Pagination is a common requirement in web applications dealing with datasets. Developers often use SQL queries with OFFSET
and LIMIT
to fetch paginated records. While this approach is straightforward and works well for many cases, it can become inefficient when dealing with very large datasets, leading to performance bottlenecks.
Consider the following query:
SELECT * FROM my_table
ORDER BY created_at DESC
LIMIT 10 OFFSET 100000;
The database engine must scan and discard 100,000 rows before returning the next 10 results. As the offset increases, the query slows down because the database has to traverse more records before fetching the relevant data.
Keyset Pagination
To improve pagination performance, we can use keyset pagination (also called the seek method). Instead of using OFFSET
, we get the last retrieved record’s information from client and fetch records after it in a query. This method is significantly more efficient, as it avoids scanning unnecessary rows. The performance difference in large tables can be as significant as 100 times faster with keyset pagination compared to offset-based pagination.
Let’s assume our table has an indexed id
column. Instead of using OFFSET
, we fetch records after the last retrieved id
that we got from client:
SELECT * FROM product
WHERE id > ?
ORDER BY id ASC
LIMIT 10;
Limitation of the above query is that it only works when sorting by a unique column like id
. However, if we need to order by a non-unique column like expire_date
, keyset pagination alone may not work correctly. Multiple rows could have the same expire_date
, leading to incorrect pagination. To fix this, we use a composite key (multiple columns) for pagination, ensuring a unique ordering sequence:
SELECT * FROM my_table
WHERE (expire_date < ? OR (expire_date = ? AND id < ?))
ORDER BY expire_date DESC, id DESC
LIMIT 10;
This approach prevents duplicate records from appearing in multiple pages, as each query fetches results based on a specific starting point.
Indexing for Best Performance
To maximize query speed, create a composite index:
CREATE INDEX idx_product_expire_date_and_id
ON product (expire_date DESC, id DESC);
Keyset Pagination in Spring Boot
In a Spring Boot application using JPA and Hibernate, you can implement keyset pagination using JPQL as follows:
@Query("SELECT m FROM Product m WHERE m.expireDate < :expireDate OR (m.expireDate = :expireDate AND m.id < :id) ORDER BY m.expireDate DESC, m.id DESC")
List<Product> findNextPage(@Param("expireDate") LocalDateTime expireDate, @Param("id") Long id, Pageable pageable); // use pageable only for page size
Spring Data JPA will automatically apply LIMIT based on Pageable.getPageSize(), even though it’s not explicitly in the JPQL. Since you’re not using Pageable for offset-based pagination, just pass PageRequest.of(0, pageSize) to only control the limit.
Usage in service layer would be like this:
Pageable pageable = PageRequest.of(0, 10);
List products = productRepository.findNextPage(expireDate, id, pageable);
Keyset Pagination in Django
In Django with Django ORM, you can implement keyset pagination using Django’s QuerySet API:
from django.db.models import Q
def get_next_page(expire_date, last_id, expire_date, limit=10):
return Product.objects.filter(
Q(expire_date__lt=expire_date) | (Q(expire_date=expire_date) & Q(id__lt=last_id))
).order_by('-expire_date', '-id')[:limit]
Conclusion
Keyset pagination is a powerful technique for improving SQL query performance when paginating large datasets. While it works best with unique indexed columns, it can be adapted for non-unique sorting by using composite keys. By applying this technique, you can have faster and more efficient pagination in databases with huge amount of records in tables. 🚀