Django Rest Framework is by far the most popular solution for providing APIs in the Django universe (with tastypie being a viable alternative). During last years summer – the project creator – Tom Christie – started a kickstarter campaign aimed at fueling his effort to get to the next major release – the funds were gathered quickly and Django Rest Framework v3 was born. There are several goals of v3 ranging from refactoring to extending ui layer of the product, in this article though I want to focus on the improved pagination support which was introduced in the 3.1 release (beginning of March 2015).
Pagination in DRF v2
In DRF v2 there was only one strategy of paginating data, configurable in views or globally in settings.py:
class PaginatedListView(ListAPIView):
queryset = ExampleModel.objects.all()
serializer_class = ExampleModelSerializer
paginate_by = 10
paginate_by_param = 'page_size'
max_paginate_by = 100
Possible extensions could affect the way, data is rendered, but provided no options on extending the way it was fetched – which leads us to the next section.
Improvements in version 3
You might ask – what is wrong with that approach, and why there’s need to change. Well, as with many problems in programming, paging is not as simple as it looks. The problem with this approach is that under the hood – it uses a limit offset
SQL query which might be slow for large datasets, even when using indexes (you can read about it here and here). So this paging approach is not always recommended. With that rationale in mind, paging configuration was modified in the new release of DRF, and cursor pagination was included. Let’s take a quick look at the offered options:
Configuration changes and custom pagination schemes
From the code point of view, the previously used options such as paginate_by
, paginate_by_param = 'page_size'
and max_paginate_by = 100
have been moved to separate classes, which should now be referenced either in settings.py
(for default pagination scheme) or in generic views via the pagination_class
attribute. Changing the fields such as page size is done by subclassing pagination classes (or writing your own custom pagination scheme)
It is also possible to extend the BasePagination
class – this time with a higher degree of flexibility – overridable methods allow us to change the way data is rendered as well as the way it is fetched (having whole request at our disposal, so we can implement any possible page retrieval idea).
PageNumberPagination and LimitOffsetPagination
Those are really same approaches to pagination, the only difference being LimitOffset uses a more explicit way of specifying what records to fetch.
Example configuration looks like this:
class SmallPagesPagination(PageNumberPagination):
page_size = 3
class SmallOffsetPagination(LimitOffsetPagination):
default_limit = 3
with usage that might like this:
class RecordsPaged(generics.ListAPIView):
queryset = m.Record.objects.all()
serializer_class = s.RecordSerializer
pagination_class = SmallOffsetPagination
class RecordsLimited(generics.ListAPIView):
queryset = m.Record.objects.all()
serializer_class = s.RecordSerializer
pagination_class = SmallOffsetPagination
This will result in pages being retrievable by querying with either ?offset=3&limit=2
or ?page=2
(with other methods of setting query parameters supported by drf also available). Every request will result in two SQL queries under the hood (one counting records, so we now how many pages are available – this happens irregardless of whether we display some form of page links or not) and one for fetching the actual data.
CursorPagination
This is the second – more scalable approach, with example code looking like this:
class SmallCursorPagination(CursorPagination):
page_size = 3
ordering = 'id'
class RecordsCursored(generics.ListAPIView):
queryset = m.Record.objects.all()
serializer_class = s.RecordSerializer
pagination_class = SmallCursorPagination
This time, query takes a cursor parameter (but it does not use a DB cursor), which rendered in an url might look like ?cursor=cD03JnI9MQ%3D%3D
. This is actually a base64 and url encoded representation of a dictionary. Running binascii.a2b_base64(urllib.parse.unquote('cD03JnI9MQ%3D%3D'))
gives us b'p=7&r=1'
, which gets translated to cursor position (p), offset (not present in this case) and whether we’re iterating in reverse fashion(r). With that information it’s always possible to fetch next and second page with an efficient SQL query (which instead of offsetting, takes into account the previous position we were in). There are limitations of this approach though:
- It cannot jump to arbitrary page, or count the number of pages
- The records need to be ordered in a way that does not change during list navigation (if it would, then we could display duplicate records, or skip some when moving between pages)
- It should order by non-nullable unique or nearly unique values (so when saving last cursor position, we don’t hit too much duplicates)
The implementation is a bit more complex, but it does take care of scenarios where list is edited while being simultaneously paged by other user, which is also the reason why you don’t want to implement it yourself if there’s a ready-made component available. And even with the mentioned limitations – it is a perfect fit for an infinite scroll solutions (If you wonder why infinite scroll is popular in large webapps – it’s probably because of performance reasons)
Wrapup
There you have it. Pagination is of course not rocket science – still getting it right is not as trivial as it might seem – and extending support for it in DRF is a nice addition – it’s easy to use and provides options for working with larger datasets which in effect leaves developer with more time to focus on the essential issues of their project instead of working around the framework they’ve chosen.