.. _pagination: Pagination ========== .. image:: https://live.staticflickr.com/3620/3314919669_c4a8aaa604_b.jpg Typically when an API has a larger volume of data, it's common to break that dataset up into manageable "pages" of data. Depending on the API, this can be handled any number of different ways, however the overall theme is is the same: Collect X number of objects in a request, then collect the next X number, then the next, and so on. When you look at how this looks from an API perspective, this would end up looking something similar to below: >>> api = ExampleAPI() >>> page = api.get('items', params={'page': 1, 'size': 10}) >>> next_page = api.get('items', params={'page': 2, 'size': 10}) >>> next_next = api.get('items', params={'page': 3, 'size': 10}) While this works just fine, it does require the developer to know how to handle the pagination, know the limits of the API, and how to burn down each page of data. For some folks this may be perfectly fine, however with a little effort we can turn these series of calls into an iterator that handles those page calls for the developer and allows the developer to use this paginated API within a simple for loop. RESTfly has a basic iterator already setup with most of the pagination logic already written. All we need to do is override the _get_page method with the actual calls. For this example, we will assume that the items are enclosed in an items attribute and that there is a total attribute: .. code-block:: json { "items": [ {"id": 1}, {"id": 2}, {"id": 3} ], "total": 101 } To wrap that response in an iterator, all we need to do tell the iterator how to handle the pagination of the responses: >>> from restfly.iterator import APIIterator >>> class ItemIterator(APIIterator): ... page_size: int = 10 ... ... def _get_page(self) -> None: ... """ ... Gets the next page of data ... """ ... resp = self._api.get('items', params={ ... 'page': self.num_pages + 1, ... 'size': self.page_size ... }).json() ... # get the total and the items list from the response and store them into ... # the reserved attributes. ... self.total = resp.get('total') ... self.page = resp.get('items', []) Alright, so we have an iterator class now, but how to we use it? To start, we should manually test it and see how it works: >>> items = ItemIterator(api, page_size=10) >>> for item in items: ... print(item) We should see the items being fed to the for loop through the iterator. Once the total number of records has been reached (regardless of how many pages), the iterator will terminate with a ``StopIteration`` exception. Now, lets go ahead and wire this into an endpoint method: >>> from restfly.endpoint import APIEndpoint >>> class ItemsEndpoint(APIEndpoint): ... def list( ... self, ... page_size: int = 10, ... max_items: Optional[int] = None, ... max_pages: Optional[int] = None, ... ) -> ItemIterator: ... return ItemIterator( ... self._api, ... page_size=page_size, ... max_items=max_items, ... max_pages=max_pages ... ) ... So now, once we wire it into the ExampleAPI class like so: >>> class ExampleAPI(APISession): ... @property ... def items(self) -> ItemsEndpoint: ... return ItemsEndpoint(self) We now can simply call the items.list method and get the iterator returned back to us. This allows for us to call the ``items.list`` method and loop over it without any care as to how the underlying pagination is being handled. >>> api = ExampleAPI() >>> for item in api.items.list(): ... print(item)