I am working on creating Egnyte tap. They use non ...
# singer-tap-development
n
I am working on creating Egnyte tap. They use non standard API pagination (cursor). Based on what startDate the GET request is made for we might get data or empty events. Empty events don't mean there is no data. This is their response body:
Copy code
{
  "events": [],
  "nextCursor": "AAGxMwAAAZGg69ZzAAAAAAAAAAAAAAAAAAAAAA",
  "moreEvents": true,
  "cursorExpired": false
}
In this case there are more events, but meltano won't be requesting for next page since there are no events. How to force meltano to keep making requests until we have "moreEvents" false. This is the current paginatior implementation:
Copy code
class EgnytePaginator(BaseAPIPaginator):

    def get_next(self, response: Response) -> Optional[str]:
        moreEvents = response.json().get("moreEvents")
        if moreEvents is False:
            return None
        return response.json().get("nextCursor")
client.py get_url_params
Copy code
def get_url_params(
        self,
        context: Context | None,  # noqa: ARG002
        next_page_token: Any | None,  # noqa: ANN401
    ) -> dict[str, Any]:

        params: dict = {}
        if next_page_token:
            logging.info(f'Next token: {next_page_token}')
            params["nextCursor"] = next_page_token
            return params
        params["startDate"] = self.start_time
n
Hi, I have overriden has_more post the above suggestion. It still doesn't solve the issue. This is the log message I get:
Copy code
2024-08-30 17:26:38,122 | INFO     | root                 | Parsed 0 events from response
2024-08-30 17:26:38,123 | INFO     | root                 | More events: True
2024-08-30 17:26:38,123 | INFO     | root                 | Next cursor: AAGxMwAAAZGIACLgAAAAAAAAAAAAAAAAAAAAAA
2024-08-30 17:26:38,123 | INFO     | tap-egnyte.auth_events | Pagination stopped after 0 pages because no records were found in the last response
e
@Nir Diwakar (Nir) maybe the situation with this API is similar to https://github.com/meltano/sdk/issues/2318#issuecomment-2207861611? The rationale for this is that most APIs do send an empty response whenever there's no more data available, so the current implementation is there to help a majority of developers so they don't need to implement the breaking logic themselves. We could implement an opt-out though, so I'm happy to read suggestions.
n
A simple addition solves the issue of premature pagination:
Copy code
def parse_response(self, response: Response) -> Iterable[dict]:
    data = response.json()
    events = data.get("events", [])
    if events:
        for event in events:
            yield event
    elif data.get("moreEvents"):
        # Yield None to indicate pagination should continue
        yield {}
👍 2
e
Oh yup, that's a reasonable workaround 👍