Guys, Can you please tell me what needs to be done...
# singer-tap-development
a
Guys, Can you please tell me what needs to be done so that after the first request in the previous_value I get 1 instead of None?
r
What does the body of the first request contain, and what is the value of your
next_page_token_jsonpath
?
a
next_page_token_jsonpath where it is?
response example:
Copy code
{
  "ok": true,
  "logs": [
    {
      "service_id": 1234567890,
      "service_type": "Google Calendar",
      "user_id": "U1234ABCD",
      "user_name": "Johnny",
      "channel": "C1234567890",
      "date": "1392163200",
      "change_type": "enabled",
      "scope": "incoming-webhook"
    },
    {
      "app_id": "2345678901",
      "app_type": "Johnny App",
      "user_id": "U2345BCDE",
      "user_name": "Billy",
      "date": "1392163201",
      "change_type": "added",
      "scope": "chat:write:user,channels:read"
    },
    {
      "service_id": "3456789012",
      "service_type": "Airbrake",
      "user_id": "U3456CDEF",
      "user_name": "Joey",
      "channel": "C1234567890",
      "date": "1392163202",
      "change_type": "disabled",
      "reason": "user",
      "scope": "incoming-webhook"
    }
  ],
  "paging": {
    "count": 3,
    "total": 3,
    "page": 1,
    "pages": 1
  }
}
r
Sorry - ignore what I said about
next_page_token_jsonpath
, it's not applicable here.
a
hmm, It seems to make sense and I need to override this value
r
Try this:
Copy code
def get_next_page_token(
        self, response: requests.Response, previous_token: Optional[Any]
    ) -> Optional[Any]:
        data = response.json()
        max_page = data["paging"}["pages"]
        current_page = data["paging"]["page"]

        next_page = None
        if current_page < max_page:
            next_page = current_page + 1

       # ...

        return next_page
a
still recieve only one record
message has been deleted
e
You need to change the page size to something larger
a
I specifically set it to 1 to test following links, because I have very few entries (only 13)
r
Is that sample response you sent accurate? There is only one page in that case, so no next request to make.
a
Yes, I just set page_size=3 and now I only get 3 posts. The example I gave is from the official site. Also here is the response from curl
I think I know what the problem is. I don't see the get_next_page_token function being called anywhere
r
That's what I was thinking, but you had it working before.
a
It seems that yes. Maybe this happened after I synchronized my fork with the original repository (to be honest, I doubt this is the problem)
I see a new class has appeared here, perhaps it interferes with the get_next_page_token call?
r
Has
RESTStream.get_next_page_token
been removed in a recent version of the SDK? @edgar_ramirez_mondragon
a
Ahh: it seems yes, I don't see it in the documentation
e
@Reuben (Matatika) it was removed but if implemented it will still be used
a
strange that it doesn't work for me
I don't see him calling here.
r
Ah, I see.
get_new_paginator
is overriding the
super
implementation, which does still call `get_next_page_token`: https://github.com/meltano/sdk/blob/b1a2ec4443759be51b76b76d80d3dd7ee72c190d/singer_sdk/streams/rest.py#L479-L496.
a
It seems I should override this function(get_next), right?
r
Sure, but probably just for the logs stream if that is what you are adding. Don't want to mess up other streams.
Copy code
class SlackLogsPaginator(JSONPathPaginator):

    def get_next(self, response: Response) -> str | None:
        ...
a
Hmm, I also thought about this, but how then to make this condition so that the paginator for logs is called only for logs?
And I think that logsPaginator should also work with throttling
r
Copy code
class LogsStream(SlackStream):

    def get_new_paginator(self):
        return SlackLogsPaginator(self.next_page_token_jsonpath)
And I think that logsPaginator should also work with throttling
Copy code
class SlackLogsPaginator(ThrottledJSONPathPaginator):

    def get_next(self, response: Response) -> str | None:
        # apply throttling logic
        super().get_next(response)

        # your custom pagination implementation
        ...
(this is just off the top of my head, so apologies if it is slightly wrong)
a
This is a good option, but for some reason it seems to me that i can try to use the basic paginator implementation and not write your own (but I'm not sure yet)
No problem Thank you very much
r
Yes, but that will not have the throttling implementation if that matters to you.
a
But throttling calls super().get_next(response)
from basic realization
Or am I confused
r
You can just ignore the
super
call return value and apply your own custom pagination logic after - the purpose is just to apply the timeout.
Alternatively, you could modify
ThrottledJSONPathPaginator
...
a
No, no, I'm talking about the fact that I want to try using this implementation
r
Your response doesn't contain a next page token, does it? I thought that was the root of the issue.
a
Yes, I don't have a next page token, but I do have a current page token. I'm thinking of some variant that could use next_page_token_jsonpath and an attribute like 'need_increment_next_page_token' but it seems to be nonsense(
This implementation of paging is very popular, doesn't singer_sdk support it?
or something like this...
r
Maybe BasePageNumberPaginator or BaseOffsetPaginator is what you want.
a
Could you tell me where I should specify the pagination option that I want to use for the selected stream
r
You will have to override
get_new_paginator
for the corresponding stream class.
a
ahhh
It turns out that we have returned to the problem discussed above, where we also need to use throttling for this stream
r
Unfortunately, if you want to do that I think you're just going to have to copy that logic and create a new
Throttled<pagingator-type>
class that uses it, since
ThrottledJSONPathPaginator
is for JSONPath implementations only.
a
I understand, thank you. I'm going to try now
r
Good luck! 😄
a
thanks
Seems like it should be something like this
r
Yep, make sure you are always returning some
bool
value from
has_more
though (
return
statement needs to be indented one level less).
a
good catch
Wow, it seems to work!!!
I'm glad thank you very much
r
Haha, awesome! 😎 That was a fun problem! Glad you got it working.