I'm using the SDK to build a tap for Google Search...
# singer-tap-development
a
I'm using the SDK to build a tap for Google Search Console api that can accept a service account credential. I'm using the
googleapiclient
library rather than a more traditional rest stream with structured URL, so I'm just passing a dict to a function like so to get my json:
Copy code
self.authenticator.service.searchanalytics().query(
                siteUrl=self.config['site_url'],
                body = body
            )
What I need to to is create a date range between start and end dates, and then pass each date to the api as a query parameter in the body. How can I store the last queried in a state file. For instance, if I've already ran the from 2023-11-01 up to 2023-11-19, then when I run again tonight, I should only collect data for
2023-11-20
as my state file tells me I've ran up to
2023-11-19
already? I was going to mutate
context
but the docs advise specifically against that.
s
I have done something like this in tap-rest-api-msdk. I am not saying that I have followed best practice but it seems to work for where I used this in FHIR API's. https://github.com/s7clarke10/tap-rest-api-msdk/blob/4f87c1adae00446388ebbe418c70b87c231856dc/tap_rest_api_msdk/streams.py#L515 https://github.com/s7clarke10/tap-rest-api-msdk/blob/4f87c1adae00446388ebbe418c70b87c231856dc/tap_rest_api_msdk/utils.py#L99-L115 I have specified what the replication key is as part of my stream to ensure the current position is saved as bookmark position. If there is no bookmark value it will work with the supplied START_DATE otherwise it will work with the latest bookmark. It will add in an appropriate query to the API to return data from the last position it got up to. https://github.com/s7clarke10/tap-rest-api-msdk/blob/4f87c1adae00446388ebbe418c70b87c231856dc/tap_rest_api_msdk/streams.py#L537-L553
a
Thanks, I didn't even try setting
replication_key
that seems to solve it completely. Another bit of SDK magic 🪄