Hi everyone. I am trying to build a custom tap to ...
# singer-tap-development
s
Hi everyone. I am trying to build a custom tap to get astronomy data from REST api(sunrise-sunset.org/api). Currently the api serves the data based on a particular date with few other parameters. I have a requirement to load the data since 1st Jan 2020. I am currently using meltano sdk to build a custom tap. Attached is the code snippet (streams.py) I am using to stream the data. Currently I will get only one record for a particular day. How can I change the meltano stream such that I can load the data over an interval. Ex: from 1st Jan 2020 to 20 Oct 2022. i.e, How to make multiple rest api calls for each day as the api doesn't have any pagination or date interval support?? Detailed requirement: We want to build a dataset for Sunrise and Sunset timing for Pune( lat=18.5204 long=73.8567). The tap should perform the following function: 1. Historical Load : load historical data since 1 Jan 2020 2. Incremental Load : Append today’s data in existing target 3. Transform data : Transform the timestamp from UTC to IST
s
Hey @shravan_g_h did you get this to work or are you still looking for a solution?
r
(also in response to https://meltano.slack.com/archives/C01TCRBBJD7/p1687526456344019) Not tested, but maybe something like this?
client.py
Copy code
def get_records(self, context):
        dates: list = self.config["dates"]
        # or implement some logic here to accept a start date/end date and construct a list of dates from those
        # dates = get_dates(start=self.config["start_date"])  # from start_date up to today
        # dates = get_dates(start=self.config["start_date"], end=self.config["end_date"])  # from start_date to end_date

        for date in dates:
            context["date"] = date
            yield super().get_records(self, {**(context or {}), "date": date})

    def get_url_params(self, context, next_page_token):
        params = super().get_url_params(context, next_page_token)

        params["lat"] = 18.5204
        params["long"] = 73.8567
        params["date"] = context["date"]

        return params
Couple of things I'm not sure about: • Is modifying context in this way is best practice or not? • What are the implications of introducing a loop in
get_records
that calls the super implementation? • What are the implications of accepting a configurable date range as config? Either way, can't hurt to try! 😅