pierre_de_poulpiquet
03/23/2021, 8:07 PMRESTStream
.
The closest thing that I saw was: https://gitlab.com/meltano/singer-sdk/-/blob/development/singer_sdk/samples/sample_tap_gitlab/gitlab_rest_streams.py#L160 where 2 streams are used and a sort of state (is it the same that the tap state?) is used to pass data from the 2 streams.
In Hubspot, there are thousands or hundred of thousands Contacts Ids to be passed between the 2 streams. Would the recommended solution be the same as the one implemented into the Gitlab tap?
b. Some streams in Hubspot are nested into a single API / Endpoint. Ex. Deal Pipeline and Deal Pipeline Stage are returned through 2 nested lists into a single endpoint.
How would it be mapped with the SDK? Is it possible to create a Stream that consume the results of the API calls of another Stream without doing API calls?
c. What is the concurrency model of the SDK between Streams? Are all streams synced in sequence, or is there some sort of concurrency?
My experience shown that Hubspot paginated APIs have high latency so the âtotal sync timeâ can be quite long: weâre spending a lot of time waiting for the next page (and when people have 100k records, that a lot of pagesâŚ).
Syncing different paginated streams (ex. Contacts and Deals) at the same time is saving time in this case.
How would the SDK have helped? đjuan_sebastian_suarez_valencia
03/23/2021, 8:25 PMaaronsteers
03/23/2021, 9:12 PMRESTStream.append_extra_record_data()
method (or similar).
This doesnât exist yet but in theory could be defaulted to do nothing and then called by the framework here before yielding each individual record.
Option 3: Implement subsequent request calls, in batch, during RESTStream.parse_response()
here.
This is similar to option 2 but faster if you can send multiple record keys to the REST endpoint at the same time.aaronsteers
03/23/2021, 9:15 PMrequests
library apart from the main flow. Weâd probably want to add those - perhaps a singleton call that assumes the same authenticator and http header, but takes a custom path, custom http params, and custom payload.
Thoughts?aaronsteers
03/23/2021, 9:16 PMAre all streams synced in sequence, or is there some sort of concurrency?Currently the streams are syncâed in sequence - but in theory this can be overridden and in the future, we might build in the capability to auto-parallelize up to a specific degree of parallelism. Another option which Iâve seen people do is to simply have two or more groupings of streams for the same tap - running in parallel at the orchestrator level (Meltano meltano).
aaronsteers
03/23/2021, 9:21 PMaaronsteers
03/23/2021, 9:51 PMA feature that may be of interest to myself and other people developing
RESTful taps may be the ability to query an endpoint for each record in a
stream (e.g.Â)./content/{content_id}/views
edgar_ramirez_mondragon
03/23/2021, 10:22 PM