stephen_lloyd
04/12/2021, 4:22 AMteams
|
--spaces
|
--folders
|
--lists
|
--tasks
At each level, I need to send data to Snowflake, so each level is a valid stream. But I also need to remember all the ids from each level so that I can loop through them.
It looks like partitions would help, but I’m not certain how they are working. The gitlab example only deals with one level of a hierarchy, I think. Any tips?
EDIT: changed bottom level of the hierarchy from spaces -> tasksaaronsteers
04/12/2021, 4:55 AMspaces
in your example basically requires one state entry and one call for each row from lists
, which requires one call for each row from folders
, and so on. Do you know what would be the approximate order of magnitude (hundreds, thousands, or closer to millions+) for the next-to-lowest grain? That would probably drive whether partitions are a feasible solution alone. Otherwise, the tap would likely need to make additional REST calls at runtime to loop through the parent structures.stephen_lloyd
04/12/2021, 5:12 AM<https://api.clickup.com/api/v2/team>
<https://api.clickup.com/api/v2/team/{team_id}/space>
<https://api.clickup.com/api/v2/space/{space_id}/folder>
<https://api.clickup.com/api/v2/folder/{folder_id}/list>
<https://api.clickup.com/api/v2/list/{list_id}/task>
ken_payne
04/12/2021, 2:06 PMService
class to traverse the hierarchy of my upstream source and present 'flat' lists of dicts for streams to consume. The implementation was actually one of the topics of last week 😅 If you have large volumes of data to retrieve, this solution may be quite memory-intensive, but it works for us.stephen_lloyd
04/12/2021, 3:32 PMaaronsteers
04/12/2021, 7:53 PMI think we’re at most in the 100s range for the next to lowest grain.At this scale, partitions should still be scalable, since (1) you can traverse that number of items during job initialization with little realtime latency, and (2) the size of those items each having their own partition entry in
state
would not become a scaling challenge. That said, you could also take the service approach that @ken_payne used for Tableau instead of or in addition to the partitions approach.ken_payne
04/12/2021, 8:27 PMken_payne
04/12/2021, 9:23 PMaaronsteers
04/12/2021, 9:24 PMaaronsteers
04/13/2021, 8:07 PMaaronsteers
04/13/2021, 8:08 PMaaronsteers
04/13/2021, 8:09 PMstephen_lloyd
04/14/2021, 3:48 AMaaronsteers
04/14/2021, 2:13 PMstephen_lloyd
04/14/2021, 3:53 PMaaronsteers
04/14/2021, 3:55 PMstephen_lloyd
04/14/2021, 3:57 PMaaronsteers
04/14/2021, 5:12 PM