michael_cooper
09/10/2021, 3:57 PMget_child_context()
gets called after post_process()
. Would it not make more sense to call get_child_context()
on the unprocessed record? For example, I had to output a json string as the value of a column in the final record, but I needed a value from within the json string for get_child_context()
. get_child_context()
being called on the post-processed record caused me to have to json.loads(record)
, which seems like an unnecessary step.
• It would be nice to be able to access the context
passed to a child stream from any section of the stream. For example, my parent stream might pass a list of ids to query in the child, but the child endpoint only takes a single id per query. It would be helpful to be able to access the context
from get_next_page_token()
and pop the id off as needed instead of writing indexing logic within get_next_page_token()
. I’m sure there are other areas that context
would be helpful to access that aren’t currently accessible, but this is just the most immediate example I have.
• It could also be helpful to have some sort of mechanism to build up a context
in a parent stream to pass to a child. Since get_child_context()
is called on each record, you may be able to only pull on value from that record, but it’s possible that the child stream can take in a list of values per request. So if there are N records produced from the parent stream, the current implementation means you have to make N calls to the API via the child stream. Building a context to pass to the child would mean you could make just one call.
Maybe there is a way to access the context
from anywhere within a stream, but I’m just unaware of it.