Is it possible for a parent stream to have variabl...
# singer-tap-development
a
Is it possible for a parent stream to have variable child streams based on an array property of the parent stream?
a
Hi, @andrew_stewart! Can you say a bit more? I'm not sure I follow the question or the the use case.
a
For example, let’s say the source is some API. if say the parent stream has some array property which in some instance might be
'my_list': ['a','b','c']
(and this instance with another property
'id': 'xxx'
)… and then say in the child stream I’d expect objects obtained from paths as follows: •
/parent/{id}/{a}
/parent/{id}/{b}
/parent/{id}/{c}
Does something like that seem possible with parent child steams?
(sorry for the crappy description :P)
a
Cool, yeah, I think that makes sense. I don't think this could be auto-populated to the path, from simply passing the array as part of the child's context. However - you could include this in your pagination logic. Basically, when you'd otherwise be done looping through pages, you instead check to see if all passed array items from the parent context have been "popped".
a
I can possibly imagine at least two approaches to this. Approach 1: I could just define a separate child stream class for each known value of the parent stream’s arrray that can be expected, but I imagine there’d need to be some kind of conditional logic on the child stream because some parent object might not have a child object of, say,
b
,etc.
a
That approach (Approach 1) would also work. 👍
a
then Approach 2 being yeah.. some kind of auto-populated magic that I can’t quite get my head around and probably isn’t possible.
a
In that case (approach 1), you could short-circuit (basically just abort) the child stream if the context passed from the parent is not inclusive of this type of child.
a
Ok great, that should work fine. Any idea what that abort logic would look like?
a
One sec... yeah, let me check the docs real quick.
Probably easiest in the child's get_records() call: if the context passed to that method signals there's nothing to pull, then just return. Otherwise, you can call the base implementation.
I'm remembering I actually did something similar for github issue comments. If we could tell from the parent context that there were zero comments, then no point in calling the issue comments' endpoint.
a
Ok nice, I think that makes sense. So probably something like…
Copy code
def get_records(self, context: Optional[dict] = None) -> Iterable[Dict[str, Any]]:
        """Return a generator of row-type dictionary objects.

        Each row emitted should be a dictionary of property names to their values.
        """
        if context and 'a' in context.get("my_list"):
            self.logger.debug(f"No comments detected. Skipping '{self.name}' sync.")
            return []

        return super().get_records(context)
Thanks @aaronsteers! I thought I was just taking a gingerly stroll around the base of the learning curve here.. and now apparently I’m free soloing up the cliff face 😛
The SDK has been great to work with though!
a
Yeah - no worries at all. And thanks for the positive feedback! The way I think of it, there are always some edge case topics where it can be easy to feel like you're "off the beaten path". In those cases, it's nice to have confirmation that a certain approach is worth trying.
Seems like maybe we could put more of these into the code samples and make it easier for next person. (Note to self!) 🙂