Is there a way to set a “where” condition for a ta...
# plugins-general
d
Is there a way to set a “where” condition for a table select for non-translog based syncs? Like only sync data where a column = X identifier?
t
The two ways I know of to do this are: 1. Defining a view that includes the WHERE clause and selecting from the view instead of the table 2. a mapper that filters out the undesirable data The former is more efficient from a database/data movement perspective; the latter works when you can't modify the source (of if the source isn't a relational database...)
But there may be other options. It depends on the tap you're using too; there's no reason a tap couldn't implement such a feature. 🤷🏻‍♂️
d
What do you mean with a mapper on #2?
Unfortuantly we cannot modify the source PG database
t
I haven't done it myself, to be honest, but I believe you can use the map-transformer to define a filter that will remove records before they're passed to the target.
d
oh interesting
i wasn’t aware of this
ty!
t
My pleasure 🙂
c
Yeah. I agree. A mapper would do the trick. Your pipeline will still extract the unwanted rows, but your target will only get the desired rows.
Setting up the mapper config takes a bit getting used to, but once you get the hang of it, it's really powerful.
p
I think what has been said is great - one extra potential approach, if I'm understanding correctly, is to set your bookmarks for a table using state set. I think then on the next run the tap would know that state exists and to run an incremental sync starting at those bookmarks