HI team, anyone encountered a case where 2 taps (e...
# troubleshooting
b
HI team, anyone encountered a case where 2 taps (e.g.
tap-salesforce
and
tap-zendesk
) share the same
stream
name? How do you deal with such cases? The target would write to the same file/table wouldn’t it? We haven’t encountered it yet but thinking about the future and wondering what the community would do in such cases…? Thanks!
d
I my case, target-bigquery allows a table_prefix in the config, so our tables end up being
zendesk_users
and
salesforce_users
n
I ran into this with redshift and addressed it by having the target put each data source in a separate schema. More details/context in this thread: https://meltano.slack.com/archives/CFG3C3C66/p1617224371207200?thread_ts=1617223616.203600&cid=CFG3C3C66
b
thanks! @dan_ladd so do you maintain one
target-bigquery
for each of your tap? or do you pass the
table_prefix
at runtime using Meltano? If the latter, how do you do it?
@nick_hamlin we also rely on plugin inheritance. We are building Meltano on docker and when we have a lot of targets, the docker image size becomes real big and it’s causing us some issues. So we are trying to reduce the number of targets we install 🙂 A nice solution would be the possibility to add/update a target config at runtime (in the same way that
--select
can be used to select an objects/stream, it would be nice to be able to specify a schema/table name at runtime, without needing to install multiple targets and relying on plugin inheritance…). @douwe_maan, would love your opinion on this!
d
The latter, we have our meltano project wrapped up in docker and run it for each job, passing in the prefix at runtime
b
How are you passing this prefix at runtime? Do you manually pass a
config.json
and have the prefix in there? AFAIK only few commands are allowed at runtime and they are listed here. Am I missing something? 😛
d
We pass it as an environment variable
TARGET_BIGQUERY_TABLE_PREFIX
to the image
b
Got it! I think we should be able to implement that as well. Thanks a lot!
c
@benjamin_maquet we use Chamber which lets us easily keep all taps and targets in one docker image, but then hydrate specific env vars for each elt run, including overriding the schema name if we want
If you’re wondering how that works, you set up parameter store like this:
And then execute your container like this:
chamber exec meltano meltano/target-redshift meltano/tap-zendesk -— meltano elt tap-zendesk target-redshift --job_id zendesk-to-redshift
so that will hydrate all the
meltano/
vars first, followed by
meltano/target-redshift
vars next and then
meltano/tap-zendesk
last (this allows you to override target vars per-tap). Chamber doesn’t recurse downwards
then you build a docker image that’s agnostic to anything and simply controlled by your container scheduler + chamebr
oh and it’s super easy to just set the vars from your own dev machine:
chamber write meltano/tap-zendesk TARGET_REDSHIFT_BATCH_SIZE_ROWS 500
Make sure you set up a separate RDS postgres for persistence and you’re in a world of such pure containers it’ll make you cry 🙂