Could I use plugin inheritance to run exactly the ...
# best-practices
a
Could I use plugin inheritance to run exactly the same tap, but with two different
select:
configurations? Reasoning: one of my API streams takes a long time to sync, so I would like to run just this stream every day, and then run every other stream every 3 hours or so. I think this might run the risk of getting the state mixed up, as the two plugins would need different names. So I would have to make the streams exclusive either to to the parent plugin or the inherited one, not both. Happy to hear If there's a better way to do it!
p
@Andy Carter yes! I think thats a pretty common pattern.
So I would have to make the streams exclusive either to to the parent plugin or the inherited one, not both.
Thats probably the way I would do it. Exclude it the one stream from the 3 hr frequency parent tap and then the inherited tap could select just that single stream so theres no overlap to worry about.
a
I wonder if a 'use cases for plugin inheritance' might be a good docs section, if it doesn't exist already. Up until now I've only ever thought about it a bit like abstract base classes, you wouldn't actually run the parent tap, just subclass it multiple times to save duplicating config.
t
@Sven Balnojan 👆
a
Another fun use-case for inheritance that I have in place — testing configuration without moving (very much) data around. This looks like: • In my source DB I built a very tiny table • I write a parent tap that I intend to use in production (call it
tap-source-parent
) • I write a child tap to select just that small table and inherit every other property from the parent (call it
tap-source-child
) Why do this? A few reasons: 1. I can design that table however I want, so that I can test how certain data types get translated from tap to target — a thing that has caused me some confusion before. 2. It’s an incredibly fast way of ensuring that my configuration is exactly where I need it to be without actually touching any of the production tables. 3. It gives me the basic layout of the state json (And the naming convention of the state-id) in case I want to manually update state for the larger job (e.g., for picking up where you left off on already existing tables)
p
@anthony_shook that is another good use case! Theres also the option to use environments for something like this, you could have a
test
meltano environment where you override the select criteria for the tap to only select from the test table then in
prod
you dont override it. Then instead of calling
meltano run tap-source-parent target-x
and
meltano run tap-source-child target-x
, you'd toggle it using
meltano --environment=test tap-source target-x
and
meltano --environment=prod tap-source target-x
. I'm not sure if there are any major pros/cons to doing it one way over the other though, just more options
a
Totally true! It’s 100% a semantic/preference thing for me (and, admittedly, a laziness on my part for managing multiple environments :P)
a
Just for more food for thought, Consider also just using env vars:
TAP_WHATEVER__SELECT='["someStream.*"]' meltano run ...
(or even dynamically select a single property)
TAP_WHATEVER__SELECT='["someStream.idColumn"]' meltano run ...
I think its leaner than inheritance and more yaml when all you want is to modify
select
on the fly and requires no commits to the repo or changes. But either way works! I like the environment route too if its a consistent thing all plugins will adhere to like Pat's
environment=test
a
That is another nice solution, I didn't think to use
select
as an env var, I guess I only ever considered it for secrets. TMTOWTDI!