Does anyone have advice on best practices for sett...
# best-practices
n
Does anyone have advice on best practices for setting `job-id`s? In my local testing, I’ve basically just been using a constant job-id because it’s straightforward, but I can tell that particular “job” now has lots of various tables’ states tracked. This isn’t necessarily a problem (I don’t think), but it seems a bit clunky. OTOH, I could change the job id to reflect the specific table I’m syncing, but that seems like an optimization too far in the other direction. I’m curious what others typically do/would recommend here!
t
I don’t know that we’ve come up with a set of best practices for that! I think there’s a lot of things we can do to make the
job-id
setting more usable over all since it feels heavy to maintain a bunch of job-ids for many separate jobs…
I am hoping to spend a bit of time on this topic this week https://gitlab.com/meltano/meltano/-/issues/2574 We don’t do a good job of highlighting this powerful feature in the docs currently and its use is a bit confusing for sure.
d
Since a pipeline's state is specific to the tap and target used and how both are configured (including which streams are selected), I'd recommend using a different Job ID for each different tap/target/config combination. Also,
meltano elt
will fail if a pipeline with the same Job ID is already running, so you can't have multiple pipelines with overlapping schedules with the same Job ID.
So I'd recommend a unique Job ID for each scheduled pipeline, which in your case could be the same tap/target with different selected streams/tables
n
Thanks guys, super helpful! That’s generally where I was starting to lean as well (e.g. separate job_ids for “hourly”, “nightly”, etc.)