or_barda
04/18/2022, 4:37 PMv1.77.0
. I also saw a discussion on this in the following thread or_barda
04/18/2022, 5:53 PMAnother 'elt_my_meltano' pipeline is already running which started at 2022-04-05 06:23:52.597229. To ignore this check use the '--force' option.
nick_hamlin
04/18/2022, 10:32 PMor_barda
04/19/2022, 12:50 PMdouwe_maan
04/19/2022, 1:35 PMaaronsteers
04/19/2022, 6:01 PM--force
?
2. Is there a specific way the jobs are aborting which is causing the job record to be in an orphaned/abandoned state?nick_hamlin
04/19/2022, 6:06 PM--force
for regularly scheduled jobs is an antipattern since it has the potential for jobs to potentially “step on each others toes” in unexpected ways. It makes sense to me why that would be something to be avoided, but please correct me if that’s not case.
2. Yes, As far as I can tell, this happens when a job is running “directly” via airflow (as opposed to the meltano UI’s wrapping of airflow) and an issue occurs that causes that job to be disrupted while it’s in process. For us, this most recently happened when we had an issue with the AWS hardware on which the underlying meltano postgres db was running and it lost connection to the running airflow service (it’s been pretty infrequent - maybe coming up once or twice since I put the original issue in?)douwe_maan
04/19/2022, 6:26 PMor_barda
04/19/2022, 6:27 PM--force
flag is a backdoor in special cases like this but that’s should happen only in rare cases. My concern is when a task is stopped in airflow the metlano is not shutting down properly and I have ti use this flag, although airflow does send the SIGTERM when shutting down the taskdouwe_maan
04/19/2022, 6:29 PMdouwe_maan
04/19/2022, 6:29 PMnick_hamlin
04/19/2022, 6:34 PM--force
message”, waited a few minutes, and had everything work fineaaronsteers
04/19/2022, 6:45 PMMy concern is when a task is stopped in airflow the metlano is not shutting down properlySecond, we can check if stale detection is still working correctly to ignore/clear jobs with a heartbeat older than 5 minutes. (Might need to check timezone logic to make sure that's not a factor here.) And lastly, just to confirm: are we still aligned that within five minutes of a job being canceled, the
--force
flag is still okay for use in these cases? (The case being one where the person running has confidence that their last job is no longer running, although the timeframe for stale detection may not yet have been reached.)
Does that sound right? I'll log (or dig up!) an issue on those two first points if so.douwe_maan
04/19/2022, 6:45 PMAnd lastly, just to confirm: are we still aligned that within five minutes of a job being canceled, theAgreed, it’s a valid workaround when stale detection hasn’t triggered yetflag is still okay for use in these cases?--force
or_barda
04/19/2022, 6:49 PMmeltano run elt
using a DockerOperator and this task is shutdown by airflow sometime in the middle. What is the expected behavior?douwe_maan
04/19/2022, 6:50 PMmeltano elt
, meltano schedule --list
and a few others (possible meltano run
@aaronsteers?) Meltano then marks all jobs with a heartbeat older than 5 minutes as failed. So it does depend on one of those other commands running semi-regularlydouwe_maan
04/19/2022, 6:50 PMmeltano run
, but I haven’t checkedor_barda
04/19/2022, 6:53 PMmeltano elt
not meltano run elt
aaronsteers
04/19/2022, 6:54 PMmeltano elt
but thanks for confirming.aaronsteers
04/19/2022, 6:55 PMdouwe_maan
04/19/2022, 6:56 PMmeltano elt
definitely runs the stale job check, so that’s not the issue here, although it’s still good to ensure meltano run
does it tooor_barda
04/19/2022, 7:00 PMdouwe_maan
04/19/2022, 7:01 PMor_barda
04/19/2022, 7:03 PMv.1.77.0
douwe_maan
04/19/2022, 7:04 PMor_barda
04/19/2022, 7:04 PMaaronsteers
04/21/2022, 3:57 PM