Hi What is a proper approach to re run failed or interrupted Meltano #troubleshooting

Hi, What is a proper approach to re-run failed or ...

Denis I.

04/05/2023, 7:34 PM

Hi, What is a proper approach to re-run failed or interrupted tap with the last successful state? (for `run`/`elt` commands)

visch

04/05/2023, 8:11 PM

meltano run tap-name target-name

dima_petukhov

04/06/2023, 7:43 AM

run

will not set intermediate states( only once at the end of the sync) , and

elt

will update states each time writing a batch, as I understand

Denis I.

04/06/2023, 11:32 AM

Right, with the interrupted

run

I get additional keys in the state object (

replication_key_signpost

starting_replication_value

progress_markers

) and previous

replication_key_value

with a note Progress is not resumable if interrupted. Subsequent

run

started from the same

replication_key_value

and finished correctly. That’s good. So that means it’s safe to re-run

run

in pipeline in case of interruption or fail. Is there any scenario when the tap could be forced to re-sync everything from scratch?

pat_nadolny

04/06/2023, 12:26 PM

TLDR: running the pipeline is safe and recommended. You can ask it to resume from scratch using “full-refresh” but it won’t randomly do that on its own.

pat_nadolny

04/06/2023, 12:28 PM

The tap implementation manages how to communicate its state progress. I think at least one reason for the “Progress is not resumable” warning is if the records aren’t guaranteed to be sorted, meaning even though record 2 was received by the target it doesn’t mean record 1 is guaranteed to be already loaded. In this case the sync waits until the completion of the sync to safely save state, if it was interrupted it won’t save 2 as the state due to this lack of a guarantee.

pat_nadolny

04/06/2023, 12:31 PM

Targets are implemented to “emit” state only once they confirm a batch is successfully loaded in the destination. Meaning if the streams are “resumable” then a failure after successfully loading a few batches will have progressed the state forward and the next sync will not resysnc those batches because they’re confirmed already in the destination.

dima_petukhov

04/06/2023, 12:33 PM

@pat_nadolny For example I have [info ] INFO Writing table batch with 49 rows for

('orders', 'refunds')

... cmd_type=loader name=target-postgres And I dont know if the state will be updated, I have a lot of

Writing table batch

but state table is still empty I've used command

elt tap-shopify target-porsgres --state-id=XXX

visch

04/06/2023, 12:34 PM

Which target postgres @dima_petukhov?

dima_petukhov

04/06/2023, 12:34 PM

datamill co

visch

04/06/2023, 12:42 PM

https://github.com/datamill-co/target-postgres/blob/master/target_postgres/stream_tracker.py This target doesn't emit state along the way they have a comment there, saying singer is the reason is wrong. You can try the meltanolabs variant and you should see state properly updated along the way

dima_petukhov

04/06/2023, 12:45 PM

@visch meltano variant is not stable as I understand

pat_nadolny

04/06/2023, 12:50 PM

One distinction is between a target emitting state and meltano writing it into the DB. I’m not positive how this works (cc @cody_hanson would know) but my understanding is that meltano collects these state message during a sync then only writes it to the DB at the exit of the sync, completed or failed. So I wouldn’t expect the DB to be updated mid sync, it doesn’t mean state isn’t being tracked though. @cody_hanson can you clarify how it works?

dima_petukhov

04/06/2023, 1:03 PM

@visch elt has some error, when I uninstalled the loader, and, magically it wrote a state before quitting with an error How could we gracefully stop running program ? Which keyboard shortcut is better to use in this case

visch

04/06/2023, 1:14 PM

@dima_petukhov if you use the meltano labs target you'd be helping us get closer to stable. I run it in production for 4 separate projects that run all the time myself

Matt Menzenski

05/05/2023, 7:03 PM

I think I’m running into this too https://meltano.slack.com/archives/C01TCRBBJD7/p1683308663819219

visch

05/05/2023, 7:04 PM

Which

target

are you using @Matt Menzenski ? This thread was using pipeline wise

Matt Menzenski

05/05/2023, 7:04 PM

I’m on MeltanoLabs target-postgres

visch

05/05/2023, 7:04 PM

Not the same thing then 😕

Open in Slack

Previous Next