Hello I have a question In Meltano you increment the state w Meltano #troubleshooting

Hello I have a question In Meltano you increment ...

juan_sebastian_suarez_valencia

04/21/2021, 8:58 PM

Hello I have a question In Meltano you increment the state when the ELT is run However I thought that it was best practice to do it at the end in order to be sure that the pipeline went through correctly I just had an error with a pipeline and the state that I am launching it with is the most recent so the pipeline won’t take into account the old records previous to the error. Furthermore, how can I detect an error in a pipeline using Airflow as an orchestrator ?

taylor

04/22/2021, 3:38 PM

Can you share more about the error you’re seeing and the commands it was run with? It’s possible the target emitted the state which Meltano would capture but there could be a subsequent failure. Most targets aren’t doing validation of the state and are just passing through what’s emitted by the tap.

taylor

04/22/2021, 3:39 PM

For airflow, you should be able to view the logs directly or via the UI by invoking the webserver https://meltano.com/docs/orchestration.html#other-things-you-can-do-with-airflow

juan_sebastian_suarez_valencia

04/23/2021, 1:39 PM

@taylor Here is the problem. • I launch meltano like this

meltano elt tap-hubspot-meister target-bigquery --job_id=hubspot-bigquery

• Then I see that the incremental state has been updated • But there’s a problem in one of the pipeline • And yet the state was updated although there was an error

douwe_maan

04/23/2021, 3:40 PM

@juan_sebastian_suarez_valencia If the target outputs that state, that indicates that it actually loaded those records into the destination, and that we don't want to extract them again on the next run, even if the tap failed halfway through and wasn't able to extract some other streams/records. If the tap and target are working correctly only forwarding state for actually loaded records, that state is where we'd like the next run to start off.

douwe_maan

04/23/2021, 3:41 PM

The

Pushing state: {}

you're seeing is a bug in target-bigquery that has been fixed: https://github.com/adswerve/target-bigquery/issues/9 but not released yet. I don't think it affects what you're seeing here, though, since we see a real non-empty state message at the end

douwe_maan

04/23/2021, 3:43 PM

It looks to me like the tap was able to extract a few records in the companies, contacts, and deals streams and wrote the corresponding state, after which the target was able to load those records and forwarded the corresponding state, which Meltano then stored to make sure the next run wouldn't extract and load those records again. The fact that the tap failed later on doesn't change the fact that those earlier records were loaded and don't need to be loaded again. Are you seeing any issues caused by this behavior? It's possible that tap and target are not behaving correctly, and I'm happy to debug that some more, but I think Meltano is right to treat any state output by the target as valid even if the tap or target failed

Open in Slack

Previous Next