I am stuck in a loop using Meltano to copy data from Mongo t Meltano #troubleshooting

I am stuck in a loop using Meltano to copy data fr...

tim_suh

05/25/2023, 1:02 PM

I am stuck in a loop using Meltano to copy data from Mongo to Snowflake. I have set the replication strategy to incremental but when Meltano run fails (not sure exactly why but maybe Mongo is busy, anyway the reason is not important), even though the state - meltano.db - shows the latest replication key, Meltano seems to start from the beginning when restarted. Is this the expected behavior?

thomas_briggs

05/25/2023, 1:14 PM

What command are you using to run meltano? If you're using

elt

then you need to specify a job name for it to track state. If you're using

run

it should do it automatically though.

tim_suh

05/25/2023, 1:22 PM

meltano run tap-mongodb target-snowflake

tim_suh

05/25/2023, 1:22 PM

That's how I run it

tim_suh

05/25/2023, 1:23 PM

I sqlite-d into meltano.db and can see that the state is up to date. It shows the last replication key. But then when meltano runs again, the replication key is reset.

thomas_briggs

05/25/2023, 1:26 PM

What do your

select

rules in meltano.yml look like? Meltano will store the high water mark for each stream at the end of the run but whether or not it retrieves that and uses it for the next run depends on the configuration.

tim_suh

05/25/2023, 1:27 PM

select:

- table1.*

- table2.*

metadata:

'*':

replication-method: INCREMENTAL

replication-key: _id

tim_suh

05/25/2023, 1:29 PM

Isn't that how you configure?

thomas_briggs

05/25/2023, 1:34 PM

Yeah, that looks right... maybe try replacing the wildcard under metadata with an explicit table name? Also, none of the patterns in our configs are in single quotes... ours are all things like

dbo-*

thomas_briggs

05/25/2023, 1:35 PM

Could also be something about Mongo... I haven't worked with that at all

tim_suh

05/25/2023, 1:43 PM

single quote was added by Meltano - I did not hand edit meltano.yml

thomas_briggs

05/25/2023, 1:49 PM

https://github.com/meltano/meltano/issues/6623

thomas_briggs

05/25/2023, 1:51 PM

I have no proof that's the problem, of course, but it sounds plausible 😉

tim_suh

05/25/2023, 2:52 PM

This is good information. I believe this issue has been resolved in the latest version of tap-mongodb.

mert_bakir

05/25/2023, 3:11 PM

Which tap-mongodb variant are you using, is it the default one?

tim_suh

05/25/2023, 3:16 PM

- name: tap-mongodb

variant: z3z1ma

pip_url: git+<https://github.com/z3z1ma/tap-mongodb.git>

tim_suh

05/25/2023, 3:17 PM

Yes

tim_suh

05/25/2023, 3:21 PM

I can see that Meltano is emitting state as replication key is updated but if I were to kill it and then restart, Meltano goes back to the beginning instead of picking up from where it left off, which should be the last replication_key it knows about.

Matt Menzenski

05/25/2023, 4:18 PM

My menzenski variant of tap-mongodb picks up where it left off in this scenario, for what it’s worth. You might see if that works for your use case.

tim_suh

05/25/2023, 4:35 PM

Thanks @Matt Menzenski I will try it today

Open in Slack

Previous Next