Random thought problem Let s say I have a table with 1billio Meltano #best-practices

Random thought problem: Let’s say I have a table ...

Anthony Shook

04/17/2025, 5:06 PM

Random thought problem: Let’s say I have a table with 1billion+ rows, and for the longest time, I’ve been replicating it on an auto-incrementing

id

column. However, the table is mutable at the source and has an

updated_at

column, so that means I’m not catching changes in the source table once I’ve pulled the at-the-moment value of a row. So my situation is this: • I want to update meltano config from using

id

as my replication-key to using

updated_at

as my replication key, with

id

as a value in

table-key-properties

• I don’t want to start from the beginning of time, because it’s absolutely too much data to handle, so I’ve got to manually set a

date

So the question is — how would you go about it?

Anthony Shook

04/17/2025, 5:08 PM

Fun addendum: I tried doing this — I updated the

yml

file, and updated the stream entry in my state table (db, not json). What I got was a nondescript error. In fact, the error didn’t say anything it just… errored without any message at all.

Edgar Ramírez (Arch.dev)

04/17/2025, 5:33 PM

Hi @Anthony Shook! What does the updated field look like in the db?

Anthony Shook

04/17/2025, 5:41 PM

In the source table (postgres) it’s timestamptz; in the target table (snowflake) it’s timestampntz

Anthony Shook

04/17/2025, 5:42 PM

(further wrinkle — I have other tables where this works perfectly fine)

Edgar Ramírez (Arch.dev)

04/17/2025, 5:45 PM

A safer approach could be getting the current state with

meltano state get ...

, edit the state, do

meltano state set ...

and run the pipeline. If it fails you can easily revert with the same steps.

Anthony Shook

04/17/2025, 5:54 PM

that’s fair. I did notice after making the change that I started getting

bad handshake: SysCallError(-1, 'Unexpected EOF')

errors but I suspect that might be unrelated?

Edgar Ramírez (Arch.dev)

04/17/2025, 8:16 PM

Is there more, i.e. a traceback, on that error? It does look like a transient network error.

Anthony Shook

04/17/2025, 8:26 PM

Just gave me the old broken pipe issue — but it worked when I pushed it to ECR, so probably shoddy home internet 🙂

🙌 1

2 Views

Open in Slack

Previous Next