Random thought problem: Let’s say I have a table ...
# best-practices
a
Random thought problem: Let’s say I have a table with 1billion+ rows, and for the longest time, I’ve been replicating it on an auto-incrementing
id
column. However, the table is mutable at the source and has an
updated_at
column, so that means I’m not catching changes in the source table once I’ve pulled the at-the-moment value of a row. So my situation is this: • I want to update meltano config from using
id
as my replication-key to using
updated_at
as my replication key, with
id
as a value in
table-key-properties
• I don’t want to start from the beginning of time, because it’s absolutely too much data to handle, so I’ve got to manually set a
date
So the question is — how would you go about it?
Fun addendum: I tried doing this — I updated the
yml
file, and updated the stream entry in my state table (db, not json). What I got was a nondescript error. In fact, the error didn’t say anything it just… errored without any message at all.
e
Hi @Anthony Shook! What does the updated field look like in the db?
a
In the source table (postgres) it’s timestamptz; in the target table (snowflake) it’s timestampntz
(further wrinkle — I have other tables where this works perfectly fine)
e
A safer approach could be getting the current state with
meltano state get ...
, edit the state, do
meltano state set ...
and run the pipeline. If it fails you can easily revert with the same steps.
a
that’s fair. I did notice after making the change that I started getting
bad handshake: SysCallError(-1, 'Unexpected EOF')
errors but I suspect that might be unrelated?
e
Is there more, i.e. a traceback, on that error? It does look like a transient network error.
a
Just gave me the old broken pipe issue — but it worked when I pushed it to ECR, so probably shoddy home internet 🙂
🙌 1