Is there anywhere I can read (docs or code) about ...
# troubleshooting
a
Is there anywhere I can read (docs or code) about how state is handled using a cloud backend? I've using a modified
tap-spreadsheets-anywhere
, got a long running process which should be emitting state messages quite regularly at the end of each spreadsheet; my state.json is being modified very frequently according to timestamps, but the content of the file doesn't seem to be changing, and new streams aren't been listed. I'm struggling to get to the end of process to even see if the final state is recorded correctly. The tap uses the
singer
library, not the meltano sdk.
v
If I ignore the cloud backend for a second as that shouldn't change state behavior Tap emits a state message (or many State messages) , the target takes that state message and should (key word here as sometimes they decide to do other things sometimes for great reasons) persist all days that has been sent so far then emit a state message back to stdout. Only then does meltano write that state message to the back end. Singer does this do you know data has been persisted before your store the state data (you don't want to think you pulled all the data until today but not actually have it stored in the db)
a
OK, so if I can see
STATE
type messages in the output, then the state store should update to reflect that, regardless of backend chosen? Meltano wouldn't wait to the end of an entire
run
to update the state, if a
STATE
message has been emitted during the first of many streams?
Ah, so the target is involved too! I missed that part.
v
Most of the time when I hear people not having state written as they expect it's due to the target
a
All of a sudden my state.json now contains a lot more info! Maybe there was some write lock in place, multiple processes in contention.
I wonder if my multiple streams in my tap were completing so quickly that that it was causing issues with the json write back to cloud
v
If you could replicate your setup some how we could put an issue in, it sounds like you're running multiple jobs in parallel