Back again with another question! So the custom ta...
# singer-tap-development
i
Back again with another question! So the custom tap I'm working on scans the local file system for files to sync. All files belong to the same stream, like a track-record of events per day, with a file for each day. Files should only be synced once to avoid duplication. How should I implement that? The docs recommend against writing to the state yourself, and I can't imagine that this is such a unique case, so what am I missing?
1
e
When you're processing records from those files, you could get the
mtime
from the file and add it as a metadata field to the record, .e.g.
_modified_time
. Then, you can set that as the
replication_key
attribute in your stream class.
👍 1
i
I think that will work! But does that mean that every file will be read entirely?
a
The docs recommend against writing to the state yourself
I'm curious if that means 'let meltano persist the state to the state store, don't do it yourself', or actually 'don't manually modify state values in any way'
i
From the docs: We do not recommend reading or writing state directly within your tap. I'm taking that to mean the latter. You can access parts of the state, as found in this example, where the most recent sync replication-key can be accessed. This example also solves my problem!
👍 1