Stéphane Burwash
09/30/2024, 3:40 PMmetadata
parameter.
However, we found that within the tap-stream
definition itself, replication_method
parameter took precedence over the metadata
config, meaning that we could not set a default replication-method
and replication-key
.
My question is: Is it possible to set a default replication-method & replication-key which could then be overwritten in the parameters?
Thanks 😄Stéphane Burwash
09/30/2024, 3:54 PMEdgar Ramírez (Arch.dev)
09/30/2024, 3:59 PMWhere no timestamp is returned in incremental syncsHmm, is that the behavior you're seeing? I think on the contrary, we return
None
only on Full-Table syncs.
Are you hardcoding the _`replication_method`_ attribute in your streams? If not, the default replication method is derived from the presence or absence of a replication key:
https://github.com/meltano/sdk/blob/f49f25b1cd530c01d885c50ffac9656ae4ea9c9a/singer_sdk/streams/core.py#L658-L660
Otherwise, the catalog override is respected:
https://github.com/meltano/sdk/blob/f49f25b1cd530c01d885c50ffac9656ae4ea9c9a/singer_sdk/streams/core.py#L656-L657
The replication key also respects any overrides:
https://github.com/meltano/sdk/blob/f49f25b1cd530c01d885c50ffac9656ae4ea9c9a/singer_sdk/streams/core.py#L1270-L1271Stéphane Burwash
09/30/2024, 4:01 PMname = "tasks"
path = "/v4/tasks"
primary_keys = ["id"]
records_jsonpath = "$.data[*]"
replication_method = "INCREMENTAL"
replication_key = "updatedDate"
schema = Tasks.schema
And wanted a quick way to modify to FULL_TABLE
while getting the starting_timestamp from the start_date
IF possible.
Is it not best practice to hardcode the replication method?Edgar Ramírez (Arch.dev)
09/30/2024, 4:09 PMreplication_method
attribute, you won't be able to override it with the catalog because you lose these:
https://github.com/meltano/sdk/blob/f49f25b1cd530c01d885c50ffac9656ae4ea9c9a/singer_sdk/streams/core.py#L656-L657
And wanted a quick way to modify toI believe awhile getting the starting_timestamp from theFULL_TABLE
IF possible.start_date
--full-refresh
might give you that: state is ignored and the configured start_date
is used. But you might've already explored that option.Stéphane Burwash
09/30/2024, 4:10 PM--full-refresh
doesn't work that well in production for me 😅 so I'll check removing the hard-code, thanks!Edgar Ramírez (Arch.dev)
09/30/2024, 4:13 PMSadlyFor my curiosity: is it due to constraints in your prod environment, or is it some aspect of Meltano that prohibits it? Gotta say, even if you remove the hard-code the default behavior is to ignoredoesn't work that well in production for me--full-refresh
start_date
for FULL_TABLE
, you might wanna override get_starting_replication_key_value
.Stéphane Burwash
09/30/2024, 4:15 PMFor my curiosity: is it due to constraints in your prod environment, or is it some aspect of Meltano that prohibits it?It requires that we update the CLI command structure when running in production, which at this time is tricky. We'll be working to make it more flexible in the future however 😄
Stéphane Burwash
09/30/2024, 4:16 PMEdgar Ramírez (Arch.dev)
09/30/2024, 4:16 PMIt requires that we update the CLI command structure when running in production, which at this time is tricky. We'll be working to make it more flexible in the future howeverWould it help if there was env var, e.g.
MELTANO_RUN_FULL_REFRESH
that you could use instead of the CLI flag?Stéphane Burwash
09/30/2024, 4:17 PMEdgar Ramírez (Arch.dev)
09/30/2024, 4:28 PM