Debashis Adak
03/14/2024, 8:10 PMmeltano run
or meltano el
result in complete data reloading. This, in turn, leads to duplicates in the database. My stream data is not sorted (is_sorted = False), but want to load incremental data per batch (from previous batch's highest watermark)
Here's a snippet of the state file content:
{
"completed": {
"singer_state": {
"bookmarks": {
"book": {
"replication_key": "id",
"replication_key_value": 112233
}
}
}
},
"partial": {}
}
Below is the relevant code section:
def get_url_params(self, context, next_page_token):
params = {}
starting_id: int = self.get_starting_replication_key_value(context=context)
if starting_id:
params["after"] = starting_id
<http://self.logger.info|self.logger.info>("QUERY PARAMS: %s", params)
return params
Also made
replication_key = "id"
is_sorted = False
I would greatly appreciate any assistance or guidance in resolving this issue. If you have reference code or any insights on how to properly implement incremental data loading with Meltano, it would be immensely helpful.Edgar Ramírez (Arch.dev)
03/14/2024, 9:55 PM