Hi All I hope this message finds you well I m reaching out t Meltano #getting-started

Hi All, I hope this message finds you well. I'm r...

Debashis Adak

03/14/2024, 8:10 PM

Hi All, I hope this message finds you well. I'm reaching out to seek guidance on implementing incremental data loading from a custom tap API into a PostgreSQL database using Meltano. I'm relatively new to Meltano and have been encountering an issue where, despite being able to save the state file in the local filesystem successfully, subsequent runs of

meltano run

meltano el

result in complete data reloading. This, in turn, leads to duplicates in the database. My stream data is not sorted (is_sorted = False), but want to load incremental data per batch (from previous batch's highest watermark) Here's a snippet of the state file content:

Copy code

{
  "completed": {
    "singer_state": {
      "bookmarks": {
        "book": {
          "replication_key": "id",
          "replication_key_value": 112233
        }
      }
    }
  },
  "partial": {}
}

Below is the relevant code section:

Copy code

def get_url_params(self, context, next_page_token):
        params = {}

        starting_id: int = self.get_starting_replication_key_value(context=context)
        if starting_id:
            params["after"] = starting_id

        <http://self.logger.info|self.logger.info>("QUERY PARAMS: %s", params)

        return params

Also made

Copy code

replication_key = "id"
is_sorted = False

I would greatly appreciate any assistance or guidance in resolving this issue. If you have reference code or any insights on how to properly implement incremental data loading with Meltano, it would be immensely helpful.

Edgar Ramírez (Arch.dev)

03/14/2024, 9:55 PM

For the curious, solved in https://meltano.slack.com/archives/C069CQNHDNF/p1710447088204849

Open in Slack

Previous Next