Hello everyone, I am running a `tap-postgres targe...
# troubleshooting
s
Hello everyone, I am running a
tap-postgres target-postgres
pipeline that does not save the state. I have set the following in the plugin configuration:
Copy code
default_replication_method: INCREMENTAL
...
public:
     '*_*':
        replication-method: INCREMENTAL
        replication-key: key
I tried writing the state to local file system and I see only empty arrays for each table
Copy code
{
  "completed": {
    "singer_state": {
      "bookmarks": {
        "public-user_current": {}
      }
    }
  },
  "partial": {}
}
Is there additional configuration that needs to be added or something is incorrect? Meltano version is 3.4.2.
âś… 1
a
Is
key
the name of your replication column in postgres? Normally it might be something like
modified_on
. You are looking for a 'high water mark' value column, so that when the next run starts, the tap will only emit rows where 'key' is higher than it was before.
đź‘€ 1
âž• 1
s
Using the
updated_at
column now but still getting empty
{}
in state for all tables. Tried
meltano install --clean
, tried giving it a
state-id
, did a
--full-refresh
, but all results in empty state
a
Can you share full
meltano.yml
or just this plugin bit? Maybe the arrangement is not quite right
s
Copy code
plugins:
  extractors:
  - name: tap-postgres
    variant: meltanolabs
    pip_url: git+<https://github.com/MeltanoLabs/tap-postgres.git>
    config:
      database: <database_name>
      default_replication_method: INCREMENTAL
      flattening_enabled: false
      filter_schemas:
      - public
      host: <host_ip>
      port: <port>
      user: <username>
    select:
    - public-*_current.*
    public:
      '*_current':
        replication-method: INCREMENTAL
        replication-key: updated_at
e
@Saiyida Noor Fatima what's the command you're running?
s
Anytime I make a change:
Copy code
meltano lock --update --all
meltano install --clean
To run the pipeline I've been using:
Copy code
meltano el tap-postgres target-postgres --state-id=state_test
I have tried with
meltano run
and
meltano elt
as well
e
You could inspect the generated catalog with
Copy code
meltano invoke --dump=catalog tap-getpocket > catalog.json
to confirm that the right stream and field are marked as incremental.
s
For all tables in the catalog I see
"replication_method": ""
. In the metadata property I see
"forced-replication-method": ""
.
Copy code
{
  "breadcrumb": [],
  "metadata": {
    "inclusion": "available",
    "table-key-properties": [
      "key"
    ],
    "forced-replication-method": "",
    "schema-name": "public",
    "selected": true
  }
}
The
updated_at
property looks like this in metadata:
Copy code
{
  "breadcrumb": [
    "properties",
    "updated_at"
  ],
  "metadata": {
    "inclusion": "available",
    "selected": true
  }
}
Here's one of the tables that get synced fully each time I run the pipeline. I dont see any property indicating an incremental sync
Copy code
{
  "tap_stream_id": "public-table_current",
  "table_name": "table_current",
  "replication_method": "",
  "key_properties": [
    "key"
  ],
  "schema": {
    "properties": {
      "created_at": {
        "format": "date-time",
        "type": [
          "string",
          "null"
        ]
      },
      "updated_at": {
        "format": "date-time",
        "type": [
          "string",
          "null"
        ]
      },
      "key": {
        "type": [
          "string"
        ]
      }
    },
    "type": "object",
    "required": [
      "key"
    ]
  },
  "is_view": false,
  "stream": "public-table_current",
  "metadata": [
    {
      "breadcrumb": [
        "properties",
        "created_at"
      ],
      "metadata": {
        "inclusion": "available",
        "selected": true
      }
    },
    {
      "breadcrumb": [
        "properties",
        "updated_at"
      ],
      "metadata": {
        "inclusion": "available",
        "selected": true
      }
    },
    {
      "breadcrumb": [
        "properties",
        "key"
      ],
      "metadata": {
        "inclusion": "automatic",
        "selected": true
      }
    },
    {
      "breadcrumb": [],
      "metadata": {
        "inclusion": "available",
        "table-key-properties": [
          "key"
        ],
        "forced-replication-method": "",
        "schema-name": "public",
        "selected": true
      }
    }
  ],
  "selected": true
}
e
Oh I see the problem. You got
public
as an attribute at the same level as config, but you need `metadata`:
Copy code
plugins:
  extractors:
  - name: tap-postgres
    variant: meltanolabs
    pip_url: git+<https://github.com/MeltanoLabs/tap-postgres.git>
    config:
      database: <database_name>
      default_replication_method: INCREMENTAL
      flattening_enabled: false
      filter_schemas:
      - public
      host: <host_ip>
      port: <port>
      user: <username>
    select:
    - public-*_current.*
    metadata:
      'public-*_current':
        replication-method: INCREMENTAL
        replication-key: updated_at
s
Works now when I run the pipeline via command line. Thank you! Still seeing a full import when running a docker image of the pipeline, but I'm looking into it.
👍 1