csp
04/12/2023, 9:18 PMcsp
04/12/2023, 9:29 PMdefault_replication_method
to INCREMENTAL
, but the document said that I need to configure replication_key
column within the catalog's stream definitions. Not sure what it means exactly, can someone give an example?
I can set the replication method to full_table and it works fine, but I'd like to do incrementally when there are new rows and rows updated.aaronsteers
04/12/2023, 10:34 PMdefault_replication_method
config option that tap-postgres
provides is not available on all taps. Assuming you set the value to INCREMENTAL
, then the way the tap interprets this setting is something like: "Use incremental replication for all the tables that have an incremental key defined, assuming I have some bookmark to continue from. Otherwise, use "Full table" replication."aaronsteers
04/12/2023, 10:35 PMdefault_replication_method
to INCREMENTAL but you will also have to specify what the incremental key columns are (e.g. 'last_updated_as_of', etc.) in the meltano.yml
file.aaronsteers
04/12/2023, 10:38 PMLOG_BASED
replication, and that uses Postgres's own internal changelogs to track just the rows that are changed. While in theory, this is the same as a column-based incremental sync ... in practice it's much more powerful because it doesn't miss updates if your 'updated_at' columns miss an update, and it can also track changes on tables that have no incremental column set.
That said, you might require assistance from a DB admin to configure it, and so its perfectly normal to INCREMENTAL or FULL_TABLE replication when you're first getting started.
https://github.com/transferwise/pipelinewise-tap-postgres#log-based-replication-requirementscsp
04/13/2023, 12:36 AMupdated_at
column as the replication_key
for INCREMENTAL, but when I run meltano config tap-postgres test
, it gave me the error
Need help fixing this problem? Visit <http://melta.no/> for troubleshooting steps, or to
join our friendly Slack community.
Plugin configuration is invalid
AttributeError: 'NoneType' object has no attribute 'get'
that's when I got confused. The error message is not very clear as to what it meant. So I checked out the code to take a look, and got more confused. That's when I came here for help. I'll try to make sense out of the codes.csp
04/13/2023, 12:37 AMTheWhat do you mean by this?config option thatdefault_replication_method
provides is not available on all taps.tap-postgres
aaronsteers
04/13/2023, 3:43 PMmeltano.yml
content, with sensitive info redacted?aaronsteers
04/13/2023, 3:43 PMAttributeError: 'NoneType' object has no attribute 'get'
- Also, if you have the line number from the error message, or a fuller error message, that might be helpful for debugging.aaronsteers
04/13/2023, 3:46 PMcsp
04/13/2023, 5:55 PMversion: 1
default_environment: dev
project_id: 4457f4ce-a3b3-4e19-8f1c-861c13d1d809
environments:
- name: dev
- name: staging
- name: prod
plugins:
extractors:
- name: tap-postgres
variant: transferwise
pip_url: pipelinewise-tap-postgres
config:
host: localhost
port: 5432
dbname: mydb
filter_schemas: ''
user: itsme
default_replication_method: INCREMENTAL
replication_key: updated_at
loaders:
- name: target-csv
variant: hotgluexyz
pip_url: git+<https://github.com/hotgluexyz/target-csv.git>
config:
destination_path: data/output/
quotechar: '"'
And this is just a simplified test table:
create table customers(id int, first_name varchar(255), last_name varchar(255), age int, created_at timestamp, updated_at timestamp);
Running the test command meltano config tap-postgres test
just failed with the error that I gave above. Nothing specific about the error, and not very explanatory.
I believe the error means that I'm missing something in meltano.yml
when I changed the replication method, but I don't know what exactly. The tap's doc is not clear either 🙂csp
04/13/2023, 6:10 PMreplication_key_sql_datatype = md_map.get(('properties', replication_key)).get('sql-datatype')
in the sync_strategies/incremental.py
file. Seems like incremental needs more config, but it's not documented anywhere.aaronsteers
04/13/2023, 6:23 PMcsp
04/13/2023, 7:07 PMedgar_ramirez_mondragon
04/13/2023, 10:54 PMpublic.state
table with updated_at
as replication key:
version: 1
default_environment: dev
project_id: 4457f4ce-a3b3-4e19-8f1c-861c13d1d809
environments:
- name: dev
- name: staging
- name: prod
plugins:
extractors:
- name: tap-postgres
variant: transferwise
pip_url: pipelinewise-tap-postgres
config:
host: localhost
port: 5432
dbname: mydb
filter_schemas: ''
user: itsme
default_replication_method: INCREMENTAL
select:
- public-state.*
metadata:
public-state:
replication-key: updated_at
loaders:
- name: target-csv
variant: hotgluexyz
pip_url: git+<https://github.com/hotgluexyz/target-csv.git>
config:
destination_path: data/output/
quotechar: '"'
(see docs for metadata
in https://docs.meltano.com/concepts/plugins#metadata-extra)csp
04/14/2023, 2:46 AMversion: 1
default_environment: dev
project_id: 4457f4ce-a3b3-4e19-8f1c-861c13d1d809
environments:
- name: dev
- name: staging
- name: prod
plugins:
extractors:
- name: tap-postgres
variant: transferwise
pip_url: pipelinewise-tap-postgres
config:
host: localhost
port: 5432
dbname: mydb
user: itsme
metadata:
public-customers:
replication-method: INCREMENTAL
replication-key: updated_at
updated_at:
is-replication-key: 'true'
loaders:
- name: target-csv
variant: hotgluexyz
pip_url: git+<https://github.com/hotgluexyz/target-csv.git>
config:
destination_path: data/output/
quotechar: '"'