Me again! I had a question regarding the tap-post...
# singer-tap-development
s
Me again! I had a question regarding the tap-postgres, if anyone has ever worked with it: I'm getting this error:
Copy code
replication_key_sql_datatype = md_map.get(('properties', replication_key)).get('sql-datatype')
AttributeError: 'NoneType' object has no attribute 'get'
When running the tap. Anyone ever encounter this?
a
The
NoneType
output is the result of
md_map.get(('properties', replication_key))
, meaning the the tap could not find reference to the incremental replication key in your schema catalog. Can you double check the spelling of the replication_key perhaps? Might also be worth confirming manually if the generated catalog includes that column in its metadata.
s
Thanks for the answer! I did not specify any catalog anywhere; where would it be possible to view the generated catalog? Sorry if my questions are quite basic, I'm very new to the postgres tap
a
Can you try
Copy code
meltano invoke --dump=catalog <plugin>
where
<plugin>
is
tap-postgres
?
s
And is this generated catalog supposed to have a field indicating "replication_key"? Or, when running incremental, am I supposed to offer a catalog which contains that info?
a
I guess I assumed you were already providing the replication key in your
meltano.yml
file.
SQL taps generally can't detect replication keys so if you are not requesting one in
meltano.yml
, then this looks increasingly like a bug on the tap side.
Might be helpful if you are able to provided the partial code from your
meltano.yml
which pertains to this extractor.
s
Copy code
- name: tap-postgres
    variant: transferwise
    pip_url: pipelinewise-tap-postgres
    config:
     host: localhost
     port: 36173
     dbname: name
     user: username
     default_replication_method: incremental
This is literrally all I am given; I couldn't really spot in the documentation where you would specify a key on a table-to-table basis
Update: Trying to specify in the .yml file following this documentation: https://transferwise.github.io/pipelinewise/connectors/taps/postgres.html
a
Those docs are for the pipelinewise orchestrator and would not work for this.
Can you try removing
default_replication_method: incremental
?
message has been deleted
s
raise Exception(f"Unrecognized replication_method {replication_method}")
Exception: Unrecognized replication_method None
a
Do you know if your environment meets these requirements for log-based replication? https://github.com/transferwise/pipelinewise-tap-postgres#log-based-replication-requirements
s
We have log based in Stitch, so it definitely should
a
Oh great. Then can you try this?:
Copy code
default_replication_method: LOG_BASED
s
Huh, that worked
a
🙂
s
So we would qualify this as a failing on my part with incremental, or with the tap?
But thank you so much 😄
a
So,
INCREMENTAL
in Singer-Spec is actually short-hand for
Key-Based Incremental
, which for database sources would require you to specify a
UpdatedOn
column name or similar for each table. The
LOG_BASED
is also incremental, but it uses the system internal logs so it doesn't need a designated incremntal key column to be specified.
So we would qualify this as a failing on my part with incremental, or with the tap?
The tap could do a better job of giving friendly error messages. For my part, I probably should have asked up front if you wanted to use log-based (automatic) incremental or key-based incremental.
s
Does meltano allow us to specify the column (in case we want to do incremental) ?
a
Yep.
s
Oh cool! Following the pipeline wise architecture, or a special nomenclature?
```extractors:
- name: tap-postgres
metadata:
some_stream_id:
replication-method: INCREMENTAL
replication-key: created_at
created_at:
is-replication-key: true```
s
Awesome, thank you so much!
a
Took me a while to find this code sample actually so I opened a docs issue to hopefully make it more obvious for future reference.
You're very welcome!
This sample 👆 also shows how you can override the replication-method for any tables that should not use the default.
s
That would be awesome! Also, I know that @visch created his own postgres tap using the SDK, so I'm excited to see if that's any better
c
@Stéphane Burwash Another thing to look out for with SQL sources is setting your primary keys in the meltano metadata. Took me a while to figure out what the primary key metadata is called ... Confirm with some test runs if manual primary key configs are required though. The tap might actually configure the correct primary key for you. But if you need to manually override, you would do that on a per stream basis:
Copy code
plugins:
  extractors:
  - name: tap-postgres
    variant: transferwise
    metadata:
      <streamnamefromtap>:
        table-key-properties:
          - KeyColumnName1
          - KeyColumnName2