Hey, I'm trying to sync tables with jsonb data typ...
# plugins-general
m
Hey, I'm trying to sync tables with jsonb data type from one postgres DB to another and it looks like the sync has issues with jsonb rows. I've tried all the taps and targets for postgres and they all fail for one reason or another. Has anyone achieved this?
d
@mark_poole TransferWise’s target-postgres should support that: https://github.com/transferwise/pipelinewise-target-postgres/blob/9470108c89b4a6f00420327ae2c66d48279d5cee/target_postgres/db_sync.py#L44-L45 What issue did you run into with that one?
m
@douwe_maan is there a recommended tap to use?
Copy code
ras.py", line 143, in execute
target-postgres |     return super(DictCursor, self).execute(query, vars)
target-postgres | psycopg2.errors.SyntaxError: syntax error at or near "-"
target-postgres | LINE 1: CREATE TABLE pep_intake.public-auth_group ("id" bigint, "nam...
target-postgres |                                       ^
target-postgres |
d
That error suggests you’re actually still using the data mill-co variant: https://meltano.com/plugins/loaders/postgres.html#error-psycopg2-programmingerror-syntax-error-at-or-near
The TransferWise tap and target should work together perfectly
They’re literally made for each other :)
m
hmm
could this be a pip environment issue?
my config specifies
Copy code
- name: tap-postgres
    varient: transferwise
    pip_url: pipelinewise-tap-postgres
    config:
      dbname: xx
      host: xx
      port: xx
      ssl: true
      user: xx
      default_replication_method: FULL_TABLE
  loaders:
  - name: target-postgres
    varient: transferwise
    pip_url: pipelinewise-target-postgres
    config:
      logging_level: DEBUG
      #persist_empty_tables: true
      #invalid_records_threshold: 10
      user: xx
      host: xx
      dbname: xxx
      default_target_schema: xxx
      add_metadata_columns: true
d
Note that it says varient, not variant!
Fix that, run “meltano install” and try again :)
m
omg
thank you
I was doing this at 3am so it makes sense it was a typo
I hit the library error again this time. For reference I did 1) Removed the venv and re-created it 2) Removed the .meltano folder 3) Ran meltano install 4) Ran
Copy code
meltano elt tap-postgres target-postgres
Python 3.8.5
Latest meltano from pip
I was so hopeful about the typo causing the issue
The only thing I can think is it's some kind of python version incompatibility
d
Sounds like it could be incompatibility of cloudpickle with Python 3.8: https://github.com/cloudpipe/cloudpickle/issues/404
m
ah interesting, I actually switched the base VM to ubuntu 20.04 to get 3.8 so the default tap would work lol. I guess I should go back to buster where 3.7 is the default python3
d
That issue suggests that the problem is only when using files pickled under 3.7 with 3.8, but that’s not what we’re doing here :/
In any case, this is no longer Meltano specific so I’m going to have some dinner and leave the debugging to you for the moment ;)
m
@douwe_maan thank you, I'll try a different python version anyway. Is the transferwise target / tap my only option for jsonb with postgres + Meltano?
d
Right now, yes. The TransferWise taps and targets are very good and actively maintained, so they should serve you well!
m
@douwe_maan you are the man!
I installed pyenv and python 3.7, reset the venv, looks like it's running now!!!
I guess that bug applies to everyone using 3.8, i'll make a comment on the issue
d
Awesome!
m
Copy code
meltano         | Incremental state has been updated at 2021-03-05 01:34:42.613960.
meltano         | Extract & load complete!
meltano         | Transformation skipped.
amazing
@douwe_maan I would buy you a beer if I could
d
A virtual 🍻 will have to do!
m
🍻 there you go 🙂
next step is getting this deployed to kubernetes
l
@mark_poole I ran into the same issue and it was fixed but never pinned to a release from the TransferWise Target. A workaround is to use the
pip_url: git+<https://github.com/transferwise/pipelinewise-target-postgres.git>
since that references the master branch on which it is fixed. @douwe_maan A suggestion I would make is that on the docs for the tap mysql to include a specific note to use the TransferWise target.
d
@lars It may not be clear enough, but did you see the following section under https://meltano.com/plugins/extractors/mysql.html#next-steps?
Note that this extractor is incompatible with the default
datamill-co
variants of
target-postgres
and
target-snowflake
, because they don't support stream names that include the source schema in addition to the table name:
<schema>-<table>
, e.g.
public-accounts
.
Instead, use the
transferwise
variants that were made to be used with this extractor:
target-postgres
and
target-snowflake
.
l
@douwe_maan I guess I would also expect a failure from meltano's side since it is incompatible. Because I was using the
datamill-co
variant with the
tap-mysql
extractor in our testing environment and it was working fine. Maybe it's worth mentioning it on the
target-postgres
 and 
target-snowflake
docs as well, since the failure is coming from the postgres target. FYI, TransferWise just released a new version of their postgres target that is compatible with Python 3.8. So that error should not be happening anymore.
d
the failure is coming from the postgres target.
@lars What exact error are you seeing? I agree we should mention it under https://meltano.com/plugins/loaders/postgres.html#troubleshooting and explain the possible workaround
l
That was the error I was seeing.
d
Ah all right, that should now be resolved by the new Python 3.8 compatible version, right? Or is it still showing up in a "stock" Meltano installation? Did you see another error relating to the incompatibility between tap-mysql and target-postgres?
l
No that is resolved now and not showing up anymore. The only thing I saw is that
tap-mysql
and and the
datamill-co
variant of
target-postgres
extractor, it didn't correctly update data when using
LOG_BASED
replication method, a field that was updated multiple times between each run only used the first updated value. For example a field was updated from 1 -> 2 -> 3 -> 4, running the extractor and loader pipeline, the field would show up as 2 in our postgres database. That was the only thing I noticed. This isn't the case when running the
transferwise
variant of
target-postgres
.
d
Interesting, that sounds like a bug in target-postgres where it doesn't handle multiple
RECORD
messages for the same underlying row correctly. I suggest filing an issue for that on https://github.com/datamill-co/target-postgres/issues
l
Yeah I'll file an issue with them. Thanks!!