I m trying to replicate a Postgres DB using FULL TABLE repli Meltano #singer-targets

I'm trying to replicate a Postgres DB using FULL_T...

Jens Christian Hillerup

08/06/2024, 12:55 PM

I'm trying to replicate a Postgres DB using FULL_TABLE replication (for now, just want to get something working). Seems like my JSON fields don't survive. I have a largeish object looking like this:

Copy code

{
    "header": "just-a-plain-old-string",
    "sections": [
        {"name": "another string", "questions": [ .... ]},
        {"name": "another string 2", "questions": [ .... ]},
        {"name": "another string 3", "questions": [ .... ]},
        {"name": "another string 4", "questions": [ .... ]},
    ]
}

... you get the gist... What I end up with in my warehouse is just this string:

Copy code

{"\"header\"","\"text\"","\"sections\""}

So that's the two roots in the object,

header

and

sections

and ... I don't know where

text

comes from. Also there's some quotation mark business that needs my attention. Digging into it, it kinda looks like I'm out of luck w.r.t. having the JSON Postgres data type supported in the target, and I suppose I could live with a string. But I'd kinda just expect the JSON to survive in a string field. But maybe the problems start already at the tap?

Charles Feduke

08/06/2024, 1:34 PM

I ran into a similar problem recently going from tap-postgres to target-postgres where the target schema was aggressively choosing

jsonb[]

even though the tap column was

jsonb

and ending up in a similar situation where I just had root-level attribute names with no associated JSON hashes. The fix was in

target-postgres

where

jsonb

became the chosen default via a code change. What target are you using? (It’d be weird if the behavior of your target and target-postgres is exactly the same, because I believe it was the conversion during

INSERT

from

jsonb

jsonb[]

where data was lost.)

Jens Christian Hillerup

08/06/2024, 1:34 PM

I'm also using target-postgres, the meltanolabs variant

Jens Christian Hillerup

08/06/2024, 1:35 PM

brilliant.. I wonder if I can somehow force the data type on a column level in the project

yml

file....

Charles Feduke

08/06/2024, 1:36 PM

The fix for me was to update to target-postgres 0.0.14 and change my column types in staging - didn’t care about the data loss there - and I manually defined my tables in production prior to running the pipeline the first time to ensure

jsonb

was chosen over

jsonb[]

Charles Feduke

08/06/2024, 1:36 PM

I went over the source code for 0.0.14 and it seems like it will work as long as you drop your tables and let it recreate them

Jens Christian Hillerup

08/06/2024, 1:36 PM

Fantastic, thanks a lot

Charles Feduke

08/06/2024, 1:36 PM

Copy code

- name: target-postgres
    variant: meltanolabs
    pip_url: meltanolabs-target-postgres~=0.0.14

Charles Feduke

08/06/2024, 1:37 PM

the version change should be enough, I was just extra cautious with production prior to having a job that runs for 24 hours give or take 🙂

Jens Christian Hillerup

08/06/2024, 1:37 PM

oh wow, a full refresh takes about 20 minutes for me..

Charles Feduke

08/06/2024, 1:38 PM

yeah similar situation for the staging environment on tiny itty bitty AWS instances for me, but production… has a lot of audit records

Charles Feduke

08/06/2024, 1:40 PM

you can use this query to identify any

jsonb[]

columns still hanging around - note, my version produces a bunch of alter table statements so I could just correct staging and move on:

Copy code

SELECT
    distinct
    'alter table ' || c.table_schema || '.' || c.table_name || ' alter column ' || c.column_name || ' type jsonb using to_jsonb(' || c.column_name || ');' as ddl
FROM
    information_schema.columns c
JOIN
    pg_catalog.pg_class t ON c.table_name = t.relname
JOIN
    pg_catalog.pg_attribute a ON a.attname = c.column_name AND a.attrelid = t.oid
JOIN
    pg_catalog.pg_type tp ON a.atttypid = tp.oid
WHERE
    c.data_type = 'ARRAY' AND tp.typname = '_jsonb';

🙌 1

Jens Christian Hillerup

08/06/2024, 1:43 PM

Thank you so much for all this! I really need to leave now, but I'll check back on it.

Jens Christian Hillerup

08/06/2024, 9:50 PM

Worked great. Did have to nuke my DB though, but no biggie. Thanks again!

🙌 1

4 Views

Open in Slack

Previous Next