Non-urgent question about `target-postgres` (datam...
# plugins-general
l
Non-urgent question about
target-postgres
(datamill-co variant): is the target schema expected to be empty? I tried using
public
on a db which has pre-existing tables, and it failed because the description of some tables was not valid JSON (here https://github.com/datamill-co/target-postgres/blob/9c095d91e215f932caa897a6587c6dc6278db8cf/target_postgres/postgres.py#L223
json.loads
didn't like that it was fed text for humans)
d
Can you share the error or other failure symptoms you're seeing?
l
sure, so I ran
meltano elt tap-csv target-postgres --job_id=xxxx
and it failed with the following stacktrace:
Copy code
target-postgres | ERROR Exception writing records
target-postgres | Traceback (most recent call last):
target-postgres |   File "/.../ays-elt/.meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/target_postgres/postgres.py", line 236, in write_batch
target-postgres |     self.setup_table_mapping_cache(cur)
target-postgres |   File "/.../ays-elt/.meltano/loaders/target-postgres/venv/lib/python3.8/site-packages/target_postgres/postgres.py", line 223, in setup_table_mapping_cache
target-postgres |     table_path = json.loads(raw_json).get('path', None)
target-postgres |   File "/home/laurent/.pyenv/versions/3.8.7/lib/python3.8/json/__init__.py", line 357, in loads
target-postgres |     return _default_decoder.decode(s)
target-postgres |   File "/home/laurent/.pyenv/versions/3.8.7/lib/python3.8/json/decoder.py", line 337, in decode
target-postgres |     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
target-postgres |   File "/home/laurent/.pyenv/versions/3.8.7/lib/python3.8/json/decoder.py", line 355, in raw_decode
target-postgres |     raise JSONDecodeError("Expecting value", s, err.value) from None
target-postgres | json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
d
OK. Can you find out what is in that
raw_json
variable if it's not valid JSON?
l
so I added some extra logging in the postgres target, and on the line I mentioned,
raw_json
is the description of an existing table, something like "the list of our clients"
d
The value comes from
obj_description(c.oid, 'pg_class')
in the query on line 214, so it's odd that Postgres would return anything for that function other than a JSON object description
l
yep, my thoughts exactly
d
the list of our clients
... Does that value look familiar to you?
Like where does it actually occur in a table row?
l
yes, it's the description of a table
it's not IN the table, it's like metadata aobut the table
let me fish up the SQL that created it
d
OK, it sounds like this target has certain expectations of what the table description is used for, then
l
COMMENT ON TABLE public.employees IS 'the comment I mentioned above'
d
Right. Looks like it's fussy about table comments that don't follow its own idea of what should be there
I think that's worth filing an issue in https://github.com/datamill-co/target-postgres/issues
And then using a different schema for the time being if that's an option
If you don't set a
postgres_schema
at all, the default is actually to use a name derived from the tap you used: https://meltano.com/plugins/loaders/postgres.html#postgres-schema
l
ok, I was thinking of it, just wanted to sense-check here first, seeing that you're pretty responsive 🙂
d
🙂
l
I tried with a different schema, but then it failed because it expected the schema to be there in the first place
d
Ah, interesting, I thought it auto-created schemas
l
let me try without specifying any schema, to see what happens
meta-question: is datamill the preferred variant for the postgres target?
d
Yep
l
alright, cool
confirmed, it does not auto-create the schema:
meltano         | Loading failed (1): target_postgres.exceptions.PostgresError: ('Exception writing records', InvalidSchemaName('schema "tap_csv" does not exist\nLINE 1: CREATE TABLE "tap_csv"."mytable" ();\n
d
OK, good to know!
l
as a workaround, there's a
before_run_sql
which could be used to create the schema, but it feels a little hackish. Maybe an option like
autocreate_schema
(true/false) would make sense?
d
before_run_sql
sounds like a good option for now, but I think a dedicated option definitely wouldn't hurt. We'd probably set it to
true
by default in Meltano. Wanna create an issue on https://github.com/datamill-co/target-postgres?
l
yep, working on the issues now. I'll file 2, one for the table comments, and one for the autocreation, ok with you?
d
Yep that sounds good, but it's not my repo so what I think doesn't matter too much 😄
l
sure, but you're kind of a power-user I guess 🙂
d
Kind of
l
thanks for the quick response @douwe_maan much appreciated!
d
Happy to help as always, thanks for helping debug and improve the ecosystem!
l
haha, yeah 4 bugs in 2 days, good start 🙂
d
And some of those bugs already fixed, don't forget that part 😉
l
I know, that's amazing!