Hello hello! I have two questions which hopefully...
# getting-started
a
Hello hello! I have two questions which hopefully someone here can help me out with 🙂 First: Is it possible to have meltano automatically update a destination postgres target schema if the inbound data includes new columns (no need to worry about removals) Second: Is there any way to configure the meltano backend to use postgres? I'm unable to find any documentation on the systemdb option and whether the URI it accepts can include a postgres connection string...
e
Is it possible to have meltano automatically update a destination postgres target schema if the inbound data includes new columns (no need to worry about removals)
Depends on the target implementation, but for example MeltanoLabs/target-postgres does create the new columns when incoming SCHEMA message includes them.
Is there any way to configure the meltano backend to use postgres?
Yes! See https://docs.meltano.com/concepts/project/#support-for-other-database-types. The URI should look like this
Copy code
<postgresql+psycopg://user:password@host>:port/dbname
❤️ 1
a
Ah cool! Okay, so I'm running
tap-csv
and
target-postgres
. It notices that there are new columns, but they ultimatley don't get created.
Is there a special flag I need to provide to trigger it?
e
It notices that there are new columns
So where do you get this info from? The logs?
a
Yessir
e
And what does the log say?
a
Copy code
2024-12-16 19:17:27,339 | WARNING  | target-postgres.test | No schema for record field 'school' cmd_type=elb consumer=True job_name=dev:tap-csv-to-target-postgres name=target-postgres producer=False run_id=c31951b7-8c8f-413f-bdfa-5cf6efdcc413 stdio=stderr string_id=target-postgres
e
Ah, I can you try running with the
--refresh-catalog
flag?
Copy code
meltano run --refresh-catalog ...
a
Trying... 🙂
👍 1
Aah, yep. That did it!
Any consequences to that flag I should be mindful of?
e
Nothing of significance for most use cases. The long explanation is that Meltano caches the tap's catalog because discovery can return a lot of stuff and take a long time, specially for database sources. For most taps, the discovery process should be almost instantaneous, as is the case for tap-csv. We're still i the process of figuring out what are the best cache invalidation heuristics, maybe enabling catalog caching for SQL, salesforce and similar dynamic sources and disabling it otherwise.
a
Ahhh neat, okay. That's easy enough to remember 🙂
Oh, one last thing.... Is there any way to ask meltano to make a "best effort" to guess types other than text for columns when going to postgres? I'm actually hoping to be able to have it manage the destination schema for me like this.
So I definitely don't want to have to explicitly manage the schema. But if it has to be an "after the fact" thing, that's fine too. Just trying to fine tune a few last details.
(just noticing now, I can't change the type of destination columns in the DB after they're made without meltano complaining)
Copy code
Altering columns is not supported. Could not convert column 'tap_csv.test.id' from 'INTEGER' to 'TEXT'.
e
> (just noticing now, I can't change the type of destination columns in the DB after they're made without meltano complaining) Ah yeah, it's not safe to alter the column types manually on the tables created by the loader. The best way to customize the types of certain columns would be using schema:
Copy code
extractors:
- name: tap-csv
  schema:
    test:
      id:
        type: "integer"
We do have an issue to make column type alteration configurable: https://github.com/meltano/sdk/issues/1781. But the point above about not changing them manually would still hold: the loader would handle type alterations for you.
a
If I specify schema manually after running once, and run again, will the loader update?
Added a comment to the ticket 🙂