Hi everyone, I am trying to pull data from snowfla...
# plugins-general
s
Hi everyone, I am trying to pull data from snowflake and push it to postgres. My meltano.yml has the following setup.
Copy code
plugins:
  extractors:
  - name: tap-snowflake
    variant: meltanolabs
    pip_url: meltanolabs-tap-snowflake certifi==2025.1.31 
  loaders:
  - name: target-postgres
    variant: meltanolabs
    pip_url: meltanolabs-target-postgres
  utilities:
  - name: airflow
    variant: apache
    pip_url: git+<https://github.com/meltano/airflow-ext.git@main> apache-airflow==2.10.5 psycopg2-binary==2.9.10 
      --constraint 
      <https://raw.githubusercontent.com/apache/airflow/constraints-2.10.5/constraints-no-providers-${MELTANO__PYTHON_VERSION}.txt>
I am seeing 3 different issues. 1. All the tables in the target are of the form
schema_name-table_name
Our initial builds where always generating
table_name
when the target schema was the same as source. But now, whatever schema we try to push it to, it's creating
schema_name-table_name
format. 2. Every sync, we are seeing duplicate records being created, only difference being the
_sdc_*
columns being different. a. I have set
load_method
to
upsert
but to no avail. Every sync I see duplicate records. 3. Now every run is failing with the following exception. I am hoping that (1) will resolve this issue. Otherwise is there any way to exclude the table?. I have tried
exclude: [table-name]
but that did not work.
Copy code
sqlalchemy.exc.IdentifierError: Identifier 'bigchange_391_interdoor_ltd-sales_opportunity_note_workflow_answers_v1' exceeds maximum length of 63 characters cmd_type=elb consumer=True job_name=dev:tap-snowflake-to-target-postgres name=target-postgres producer=False run_id=41a540a7-6e96-4acc-abc3-d14e4a473c1d stdio=stderr string_id=target-postgres
r
On point 1, this probably explains that behaviour: https://meltano.slack.com/archives/C069CQNHDNF/p1747318577716309?thread_ts=1747314062.612539&amp;cid=C069CQNHDNF TL;DR - you probably have
default_target_schema
set, so it's now taking the stream names from
tap-postgres
at face value rather than inferring the schema from them.
👍 1
s
On (1) thanks for pointing this out. I do have the
default_target_schema
defined. Trying it out without that. Also thanks for pointing out the #C069CQNHDNF channel, for some reason I glazed over it 🙂
👍 1
r
You could also pin to a previous version of
target-postgres
before 0.4.1 if you wanted to keep behaviour the same.
i.e.
Copy code
pip_url: 'meltanolabs-target-postgres<0.4.1'
or
Copy code
pip_url: 'meltanolabs-target-postgres singer-sdk<0.46.0'
🙏 1
s
@Reuben (Matatika) Thanks for the help. Removing
default_target_schema
did not help. But pinning the target-postgres plugin did do the magic.
r
Hmm, that seems strange. What did you end up pinning to?
s
I just went with
pip_url: 'meltanolabs-target-postgres<0.4.1'
did not try pinning the singer-sdk
👍 1
r
Thanks, I'll give it a test myself sometime.
(cc @Edgar Ramírez (Arch.dev), might be something to be aware of)
e
Ok, so I agree the behavior if
default_target_schema
is set should probably to load into
<default_target_schema>.<src_table_name>
instead of
<default_target_schema>.<src_schema_name><table_name>
No, that gets us back to the behavior before https://github.com/meltano/sdk/issues/2909 was fixed 😬. I don't know what the correct approach is here.