Hi All I am trying to run meltano tap-snowflake t...
# getting-started
t
Hi All I am trying to run meltano tap-snowflake target-bigquery via a bash.sh file by passing the configuration so I can dynamically so I can specify the table. The meltan.yml file is configured with the correct connection information. However, when I run my batch script it runs for all the tables in my snowflake not the one in the below code
Copy code
MELTANO_EXTRACTOR_TAP_SNOWFLAKE_CONFIG=$(cat <<EOF
{
  "account": "$TAP_SNOWFLAKE_ACCOUNT",
  "user": "$TAP_SNOWFLAKE_USER",
  "password": "$TAP_SNOWFLAKE_PASSWORD",
  "database": "$TAP_SNOWFLAKE_DATABASE",
  "warehouse": "$TAP_SNOWFLAKE_WAREHOUSE",
  "schema": "$TAP_SNOWFLAKE_SCHEMA",
  "tables": "$TAP_SNOWFLAKE_TABLE_NAME"
}
EOF
)

echo $MELTANO_EXTRACTOR_TAP_SNOWFLAKE_CONFIG

export MELTANO_EXTRACTOR_TAP_SNOWFLAKE_CONFIG
meltano run tap-snowflake target-bigquery
I have also tried
Copy code
MELTANO_EXTRACTOR_TAP_SNOWFLAKE_CONFIG=$(cat <<EOF
{
  "account": "$TAP_SNOWFLAKE_ACCOUNT",
  "user": "$TAP_SNOWFLAKE_USER",
  "password": "$TAP_SNOWFLAKE_PASSWORD",
  "database": "$TAP_SNOWFLAKE_DATABASE",
  "warehouse": "$TAP_SNOWFLAKE_WAREHOUSE",
  "schema": "$TAP_SNOWFLAKE_SCHEMA",
  "tables": 
        "-" "$TAP_SNOWFLAKE_TABLE_NAME"
}
EOF
)

echo $MELTANO_EXTRACTOR_TAP_SNOWFLAKE_CONFIG

export MELTANO_EXTRACTOR_TAP_SNOWFLAKE_CONFIG
meltano run tap-snowflake target-bigquery
and
Copy code
MELTANO_EXTRACTOR_TAP_SNOWFLAKE_CONFIG=$(cat <<EOF
{
  "account": "$TAP_SNOWFLAKE_ACCOUNT",
  "user": "$TAP_SNOWFLAKE_USER",
  "password": "$TAP_SNOWFLAKE_PASSWORD",
  "database": "$TAP_SNOWFLAKE_DATABASE",
  "warehouse": "$TAP_SNOWFLAKE_WAREHOUSE",
  "schema": "$TAP_SNOWFLAKE_SCHEMA",
  "tables": ["$TAP_SNOWFLAKE_TABLE_NAME"]
}
EOF
)

echo $MELTANO_EXTRACTOR_TAP_SNOWFLAKE_CONFIG

export MELTANO_EXTRACTOR_TAP_SNOWFLAKE_CONFIG
meltano run tap-snowflake target-bigquery
Thank you for any help Tim
e
Hi Tim! I'm curious why you're trying to set the entire JSON config as an env var instead of using individual env vars for each setting. I don't think the former works, but I'm curious if you saw that somewhere in the docs 🙂
t
Hi Edgar I was trying to follow the idea in this document https://docs.meltano.com/concepts/environments/?meltano-tabs=env image.png d*ocs.meltano.com* Environments | Meltano Documentation Environments allow data teams to manage different sets of configurations for extractors, loaders and plugins.
e
I see. You'd rather set the individual env vars. You can see the list of env vars a tap accepts with the command
meltano config tap-snowflake list
.
t
Hi Edgar Thank you for the pointer I altered my bash file to below trying to set the config before calling the run statement
Copy code
meltano config tap-snowflake set tables '[{"schema": "Schema_One", "table_name": "Table_One"}]'
meltano config tap-snowflake
meltano run tap-snowflake target-bigquery
However, it doesn't seem to update the meltano.ymal flie It states that the config has been updated, but the meltano config tap-snowflake doesn't show the update and it then runs for all tables. Any idea what it is I am doing wrong?
e
Hmm. I can confirm that this works
Copy code
$ meltano config tap-snowflake set tables '[{"schema": "Schema_One", "table_name": "Table_One"}]'
2024-05-31T00:44:29.565981Z [info     ] The default environment 'dev' will be ignored for `meltano config`. To configure a specific environment, please use the option `--environment=<environment name>`.
Extractor 'tap-snowflake' setting 'tables' was set in `meltano.yml`: [{'schema': 'Schema_One', 'table_name': 'Table_One'}]
$ meltano@py311 config tap-snowflake
2024-05-31T00:44:35.362239Z [info     ] The default environment 'dev' will be ignored for `meltano config`. To configure a specific environment, please use the option `--environment=<environment name>`.
{
  "tables": [
    {
      "schema": "Schema_One",
      "table_name": "Table_One"
    }
  ]
}
1. What version of meltano are you running? 2. What does your
meltano.yml
file look like? 3. Try running
meltano lock --update --all
t
Hi Meltano version 3.4.0 meltano.yml
Copy code
plugins:
  extractors:
  - name: tap-snowflake
    variant: meltanolabs
    pip_url: meltanolabs-tap-snowflake
    config:
      account: {$TAP_SNOWFLAKE_ACCOUNT: null}
      database: {$TAP_SNOWFLAKE_DATABASE: null}
      role: {$TAP_SNOWFLAKE_ROLE: null}
      schema: {$TAP_SNOWFLAKE_SCHEMA: null}
      user: {$TAP_SNOWFLAKE_USER: null}
      warehouse: {$TAP_SNOWFLAKE_WAREHOUSE: null}
Add the lock statement and the meltano config tap-snowflake returns the below showing the correct table added but the stream log is running a different table the first table in the schema based alphabetically and doesn't run Table_One . I have also tried altering the meltano.yml passing in the table name via a variable, but it fails saying it can't run a discover over the catalog . I suppose it I can pass it directly to the meltano.yml file that would solve having to run a meltano config prior to running
Copy code
plugins:
  extractors:
  - name: tap-snowflake
    variant: meltanolabs
    pip_url: meltanolabs-tap-snowflake
    config:
      account: {$TAP_SNOWFLAKE_ACCOUNT: null}
      database: {$TAP_SNOWFLAKE_DATABASE: null}
      role: {$TAP_SNOWFLAKE_ROLE: null}
      schema: {$TAP_SNOWFLAKE_SCHEMA: null}
      user: {$TAP_SNOWFLAKE_USER: null}
      warehouse: {$TAP_SNOWFLAKE_WAREHOUSE: null}
      tables:
        - {$TAP_SNOWFLAKE_Table_Name: null}
v
https://docs.meltano.com/reference/command-line-interface/#no-color may also be helpful ;) Also for selecting tables id highly recommend the select filter extra https://docs.meltano.com/concepts/plugins/#select_filter-extra that's how I do what you're doing for 200 ish tables in a db
The error your running into about not being about to find a table during discover: if you swap to select filter that will probably help this but my bet is you're hitting catalog caching issues, see https://github.com/meltano/meltano/issues/6919 As a part of our script right now we just delete the .meltano/run directory before every run (or run each run in a separate container) as for us the discovery process is very fast compared to pulling data from the tables we care about. Ideally we'd manage this a bit better
👀 1
t
Hi Sorry for the delay in replying, but it was a national holiday on Monday so had the day off. Derek I tried your suggestions, but all that seems to happen now is it only returns one table (the first alphabetically) . I have tried doing a) --full-refresh to reset the state file b) deleting the meltano./run directory c) reinstalling the tap-snowflake extractor d) listing all the tables from the schema in the meltano.yml file. e) meltano select tap-snowflake --refresh all with no success. No idea what I've broken or how to get it back to the original state so it at least returns all the tables.
v
Sounds like we need to start much simpler than what I was thinking. What exactly is your issue now? It sounds like it's something different now. Also meltano.yml would be very helpful
t
Morning Derek Below is my meltano.yml file. looking at it before sending it I noticed there was a config below the dev environment pointing to the table it would only extract. I removed it and it now runs for all tables. However, what I am trying to achieve is to call Meltano run tap-snowflake targer-bigquery from a bash.sh file and pass the table I wish to load . bash.sh file
Copy code
meltano config tap-snowflake set tables '["SCH.TABLE1"]'
meltano run tap-snowflake target-jsonl
Meltano.yml
Copy code
version: 1
default_environment: dev
project_id: 2c9b2c68-011b-463b-a3ec-8708821583c4
environments:
- name: dev
- name: staging
- name: prod
plugins:
  extractors:
  - name: tap-snowflake
    variant: meltanolabs
    pip_url: meltanolabs-tap-snowflake
    config:
      account: {$TAP_SNOWFLAKE_ACCOUNT: null}
      database: {$TAP_SNOWFLAKE_DATABASE: null}
      role: {$TAP_SNOWFLAKE_ROLE: null}
      schema: {$TAP_SNOWFLAKE_SCHEMA: null}
      user: {$TAP_SNOWFLAKE_USER: null}
      warehouse: {$TAP_SNOWFLAKE_WAREHOUSE: null}
      tables:
          - ["SCH.TABLE1"]
  loaders:
  - name: target-jsonl
    variant: andyh1203
    pip_url: target-jsonl
  - name: target-bigquery
    variant: z3z1ma
    pip_url: git+<https://github.com/z3z1ma/target-bigquery.git>
    config:
      credentials_path: ./.secrets/GCP_Cred.json
      project: {$TARGET_BIGQUERY_PROJECT: null}
      dataset: {$TARGET_BIGQUERY_DATASET: null}
      location: {$TARGET_BIGQUERY_LOCATION: null}
      overwrite: true
      method: gcs_stage
      bucket: {$LANDING_BUCKET: null}}
      batch_size: 100000
This works now, but I am unable to pass the table name via a variable name. the below fails bash.sh
Copy code
export $TAP_SNOWFLAKE_TABLENAME='["SCH.TABLE1"]'
meltano run tap-snowflake target-jsonl
meltano.yml
Copy code
version: 1
default_environment: dev
project_id: 2c9b2c68-011b-463b-a3ec-8708821583c4
environments:
- name: dev
- name: staging
- name: prod
plugins:
  extractors:
  - name: tap-snowflake
    variant: meltanolabs
    pip_url: meltanolabs-tap-snowflake
    config:
      account: {$TAP_SNOWFLAKE_ACCOUNT: null}
      database: {$TAP_SNOWFLAKE_DATABASE: null}
      role: {$TAP_SNOWFLAKE_ROLE: null}
      schema: {$TAP_SNOWFLAKE_SCHEMA: null}
      user: {$TAP_SNOWFLAKE_USER: null}
      warehouse: {$TAP_SNOWFLAKE_WAREHOUSE: null}
      tables:
          - {$TAP_SNOWFLAKE_TABLENAME: null}
  loaders:
  - name: target-jsonl
    variant: andyh1203
    pip_url: target-jsonl
  - name: target-bigquery
    variant: z3z1ma
    pip_url: git+<https://github.com/z3z1ma/target-bigquery.git>
    config:
      credentials_path: ./.secrets/GCP_Cred.json
      project: {$TARGET_BIGQUERY_PROJECT: null}
      dataset: {$TARGET_BIGQUERY_DATASET: null}
      location: {$TARGET_BIGQUERY_LOCATION: null}
      overwrite: true
      method: gcs_stage
      bucket: {$LANDING_BUCKET: null}}
      batch_size: 100000
Thanks for any help
Hi I have managed to solve the issue. bash.sh
Copy code
export TAP_SNOWFLAKE_TABLE=SCH.TABLE1
echo $TAP_SNOWFLAKE_TABLE
meltano run tap-snowflake target-bigquery
meltano.yml
Copy code
plugins:
  extractors:
  - name: tap-snowflake
    variant: meltanolabs
    pip_url: meltanolabs-tap-snowflake
    config:
      account: {$TAP_SNOWFLAKE_ACCOUNT: null}
      database: {$TAP_SNOWFLAKE_DATABASE: null}
      role: {$TAP_SNOWFLAKE_ROLE: null}
      schema: {$TAP_SNOWFLAKE_SCHEMA: null}
      user: {$TAP_SNOWFLAKE_USER: null}
      warehouse: {$TAP_SNOWFLAKE_WAREHOUSE: null}
      tables:
        - $TAP_SNOWFLAKE_TABLE
quite a few syntax errors thank you both for the help Tim