So I'm following the <pipeline specific guide> to ...
# plugins-general
m
So I'm following the pipeline specific guide to try and override a specific environment variable for this pipeline, but it doesn't seem to be passing value to the pipeline. Here's what I've got so far:
Copy code
# meltano.yml
...
  loaders:
  - name: target-bigquery
    namespace: bigquery
    pip_url: git+<https://github.com/Mashey/target-bigquery.git>
    executable: target-bigquery
    settings:
    - name: project_id
      value: vibrant-bonus-233400
    - name: dataset_id
    - name: validate_records
      value: false
    - name: stream_data
      value: true
    - name: disable_collection
      value: true
    - name: add_metadata_columns
      value: false
    - name: location
      value: US
schedules:
- name: shipstation-to-bigquery
  extractor: tap-shipstation
  loader: target-bigquery
  transform: skip
  interval: '@daily'
  start_date: 2020-10-07 16:24:54.035456
  env:
    TARGET_BIGQUERY_DATASET_ID: raw_shipstation
and when I run
meltano elt tap-shipstation target-bigquery --job_id shipstation-to-bigquery --dump=loader-config
I get the following output:
Copy code
{
  "project_id": "vibrant-bonus-233400",
  "dataset_id": null,
  "validate_records": false,
  "stream_data": true,
  "disable_collection": true,
  "add_metadata_columns": false,
  "location": "US"
}
If I add
TARGET_BIGQUERY_DATASET_ID
to the
.env
file, it will take that value, but
env:
doesn't seem to be overriding it for this specific pipeline.
t
what does
meltano config target-bigquery list
show?
m
Copy code
project_id [env: TARGET_BIGQUERY_PROJECT_ID, BIGQUERY_PROJECT_ID] current value: 'vibrant-bonus-233400' (from default)
dataset_id [env: TARGET_BIGQUERY_DATASET_ID, BIGQUERY_DATASET_ID] current value: '' (from `.env`)
validate_records [env: TARGET_BIGQUERY_VALIDATE_RECORDS, BIGQUERY_VALIDATE_RECORDS] current value: False (from default)
stream_data [env: TARGET_BIGQUERY_STREAM_DATA, BIGQUERY_STREAM_DATA] current value: True (from default)
disable_collection [env: TARGET_BIGQUERY_DISABLE_COLLECTION, BIGQUERY_DISABLE_COLLECTION] current value: True (from default)
add_metadata_columns [env: TARGET_BIGQUERY_ADD_METADATA_COLUMNS, BIGQUERY_ADD_METADATA_COLUMNS] current value: False (from default)
location [env: TARGET_BIGQUERY_LOCATION, BIGQUERY_LOCATION] current value: 'US' (from default)
d
@michael_cooper When you run
meltano elt tap-shipstation target-bigquery --job_id shipstation-to-bigquery
, the schedule definition is not actually used in any way, so neither is its
env
. The orchestrator itself is responsible for iterating over the schedules and populating the
env
before invoking the command: https://gitlab.com/meltano/files-airflow/-/blob/master/bundle/orchestrate/dags/meltano.py#L69
This would allow you to run
meltano schedule run shipstation-to-bigquery
and get the behavior you're looking for, but it's not implement yet: https://gitlab.com/meltano/meltano/-/issues/2227
I get that that may be confusing, but it's important to realize that
schedules
are exclusively used by orchestrator-specific DAG generators like https://gitlab.com/meltano/files-airflow/-/blob/master/bundle/orchestrate/dags/meltano.py
m
Is there a way to test whether it works then? I can't test it out within the UI either.
d
meltano invoke airflow run ...
should work: https://airflow.apache.org/docs/stable/cli-ref#run
Or you can verify that it works by simulating what Airflow would actually invoke using
TARGET_BIGQUERY_DATASET_ID=raw_shipstation meltano elt tap-shipstation target-bigquery --job_id shipstation-to-bigquery --dump=loader-config
I can't test it out within the UI either.
Good point, the
env
is not currently taken into account when you use the "Run" button in the UI! I've filed a bug report: https://gitlab.com/meltano/meltano/-/issues/2379
m
Great, that all worked. Thank you (again!)
d
Happy to help! Thanks for the feedback, it's good to know that this wasn't clear, and I may want to look into https://gitlab.com/meltano/meltano/-/issues/2227 sooner rather than later. I'll also gladly take a contribution if you're so inclined 😉