Hi all quick question following up on some of the work above Meltano #best-practices

Hi all, quick question following up on some of the...

adam_rudd

10/01/2021, 3:26 AM

Hi all, quick question following up on some of the work above: • targets appear to be tightly coupled to config or ENV vers • If we plan to re-use targets, is there a recommended approach here? eg:

Copy code

pipeline1: Pulling from SalesForce and loading via target-s3-csv to an s3 bucket with account role ARN:123 `salesforce-data`
pipeline2: Pulling from Jira and loading via target-s3-csv to an s3 bucket with account role ARN:456 `salesforce-data`

If dockerizing this, i’m under the impression that we should have a single

meltano

container with a bunch of workers which are used for actual processing. TLDR: whats the recommended approach for re-using targets with different configs (writing to the same location is not feasible for us so we cannot consider this option unfortunately)

boggdan_barrientos

10/01/2021, 3:56 AM

For re-use you can use

inherit_from:

works for tap/targets you only have to change the configuration. Something like this.

Copy code

- name: target-redshift
    variant: transferwise
    pip_url: pipelinewise-target-redshift
    config:
      primary_key_required: false
      batch_size_rows: 100000
  - name: target-redshift-raw
    inherit_from: target-redshift
    config:
      primary_key_required: true
      batch_size_rows: 250000
      parallelism: -1

adam_rudd

10/01/2021, 5:26 AM

Legend!

edgar_ramirez_mondragon

10/01/2021, 3:13 PM

@boggdan_barrientos's solution is probably the more robust one, especially if the config values will be hardcoded once the Docker image is baked. Another approach, that database loaders use, is to default the value of the target schema/bucket/etc. to

$MELTANO_EXTRACT__LOAD_SCHEMA

(for example target-postgres). That variable is filled at runtime from the extractor definition: https://meltano.com/docs/plugins.html#load-schema-extra. So if you have an extractor for Salesforce with a namespace

salesforce

, Jira with namespace

jira

and a target-s3-csv that defines

s3_bucket: $MELTANO_EXTRACT__LOAD_SCHEMA

, then each source will land in a separate bucket in S3

Open in Slack

Previous Next