Hi I am using batch strategy for a `tap-mssql:targ...
# plugins-general
h
Hi I am using batch strategy for a
tap-mssql:target-snowflake
pipeline. I noticed the
_sdc
metadata fields are empty, which makes sense for the same reason you can't use stream_maps on batch messages. But I am really interested in having metadata, especially for downstream dbt models. How might I achieve this? Or if it's currently impossible, what might I contribute to achieve this? Pondering the options, maybe allowing taps to add record metadata before batching?
v
interested in your meltano.yml as well!
h
extractors.meltano.yml
Copy code
plugins:
  extractors:
  - name: tap-mssql
    variant: buzzcutnorman
    pip_url: tap-mssql[s3] git+<https://github.com/BuzzCutNorman/tap-mssql.git>
    config:
      port: 1433
      batch_config:
        encoding:
          format: jsonl
          compression: gzip
        storage:
          root: <file://project>
        batch_size: 250000
  - name: tap-mssql__academycrm
    inherit_from: tap-mssql
    load_schema: academycrm
    metadata:
      '*':
        key_properties: [Id]
        replication-key: ModifiedOn
        replication-method: INCREMENTAL
        Id:
          inclusion: automatic
      dbo-Audit:
        replication-key: CreatedOn
    select:
    - '*dbo-AcademicInstitution.*'
    - ...
    - '*dbo-UserTask.*'
  - name: tap-mssql__academycrm__ids
    inherit_from: tap-mssql__academycrm
    metadata:
      '*':
        replication-method: FULL_TABLE
        replication-key: ''
  - name: tap-mssql__academycrm__full
    inherit_from: tap-mssql__academycrm__ids
    config:
      batch_config:
        batch_size: 1000000
    select:
    - '*dbo-Account.*'
    - ...
    - '*dbo-vw_ValueMethod_lu.*'
loaders.meltano.yml
Copy code
plugins:
  loaders:
  - name: target-snowflake
    variant: meltanolabs
    pip_url: meltanolabs-target-snowflake
    env:
      SINGER_SDK_LOG_CONFIG: logging/snowflake.logging.yml
    config:
      role: ...
      warehouse: ...
      schema: $MELTANO_EXTRACT__LOAD_SCHEMA
      add_record_metadata: true
      default_target_schema: MELTANO_${ENV_PREFIX}
      validate_records: false
prod.meltano.yml
Copy code
environments:
- name: prod
  config:
    plugins:
      loaders:
      - name: target-snowflake
        config:
          default_target_schema: ''
  env:
    ENV_PREFIX: MELTANO
    ENV_PREFIX_LOWER: meltano
❤️ 1
In env vars
Copy code
TARGET_SNOWFLAKE_ACCOUNT=...
TARGET_SNOWFLAKE_USER=...
TARGET_SNOWFLAKE_PASSWORD=...
TAP_MSSQL__ACADEMYCRM_DATABASE=...
TAP_MSSQL__ACADEMYCRM_HOST=...
TAP_MSSQL__ACADEMYCRM_PASSWORD=...
TAP_MSSQL__ACADEMYCRM_USER=...
e
@Holly Evans what's the metadata you're interested in? We may be able to tweak the insert statement to include some literals as _sdc columns.
h
@Edgar Ramírez (Arch.dev)
_sdc_batched_at
or
_sdc_sync_started_at