My current value for stream_maps is ```stream_maps...
# troubleshooting
r
My current value for stream_maps is
Copy code
stream_maps:
  databasename-table_name:
  __alias__: table_name
and executing meltano run tap-mysql target-bigquery gives
Copy code
Loading failed                 code=1 message=singer_sdk.exceptions.RecordsWithoutSchemaException: A record for stream 'databasename-table_name' was encountered before a corresponding schema. name=meltano run_id=98b018b2-40dd-4c15-a9f8-29bb4e2b4a81 state_id=2025-01-30T110140--tap-mysql--target-bigquery
If I remove it, the command executes, but the table name in BigQuery is created as
databasename_table_name
instead of just
table_name
.
r
Indentation looks wrong. Does
Copy code
stream_maps:
  databasename-table_name:
    __alias__: table_name
work? https://sdk.meltano.com/en/v0.44.1/stream_maps.html#aliasing-a-stream-using-alias
r
@Reuben (Matatika) I tired, having 2 spaces also, but same error. Here is my compele meltano.yml config
Copy code
version: 1
default_environment: dev
project_id: ***
environments:
- name: dev
plugins:
  extractors:
  - name: tap-mysql
    variant: transferwise
    pip_url:
      git+<https://github.com/transferwise/pipelinewise.git#subdirectory=singer-connectors/tap-mysql>
    config:
      host: ***
      user: ***
      database: databasename
      password: pwd
      port: ***
      tables: tbl_name
    select:
    - databasename-tbl_name.*
    metadata:
      databasename-tbl_name:
        replication-method: LOG_BASED
        replication-key: id
        id:
          is-replication-key: true
  loaders:
  - name: target-bigquery
    variant: z3z1ma
    pip_url: git+<https://github.com/z3z1ma/target-bigquery.git>
    config:
      location: ***
      credentials_path: service_account.json
      dedupe_before_upsert: true
      upsert: true
      project: ***
      dataset: ***
      cluster_on_key_properties: true
      denormalized: true
      table_name_prefix: ''
      stream_maps:
        databasename-tbl_name:
          __alias__: tbl_name
If I remove stream_maps, table and data are copied but the table name will be databasename_tbl_name in bigquery
r
https://github.com/meltano/sdk/pull/2589 looks to have fixed this in SDK version 0.39.1.
target-bigquery
is currently using SDK version 0.22.0, so it won't have this fix. You should be able to use
meltano-map-transformer
(which is on a more recent SDK version) to apply
stream_maps
instead: https://github.com/MeltanoLabs/meltano-map-transform?tab=readme-ov-file#meltano-installation-instructions
Copy code
meltano add mapper meltano-map-transformer
Copy code
version: 1
default_environment: dev
project_id: ***
environments:
- name: dev
plugins:
  extractors:
  - name: tap-mysql
    variant: transferwise
    pip_url:
      git+<https://github.com/transferwise/pipelinewise.git#subdirectory=singer-connectors/tap-mysql>
    config:
      host: ***
      user: ***
      database: databasename
      password: pwd
      port: ***
      tables: tbl_name
    select:
    - databasename-tbl_name.*
    metadata:
      databasename-tbl_name:
        replication-method: LOG_BASED
        replication-key: id
        id:
          is-replication-key: true
  mappers:
  - name: meltano-map-transformer
    pip_url: git+<https://github.com/MeltanoLabs/meltano-map-transform.git>
    mappings:
    - name: mysql-to-bigquery
      config:
        stream_maps:
          databasename-tbl_name:
            __alias__: tbl_name
  loaders:
  - name: target-bigquery
    variant: z3z1ma
    pip_url: git+<https://github.com/z3z1ma/target-bigquery.git>
    config:
      location: ***
      credentials_path: service_account.json
      dedupe_before_upsert: true
      upsert: true
      project: ***
      dataset: ***
      cluster_on_key_properties: true
      denormalized: true
      table_name_prefix: ''
Copy code
meltano run tap-mysql mysql-to-bigquery target-bigquery
r
@Reuben (Matatika) Thanks!, it worked
🙌 1
r
Great 🔥
I made an issue if you were interested in following along and switching back to your old config eventually (hopefully): https://github.com/z3z1ma/target-bigquery/issues/111
r
Yes, I'm interested in following along! Hopefully, we can switch back to the old config eventually.
@Reuben (Matatika) Also, is there a way to prevent fields like
_sdc_batched_at
,
_sdc_extracted_at
,
_sdc_deleted_at
, and
_sdc_received_at
from being created in the BigQuery table unless they are required by the system? I tried using add_record_metadata: false add_metadata_columns: false but those columns are still generates and saved in the table.
r
Looks like the target doesn't advertise the
add_record_metadata
setting (issue with the Meltano Hub plugin definition/target SDK version again). You can manually add it to your `meltano.yml`:
Copy code
loaders:
  - name: target-bigquery
    variant: z3z1ma
    pip_url: git+<https://github.com/z3z1ma/target-bigquery.git>
    settings:
    - name: add_record_metadata
      kind: boolean
      value: false
    config:
      location: ***
      credentials_path: service_account.json
      dedupe_before_upsert: true
      upsert: true
      project: ***
      dataset: ***
      cluster_on_key_properties: true
      denormalized: true
      table_name_prefix: ''
For reference: https://github.com/meltano/sdk/issues/1199 https://github.com/meltano/sdk/pull/1881
r
@Reuben (Matatika) It still adds _sdc__* columns
r
Do you see
add_record_metadata
set to
false
when you run
Copy code
meltano config target-bigquery list
?
r
add_record_metadata [env: TARGET_BIGQUERY_ADD_RECORD_METADATA] current value: False (default)
r