Hello everyone I ve created a data pipeline to transfer data Meltano #troubleshooting

Hello everyone, I've created a data pipeline to tr...

farshad_ghorbani

02/20/2023, 7:14 AM

Hello everyone, I've created a data pipeline to transfer data from MongoDB to BigQuery. However, I've noticed that all boolean values in the documents are capital, which makes it difficult to work with in BigQuery. I'm currently able to extract the values, but I need to convert them to lowercase in order to use them. Additionally, I'm wondering if it's possible to save the flattened document as columns instead of as a JSON or string. Does anyone have experience with this or know of a way to accomplish it? Does anyone have any ideas on how to fix this issue? Any help or suggestions would be greatly appreciated. Thank you!

edgar_ramirez_mondragon

02/20/2023, 4:17 PM

Hi @farshad_ghorbani. What variant of the bigquery loader are you using? I know for a fact that the

jmriego

variant doesn't load everything in a single json string column

farshad_ghorbani

02/20/2023, 5:55 PM

Hi @edgar_ramirez_mondragon, I am currently using the

Adswerve

variant, and I have tried other variants, but I have not tried the

jmriego

variant.

farshad_ghorbani

02/20/2023, 7:40 PM

I tested

jmriego

but I got this error

Copy code

* Neither 'default_target_schema' (string) nor 'schema_mapping' (object) keys set in config.

edgar_ramirez_mondragon

02/21/2023, 3:50 PM

That means you’re missing at least one of those config options

farshad_ghorbani

02/21/2023, 4:42 PM

I double-checked my configuration multiple times, but the issue persisted

edgar_ramirez_mondragon

02/21/2023, 4:44 PM

Can you share your (redacted)

meltano.yml

contents?

farshad_ghorbani

02/21/2023, 5:07 PM

This is the last edit of my target-bigquery config:

Copy code

- name: target-orders-v2
        config:
          credentials_path: ./client_secrets.json
          dataset_id: db_ordermanagement
          project_id: db_ordres
          add_metadata_columns: true
          hard_delete: false
          data_flattening_max_level: 0
          primary_key_required: false
          default_target_schema_select_permission: true

edgar_ramirez_mondragon

02/21/2023, 6:09 PM

Ok, so the config is missing

default_target_schema

farshad_ghorbani

02/21/2023, 6:31 PM

Thank you for your help. However, as mentioned in the document, 'Name of the schema where the tables will be created. If schema_mapping is not defined, then every stream sent by the tap is loaded into this schema.' Could you clarify if this means that the value is required?

edgar_ramirez_mondragon

02/21/2023, 7:36 PM

One of

schema_mapping

and

default_target_schema

is required. If

schema_mapping

it not defined (or it’s not exhaustive), then the

default_target_schema

is used to decide where the tables will be created

farshad_ghorbani

02/21/2023, 7:40 PM

If you have any examples, could you please share them with me?

edgar_ramirez_mondragon

02/21/2023, 7:43 PM

Sure. I use it in the default value since I haven’t migrated away from a custom plugin for target-bigquery, but the same value should work in the

config

block: https://github.com/edgarrmondragon/meltano-dataops/blob/1fc245c037a9b3df217daa3c560919b926ebbed0/meltano.yml#L33-L37

$MELTANO_EXTRACT__LOAD_SCHEMA

will be set to the namespace of the extractor, e.g.

tap_github

will be the schema where github data will land in the dwh

farshad_ghorbani

02/21/2023, 7:44 PM

Thank you

Open in Slack

Previous Next