Anyone running into `google.api_core.exceptions.Ba...
# troubleshooting
q
Anyone running into
google.api_core.exceptions.BadRequest: 400 PATCH <https://bigquery.googleapis.com/bigquery/v2/projects/project_id/datasets/salesforce/tables/opportunity?prettyPrint=false>: Field id already exists in schema
when running
target-bigquery
? I've been looking but I really can't explain that error. Here's my plugin configuration:
Copy code
loaders:
      - name: target-bigquery
        variant: z3z1ma
        config:
          project: project-id
          dataset: salesforce
          location: us-west1
          method: batch_job
          batch_size: 500
          fail_fast: True
          overwrite: True
          flattening_enabled: True
          denormalized: True
          column_name_transforms:
            lower: True
            quote: False
            add_underscore_when_invalid: True
            snake_case: True
This issue only arises when attempting to use
denormalized
. The table is properly created in Big Query, but the data won't be loaded into it. This would constraint to leave the data as JSON in a single column, or parse it with an additional layer of python/DBT normalisation models
PS: there is a field named
id
in the flattened stream data, only one. Not sure how bigquery could find a duplicate. But if that was the case, how to handle it via streams maps?
s
@quentin_gaborit did you manage to fix this? Facing the same issue here
q
Still not, I opened an issue on the repo. I’m going to try to have a closer look today but it doesn’t seem that trivial to identify
s
@quentin_gaborit hey I just tried variant - adswerve for target-bigquery and it's working nicely. Didn't face any issues so far.
q
Yep I saw that but I think adswerve only supports authentication via credentials json, while we only do IAM via SSO with Okta. However I tried
youcruit
, which does work but offers neither the 4 write options, neither the snake case transformations. Additionally I think the performances are not as good but that remains to be tested.
However it seems I've identified the issue, basically the plugin aims at supporting schema evolution: when a new fields appear the bigquery table schema is updated. The problem currently is that when using denormalized coupled with snake case transformation, there's a concatenation of both transformed fields and non transformed fields in the schema passed to the client when updating the table schema.