Just getting started with `target-bigquery` . A f...
# getting-started
s
Just getting started with
target-bigquery
. A field was added in my source tap (one I’m developing custom) and I’m getting an error (below). Is this expected? I assumed that meltano/singer would automagically handle schema updates, or is this something to do with my tap?
Copy code
google.api_core.exceptions.BadRequest: 400 Provided Schema does not match Table mse-user-sterling-1:meltano_quickbase_json.wo_tags. Cannot add fields (field: wo_tag)
e
You might be seeing this because
If you add new columns to an existing table schema, the columns must be NULLABLE or REPEATED. You cannot add a REQUIRED column to an existing table schema. If you attempt to add a REQUIRED column to an existing table schema in the API or bq command-line tool, the following error is returned: BigQuery error in update operation: Provided Schema does not match Table project_id:dataset.table. Cannot add required columns to an existing schema.
https://cloud.google.com/bigquery/docs/managing-table-schemas#manually_adding_an_empty_column
So you might wanna make the new field nullable, e.g.
"type": ["string", "null"]
s
I don’t think that’s it. When I look in the logs, the schema does indicate that it it nullable.
Copy code
'configuration': {'load': {'schema': {'fields': [{'name': 'wo_tag', 'type': 'STRING', 'mode': 'NULLABLE'}
I tried using the
transferwise
variant and it was able to detect a new column and apply a schema change. So does this mean that the default bigquery target for meltano doesn’t support schema changes? Unfortunately the
transferwise
one doesn’t support date fields so I can’t really use that one either because my data has dates.
e
A fix just landed in `master`: https://github.com/jmriego/pipelinewise-target-bigquery/commit/8206827f64c51d74175a5e8972263bc5654342cc. You can try with the
pip_url: git+<https://github.com/jmriego/pipelinewise-target-bigquery.git>
s
Haven’t tried it yet, but just reviewing the code it looks like you’re casting dates as timestamps? Why not just use date?
e
j
that's already changed anyway, but the reason is that when I developed this target, the taps I was using did not differentiate between date and datetime. The Singer schema I was receiving for both was the same So because we use this and dbt together, I decided to set everything to timestamp so that it was possible to change the types in dbt later in the stage layer