Hi everyone, it looks like we are encountering a p...
# troubleshooting
g
Hi everyone, it looks like we are encountering a problem with MySQL JSON fields. I am trying to create pipeline between MySQL and BigQuery. I did some testing and pipeline breaks when I add JSON field type do the table. It works with other data types. Does anyone have similar experience or can you advice workaround or fix. It is quite difficult to google that problem. I run
$ meltano elt tap-mysql target-bigquery
MySQL and BQ credentials are located in .env file melaton.yaml
Copy code
version: 1
send_anonymous_usage_stats: true
project_id: 58bd7ef7-855e-4c97-a687-148db3eb7843
plugins:
  extractors:
  - name: tap-mysql
    variant: transferwise
    pip_url: pipelinewise-tap-mysql
    metadata:
      '*':
        replication-method: FULL_TABLE
  loaders:
  - name: target-bigquery
    variant: adswerve
    pip_url: git+<https://github.com/adswerve/target-bigquery.git>
    config:
      add_metadata_columns: true
      location: asia-southeast1
      replication_method: truncate
      table_suffix: _src
      validate_records: false
error log:
Copy code
...
tap-mysql       | time=2021-07-13 12:28:19 name=singer level=INFO message=METRIC: {"type": "counter", "metric": "record_count", "value": 2, "tags": {"database": "meltano_poc_db", "table": "json_example"}}
target-bigquery | CRITICAL 'RECORD'
tap-mysql       | time=2021-07-13 12:28:19 name=singer level=INFO message=METRIC: {"type": "timer", "metric": "job_duration", "value": 4.003504991531372, "tags": {"job_type": "sync_table", "database": "meltano_poc_db", "table": "json_example", "status": "succeeded"}}
target-bigquery | CRITICAL ['Traceback (most recent call last):\n', '  File "/Users/greg/projects/meltano-projects/data-analytics-meltano-elt/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/target_bigquery/__init__.py", line 103, in main\n    for state in state_iterator:\n', '  File "/Users/greg/projects/meltano-projects/data-analytics-meltano-elt/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/target_bigquery/process.py", line 54, in process\n    for s in handler.handle_record_message(msg):\n', '  File "/Users/greg/projects/meltano-projects/data-analytics-meltano-elt/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/target_bigquery/processhandler.py", line 171, in handle_record_message\n    nr = format_record_to_schema(nr, self.bq_schema_dicts[stream])\n', '  File "/Users/greg/projects/meltano-projects/data-analytics-meltano-elt/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/target_bigquery/schema.py", line 349, in format_record_to_schema\n    record[k] = conversion_dict[bq_schema[k]["type"]](v)\n', "KeyError: 'RECORD'\n"]
meltano         | Loading failed (2): CRITICAL ['Traceback (most recent call last):\n', '  File "/Users/greg/projects/meltano-projects/data-analytics-meltano-elt/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/target_bigquery/__init__.py", line 103, in main\n    for state in state_iterator:\n', '  File "/Users/greg/projects/meltano-projects/data-analytics-meltano-elt/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/target_bigquery/process.py", line 54, in process\n    for s in handler.handle_record_message(msg):\n', '  File "/Users/greg/projects/meltano-projects/data-analytics-meltano-elt/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/target_bigquery/processhandler.py", line 171, in handle_record_message\n    nr = format_record_to_schema(nr, self.bq_schema_dicts[stream])\n', '  File "/Users/greg/projects/meltano-projects/data-analytics-meltano-elt/.meltano/loaders/target-bigquery/venv/lib/python3.9/site-packages/target_bigquery/schema.py", line 349, in format_record_to_schema\n    record[k] = conversion_dict[bq_schema[k]["type"]](v)\n', "KeyError: 'RECORD'\n"]
...
I changed target to jsonl and it looks like it is still a problem:
$ meltano elt tap-mysql target-jsonl
Copy code
meltano | The `discovery.yml` manifest received from <https://www.meltano.com/discovery.yml> has version 18, while this version of Meltano requires version 16.
meltano | Please install the latest compatible version of Meltano using `meltano upgrade`.
meltano | Falling back on the cached `discovery.yml` manifest...
meltano      | Running extract & load...
meltano      | No state was found, complete import.
tap-mysql    | time=2021-07-13 12:44:47 name=tap_mysql level=INFO message=Server Parameters: version: 5.7.33-google-log, wait_timeout: 28800, innodb_lock_wait_timeout: 3600, max_allowed_packet: 33554432, interactive_timeout: 28800
tap-mysql    | time=2021-07-13 12:44:48 name=tap_mysql level=INFO message=Server SSL Parameters(blank means SSL is not active): [ssl_version: ], [ssl_cipher: ]
tap-mysql    | time=2021-07-13 12:44:52 name=tap_mysql level=INFO message=Beginning sync for InnoDB table meltano_poc_db.json_example
tap-mysql    | time=2021-07-13 12:44:52 name=tap_mysql level=INFO message=Stream meltano_poc_db-json_example is using full table replication
target-jsonl | WARNING Unknown message type ACTIVATE_VERSION in message {'type': 'ACTIVATE_VERSION', 'stream': 'meltano_poc_db-json_example', 'version': 1626151492705}
tap-mysql    | time=2021-07-13 12:44:54 name=tap_mysql level=INFO message=Detected auto-incrementing primary key(s) - will replicate incrementally
tap-mysql    | time=2021-07-13 12:44:55 name=tap_mysql level=INFO message=Running SELECT `id`,`payload` FROM `meltano_poc_db`.`json_example` WHERE `id` <= 2 ORDER BY `id` ASC
tap-mysql    | /Users/greg/projects/meltano-projects/data-analytics-meltano-elt/.meltano/extractors/tap-mysql/venv/lib/python3.9/site-packages/pymysql/connections.py:1077: UserWarning: Previous unbuffered result was left incomplete
tap-mysql    |   warnings.warn("Previous unbuffered result was left incomplete")
tap-mysql    | time=2021-07-13 12:44:55 name=singer level=INFO message=METRIC: {"type": "counter", "metric": "record_count", "value": 2, "tags": {"database": "meltano_poc_db", "table": "json_example"}}
target-jsonl | Traceback (most recent call last):
target-jsonl |   File "/Users/greg/projects/meltano-projects/data-analytics-meltano-elt/.meltano/loaders/target-jsonl/venv/bin/target-jsonl", line 8, in <module>
tap-mysql    | time=2021-07-13 12:44:55 name=singer level=INFO message=METRIC: {"type": "timer", "metric": "job_duration", "value": 4.377760887145996, "tags": {"job_type": "sync_table", "database": "meltano_poc_db", "table": "json_example", "status": "succeeded"}}
target-jsonl |     sys.exit(main())
target-jsonl |   File "/Users/greg/projects/meltano-projects/data-analytics-meltano-elt/.meltano/loaders/target-jsonl/venv/lib/python3.9/site-packages/target_jsonl.py", line 94, in main
target-jsonl |     state = persist_messages(input_messages, config.get('destination_path', ''), config.get('do_timestamp_file', True))
target-jsonl |   File "/Users/greg/projects/meltano-projects/data-analytics-meltano-elt/.meltano/loaders/target-jsonl/venv/lib/python3.9/site-packages/target_jsonl.py", line 59, in persist_messages
target-jsonl |     validators[o['stream']].validate(float_to_decimal(o['record']))
target-jsonl |   File "/Users/greg/projects/meltano-projects/data-analytics-meltano-elt/.meltano/loaders/target-jsonl/venv/lib/python3.9/site-packages/jsonschema/validators.py", line 130, in validate
target-jsonl |     raise error
target-jsonl | jsonschema.exceptions.ValidationError: '{"key": "val2"}' is not of type 'null', 'object'
target-jsonl |
target-jsonl | Failed validating 'type' in schema['properties']['payload']:
target-jsonl |     {'inclusion': 'available', 'type': ['null', 'object']}
target-jsonl |
target-jsonl | On instance['payload']:
target-jsonl |     '{"key": "val2"}'
I am not sure if that could be the reason https://github.com/singer-io/tap-mysql/issues/110
a
Hi, @grzegorz_frydrychowicz. I spent some time yesterday digging into this but couldn't find a clear cause. It looks as though this JSON support was added in another pull requests. And also, it looks like the issue linked above is for the Singer variant, where your Meltano.yml file references the pipelinewise variant. I hate to suggest this randomly without a good root cause, but you could try switching variants and see if another one fixes the issue. (Note you'll need to explicitly reinstall the loader after switching.)