Hello everyone! I'm having troubles with `tap-bigq...
# troubleshooting
a
Hello everyone! I'm having troubles with
tap-bigquery
, the only error I get is:
Copy code
Catalog discovery failed: invalid catalog: Expecting value: line 1 column 1 (char 0)
I followed this guide and there is no setting for a Catalog. The only reference in under 'Capabilities' which I do not yet understand 😅. When running
meltano invoke tap-bigquery -h
I see a 'catalog' option which I then set with
meltano invoke tap-bigquery --catalog catalog.json
but it returns the same error:
Copy code
CRITICAL Expecting value: line 1 column 1 (char 0)
Traceback (most recent call last):
  File "C:\Users\AndreaRadaelli\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\AndreaRadaelli\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_PROJECT\.meltano\extractors\tap-bigquery\venv\Scripts\tap-bigquery.exe\__main__.py", line 7, in <module>
  File "C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_PROJECT\.meltano\extractors\tap-bigquery\venv\lib\site-packages\singer\utils.py", line 235, in wrapped
    return fnc(*args, **kwargs)
  File "C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_PROJECT\.meltano\extractors\tap-bigquery\venv\lib\site-packages\tap_bigquery\__init__.py", line 154, in main
    args = parse_args()
  File "C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_PROJECT\.meltano\extractors\tap-bigquery\venv\lib\site-packages\tap_bigquery\__init__.py", line 147, in parse_args
    args.catalog = Catalog.load(args.catalog)
  File "C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_PROJECT\.meltano\extractors\tap-bigquery\venv\lib\site-packages\singer\catalog.py", line 96, in load
    return Catalog.from_dict(json.load(fp))
  File "C:\Users\AndreaRadaelli\AppData\Local\Programs\Python\Python39\lib\json\__init__.py", line 293, in load
    return loads(fp.read(),
  File "C:\Users\AndreaRadaelli\AppData\Local\Programs\Python\Python39\lib\json\__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "C:\Users\AndreaRadaelli\AppData\Local\Programs\Python\Python39\lib\json\decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Users\AndreaRadaelli\AppData\Local\Programs\Python\Python39\lib\json\decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Since there is no catalog setting on the Meltano Hub guide I really don't know how to do this, can someone help me with this issue?
p
@andrea_radaelli You shouldnt have to worry about passing a catalog to the tap unless youre doing something advanced but for more details see these docs. Theyre an internal detail of how Singer works (which meltano uses). Meltano manages generating the catalog and passing it to the tap.
Related to your "discovery failed" error. Usually theres more detail up further in the stack trace, can you see if you can find anything else? I havent personally used this tap but it could be related to credentials.
a
Hi @pat_nadolny, and thankyou for your support! Regarding credentials: this is a test ELT, my tap and my target are BigQuery Extractor and Loader. When isolated,
target-bigquery
works fine with the same credentials I give to
tap-bigquery
I tried with
meltano invoke tap-bigquery > output.json
but I keep having this error:
Copy code
Catalog discovery failed: invalid catalog: Expecting value: line 1 column 1 (char 0)
So i tried with
meltano --log-level=debug run tap-bigquery target-bigquery
and I get: ```2023-05-16T072915.198905Z [debug ] Creating engine '<meltano.core.project.Project object at 0x000001CF1BD8ACD0>@sqlite///C\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_PROJECT\.meltano/meltano.db' 2023-05-16T072915.257643Z [debug ] Found plugin parent parent=tap-bigquery plugin=tap-bigquery source=<DefinitionSource.LOCKFILE: 8> 2023-05-16T072915.262674Z [debug ] found plugin in cli invocation plugin_name=tap-bigquery 2023-05-16T072915.281945Z [debug ] Found plugin parent parent=target-bigquery plugin=target-bigquery source=<DefinitionSource.LOCKFILE: 8> 2023-05-16T072915.291019Z [debug ] found plugin in cli invocation plugin_name=target-bigquery 2023-05-16T072915.291019Z [debug ] head of set is extractor as expected block=<meltano.core.plugin.project_plugin.ProjectPlugin object at 0x000001CF2245B250> 2023-05-16T072915.360385Z [debug ] found block block_type=loaders index=1 2023-05-16T072915.363459Z [debug ] blocks idx=1 offset=0 2023-05-16T072915.441497Z [debug ] Variable '$MELTANO_LOAD_SCHEMA' is not set in the provided env dictionary. 2023-05-16T072915.447906Z [debug ] found ExtractLoadBlocks set offset=0 2023-05-16T072915.447906Z [debug ] All ExtractLoadBlocks validated, starting execution. 2023-05-16T072916.526774Z [debug ] Created configuration at C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_PROJECT\.meltano\run\tap-bigquery\tap.14fd7376-856c-42ea-9001-cf3651061b3b.config.json 2023-05-16T072916.529101Z [debug ] Could not find tap.properties.json in C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_PROJECT\.meltano\extractors\tap-bigquery\tap.properties.json, skipping. 2023-05-16T072916.531876Z [debug ] Could not find tap.properties.cache_key in C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_PROJECT\.meltano\extractors\tap-bigquery\tap.properties.cache_key, skipping. 2023-05-16T072916.537455Z [debug ] Could not find state.json in C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_PROJECT\.meltano\extractors\tap-bigquery\state.json, skipping. 2023-05-16T072916.544904Z [warning ] No state was found, complete import. 2023-05-16T072916.548841Z [debug ] Variable '$P' is not set in the provided env dictionary. 2023-05-16T072916.548841Z [debug ] Variable '$G' is not set in the provided env dictionary. 2023-05-16T072916.551045Z [debug ] Variable '$P' is not set in the provided env dictionary. 2023-05-16T072916.555209Z [debug ] Variable '$G' is not set in the provided env dictionary. 2023-05-16T072916.558623Z [debug ] Invoking: ['C:\\Users\\AndreaRadaelli\\Desktop\\Development\\Venvs\\Meltano_on_GCP\\Meltano_on_GCP_PROJECT\\.meltano\\extractors\\tap-bigquery\\venv\\Scripts\\tap-bigquery.exe', '--config', 'C:\\Users\\AndreaRadaelli\\Desktop\\Development\\Venvs\\Meltano_on_GCP\\Meltano_on_GCP_PROJECT\\.meltano\\run\\tap-bigquery\\tap.14fd7376-856c-42ea-9001-cf3651061b3b.config.json', '--discover'] ERROR state or start_datetime must be specified 2023-05-16T072917.390668Z [debug ] Deleted configuration at C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_PROJECT\.meltano\run\tap-bigquery\tap.14fd7376-856c-42ea-9001-cf3651061b3b.config.json 2023…
```Run invocation could not be completed as block failed: Cannot start plugin tap-bigquery: Catalog discovery failed: invalid catalog: Expecting value: line 1 column 1 (char 0) ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_VENV\lib\site-pa │ │ ckages\meltano\core\plugin\singer\tap.py:394 in discover_catalog │ │ │ │ 391 │ │ # test for the result to be a valid catalog │ │ 392 │ │ try: │ │ 393 │ │ │ with catalog_path.open("r") as catalog_file: │ │ ❱ 394 │ │ │ │ catalog = json.load(catalog_file) │ │ 395 │ │ │ │ Draft4Validator.check_schema(catalog) │ │ 396 │ │ except Exception as err: │ │ 397 │ │ │ catalog_path.unlink() │ │ │ │ C\Users\AndreaRadaelli\AppData\Local\Programs\Python\Python39\lib\json\init.py293 in load │ │ │ │ 290 │ To use a custom ``JSONDecoder`` subclass, specify it with the ``cls`` │ │ 291 │ kwarg; otherwise ``JSONDecoder`` is used. │ │ 292 │ """ │ │ ❱ 293 │ return loads(fp.read(), │ │ 294 │ │ cls=cls, object_hook=object_hook, │ │ 295 │ │ parse_float=parse_float, parse_int=parse_int, │ │ 296 │ │ parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw) │ │ │ │ C\Users\AndreaRadaelli\AppData\Local\Programs\Python\Python39\lib\json\init.py346 in loads │ │ │ │ 343 │ if (cls is None and object_hook is None and │ │ 344 │ │ │ parse_int is None and parse_float is None and │ │ 345 │ │ │ parse_constant is None and object_pairs_hook is None and not kw): │ │ ❱ 346 │ │ return _default_decoder.decode(s) │ │ 347 │ if cls is None: │ │ 348 │ │ cls = JSONDecoder │ │ 349 │ if object_hook is not None: │ │ │ │ C\Users\AndreaRadaelli\AppData\Local\Programs\Python\Python39\lib\json\decoder.py337 in decode │ │ │ │ 334 │ │ containing a JSON doc…
```The above exception was the direct cause of the following exception: ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_VENV\lib\site-pa │ │ ckages\meltano\core\block\singer.py:336 in start │ │ │ │ 333 │ │ │ stdin = asyncio.subprocess.PIPE │ │ 334 │ │ │ │ 335 │ │ try: │ │ ❱ 336 │ │ │ self.process_handle = await self.invoker.invoke_async( │ │ 337 │ │ │ │ limit=line_length_limit, │ │ 338 │ │ │ │ stdin=stdin, # Singer messages │ │ 339 │ │ │ │ stdout=asyncio.subprocess.PIPE, # Singer state │ │ │ │ C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_VENV\lib\site-pa │ │ ckages\meltano\core\plugin_invoker.py:443 in invoke_async │ │ │ │ 440 │ │ Returns: │ │ 441 │ │ │ Subprocess. │ │ 442 │ │ """ │ │ ❱ 443 │ │ async with self._invoke(*args, **kwargs) as ( │ │ 444 │ │ │ popen_args, │ │ 445 │ │ │ popen_options, │ │ 446 │ │ │ popen_env, │ │ │ │ C\Users\AndreaRadaelli\AppData\Local\Programs\Python\Python39\lib\contextlib.py181 in │ │ aenter │ │ │ │ 178 │ │ # they are only needed for recreation, which is not possible anymore │ │ 179 │ │ del self.args, self.kwds, self.func │ │ 180 │ │ try: │ │ ❱ 181 │ │ │ return await self.gen.anext() │ │ 182 │ │ except StopAsyncIteration: │ │ 183 │ │ │ raise RuntimeError("generator didn't yield") from None │ │ 184 │ │ │ │ C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_VENV\lib\site-pa │ │ ckages\meltano\c…
```The above exception was the direct cause of the following exception: ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_VENV\lib\site-pa │ │ ckages\meltano\cli\run.py:172 in _run_blocks │ │ │ │ 169 │ │ │ continue │ │ 170 │ │ │ │ 171 │ │ try: │ │ ❱ 172 │ │ │ await blk.run() │ │ 173 │ │ except RunnerError as err: │ │ 174 │ │ │ logger.error( │ │ 175 │ │ │ │ "Block run completed.", │ │ │ │ C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_VENV\lib\site-pa │ │ ckages\meltano\core\block\extract_load.py:444 in run │ │ │ │ 441 │ │ │ # TODO: legacy
meltano elt
style logging should be deprecated │ │ 442 │ │ │ legacy_log_handler = self.output_logger.out("meltano", logger) │ │ 443 │ │ │ with legacy_log_handler.redirect_logging(): │ │ ❱ 444 │ │ │ │ await self.run_with_job() │ │ 445 │ │ │ │ return │ │ 446 │ │ else: │ │ 447 │ │ │ logger.warning( │ │ │ │ C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_VENV\lib\site-pa │ │ ckages\meltano\core\block\extract_load.py:474 in run_with_job │ │ │ │ 471 │ │ │ │ 472 │ │ with closing(self.context.session) as session: │ │ 473 │ │ │ async with job.run(session): │ │ ❱ 474 │ │ │ │ await self.execute() │ │ 475 │ │ │ 476 │ async def terminate(self, graceful: bool = False) -> None: │ │ 477 │ │ """Terminate an in flight ExtractLoad execution, potentially disruptive. │ │ │ │ C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_VENV\lib\site-pa │ │ c…
```The above exception was the direct cause of the following exception: ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_VENV\lib\site-pa │ │ ckages\meltano\cli\__init__.py:105 in _run_cli │ │ │ │ 102 │ """ │ │ 103 │ try: │ │ 104 │ │ try: # noqa: WPS225, WPS505 │ │ ❱ 105 │ │ │ cli(obj={"project": None}) │ │ 106 │ │ except ProjectReadonly as err: │ │ 107 │ │ │ raise CliError( │ │ 108 │ │ │ │ f"The requested action could not be completed: {err}", │ │ │ │ C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_VENV\lib\site-pa │ │ ckages\click\core.py:1130 in call │ │ │ │ 1127 │ │ │ 1128 │ def __call__(self, *args: t.Any, **kwargs: t.Any) -> t.Any: │ │ 1129 │ │ """Alias for meth`main`.""" │ │ ❱ 1130 │ │ return self.main(*args, **kwargs) │ │ 1131 │ │ 1132 │ │ 1133 class Command(BaseCommand): │ │ │ │ C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_VENV\lib\site-pa │ │ ckages\meltano\cli\cli.py:43 in main │ │ │ │ 40 │ │ │ args: Positional arguments for the Click group. │ │ 41 │ │ │ kwargs: Keyword arguments for the Click group. │ │ 42 │ │ """ │ │ ❱ 43 │ │ return super().main(*args, windows_expand_args=False, **kwargs) │ │ 44 │ │ 45 │ │ 46 @click.group( │ │ │ │ C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_VENV\lib\site-pa │ │ ckages\click\core.py:1055 in main …
```│ C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_VENV\lib\site-pa │ │ ckages\click\core.py:760 in invoke │ │ │ │ 757 │ │ │ │ 758 │ │ with augment_usage_errors(__self): │ │ 759 │ │ │ with ctx: │ │ ❱ 760 │ │ │ │ return __callback(*args, **kwargs) │ │ 761 │ │ │ 762 │ def forward( │ │ 763 │ │ __self, __cmd: "Command", *args: t.Any, **kwargs: t.Any # noqa: B902 │ │ │ │ C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_VENV\lib\site-pa │ │ ckages\meltano\cli\params.py:27 in decorate │ │ │ │ 24 │ │ if database_uri: │ │ 25 │ │ │ ProjectSettingsService.config_override["database_uri"] = database_uri │ │ 26 │ │ │ │ ❱ 27 │ │ return func(*args, **kwargs) │ │ 28 │ │ │ 29 │ return functools.update_wrapper(decorate, func) │ │ 30 │ │ │ │ C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_VENV\lib\site-pa │ │ ckages\meltano\cli\params.py:76 in decorate │ │ │ │ 73 │ │ │ │ except MigrationError as err: │ │ 74 │ │ │ │ │ raise CliError(str(err)) from err │ │ 75 │ │ │ │ │ ❱ 76 │ │ │ func(project, *args, **kwargs) │ │ 77 │ │ │ │ 78 │ │ return functools.update_wrapper(decorate, func) │ │ 79 │ │ │ │ C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_VENV\lib\site-pa │ │ ckages\click\decorators.py:26 in new_func │ │ │ │ 23 │ """ │ │ 24 │ …
Copy code
The above exception was the direct cause of the following exception:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_VENV\lib\site-pa │
│ ckages\meltano\cli\__init__.py:115 in _run_cli                                                   │
│                                                                                                  │
│   112 │   │   except MeltanoError as err:                                                        │
│   113 │   │   │   handle_meltano_error(err)                                                      │
│   114 │   │   except Exception as err:                                                           │
│ ❱ 115 │   │   │   raise CliError(f"{troubleshooting_message}\n{err}") from err                   │
│   116 │   except CliError as cli_error:                                                          │
│   117 │   │   cli_error.print()                                                                  │
│   118 │   │   sys.exit(1)                                                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
CliError: Need help fixing this problem? Visit <http://melta.no/> for troubleshooting steps, or to
join our friendly Slack community.

Run invocation could not be completed as block failed: Cannot start plugin tap-bigquery: Catalog discovery failed: invalid catalog: Expecting value: line 1 column 1 (char 0)

Need help fixing this problem? Visit <http://melta.no/> for troubleshooting steps, or to
join our friendly Slack community.

Run invocation could not be completed as block failed: Cannot start plugin tap-bigquery: Catalog discovery failed: invalid catalog: Expecting value: line 1 column 1 (char 0)
u
@andrea_radaelli I think
ERROR state or start_datetime must be specified
is your issue! Its not super clear but the tap is throwing an error. Usually
invoke
is the best way to debug a particular plugin, see some more suggestions in https://docs.meltano.com/guide/troubleshooting. If you expect
start_datetime
to have been set already then run
meltano config tap-bigquery
to print out your config file, sometimes its not what you'd expect.
a
Hey Pat! thanks again for your help. Unfortunatelly I still have problems with
tap-bigquery
even after your suggestions. I set both
start_datetime
and
end_datetime
and my config looks like this:
Copy code
{
  "streams": [
    {
      "name": "Meltano_on_GCP",
      "table": "bq-test-202003.mt_test.magazzino",
      "columns": [
        "*"
      ],
      "datetime_key": "Data_Movimento"
    }
  ],
  "credentials_path": "client_secret.json",
  "start_datetime": "2017-01-01T00:00:00Z",
  "end_datetime": "2023-05-22T00:00:00Z",
  "limit": 10000,
  "start_always_inclusive": true
}
The error message I get is this: ```Catalog discovery failed: command ['C:\\Users\\AndreaRadaelli\\Desktop\\Development\\Venvs\\Meltano_on_GCP\\Meltano_on_GCP_PROJECT\\.meltano\\extractors\\tap-bigquery\\venv\\Scripts\\tap-bigquery.exe', '--config', 'C:\\Users\\AndreaRadaelli\\Desktop\\Development\\Venvs\\Meltano_on_GCP\\Meltano_on_GCP_PROJECT\\.meltano\\run\\tap-bigquery\\tap.16d8f62f-39cd-47f9-8282-0640ffbc68d6.config.json', '--discover'] returned 1 with stderr: INFO Running query: SELECT * FROM bq-test-202003.mt_test.magazzino WHERE 1=1 AND datetime '2017-01-01 000000.000000' <= CAST(Data_Movimento as datetime) AND CAST(Data_Movimento as datetime) < datetime '2023-05-22 000000.000000' ORDER BY Data_Movimento LIMIT 10000 CRITICAL 403 Access Denied: Table bq-test-202003mt test.magazzino User does not have permission to query table bq-test-202003:mt_test.magazzino, or perhaps it does not exist in location europe-west4. CRITICAL CRITICAL Location: europe-west4 CRITICAL Job ID: 36cfb938-4a70-4b0c-8e59-3ea0a107314e Traceback (most recent call last): File "C:\Users\AndreaRadaelli\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\AndreaRadaelli\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_PROJECT\.meltano\extractors\tap-bigquery\venv\Scripts\tap-bigquery.exe\__main__.py", line 7, in <module> File "C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_PROJECT\.meltano\extractors\tap-bigquery\venv\lib\site-packages\singer\utils.py", line 235, in wrapped return fnc(*args, **kwargs) File "C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_PROJECT\.meltano\extractors\tap-bigquery\venv\lib\site-packages\tap_bigquery\__init__.py", line 175, in main catalog = discover(CONFIG) File "C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_PROJECT\.meltano\extractors\tap-bigquery\venv\lib\site-packages\tap_bigquery\__init__.py", line 45, in discover stream_metadata, stream_key_properties, schema = source.do_discover( File "C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_PROJECT\.meltano\extractors\tap-bigquery\venv\lib\site-packages\tap_bigquery\sync_bigquery.py", line 103, in do_discover results = query_job.result() # Waits for job to complete. File "C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_PROJECT\.meltano\extractors\tap-bigquery\venv\lib\site-packages\google\cloud\bigquery\job\query.py", line 1499, in result do_get_result() File "C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_PROJECT\.meltano\extractors\tap-bigquery\venv\lib\site-packages\google\api_core\retry.py", line 349, in retry_wrapped_func return retry_target( File "C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_PROJECT\.meltano\extractors\tap-bigquery\venv\lib\site-packages\google\api_core\retry.py", line 191, in retry_target return target() File "C:\Users\AndreaRadaelli\Desktop\Development\Venvs\Meltano_on_GCP\Meltano_on_GCP_PROJECT\.meltano\extractors\tap-bigquery\venv\lib\site-p…
u
@andrea_radaelli hmm 🤔 and when you run this exact query manually using your sql editor it works?
SELECT * FROM <http://bq-test-202003.mt|bq-test-202003.mt>_test.magazzino WHERE 1=1 AND datetime '2017-01-01 00:00:00.000000' <= CAST(Data_Movimento as datetime) AND CAST(Data_Movimento as datetime) < datetime '2023-05-22 00:00:00.000000' ORDER BY Data_Movimento LIMIT 10000
u
I ask because I've seen casing and dashes cause issues in other warehouses but I'm not super familiar with BQ
u
Ohhhh you know what I think I figured it out. You're supposed to be able to pass credentials either as an env var
GOOGLE_APPLICATION_CREDENTIALS_STRING
with the full json string or via the
credentials_path
setting. I assume youre using the setting? I found a bug with how the setting is communicating with the tap https://github.com/meltano/hub/pull/1345
u
Once that merges you should be able to run
meltano lock tap-bigquery --update
to pull it in. For now you can test it by overriding the setting in your meltano.yml:
Copy code
- name: tap-bigquery
    variant: anelendata
    pip_url: tap-bigquery
    settings:
      - name: credentials_path
        env: GOOGLE_APPLICATION_CREDENTIALS
        value: $MELTANO_PROJECT_ROOT/client_secrets.json
u
Let me know if that works for you!
a
@pat_nadolny it worked! Thankyou so much 🤗
e
Hi @pat_nadolny! I have been following this conversation as I want to use`GOOGLE_APPLICATION_CREDENTIALS_STRING.`I tried this:
Copy code
- name: credentials_path
      env: GOOGLE_APPLICATION_CREDENTIALS_STRING
      value: '{"type": "X", "project_id": "X "auth_uri":...}'
as shown here but am facing a similar issue to Andrea with "Plugin configuration is invalid".
If you have any ideas, would love to hear them!
u
Unfortunately it looks like the tap only supports reading the credentials from a file. A hack that you can try though is adding a bash or python command/script that write that file out for you as part of your pipeline job see https://meltano.slack.com/archives/C01TCRBBJD7/p1685628417315209?thread_ts=1685613727.607619&amp;cid=C01TCRBBJD7. Then your command would be something like
meltano run write_google_creds_to_file tap-bigquery target-y
and
write_google_creds_to_file
is a script of some sort that reads the env var and write it to the file path...or you could refactor the tap to accept a string as well 😄
e
Out of curiosity, what is the purpose of GOOGLE_APPLICATION_CREDENTIALS in that MR? I'm still facing issues even if I point it to the file instead with the addition described earlier on in the thread
Copy code
plugins:
  extractors:
  - name: tap-bigquery
    namespace: tap_bigquery
    pip_url: git+<https://github.com/anelendata/tap-bigquery.git@v0.3.7>
    executable: tap-bigquery
    capabilities:
    - catalog
    - discover
    - state
    settings:
    - name: start_datetime
      kind: date_iso8601
    - name: start_always_inclusive
      kind: boolean
    - name: credentials_path
      env: GOOGLE_APPLICATION_CREDENTIALS
      value: 'path to client_secrets.json'
    env:
      GOOGLE_CLOUD_PROJECT: xxx
    config:
      streams:
      - name: xxx
        table: <http://xxx.xxxx.xxx|xxx.xxxx.xxx>
        columns:
        - '*'
        datetime_key: updated_at
        filters: ''
      start_datetime: '2022-08-10T00:00:00Z'
      start_always_inclusive: false
  loaders:
  - name: target-s3-jsonl-bigquery
    inherit_from: target-s3-jsonl
    namespace: target_s3_jsonl_bigquery
u
I described the bug a bit in https://github.com/meltano/hub/pull/1345#issue-1723966527 the
env
key in a setting defines an environment variable that will be populated with the value of of that config. The bigquery client needs the env var
GOOGLE_APPLICATION_CREDENTIALS
populated in order for it to properly connect.
u
It looks like you have tap-bigquery installed as a custom plugin. None of the executable/capabilities/settings/namespace are needed if youre installing from the hub so overriding those might be causing weird behavior. You should try removing it then re-adding it from the hub
meltano remove extractor tap-bigquery
plus
meltano add extractor tap-bigquery
and make sure you set the config value
credentials_path