working on building a tap that dynamically builds ...
# singer-tap-development
j
working on building a tap that dynamically builds a stream schema based on the first record returned from an api call, Things were going great until I hit a very not helpful message that says
Catalog discovery failed: invalid catalog: Expecting value: line 1 column 1 (char 0)
full stack trace:
```tap-rest-api % meltano --log-level=debug invoke tap-rest-api --catalog [2021-07-26 135652,435] [16477|MainThread|root] [DEBUG] Creating engine <meltano.core.project.Project object at 0x10453e610>@sqlite:////Users/jlloyd/Documents/dev/tap-rest-api/.meltano/meltano.db [2021-07-26 135652,537] [16477|MainThread|root] [DEBUG] Created configuration at /Users/jlloyd/Documents/dev/tap-rest-api/.meltano/run/tap-rest-api/tap.config.json [2021-07-26 135652,537] [16477|MainThread|root] [DEBUG] Could not find tap.properties.json in /Users/jlloyd/Documents/dev/tap-rest-api/.meltano/extractors/tap-rest-api/tap.properties.json, skipping. [2021-07-26 135652,538] [16477|MainThread|root] [DEBUG] Could not find tap.properties.cache_key in /Users/jlloyd/Documents/dev/tap-rest-api/.meltano/extractors/tap-rest-api/tap.properties.cache_key, skipping. [2021-07-26 135652,538] [16477|MainThread|root] [DEBUG] Could not find state.json in /Users/jlloyd/Documents/dev/tap-rest-api/.meltano/extractors/tap-rest-api/state.json, skipping. [2021-07-26 135652,538] [16477|MainThread|root] [DEBUG] Invoking: ['/Users/jlloyd/Documents/dev/tap-rest-api/tap-rest-api.sh', '--config', '/Users/jlloyd/Documents/dev/tap-rest-api/.meltano/run/tap-rest-api/tap.config.json', '--discover'] [2021-07-26 135652,538] [16477|MainThread|root] [DEBUG] Env: {...} [2021-07-26 135657,519] [16477|MainThread|root] [DEBUG] Deleted configuration at /Users/jlloyd/Documents/dev/tap-rest-api/.meltano/run/tap-rest-api/tap.config.json [2021-07-26 135657,520] [16477|MainThread|meltano.cli.utils] [DEBUG] Catalog discovery failed: invalid catalog: Expecting value: line 1 column 1 (char 0) Traceback (most recent call last): File "/Users/jlloyd/.pyenv/versions/3.7.10/envs/meltanosdk37/lib/python3.7/site-packages/meltano/core/plugin/singer/tap.py", line 263, in discover_catalog catalog = json.load(catalog_file) File "/Users/jlloyd/.pyenv/versions/3.7.10/lib/python3.7/json/__init__.py", line 296, in load parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw) File "/Users/jlloyd/.pyenv/versions/3.7.10/lib/python3.7/json/__init__.py", line 348, in loads return _default_decoder.decode(s) File "/Users/jlloyd/.pyenv/versions/3.7.10/lib/python3.7/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/Users/jlloyd/.pyenv/versions/3.7.10/lib/python3.7/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/Users/jlloyd/.pyenv/versions/3.7.10/envs/meltanosdk37/lib/python3.7/site-packages/meltano/cli/__init__.py", line 44, in main cli(obj={"project": None}) File "/Users/jlloyd/.pyenv/versions/3.7.10/envs/meltanosdk37/lib/python3.7/site-packages/click/core.py", line 829, in call return self.main(*args, **kwargs) File "/Users/jlloyd/.pyenv/versions/3.7.10/envs/meltanosdk37/lib/python3.7/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/Users/jlloyd/.pyenv/versions/3.7.10/envs/meltanosdk37/lib/python3.7/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/Users/jlloyd/.pyenv/versions/3.7.10/envs/meltanosdk37/lib/python3.7/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/Users/jlloyd/.pyenv/versions/3.7.10/envs/meltanosdk37/lib/python3.7/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/Users/jlloyd/.pyenv/versions/3.7.10/envs/meltanosdk37/lib/python3.7/site-packages/meltano/cli/params.py", line 23, in decorate return func(*args, **kwargs) File "/Users/jlloyd/.pyenv/versions/3.7.10/envs/meltanosdk37/lib/python3.7/site-packages/meltano/cli/param…
a
I think you may have to test and develop this function outside of Meltano at first. Do you plan to feed this into a discovery process?
j
that’s the funny thing is that I can run
meltano invoke tap-rest-api --discover
just fine …
Maybe I should clarify that I am using the SDK and the most recent version of meltano
ah, just figured it out. I had to run
meltano invoke tap-rest-api --discover > out.jsonl
and look at the file before I realized that the output was including some print statements that I forgot to take out …
a
@josh_lloyd Is this something you think we might be able to build into the SDK? Given
n
number of rows per stream a tap developer could either use this to build a catalog during dev time to save coding effort, or else at runtime for dynamic schemas.
j
almost definitely, I’ve been making good progress over the last couple hours. But a lot of what I’ve built is a bit “hackish” for my particular situation. We can probably strip out the funky parts and decide on something more widely applicable.
the wackiness comes into play when trying to decide whether or not to flatten certain json sub-objects.
@aaronsteers I’ve completed an initial prototype. I went far enough to get it working for my use case, but it’s not a fully robust tap (no authentication capabilities, for example). Regardless, if you’re only looking to incorporate it into the SDK, it’s really just the logic we’d need to port over, not the tap itself. Here’s the link to my repo. The readme should have a little bit of explanation. Sidenote: awesome job on the SDK. This was all way simpler than I anticipated it would be. Took me only 3 days to build the tap and use it in production within my dagster environment. I’m happy to try and help incorporate it into the SDK, though I’m not really sure where to start with that and I think a good amount of discussion might need to take place before it does. Happy to create a ticket if you’re still interested.
a
@josh_lloyd - ty for this contribution to the community, and the positive feedback on the SDK. I have spent some time looking over your new tap, and I have to say, this seems to me a very exciting addition for the community! I just logged this issue and I've marked it as "accepting merge requests": https://gitlab.com/meltano/sdk/-/issues/174
cc @amanda.folson - 👆
j
love your summary in the ticket and positive feedback. I’m proud to help the community!
Although their code was significantly altered to point of non-recognition I do feel like I should mention that I got the idea from https://github.com/anelendata/tap-rest-api
a
Thanks! I've noted that repo also in the "refs" section 👍