I’ve created a new tap using the SDK. It infers th...
# singer-tap-development
j
I’ve created a new tap using the SDK. It infers the schema of a stream using the response of a call to the api. I had it working at one point but now when it tries to infer the schema or spit out a catalog file it chokes like so when using the command
meltano --log-level=debug invoke --dump catalog tap-sumologic
:
Copy code
[2022-07-26 15:10:33,498] [40074|MainThread|meltano.cli.utils] [DEBUG] Could not dump catalog: Catalog discovery failed: invalid catalog: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/meltano/core/plugin/singer/tap.py", line 302, in discover_catalog
    catalog = json.load(catalog_file)
  File "/usr/local/Cellar/python@3.9/3.9.7_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/__init__.py", line 293, in load
    return loads(fp.read(),
  File "/usr/local/Cellar/python@3.9/3.9.7_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/local/Cellar/python@3.9/3.9.7_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/local/Cellar/python@3.9/3.9.7_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
the schema it produces is this:
Copy code
{'type': 'object', 'properties': {'_sourcecategory': {'type': ['null', 'string']}, '_count': {'type': ['null', 'string', 'integer']}}, 'key_properties': ['_sourcecategory']}
c
My below comment may seem to have nothing to do with your question .... but ...
Copy code
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/meltano/core/plugin/singer/tap.py", line 302, in discover_catalog
Why is the SDK installed in a global location? (
/usr/local/lib
)? The reason I'm asking this, is because what I would normally do in these situations is to quickly fire up the python debugger (
pdb
) by adding a quick
import pdb; pdb.set_trace()
line just before the line where the trace gets thrown in order to force a debugger breakpoint. But, I wouldn't want to do that for a globally installed library (i.e. something that is shared across everything on your computer by being installed in
/usr/local/lib
) Normally, the Meltano SDK should be in a virtualenv that is managed by meltano inside the
.meltano
folder in your meltano project folder, which therefore makes editing the source
.py
files and throwing in a quick
import pdb; pdb.set_trace()
a no-brainer to do.
j
that is a good question … there is no particular reason. I didn’t intentionally do that. I initiallized this repo doing a
poetry install
. I would have thought that it would have avoided that for me
c
Yup.
poetry install
would also manage a virtualenv for you that has the SDK in it. The global location could be a historical remnant on the machine you're working on. Anyway, in case you can just run your tap directly (without meltano cli), you could also just fire up
pdb
directly (something like
python -m pdb -c tap-sumologic --config ....
should do the trick, I think) and the debugger should throw you into an interactive shell as soon as your tap raises that JSONDecodeError exception and then you can have a look around. Quickstart for
pdb
in case you're not super familiar with it yet https://realpython.com/lessons/getting-started-pdb/
Or
python -m pdb -c tap-sumologic --discover
in your case, I suppose, since that's where the problem occurs
j
I am totally new to pdb, so thanks for that. When I try to run that command though I get
Copy code
getopt.GetoptError: option --discover not recognized
I can run
meltano invoke tap-sumologic --discover
just fine though
c
Yeah. Turns out that the arguments are passed to
pdb
and not to the python script ....
j
ah, but using
breakpoint()
led me to the
properties.json
file where I discovered that a lazy
print
statement had accidentally inserted bogus info into the file. testing the removal of that print statement now …
yup, that did the trick. Thanks!
c
import logging
logging.debug()
will be much safer and won't clobber your stdout.
j
yeah, thanks. I think I switched it to prints to debug something at one point and forgot to switch it back. Next time I’ll remember to use the logging instead.