Has anyone ever had a pipeline go from working fin...
# troubleshooting
j
Has anyone ever had a pipeline go from working fine one day to all of sudden struggling to get a catalog input? This is on tap-dynamodb running in an ECS container orchestrated by Dagster:
Copy code
tap-dynamodb v0.0.1, Meltano SDK v0.3.18)
Skipping parse of env var settings...
Config validation passed with 0 errors and 0 warnings.
Traceback (most recent call last):
  File "/opt/dagster/app/.meltano/extractors/tap-dynamodb/venv/bin/tap-dynamodb", line 8, in <module>
    sys.exit(TapDynamoDB.cli())
  File "/opt/dagster/app/.meltano/extractors/tap-dynamodb/venv/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/opt/dagster/app/.meltano/extractors/tap-dynamodb/venv/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/opt/dagster/app/.meltano/extractors/tap-dynamodb/venv/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/dagster/app/.meltano/extractors/tap-dynamodb/venv/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/opt/dagster/app/.meltano/extractors/tap-dynamodb/venv/lib/python3.8/site-packages/singer_sdk/tap_base.py", line 442, in cli
    tap = cls(  # type: ignore  # Ignore 'type not callable'
  File "/opt/dagster/app/.meltano/extractors/tap-dynamodb/venv/lib/python3.8/site-packages/singer_sdk/tap_base.py", line 74, in __init__
    self._input_catalog = Catalog.from_dict(read_json_file(catalog))
  File "/opt/dagster/app/.meltano/extractors/tap-dynamodb/venv/lib/python3.8/site-packages/singer_sdk/helpers/_util.py", line 22, in read_json_file
    return cast(dict, json.loads(Path(path).read_text()))
  File "/usr/local/lib/python3.8/json/__init__.py", line 357, in loads
    return _default_decoder.decode(s)
  File "/usr/local/lib/python3.8/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/local/lib/python3.8/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
a
Cc @pat_nadolny
j
Can this result from network connectivity issues? I would expect that to manifest some other way but I can run this same pipeline in my stage pipeline and locally just fine, but in my prod environment it chokes like this 🤔
a
I think this is a new implementation and may not have a version pinned. Can you confirm @josh_lloyd if your pip_url is pinned to any URL or version ref?
I know @pat_nadolny has been working on this tap. Is it possible one of the new changes broke your existing flow? If so, you should be able to pin back to the prior commit ref or version (although I don't know if we have yet published pinnable versions).
j
ah, it is not pinned. I’ll give that a shot
actually that’s a lie
It’s pointing at my own version of the tap
<https://github.com/Widen/tap-dynamodb.git>
and I know I didn’t change that any time recently
a
Hey @josh_lloyd! How did you set up the schema for your DDB tables? I couldn’t get tap-dynamodb to get the catalog schema when I was setting up a few months ago.
p
Just catching up on this now - yeah it looks like Josh is using the Widen variant instead of https://github.com/MeltanoLabs/tap-dynamodb which AJ was referring to. I just built https://github.com/MeltanoLabs/tap-dynamodb in the last few weeks.
Can this result from network connectivity issues?
@josh_lloyd I dont know for sure but I would guess this is the issue. I skimmed the Widen tap code and I think if you dont get results back from the scan request then it has no records to generate a catalog from. I wonder if misconfigured credentials could cause this 🤔 . Unfortunately the stack trace doesnt tell us too much
Side Note: @abhishek_ajmera if you want you can test out the new meltanolabs variant, its relatively new but I'm running it in production for our simple use cases and its working well for me. It infers schema using genson right now. You can join #C04TSH483DF to discuss more!
j
@abhishek_ajmera The way this version of the tap is designed, it infers the schema from the first X number of records that are returned from a query. I don’t know that all other variants of the tap make that possible (similar, it sounds, to the way the meltano variant is doing it)