Probably something simple, but I'm getting: ```tap...
# singer-tap-development
f
Probably something simple, but I'm getting:
Copy code
tap-athena      | extractor |   File ".../meltano/.meltano/extractors/tap-athena/venv/lib/python3.9/site-packages/singer_sdk/tap_base.py", line 514, in discover_streams
tap-athena      | extractor |     for catalog_entry in self.catalog_dict["streams"]:
tap-athena      | extractor | KeyError: 'streams'
when I do a elt run. However, I can do a meltano invoke tap-athena --discover and get the discovery catalog. I'm trying to test a change before I submit the PR, to make the AWS credentials optional so they can be picked up via standard locations (env, default profile, instance creds, etc). With debug logging I get:
Copy code
time=2021-11-05 15:09:19 name=tap-athena level=INFO message=Skipping parse of env var settings...
time=2021-11-05 15:09:19 name=tap-athena level=INFO message=Config validation passed with 0 errors and 0 warnings.
time=2021-11-05 15:09:19 name=root level=INFO message=Operator '__else__=None' was not found. Unmapped streams will be included in output.
time=2021-11-05 15:09:19 name=botocore.credentials level=INFO message=Found credentials in shared credentials file: ~/.aws/credentials
time=2021-11-05 15:09:50 name=botocore.credentials level=INFO message=Found credentials in shared credentials file: ~/.aws/credentials
[2021-11-05 15:10:21,414] [82571|MainThread|root] [DEBUG] Deleted configuration at .../meltano/.meltano/run/tap-athena/tap.....config.json
So it woks with discover, but when used as an actual target it can't use the discover to find the streams...
v
you may have a cached catalog in the .meltano/something folder
If you could give the commands you're running and the logs that would be much more helpful
Hard to assume how errors are coming up
f
I posted:
Copy code
tap-athena            | extractor |     for catalog_entry in self.catalog_dict["streams"]:
tap-athena            | extractor | KeyError: 'streams'
It's failing when looking at the streams in the catalog. I tried running it with a catalog file that I saved from a --discover call and it failed also:
meltano elt tap-athena target-postgres --catalog catalog.json
. Maybe it's something with my environment, I'll try building the docker image and running it in docker.
e
tap-athena
is still experimental (using unreleased SDK features) so this may be a bug in how the input catalog is used to create stream instances at run time. cc @aaronsteers
a
Hi, @fred_reimer. It appears this a new bug caused by a recent refactor on the open MR for SQL type streams. I will need to pin the tap-athena connector to a specific prerelease version or commit tag. I will take that action now. Should hopefully be able to resolve by EOD. Apologies for the inconvenience.
cc @pat_nadolny - this may affect you as well
p
I ended up hacking a forked version together to get my MVP working since you guys were still working on the db type streams part https://github.com/pnadolny13/tap-athena/commits/hard_fork_sdk_db_stream_stuff
I'm happy to switch back whenever its in a good state for more testing 😄
f
Sorry, was dealing with an issue. I'll pin it to the commit for my testing and let you know.
Well, I would, but not sure which commit to use. I'm done for the day, but if someone can tell me what commit to use I'll test it out tomorrow.
Here is the issue:
Copy code
if self.input_catalog:
            return self.input_catalog
self.input_catalog just returns
self._input_catalog
, which is a Catalog object, not a dict. This needs to be replaced with:
Copy code
if self.input_catalog:
            return self.input_catalog.to_dict()
I'll make a MR on the 263-sql-type-targets-and-sinks branch..
And with this commit, the ELT pipeline works to pull data from Athena and put it in a PostgreSQL database.
a
Thanks, @fred_reimer! Much appreciated. I'm planning to spend more time on this later today.
f
Any update on this? I'd like to sync up to the latest version as I need to add partitions in target-athena for our use-case.