With <tap-spreadsheets-anywhere>, it authenticates...
# troubleshooting
s
With tap-spreadsheets-anywhere, it authenticates "as described in the smart_open documentation here." So if I want to tap s3 with a key, private key, and bucket name, how would my
config:tables:path
actually look? Something like this I'm assuming: "_Other examples of URLs that smart_open accepts:_ `s3://my_key:my_secret@my_bucket/my_key`"... I'm also assuming I don't need to expose my credentials in my config so how I could I get this url to pull from my
.env
file given this "smart_open" format? Thanks!
m
I am setting the AWS environment variables
AWS_ACCESS_KEY_ID
,
AWS_SECRET_ACCESS_KEY
, and
AWS_DEFAULT_REGION
. I believe that s3 access uses the boto3 library under the hood - that library supports a range of AWS access methods, but this is what I’m doing and it works. If you set those three variables in your
.env
file, meltano should pull them from there automatically.
as for the config itself, here’s a working example from my project (note: I’m using the jsonl format here but the idea is the same)
Copy code
plugins:
        extractors:
          - name: tap-spreadsheets-anywhere
            config:
              tables:
                - path: <s3://my-bucket-name>
                  name: my_stream_name
                  pattern: my-directory/my-subdirectory/.*json
                  start_date: '2020-01-01T00:00:00Z'
                  key_properties: [ ]
                  format: jsonl
s
Okay that's nice and clean! Will try it out. 👌
Sorry if this shows my greenness, but I've followed this format and am getting:
[Errno 2] No such file or directory: '/Users/sonnygeorge/meltano-test/.meltano/run/tap-spreadsheets-anywhere/tap.9f4dfeec-c938-4b81-bf74-264e76b926e6.config.json'
(and for the record, it is correct, that file is certainly not in that folder..)
m
did you install the tap? (with
meltano add extractor  tap-spreadsheets-anywhere
?)
if you run
meltano config tap-spreadsheets-anywhere
what’s the result? It should be a JSON representation containing that tables array
s
Yeah don't know what I had done wrong earlier, but I went through the basics again and am getting "plugin configuration is valid" 😄
Hmmmm... Now running with
meltano run tap-spreadsheets-anywhere target-jsonl
, I seem to be getting stuck on the first of 1034 files...
m
is that a really big file?
s
Not 20 minutes big... Simple python/boto3 downloads it much quicker
m
do you get different results if you run
meltano invoke tap-spreadsheets-anywhere
?
s
This seems to work... that is, there is a flurry of .jsonl lines in my terminal 😅
Does
invoke
generate a log anywhere? It's hard to tell if it is making it past the first file with all the mess flying through my terminal 😅
m
I don’t know (I’m still fairly new honestly)
you might try
meltano invoke tap-spreadsheets-anywhere | tee -a output.jsonl
to redirect to an output file as well as display in the terminal