shailesh_kochhar
05/25/2023, 10:45 AMmeltano add extractor tap-s3
meltano select --list --all tap-s3
Error produced
```Cannot list the selected attributes: Catalog discovery failed: command ['/home/ec2-user/workspace/feature-store/meltano/first-pipe/.meltano/extractors/tap-s3/venv/bin/tap-airbyte', '--config', '/home/ec2-user/workspace/feature-store/meltano/first-pipe/.meltano/run/tap-s3/tap.0461fef6-3ada-4752-a6d6-ec9c8c47e155.config.json', '--discover'] returned 1 with stderr:
Traceback (most recent call last):
File "/usr/lib64/python3.9/shutil.py", line 660, in _rmtree_safe_fd
dirfd = os.open(entry.name, os.O_RDONLY, dir_fd=topfd)
PermissionError: [Errno 13] Permission denied: 'tmpzbuqebi0'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ec2-user/workspace/feature-store/meltano/first-pipe/.meltano/extractors/tap-s3/venv/bin/tap-airbyte", line 8, in <module>
sys.exit(TapAirbyte.cli())
File "/home/ec2-user/workspace/feature-store/meltano/first-pipe/.meltano/extractors/tap-s3/venv/lib/python3.9/site-packages/click/core.py", line 1130, in call
return self.main(*args, **kwargs)
File "/home/ec2-user/workspace/feature-store/meltano/first-pipe/.meltano/extractors/tap-s3/venv/lib/python3.9/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/home/ec2-user/workspace/feature-store/meltano/first-pipe/.meltano/extractors/tap-s3/venv/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/ec2-user/workspace/feature-store/meltano/first-pipe/.meltano/extractors/tap-s3/venv/lib/python3.9/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/home/ec2-user/workspace/feature-store/meltano/first-pipe/.meltano/extractors/tap-s3/venv/lib/python3.9/site-packages/tap_airbyte/tap.py", line 269, in cli
tap: TapAirbyte = cls( # type: ignore
File "/home/ec2-user/workspace/feature-store/meltano/first-pipe/.meltano/extractors/tap-s3/venv/lib/python3.9/site-packages/tap_airbyte/tap.py", line 314, in init
super().__init__(*args, **kwargs)
File "/home/ec2-user/workspace/feature-store/meltano/first-pipe/.meltano/extractors/tap-s3/venv/lib/python3.9/site-packages/singer_sdk/tap_base.py", line 97, in init
self.mapper.register_raw_streams_from_catalog(self.catalog)
File "/home/ec2-user/workspace/feature-store/meltano/first-pipe/.meltano/extractors/tap-s3/venv/lib/python3.9/site-packages/singer_sdk/tap_base.py", line 159, in catalog
self._catalog = self.input_catalog or self._singer_catalog
File "/home/ec2-user/workspace/feature-store/meltano/first-pipe/.meltano/extractors/tap-s3/venv/lib/python3.9/site-packages/singer_sdk/tap_base.py", line 251, in _singer_catalog
for stream in self.streams.values()
File "/home/ec2-user/workspace/feature-store/meltano/first-pipe/.meltano/extractors/tap-s3/venv/lib/python3.9/site-packages/singer_sdk/tap_base.py", line 122, in streams
for stream in self.load_streams():
File "/home/ec2-user/workspace/feature-store/meltano/first-pipe/.meltano/extractors/tap-s3/venv/lib/python3.9/site-packages/singer_sdk/tap_base.py", line 283, in load_streams
for stream in self.discover_streams():
File "/home/ec2-user/workspace/feature-store/meltano/first-pipe/.meltano/extractors/tap-s3/venv/lib/python3.9/site-packages/tap_airbyte/tap.py", line 688, in discover_streams
for stream in self.airbyte_catalog["streams"]:
File "/home/ec2-user/workspace/feature-store/meltano/first-pipe/.meltano/extractors/tap-s3/venv/lib/python3.9/site-packages/tap_a…shailesh_kochhar
05/26/2023, 5:49 AMtap-airbyte
extractor to discover a catalog which tries to create some temporary files and then read them. Somehow those files are created as the root user and not the current user and so cannot be read.
Would anyone know where to look to understand this behaviour? Is this expected and I need to run as root? Or is there some other problem lurking?ceyhun_kerti
05/26/2023, 9:24 PMshailesh_kochhar
05/30/2023, 4:40 AMpat_nadolny
05/30/2023, 7:26 PMshailesh_kochhar
05/31/2023, 6:41 AMshailesh_kochhar
06/01/2023, 4:53 AMversion: 1
default_environment: dev
project_id: d88d7541-5c12-4c90-8b59-03766a4e666c
environments:
- name: dev
- name: staging
- name: prod
plugins:
extractors:
- name: tap-s3
variant: airbyte
pip_url: git+<https://github.com/MeltanoLabs/tap-airbyte-wrapper.git>
config:
airbyte_config:
dataset: XXXXXX-XXXXXX-logs
path_pattern: dt=*/hr=*/*
format:
filetype: json
infer_datatypes: true
loaders:
- name: target-jsonl
variant: andyh1203
pip_url: target-jsonl
And config.json
{
"airbyte_spec": {
"image": "airbyte/source-s3",
"tag": "latest"
}
}
pat_nadolny
06/01/2023, 1:47 PMmeltano config tap-s3
you should get your full config. Theres no need to manually pass in a config.json though, meltano will manage that for you.pat_nadolny
06/01/2023, 1:48 PMconnector_config.provider.bucket
and connector_config.path_pattern
are required with docs in https://docs.airbyte.com/integrations/sources/s3/#s3-provider-settingspat_nadolny
06/01/2023, 1:49 PMpat_nadolny
06/01/2023, 1:51 PMshailesh_kochhar
06/02/2023, 4:39 AMmeltaon run tap-s3 target-jsonl
By outdated image do you mean the airbyte-tap image being run by docker?shailesh_kochhar
06/02/2023, 4:40 AMmeltano config tap-s3
to see what it contained. I don't pass it into the run command.user
06/02/2023, 12:58 PMBy outdated image do you mean the airbyte-tap image being run by docker?Yes I doubt this is your problem but if that image is really old it might have bugs, just something to try.
user
06/02/2023, 12:59 PMshailesh_kochhar
06/05/2023, 8:07 AMconnector_config.path_pattern
is present in the snipper above. There's also another file in the directory named `.env`with sensitive parameters like bucket, access_key_id and access_key_params
TAP_S3_AIRBYTE_CONFIG_PROVIDER_BUCKET='XXXXXXXXXXXX'
TAP_S3_AIRBYTE_CONFIG_PROVIDER_AWS_ACCESS_KEY_ID='XXXXXXXXXXXX'
TAP_S3_AIRBYTE_CONFIG_PROVIDER_AWS_SECRET_ACCESS_KEY='XXXXXXXX'
Docker shows that the airbyte/tap-s3 image is the latest version
~/workspace/meltano/first-pipe$ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
airbyte/source-s3 latest 6cca1f00ca50 2 weeks ago 459MB
I have also tried explicitly deleting the image and running the s3 tap with
~/workspace/meltano/first-pipe$ docker rmi airbyte/source-s3
~/workspace/meltano/first-pipe$ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
~/workspace/meltano/first-pipe$ meltano config tap-s3 test
~/workspace/meltano/first-pipe$ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
airbyte/source-s3 latest 6cca1f00ca50 2 weeks ago 459MB
The meltano config
command fails with an 'Operation not permitted' error
Since there was an earlier mention of docker permissions, I'm wondering if docker is correctly installed on the host. Is there a preferred way to install docker? IIRC, I followed the install instructions for installing docker engine on Ubuntu from apt repos.
https://docs.docker.com/engine/install/ubuntu/#install-using-the-repositoryanthony_giannotti
08/18/2023, 2:23 AM