Hello, i'm using tap-s3-csv to extract from s3 a c...
# plugins-general
b
Hello, i'm using tap-s3-csv to extract from s3 a csv file and transfer to snowflake .. some of the files are empty, and getting this error :
Copy code
ValueError: activebaseaggregate.csv file(s) has no data and cannot analyse the content to generate the required schema.
is there a way that I can skip the file it its empty instead of breaking?
e
The s7clarke10 variant has a
warning_if_no_file
setting that would probably help
b
@edgar_ramirez_mondragon Tried it and getting this error
Copy code
✘ ✝  ~/meltano-onmopay/test  meltano add extractor tap-s3-csv --variant s7clarke10
Added extractor 'tap-s3-csv' to your Meltano project
Variant:	s7clarke10
Repository:	<https://github.com/s7clarke10/pipelinewise-tap-s3-csv>
Documentation:	<https://hub.meltano.com/extractors/tap-s3-csv--s7clarke10>

Installing extractor 'tap-s3-csv'...
Extractor 'tap-s3-csv' could not be installed: failed to install plugin 'tap-s3-csv'.
  Running command git clone --filter=blob:none --quiet <https://github.com/s7clarke10/pipelinewise-tap-s3-csv.git> /private/var/folders/__/wnmv0801619bbtcjxpng6ghr0000gn/T/pip-req-build-b2gogocl
  Running command git clone --filter=blob:none --quiet <https://github.com/s7clarke10/singer-encodings.git> /private/var/folders/__/wnmv0801619bbtcjxpng6ghr0000gn/T/pip-install-lng669ok/singer-encodings_b27bdcd5a4d24eed9d19a1cf735b85ef
  Running command git clone --filter=blob:none --quiet <https://github.com/s7clarke10/messytables.git> /private/var/folders/__/wnmv0801619bbtcjxpng6ghr0000gn/T/pip-install-lng669ok/messytables_5bec95bf4d9b48329bbd4b6c9dc8b137
  error: subprocess-exited-with-error

  × Preparing metadata (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [6 lines of output]
      💥 maturin failed
        Caused by: pyproject.toml is invalid
        Caused by: pyproject.toml is not PEP 517 compliant: invalid type: string "on", expected a boolean for key `tool.maturin.strip` at line 12 column 9
      Error running maturin: Command '['maturin', 'pep517', 'write-dist-info', '--metadata-directory', '/private/var/folders/__/wnmv0801619bbtcjxpng6ghr0000gn/T/pip-modern-metadata-1xsftzfi', '--interpreter', '/Users/baselabdo/meltano-onmopay/test/.meltano/extractors/tap-s3-csv/venv/bin/python']' returned non-zero exit status 1.
      Checking for Rust toolchain....
      Running `maturin pep517 write-dist-info --metadata-directory /private/var/folders/__/wnmv0801619bbtcjxpng6ghr0000gn/T/pip-modern-metadata-1xsftzfi --interpreter /Users/baselabdo/meltano-onmopay/test/.meltano/extractors/tap-s3-csv/venv/bin/python`
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Need help fixing this problem? Visit <http://melta.no/> for troubleshooting steps, or to
join our friendly Slack community.

Failed to install plugin(s)
e
Are you on Python 3.11 by chance? If that's the case I think what's failing is the building of
orjson
(a transitive dependency) because it seems the build backend (maturin) changed along the way from pinned
orjson==3.7.2
. My guess is that Python 3.10 wouldn't have this problem because binary wheels exist.
b
Yes I am 😄 will try 3.10
thanks
tried with python 3.10 and still having the same issue
Copy code
(myenv)  ✝  ~/meltano-onmopay/test-2  python3 --version
Python 3.10.11
(myenv)  ✝  ~/meltano-onmopay/test-2  meltano add extractor tap-s3-csv --variant s7clarke10
Extractor 'tap-s3-csv' already exists in your Meltano project
To add it to your project another time so that each can be configured differently,
add a new plugin inheriting from the existing one with its own unique name:
	meltano add extractor tap-s3-csv--new --inherit-from tap-s3-csv

Installing extractor 'tap-s3-csv'...
Extractor 'tap-s3-csv' could not be installed: failed to install plugin 'tap-s3-csv'.
  Running command git clone --filter=blob:none --quiet <https://github.com/s7clarke10/pipelinewise-tap-s3-csv.git> /private/var/folders/__/wnmv0801619bbtcjxpng6ghr0000gn/T/pip-req-build-vcmne7xz
  Running command git clone --filter=blob:none --quiet <https://github.com/s7clarke10/singer-encodings.git> /private/var/folders/__/wnmv0801619bbtcjxpng6ghr0000gn/T/pip-install-n64fxycy/singer-encodings_2a8aab772124469b9b021870ac40e538
  Running command git clone --filter=blob:none --quiet <https://github.com/s7clarke10/messytables.git> /private/var/folders/__/wnmv0801619bbtcjxpng6ghr0000gn/T/pip-install-n64fxycy/messytables_654ce1341ff3459c80a85bf1419e2f20
  error: subprocess-exited-with-error

  × Preparing metadata (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [6 lines of output]
      💥 maturin failed
        Caused by: pyproject.toml is invalid
        Caused by: pyproject.toml is not PEP 517 compliant: invalid type: string "on", expected a boolean for key `tool.maturin.strip` at line 12 column 9
      Error running maturin: Command '['maturin', 'pep517', 'write-dist-info', '--metadata-directory', '/private/var/folders/__/wnmv0801619bbtcjxpng6ghr0000gn/T/pip-modern-metadata-hz1geit2', '--interpreter', '/Users/baselabdo/meltano-onmopay/test-2/.meltano/extractors/tap-s3-csv/venv/bin/python']' returned non-zero exit status 1.
      Checking for Rust toolchain....
      Running `maturin pep517 write-dist-info --metadata-directory /private/var/folders/__/wnmv0801619bbtcjxpng6ghr0000gn/T/pip-modern-metadata-hz1geit2 --interpreter /Users/baselabdo/meltano-onmopay/test-2/.meltano/extractors/tap-s3-csv/venv/bin/python`
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Need help fixing this problem? Visit <http://melta.no/> for troubleshooting steps, or to
join our friendly Slack community.

Failed to install plugin(s)
e
Then that probably means you're on an M1 mac. I think it's gonna be a pain to get that installed without figuring out the entire maturin + rust toolchain from that point in time 😬.
b
@edgar_ramirez_mondragon I tried to fork the
transferwise variant
repo of the tap-s3-csv extractor and I manually moved the changes from s7clarke1 variant to the forked copy .. and I re-installed the plugins and it passed. But, when I ran the pipeline, I'm getting this error .. which I couldn't figure out the reason thinkspin (attached the error log file) ```2023-12-07T004419.537946Z [error ] Loading failed code=1 message=Exception: Line is missing required properties key(s): {} name=meltano run_id=e620a28c-79d0-4f24-9675-48f9dabb6181 state_id=2023-12-07T004405--tap-s3-csv--target-snowflake 2023-12-07T004419.538070Z [debug ] ELT could not be completed: Loader failed. For more detailed log messages re-run the command using 'meltano --log-level=debug ...' CLI flag. Note that you can also check the generated log file at '/Users/baselabdo/meltano-onmopay/test/.meltano/logs/elt/2023-12-07T004405--tap-s3-csv--target-snowflake/e620a28c-79d0-4f24-9675-48f9dabb6181/elt.log'. For more information on debugging and logging: https://docs.meltano.com/reference/command-line-interface#debugging ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/meltano/cli/elt. │ │ py:288 in _run_elt │ │ │ │ 285 │ │ │ if elt_context.only_transform: │ │ 286 │ │ │ │ log.info("Extract & load skipped.") │ │ 287 │ │ │ else: │ │ ❱ 288 │ │ │ │ await _run_extract_load(log, elt_context, output_logger) │ │ 289 │ │ │ │ │ 290 │ │ │ if elt_context.transformer: │ │ 291 │ │ │ │ await _run_transform(log, elt_context, output_logger) │ │ │ │ /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/meltano/cli/elt. │ │ py:357 in _run_extract_load │ │ │ │ 354 │ │ │ with loader_log.line_writer() as loader_log_writer: │ │ 355 │ │ │ │ with extractor_out_writer_ctxmgr() as extractor_out_writer: │ │ 356 │ │ │ │ │ with loader_out_writer_ctxmgr() as loader_out_writer: │ │ ❱ 357 │ │ │ │ │ │ await singer_runner.run( │ │ 358 │ │ │ │ │ │ │ **kwargs, │ │ 359 │ │ │ │ │ │ │ extractor_log=extractor_log_writer, │ │ 360 │ │ │ │ │ │ │ loader_log=loader_log_writer, │ │ │ │ /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/meltano/core/run │ │ ner/singer.py:223 in run │ │ │ │ 220 │ │ async with tap.prepared(self.context.session), target.prepared( │
e
The loader is failing, so the tap seems alright. Can you try running
meltano invoke tap-s3-csv > singer.jsonl
and inspect the contents of
singer.jsonl
for invalid lines?