Hi all I m debugging an issue with a tap snowflake to target Meltano #troubleshooting

Hi all, I'm debugging an issue with a tap-snowflak...

peter_pezon

12/16/2022, 11:47 PM

Hi all, I'm debugging an issue with a tap-snowflake-to-target-csv pipeline. Runs fine locally; but when I package it up into a Docker image, and run in a k8s pod attached to an EFS volume, I get an error and not clear how to debug it. ```2022-12-16T234205.670660Z [info ] time=2022-12-16 234205 name=singer level=INFO message=METRIC: {"type": "counter", "metric": "record_count", "value": 59725, "tags": {"database": "ANALYTICS", "table": "SALES_INVOICE_V"}} cmd_type=elb consumer=False name=tap-snowflake-sap-sales-invoice producer=True stdio=stderr string_id=tap-snowflake-sap-sales-invoice 2022-12-16T234304.671892Z [error ] Loader failed ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /.venv/lib/python3.9/site-packages/meltano/core/logging/output_logger.py:201 in redirect_logging │ │ │ │ 198 │ │ │ *ignore_errors, │ │ 199 │ │ ) │ │ 200 │ │ try: │ │ ❱ 201 │ │ │ yield │ │ 202 │ │ except ignored_errors: # noqa: WPS329 │ │ 203 │ │ │ raise │ │ 204 │ │ except Exception as err: │ │ │ │ ╭────────────────────────────────────────── locals ───────────────────────────────────────────╮ │ │ │ err = RunnerError('Loader failed') │ │ │ │ ignore_errors = () │ │ │ │ ignored_errors = (<class 'KeyboardInterrupt'>, <class 'asyncio.exceptions.CancelledError'>) │ │ │ │ logger = <RootLogger root (INFO)> │ │ │ │ self = <meltano.core.logging.output_logger.Out object at 0x7f8c9b48a970> │ │ │ ╰─────────────────────────────────────────────────────────────────────────────────────────────╯ │ │ │ │ /.venv/lib/python3.9/site-packages/meltano/core/block/extract_load.py:461 in run │ │ │ │ 458 │ │ │ # TODO: legacy

meltano elt

style logging should be deprecated │ │ 459 │ │ │ legacy_log_handler = self.output_logger.out("meltano", logger) │ │ 460 │ │ │ with legacy_log_handler.redirect_logging(): │ │ ❱ 461 │ │ │ │ await self.run_with_job() │ │ 462 │ │ │ │ return …

Will Da Silva (Arch)

12/16/2022, 11:50 PM

Looks like you're getting exit code 9, which is an out-of-memory error.

peter_pezon

12/16/2022, 11:50 PM

that's problematic.

peter_pezon

12/16/2022, 11:51 PM

Thanks for pointing that out

peter_pezon

12/16/2022, 11:52 PM

So, it's not ideal to hold a large snowflake table in memory waiting to be written to CSV. are there any strategies around handling this? Like, using partition keys? Do any CSV load plugins do chunked loading?

peter_pezon

12/16/2022, 11:56 PM

i see that

target-csv

(meltanolabs variant) implements

process_batch

. would this help in copying only batches at a time from snowflake to csv? does the tap need to implement batch as well?

Will Da Silva (Arch)

12/16/2022, 11:57 PM

@edgar_ramirez_mondragon would know better than I

edgar_ramirez_mondragon

12/19/2022, 11:36 PM

would this help in copying only batches at a time from snowflake to csv?

It doesn’t, unfortunately. It does process records in batches, but there’s nothing controlling the maximum batch size. Some targets allow you to fine tune this batch size, but that one doesn’t.

does the tap need to implement batch as well?

@peter_pezon not necessarily. Most targets should be able to batch the records they read from stdin and commit them based on a few triggers, e.g. a SCHEMA message from a different stream or a STATE message. https://github.com/MeltanoLabs/target-csv/issues/3

Open in Slack

Previous Next