Not sure of best place to ask this question, because it stems from troubleshooting a toy example inspired by
modern data stack. So happy to move this elsewhere.
How could I improve performance of a similar meltano pipeline to load a CSV of ~1M records into a duckdb database? In my pipeline, I used
Meltano variant of tap-csv, and
jwills target-duckdb. Pipeline would take ~2hrs to complete. Using duckdb directly to import the same CSV would take no where near as long (seconds to minutes).