simon_podhajsky
04/29/2024, 4:35 AMrun tap-mysql target-duckdb dbt-duckdb:build
pipeline setup, but I'd like to then create another .duckdb
file with a small subset of the tables (notably excluding most of those created in the original target-duckdb
loader action). Is this possible, or do I need to define another pipelinesimon_podhajsky
04/29/2024, 5:17 AMtap-duckdb duckdb-subsetter
pipeline run, where duckdb-subsetter
is set to inherit_from: target-duckdb
. This seems workable but gets stuck very early on the following error:
2024-04-29T05:06:28.309258Z [info ] time=2024-04-29 05:06:28 name=target_duckdb level=CRITICAL message=Primary key is set to mandatory but not defined in the [main-final__all_achievements] stream cmd_type=elb consumer=True job_name=dev:tap-duckdb-to-duckdb-subsetter name=duckdb-subsetter producer=False run_id=4e28d909-bd6c-45a7-8ce3-2456a43bf1fb stdio=stderr string_id=duckdb-subsetter
The issue is that the table that's being replicated has no primary key requirement (though perhaps I'm misunderstanding the error message and it is in fact the database settings that require the primary key?):
-- adk_wrapped.main.final__all_achievements definition
CREATE TABLE final__all_achievements(clovek_id BIGINT,
school_year BIGINT,
achievement_id VARCHAR,
achievement_name VARCHAR,
achievement_description VARCHAR,
achievement_data JSON,
achievement_type VARCHAR,
achievement_priority INTEGER,
achievement_image VARCHAR);
I suppose to test that earlier hypothesis, I can try to define a primary key on final__all_achievements
via a dbt constraint, but that seems like a pretty roundabout way of going at it. Any thoughts? Searching for the error in the Slack turns up nothing.Edgar Ramírez (Arch.dev)
04/29/2024, 9:10 PMprimary_key_required: false
on the inherited target.
That said, I would try a different approach that doesn't require re-exporting a subset of the tables and instead use ATTACH to materialize those tables in a different database. Maybe use path
for the desired subset of tables, and attach
for the database generated by the EL pipeline.simon_podhajsky
04/30/2024, 5:17 AMprimary_key_required: false
takes me to a different error (target-duckdb tries to create a table with no columns), so I'll investigate the ATTACH option first - thanks for pointing it out!