Question about FULL_TABLE replication, SQL=>Snowfl...
# troubleshooting
j
Question about FULL_TABLE replication, SQL=>Snowflake. One of our tables in SQL does a bit of an odd thing where it only wants a singular version of two records at a time. When there is an update it hard deletes the old record and inserts a new one in. When the FULL_TABLE replication happened, I expected the two columns to now be loaded into Snowflake and the old two rows removed; but it Snowflake still kept the old two rows that were deleted. Is this intended behavior with FULL_TABLE replication? Does it just copy the table as is and leaves any preexisting rows alone?
image.png
-2147479760 is the old version, and today it got updated to -2147479674. 760 was deleted from sql but was kept in Snowflake with full table replicate
Copy code
2024-11-30 07:38:08 +0000 - dagster - INFO - elt_run - db0cb6d0-eaac-4268-8f66-665d24491972 - elt_run_elOp[orgSettings] - 2024-11-30T07:38:07.877745Z [info     ] 2024-11-30 07:38:07,875 | INFO     | target-snowflake.Content-OrganizationSettingsEnvironmentTracking | Cleaning up Content-OrganizationSettingsEnvironmentTracking cmd_type=elb consumer=True job_name=prod:tap-mssql-orgSettings-to-target-snowflake name=target-snowflake producer=False run_id=02e69a59-c13d-460b-ab64-a134b808d7a0 stdio=stderr string_id=target-snowflake
2024-11-30 07:38:08 +0000 - dagster - INFO - elt_run - db0cb6d0-eaac-4268-8f66-665d24491972 - elt_run_elOp[orgSettings] - 2024-11-30T07:38:07.877960Z [info     ] 2024-11-30 07:38:07,875 | INFO     | singer_sdk.metrics   | METRIC: {"type": "counter", "metric": "record_count", "value": 2, "tags": {"stream": "Content-OrganizationSettingsEnvironmentTracking", "pid": 722748}} cmd_type=elb consumer=True job_name=prod:tap-mssql-orgSettings-to-target-snowflake name=target-snowflake producer=False run_id=02e69a59-c13d-460b-ab64-a134b808d7a0 stdio=stderr string_id=target-snowflake
It confirmed here that only two rows were loaded
{"type": "counter", "metric": "record_count", "value": 2, "tags": {"stream": "Content-OrganizationSettingsEnvironmentTracking", "pid": 722748}}
Copy code
plugins:
  extractors:
  - name: tap-mssql-orgSettings
    inherit_from: tap-mssql
    config:
      stream_maps:
        Content-OrganizationNameLookup:
          Checksum: __NULL__
    select:
    - Content-FiscalYearSettings.*
    - Content-OrganizationSettingsEnvironmentTracking.*
    - Content-OrganizationLocale.*
    - Content-OrganizationNameLookup.*
    - Content-OrganizationNameLocalization.*
☝️ current config
I thought it would do a full replace of the table, so to speak each time
e
Maybe https://github.com/MeltanoLabs/target-snowflake/pull/285 is relevant. I haven't tested the PR myself, though I'd be surprised if that did it cause
allow_overwrite
seems like a no-op upstream. It's at least worth trying with
load_method: overwrite
.
j
I'll take a look. Thanks as always Edgar
The
load_method: overwrite
worked, thank you!
But only for FULL_TABLE. It messes with the INCREMENTAL load where it only keeps the rows it loads in and dumps everything else
I guess I'll need two sets of loaders, one for my incremental where i dont allow_override and one for my FULL_TABLE that overrides it
e
Ah, https://github.com/meltano/sdk/issues/1350 is in my TODO list for the coming weeks!
melty bouncy 1
❤️ 1
j
As an aside Edgar, I don't know if that PR you linked should be merged. You can specify the overwrite behavior as part of the config already and it's part of the target-snowflake github. Not my call, but just based on what i've seen over the weekend.