I expected the <snowflake target> to support table...
# troubleshooting
s
I expected the snowflake target to support table schema changes. However, it looks like I’m getting an error when a new column is added to my source. Should this target be able to handle schema changes and is there something I might be missing to make this work properly? Also, it appears that something is deleting the internal stage files so I can’t investigate them directly. Anyone know tricks to turn off auto-deleting of these stage files?
Copy code
2022-09-12T17:07:03.599035Z [info     ] snowflake.connector.errors.ProgrammingError: 100080 (22000): Number of columns in file (104) does not match that of the corresponding table (105), use file format option error_on_column_count_mismatch=false to ignore this error cmd_type=elb consumer=True name=target-snowflake--etahub producer=False stdio=stderr string_id=target-snowflake--etahub

2022-09-12T17:07:03.599148Z [info     ]   File '@LGENS/batch_spn8i2_5.csv.gz', line 2, character 1 cmd_type=elb consumer=True name=target-snowflake--etahub producer=False stdio=stderr string_id=target-snowflake--etahub
p
@sterling_paramore I dont know all the details but it does support schema changes, checkout the transferwise documentation on how they do it. Also check out the archive-load-files setting for archiving the files in S3, I havent used it but its in the readme.
s
We don’t use S3 and are just relying on the internal stage.
p
@sterling_paramore Ah ok yeah I do the same. No you've got me curious too 😄. I see that it cleans up after itself after the load. I wonder if you run a list for you stage
list @%mytable
which looks like it matches the table name and if it failed I suspect it wouldnt have cleaned up the stage so show stages should include the failed stage. Probably depends on the failure and how theyre handled. Is that helpful?
s
I don’t really know what’s going on because I actually can see them today. I must have made some mistake with with the database/schema I was using that I didn’t appreciate at the time. The root issue though is that in my source data, a column was renamed. It doesn’t appear there’s anything in this target that can handle this situation. This section about columns to replace, only considers columns whose datatypes have changed. I would have expected the target to drop (or version) the old column name and create a new one. However, there doesn’t appear to be anything in the code that would detect that a column exists in snowflake, but not in the data stream.
p
That makes sense, do you have a work around to get it fixed now that you know whats happening? I suppose you could alter your existing column name manually. Another thing to note is that we're planning to build and maintain Meltano SDK based versions of some of the most used taps/targets so theyre stable, feel free to add any context related to this that might be helpful in the target-snowflake issue!
s
Working on it…
Hmmmm….. so maybe I don’t actually need to drop the column. The solution was right there in the error message:
use file format option error_on_column_count_mismatch=false to ignore this error
. I was concerned this might cause data to end up in the wrong columns, but if I change the file format to use this option, then data gets loaded as expected. This wasn’t a problem with adding a column as I originally suspected, but a problem with renaming a column (aka one drop/one add). So the number of columns in the data stream will be less than the number of columns in the table. If I set that option to ignore the error, it works. The old column is still there and has the data it used to have, it just won’t be populated for new data.