I'm using tap-csv (meltanolabs) and target-postgre...
# troubleshooting
c
I'm using tap-csv (meltanolabs) and target-postgres (transferwise). When a new table is created, the columns are in alpha order, rather than the order from the source csv. Is there a way to persist the column order from the csv?
Actually, I just realized it's not creating the columns in alpha order. Source column order: name address phone Database column order: address phone name Any ideas? (order by column name length desc?)
e
Hi @chrish! I just confirmed that tap-csv seems to respect the order in the source files, so this is probably a bug in target-postgres. They do seem to alpha-sort at first glance in https://github.com/transferwise/pipelinewise-target-postgres/blob/4f9aad7ef1a0314578ea1c299ef6a9b6d18537b2/target_postgres/db_sync.py#L124-L125
t
It's annoying but I'm not sure I'd call it a bug. Queries shouldn't really rely on column order anyway, and since the target may add columns of its own (e.g. _sdc_batched_at) you usually won't end up with the tables looking exactly the same anyway.
c
Perhaps bug is too strong, but it is certainly unexpected. I can't think of any cases where changing the column order by default would be a desirable feature. @thomas_briggs - I believe you are right, that queries should not rely on column order, but humans do. In my particular case there are people who inspect the table to review the data and when you have name columns, address columns, and 50+ other columns in "random" order from a usefulness standpoint, it slows down the humans.
Reading through the code, I'm guessing that the sort is there to facilitate the comparison of the existing columns with the new ones, but I'm not sure yet. It seems that there should be a setting to persist the original column order, and perhaps that persisting the order should be the default.
I'm just not sure right yet how critical the sort order is to the rest of the loader's functionality. Anyone know?
t
Agreed that it's annoying from a human perspective. 😉 Your question about whether the sorting of columns is necessary is a good one. I believe new columns get added to the end of the table so if the schema changes the column ultimately end up not in alpha order, so... I would think it isn't truly necessary but I don't really know.