Hey all, I am involved in a project to take data ...
# troubleshooting
m
Hey all, I am involved in a project to take data from several different sources and send it to BigQuery. One of these sources is GSpread. To obtain this data, I am using tap-gsheet and transfer-bigquery. Both work perfectly. My problem is that the data has special characters that BQ does not accept. I took a look at the documentation of the transformations with DBT and did some tests but I was not able to reach the desired result. I need to normalize the data by removing any special characters before sending them to BQ. Would anyone know what the best way to do this would be? Do you have a project or example?
v
Few options 1. Change the bigquery target to convert data that comes in to characters it accepts 2. Change the tap-gsheet connector to send data that bigquery likes (not a good option really this should send the data in the closest form possible) 3. Add something in between doing a transformation like you're talking about. Putting https://github.com/transferwise/pipelinewise-transform-field in the middle and altering the data may work for your case
Altering the target is probably the best way to go forward here
d
I'd be curious as to what kind of characters are causing this. I agree with @visch on #1, the adswerve folks are pretty receptive to PRs if you make a fix.
m
Tkx Derek, it is much better than the options I imagined. I'll do the tests!
For example @dan_ladd. In one of the spreadsheets there is a column called Comments. BQ does not accept the character a with an acute accent (á) and the process ends with an error. It only worked when I changed the character to an accent-free one
All of our content is in pt_BR