ian_lewis
05/15/2023, 7:25 AMtap-spreadsheets-anywhere
?
We have a case where spreadsheet data is provided to us in a non-standard way. Basically, before the actual data begins here are a number of rows that comprise titles, descriptions and other preamble. There are also blank lines.
It seems tap-spreadsheets-anywhere
chokes on blank lines and fails to process any further lines.
Any experience or suggestions with this issue?aaron_phethean
05/15/2023, 10:43 AMian_lewis
05/15/2023, 10:48 AMtap-spreadsheets-anywhere
had a release, it would save using specific commits 🤷aaron_phethean
05/15/2023, 11:02 AMcraig_astill
05/15/2023, 11:26 AMskip_initial
changes (https://github.com/ets/tap-spreadsheets-anywhere/pull/37) supported skipping over rows with data in them.
To skip over blank rows, it looks like the skip needs to be pushed into the Excel `generator_wrapper`: https://github.com/ets/tap-spreadsheets-anywhere/blob/main/tap_spreadsheets_anywhere/excel_handler.py#L9-L32 before the header_row
is populated. This avoids the IndexError
raised when this function parses a blank row.
I'm thinking of cleaning up my experiment and raising an issue + PR. Although I will check out your links @aaron_phethean to see if there is a cleaner way.aaron_phethean
05/15/2023, 11:29 AM"field_names":["Date","ULSP_per_litre","ULSD_per_litre","ULSP_duty","ULSD_duty","ULSP_vat_pc","ULSD_vat_pc"],
craig_astill
05/15/2023, 11:31 AMfield_names
didn't help. The tap blows up during sampling of the file in the discovery phase, instead of later on when field_names
are used.aaron_phethean
05/15/2023, 11:31 AMcraig_astill
05/15/2023, 2:13 PMcraig_astill
05/16/2023, 3:24 PMcraig_astill
05/22/2023, 4:50 PMzipfile.ZipFile(file_handl)
blows up on an S3 sourced file.
Any ideas?craig_astill
05/22/2023, 4:52 PMMatt Menzenski
05/22/2023, 11:46 PMcraig_astill
05/23/2023, 7:38 AMpeter_s
05/28/2023, 8:17 PMWould anyone pay a small fee for a tap with 12-months support, regular patch releases, and an upgrade option in meltano?We’d definitely be interested.