We need a page where users can manually upload CSV...
# getting-started
a
We need a page where users can manually upload CSV files to be loaded into BigQuery — does Meltano have anything built-in for a file upload form?
a
I haven't done this before, but in theory you could use something like tap-spreadsheets-anywhere for this...
You'd presumably then end up with something like: 1. Users can upload spreadsheet content to _ (fill in approved location). 2. This gets executed on a schedule or by a trigger:
meltano run tap-spreadsheets-anywhere target-bigquery
a
Thanks AJ. I will look into that. I am hoping to run Meltano synchronously so the uploader can see errors/success
a
Hey @aaron is the file in a known format? Do you need validation errors from an uploaded file? If you need the validation results, I see 2 options: 1. Do the validation at file upload (PUT) before triggering meltano for processing. 2. PUT, trigger meltano run, GET for results - count of records processed, or errors. To PUT the file to a location and then have it processed seems reasonable, but all validation would be outside meltano, so feels like meltano wouldn't be doing much for you. To poll (GET) to an api endpoint for validation results seems an unhandled case too - I don't know of any structured output from meltano. 'meltano run’ just fails with errors from the tap. ‘meltano test’ could have structured output but doesn't from memory.
a
So many Aarons in this thread! 😄
(AJ = Aaron John)
n
BigQuery has native features for csv files. Example - you can set a blob storage trigger so when a csv file gets put into a particular storage bucket (either direct, or via a web form that puts the file in the bucket) it triggers a load into BigQuery, then leverage BigQuery’s schema autodetect to read the csv (it reads the first 100 rows to detect the schema … its generally pretty good unless the data is inconsistent and the first 100 rows look nothing like the rest of the file) … there are also options to set the number of errors to ignore before failing, and handling of quoted strings, jagged rows etc etc … just another option
v
csvs = hard via websites and users https://flatfile.com/ = much easier 🤷 . If I wanted free/cheap and my users were technical I'd go with gitlab/github , have the user upload a csv file to /seeds in dbt. https://docs.getdbt.com/docs/building-a-dbt-project/seeds