isabella
07/05/2023, 10:22 AMjanis_puris
07/05/2023, 10:56 AMmeltano invoke --dump=catalog tap-s3-csv > catalog.json
do the column types make sense comparing to the CSV you are retrieving?Henning Holgersen
07/05/2023, 11:04 AMisabella
07/05/2023, 1:24 PMjanis_puris
07/05/2023, 1:25 PMisabella
07/05/2023, 3:41 PMjanis_puris
07/05/2023, 5:30 PMmeltano run tap-s3-csv target-jsonl
recipe
mkdir -p tmp && cd $_
git clone git@github.com:isareply/meltanodemo.git
cd meltanodemo
python3 -m venv venv && source venv/bin/activate
pip install meltano
aws --profile dataplatform-test s3 cp test.csv <s3://dpt-jps-sandbox/tmp/test.csv>
cd s3-to-snowflake-test
replace the meltano.yml
with attached version
check select
❯ meltano select --list tap-s3-csv
2023-07-05T17:22:14.265194Z [info ] The default environment 'dev' will be ignored for `meltano select`. To configure a specific environment, please use the option `--environment=<environment name>`.
2023-07-05T17:22:15.251298Z [warning ] A catalog file was found, but it will be ignored as the extractor does not advertise the `catalog` or `properties` capability
Legend:
SelectionType.SELECTED
SelectionType.EXCLUDED
SelectionType.AUTOMATIC
Enabled patterns:
*.*
Selected attributes:
[SelectionType.SELECTED] test._sdc_extra
[SelectionType.SELECTED] test._sdc_source_bucket
[SelectionType.SELECTED] test._sdc_source_file
[SelectionType.SELECTED] test._sdc_source_lineno
[SelectionType.SELECTED] test.field1
[SelectionType.SELECTED] test.field2
[SelectionType.SELECTED] test.field3
[SelectionType.SELECTED] test.field4
[SelectionType.SELECTED] test.field5
run it
❯ meltano run tap-s3-csv target-jsonl
2023-07-05T17:24:44.241256Z [info ] Environment 'dev' is active
2023-07-05T17:24:46.236458Z [warning ] No state was found, complete import.
...
2023-07-05T17:24:49.982994Z [info ] Incremental state has been updated at 2023-07-05 17:24:49.982941.
2023-07-05T17:24:49.987608Z [info ] Block run completed. block_type=ExtractLoadBlocks err=None set_number=0 success=True
check the first jsonl entry
❯ head -1 output/test.jsonl | jq
{
"field1": "value1_1",
"field2": "value1_2",
"field3": "value1_3",
"field4": "value1_4",
"field5": "value1_5",
"_sdc_source_bucket": "dpt-jps-sandbox",
"_sdc_source_file": "tmp/test.csv",
"_sdc_source_lineno": 2
}