Hello all. I am struggling a bit with the tap-spr...
# singer-taps
m
Hello all. I am struggling a bit with the tap-spreadsheets-anywhere tap. I have two use cases, one for S3 and the other with SFTP, and although I think I have the config right I am getting an error that the config is invalid. I decided to try a local file instead, thinking that would be easier, but that complains about an invalid config as well. Here are some examples of my various configs. Any help would be appreciated. Local file: plugins: extractors: - name: tap-spreadsheets-anywhere variant: ets pip_url: git+https://github.com/ets/tap-spreadsheets-anywhere.git executable: tap-spreadsheets-anywhere capabilities: - catalog - discover - state config: tables: [ { "name": "employee_list", "path": "file:///temp/upload_data", "pattern": "EmployeeList*.csv", "start_date": "2023-05-01T000000Z", "key_properties": [], "format": "csv", "delimiter": ",", "quotechar": '"', "encoding": "utf-8", "selected": true }] SFTP: plugins: extractors: - name: tap-spreadsheets-anywhere variant: ets pip_url: git+https://github.com/ets/tap-spreadsheets-anywhere.git executable: tap-spreadsheets-anywhere capabilities: - catalog - discover - state config: tables: - path: sftp://filetransfer.mysource.com name: employee_list pattern: from_source/EmployeeList*.csv start_date: 2023-05-01T000000Z key_properties: [] format: csv delimiter: ',' quotechar: '"' universal_newlines: false sample_rate: 10 max_sampling_read: 2000 max_sampled_files: 3 prefer_number_vs_integer: true selected: true S3: plugins: extractors: - name: tap-spreadsheets-anywhere variant: ets pip_url: git+https://github.com/ets/tap-spreadsheets-anywhere.git executable: tap-spreadsheets-anywhere capabilities: - catalog - discover - state config: tables: - path: s3://mybucket name: dim_center pattern: myfolder/DimCenter/*.csv start_date: key_properties: [] format: csv delimiter: ',' quotechar: '"' universal_newlines: false
r
For your local file example, you've provided the config as JSON instead of YAML. Is that intentional?
m
I found examples both ways, so I have tried both to see if it made a difference. So far, it doesn't seem to matter which way I do it.
e
yaml is a superset of json, so the json example is ok I believe. I tried it out with a local file and it works ok:
Copy code
plugins:
  extractors:
  - name: tap-spreadsheets-anywhere
    variant: ets
    pip_url: git+<https://github.com/ets/tap-spreadsheets-anywhere.git>
    executable: tap-spreadsheets-anywhere
    capabilities:
    - catalog
    - discover
    - state
    config:
      tables:
        [
        {
        "name": "customers",
        "path": "file:///Users/edgarramirez/meltano/triage-tsa/data/",
        "pattern": "customers.csv",
        "start_date": "2021-05-01T00:00:00Z",
        "key_properties": [],
        "format": "csv",
        "delimiter": ",",
        "quotechar": '"',
        "encoding": "utf-8",
        "selected": true
        }]
r
I tested your configurations for myself and this is what I found: Local - looks fine (as @edgar_ramirez_mondragon has shown). SFTP - you need to quote your start date value (i.e.
'2023-05-01T00:00:00Z'
).
Copy code
meltano config tap-spreadsheets-anywhere list
Copy code
Object of type TimeStamp is not JSON serializable
S3 - you need a value for
start_date
.
Copy code
meltano config tap-spreadsheets-anywhere test
Copy code
CRITICAL expected str for dictionary value @ data['tables'][0]['start_date']
In my experience,
tap-spreadsheets-anywhere
has always been a bit unwieldy since you have to provide the entire config in a single setting. Hope this helps!
m
This is very helpful! I still get an error that my config is invalid, but I also get some data back now, which is progress. So the test file is found and read. I will try the S3 and SFTP setting hints as well.
Thank you!