Hi, I am using `tap-postgres` and `target-csv` I n...
# troubleshooting
p
Hi, I am using
tap-postgres
and
target-csv
I noticed empty tables (ie no records) have no corresponding .csv files, all other tables seem to be loaded correctly. Meltano.yml:
Copy code
...
  - name: tap-postgres
    variant: meltanolabs
    pip_url: git+<https://github.com/MeltanoLabs/tap-postgres.git>
    config:
      dbname: northwind
      schema: public
      database: northwind
      host: localhost
      port: 5432
      user: northwind_user
    select:
    - public-*.*
  - name: target-csv
    variant: meltanolabs
    pip_url: git+<https://github.com/MeltanoLabs/target-csv.git>
    config:
      delimiter: ;
      destination_path: data/postgres/
Also doing
meltano invoke tap-postgres > tap.out
I get:
Copy code
...
{"type":"SCHEMA","stream":"public-categories","schema":{"properties":{"category_id":{"type":["integer"]},"category_name":{"type":["string","null"]},"description":{"type":["string","null"]},"picture":{"type":["string","null"]}},"type":"object","required":["category_id"]},"key_properties":["category_id"]}
{"type":"RECORD","stream":"public-categories","record":{"category_id":1,"category_name":"Beverages","description":"Soft drinks, coffees, teas, beers, and ales","picture":""},"time_extracted":"2024-06-15T04:08:41.871671+00:00"}
...
{"type":"SCHEMA","stream":"public-customer_customer_demo","schema":{"properties":{"customer_id":{"type":["string"]},"customer_type_id":{"type":["string"]}},"type":"object","required":["customer_id","customer_type_id"]},"key_properties":["customer_id","customer_type_id"]}
...
# another stream
Notice the lack of
"type":"RECORD"
for the
stream public-customer_customer_demo
. Because of this, I believe
target-csv
is to blame. Is this default behavior? Can I force target-csv to generate emtpy .csv files for the empty tables?
e
Hi @Pedro Silva dos Santos! I don't think that's possible with the way target-csv currently works. The logic to write to the file, and more importantly here create if it doesn't exist, lives in the `process_batch`: https://github.com/MeltanoLabs/target-csv/blob/main/target_csv%2Fsinks.py#L103-L108. My guess is that it's never called for empty streams because the method is only called when a "batch" is full, but that never happens for the stream in question. One fix I can think of is to add an optional early file creation step controlled by a new setting (e.g
create_files_for_empty_streams
), maybe within the Sink class
__init__
method. By all means feel free to log an issue in the repo and even submit a PR if you would like to contribute.
🙌 1