Hi I am using `tap postgres` and `target csv` I noticed empt Meltano #troubleshooting

Hi, I am using `tap-postgres` and `target-csv` I n...

Pedro Silva dos Santos

06/15/2024, 4:37 AM

Hi, I am using

tap-postgres

and

target-csv

I noticed empty tables (ie no records) have no corresponding .csv files, all other tables seem to be loaded correctly. Meltano.yml:

Copy code

...
  - name: tap-postgres
    variant: meltanolabs
    pip_url: git+<https://github.com/MeltanoLabs/tap-postgres.git>
    config:
      dbname: northwind
      schema: public
      database: northwind
      host: localhost
      port: 5432
      user: northwind_user
    select:
    - public-*.*
  - name: target-csv
    variant: meltanolabs
    pip_url: git+<https://github.com/MeltanoLabs/target-csv.git>
    config:
      delimiter: ;
      destination_path: data/postgres/

Also doing

meltano invoke tap-postgres > tap.out

I get:

Copy code

...
{"type":"SCHEMA","stream":"public-categories","schema":{"properties":{"category_id":{"type":["integer"]},"category_name":{"type":["string","null"]},"description":{"type":["string","null"]},"picture":{"type":["string","null"]}},"type":"object","required":["category_id"]},"key_properties":["category_id"]}
{"type":"RECORD","stream":"public-categories","record":{"category_id":1,"category_name":"Beverages","description":"Soft drinks, coffees, teas, beers, and ales","picture":""},"time_extracted":"2024-06-15T04:08:41.871671+00:00"}
...
{"type":"SCHEMA","stream":"public-customer_customer_demo","schema":{"properties":{"customer_id":{"type":["string"]},"customer_type_id":{"type":["string"]}},"type":"object","required":["customer_id","customer_type_id"]},"key_properties":["customer_id","customer_type_id"]}
...
# another stream

Notice the lack of

"type":"RECORD"

for the

stream public-customer_customer_demo

. Because of this, I believe

target-csv

is to blame. Is this default behavior? Can I force target-csv to generate emtpy .csv files for the empty tables?

Edgar Ramírez (Arch.dev)

06/15/2024, 7:22 AM

Hi @Pedro Silva dos Santos! I don't think that's possible with the way target-csv currently works. The logic to write to the file, and more importantly here create if it doesn't exist, lives in the `process_batch`: https://github.com/MeltanoLabs/target-csv/blob/main/target_csv%2Fsinks.py#L103-L108. My guess is that it's never called for empty streams because the method is only called when a "batch" is full, but that never happens for the stream in question. One fix I can think of is to add an optional early file creation step controlled by a new setting (e.g

create_files_for_empty_streams

), maybe within the Sink class

__init__

method. By all means feel free to log an issue in the repo and even submit a PR if you would like to contribute.

🙌 1

4 Views

Open in Slack

Previous Next