Hi all Few questions regarding `target postgres` meltanolabs Meltano #troubleshooting

Hi all! Few questions regarding `target-postgres` ...

silverbullet1

09/18/2023, 6:09 PM

Hi all! Few questions regarding

target-postgres

(meltanolabs variant) 1. Is the batch size of fixed ? How do I find out what it is and how to change it ? I am noticing that in my logs, I see a different number everytime.

Copy code

Target sink for foo is full. Draining..
METRIC: {"type": "counter", "metric": "record_count", "value": 28711,..}
Target sink for foo is full. Draining..
METRIC: {"type": "counter", "metric": "record_count", "value": 40006,..}

2. When I do a full refresh, I notice that my pod (running the pipeline) crashes due to OOM. If we are using batching, that shouldn't happen right? or am I missing something. 3. Is the batch size decided by the tap, target or both ? 4. What strategy is used to write state info, I am noticing that the job is writing state very less frequently! (once an hour)..more details in 2nd comment

silverbullet1

09/19/2023, 5:15 AM

I see

MAX_SIZE_DEFAULT = 10000

mentioned here. https://github.com/meltano/sdk/issues/1626 But I am seeing 28k, 40k, etc in the above logs, am I checking the wrong thing ?

silverbullet1

09/19/2023, 5:15 AM

also looks like we are not updating state after writing each batch record 🤔 I am noticing that my logs started at

Copy code

[2023-09-19, 05:44:09 UTC] 
INFO     | tap-redshift         | Beginning incremental sync of 'foo-bar'.

but incremental state got updated almost an hour later ...

Copy code

[2023-09-19, 06:31:51 UTC] Incremental state has been updated at 2023-09-19 06:31:51.323767.

So I would lose almost an hour's work in case something went wrong ?

silverbullet1

09/20/2023, 5:52 AM

@edgar_ramirez_mondragon / @pat_nadolny requesting your 2 cents here!

user

10/02/2023, 2:50 PM

also looks like we are not updating state after writing each batch record

@silverbullet1 this might be related to https://sdk.meltano.com/en/latest/classes/singer_sdk.Stream.html#singer_sdk.Stream.is_sorted in the tap, I'm not positive though. If the tap doesnt sent sorted data then the SDK cant know that all the data has arrived up to the new state timestamp until the run is completed. If the records are sorted then it can bookmark state more frequently to save progress and resume after a failure where it left off vs starting over

Open in Slack

Previous Next