Hi all! Few questions regarding `target-postgres` ...
# troubleshooting
s
Hi all! Few questions regarding
target-postgres
(meltanolabs variant) 1. Is the batch size of fixed ? How do I find out what it is and how to change it ? I am noticing that in my logs, I see a different number everytime.
Copy code
Target sink for foo is full. Draining..
METRIC: {"type": "counter", "metric": "record_count", "value": 28711,..}
Target sink for foo is full. Draining..
METRIC: {"type": "counter", "metric": "record_count", "value": 40006,..}
2. When I do a full refresh, I notice that my pod (running the pipeline) crashes due to OOM. If we are using batching, that shouldn't happen right? or am I missing something. 3. Is the batch size decided by the tap, target or both ? 4. What strategy is used to write state info, I am noticing that the job is writing state very less frequently! (once an hour)..more details in 2nd comment
I see
MAX_SIZE_DEFAULT = 10000
mentioned here. https://github.com/meltano/sdk/issues/1626 But I am seeing 28k, 40k, etc in the above logs, am I checking the wrong thing ?
also looks like we are not updating state after writing each batch record 🤔 I am noticing that my logs started at
Copy code
[2023-09-19, 05:44:09 UTC] 
INFO     | tap-redshift         | Beginning incremental sync of 'foo-bar'.
but incremental state got updated almost an hour later ...
Copy code
[2023-09-19, 06:31:51 UTC] Incremental state has been updated at 2023-09-19 06:31:51.323767.
So I would lose almost an hour's work in case something went wrong ?
@edgar_ramirez_mondragon / @pat_nadolny requesting your 2 cents here!
u
also looks like we are not updating state after writing each batch record
@silverbullet1 this might be related to https://sdk.meltano.com/en/latest/classes/singer_sdk.Stream.html#singer_sdk.Stream.is_sorted in the tap, I'm not positive though. If the tap doesnt sent sorted data then the SDK cant know that all the data has arrived up to the new state timestamp until the run is completed. If the records are sorted then it can bookmark state more frequently to save progress and resume after a failure where it left off vs starting over