I have just spent time doing a dagster upgrade with the dagster_ext only to find I'm having more memory issues and jobs failing on my most intensive workload 😞. It does involve opening some large CSV files. Running the job in the container directly seems fine, although error is intermittent.
I am already setting a small batch size which I will look to reduce:
- name: target-postgres-small-batch
inherit_from: target-postgres
config:
batch_size_rows: 5000
Are there any other quick wins I could use to limit memory usage for just this one job? I'm not too bothered about run time but maxing out on memory on my 4GB container which I can't increase without a quota request to Azure.