Hi all. I'm having trouble running an initial ext...
# troubleshooting
k
Hi all. I'm having trouble running an initial extract of a large table, because the Heroku PostgreSQL instance runs out of temporary disk space. I also run out of space locally on a machine with 83 GB free! Is there a way to apply a limiting query to reduce the number of rows extracted? I expect it will work fine once I have an incremental
replication_key_value
that's not 6 years old.
I've tried a target
batch_size_rows
but that doesn't seem to change how the tap behaves.
j
were you able to work around this?
k
Hi @jose_escudero, as a stopgap I added a
max_query_rows
parameter to the
pipelinewise-tap-postgres
, but it's rather too hacky to submit upstream.
With that in place I have to set the job to repeat constantly until it catches up with the backlog (usually our jobs run once a day).
Here's my fork https://github.com/ClickMechanic/pipelinewise-tap-postgres in case it's of any use.
Saying that, it looks like upstream have implemented something similar since then! https://github.com/transferwise/pipelinewise-tap-postgres/commit/c295d7c698e1036fbd31ef22f13ab0052a7e33cc
I assume you can add something like
TAP_POSTGRES_LIMIT=20000
to your ENV.
j
I hadn't noticed the
limit
param. Thanks a lot.