Hi all! I am currently facing a loader issue with ...
# troubleshooting
c
Hi all! I am currently facing a loader issue with one our our pipelines The loader I am using (
Recharge
), inherits from
target-redshift
(datamill variant). When running we currently get the error message...
Copy code
psycopg2.OperationalError: SSL connection has been closed unexpectedly 
 psycopg2.InterfaceError: cursor already closed
I have tried to significantly decrease the
max_batch_size
to as low as
50
, but it appears despite changing the
max_batch_size
to an extremely low value, it still will not load any batches of data! If anyone could shine some light on how to resolve this issue that would be great!
@monika_rajput @edward_ryan
n
Hey @connor_flynn I’ve been looking for a solution to this issue myself. In my case, it’s with tap-postgres and target-snowflake. Were you every able to resolve this?
c
Hello @nick_james, unfortunately I have yet to resolve this issue. I will be sure to attach to this thread if it is resolved!
n
Thanks connor. It’s a real headscratcher. I have also dropped batch sizes to an impractically low number, and tried bumping the
MELTANO_ELT_BUFFER_SIZE
environment variable. Oddly enough, it seems to fail right at the end of the job.
almost like it gets to the end of the table but doesn’t know how to hang up on the postgres tap
e
Yes, it is driving us up a wall Trying to determine if this in the tap or the target
Seems like we're overlapping with the Postgres piece since the Redshift target is built on top of Postgres @connor_flynn and @monika_rajput we don't have this issue with pipelinewise right?
n
so for me the issue was my tap-postgres closing connections. Tweaking some postgres configs on my RDS instance appears to have fixed the issue.
I bumped the values for
max_standby_archive_delay
and
max_standby_streaming_delay
i believe redshift has the same params in its parameter groups
@edward_ryan @connor_flynn I hope this helps!
e
that makes A LOT of sense
@connor_flynn this also relates to what we were discussing re: persisting connections thanks a lot @nick_james, we'll report back ... very helpful
c
Awesome thanks a lot for the insight @nick_james !
n
great! i set my values to 500 seconds. i may bring it down if it causes replication issues, but so far everything seems to be humming
m
Thanks a lot @nick_james
c
Hey @nick_james ! We are trying to adjust our
max_standby_streaming_delay
&
max_standby_archive_delay
. Currently looking for where we need to apply these changes. How did you go about increasing these value for your database?
n
i did it in my RDS parameter groups. how are you hosting your db?
is your postgres a replica or a leader?
c
@edward_ryan can you help answer these?
n
if it’s a replica, you might also need to extend your WAL timeout for the replica
e
Ah got it We are currently hosting it on an EC2 @connor_flynn it must be in the config / settings for Postgres on the EC2 instance (outside of Meltano)
Thanks a lot @nick_james@nick_james !
n
then yea you’ll want to update the postgresql.conf file on the box using the command line
wal_receiver_timeout
is the one you’ll want to extend if you’re using a replica
c
Hey @nick_james it does look like we are using a "replica". However it also looks like the settings (
wal_receiver_timeout
,
max_standby_streaming_delay
&
max_standby_archive_delay
) all will be "ignored on a master server". And we can confirm we are using a master server by running the following and getting the output "`false`"
Copy code
select pg_is_in_recovery();
Do you know if you are using a "standby" or "master" server?
n
ah ok yea that makes sense if you’re doing log-based replication
i think
wal_sender_timeout
might be your ticket then
c
I have this same issue between Postgres and Snowflake, but I'm not the owner of the database server. Is updating these keys in the postgresql.conf the only solution you've found?
e
Hey @neil_gorman, I think this is related to issue we're facing
@connor_lough I think so ... @connor_flynn can you confirm?
c
@edward_ryan @connor_flynn - out of curiosity, is yours also failing on an... • incremental load ◦ cursor field: 'date' ◦ table is over ~40M records? ◦ cursor field is unordered?
c
Hey @connor_lough I believe our's is failing with; • Incremental loads ◦ created_at (cursor field) ◦ table is <1 million records ◦ cursor field is ascending
c
And your error occurs at the 300 sec mark?
c
No, several hours
m
We are facing this error too. Would it be possible to catch the operational error and re-connect the cursor client side?
c
Hello @martin_morset! If you are still facing this issue. We found it to be related to the Recharge API.
m
@emil_nilsson fyi