Hi guys, I’ve got meltano running against a live d...
# troubleshooting
g
Hi guys, I’ve got meltano running against a live db now in parallel without existing processes. The existing process does not use the binary log so Meltano is the only one doing so. If I run full refresh it whizzes through. If I don’t run full request I’m getting this in the logs and it takes about 40 seconds to run -
tap-mysql    | time=2021-04-25 16:01:00 name=tap_mysql level=INFO message=BinLog reader (file: mysql-bin-changelog.000016, pos:98078387) has reached or exceeded end position, exiting!
a
Hi, @giles_horwitch-smith. I’ve not seen this before myself but I did a code search on that repo for the error message you pasted here. There’s a detailed blurb inline with that error message here.
Copy code
# The iterator across python-mysql-replication's fetchone method should ultimately terminate
        # upon receiving an EOF packet. There seem to be some cases when a MySQL server will not send
        # one causing binlog replication to hang.
Does this help at all? Seems like a condition where the mysql api is perhaps failing to send EOF?
g
Thanks @aaronsteers appreciate you coming back to me. Any ideas on how I can sort it? Is there a way for me to reset the meltano log and do a full-refresh? See if that will sort it.
@aaronsteers Thanks for your advice so far, I’ve done a completely fresh install and restore of the meltano config, I’m getting the same issue. Do you have any ideas? It’s working overall but just weird to be getting the message
a
Hi, @giles_horwitch-smith. For my info, appoximately how long does this sync run for? And to confirm, you are getting that warning but no hard failures, correct? Data appears to load ok?
g
Sync takes about 60 seconds to full-refresh and 40 with the pause on the incremental. When I was running on my local there was no pause and incremental would take 5 seconds or so. The Data loads fine from what I can see so far
a
I spent a little more time looking at that code, and I think might be understanding now better what is happening. The bin log is basically the transaction log, and at the beginning of the replication effort, we take ‘now’ in the binlog as the maximum record. (In the SDK implementation, we’ve started calling this a “signpost”.) If no rows are inserted or updated during our replication, we’ll end exactly at the point we snapshotted - our signpost value. But if new records are inserted and updated during the replication itself, we will eventually get to a biglog entry greater than out signpost value. At this point, the code is exiting, since it has retrieved all the records which it had intended to retrieve.
This is my theory, at least, based on what I’m seeing in the code.
I think you are okay. But it is confusing how the log message is phrased, and not clear from that message if the situation is problematic or if it is working fine as expected.
g
Could it be the other way round. If I make no changes in the app, I get the problem, if I’ve made no change to the data on the source, I get the error. ie. the binlog has not changed since last run. If’ve made a change it runs through quickly
I agree, I think it’s running ok.
Either way, thank you very much for your help
a
You are very welcome! And yes, sounds like this might actually be functioning ok. If you want to be extra-sure, or if you want to submit feedback to them on the error, it might be worthwhile opening a ticket for the Pipelinewise folks here on their repo. In general, I find their team very responsive and helpful. 🙂
g
Cheers will do.