I m using transferwise s variant of `tap mysql` with `LOG BA Meltano #plugins-general

I’m using transferwise’s variant of `tap-mysql` wi...

josh_lloyd

08/16/2021, 7:02 PM

I’m using transferwise’s variant of

tap-mysql

with

LOG_BASED

replication. I’ve got about a dozen meltano pipelines running in a staggered fashion such that at any one time there are no more than 7

elt

pipelines running at a given time. Each of these pipelines is accessing a distinct database on the same AWS RDS MySQL instance. The first time I ran this series of pipelines, most pipelines completely successfully (some errorred for unrelated reasons). The second and third days I ran this series of pipelines most of the pipelines error with the following

Copy code

meltano | Extraction failed (1): pymysql.err.InternalError: (1236, "A slave with the same server_uuid/server_id as this slave has connected to the master; the first event 'mysql-bin-changelog.418935' at 777596, the last event read from '/rdsdbdata/log/binlog/mysql-bin-changelog.418937' at 737078, the last byte read from '/rdsdbdata/log/binlog/mysql-bin-changelog.418937' at 737078.")

So, several were running in parallel at a given time and completed successfully, but most produced some variation of the error above at various points during the extraction. There are several recommendations online for how to solve this, but I’m not sure which one to attempt. Some recommend changing server configuration which may not be possible since I down’t manage these source dbs. Some suggest that the server_id needs to change for each concurrent call, which doesn’t seem to be controllable using this tap and much of the calls seem to be controlled by the state emitted by meltano … any thoughts?

deactivateduser278489

11/30/2021, 12:22 AM

Hey, I’m seeing the same set of issues. My current workaround is to set retries (I’m orchestrating with airflow). @josh_lloyd did you ever get a resolution?

josh_lloyd

11/30/2021, 12:30 AM

Never did find a way to run two or more pipelines against the same MySQL instance at the same time. In the end, I opted to run all pipelines against an instance serially to avoid this error. Not ideal, but none of the other solutions seemed plausible to me.

deactivateduser278489

11/30/2021, 12:37 AM

Did you ever try to play around with the session_sqls parameter?

deactivateduser278489

12/02/2021, 5:01 PM

Hey @josh_lloyd I have a mysql server that seems to work to run multiple meltano connections (mysql 5.6) and another that is getting this error (mysql 5.7) (Btw retries didn’t solve the issue). My hunch is that there’s something to do with the GTID mode. Since I know it’s possible to work in parallel I’m going to keep exploring options because I’d like to get the second working. Any chance you could send me whatever articles you read that suggested server changes?

josh_lloyd

12/02/2021, 5:04 PM

https://dba.stackexchange.com/questions/231414/error-a-slave-with-the-same-server-uuid-server-id-as-this-slave-has-connected

josh_lloyd

12/02/2021, 5:09 PM

I’m trying to find the articles that suggest MySQL server changes, but I can’t find any (not sure what I was looking at). I found a lot of articles that suggest that an error similar to this is raised when a Master->slave relationship is not configured correctly, but those don’t apply in my case.

deactivateduser278489

12/02/2021, 5:44 PM

Ok got it. One more thing - what version of aurora mysql are you using? I’m asking because if you have 5.6 I might have follow up questions about parameters, as could more easily compare my working parallelization with your non-working one in 5.6 (vs comparing my working 5.6 with my non-working 5.7).

2 Views

Open in Slack

Previous Next