https://meltano.com/ logo
#announcements
Title
# announcements
g

gentle-minister-47752

03/01/2021, 2:20 PM
Hello everyone! We are using meltano on kubernetes environment and we want to export some tables to CSV files. We are working with a simple ELT (tap-mysql -> target-csv). It works well and this makes me very happy: D. In doing the various tests we noticed that the tables are parallelly exported but we had never noticed this behavior before, but only a sequential mode. Does this behavior depend on the particular environment (kubernetes) or is it a Meltano feature? Is it possible to set a maximum number of tables that can be processed in parallel or does it scale according to the available resources?
f

flat-bear-81546

03/01/2021, 3:42 PM
I'm pretty sure it's a tap-mysql feature!
👍 1
r

ripe-musician-59933

03/01/2021, 3:54 PM
@gentle-minister-47752 Which specific tap-mysql are you using? https://github.com/transferwise/pipelinewise-tap-mysql as documented on https://meltano.com/plugins/extractors/mysql.html? Extracting data is entirely the tap's responsibility, with Meltano only handling the execution environment of the tap, so as Derek says, any specific extraction behavior you're seeing is necessarily a feature of the tap. Meltano doesn't know how to tell taps to run sequentially or in parallel (since there's no generic way that would work for all taps), and it doesn't detect whether it's running inside Kubernetes, so the tap must be using some heuristics of its own!
👀 1
👍 1
If we know which tap it is, we can investigate some more, and possibly find a setting that controls the parallelization
🙏 1
g

gentle-minister-47752

03/01/2021, 4:01 PM
r

ripe-musician-59933

03/01/2021, 4:05 PM
All right, I'm not immediately seeing code suggesting it extracts tables in parallel. What indications were you seeing that it was?
g

gentle-minister-47752

03/01/2021, 4:13 PM
We noticed that when we were testing on a single server, csv files were created one after each other
while on kubernetes files are created roughly at the same time, and then filled with data, apparently in parallel, hence why we asked :)
r

ripe-musician-59933

03/01/2021, 4:17 PM
Interesting! The answer has to be somewhere in https://github.com/singer-io/tap-mysql but I'm not seeing it yet. Perhaps it's behaving differently with different sync strategies (https://github.com/singer-io/tap-mysql/tree/master/tap_mysql/sync_strategies), and you were using one locally and another in prod?
g

gentle-minister-47752

03/01/2021, 4:24 PM
We are using full table sync strategy locally and in prod. we will look deeper into 😉 Anyway, thank you both for your your reply.
👍 1