It seems Meltano replicate tables one by one, is ...
# best-practices
l
It seems Meltano replicate tables one by one, is it possible to process multiple tables in parallel?
a
I was just digging into this
So Meltano runs singer taps and targets A single tap uses a fd (stdout) piped to a single targets fd (stdin). So even if a tap is running with 20 tables selected, they are all going through the same fd. Singer supports getting messages in any order so long as a schema message precedes a stream. So if you think, BFS (breadth first search), correlate that to the fact a tap could stream all of its streams "simultaneously" but ultimately really just alternating stream messages Or DFS (depth first search) if we follow the comparison where a tap streams a single stream to completion then the next. At the end of the day it's all going through the same proverbial pipe. So it doesn't really matter.
True parallelism (single or multi node) comes from multiple processes And that means taking a single tap-xxx target-yyy and partitioning the selected streams Then executing the partitions simultaneously if supported by target/dest
a
The other solution might be BATCH records (where tap send filename with a lot of rows from one stream), in theory it would be possible to fetch several streams simultaneously on the tap side and upload simultaneously on the target side.
BATCH records are in the works in singer_sdk now if I'm not mistaken.