Hi! I’m taking a look at Meltano as a way to get d...
# plugins-general
s
Hi! I’m taking a look at Meltano as a way to get data into and out of our platform and i was wondering if people would recommend it for ingesting fast moving data e.g. from queues like mqtt and if so could anyone point me at an example please? I can see the Singer github page talks about moving data between just about anything including queues but i cannot find an example of a singer tap that works with queues.
t
I’m not aware of any examples of this use case. In theory it should be possible, but I suspect this is a less common use case than the batch movement of data.
s
Thank you @taylor would you say that this is more commonly done with something like logstash, vector or telegraf?
t
I’m not familiar with those tools unfortunately. @aaronsteers do you have more experience with these?
i
Hey Steven, the tools you mentioned are most commonly used to collect and parse logs and metrics. In most cases, these logs are being stored in ElasticSearch or something similar, and visualised with let’s say Grafana. I have experience with collecting mqtt data though. In my case I collected the messages with a thin service that batched them in S3. From that point it was quite easy to read the messages with Meltano. I hope that helps.
d
Was about to say something similar. The challenge seems to be identifying and standardizing the schema of MQTT messages—maybe on a per-topic basis?—and once those standards are determined, an approach like @ivanovyordan described could be used to batch data from each topic to an intermediate destination that you can pull with a source tap.
a
Hi, @steven_zarka - and welcome! The Meltano team is excited to announce today our official launch of our new Meltano Hub for singer connectors, which lists over 200 possible source systems - although we are still working with the community to curate the best of the best and make the maturity of each tap more transparent. First - can you tell if any connectors there would match to a queue system you might be considering? Singer Taps | MeltanoHub Second - if the connector does not yet exist, you are definitely empowered to create your own connector using our Python-based #C01PKLU5D1R. While in theory, this is possible, the challenge with a queue-based systems (I’m thinking similar to SQS) is that the queue system generally wants to confirm receipt and processing of each message in the queue individually - which you would have to optimistically assert. Can you confirm if this is a requirement you are looking to meet in this implementation? In theory, if you run the sync cycle in small batches, the confirmation you receive from the target, in form of the bookmark incremented in its subsequent execution, would give you a strong positive confirmation that the records were indeed written. Update: as @ivanovyordan notes, another option could also create a caching layer in a custom tap which provides its own independent confirmation of write (allowing the queue to receive positive per-message confirmation) and then subsequent executions could stream directly from the queue and also from the cache, in case that a bookmark is requested at a prior token than is the latest in the queue. Hope this helps!
s
Understood, thank you for your help, and congratulations on launching Meltano Hub 🚀