hi everyone, looking into meltano to replace our c...
# getting-started
c
hi everyone, looking into meltano to replace our current etl atm. looks like a good fit because I think it will make it super easy for others in my company to contribute new sources and using dbt to contribute new analysis. However I do have a very specific data-source: blockchain rpcs. They are notoriously unreliable data-sources because they fail often and are rate-limited. Our current extractor handles things like: • pulls a list of contracts from a db to only get data for those contracts, maintains an import state for each contract • handles multi-process importer rate limiting using redis shared locks • does out-of-order processing: generates a bunch of queries, tries a query, if it fails, this gets persisted to the import state for failed contracts ranges and retried on next run • Optimize queries for least number of queries to lower the paid rpc bill-per-query • Select the best rpc for each type of query and does rpc load fanning when multiple rpc are available for a single chain Given what I know about singer and the current complexity of this extractor, I think the best approach would be to use this extractor as is as a singer tap and plug it in the meltano ecosystem but it's written in typescript so idk how that would work.
v
Not sure about your tap exactly, since your new to the eco system id recommend creating a new tap with the sdk in python to do something simple and get used to singer. You can use the executable parameter on taps and targets to technically run and language you want as long as it's singer compliant it should just work. Personally I'd implement all of this in python but if your team is used to typescript I understand keeping it, I wonder if there's a singer library out there in npm someone has done (I vaguely remember something). Main reason I'd recommend python is the sdk is very nice and handles lots of edge cases for you in both singer and standard extractor use cases Of things on your list of things the extractor handles the only one I wouldn't do in the tap is importer rate looking using redis shared locks. I'd dive into that requirement before implementing but if it's truly needed then the target would be the place to implement that I believe
c
Thx for your answer sir, very insightful. I think i'll stick to Typescript for now as the extractor is already implemented and only needs a quick adaptation, the protocol is simple enough to do that. I'll play with the python sdk in the meantime to plan a proper python rewrite if needed. I'm not sure how bookmarks and import state will works with out-of-order processing though.
v
Probably not that easy of a lift is my guess