I just want to say. There was a time I thought mel...
# random
a
I just want to say. There was a time I thought meltano and airbyte filled a similar role. But I am floored at just how much more I like meltano fundamentally and in practice. Airbyte is a buggy nightmare. Meltano so far “just works”. Furthermore the gitlab repo meltano squared is a fantastic reference point. I am keen on where the new adapter specific dbt integration sits for bigquery so I don't get too deep before the api changes or has a better alternative.
t
This is excellent to know that airbyte is very buggy still. I've been considering it, but put off by the fact that it is still in an alpha state.
a
Yeah @todd_de_quincey The bugs weren't edge case bugs either IMO. They were major issues in both the bigquery destination and salesforce source which happened to be the two I needed during PoV after already pitching an architecture diagram. One bug cause salesforce source to fail and tip over on an undecodable byte despite salesforce documentation stating the two possible encodings as they were assuming 1. It would leave trash temp tables in the database too. A lot of them since it defaulted to automatically retry 3 times. Mind you on an initial salesforce sync it was SLOW. 7+ hours slow. Insane. I wrote some async python to do it in sub 15 minutes in a day. The bigquery bug was even more ridiculous considering an update essentially bricked the connector. A json schema validation fails relating to loading to GCS staging. I don't actually think this one is fixed, but obviously the integration testing is lacking some surface area. Outside these bugs, the performance is bad, the overhead is high, 2.5K issues and rising on github is concerning, their slack is full of posts (maybe 75% or more) on issues/bugs instead of good conversation, these issue posts are always met with a "please post this on our discourse", and lastly if memory consumption is too high, the worker pods literally hang indefinitely. I get it, they are alpha -- I also get it that it open source. But they really make out this whole position like they are ready to be thrown up in your production environment with a few minutes of set up. I wasted a week on bugs in what you would think are 2 highly trafficked connectors. Meltano on the other hand, again, just worked.
t
Thanks for taking the time to provide a detailed overview of the issues that you encountered. Extremely useful, as I too am evaluating different options to speed up the development of my EL process. I’d like to use open source where possible, but most of the frameworks don’t feel like they are quite there yet. I’d be interested to hear how you find Meltano is production. My only concern with Meltano is the disparate state of the various Singer taps (so more of a concern with Singer than Meltano itself). Since it isn’t a centralised tap repository, many of the taps are in varying states. Some with tests, some without tests. Some “battlefield tested”, some not. Given my use case is for marketing analytics, I need to pull together quite a number of different taps (Google Ads, Facebook Ads etc etc). So the varying states and quality could become a bit of a maintenance problem for me is the concern. If AirByte is as buggy as you have described, then I have to 100% agree that they sound like they are misrepresenting the current state of the app. Their marketing is very slick and I too thought that they were a lot more production ready than it sounds like they really are. They are very well funded, so I would suspect that in the coming year or 2, their product will mature quite a bit and no doubt it will become a very strong contender in the EL space.
t
@alexander_butler thanks for the kind words! I’d love to chat more about this if you have time (will DM separately). As for the adapter specific work, https://gitlab.com/meltano/meltano/-/merge_requests/2642 is the MR where it’s being worked on by @cody_hanson. It’s in the review stage and should be merged this week 🙂
t
@alexander_butler also interesting that you say that Salesforce was slow. This is one of the (very few) source connectors that is in the “Generally Available” state