Hello Team currently we use Fivetran and would like to see i Meltano #plugins-general

Hello Team, currently, we use Fivetran and would l...

yijie_wang

06/24/2022, 3:24 PM

Hello Team, currently, we use Fivetran and would like to see if Meltano could replace it. It worked really well with Salesforce replication in Snowflake. In Fivetran, the names are roughly converted to snake cases for table and column names. If we decide to use Meltano, I wonder if there is a way to configure target-snowflake to convert the names to snake cases. Or please let me know if there is a better way to handle this. Otherwise, it will be a difficult task to migrate our existing queries. Many thanks. Below are a few examples.

Copy code

SOURCE NAME	NAME CONVERSION	
My_Name	    my_name	
My Name	    my_name	
MyName	    my_name	
mYName	    m_yname	
MyName123	my_name_123	
My123name	my_123_name	
My_Name!	my_name_	
My_Name____c	my_name_c

taylor

06/24/2022, 3:33 PM

It depends on the variant of

target-snowflake

that you use. I don't see a setting for that feature in https://github.com/transferwise/pipelinewise-target-snowflake But I will say this is a good use case for dbt. If you're able to build thin source models on top of your raw data, you can build all of your downstream models off of that source layer and then if you migrate data EL tools you'd only have a single place to update any mappings like that. I did that at GitLab and it worked quite well as we migrated from Stitch to Fivetran to Meltano for different data sources https://gitlab.com/gitlab-data/analytics/-/blob/master/transform/snowflake-dbt/models/sources/sfdc/sfdc_opportunity_source.sql

alexander_butler

06/24/2022, 3:51 PM

It could be a valid case for a stream mapper. I'll investigate 😉

yijie_wang

06/24/2022, 4:49 PM

Thank you @taylor and @alexander_butler. I think it will be stream mapper will be a better option. DBT methods may pose challenges since we have to continue running them after migration. We have a large number of BI reports based on Snowflake’s data model. Thus, we must keep the existing table columns as they were created by Fivetran. Using fal, it works, but it just adds an unnecessary layer of complexity. It would be great if there were a stream mapper with regex replace functionality which will have a big impact on the Fivetran user who wants to make a switch. @taylor Please correct me if I misunderstood your suggestion.

taylor

06/24/2022, 4:53 PM

@yijie_wang if you're using Snowflake then dbt is a good solution for managing transformations in the warehouse, but if you're not currently using it then it would be a bigger lift to solve this immediate problem.

aaronsteers

06/24/2022, 5:31 PM

Agreed dbt could be used after loading, but if you want to control the target tables naming, stream maps are a good option. There seems to be a very common use case here, actually, even if it is not robustly handled as of today. If anyone wanted to provide this as a PR on Meltano's mapper plugin, one way this could be made generic would be to accept a config like

stream_name_transform_rules

, which might be a list of regex search/replace patterns... 🤔

aaronsteers

06/24/2022, 5:33 PM

Another common request we get is for targets to honor/ignore/prefix or override the source schema names, and this could be a generic and portable fix for those use cases as well.

aaronsteers

06/24/2022, 5:35 PM

In case folks want to discuss in an issue, I've logged this as a stub: https://github.com/MeltanoLabs/meltano-map-transform/issues/26

yijie_wang

06/24/2022, 6:32 PM

Thank you @aaronsteers! I don’t have the technical skill to contribute. However, from the perspective of a Fivetran user. If we decide to make a migration, this functionality will be crucial.

yijie_wang

11/04/2022, 11:33 AM

Hello @aaronsteers, I wonder if there has been any progress on this. @douwe_maan suggested I implement this function using the meltano-map-transform. I have the snake case function, but I can’t figure out how to incorporate it into the class. Thank you.

Sven Balnojan

11/04/2022, 12:01 PM

Hey @yijie_wang, @aaronsteers would probably be asking the same thing: What's stopping you from using the meltano-map-transform? Renaming columns seems like a very good use case for this mapper. (And yes, a general renaming-transformer would also make sense, but this one will do the job just fine)

yijie_wang

11/04/2022, 12:45 PM

@Sven Balnojan, thanks for the reply. My understanding is that meltano-map-transform can’t rename columns at the moment. Is this not the case?

Sven Balnojan

11/04/2022, 12:55 PM

So in general it can, but you'll have to do it one by one, the issue @aaronsteers created aims at doing it semi-automatically as I understand it, e.g. just telling the stream "convert to snake case using this specific regex". But this should work:

Copy code

mappings:
    - name: rename_column
      config:
        stream_maps:
          new_table:
            oldEmailColumn: null # drop 
            new_email_column: oldEmailColumn
            ...

yijie_wang

11/04/2022, 1:01 PM

Thanks for the example,@Sven Balnojan. In my case, there are tens of thousands of columns that need to be transformed. It is impossible to use this method.

yijie_wang

11/04/2022, 1:07 PM

We use target_snowflake loader. I tried to implement the transformation here which also failed. https://github.com/transferwise/pipelinewise-target-snowflake/blob/aa6ac2e392b6148994f9138241630ba8d684adac/target_snowflake/flattening.py#L22

Sven Balnojan

11/04/2022, 1:35 PM

Yeah, so the Meltanoee way would be to use a mapper. Let me take a look into that next week I'll get back to you. And I#m sure @aaronsteers can give you a general update (but I think as far as I could see, nothing has changed to June, not in this regard).

yijie_wang

11/04/2022, 1:43 PM

Thank you so much @Sven Balnojan! A feature like this would make transiting to Meltano easier for Fivetran users. It would be great if this feature can accept a function as argument.

aaronsteers

11/04/2022, 6:16 PM

Confirmed that we're not actively prioritizing this as of now. This conversation did spark a much more robust spec exploration here for a V2 mapping syntax that - but while we're excited to add a new capability, is not likely to be in our timeline within the near future. (Contributions always welcome.)

aaronsteers

11/04/2022, 6:17 PM

An upvote or a comment there in the issue does help - and if others upvote and comment, that makes a difference for us too - and sometimes can draw external contributors.

aaron_phethean

11/05/2022, 8:06 AM

Hi @yijie_wang just catching up on your FiveTran conversation. For the compatibility / portability, I like the idea of a configuration in the target. We've done this already for Postgres pipelinewise tap - our motivation was similar, to keep our existing transforms. DM me, maybe we could help with that change on Snowflake target. https://github.com/transferwise/pipelinewise-target-postgres/compare/master...Matatika:pipelinewise-target-postgres:master

yijie_wang

11/07/2022, 12:00 PM

Thank you @aaronsteers for the comprehensive proposal. I have upvoted.

yijie_wang

11/07/2022, 12:02 PM

Thank you @aaron_phethean for share the link. I will DM you this week. It will save us a great deal of manual work.

yijie_wang

11/18/2022, 12:45 PM

Hello @aaron_phethean I finally added the logic here after many failures. The flatten is in a different module. I cannot figure out a way to bring the SHOULD_INFLECT without merge the modules. Any suggestions? Thanks you. https://github.com/transferwise/pipelinewise-target-snowflake/compare/master...yjw868:pipelinewise-target-snowflake:master

aaron_phethean

11/18/2022, 8:52 PM

Looks like it would be better to avoid the global, but need to look at the code. (On something better than my phone…)

3 Views

Open in Slack

Previous Next