I hit an issue with my Gitlab pipeline Gitlab variables are Meltano #getting-started

I hit an issue with my Gitlab pipeline: Gitlab var...

jan_soubusta

03/03/2023, 9:35 AM

I hit an issue with my Gitlab pipeline: Gitlab variables are protected in our internal repositories. Security reasons - developers could echo sensitive variables in the pre-merge phase. This limits me in what I can test in the pre-merge phase - e.g. I cannot test tap-salesforce, because we test against the production env of Salesforce, and the credentials must be protected. This leads me to question, how you approach this challenge. What makes sense to test in the pre-merge phase (regarding Meltano)? • Can I validate meltano.yml? • Should I create fixtures mimicking the result of extract from Salesforce and test only the loader to DEV DB (or even to PostgreSQL in docker running in a Gitlab worker)? • Anything else makes sense to test?

aaron_phethean

03/03/2023, 10:30 AM

Hrrm. The best case is to have a Salesforce test account, and a meltano staging environment that uses that account. Without a test account I question what you'd really be testing on the tap / target level. At the model level, you could take the view that synthetic data and dbt tests are all you need. This makes the big assumption that the incoming data still conforms to your expectations. Which, without a Salesforce test account (and therefore very likely not staging salesforce changes) will probably just change intraday at some point and you'll see this as a broken pipeline...

aaron_phethean

03/03/2023, 10:36 AM

I'd be pretty interested in revisiting how tools like Azure DevOps are dealing with this. I seem to recall that protected variables are also obfuscated in the logs to stop devs seeing these sensitive outputs - there are always clever ways to get the value out of course... Strikes me that a policy of "meltano jobs only" could give the comfort that the variables are not accessible - thinking about it, maybe this is an advantage of what we do to run the meltano pipeline 'natively' - rather than what happens in all other environments where scripts kick off an 'meltano run'. Hrrm.

jan_soubusta

03/03/2023, 11:19 AM

I tried to execute only

meltano config tap-salesforce test

in the pre-merge, but it still requires the credentials, it tries to connect to the environment. Sad. Unfortunately, we do not have test account for Salesforce. Going to fight for it, otherwise Meltano cannot be tested in pre-merge at all...

aaron_phethean

03/03/2023, 11:44 AM

Yeah. We built that to test the connection 😂

aaron_phethean

03/03/2023, 11:49 AM

Maybe have a look at the new ‘meltano compile’ command if you want a project validation. Our platform deploy stage validates the project and plugins. So we don't experience that potential problem.

visch

03/03/2023, 1:12 PM

https://meltano.slack.com/archives/CMN8HELB0/p1677842349556899?thread_ts=1677836145.062909&cid=CMN8HELB0 Salesforce has environments baked in it's literally about someone hitting a few boxes. Hard part is making sure there's not data in there they don't want exported 🫣 If you don't have access to creds and you want to test the tap you're not going to be able to as that's really all you're testing. @jan_soubusta are you trying to test something else in the CI pipeline other than to see if the tap connects and runs?

jan_soubusta

03/03/2023, 2:03 PM

Thanks for all comments! 😉 Meanwhile, I forced our RevOps team to get me access to the already existing test env (some kind of secret it was). Anyway - it still can make sense to run some kind of unit test in the first stage of the CICD pipeline to get e.g. syntax errors as soon as possible. In the case of Meltano and dbt, I expect interfaces to allow me to test the syntax validity of meltano.yml, dbt profiles, and dbt models. dbt already provides

dbt compile

statement.

meltano config test

looks similar, but it tests connections to sources in addition, which makes it a non-unit test 😉

visch

03/03/2023, 2:05 PM

To test the validity of

meltano.yml

there are a lot of cheeky ways because everytime the CLI runs it gets checked (i think). Great use case for

meltano dragon

I think 😄

meltano dragon

seems very cheeky though 😮

visch

03/03/2023, 2:05 PM

meltano compile

is better like aaron said as that makes more sense

jan_soubusta

03/03/2023, 2:13 PM

I get

Error: No such command 'compile'

. What do you mean with the compile?

visch

03/03/2023, 2:17 PM

Sorry that's a new command. I just feel terrible telling you to run

meltano dragon

to test your

meltano.yml

even though it technically works

visch

03/03/2023, 2:17 PM

But you're really just running a test to make sure it's valid yaml 🤷

visch

03/03/2023, 2:18 PM

Back to your original Q about what things to test in CI. https://github.com/meltano/squared/pull/578/checks Is meltano's repo they use which does a bunch of things in CI. sql_fluff, dbt_seeding etc.

visch

03/03/2023, 2:19 PM

For my use cases I built a CI flow that builds my project, and runs it in "dev" mode which to me means it pulls data from the production system, but instead of pushing data to the new target it runs in a Mock mode and that validates a number of things for me 🤷

francis_potter

03/07/2023, 10:03 PM

Salesforce supports “test orgs” for developers as well as “scratch orgs” that can spin up and down dynamically. Or at least they used to; it’s been a while.

jan_soubusta

03/08/2023, 8:05 AM

As I mentioned above:

Copy code

Meanwhile, I forced our RevOps team to get me access to the already existing test env (some kind of secret it was).

It is a so-called Sandbox environment, which must be configured in meltano.yml with

is_sandbox: true

Sven Balnojan

03/08/2023, 12:10 PM

@jan_soubusta so if you're just looking at E&L, what I'm trying to do is a combination of two tests: 1. tap the real source, but with a very small amount of data (that won't work for your use case), target jsonl/stdout something like that... Make sure the extracting plugin etc. works 2. tap-csv (or something similar) and target the right target, making sure the target works alright (and possible downstream transformations). I use these two strategies to make sure the tutorial keeps on working https://github.com/sbalnojan/meltano-tutorial (the repository has the sole purpose of ensuring that).

Open in Slack

Previous Next