I hit an issue with my Gitlab pipeline: Gitlab var...
# getting-started
j
I hit an issue with my Gitlab pipeline: Gitlab variables are protected in our internal repositories. Security reasons - developers could echo sensitive variables in the pre-merge phase. This limits me in what I can test in the pre-merge phase - e.g. I cannot test tap-salesforce, because we test against the production env of Salesforce, and the credentials must be protected. This leads me to question, how you approach this challenge. What makes sense to test in the pre-merge phase (regarding Meltano)? • Can I validate meltano.yml? • Should I create fixtures mimicking the result of extract from Salesforce and test only the loader to DEV DB (or even to PostgreSQL in docker running in a Gitlab worker)? • Anything else makes sense to test?
a
Hrrm. The best case is to have a Salesforce test account, and a meltano staging environment that uses that account. Without a test account I question what you'd really be testing on the tap / target level. At the model level, you could take the view that synthetic data and dbt tests are all you need. This makes the big assumption that the incoming data still conforms to your expectations. Which, without a Salesforce test account (and therefore very likely not staging salesforce changes) will probably just change intraday at some point and you'll see this as a broken pipeline...
I'd be pretty interested in revisiting how tools like Azure DevOps are dealing with this. I seem to recall that protected variables are also obfuscated in the logs to stop devs seeing these sensitive outputs - there are always clever ways to get the value out of course... Strikes me that a policy of "meltano jobs only" could give the comfort that the variables are not accessible - thinking about it, maybe this is an advantage of what we do to run the meltano pipeline 'natively' - rather than what happens in all other environments where scripts kick off an 'meltano run'. Hrrm.
j
I tried to execute only
meltano config tap-salesforce test
in the pre-merge, but it still requires the credentials, it tries to connect to the environment. Sad. Unfortunately, we do not have test account for Salesforce. Going to fight for it, otherwise Meltano cannot be tested in pre-merge at all...
a
Yeah. We built that to test the connection 😂
Maybe have a look at the new ‘meltano compile’ command if you want a project validation. Our platform deploy stage validates the project and plugins. So we don't experience that potential problem.
v
https://meltano.slack.com/archives/CMN8HELB0/p1677842349556899?thread_ts=1677836145.062909&cid=CMN8HELB0 Salesforce has environments baked in it's literally about someone hitting a few boxes. Hard part is making sure there's not data in there they don't want exported 🫣 If you don't have access to creds and you want to test the tap you're not going to be able to as that's really all you're testing. @jan_soubusta are you trying to test something else in the CI pipeline other than to see if the tap connects and runs?
j
Thanks for all comments! 😉 Meanwhile, I forced our RevOps team to get me access to the already existing test env (some kind of secret it was). Anyway - it still can make sense to run some kind of unit test in the first stage of the CICD pipeline to get e.g. syntax errors as soon as possible. In the case of Meltano and dbt, I expect interfaces to allow me to test the syntax validity of meltano.yml, dbt profiles, and dbt models. dbt already provides
dbt compile
statement.
meltano config test
looks similar, but it tests connections to sources in addition, which makes it a non-unit test 😉
v
To test the validity of
meltano.yml
there are a lot of cheeky ways because everytime the CLI runs it gets checked (i think). Great use case for
meltano dragon
I think 😄
meltano dragon
seems very cheeky though 😮
meltano compile
is better like aaron said as that makes more sense
j
I get
Error: No such command 'compile'
. What do you mean with the compile?
v
Sorry that's a new command. I just feel terrible telling you to run
meltano dragon
to test your
meltano.yml
even though it technically works
But you're really just running a test to make sure it's valid yaml 🤷
Back to your original Q about what things to test in CI. https://github.com/meltano/squared/pull/578/checks Is meltano's repo they use which does a bunch of things in CI. sql_fluff, dbt_seeding etc.
For my use cases I built a CI flow that builds my project, and runs it in "dev" mode which to me means it pulls data from the production system, but instead of pushing data to the new target it runs in a Mock mode and that validates a number of things for me 🤷
f
Salesforce supports “test orgs” for developers as well as “scratch orgs” that can spin up and down dynamically. Or at least they used to; it’s been a while.
j
As I mentioned above:
Copy code
Meanwhile, I forced our RevOps team to get me access to the already existing test env (some kind of secret it was).
It is a so-called Sandbox environment, which must be configured in meltano.yml with
is_sandbox: true
.
s
@jan_soubusta so if you're just looking at E&L, what I'm trying to do is a combination of two tests: 1. tap the real source, but with a very small amount of data (that won't work for your use case), target jsonl/stdout something like that... Make sure the extracting plugin etc. works 2. tap-csv (or something similar) and target the right target, making sure the target works alright (and possible downstream transformations). I use these two strategies to make sure the tutorial keeps on working https://github.com/sbalnojan/meltano-tutorial (the repository has the sole purpose of ensuring that).