Hello happy new year all I have a question regarding install Meltano #troubleshooting

Hello happy new year all! I have a question regar...

sean_glynn

01/04/2022, 4:42 PM

Hello happy new year all! I have a question regarding installing a tap from a GH Repo I want to install an extractor by providing a Github URL within my

meltano.yml

pipeline declaration. As per the docs, we can give

pip_url

a GH url in the following format within the extractor configuration.:

Copy code

git+<https://github.com/private-org/repo.git>

This works well with all public repos however, the tap code I want to clone is located within a private org repo and I cannot access this without the correct auth. I have been trying to use GH API Token for auth, which works well when I hardcode the token value within my meltano.yml. Example:

Copy code

pip_url: git+https://"sample-token-xxx"@github.com/private-org/tap-repo.git

I have tested this and it works as expected but I need to now find a way to pass a secret/env variable to the

meltano install

command. I have tried to export the env value and define my extractor url like this:

Copy code

version: 1
send_anonymous_usage_stats: false
project_id: tap_cloudflare_graphql
plugins:
  extractors:
  - name: tap_cloudflare_graphql
    namespace: tap_cloudflare_graphql
    pip_url: git+https://$GITHUB_TOKEN@github.com/private-org/tap-repo.git

-but the secret/env value never gets rendered when I run the

meltano install

command. I have already tried exporting this as a USER/SYSTEM environment variable and also adding as a meltano configuration:

Copy code

meltano config meltano set GITHUB_TOKEN ${GITHUB_TOKEN}

-but still this does not seem to render the token value within the meltano.yml file. I wanted to ask the community if it was possible to reference a GitHub API Key within the

meltano.yml

pipeline file? OR Is there another way to add GH auth when referencing a GH repo as an extractor within my pipeline declaration?

visch

01/04/2022, 4:51 PM

iI use private repos all the time and just make sure my local git can clone them (by setting up .ssh etc) What's your use case here? You need this to run in a GH action or something?

sean_glynn

01/04/2022, 4:55 PM

Yeah so we this

meltano install

command will be run inside a container (Apologies I should have mentioned this). The container will: • Install extractor/loaders defined within pipeline yaml via

meltano install

• Run

meltono elt

visch

01/04/2022, 4:55 PM

Is this a github action or are you running somewhere else?

sean_glynn

01/04/2022, 4:59 PM

Right now we are running via docker but the plan will be to deploy to ArgoWF

visch

01/04/2022, 4:59 PM

Ok so you're running the container in your own infrastructure, not a github action

visch

01/04/2022, 4:59 PM

sean_glynn

01/04/2022, 5:00 PM

That's correct yes

visch

01/04/2022, 5:05 PM

The way I've done this (probably better ways) is to just use an ssh key, add it to your Docker file, just like you'd add any ssh key to git. That should just work here, I'm not certain about envrionment variable expansion anywhere in your meltano.yml, this may work I haven't deep dove that part of the meltano code to know (works for plugin stuff for sure)

aaron_phethean

01/05/2022, 11:53 AM

Hi @sean_glynn - feels like this thread might help: https://meltano.slack.com/archives/CMN8HELB0/p1628034231012200 Kudos to @stephen_bailey seems to be an Argo pro!

sean_glynn

01/14/2022, 8:18 AM

Hey @aaron_phethean, I found a solution based on Stephen's ArgoWF Template. Thank you for the pointing me to this thankyou

aaron_phethean

01/14/2022, 9:46 AM

Great to hear @sean_glynn - there's a lot to like about the ArgoWF. I think they are working on automatically building an image for the Template layer which would be great. In our use case we have just one image and meltano install the plugins to be used in each job run. That' is not optimal and makes the job run slower than I'd like. Imagine you are doing the same. With a little more effort you can of course build your own image for each template, but that kind of defeats the simplicity of the template layers.

sean_glynn

01/14/2022, 10:37 AM

Yes I agree, it does take a little longer than expected when we are pulling the meltano base image, adding our dependancies and initializing prerequisite steps every-time we launch our workflow but we are more concerned about accuracy and security over speed at present. We do want to pull the latest version of our custom tap/targets and pipeline definitions when we execute our workflow, so we would prefer not to bake these into our own image. So just to explain our workflow right now: we pull the meltano base image, where we parameterize the image tag (Eg:

:latest-python3.8

:latest-python3.9

). We mount our secrets, init the container via an

/entrypoint

script (which is a common init script that installs our dependancies and initializes our prerequisite steps, required for that pipeline) - before running the ELT job. We share this

/entrpoint

script with our docker-compose script for local pipeline development (Before we want to run our pipelines in Argo). I would love to share this and get feedback on this approach (Once I have this deployment in a better state as it is still WIP 😉 ) Thank you guys for your feedback thus far !

aaron_phethean

01/14/2022, 1:21 PM

🤛 good work, sounds great.

3 Views

Open in Slack

Previous Next