Hello happy new year all! I have a question regar...
# troubleshooting
s
Hello happy new year all! I have a question regarding installing a tap from a GH Repo I want to install an extractor by providing a Github URL within my
meltano.yml
pipeline declaration. As per the docs, we can give
pip_url
a GH url in the following format within the extractor configuration.:
Copy code
git+<https://github.com/private-org/repo.git>
This works well with all public repos however, the tap code I want to clone is located within a private org repo and I cannot access this without the correct auth. I have been trying to use GH API Token for auth, which works well when I hardcode the token value within my meltano.yml. Example:
Copy code
pip_url: git+https://"sample-token-xxx"@github.com/private-org/tap-repo.git
I have tested this and it works as expected but I need to now find a way to pass a secret/env variable to the
meltano install
command. I have tried to export the env value and define my extractor url like this:
Copy code
version: 1
send_anonymous_usage_stats: false
project_id: tap_cloudflare_graphql
plugins:
  extractors:
  - name: tap_cloudflare_graphql
    namespace: tap_cloudflare_graphql
    pip_url: git+https://$GITHUB_TOKEN@github.com/private-org/tap-repo.git
-but the secret/env value never gets rendered when I run the
meltano install
command. I have already tried exporting this as a USER/SYSTEM environment variable and also adding as a meltano configuration:
Copy code
meltano config meltano set GITHUB_TOKEN ${GITHUB_TOKEN}
-but still this does not seem to render the token value within the meltano.yml file. I wanted to ask the community if it was possible to reference a GitHub API Key within the
meltano.yml
pipeline file? OR Is there another way to add GH auth when referencing a GH repo as an extractor within my pipeline declaration?
v
iI use private repos all the time and just make sure my local git can clone them (by setting up .ssh etc) What's your use case here? You need this to run in a GH action or something?
s
Yeah so we this
meltano install
command will be run inside a container (Apologies I should have mentioned this). The container will: • Install extractor/loaders defined within pipeline yaml via
meltano install
• Run
meltono elt
v
Is this a github action or are you running somewhere else?
s
Right now we are running via docker but the plan will be to deploy to ArgoWF
v
Ok so you're running the container in your own infrastructure, not a github action
?
s
That's correct yes
v
The way I've done this (probably better ways) is to just use an ssh key, add it to your Docker file, just like you'd add any ssh key to git. That should just work here, I'm not certain about envrionment variable expansion anywhere in your meltano.yml, this may work I haven't deep dove that part of the meltano code to know (works for plugin stuff for sure)
a
Hi @sean_glynn - feels like this thread might help: https://meltano.slack.com/archives/CMN8HELB0/p1628034231012200 Kudos to @stephen_bailey seems to be an Argo pro!
s
Hey @aaron_phethean, I found a solution based on Stephen's ArgoWF Template. Thank you for the pointing me to this thankyou
a
Great to hear @sean_glynn - there's a lot to like about the ArgoWF. I think they are working on automatically building an image for the Template layer which would be great. In our use case we have just one image and meltano install the plugins to be used in each job run. That' is not optimal and makes the job run slower than I'd like. Imagine you are doing the same. With a little more effort you can of course build your own image for each template, but that kind of defeats the simplicity of the template layers.
s
Yes I agree, it does take a little longer than expected when we are pulling the meltano base image, adding our dependancies and initializing prerequisite steps every-time we launch our workflow but we are more concerned about accuracy and security over speed at present. We do want to pull the latest version of our custom tap/targets and pipeline definitions when we execute our workflow, so we would prefer not to bake these into our own image. So just to explain our workflow right now: we pull the meltano base image, where we parameterize the image tag (Eg:
:latest-python3.8
|
:latest-python3.9
). We mount our secrets, init the container via an
/entrypoint
script (which is a common init script that installs our dependancies and initializes our prerequisite steps, required for that pipeline) - before running the ELT job. We share this
/entrpoint
script with our docker-compose script for local pipeline development (Before we want to run our pipelines in Argo). I would love to share this and get feedback on this approach (Once I have this deployment in a better state as it is still WIP šŸ˜‰ ) Thank you guys for your feedback thus far !
a
šŸ¤› good work, sounds great.