Hi - I'm trying to add the tap-oracle extractor pl...
# plugins-general
t
Hi - I'm trying to add the tap-oracle extractor plugin to a meltano installation that is in a private subnet and with no access to the internet. I was hoping I could just install the same plugin locally then simply zip up the tap-oracle directory that is found at .meltano/extractors/tap-oracle and copy over to the same directory on the private instance. I've tried this with no luck, still not sure which option to select when running
meltano add --custom extractor tap-oracle
in terms of the pip install argument, not sure what to go for when prompted with `Specify the plugin's
pip install
argument, for example:`
- PyPI package name:
tap-oracle
- Git repository URL:
git+https://<PLUGIN REPO URL>.git
- local directory, in editable/development mode:
-e extract/tap-oracle
- 'n' if using a local executable (nothing to install)
Default: plugin name as PyPI package name
Is it possible to do this? To install the oracle-tap on a machine with no internet access or access to a proxy etc. Any help much appreciated - Thanks!
p
Hey @tom_saunders ! Yeah that sounds like it would work to me. We build a docker image but zipping should get the job done too. For your
--custom
command, we're working on some features that will make this waay easier but for now this should work:
Copy code
meltano add --custom extractor tap-oracle
(namespace) [tap_oracle]: <hit enter - accept default>
(pip_url) [tap-oracle]: pipelinewise-tap-oracle
(executable) [pipelinewise-tap-oracle]: tap-oracle
(capabilities) [[]]: catalog,discover,state
(settings) [[]]: <hit enter - accept default>
v
I happen to have an example implementation here https://gitlab.com/vischous/oracle2mssql/-/blob/master/oracle2mssql/meltano.yml Hope it helps 😄
t
Thanks @visch - I don't think I can use that config as my meltano instance doesn't have access to the internet.
v
Got it I missed that earlier, yeah it looks like you have the tap in your extract/tap-oracle directory? this should work I think
pip_url: -e extract/tap-oracle
Everything else the same ``````
p
@tom_saunders I think you could use Derek's config if you wanted. Adding the config to your meltano.yml, then running
meltano install
will get your .meltano directory updated, then you can zip it. One thing to note is that you will need to update your meltano.yml in the private instance also so meltano knows how to find tap-oracle and call it properly. To avoid any inconsistencies it might be better to zip your entire directory and do a full replace instead of moving a single directory from .meltano, or build a docker image.
t
My meltano.yml looks like this on my local: plugins:
extractors:
- name: tap-oracle
namespace: tap_oracle
pip_url: tap-oracle
executable: tap-oracle
capabilities:
- catalog
- discover
And that seems to have installed correctly as I can run
meltano invoke tap-oracle --help
I have zipped up the full meltano project now and moved over to the private instance. Do I need to change the meltano .yml before running
meltano install
? If I've understood correclty I should change the pip_url to
pip_url: -e extract/tap-oracle
, however there is no tap-oracle in my local extract directory. I do realise I might be better served using a container-based approach, and I will explore that, I was just trying to get started quickly and have failed miserably so far 😂
p
It looks like youre using the singer-io variant of tap-oracle but the default on MeltanoHub is set to transferwise, I dont know the details but I think the transferwise variant is better since its the default, follow my previous steps if you want to switch. When you run an install it pip installs and creates the .meltano directory so if you've done that on your local you shouldnt need to do it again on your private instance so keeping the pip_url the same should be fine
c
I do realise I might be better served using a container-based approach
I don't think that's necessary and the infrastructure and know-how required for that will involve time and cost. All you need to do is to copy the contents of the pip cache from a box where you ran the install over into your network enclave / air-gapped host. And then just configure your
pip
configuration to look for wheels in that copied location (via
find-links
option) Works like charm. Should take 5 minutes to configure.
t
OK thanks Christoph - I wasn't aware of the pip cache. I'll have to do some reading I think but this gives me some optimism and a route to follow. I'll let you know how that goes
v
Pip cache will totally work, I personally would go with zipping the whole directory as https://meltano.slack.com/archives/C013EKWA2Q1/p1652273427105849?thread_ts=1652268778.370059&amp;cid=C013EKWA2Q1 suggested but it's really a preference and what you're used to thing. Both are pretty quick and get the job done!
t
Zipping the whole directory doesn't work as far as I can tell. The pip cache solution was in addition to having zipped the whole directory. I believe even with the full directory, pip on the air-gapped instance doesn't know those packages are installed and tries to install them from fresh and attempts to grab what it needs from remote repositories. I have attempted the pip cache thing but am getting no luck, I am getting dependency issues for packages that are not in my pip cache 🙃
v
Makes sense thinking about it more, PIP (ie ./meltano/extractors/*) uses environment specific stuff (bin/ links to python etc)
p
@tom_saunders what commands are you using to run the tap? Meltano shouldnt be trying to pip install anything if your .meltano is available and you run a
meltano run/invoke/elt
command. I think 🤔
v
@pat_nadolny It sounds good until you try it there's things like this baked into venvs
Copy code
visch@visch-ubuntu:~/git/tap-clickup/.meltano/extractors/tap-clickup/venv/bin$ cat activate
# This file must be used with "source bin/activate" *from bash*
# you cannot run it directly

deactivate () {
    # reset old environment variables
    if [ -n "${_OLD_VIRTUAL_PATH:-}" ] ; then
        PATH="${_OLD_VIRTUAL_PATH:-}"
        export PATH
        unset _OLD_VIRTUAL_PATH
    fi
    if [ -n "${_OLD_VIRTUAL_PYTHONHOME:-}" ] ; then
        PYTHONHOME="${_OLD_VIRTUAL_PYTHONHOME:-}"
        export PYTHONHOME
        unset _OLD_VIRTUAL_PYTHONHOME
    fi

    # This should detect bash and zsh, which have a hash command that must
    # be called to get it to forget past commands.  Without forgetting
    # past commands the $PATH changes we made may not be respected
    if [ -n "${BASH:-}" -o -n "${ZSH_VERSION:-}" ] ; then
        hash -r
    fi

    if [ -n "${_OLD_VIRTUAL_PS1:-}" ] ; then
        PS1="${_OLD_VIRTUAL_PS1:-}"
        export PS1
        unset _OLD_VIRTUAL_PS1
    fi

    unset VIRTUAL_ENV
    if [ ! "${1:-}" = "nondestructive" ] ; then
    # Self destruct!
        unset -f deactivate
    fi
}

# unset irrelevant variables
deactivate nondestructive

VIRTUAL_ENV="/home/visch/git/tap-clickup/.meltano/extractors/tap-clickup/venv"
Copy code
Note VIRTUAL_ENV="/home/visch/git/tap-clickup/.meltano/extractors/tap-clickup/venv"
When you move systems 💥
There's other things too I believe than this, just one example. There's architecture specific things as well
t
Yeah - I think I am just going to use a docker container and mount the important directories into the container 🤷‍♂️
v
That will work! I have examples of that if you need (gitlab files-docker or something like that has them as well!)
@pat_nadolny I wouldn't' be surprised if there's some way to do what we're talking about here. I haven't dug enough to know, but some kind of application bundle would probably work (pyinstaller maybe?)
t
I am guessing you need to install the oracle instantclient during the image build i.e. via the dockerfile? Does anyone have an example? Or know what the underlying OS is on the meltano docker image?
Don't worry got it working
c
I have attempted the pip cache thing but am getting no luck, I am
getting dependency issues for packages that are not in my pip cache 🙃
@tom_saunders It's really pretty simple. I think me calling it 'pip cache' was a bit misleading. Here are the steps in order: 1. On a machine that matches your air-gapped target (i.e. same OS version and same Python version and same CPU architecture), download everything that is needed from PyPi into a directory. To do so run
pip download -d /tmp/tap-oracle-wheels tap-oracle
2. Zip up the contents of that directory:
tar czf /tmp/tap-oracle-wheels.tgz /tmp/tap-oracle-wheels
3. Copy
/tmp/tap-oracle-wheels.tgz
to your air-gapped machine and extract it (
tar -C /tmp -xzf /tmp/tap-oracle-wheels.tgz
) 4. On your airgapped machine (where you presumably already have you meltano.yml and related project files) add this to your pip configuration either via config files or via env variables in order to tell pip to look in
/tmp/tap-oracle-wheels
folder for wheels and sources: a.
--find-links /tmp/tap-oracle-wheels
b. e.g when using
pip.conf
(either global
/etc/pip.conf
or per-user
~/.config/pip/pip.conf
), it would look like
Copy code
[global]
find-links =
	/tmp/tap-oracle-wheels
but some kind of application bundle would probably work (pyinstaller maybe?)
@visch I think plain
pip
capabilities should suffice. See my detailed instructions above.
t
Thank you very much for the detailed instructions. The process I followed was not the same as above so worth giving this a go for sure!
c
No worries.
pip
is quite a flexible tool. There is more detailed information available on the official documentaion site. https://pip.pypa.io/en/stable/topics/configuration Replacing
pip
with docker or k8s is a bit like calling in an airstrike when a precisely placed artillery strike is all that's needed. 😉
v
@christoph that's great when you said cache my brain hopped to a local pip server (which this kinda is but it's much easier than what I was thinking)
meltano lock --local
? @edgar_ramirez_mondragon
c
Yup, you can easily use
pip download
to manage a very simple local pip index http server's content directory ... in case you need to scale my above described method to a few more computers. 😉