Hi team I am currently testing Meltano as an option for our Meltano #troubleshooting

Hi team, I am currently testing Meltano as an opt...

abhinav_prakash

10/18/2022, 5:28 AM

Hi team, I am currently testing Meltano as an option for our EL(T) - from Mongo into Snowflake. I was able to test it out locally and things looked fine. Next steps were to add meltano to our k8s cluster using helm and argocd. I am using this Repository of Helm Charts for use with Meltano from: https://github.com/meltano/helm-meltano for the same. What I have done so far: •

cd <path-to-meltano-folder-on-our-k8s-infra-repo>

•

helm repo add meltano <https://meltano.gitlab.io/infra/helm-meltano/meltano-ui>

•

helm search repo meltano-ui

◦ This returns: ▪︎

meltano/meltano-ui   0.3.0   1.96.0   Meltano UI webservice

• So, I do a

helm pull meltano/meltano-ui

followed by

tar -xvzf meltano-ui-0.3.0.tgz

(and delete the tar file) • After this I add the meltano app to argocd and finally run

kubectl apply -f meltano.yml

After pushing the changes, I see my pod in argocd but with an error that says:

Copy code

`meltano ui` must be run inside a Meltano project.
Use `meltano init <project_directory>` to create one.

So, I went into the deployment.yaml file and changed:

Copy code

command: ["meltano", "ui"]

Copy code

command: ["/bin/sh","-c"]
args:
  - meltano init mongo-snowflake-sync;
    cd mongo-snowflake-sync;
    echo pwd;
    pwd;
    find .;

and also updated the mount path to:

mountPath: "/project/mongo-snowflake-sync/.env"

because “pwd” said the my current directory was

/project/mongo-snowflake-sync/

Also, under the values.yaml file, I updated the database URI like so:

Copy code

database_uri: "sqlite:///project/mongo-snowflake-sync/.meltano/meltano.db"

Now, I see this error:

Copy code

Could not connect to the database after 3 attempts. Max retries exceeded.
Need help fixing this problem? Visit <http://melta.no/> for troubleshooting steps, or to
join our friendly Slack community.
(sqlite3.OperationalError) unable to open database file
(Background on this error at: <https://sqlalche.me/e/14/e3q8>)

Copy code

(sqlite3.OperationalError) unable to open database file
(Background on this error at: <https://sqlalche.me/e/14/e3q8>)

I tried multiple different options for the database URI and mount path with slightly varying results. For instance, I tried adding a “find .” to the commands to check all the files that were created in the meltano init step and while sometimes I could see the meltano.db most other times I did not(this might be a k8s thing though because I cannot reproduce that scenario where the meltano.db was getting listed in the find command 🤷). However, in all cases, I did still get the same error as above i.e. Could not connect to the database after 3 attempts. Max retries exceeded. I feel pretty much stuck at this point and out of ideas. Would appreciate any help/suggestions! TIA 🙏

visch

10/18/2022, 10:58 AM

Not certain about everything, but I know that version

1.96.0

is old, and I know the database connect issue was fixed after that version, so if you could upgrade meltano maybe you'd be off to the races?

Will Da Silva (Arch)

10/18/2022, 12:50 PM

CC @ken_payne

abhinav_prakash

10/18/2022, 2:14 PM

@visch That is interesting to know but I am not sure how to add a version to this repo. When I run:

helm search repo meltano-ui

(or meltano), I only get this one version:

NAME                    CHART VERSION   APP VERSION     DESCRIPTION

meltano/meltano-ui      0.3.0           1.96.0          Meltano UI webservice

Seems like this version needs to be updated: https://github.com/meltano/helm-meltano/blob/704ad4bb0adbd1c2b3f34f1e215c96571133cdf5/meltano-ui/Chart.yaml#L24 ?

visch

10/18/2022, 2:15 PM

Possibly, but the helm chart I would imagine* should add a DB (that's not sqllite) to the project so it could be something else too @Will Da Silva (Arch)’s tag is right here

ken_payne

10/18/2022, 2:33 PM

Thanks for checking this out @abhinav_prakash 🙏 Both the Meltano image to use and the database URI are configurable, via the normal

values.yaml

route for Helm. Here are the defaults, which is where the SQLite URI is coming from. In the squared project (our internal deployment of Mletano) we substitute the docker `repository`/`tag` for custom values with our project installed in it and the

database_uri

for a running RDS Postgres instance (different for prod/staging) managed by Terraform. You can do the same with your project 🙂

abhinav_prakash

10/18/2022, 5:31 PM

@ken_payne thanks for the response thankyou Couldn't get to test it out as yet, driving for most of today. But I'll try to implement what you suggested and reach out in case I get stuck.

abhinav_prakash

11/07/2022, 4:48 PM

Hey @ken_payne, Sorry for the late response on this, I was traveling the past week. I did try setting up meltano the way you suggested and was able to get it to work(for the most part). As of now, I have a meltano install that is running version 2.8.0. I have added the following taps: • Extractor: tap-mongodb(variant: singer-io) ◦ Runs fine in discovery mode ◦ Able to access oplog table in mongodb for oplog replication • Loader: tap-snowflake(variant: transferwise, File format: CSV) ◦ Able to connect to snowflake(queries showing up in the SF history) ISSUE: • After the pipeline has read data from mongo, tap-snowflake runs the following create table statement: ◦

CREATE TABLE IF NOT EXISTS MELTANO."TABLENAME" (PRIMARY KEY("_ID")) data_retention_time_in_days = 1

, and fails with the error message: ▪︎

SQL compilation error: error line 1 at position 66 invalid identifier '_ID'

▪︎ I manually ran a create statement

CREATE TABLE IF NOT EXISTS MELTANO.TABLENAME (_ID varchar)

in SF(manually, as the SF user above) and was able to create a table, so the user has the correct perms to read/create table ▪︎ HOWEVER, this statement fails: •

CREATE TABLE IF NOT EXISTS MELTANO.TABLENAME ('_ID' varchar)

(OR “_ID”) • So I am wondering if it is the extra quotes are getting added in the snowflake extractor that should not be.(Also checked the SF doc and

ID

is not a reserved keyword) • I have already tried to load the output to json and csv files and both worked fine. TIA, Abhinav

abhinav_prakash

11/07/2022, 5:47 PM

Update: One of the other things I tried was setting up the tap-mongodb with the transferwise variant(considering that the snowflake tap is transferwise). However, with that variant I am not even able to connect to my mongodb. Even though I am using the same uri/user/pass/port as that for the singer-io variant(which works). So haven’t really been able to test out that route. Would really appreciate some help on this 🙏

abhinav_prakash

11/07/2022, 6:24 PM

UPDATE: Testing workaround: https://github.com/meltano/meltano/issues/2472

douwe_maan

11/07/2022, 7:36 PM

@abhinav_prakash The double quotes around

_ID

seem fine: https://docs.snowflake.com/en/sql-reference/identifiers-syntax.html#label-delimited-identifier. But you’re saying

CREATE TABLE IF NOT EXISTS MELTANO.TABLENAME ("_ID" varchar)

fails with that same “invalid identifier” error?

douwe_maan

11/07/2022, 7:38 PM

Does the create table query target-snowflake runs (

CREATE TABLE IF NOT EXISTS MELTANO."TABLENAME" (PRIMARY KEY("_ID")) data_retention_time_in_days = 1

) really not list any specific columns and types? I wonder if the issue is that it references

_ID

to be the primary key, but doesn’t actually specify how to create that column

abhinav_prakash

11/07/2022, 8:55 PM

Yea that is what I was wondering and found the github link where you mentioned about explicitly specifying each schema in the meltano.yml file. I am about to test that now.

abhinav_prakash

11/07/2022, 8:56 PM

About your other question, I just tried running this statement `CREATE TABLE IF NOT EXISTS MELTANO.TABLENAME ("_ID" varchar)`and it ran successfully.

3 Views

Open in Slack

Previous Next