Hello Team I m using meltano local not docker UI to sync dat Meltano #troubleshooting

Hello Team, I'm using meltano (local - not docker)...

parthasarathi_r

09/15/2021, 8:52 AM

Hello Team, I'm using meltano (local - not docker) UI to sync data from MongoDB to Redshift. I searched in UI, but not found a way to pick the entities and choose replication methods. So I started following the documents of CLI. I ran the pipeline (tap - MongoDB and target - Redshift), which I configured in UI. But It ended up with the below error.

target-redshift | psycopg2.errors.UndefinedColumn: column "_id" named in key does not exist

After analyzing, found that the tap produced below schema

{"type": "SCHEMA", "stream": "mongodb_document", "schema": {"type": "object"}, "key_properties": ["_id"]}

and, target-redshift tries to create a table with this schema. Since "_id" is mentioned key, it searches for this field under the schema and gets failed. Question - Is it possible to manually update the schema? In singer, there would be catalog.json. In meltano documentation, it is mentioned as it will be handled internally. Could you please let me know if there is any way to update the schema?

taylor

09/15/2021, 1:58 PM

https://meltano.com/docs/integration.html#selecting-entities-and-attributes-for-extraction is probably what you're looking for

taylor

09/15/2021, 1:59 PM

From https://hub.meltano.com/singer/spec#schemas you can set

key_properties

to an empty list as well to indicate there's no primary key.

francis_niu

09/16/2021, 9:49 AM

I met the same issue and solved as below: 1. Override schema with extractor's

schema

extra (https://meltano.com/docs/integration.html#overriding-schemas), then meltano will generate correct catalog.json. 2. The

tap-mongodb

default variant is out of maintenance, it always generate schema from data rows and doesn't follow the schema in catalog.json. So I forked and modified

tap-mongodb

to use specified schema.

Open in Slack

Previous Next