I am working through an issue replicating from gen...
# troubleshooting
r
I am working through an issue replicating from genesys cloud to oracle db using tap-purecloud and target-oracle. When trying to pull Users from purecloud using the following config
Copy code
version: 1
default_environment: dev
project_id: id_here
environments:
- name: dev
- name: staging
- name: prod
plugins:
  extractors:
  - name: tap-purecloud
    variant: pathlight
    pip_url: git+<https://github.com/Pathlight/tap-purecloud.git>
    config:
      domain: <http://mypurecloud.com|mypurecloud.com>
      start_date: '2024-10-06'
  loaders:
  - name: target-oracle
    variant: radbrt
    pip_url: git+<https://github.com/radbrt/target-oracle.git>
    config:
      sqlalchemy_url: oracle+cx_<oracle://[USER]@proxy_user>
      flattening_enabled: true
      flattening_max_depth: 3
We see the following error
Copy code
[info     ] KeyError: 'division__id'
I assumed this was an issue with schema flattening, but we have played around with the max depth and it seems to have no bearing on the error. The only other thing I noticed was that a top level id is a key property which the keyError is on division{id:, name:, url:} Should this be resolved by schema flattening or am I misinterpreting the error?
v
hard to say for sure, I'd hope there's a stack trace there with that to help? I'd try turning off flattening just to see if it works for you
r
@visch we ran it without flattening first and got the same error. Ran the whole thing in debug and didn't see much useful in the stacktrace other than the above error. I will post a larger snippet when I get a chance.
âž• 1
👀 1
@visch @Edgar Ramírez (Arch.dev) I tried simplifying this by adding a select to the purecloud tap, but it doesn't appear to work so all entities are pulled no matter what. I also switched to target-csv to simplify and that also fails but the debug output is not clear. They only thing I can sucessfully load is JSONL, but the output for the users entity is very different from the defined schema in the tap-purecloud spec.
Copy code
{
  "type": "object",
  "properties": {
    "email": {
      "type": "string"
    },
    "id": {
      "type": "string"
    },
    "name": {
      "type": "string"
    },
    "username": {
      "type": "string"
    }
  }
}
The error is
Copy code
ValueError: dict contains fields not in fieldnames: 'selfUri', 'division__id', 'division__selfUri', 'businessUnit__id', 'businessUnit__selfUri'
It looks like the output of the tap doesn't match the schema definitions in the tap, so all loaders are failing except for JSONL because that just loads the stream as is.
e
I think it's unfortunately a known issue: https://github.com/Pathlight/tap-purecloud/pull/17
r
thanks. I commented on the PR. It looks like there is a solution there and no one has reviewed it.
e
r
I'm not sure if anyone is reviewing that repo anymore. I might need to fork and have one of my junior developers put in the fix