Good evening/morning, I'm sending some data from t...
# troubleshooting
m
Good evening/morning, I'm sending some data from tap-mongodb to target-snowflake. This seems to be accepting a smaller set OK, but when I send the whole set I have (3k records) I get the following error (attached). There's no indication what might be the issue. My config looks something like that:
Copy code
environments:
- name: dev
  config:
    plugins:
      extractors:
      - name: tap-mongodb
        config:
          strategy: envelope
          mongo:
            host: <mongodb://localhost:27017/>
        select:
        - t_ent_393_entity_safety-report.*
      loaders:
      - name: target-snowflake
        config:
          account: XXX.eu-west-2.aws
          database: DEV
          schema: NET
          user: MELTANO
          warehouse: ELT
          role: DATALOADER
          default_target_schema: NET
          file_format: DEV.NET.MELTANO_CSV
          hard_delete: false
      - name: staging
- name: prod
plugins:
  extractors:
  - name: tap-mongodb
    variant: z3z1ma
    pip_url: git+<https://github.com/z3z1ma/tap-mongodb.git>
  loaders:
  - name: target-jsonl
    variant: andyh1203
    pip_url: target-jsonl
  - name: target-snowflake
    variant: meltanolabs
    pip_url: meltanolabs-target-snowflake
u
hi @marcin_wojciechowski! Does
meltano invoke tap-mongodb
also fail?
m
OK, I identified the problem. It's some funky Unicode characters in the source data (mongo). When I export it to a file I can see the following hex values (attached) As soon as I remove this record - things are progressing. We need to be able to support all Unicode characters. How can we achieve this?
To answer your previous question,
meltano invoke tap-mongodb
spits all the data to the console fine, however running
meltano run tap-mongodb target-jsonl
is failing with:
UnicodeEncodeError: 'charmap' codec can't encode character '\ufffd' in position 1697: character maps to <undefined>
Just to be clear, I'm running it on Windows and I believe the issue is that the intermediate file is opened with wrong encoding (cp1252 instead of UTF8) as per this thread: https://stackoverflow.com/questions/27092833/unicodeencodeerror-charmap-codec-cant-encode-characters Does it sound like a bug that needs reporting?
Aaaaand it's solved by setting the following env var: PYTHONIOENCODING=utf-8
e
Ah, that's a useful find. Thanks @marcin_wojciechowski!