Hi. Is there any possibility to extract info from ...
# troubleshooting
m
Hi. Is there any possibility to extract info from xml files. Has anyone done it before? Haven’t seen any tap for that. Thank you ☺️
b
Yes, it is possible. I do know of one, tap-stackoverflow-sampledata , but I bet there are more out there. The XML layout is fixed for specific files and the xml layout it pulls from seems more of an edge case but might give you some ideas to start with. There is also tap-universal-file which is another one to check out.
m
Yeah, I saw the stackoverflow one. Just hoping I did not have to do a custom tap though. Will look into the universal file one. Thank you 🙏
Another idea was running another software before meltano, to transform the xml to json. Is there any way I can run system commands from meltano.yml?
Hi again, I got the transformation from XML to JSON right before running Meltano. I was checking there are not a lot of options to extract data from JSON. One of them is
tap-file
from
airbyte
(which I could not get it to work due to Docker in Docker) and the
tap-universal-file
(which is throwing a very strange issue). Does anybody know any other alternative? Thank you,
v
tap-universal-file
is pretty new could you share what you're hitting? I could start a different thread about it
u
also
tap-spreadsheets-anywhere
can read json or jsonl. Checkout the examples on the hub https://hub.meltano.com/extractors/tap-spreadsheets-anywhere#common-config-examples
u
Also if you need to run the first transformation step in a meltano schedule then a hack is to execute it as a command pointing to a python/bash script off another plugin https://meltano.slack.com/archives/C01TCRBBJD7/p1685613727607619. I also wonder if you could register a dummy plugin and add commands 🤔
u
Copy code
utilities:
  - name: py_script
    namespace: py_script
    commands:
      test:
        executable: python
        args: my_script.py
This would work like
meltano install utility py_script
then
meltano invoke py_script:test
. Or in your case
meltano run py_script:xml_to_json tap-x target-y
u
I just wrote up a little section in the docs to describe this https://github.com/meltano/meltano/pull/7911
m
Thank you both!