Are there good examples of singer taps that are ce...
# getting-started
b
Are there good examples of singer taps that are centered around standard unix utils via bash wrapped in minimal python? I can download a file from an sftp, decompress, and decrypt via gpg in under 100 characters without ever writing a temp file to disk using standard unix cli tools, and I am hesitant to write a bunch of python to replicate that. I also want all the state and singer ndjson transformations that the python libs provide. Has anyone seen something that seems similar?
m
You could probably still invoke those shell commands from Python, but then the trick becomes making sure the dependencies around those shell commands are in place on the system running the tap (Is sftp / gpg / etc. installed? What if we're running on a different environment / OS that doesn't have your shell of choice?)
Finding Python libraries that would perform those tasks instead solves the dependency issue easily and also makes it more platform independent. Hopefully there's some Python libraries already created that would save you the work!
b
The libraries for sftp and gpg aren't great and are already coupled to the environment. That's why I'm looking for a way to leverage simple, reliable, existing cli tools. Environments are easily controlled via docker.
m
It's a fair point, though still not everyone runs in Docker. I'm running Meltano in Docker in production but I just as easily do my development and testing from my Windows work-issued computer natively. It's definitely still possible but you'd have to make it clear what environment your tap requires.
b
The docs for singer seem to imply that its a spec that is typically implemented in python, but that it could be any language. I’m hoping to find some examples of alternatives. I have had good success in the past using python to transform a stream of data exactly the way singer works.
almost like a tap that reads from stdin
<any bash command that outputs to stdout> | tap-csv | …
but that wont work because the state would be impossible to attain
its just infinitely simpler to
curl sftp://… | gpg -d > decrypted_file.txt
m
Yeah there is nothing specific to Singer that requires Python, though the SDK does make it a lot quicker to develop. I took a custom written script and converted it to work with Singer/Meltano by just replacing all the "write to DB" parts with "print Singer formatted record to stdout"
This particular script was pulling data from an incredibly unorthodox source that made it easier to just do it myself over using the SDK that assumes a somewhat well behaved API 😅
b
i have a rough idea of how it would work, but I wanted to see if anyone had seen anything similar before