Has anyone built a singer tap that works with larg...
# singer-tap-development
j
Has anyone built a singer tap that works with large binary files? H1 is looking for some tips and tricks. We have large PCAP files that we’re currently transforming to JSON and are looking to rewrite it to a tap. Thanks! /cc @martijn_wouters @maarten_van_gijssel
e
That’s interesting! https://github.com/BuzzCutNorman/tap-stackoverflow-sampledata is the closest that comes to mind, though files are not binary, but rather xml. The trick would be to read the file in chunks if possible. Also, since binary formats are tricky, try to get the stream schema using an appropriate library. For example, tap-dbf uses
dbfread
to get the fields and map them to right JSON type.
a
@ken_payne's Tableau tap comes to mind also. That takes a single file and treats it as a set of related data streams.
j
Sweet, thanks both! We’ll check those out.
p
@aaronsteers I think you meant to link to https://hub.meltano.com/extractors/tap-tableau--tailsdotcom/. The tails variant specifically 😄
a
Yes, that's correct. Thanks, @pat_nadolny! I didn't even realize (until now) we have more than one implementation of
tap-tableau
🤯 😅
v
https://gitlab.com/autoidm/tap-oracle/-/merge_requests/8/diffs implements clobs and blobs for tap-oracle. Might not be super helpful but 🤷