<@U06D0TMQ2U8> your presentation yesterday had me ...
# singer-target-development
v
@pat_nadolny your presentation yesterday had me thinking about a "target-pandas", a simple python template that would pull your records from STDIN, put all the records following the schema into a dataframe and then the user can do whatever they want with that data. Would allow for "pure python" folks that don't want the data to bounce around to databases (and have data sized no bigger than your memory) I currently do this pattern via
meltano run tap-name target-postgres autoidm-utility
the utility just uses pandas to
select * from xyz
on the table I wrote to, but really I only do this as I have a lot of other things I use the data for in the DB the use case you were talking about wouldn't 🤷
Instead of that pattern it'd be
meltano run tap-name target-pandas
Not much different than writing a simple target like
target-apprise
though really except you'd have a full dataframe available 🤷 meh sounds worse the more I think about it
p
This is an interesting idea - I've heard others talk about using pandas somewhere in the pipeline. I could imagine a pandas mapper being useful too. Users could batch up a portion or all of the tap records and run operations over it before writing it out. Although the idea with mappers was to avoid aggregations, etc. originally so this kind of goes against that 🤷