Have a target-mssql, I want to know the number of ...
# singer-targets
v
Have a target-mssql, I want to know the number of rows in the source stream to do a check that the data is close to accurate. In tap-oracle the catalog has a row-count populated for each stream. I'd like to get the data and use it to see whether my target has received those records properly. There's issues around your row-count from your catalog being out of sync with the new data in your tap so need to be careful about not saying source.rowcount == target.rowcount but if target.rowcount is 50% of source.rowcount I"d like to throw an error and not continue Closest writeup on this I can find is https://www.stitchdata.com/docs/replication/deleted-record-handling#full-table and the metadata columns in a target like https://github.com/transferwise/pipelinewise-target-postgres
had in production a time come up where my table only had 50% of the data due to a random error (source connection dropped)
Ultimate problem is a singer architecture question around how do you know a tap is done pushing its data. I'm guessing this has been thought about a ton and I"m just not finding the write ups?
Seems like the answer is metadata columns, right? _SDC_EXTRACTED_AT _SDC_BATCHED_AT Have your target do upserts and then allow your downstream apps to decide if something is deleted or not.