re Data Deanonymization followup to <https://melta...
# troubleshooting
v
Been doing a lot of thinking about what the "right" way is to maintain taps/targets for the long term. End goal: Tested Tap/Targets all work together perfectly 🙂 One of the biggest challenges that's been brought up multiple times is "simply" the data. The data that comes out of your source system is unique to that instance's data. One part of the solution that would help with this situation at least would be if you could get a group of people that use tap-quickbooks to offer up their deanoymized data so that we could test against it. Deanoymized data definition: Parse every record coming out of a tap, and generate data following the Schema and "shape" of the existing data. An MVP of this would be pretty simply Schema of record a: {string maxlength: 7000}, generate a random string with 7000 characters in place of the actual data that's coming out of tap-quickbooks. Why go through the trouble? Situations like this could be tested against without needing to see anyone's proprietary data. You could run test suites against large amounts of anonymized data to be sure changes to targets end up with the results that you're looking for
I"m guessing my idea here isn't new, anyone have pointers to people who have tried things around this? I think it's going to have to be a part of the tap /target service solution I'm going with anyway. Part of maintaining your tap/target includes making sure your data will continue to work or something along those lines
TLDR of everything above: Test actual data by storing a close copy of it that's "deidentified". Everything would work perfectly then 😉