I'm building a pretty large job pulling all record...
# troubleshooting
n
I'm building a pretty large job pulling all records out of various tables in SFDC. I am doing full-refresh for now. I've looked for this answer in a few places but haven't figured it out -- Expected behavior: • Batches of rows from Salesforce.Account get appended, so that at the end of the day I get 1 CSV with every row from SFDC Account Actual Behavior: • Each batch from Account overwrites the file, so my final .csv output is simply the final batch, and I lose all the other records What is the method to have
meltano run tap-salesforce target-csv --full-refresh
give me a CSV with every row for a given object when there are multiple batches?
1
e
Hey @Nathan Sooter 👋 It's a bit of an overloaded term here, so what is a batch in this context?
n
Hey Edgar -- in this case, it's referring to the chunk of rows that Meltano says it is writing to the CSV
it will say things like
Writing 137033 records to file..
e
Ah, I see. Every record batch is indeed overwriting the file 🤦‍♂️
n
Right 😕 I'll get, say, 200,000+ rows out of SFDC successfully but my final file will be 15,000 rows because the final little batch was that size. Am curious how to work around this?
e
The quick workaround would be to increase the time between batches to be long enough so it can process all records in a single batch. The target SDK doesn't expose that yet, so I have https://github.com/MeltanoLabs/target-csv/pull/195.
Ok, can you reinstall with
meltano install --clean
and try again?
That might fail 😬. I hadn't merged the pr.
😆 1
n
No worries I can restart!
It worked!
Validated against prior data and things look good too. Thank you so much for the quick PR, this is lifesaver for the project I'm working on. I think it'll be helpful for others, too!
e
Thanks for reporting! I think for quick and small volumes it was working correctly so it would've been hard to catch otherwise.
1