Hi, For incremental loads via `INCREMENTAL` or `BI...
# getting-started
b
Hi, For incremental loads via
INCREMENTAL
or
BIN_LOG
is it possible to create a transient kind of table that stores only the delta data retrieved between 2 fetches, if first run loaded 5000 records at 12:00 AM and then second run detects changes to 10000 records at 1:00 AM, is it possible to store only the 10,000 records somewhere for analysis or partial processing ?
t
I have two thoughts here: 1. Run a second pipeline with the same tap but with the jsonl target. If you set the initial state data correctly you could get both pipelines to process the same data. 2. If your target is a database you can identify the records inserted/updated in each batch according to the timestamp field set by the loader (assuming you're using a loader that sets one, anyway, like the pipelinewise targets do...)