Hello all, I was researching regarding the record ...
# getting-started
i
Hello all, I was researching regarding the record metadata - is it correct to get the last sdc set with selecting the max sdc_table_version from the record_metadata for target-postgres ?
e
Hi @Isoctcolo. Are you trying to get the most recently inserted records, or are you trying to accomplish something else?
i
Hello @edgar_ramirez_mondragon I want to get the last records which came with the last run. As far as i understand is that it is incremental - I'm missing a documentation not about the fields but how to use the data with the fields there. For example what I'm tryiong todo is I use meltano for sourcing data into our big warehouse which is built on data vault modelling. I'm having now meltano push data into my staging area in postresql and now I want to create a select which gets me the last incoming data for further processing.
In the long run I want to integrate DBT and let all run inside of meltano to build up and generate the complete data warehouse
In general I want to know how to get last inserted, updated and deleted records based on the metadata
e
Gotcha. I think @pat_nadolny may know more about data modeling approaches that leverage Singer metadata columns.
i
Perfect thanks - just a suggestion maybe put an example in the documentation because I couldn't find anything related to that.
p
@Isoctcolo these docs might help clarify a bit https://sdk.meltano.com/en/latest/implementation/record_metadata.html
i
@pat_nadolny I saw that but my question is how to leverage those record fields. It just explains what the record fields are but for example - If I want to get the changes from the last load what do I have to select ? select all which have new table_version or which have the last sdc_received timestamp - is it transaction time or real time ?
p
I'm not totally sure for your case because it might depends on your tap/target but theres a new
_sdc_sync_started_at
https://github.com/meltano/sdk/pull/1878 metadata field that is consistent across the whole sync. I would filter by the max value of that field to get the most recent set of data. Previously without this column people (including me) were using
_sdc_extracted_at
and
_sdc_batched_at
to loosely do the same thing
i
perfect @pat_nadolny thanks a lot this is exactly what i needed