narrow-analyst-73746
01/11/2021, 6:40 PMstate
. I have read the Docs but I can’t seem to understand how it works. In my meltano.yml
file I have added the below for the the Event
stream.
Event:
replication-method: INCREMENTAL
replication-key: CreatedDate
When I run my tap for the first time, I get
bash-4.2# meltano elt tap-salesforce-soap target-csv --catalog /artemis/configs/salesforce/event_properties.json
meltano | Running extract & load...
meltano | No state was found, complete import.
meltano | Found catalog in /artemis/configs/salesforce/event_properties.json
...
...
meltano | Incremental state has been updated at 2021-01-11 18:24:51.932253.
meltano | Extract & load complete!
which is expected since I didn’t have a state for this stream yet, right? But now, it says the state has been updated to 2021-01-11 18:24:51.932253
, so next time I run, I should start from 2021-01-11 18:24:51.932253
.
However, when I run this tap again, and I want to dump the state file to see its values, I get the below and the job is killed.
bash-4.2# meltano elt tap-salesforce-soap target-csv --catalog /artemis/configs/salesforce/event_properties.json --dump=state
[2021-01-11 18:25:06,496] [59|MainThread|meltano.core.plugin.singer.tap] [WARNING] No state was found, complete import.
[2021-01-11 18:25:06,496] [59|MainThread|meltano.core.plugin.singer.tap] [INFO] Found catalog in /artemis/configs/salesforce/event_properties.json
Could not find state file for this pipeline
If I re-run the tap, it fetches everyday from the start again (so looks like there’s no state…)
Am I missing something?
Thanks!ripe-musician-59933
01/11/2021, 6:53 PM--job_id
on meltano elt
! Per https://meltano.com/docs/command-line-interface.html#elt:
To allow subsequent pipeline runs with the same extractor/loader/transform combination to pick up right where the previous run left off, each ELT run has a Job ID that is used to store and look up the incremental replication state in the system database. If no stable identifier is provided using theflag or the--job_id
environment variable, extraction will always start from scratch and a one-off Job ID is automatically generated using the current date and time.MELTANO_JOB_ID
cuddly-king-83679
01/11/2021, 7:31 PMWHERE SystemModstamp >= 2020-12-21T19:39:00.000000Z
to queries to grab only changed records since my last run.
I used the replication-method and replication-key on my postgres databases, but I didn’t have to do anything to my Salesforce config to make incremental work.extractors:
- name: tap-salesforce
variant: meltano
pip_url: git+<https://gitlab.com/meltano/tap-salesforce.git>
config:
api_type: BULK
client_id: xxxxxxxx
start_date: '2019-01-01T00:00:00Z'
username: xxxxxx
select:
- Task.*
- Account.*
- User.*
- Contact.*
- Lead.*
- Opportunity.*
- OpportunityHistory.*