https://meltano.com/ logo
#announcements
Title
# announcements
n

narrow-analyst-73746

01/11/2021, 6:40 PM
Hey @ripe-musician-59933, I wish you a happy New year, I hope you’re all good! 🙂 I’ve some issues with
state
. I have read the Docs but I can’t seem to understand how it works. In my
meltano.yml
file I have added the below for the the
Event
stream.
Copy code
Event:
   replication-method: INCREMENTAL
   replication-key: CreatedDate
When I run my tap for the first time, I get
Copy code
bash-4.2# meltano elt tap-salesforce-soap target-csv --catalog /artemis/configs/salesforce/event_properties.json

meltano             | Running extract & load...
meltano             | No state was found, complete import.
meltano             | Found catalog in /artemis/configs/salesforce/event_properties.json
...
...
meltano             | Incremental state has been updated at 2021-01-11 18:24:51.932253.
meltano             | Extract & load complete!
which is expected since I didn’t have a state for this stream yet, right? But now, it says the state has been updated to
2021-01-11 18:24:51.932253
, so next time I run, I should start from
2021-01-11 18:24:51.932253
. However, when I run this tap again, and I want to dump the state file to see its values, I get the below and the job is killed.
Copy code
bash-4.2# meltano elt tap-salesforce-soap target-csv --catalog /artemis/configs/salesforce/event_properties.json --dump=state
[2021-01-11 18:25:06,496] [59|MainThread|meltano.core.plugin.singer.tap] [WARNING] No state was found, complete import.
[2021-01-11 18:25:06,496] [59|MainThread|meltano.core.plugin.singer.tap] [INFO] Found catalog in /artemis/configs/salesforce/event_properties.json
Could not find state file for this pipeline
If I re-run the tap, it fetches everyday from the start again (so looks like there’s no state…) Am I missing something? Thanks!
1
r

ripe-musician-59933

01/11/2021, 6:53 PM
@narrow-analyst-73746 It looks like you're missing a stable
--job_id
on
meltano elt
! Per https://meltano.com/docs/command-line-interface.html#elt:
To allow subsequent pipeline runs with the same extractor/loader/transform combination to pick up right where the previous run left off, each ELT run has a Job ID that is used to store and look up the incremental replication state in the system database. If no stable identifier is provided using the
--job_id
flag or the
MELTANO_JOB_ID
environment variable, extraction will always start from scratch and a one-off Job ID is automatically generated using the current date and time.
c

cuddly-king-83679

01/11/2021, 7:31 PM
@narrow-analyst-73746 I did not have to setup incremental logic for Salesforce. I’m using the BULK API for Salesforce and I have verified in the log files that it automatically added `
Copy code
WHERE SystemModstamp >= 2020-12-21T19:39:00.000000Z
to queries to grab only changed records since my last run. I used the replication-method and replication-key on my postgres databases, but I didn’t have to do anything to my Salesforce config to make incremental work.
My Salesforce setup:
Copy code
extractors:
  - name: tap-salesforce
    variant: meltano
    pip_url: git+<https://gitlab.com/meltano/tap-salesforce.git>
    config:
      api_type: BULK
      client_id: xxxxxxxx
      start_date: '2019-01-01T00:00:00Z'
      username: xxxxxx
    select:
    - Task.*
    - Account.*
    - User.*
    - Contact.*
    - Lead.*
    - Opportunity.*
    - OpportunityHistory.*