Hi All, I’m wondering if it is possible to resume ...
# troubleshooting
d
Hi All, I’m wondering if it is possible to resume a job from the latest recorded state. I had a pod die on my mid job run and I’m looking at the state recorded in the system db, which left its job status as running. I know I can pass a state file for a job run using the latest state, but am unsure of the configuration of my state that was recorded in the system db. My state file would look like this:
Copy code
{
  "singer_state": {
    "bookmarks": {
      "public-cost_usage_info": {
        "last_replication_method": "FULL_TABLE",
        "version": 1638986978643,
        "xmin": null
      },
      "public-document_status": {
        "last_replication_method": "INCREMENTAL",
        "replication_key": "status_time",
        "version": 1638986979029,
        "replication_key_value": "2021-12-08T18:09:38.997474+00:00"
      },
      "public-export_job_status": {
        "last_replication_method": "FULL_TABLE",
        "version": 1638993360229,
        "xmin": null
      },
      "public-job_run_status": {
        "last_replication_method": "FULL_TABLE",
        "version": 1638993360541,
        "xmin": null
      },
      "public-post_step_info": {
        "last_replication_method": "FULL_TABLE",
        "version": 1638993361040,
        "xmin": null
      },
      "public-query_job_status": {
        "last_replication_method": "FULL_TABLE",
        "version": 1638993361335,
        "xmin": null
      },
      "public-step_info": {
        "last_replication_method": "FULL_TABLE",
        "version": 1638993361635,
        "xmin": 5978580
      }
    },
    "currently_syncing": "public-step_info"
  }
}
My question is focused on the
public-step_info
table. The pod died with this table
currently_syncing
but the replication method is still full-table. Is the
version
field a time stamp of where it left off with the last recorded batch upload? Or is the
xmin
field used to calculate the last upload time? Will it resume from there? It is a large table and would hate to have to do a refresh, but also don’t want to duplicate data. Context: tap-postgres target-snowflake Any help on the logic regarding state would be extremely helpful!
I was able to pass in the state file per the documentation and it appears to have picked off from the right place. If there are any explanations on how the state logic works, that would still be appreciated! 😁
e
Hi @drew_ipson! Thanks for sharing and letting us know the kludge worked for you. We certainly are aware that our current state docs are a bit lacking in detail. We do have an issue to improve that so feel free to leave your 👍 , comments or even an MR to the docs (would be nice to let people know of this workaround)
d
@edgar_ramirez_mondragon Thank you for this. I’d be happy to contribute. Do you know how the xmin field and version field are used to calculate where the tap should start?