ilkka_peltola
06/01/2022, 11:50 AM2022-06-01T11:41:35.176643Z [info ] time=2022-06-01 11:41:35 name=target_snowflake level=INFO message=Emitting state {"bookmarks": {"siren": {"replication_key_signpost": "2022-06-01T11:12:43.534679+00:00", "starting_replication_value": "2000-01-01T00:00:01", "progress_markers": {"Note": "Progress is not resumable if interrupted.", "replication_key": "dateDernierTraitementUniteLegale", "replication_key_value": "2008-09-20T04:50:47"}}}} cmd_type=loader job_id=sirene-prod-1 name=target-sf-transferwise run_id=29bdbab7-15b6-4335-9d47-3c5f170904ce stdio=stderr
So it is storing the progress marker, but why does it say as a note "Progress is not resumable if interrupted"? What do I need to change for it to be able to resume?
The API itself is a little quirky, I'll describe what is happening in my get_url_params in thread, since I believe it could be something to do with that.ilkka_peltola
06/01/2022, 11:57 AMq
that can be passed e.g. dateDernierTraitementUniteLegale:[2000-01-01T00:00:01 to 3000-01-01T00:00:01]
which instructs the API to return all companies that were updated in that time period.
The API also accepts a tri
: dateDernierTraitementUniteLegale
, so the results will actually be ordered by that field.
And the api accepts debut
, an integer describing the starting point (0 = first result, 100 = 100th result).
nomber
= results per page, max 1000ilkka_peltola
06/01/2022, 11:57 AMq
parameter the same throughout, and I kept changing the debut
to traverse the pagination, but the API will not accept a debut
larger than 10000.
So, I have to keep changing the q
parameter instead.
What I'm doing now is, I'm checking the last record in my query result, taking the update time from that and using that in the new query
dateDernierTraitementUniteLegale:[2000-01-01T00:00:01 to 3000-01-01T00:00:01]
I keep debut at zero and nombre at 1000, and just keep updating the first date in the above query value.ilkka_peltola
06/01/2022, 11:58 AMeric_boucher
06/01/2022, 12:03 PMis_sorted = True
ilkka_peltola
06/01/2022, 12:03 PMilkka_peltola
06/01/2022, 12:03 PMeric_boucher
06/01/2022, 12:04 PMeric_boucher
06/01/2022, 12:04 PMilkka_peltola
06/01/2022, 12:04 PMeric_boucher
06/01/2022, 12:05 PMilkka_peltola
06/01/2022, 12:05 PMilkka_peltola
06/01/2022, 12:05 PMilkka_peltola
06/01/2022, 12:06 PMilkka_peltola
06/01/2022, 12:06 PMeric_boucher
06/01/2022, 12:06 PMilkka_peltola
06/01/2022, 12:07 PMeric_boucher
06/01/2022, 12:07 PMilkka_peltola
06/01/2022, 12:11 PMilkka_peltola
06/01/2022, 12:25 PMilkka_peltola
06/01/2022, 12:25 PMeric_boucher
06/01/2022, 12:25 PMilkka_peltola
06/02/2022, 7:09 AM2022-06-01T13:02:32.755026Z [info ] time=2022-06-01 13:02:32 name=target_snowflake level=INFO message=Emitting state {"bookmarks": {"siren": {"replication_key": "dateDernierTraitementUniteLegale", "replication_key_value": "2006-06-02T17:27:34"}, "siret": {"replication_key": "dateDernierTraitementEtablissement", "replication_key_value": "2006-06-02T17:27:34"}}} cmd_type=loader job_id=sirene-prod-1 name=target-sf-transferwise run_id=ab82934e-cf07-48db-a1c9-ac682b555544 stdio=stderr
However, when I re-run meltano, it starts from the beginning.
I've tried running the tap alone with poetry, injecting a state file, and that works correctly. For some reason though, Meltano doesn't use the state.avinash_gupta
06/03/2022, 10:54 AMilkka_peltola
06/03/2022, 11:45 AMjob_id=sirene-prod-1
π