Hi, Firstly, I wanted to say a massive thank you ...
# singer-tap-development
j
Hi, Firstly, I wanted to say a massive thank you for developing the Singer SDK. I've been struggling with the official singer documentation and I've managed to build a working WooCommerce tap in a short space of time using the SDK. I know it is going to be a really simple solution but I am trying to use "replication_key_value" from the state message to pass to the WooCommerce REST API so it only gets data after that date. I have set "date_modified" as the replication key and it is setting the state correctly:
Copy code
{
  "type": "STATE",
  "value": {
    "bookmarks": {
      "orders": {
        "progress_markers": {},
        "replication_key": "date_modified",
        "replication_key_value": "2021-05-15T15:14:40"
      }
    }
  }
}
However, I'm not sure how I can extract "replication_key_value" to pass into the params on a subsequent run. Any help would be much appreciated and if anyone else is working on WooCommerce it would be good to chat 🙂 Best, Jazzy
a
Hi, @jazzy! Glad you found us! Generally you would send the replication key back in when you are overriding
get_url_params()
.
j
Great! I will give it a go. Enjoy your weekend 😄
a
Thanks! You as well!
j
I think the logic is working as when I pass in a state.json file in the same format as the final state message like below:
Copy code
{
  "bookmarks": {
    "orders": {
      "replication_key": "date_modified",
      "replication_key_value": "2021-05-01T08:00:25"
    }
  }
}
It uses the replication_key_value correctly. I think the issue was that running: poetry run tap-woocommerce --config sample-config.json Is basically running a fresh job each time so the final state message does not persist between runs. Therefore it was always using the date set in the config file. This is the implementation logic based on the link you previously sent
Copy code
if self.replication_key:
            current_bookmark = self.get_starting_timestamp(partition)
            if current_bookmark is None:
                params["after"] = self.config.get("after")
            else:
                params["after"] = current_bookmark
The WooCommerce REST API only allows sorting on "date" and doesn't specify which date this actually is. Therefore the Singer SDK only writes the state message once the job has completed as per: https://gitlab.com/meltano/singer-sdk/-/blob/main/docs/implementation/state.md#dealing-with-unsorted-streams If I print the output of self.get_starting_timestamp(partition) it always returns None. Is my understanding correct?
a
@jazzy - In order to use the state as input, you will need to pass it via command line:
poetry run tap-woocommerce --config sample-config.json --state state.json
Where
state.json
is the (final) state output from a prior run.
If you see an area where we could improve the docs for this, an MR would be much appreciated.
j
Hey @aaronsteers hope you're well! Yep I understand as it is how I tested the logic above and it does successfully work when I pass the state file, which is great. What wasn't clear to me was if when I run:
Copy code
poetry run tap-woocommerce --config sample-config.json
If it should persist the state afterwards as it does end with a STATE message as expected. Or if running that command is the same as running it from fresh each time? Does that make sense?
a
Yeah - that makes total sense. And yes, the behavior according to Singer spec is that the state must be passed in via CLI. If not passed in via CLI, the tap is expected to run as if the first time (aka without leveraging incremental bookmarks).
j
Awesome thanks for confirmation. I think I'm in a weird mid point between testing locally and testing as part of a pipeline in Meltano 😄