Hi all. I think I need some help with a bug I foun...
# singer-tap-development
l
Hi all. I think I need some help with a bug I found in
tap-github
and state management. Specifically, I think that the tap is never issuing a proper "end" state message. I'm running it manually outside of any other code, just to isolate things a bit. I run it on a single stream (
issue_comments
) for a single repo. The last
STATE
message I get is
Copy code
{
  "type": "STATE",
  "value": {
    "bookmarks": {
      "tempStream": {},
      "repositories": {
        "partitions": [
          {
            "context": {
              "org": "nextcloud",
              "repo": "server",
              "repo_id": 60243197
            }
          }
        ]
      },
      "issue_comments": {
        "partitions": [
          {
            "context": {
              "org": "nextcloud",
              "repo": "server"
            },
            "replication_key_signpost": "2022-11-09T16:44:57.220041+00:00",
            "starting_replication_value": "2022-11-07",
            "progress_markers": {
              "Note": "Progress is not resumable if interrupted.",
              "replication_key": "updated_at",
              "replication_key_value": "2022-11-09T16:41:17Z"
            }
          }
        ]
      }
    }
  }
}
What bugs me is that
progress_markers
should not be in there, from what I understand.
finalize_state_progress_markers
should remove this, and promote
replication_key_value
one level up, so that it's used on the next run. But somehow it does not happen and our tap runs the entire stream everytime. This is with sdk 0.13.1 and the latest
master
on the tap. However if I remove
state_partitioning_keys = ["repo", "org"]
from the stream definition, then things work as I think they should. Can someone confirm my understanding? • the final state message should not contain
progress_markers
at all •
tap-github
has a buggy stream definition, and I need to remove the state partitioning keys definition to use the default value • If both statements above are true, I feel that the sdk shouldn't let me write such code, that will generate invalid state. Should there be an extra test for this?