I'm running into a problem where I've set `primary...
# singer-tap-development
n
I'm running into a problem where I've set
primary_keys = []
on a stream, which in the schema message is being emitted as
"key_properties": null
, and breaking
pipelinewise-target-snowflake
, which expects either no key or an empty list in this scenario. Some traces:
Copy code
{
  "type": "SCHEMA",
  "stream": "relationship_strengths",
  "schema": {
    "properties": {
      "internal_id": {
        "type": [
          "integer",
          "null"
        ]
      },
      "external_id": {
        "type": [
          "integer",
          "null"
        ]
      },
      "strength": {
        "type": [
          "number",
          "null"
        ]
      }
    },
    "type": "object"
  },
  "key_properties": null
}
Copy code
target-snowflake       |     if len(stream_schema_message.get('key_properties', [])) > 0 else []
target-snowflake       | TypeError: object of type 'NoneType' has no len()
I'm not sure if the problem here lies in my stream definition, the SDK, or the target. Any help would be gladly appreciated.
a
@niall_woodward on first glance, this sounds like the SDK could be updated to be compatible with the target's expectations. While I don't see
"key_properties": null
to be a bug, per se, I do think leaving off `"key_properties" entirely when missing or the empty set is a fine update to make if it improves compatibility with targets that have a specific expectation.
Do you mind logging an issue on this? We'd also be happy to take an MR if you have cycles to contribute.
Two caveats I can foresee: • If there were a large mass of targets that needed
key_properties
to be present and null, presumably, that could influence our direction here. (I don't think this is the case though.) • I definitely don't want the tap developer to have to care if they set
primary_keys = null
vs
primary_keys = []
vs leaving it omitted. Those three should be syntactically identical to any target downstream, so we should pick one and only one way to communicate "no primary keys".
Wdyt?
n
Looking at the singer standard, it definitely seems like we should be sending an empty list vs a null, so perhaps this is a bug: key_properties - a list of strings indicating which properties make up the primary key for this stream. Each item in the list must be the name of a top-level property defined in the schema. An empty list may be used to indicate there is no primary key for the stream. https://hub.meltano.com/singer/spec
Thanks for your detailed reply as always.
Let me know what you think is best, I can look at doing an MR tomorrow.
a
Yeah, that text makes it very clear the expectation is an empty list. Thanks for looking into that and sharing back.
n
I've realised I can also solve my problem using a stream map to synthesize a PK from the external_id and internal_id, which will also save the need to dedupe in the wh. I'll follow up with a fix for this empty list issue nevertheless.
a
Sweet!!
n