I’m running into an odd problem: I have a stream w...
# singer-tap-development
j
I’m running into an odd problem: I have a stream with a schema. When I run the tap with
--discover
I see the properties of the schema. However, when I run the tap directly, I’m only getting empty records. The schema in the tap output looks like this (please ignore the
xmlStream
name, I’ll rename this later):
Copy code
{
  "type": "SCHEMA",
  "stream": "xmlStream",
  "schema": {
    "properties": {},
    "type": "object"
  },
  "key_properties": [
    "@ID"
  ]
}
Here’s the skeleton of the stream:
Copy code
class xmlStream(Stream):
    name = "xmlStream"
    primary_keys = ["@ID"]
    schema = th.PropertiesList(
        th.Property("@ID", th.StringType),
        th.Property("@Name", th.StringType),
    ).to_dict()

    def get_records(self, context: Optional[dict]) -> Iterable[dict]:
        print(self.schema)

        data = open("/Users/jobert/Downloads/cwec_v4.5.xml", "r").read()
        cwes = xmltodict.parse(data, process_namespaces=False)['Weakness_Catalog']['Weaknesses']['Weakness']

        for cwe in cwes:
            yield cwe
Anyone has a clue why I’m not seeing the properties in the schema output? Here’s what the records look like:
Copy code
{"type": "RECORD", "stream": "xmlStream", "record": {}, "time_extracted": "2021-08-17T00:17:20.100213Z"}
{"type": "RECORD", "stream": "xmlStream", "record": {}, "time_extracted": "2021-08-17T00:17:20.100676Z"}
{"type": "RECORD", "stream": "xmlStream", "record": {}, "time_extracted": "2021-08-17T00:17:20.100906Z"}
{"type": "RECORD", "stream": "xmlStream", "record": {}, "time_extracted": "2021-08-17T00:17:20.101292Z"}