Nir Diwakar (Nir)
09/11/2024, 10:22 AMdef post_process(
self,
row: dict,
context: Context | None = None, # noqa: ARG002
) -> dict | None:
for key in ["loginDate", "logoutDate"]:
row.pop(key, None) if row.get(key) == 0 else None
for key in ["loginDate", "logoutDate", "date"]:
if key in row:
dt = datetime(1970, 1, 1, tzinfo=timezone.utc) + \
timedelta(milliseconds=row[key])
row["eventDate"] = dt.isoformat().replace("+00:00", "Z")
del row[key] # Remove original key
break
<http://logger.info|logger.info>(row)
return super().post_process(row, context)
I get this error: Extraction failed singer_sdk.exceptions.InvalidReplicationKeyException: Field 'eventDate' is not in schema for stream 'events'
I have added eventDate to the schema.Reuben (Matatika)
09/11/2024, 11:16 AMNir Diwakar (Nir)
09/11/2024, 11:18 AM{
"oneOf": [
{
"type": "object",
"properties": {
"eventDate": { "type": "string" },
"userName": { "type": "string" },
"loginEvent": { "type": "string" },
"ipAddress": { "type": "string" },
"accessSource": { "type": "string" },
"auditSource": { "type": "string" }
},
"required": ["eventDate", "userName", "auditSource"]
},
{
"type": "object",
"properties": {
"eventDate": { "type": "string" },
"sourcePath": { "type": "string" },
"targetPath": { "type": ["string", "null"] },
"user": { "type": "string" },
"userId": { "type": "string" },
"action": { "type": "string" },
"access": { "type": "string" },
"ipAddress": { "type": "string" },
"actionInfo": { "type": "string" },
"checksum": { "type": "string" },
"groupId": { "type": "string" },
"auditSource": { "type": "string" }
},
"required": ["eventDate", "sourcePath", "user", "action", "ipAddress", "auditSource"]
},
{
"type": "object",
"properties": {
"eventDate": { "type": "string" },
"actor": { "type": "string" },
"subject": { "type": "string" },
"action": { "type": "string" },
"actionInfo": { "type": "string" },
"source": { "type": "string" },
"auditSource": { "type": "string" }
},
"required": ["eventDate", "actor", "subject", "action", "auditSource"]
}
]
}
Reuben (Matatika)
09/11/2024, 11:37 AMproperties
to be at the top level of a schema - not oneOf
.
https://github.com/meltano/sdk/blob/6708cb995c68ab6f74d4874dfc8f978c3b054ceb/singer_sdk/streams/core.py#L228-L231
I would make all non-common properties nullable like this:
{
"type": "object",
"properties": {
"eventDate": {
"type": "string"
},
"userName": {
"type": [
"string",
"null"
]
},
"loginEvent": {
"type": [
"string",
"null"
]
},
"ipAddress": {
"type": [
"string",
"null"
]
},
"accessSource": {
"type": [
"string",
"null"
]
},
"sourcePath": {
"type": [
"string",
"null"
]
},
"targetPath": {
"type": [
"string",
"null"
]
},
"user": {
"type": [
"string",
"null"
]
},
"userId": {
"type": [
"string",
"null"
]
},
"action": {
"type": [
"string",
"null"
]
},
"access": {
"type": [
"string",
"null"
]
},
"actionInfo": {
"type": [
"string",
"null"
]
},
"checksum": {
"type": [
"string",
"null"
]
},
"groupId": {
"type": [
"string",
"null"
]
},
"actor": {
"type": [
"string",
"null"
]
},
"subject": {
"type": [
"string",
"null"
]
},
"source": {
"type": [
"string",
"null"
]
},
"auditSource": {
"type": "string"
}
},
"required": [
"eventDate",
"auditSource"
]
}
What were you trying to do with oneOf
?Nir Diwakar (Nir)
09/11/2024, 12:10 PMReuben (Matatika)
09/11/2024, 1:51 PMproperties
at the top level of the schema regardless.Nir Diwakar (Nir)
09/11/2024, 1:52 PMReuben (Matatika)
09/11/2024, 1:58 PMoneOf
sub-schemas, where properties that are not common are made nullable since there is no guarantee they will be there (guessing from your initial attempt) - maybe I should have been clearer.