is there a possibility to have default values via ...
# troubleshooting
p
is there a possibility to have default values via the schema that a tap injects if the field/attribute is not retrieved via the api? the reason I'm asking this, is for some records there are some fields returned by the api and then using tap-s3 for parquet format I'm getting an error for some fields that are not returned.
there's one way to define everything in the stream maps but via is there something via schema?
cause some of the objects I am dealing with are very nested
like I have a field right now which is returned as null when no value and when there's a value it is returned as
[StringType]
e
Are those nested fields fully represented in the schema or is the parent field's schema just something like
{"type": "object", "properties": {}
?
p
they're fully represented, so the field is
"bcc":{ "type": ["array", null]}
but for some records in tap stream, the field is not showing up at all and then for some with an
[]
although I would say for the field
"bcc"
item definition is not given
this is proving to be a problem in downstream targets which are schema on write / schema on merge
I'm using tap-kustomer and target-s3
e
I see. Short of fixing the missing fields in the tap, you could use streams maps (meltano docs) to fill-in a default value.
p
I think I was able to point out that the tap was dealing with some schema errors
basically tap let in through string/object data
so was giving an error when I was using target-s3 to save as parquet
I was able to handle it using post process
might have to do it for some more fields but seems promising
this does has to do with kustomer api sending data out of contract. https://developer.kustomer.com/kustomer-api-docs/reference/getmessagesbyconversation
while the contract on their page says that: attributes.meta.to is an array of object of {email str}, I'm getting string returns
e
Ok so the error you were seeing is a JSON schema validation issue or something else? If it's just that, then the schemas could be updated.
p
oh no so it was double layered, first, tap schema allowed strings and object both to be populated for the attributes:
<http://attributes.meta.to|attributes.meta.to>
, so when I tried to save it as parquet, it failed as there was string and array data. I thought this was because sometimes the api was not returning the field and the parquet was failing but the parquet does take into account the None fields and works fine. The problem is kustomer api sending data out of contract - so I've finally handled that in post process of the stream
This is working fine now
thanks for the help @edgar_ramirez_mondragon, as always it is appreciated melty
e
awesome 🙌