daniel_luftspring
01/21/2022, 3:51 PMmeltano elt
. So my question is - is there a good pattern for type casting the data in transit based on the JSON schema or do I just accept that everything is going to land in the database as strings and deal with it later?aaronsteers
01/21/2022, 4:47 PMschema
) declaration.
2. Convert it to actually be so in post_process()
if using the SDK.aaronsteers
01/21/2022, 4:49 PMaaronsteers
01/21/2022, 4:50 PMdaniel_luftspring
01/21/2022, 5:07 PMpost_process
. I'm going to test this out locally and see how bad it is but otherwise i'll probably take your suggestion and just deal with it in the database.aaronsteers
01/21/2022, 5:27 PMpost_process()
runs per record, you'd likely get better performance with hard-coded transforms, versus a dynamic operation that determined which transformations by checking each field in the schema. So, if you know which properties you want to convert to ints, for example:
def post_process(record, ...):
for int_field in ["user_id", "num_widgets", "age"]:
record[int_field] = int(record[int_field])
return record
daniel_luftspring
01/21/2022, 5:35 PMdef post_process(row, ...)
types = self.json_schema
for key in row.keys():
if types['properties'][key]['type'] == 'integer':
row[key] = int(row[key])
return row
aaronsteers
01/21/2022, 5:53 PMaaronsteers
01/21/2022, 5:54 PM