Michael Bi
04/30/2025, 6:02 PMtap-magento
hotglue variant to flatten the invoices
stream such that the items
array value within the stream is exploded either into additional rows within the invoices
table or within its own separate invoices_items
table. The data would then be fed into the target-redshift
ticketswap variant.
I have the relevant parameters set below in the config file, which I believe should be sufficient to flatten anything that comes out of the Magento tap.
flattening_enabled: true
flattening_max_depth: 20
Unfortunately, it doesn't appear that the flattening is occurring as the value for items
is a stringified list of JSON objects. From the singer SDK, it looks like the source code requires the value in the key-value pair in question to be a dict, not a string. To deserialize the JSON list, I have tried applying the json.loads(items)
command to the stream_maps
parameter to both tap-magento
and target-redshift
to no avail as shown below:
stream_maps:
invoices:
items: json.loads(items)
==========
When json.loads(items)
is applied to tap-magento
stream_maps
, the following error occurs:
singer_sdk.exceptions.MapExpressionError: Failed to evaluate simpleeval expressions json.loads(items)
singer_sdk.helpers._simpleeval.NameNotDefined: 'json' is not defined for expression 'json.loads(items)'
Which eventually terminates in the following schema validation error for invoices.items
as json.loads
never executes:
singer_sdk.exceptions.InvalidJSONSchema: Schema Validation Error: [['array', 'string', 'null'], 'null'] is not valid under any of the given schemas
Whereby the valid schema for invoices.items
should be ['array', 'string', 'null']
without the double nested extra null value.
This error is strange as the docs here suggest that the json.loads
command should be supported within the stream_maps
parameter? Or is `json.loads`not specifically supported by the tap-magento
plugin (as it does appear to work for the target-redshift
plugin)? If so, how could we get the desired flattening to occur? Another question would be whether json.loads
would be executed before or after the flattening by the singer SDK (of course, we would want it to happen before so that items
is recognized as a dict.
==========
When json.loads(items)
is applied to target-redshift
stream_maps
, the items
field is converted to the Redshift super
data format with the fields containing JSON objects, but the flattening does not occur. I believe this is expected behavior as tap-magento
should be the plugin performing the flattening, and target-redshift
does not appear to have the flattening parameter as usable.
cc: @Taylor GerikEdgar Ramírez (Arch.dev)
04/30/2025, 6:52 PMflatten theThat is something that is currently beyond the capabilities of the SDK at the moment. Flattening is limit to fields within a single stream, rather than "exploding" arrays into separate streams. That is a useful feature, and I'm a bit surprised it hasn't been requested so do log an issue 🙏. Thestream such that theinvoices
array value within the stream is exploded either into additional rows within theitems
table or within its own separateinvoices
table.invoices_items
NameError
is raised because the tap is in rather old version of the SDK.Michael Bi
04/30/2025, 7:28 PMinvoice
record has five items
the one invoice record should be replaced with five new invoice
records, each with its own unique item
, within that stream. Is this something that is possible?
For the JSON NameError
, are you implying that we would need to fork the tap and update the SDK such that the JSON expression would be valid? Given my testing thus far, my understanding is that having the json.loads
command execute correctly within the tap-magento
stream_maps
config would be a prerequisite for tap-magento
to flatten the records given the invoice.item
values are serialized JSON strings.