Michael Bi
04/30/2025, 6:02 PMtap-magento hotglue variant to flatten the invoices stream such that the items array value within the stream is exploded either into additional rows within the invoices table or within its own separate invoices_items table. The data would then be fed into the target-redshift ticketswap variant.
I have the relevant parameters set below in the config file, which I believe should be sufficient to flatten anything that comes out of the Magento tap.
flattening_enabled: true
flattening_max_depth: 20
Unfortunately, it doesn't appear that the flattening is occurring as the value for items is a stringified list of JSON objects. From the singer SDK, it looks like the source code requires the value in the key-value pair in question to be a dict, not a string. To deserialize the JSON list, I have tried applying the json.loads(items) command to the stream_maps parameter to both tap-magento and target-redshift to no avail as shown below:
stream_maps:
invoices:
items: json.loads(items)
==========
When json.loads(items) is applied to tap-magento stream_maps, the following error occurs:
singer_sdk.exceptions.MapExpressionError: Failed to evaluate simpleeval expressions json.loads(items)
singer_sdk.helpers._simpleeval.NameNotDefined: 'json' is not defined for expression 'json.loads(items)'
Which eventually terminates in the following schema validation error for invoices.items as json.loads never executes:
singer_sdk.exceptions.InvalidJSONSchema: Schema Validation Error: [['array', 'string', 'null'], 'null'] is not valid under any of the given schemas
Whereby the valid schema for invoices.items should be ['array', 'string', 'null'] without the double nested extra null value.
This error is strange as the docs here suggest that the json.loads command should be supported within the stream_maps parameter? Or is `json.loads`not specifically supported by the tap-magento plugin (as it does appear to work for the target-redshift plugin)? If so, how could we get the desired flattening to occur? Another question would be whether json.loads would be executed before or after the flattening by the singer SDK (of course, we would want it to happen before so that items is recognized as a dict.
==========
When json.loads(items) is applied to target-redshift stream_maps, the items field is converted to the Redshift super data format with the fields containing JSON objects, but the flattening does not occur. I believe this is expected behavior as tap-magento should be the plugin performing the flattening, and target-redshift does not appear to have the flattening parameter as usable.
cc: @Taylor GerikEdgar Ramírez (Arch.dev)
04/30/2025, 6:52 PMflatten theThat is something that is currently beyond the capabilities of the SDK at the moment. Flattening is limit to fields within a single stream, rather than "exploding" arrays into separate streams. That is a useful feature, and I'm a bit surprised it hasn't been requested so do log an issue 🙏. Thestream such that theinvoicesarray value within the stream is exploded either into additional rows within theitemstable or within its own separateinvoicestable.invoices_items
NameError is raised because the tap is in rather old version of the SDK.Michael Bi
04/30/2025, 7:28 PMinvoice record has five items the one invoice record should be replaced with five new invoice records, each with its own unique item, within that stream. Is this something that is possible?
For the JSON NameError, are you implying that we would need to fork the tap and update the SDK such that the JSON expression would be valid? Given my testing thus far, my understanding is that having the json.loads command execute correctly within the tap-magento stream_maps config would be a prerequisite for tap-magento to flatten the records given the invoice.item values are serialized JSON strings.