garret_cree
04/25/2023, 2:46 PM{
"field_1": "value_1",
"field_2": "value_2",
"field_3": "value_3",
"field_4": "value_4"
}
We want to send data from 2 of these fields to Table A:
| field_1 | field_2 |
And send one of the other fields to Table B:
| field_4 |
We don't want to hit the API for the data twice (once for each table), because the API provides all the data we need in a single extraction. We'd rather parse the payload and redirect part of it to a second table (ie, field_4 to Table B), and ignore field_3 altogether.
Any thoughts?edgar_ramirez_mondragon
04/25/2023, 3:46 PM{
"stream_maps": {
"stream_2": {
"__source__": "stream_1",
"field_4": "field_4",
"__else__": null
},
"stream_1": {
"field_1": "field_1",
"field_2": "field_2",
"__else__": null
}
}
}
garret_cree
04/25/2023, 3:54 PMedgar_ramirez_mondragon
04/25/2023, 3:56 PM--config config.json
or if you’re using Meltano you can something like
plugins:
extractors:
- name: your-tap
config:
stream_maps: ...
garret_cree
04/25/2023, 5:50 PMedgar_ramirez_mondragon
04/25/2023, 5:54 PMgarret_cree
04/28/2023, 7:30 PMplugins:
extractors:
- name: tap-my-custom-tap
config:
stream_maps:
stream_1:
id: "id"
... # all my other fields
stream_2:
__source__: "stream_1"
... # more fields
I run my job like this:
meltano --log-level info elt tap-my-custom-tap target-postgres --state-id my-state-id --select stream_1 --select stream_2
And 2 issues arise:
1. The field names are being parsed as simpleeval commands, and aren't recognized ( singer_sdk.exceptions.MapExpressionError: Failed to evaluate simpleeval expressions id.
)
2. Both streams hit the API for the data. I imagine there's some other config I need set up so stream_1 always hits the API for data, and stream_2 only hits the API if it is the only stream selected, otherwise it should use the same data from stream_1's runedgar_ramirez_mondragon
04/28/2023, 11:38 PM--select
does not currently work with streams generated by stream maps, only with the original streams (has to do with the singer catalog). You can see which streams are those by running meltano select tap-my-custom-tap --list --all
2. Make sure you’re referencing an existing stream, e.g. if the original stream single stream in your tap is my_stream
(which is only extracted once, but you want to split each record into two) you’d need a config like this:
plugins:
extractors:
- name: tap-my-custom-tap
config:
stream_maps:
my_stream:
id: "id"
... # all my other fields
new_stream:
__source__: "my_stream"
... # more fields
The tap would only be hitting the API for my_stream
, and stream maps only work on the already-extracted record.