How possible is the following use case with Meltano We have Meltano #getting-started

How possible is the following use case with Meltan...

garret_cree

04/25/2023, 2:46 PM

How possible is the following use case with Meltano? We have one API data source we want to extract from, let's say it provides 4 fields in its payload:

Copy code

{
  "field_1": "value_1",
  "field_2": "value_2",
  "field_3": "value_3",
  "field_4": "value_4"
}

We want to send data from 2 of these fields to Table A:

Copy code

| field_1 | field_2 |

And send one of the other fields to Table B:

Copy code

| field_4 |

We don't want to hit the API for the data twice (once for each table), because the API provides all the data we need in a single extraction. We'd rather parse the payload and redirect part of it to a second table (ie, field_4 to Table B), and ignore field_3 altogether. Any thoughts?

edgar_ramirez_mondragon

04/25/2023, 3:46 PM

Hi Garret! I think you could use stream maps for that with a config like:

Copy code

{
  "stream_maps": {
    "stream_2": {
      "__source__": "stream_1",
      "field_4": "field_4",
      "__else__": null
    },
    "stream_1": {
      "field_1": "field_1",
      "field_2": "field_2",
      "__else__": null
    }
  }
}

garret_cree

04/25/2023, 3:54 PM

That could be just what I'm looking for, thanks! I'm a bit confused as to where to put this config though. We have created custom taps with our own Tap and Stream classes that override some of the default methods. Would this config be a property of our custom Stream class?

edgar_ramirez_mondragon

04/25/2023, 3:56 PM

You would put this in the tap config option, e.g.

--config config.json

or if you’re using Meltano you can something like

Copy code

plugins:
  extractors:
  - name: your-tap
    config:
      stream_maps: ...

garret_cree

04/25/2023, 5:50 PM

Amazing, thank you!

edgar_ramirez_mondragon

04/25/2023, 5:54 PM

Np, do let me know how it goes!

garret_cree

04/28/2023, 7:30 PM

I think I'm missing something. I set my meltano.yml file up like this:

Copy code

plugins:
  extractors:
    - name: tap-my-custom-tap
      config:
        stream_maps:
          stream_1:
            id: "id"
            ... # all my other fields
          stream_2:
            __source__: "stream_1"
            ... # more fields

I run my job like this:

Copy code

meltano --log-level info elt tap-my-custom-tap target-postgres --state-id my-state-id --select stream_1 --select stream_2

And 2 issues arise: 1. The field names are being parsed as simpleeval commands, and aren't recognized (

singer_sdk.exceptions.MapExpressionError: Failed to evaluate simpleeval expressions id.

) 2. Both streams hit the API for the data. I imagine there's some other config I need set up so stream_1 always hits the API for data, and stream_2 only hits the API if it is the only stream selected, otherwise it should use the same data from stream_1's run

edgar_ramirez_mondragon

04/28/2023, 11:38 PM

Ok, so a couple of notes: 1.

--select

does not currently work with streams generated by stream maps, only with the original streams (has to do with the singer catalog). You can see which streams are those by running

meltano select tap-my-custom-tap --list --all

2. Make sure you’re referencing an existing stream, e.g. if the original stream single stream in your tap is

my_stream

(which is only extracted once, but you want to split each record into two) you’d need a config like this:

Copy code

plugins:
  extractors:
    - name: tap-my-custom-tap
      config:
        stream_maps:
          my_stream:
            id: "id"
            ... # all my other fields
          new_stream:
            __source__: "my_stream"
            ... # more fields

The tap would only be hitting the API for

my_stream

, and stream maps only work on the already-extracted record.

Open in Slack

Previous Next