Matt Menzenski
04/26/2023, 9:10 PMMatt Menzenski
04/26/2023, 9:47 PMselect
set in meltano.yml:
select:
- stream_name.field_name
- stream_name_two.*
etc, how can I get those values in my tap? I can’t find any sort of “get_selected_fields” type method in the SDKMatt Menzenski
04/26/2023, 9:48 PMMatt Menzenski
04/26/2023, 10:03 PMMatt Menzenski
04/27/2023, 2:56 PMpat_nadolny
04/27/2023, 3:58 PMtap-mongodb --discover --config config.json > .meltano/run/tap-mongodb/properties.json
to generate the full catalog. When you run meltano select tap-mongodb --list
it shows you all streams available in your catalog file.
2. then it updates the catalog streams to be "selected": true
or false based on your select criteria in your meltano.yml.
3. when you run the tap meltano run tap-mongodb target-jsonl
it passes the updated catalog into the tap equivalent to tap-mongodb --config config.json --catalog .meltano/run/tap-mongodb/properties.json
4. the tap receives the updated catalog as input and only syncs the appropriate streams.
In step 1 the tap receives no input catalog so it should run the dynamic schema generation code to build the schema for all available streams (in mongodb case, filtered using only the databases in the config). Then in step 3/4 the tap receives a catalog as input so it should not dynamically generate the schema, it should use what was provided and sync the appropriate streams that are selected. Does that make sense?pat_nadolny
04/27/2023, 3:59 PMMatt Menzenski
04/27/2023, 4:54 PMDoes that make sense?Yes, although I think there’s a piece I’m still missing. It seems like a tap should be able to access the “selected entities and fields” catalog information without first having to do something like scan all tables in the database first and then filter them.
pat_nadolny
04/27/2023, 5:14 PMMatt Menzenski
04/27/2023, 5:51 PMMatt Menzenski
04/27/2023, 5:51 PMMatt Menzenski
04/27/2023, 6:50 PM$ cookiecutter <https://github.com/meltano/sdk> --directory="cookiecutter/tap-template"
You've downloaded /Users/matt/.cookiecutters/sdk before. Is it okay to delete and re-download it? [yes]: y
source_name [MySourceName]: sdk-testing
admin_name [FirstName LastName]:
tap_id [tap-sdk-testing]:
library_name [tap_sdk_testing]:
variant [None (Skip)]:
Select stream_type:
1 - REST
2 - GraphQL
3 - SQL
4 - Other
Choose from 1, 2, 3, 4 [1]: 4
Select auth_method:
1 - API Key
2 - Bearer Token
3 - Basic Auth
4 - OAuth2
5 - JWT
6 - Custom or N/A
Choose from 1, 2, 3, 4, 5, 6 [1]: 6
Select include_cicd_sample_template:
1 - GitHub
2 - None (Skip)
Choose from 1, 2 [1]: 1
I got it runnable (the hyphen in sdk-testing
needed to be replaced with an underscore).
Then I added a select
block. Here, the users.age
is a stream + property that is defined in the tap, while documents
is a stream not defined in the tap at all.
select:
- '!users.age'
- 'documents.*'
Then I ran `meltano select`:
$ meltano select tap-sdk-testing --list --all
2023-04-27T18:39:11.375589Z [info ] The default environment 'test' will be ignored for `meltano select`. To configure a specific environment, please use the option `--environment=<environment name>`.
Legend:
SelectionType.SELECTED
SelectionType.EXCLUDED
SelectionType.AUTOMATIC
Enabled patterns:
!users.age
documents.*
Selected attributes:
[SelectionType.EXCLUDED] groups.id
[SelectionType.EXCLUDED] groups.modified
[SelectionType.EXCLUDED] groups.name
[SelectionType.EXCLUDED] users.age
[SelectionType.EXCLUDED] users.city
[SelectionType.EXCLUDED] users.email
[SelectionType.EXCLUDED] users.id
[SelectionType.EXCLUDED] users.name
[SelectionType.EXCLUDED] users.state
[SelectionType.EXCLUDED] users.street
[SelectionType.EXCLUDED] users.zip
I can see the documents.*
in the response there, which is really encouraging to me.Matt Menzenski
04/27/2023, 6:53 PMmeltano
CLI can pull from a tap but which the tap can’t pull itself?Matt Menzenski
04/27/2023, 6:55 PMpat_nadolny
04/27/2023, 7:15 PMI’m now wondering if this is something that theRight - meltano is pushing this info to the "dumb" tap that has no understanding of the fact that its being called by meltanoCLI can pull from a tap but which the tap can’t pull itself?meltano
pat_nadolny
04/27/2023, 7:20 PMselect
cli calls are all reading or manipulating the locally stored catalog. Then once we ask meltano to run a sync, it sends that updated catalog to the tap and says "run a sync with this catalog now that I've updated it with only my certain streams selected"Matt Menzenski
04/27/2023, 7:25 PMMatt Menzenski
04/27/2023, 7:26 PMpat_nadolny
04/27/2023, 9:12 PMMatt Menzenski
05/01/2023, 4:09 AM