I'm trying to speed up runs, by running a --discov...
# troubleshooting
f
I'm trying to speed up runs, by running a --discover on the tap and saving the catalog to a file, then specifying that file in the catalog entry for the tap (in the meltano.yml file). Thought is, it won't have to do a discovery every run if a catalog is specified. When I do that, I don't get any errors, but it also does not do anything when I extract. Debug provides no useful messages. Is the file produced by --discover supposed to be usable directly as a catalog entry in meltano.yml?
v
Instead of running --discover directly try running
meltano invoke --dump=catalog tap-name
diff the catalog from this and the one you dumped. And you'll probably see 😄
f
selected false? I have select in the meltano.yml file. You'd think that would override what was in the catalog, no?
v
My point is what catalog, What command did you use to
by running a --discover
f
meltano invoke tap-mssql --discover > catalog.json
v
meltano invoke --dump=catalog tap-name
is different as it calls discovery for you, and adds the meltano magic sauce
ie selects
The idea behind it (I believe) is that you don't want magic where it's not expected. If you call an exectuable like tap-mssql and pass it --discover, it's expected to be the vanilla --discover run.
f
Hmm. Must have missed that somewhere in the docs. So are you saying this is the correct process: • Configure tap with minimal config so you can do a discovery to see what is out there. • Add in the select, metadata, anything else in the meltano.yml file that you want to change. • Run meltano invoke --dump=catalog tap-whatver > catalog.json • Add in catalog: catalog.json in the meltano.yml • Meltano will now pass the catalog.json to the tap, ignoring any select, metadata, whatever in the meltano.yml file, but you probably want to keep it there so you can re-generate the catalog.json if necessary. Is that about right? Not sure I believe that is the best process, but if that's the process then I can work with that. I'd think/hope that meltano would start with the catalog specified, update with any changes in the meltano.yml file, and pass that to the tap...
v
Well the best practice I believe is to generate the catalog every run with discovery. You don't want that because it takes time to run the discovery process every run. So best practice would be 1. Configure tap 2. Add select/metadata/etc 3. Schedule your meltano tap (elt, etc)
For overriding the catalog I think that's right @fred_reimer
f
OK, thanks. Enjoy your weekend!