Hey love the concept for using a schema around taps amp targ Meltano #plugins-general

Hey, love the concept for using a schema around ta...

sam_woolerton

09/04/2020, 8:52 AM

Hey, love the concept for using a schema around taps & targets so that community infra can be built around it. I need to get data out of a few niche applications (construction software) so happy to share these with the community once I've got them working - just a few implementation details holding me up. I have a custom tap I'm building for SIgnOnSite that works perfectly with

meltano invoke tap-signonsite --properties path/to/properties.json

(doesn't work unless properties is specified, assuming that's intended behaviour). It doesn't work as part of a pipeline in Meltano UI though (just to target-csv, so couldn't be simpler). It gives 2 errors •

invalid catalog output

- I copied the base for my tap from

tap-github

, so assumed that it would work. This looks the same as

tap-stripe

too, so any insight on what's wrong? It results in the JSON catalog being dumped to console when run •

ERROR: unable to parse

- it can't parse the arguments for some reason.. Works fine with

meltano invoke

as noted above, so does the UI pass the arguments differently? Also, I'll likely want to use

meltano elt

once this is working to schedule the pipeline myself - where does it get the properties file from? When I

meltano invoke

, I'm specifying the path, but it doesn't look like there's an option for that with

meltano elt

al_whatmough

09/04/2020, 10:35 AM

What happens if you run

meltano invoke tap-signonsite --discover

douwe_maan

09/04/2020, 2:30 PM

@sam_woolerton When using

meltano elt

meltano invoke

, or running a pipeline from the UI, you don't need to provide a properties file explicitly as long as the tap supports discovery mode. As it says on https://meltano.com/docs/integration.html#selecting-entities-and-attributes-for-extraction:

Whenever an extractor is run using
meltano elt
or
meltano invoke
, Meltano will generate the desired catalog on the fly by running the tap in discovery mode and applying the selection, metadata, and schema rules to the resulting catalog file before passing it to the tap in sync mode.

I assume you've specified that the tap supports the

discover

and

properties

capabilities? (Note that

--properties

is considered deprecated, and new taps should use `--catalog`: https://github.com/singer-io/getting-started/blob/master/docs/SYNC_MODE.md) In your case, it looks like this is not working correctly because Meltano considers the discovered catalog (result of

meltano invoke tap-signonsite --discover

) invalid. Your discovery code looks OK, but can you please share the discovered catalog so that we can help you figure out why Meltano might still not like it? The "unable to parse" error you're seeing

target-csv

print originates here: https://github.com/singer-io/target-csv/blob/1b73164ae7482a7f5dc625f2b08e85b7410e5473/target_csv.py#L51 The fact that nothing was printed after the colon suggests that the target received an empty line on its stdin, while all lines output by the tap are expected to be JSON-encoded Singer messages. Do you know why your tap may be outputting blank lines? If your tap is also outputting extra lines when you run

--discover

, that may explain why Meltano fails to parse the output, since it expects the entire output to be valid JSON, which it may not be in your case. You may want to run

meltano elt

in debug mode (https://meltano.com/docs/command-line-interface.html#debugging), so that you can see each line output by the tap, as well as the exact way the tap and target are invoked, including their command line arguments (

--properties

etc).

douwe_maan

09/04/2020, 3:09 PM

Since the "invalid catalog output" error is pretty uninformative, I've made this change to print full error message when extractor catalog discovery fails, which will land in the next release: https://gitlab.com/meltano/meltano/-/merge_requests/1849

sam_woolerton

09/04/2020, 9:41 PM

Ok solved the catalog issue - I had some print statements so I could track control flow, and hadn't clicked that they would also go to stdout (I started as a web dev so still kind of new to stdin/stdout concepts). Commented those print lines out and the catalog issue is solved thanks

douwe_maan

09/04/2020, 9:42 PM

@sam_woolerton Great! Yeah, for logging you're better off using STDERR. Those messages will be reflected in the

meltano elt

output as well 🙂

sam_woolerton

09/04/2020, 10:54 PM

Once I commented out the rest of my

print

statements, the pipeline itself ran perfectly too! Thanks for the help

Open in Slack

Previous Next