How can we test Meltano end-to-end? Thinking about...
# getting-started
a
How can we test Meltano end-to-end? Thinking about using WireMock and
HTTP_PROXY
to fake the source APIs, then writing to CSV or a temporary BigQuery dataset. Then we’d assert on the resulting data (not sure what language/framework to use for that). Is there a better approach? Anything built-in to make this easier?
a
I think dbt tests are a great way to write tests in general, and those can be used at every layer of your pipeline. In addition to writing tests on the final product, I regularly have built dbt tests for all assumptions also on the raw source table data as well, which helps to flag issues during extract-load prior to donwstream consumption by individual transforms.
A few other practices which you might already have considered, but which I think are helpful: (1) ideally find a way to Extrat-load just a slice of the full data to make EL tests faster, (2) Ideally parameterize your EL landing zone so you can have "disposable" landing environments keyed by a CI build number (for instance), (3) consider dbt overrides to create everything as a view in your first round of tests, which can give you compile failures faster than if actually transforming data.
a
Thank you!
Trying to wrap
meltano elt
in VCR.py, but maybe the tap is running in a subprocess(?) because VCR.py can’t see all the HTTP requests. The only request captured is to https://www.meltano.com/discovery.yml
Copy code
@pytest.mark.vcr()
def test_elt():
    runner = CliRunner()
    result = runner.invoke(cli, ['elt', 'tap-gitlab', 'target-jsonl', '--job_id=gitlab-to-jsonl', '--full-refresh'])
    print(result.output)
Taps run in subprocesses so VCR.py can’t record & replay the requests this way. Any thoughts about how to run VCR from inside that subprocess?
j
Testing taps with vcr is a neat idea. Those API calls run in the tap's python process, so you'd need to use it there somehow. Maybe you could make a "wrapper" tap which imports your actual tap code and calls it from within vcr. Then use that tap in your Meltano ELT test
a
(Late update) we ended up doing a proof of concept that captured traffic with tcpdump to assert on it