Hi all, I’m trying to set up my first pipeline loc...
# troubleshooting
n
Hi all, I’m trying to set up my first pipeline locally to get data from postgres to snowflake. I am starting with the JSONL target just to test the first part of the pipeline and think I am missing something on the configuration front. Meltano keeps dumping all tables to my output directory even through I only have one in my select config:
Copy code
- name: tap-postgres
    variant: transferwise
    pip_url: pipelinewise-tap-postgres
    config:
      dbname: swingleft
      user: meltano
      default_replication_method: FULL_TABLE
      filter_schemas: public
      select:
      - actblue_contribution.*
that select shows up in the config in my UI:
e
Yup your
select
is nested in the plugin
config
but it should be at the same level as the latter
n
i actually just blew away the entire project and started from scratch, and I’m seeing the same stuff =/
when i move the select out one level in scope, i see
message=Selected streams: []
in the logs
and no output
i first tried by adding the select with the cli
that moved it into the dev config (which would be ideal) but i got the same effect
e
So what cli command did you try?
n
Copy code
meltano select tap-postgres actblue_contribution
and when i do that i get it updates my dev config:
Copy code
environments:
- name: dev
  config:
    plugins:
      extractors:
      - name: tap-postgres
        select:
        - actblue_contribution.*
- name: staging
- name: prod
but then the table does not show up in the UI for the tap configuration
message has been deleted
e
Right, so the tap catalog (the
select
bit) and config and functionally separate, so the selected streams won't be displayed in that part of the UI. I'm not sure if they are shown somewhere else in the UI, though cc @alexmarple Another thing I'm not sure is how environments (most relevant here, the default environment @cody_hanson @taylor ) play with the UI. It's not functionality we've tested together.
n
ok that’s good to know. I won’t worry about hte UI for now, but if I do things properly, I should see the dump from that table in a json file in
/output
, is that correct?
e
That's correct
n
ok blew it all away and going to try just with the CLI
ok here’s the sequence of commands I ran:
Copy code
meltano init meltano
cd meltano 
meltano add extractor tap-postgres
meltano config tap-postgres set host localhost
meltano config tap-postgres set port 5432
meltano config tap-postgres set user meltano
meltano config tap-postgres set password [[ PASSWORD ]]
meltano config tap-postgres set dbname [[ DBNAME ]]
meltano select tap-postgres [[ TABLENAME ]]
meltano add loader target-jsonl 
meltano run tap-postgres target-jsonl
output:
Copy code
➜ meltano run tap-postgres target-jsonl
2022-04-22T22:37:37.240606Z [info     ] Environment 'dev' is active
2022-04-22T22:37:37.934829Z [info     ] commands                       commands={}
2022-04-22T22:37:38.089381Z [warning  ] No state was found, complete import.
2022-04-22T22:37:40.314796Z [info     ] time=2022-04-22 15:37:40 name=tap_postgres level=INFO message=Selected streams: [] cmd_type=elb consumer=False name=tap-postgres producer=True stdio=stderr string_id=tap-postgres
2022-04-22T22:37:40.644974Z [info     ] time=2022-04-22 15:37:40 name=tap_postgres level=INFO message=No streams marked as currently_syncing in state file cmd_type=elb consumer=False name=tap-postgres producer=True stdio=stderr string_id=tap-postgres
2022-04-22T22:37:40.678714Z [info     ] Block run completed.           block_type=ExtractLoadBlocks err=None set_number=0 success=True
i just tried the following as well:
Copy code
meltano schedule pg-to-json tap-postgres target-jsonl @once
meltano schedule run pg-to-json
And it had the same effect.
is it possible pyenv/virtualenvwrapper could be messing with meltano? not sure where it might be sourcing/writing files under the hood i suppose.
no difference with the standard virtualenv and system python 3. tried on 3.8 and 3.9. Stumped.
bingo. running
meltano select --list --all
showed the full list of possible tables. they were all prefixed with their schema like
public-{{TABLENAME}}
and when i changed it to
public-{{TABLENAME}}.*
in my config it worked. Oddly, I had tried this before because I noticed the naming convention of the files it was writing when it was doing the entire schema. No idea why, but at least this one is solved.
t
glad you solved it! the
message=Selected streams: []
message was my clue and I was going to suggest reviewing what was actually selected 🙂