I added a new schema and set of tables to an exist...
# troubleshooting
j
I added a new schema and set of tables to an existing mysql tap that's configured for incremental replication of an existing set of tables, and none of the new tables got picked up for extraction-- wondering if I missed a step somewhere, or if I need to run things w/a full refresh?
I just put them into meltano.yml manually + this is my production meltano instance that runs as a job in k8s using a postgres instance for state
d
I wonder if tap-mysql is using the incremental replication state file to only replicate those tables instead of scanning for available tables again
I assume you're not providing an explicit
catalog
that simply doesn't list the new tables? If you're using
select
, discovery should be running again every time and new streams should be discovered
(unless the tap thinks it's being clever by only syncing streams already in state)
j
it's just
select
, yeah--- no catalog
d
It would be useful to know whether running with
--full-refresh
does pick up the new tables, since that'd confirm the state file is the culprit
j
ok; running full refresh is a bit of a Thing b/c one of the tables is extremely large, but I'll see what I can do
d
Ok let's try something else then
Can you manually modify the state to add an entry for a new stream, and see if that causes it to be picked up?
j
mmm, like in postgres?
d
You can do that directly in the system DB, or by dumping the state using
meltano elt ... --dump=state
, modifying it, and passing it to
meltano elt ... --state=<state.json>
j
looking in the system db, one sec
d
Does
meltano select --list <tap>
or
meltano elt ... --dump=catalog
include the new tables at all?
j
lemme see, one sec
nope, not there at all
d
Interesting. What about running
meltano invoke <tap> --discover
to do without Meltano's "cleverness" around state or discovery entirely?
If that doesn't pick them up, it's a tap-mysql (configuration) issue
j
ok, lemme see
also i'm in the postgres db; where should I be looking for stuff?
I see e.g. the
job
table
d
job
will have one row per job/pipeline run
Each row will have its own singer state payload
The last one will be picked up on the next run with the same Job ID
j
gonna clone the repo on the bastion so it's easier to play with stuff
d
Did you try
meltano invoke <tap> --discover
already?
j
I did; it did not look like it was picking up the new tables
they're in a different schema, and I wonder if that's part of the problem
should be in a better spot to debug this in a moment
d
👍
Did you set the
filter_dbs
setting by any chance?
Because that'd definitely explain it 😄
j
I did!
d
Unlike the name suggests, "dbs" means "schemas" here
j
but I thought I added it in
d
`filter_db`: Comma separated list of schemas to extract tables only from particular schemas and to improve data extraction performance
j
it used to be
wg_static
, and now it's
wg_static,wg_mc
d
I think we've ruled out Meltano though, if even
meltano invoke <tap> --discover
isn't including the tables in question
Unless it's a config issue
Try
meltano config <tap>
so we can verify
filter_dbs
It's this query for
information_schema.tables
that determines the discovery output
And it would appear that that query is excluding your newtables
I'm off for the day, good luck debugging!
Let me know if it turns out to be a Meltano issue after all 😄 Or an opportunity to document tap-mysql better
j
i'm starting to bet it's a privilege user for the user i'm using...one sec to let me confirm that!
sorry for the hassle!
YES-- that's the problem! So sorry Douwe-- thanks for your help!!
d
Glad we figured it out!