Hi When running the `tap-jtl-mssql` to connect to...
# troubleshooting
a
Hi When running the
tap-jtl-mssql
to connect to MSSQL, we are experiencing extremely long execution time during the
--discover
phase. According to the logs, the invocation:
Copy code
2025-05-28T12:50:05.060658Z [debug] Invoking: [.../tap-jtl-mssql ... --discover]
takes 2–3 minutes to complete. It seems that the tap is trying to scan and analyze the entire database schema, even though a local catalog file is explicitly defined in the configuration:
Copy code
yaml

catalog: /home/on/Projects/meltano_jtl_pim/project/tap_jtl_mssql/catalog.json
However, this file appears to be ignored, as the tap still initiates a full database scan via SQLAlchemy and launches discovery instead of using the pre-generated catalog. Expected behavior: When a
catalog.json
is provided, the tap should skip the discovery process and use the existing catalog instead. Questions: 1. Is there a way to forcefully prevent the
--discover
step and work only with the existing
catalog.json
? 2. Is there a Meltano or tap configuration option to disable automatic discovery if the catalog file is already available? 3. Is there a way to cache discovery results to avoid hitting the database repeatedly? Environment: • Meltano version: 3.7.6 • OS: Ubuntu 22.04 • Python: 3.12.3
e
When a
catalog.json
is provided, the tap should skip the discovery process and use the existing catalog instead.
I can confirm that's the actual behavior. Where are you defining
catalog.json
in the context of the rest of the plugin config
meltano.yml
?
Is there a way to cache discovery results to avoid hitting the database repeatedly?
That is also already the case. Are you running Meltano in an ephemeral environment, e.g. Docker?
a
Example of our meltano.yml
Copy code
version: 1
send_anonymous_usage_stats: true
project_id: tap-mssql
default_environment: dev
venv:
  backend: uv
environments:
- name: dev
plugins:
  extractors:
  - name: tap-jtl-mssql
    namespace: tap_jtl_mssql
    pip_url: -e .
    capabilities:
    - state
    - catalog
    - discover
    - about
    - stream-maps

    # TODO: Declare settings and their types here:
    settings_group_validation:
    - [host, port, database, user, password, query]

    # TODO: Declare default configuration values here:
    settings:
    - name: host
      label: Host
      description: The DB Host

    - name: port
      label: Port
      description: The DB Port

    - name: database
      label: Database
      description: The DB Name

    - name: user
      label: User
      description: The DB User

    - name: password
      kind: string
      label: DB User Password
      description: DB User Password

      sensitive: true
    - name: query
      label: Query for select data from DB
      description: Query for select data from DB

    - name: driver
      label: Driver for DB connection
      description: Driver, default - ODBC Driver 17 for SQL Server

    # TODO: Declare required settings here:
    config:
      driver: ODBC Driver 18 for SQL Server
      host: 217.154.199.124
      #database: test-db
      database: eazybusiness
      user: sa
      #query: SELECT TOP 10 * FROM products
      query: |
        SELECT
            k.*,
            ks.*
        FROM tkategorie k
                 LEFT JOIN tkategoriesprache ks ON ks.kKategorie = k.kKategorie
        WHERE kOberKategorie = 0;
      catalog: ./catalog.json
    select:
    - '*.*'

##################################################################################
  loaders:
  - name: target-jsonl
    variant: andyh1203
    pip_url: target-jsonl
    config:
      do_timestamp_file: false
      destination_path: ./output

  - name: target-mysql
    variant: thkwag
    pip_url: thk-target-mysql
    config:
      user: user
      database: db
      password: pass
      host: mysql
      port: "3306"
That is also already the case. Are you running Meltano in an ephemeral environment, e.g. Docker? - We have run in Docker or Linux PC
Do you have any idea how to resolve this issue?
e
Ok, so
catalog
is nested incorrectly. It should look something like:
Copy code
config:
      driver: ODBC Driver 18 for SQL Server
      ..
    catalog: ./catalog.json
    select:
    - '*.*'
i.e. at the same level as
config
, not nested inside it.
a
Many thanks. It works !! However, we got the following error 2025-06-10T100833.786000Z [info ] Skipping deselected stream 'dbo-products'. cmd_type=elb consumer=False job_name=dev:tap-jtl-mssql-to-target-jsonl name=tap-jtl-mssql producer=True run_id=5fd6b1b0-b0d8-45d2-94c1-ca5fe8ab9e8f stdio=stderr string_id=tap-jtl-mssql 2025-06-10T100833.834393Z [debug ] head producer completed first as expected name=tap-jtl-mssql
e
Can't have both
catalog
and
select
set. The
dbo-products
stream most likely doesn't have the
selected: true
metadata in the catalog file