Hi, we are running into an issue where <tap-snowfl...
# contributing
n
Hi, we are running into an issue where tap-snowflake is taking about 90 seconds to execute
meltano invoke custom-snowflake tap —discover
. We are providing a select config, tables, and schema in the config so it should be a relatively quick execution. A 90 sec execution is far too long. In the tap, it states that when we provide tables in the configuration, it should limit discovery to the schema and tables specified. However, based on logs, the tap is introspecting the entire database. The tap is going through every table & every view in every schema the user has access to. This seems like a tap-snowflake bug and we want to address it asap so our customers can have a better experience. @pat_nadolny It seems like there has been an outstanding issue filed in tap-snowflake which would address this. Does the meltano team have an eta on addressing it? the configurations for your reference.
Copy code
- config:
   account: <redacted>
   database: CUSTOMER_DB
   password: <redacted>
   schema: customer_schema
   user: <redacted>
   warehouse: CUSTOMER_WH
   tables:
    - customer_schema.table_name
  inherit_from: tap-snowflake
  metadata: {}
  name: custom-snowflake-tap
  schema: {}
  select:
  - customer_schema-table_name.*
  select_filter: []
Snippet of logs during discovery: ```2023-11-02 183356,016 | INFO | snowflake.connector.cursor | query: [SHOW /* sqlalchemy:get_view_names */ VIEWS IN test_google_analytics] 2023-11-02 183356,085 | INFO | snowflake.connector.cursor | query execution done 2023-11-02 183356,085 | INFO | snowflake.connector.cursor | Number of results in first chunk: 0 2023-11-02 183356,086 | INFO | snowflake.connector.cursor | query: [ROLLBACK] 2023-11-02 183356,136 | INFO | snowflake.connector.cursor | query execution done 2023-11-02 183356,137 | INFO | snowflake.connector.cursor | Number of results in first chunk: 1 2023-11-02 183356,138 | INFO | snowflake.connector.cursor | query: [SHOW /* sqlalchemy:get_table_names */ TABLES IN test_insights_proudmoments] 2023-11-02 183356,252 | INFO | snowflake.connector.cursor | query execution done 2023-11-02 183356,253 | INFO | snowflake.connector.cursor | Number of results in first chunk: 0 2023-11-02 183356,253 | INFO | snowflake.connector.cursor | query: [ROLLBACK] 2023-11-02 183356,307 | INFO | snowflake.connector.cursor | query execution done 2023-11-02 183356,307 | INFO | snowflake.connector.cursor | Number of results in first chunk: 1 2023-11-02 183356,308 | INFO | snowflake.connector.cursor | query: [SHOW /* sqlalchemy:get_view_names */ VIEWS IN test_insights_proudmoments] 2023-11-02 183356,466 | INFO | snowflake.connector.cursor | query execution done 2023-11-02 183356,467 | INFO | snowflake.connector.cursor | Number of results in first chunk: 6 2023-11-02 183356,467 | INFO | snowflake.connector.cursor | query: [ROLLBACK] 2023-11-02 183356,517 | INFO | snowflake.connector.cursor | query execution done 2023-11-02 183356,517 | INFO | snowflake.connector.cursor | Number of results in first chunk: 1 2023-11-02 183356,518 | INFO | snowflake.connector.cursor | query: [SHOW /* sqlalchemy:get_table_names */ TABLES IN test_keyword_kabbage] 2023-11-02 183356,615 | INFO | snowflake.connector.cursor | query execution done 2023-11-02 183356,615 | INFO | snowflake.connector.cursor | Number of results in first chunk: 2 2023-11-02 183356,616 | INFO | snowflake.connector.cursor | query: [ROLLBACK] 2023-11-02 183356,686 | INFO | snowflake.connector.cursor | query execution done 2023-11-02 183356,687 | INFO | snowflake.connector.cursor | Number of results in first chunk: 1 2023-11-02 183356,687 | INFO | snowflake.connector.cursor | query: […
Hi @pat_nadolny, would appreciate your input when you have a sec!
p
Hey @nidhi_kakulawaram I personally wont have time to debug it soon but from a quick look it seems like its the fact that were iterating all schemas that might be contributing to the issue. In https://github.com/MeltanoLabs/tap-snowflake/blob/59287ca8a1ddefb735df9b43674db95765bf3b94/tap_snowflake/client.py#L91 we have to iterate the schemas before filtering out the tables in each. We use tap-snowflake in the squared project which has quite a few tables too and I havent noticed a major slow down but I'd have to double check. From my comment in https://github.com/MeltanoLabs/tap-snowflake/issues/23#issuecomment-1565784344 it makes me think that it was resolved once I was properly defining the table name. Heres are table config and base config for reference.