I'm getting some strange behavior in tap-snowflake...
# singer-targets
d
I'm getting some strange behavior in tap-snowflake, wondering if anyone else might have run into this before. It's been working for a while now, but all of a sudden stopped working in the last week or two. Basically, the loader gets stuck here:
Copy code
2024-08-29 00:17:41,119 | INFO     | target-snowflake     | Target 'target-snowflake' is listening for input from tap.
2024-08-29 00:17:41,120 | INFO     | target-snowflake     | Initializing 'target-snowflake' target sink...
2024-08-29 00:17:41,120 | INFO     | target-snowflake.dim-AssetGroup | Initializing target sink for stream 'dbo-Test'...
2024-08-29 00:17:41,130 | INFO     | snowflake.connector.connection | Snowflake Connector for Python Version: 3.12.1, Python Version: 3.12.5, Platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
2024-08-29 00:17:41,130 | INFO     | snowflake.connector.connection | Connecting to GLOBAL Snowflake domain
2024-08-29 00:17:41,130 | INFO     | snowflake.connector.connection | This connection is in OCSP Fail Open Mode. TLS Certificates would be checked for validity and revocation status. Any other Certificate Revocation related exceptions or OCSP Responder failures would be disregarded in favor of connectivity.
It seems to just hang there indefinitely. Strange thing is that it works perfectly fine on Windows, as well as our app server, which is also running on Ubuntu. I've tried clearing out any Windows paths from PATH in WSL Ubuntu, clearing all non-relevant paths from the PATH variable, doing a fresh clone of the repo, reinstalling ubuntu, clearing PATH/env again on the new install. Can't seem to figure out what changed. Even the code from same version as before when it was working is just stuck at this same spot. Next step is trying to debug this, but haven't got that working yet.
v
What's the full command you're running. I wonder if it's not the target but the tap just isn't sending any data
I've had this happen in some quick python scripts I threw together 3-4 years ago where I called requests directly and didn't add a timeout
d
It shouldn't be the tap.
cat output.json | meltano --log-level=debug invoke target-snowflake
I actually left this overnight and it's still hung up on same spot
v
makes sense, definietly something target related. What I'd want to know now is what is the python function doing when it's handing we should be able to get a stack trace right now while it's hanging
one sec
d
Oh, is that possible? I was going to set up the debug to try to figure this out. It would be nice if there was some easier way of knowing what it's hung up on. It's quite a hassle to set up linux vscode within WSL and the resolution seems odd (feels slightly squished)
v
d
Is my command the one with the correct pid? I see a lot of other ones spun up with the format of target-snowflake --config ...config.json
v
I think I"m close but I don't have time to keep going
Copy code
{
  "name": "Python Debugger: Attach using Process Id",
  "type": "debugpy",
  "request": "attach",
  "processId": "${command:pickProcess}"
}
Needed gdb so I ran
sudo apt install gdb
but then I was getting a seg fault, but didn't dive more 🤷
d
No problem, let me try the other way first using vscode since I already started setting it up there
v
https://github.com/microsoft/debugpy/issues/882#issuecomment-1092139563 looks to be what I was hitting but I have to stop 😄
e
Even the code from same version as before when it was working is just stuck at this same spot. Next step is trying to debug this, but haven't got that working yet.
That's really weird. I'd try adding
snowflake-connector-python==3.12.0
to the pip_url and see if that fixes it, cause that'd be a recent thing that changed and we don't pin that dependency.
d
What's the correct syntax for adding that?
Copy code
- name: target-snowflake
      variant: meltanolabs
      pip_url: meltanolabs-target-snowflake
v
Copy code
- name: target-snowflake
      variant: meltanolabs
      pip_url: meltanolabs-target-snowflake snowflake-connector-python==3.12.0
👍 2
d
No difference, deleted .meltano dir, ran install, and tried again
e
Hmm, maybe downgrade
meltanolabs-target-snowflake
too?
Copy code
pip_url: meltanolabs-target-snowflake==0.9.1 snowflake-connector-python==3.12.0
d
Is there a command to verify the loader version installed?
e
Not currently, other than maybe
.meltano/loaders/target-snowflake/venv/bin/pip list
.
If pinning explicitly doesn't work, I'd try using the uv venv backend and run
UV_EXCLUDE_NEWER=<some date when things were working> meltano install --clean
to get all the dependencies as they were at a point in time.
d
Interesting, docs say there's a version here
Copy code
Usage: target-snowflake [OPTIONS]

  Execute the Singer target.

Options:
  --version                 Display the package version.
but all it shows is
target-snowflake v[could not be detected],
e
Ah, yeah that's right. We're failing to detect and that's a 🐛
d
I think pinning is working, but I will see if the date approach works
🤞 1
Made some progress, though still no idea what the problem is. The line that it gets stuck on is in network.py, on line 739
ret = self.fetch(...)
. It looks like all this does is call a rest API, so I copied the url, headers, and data into postman on Windows, and of course, it works. I then translated it into curl and ran it in wsl and it gets stuck. Then I ssh'ed into a random linux box and it works there too. So it's something with the connection, but not sure where yet. I think it's probably safe to say that since curl doesn't work, then issue is not with meltano. Still very strange though
v
network.py from which library? I'd like to peek
Really a timeout should be happening so even if it does get "stuck" the timeout should hit and fail
v
From diving in, it looks like getting the python logger set to debug logging for that module would do some good here. There's also a few code paths that don't have a timeout set so a full stack trace at the place the job is stuck would be helfpul
no fun though for sure 😕
• `NOTE: this has not been tested extensively, but has been shown to improve the experience when using WSL
💡 1
d
Interesting, but this was also in April. It stopped working for me 2 weeks ago. I think my first time setting up meltano was in May
🤷 1
The note sounds like it defaults to false as well, and I'm also not using externalbrowser auth
🤷 1
How do you enable debug logging here? I set
Copy code
logging.basicConfig(level=logging.DEBUG)
in main, and I don't see any messages in the console. Added a log message in main to test, and it does show up.
I also can't get a stack trace at the exact point where it's stuck. Clicking pause on the debugger doesn't respond at that point in time
I don't think it outputs anything useful
Copy code
2024-08-29 19:43:46,046 | INFO     | snowflake.connector.connection | Snowflake Connector for Python Version: 3.12.1, Python Version: 3.12.5, Platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
2024-08-29 19:43:46,048 | INFO     | snowflake.connector.connection | Connecting to GLOBAL Snowflake domain
2024-08-29 19:43:46,048 | INFO     | snowflake.connector.connection | This connection is in OCSP Fail Open Mode. TLS Certificates would be checked for validity and revocation status. Any other Certificate Revocation related exceptions or OCSP Responder failures would be disregarded in favor of connectivity.
2024-08-29 19:43:48,169 | DEBUG    | snowflake.connector.network | Session status for SessionPool '<http://OMITTED.snowflakecomputing.com|OMITTED.snowflakecomputing.com>', SessionPool 1/1 active sessions
2024-08-29 19:43:48,170 | DEBUG    | snowflake.connector.network | remaining request timeout: N/A ms, retry cnt: 1
2024-08-29 19:43:48,172 | DEBUG    | snowflake.connector.network | Request guid: b35b14f9-3ca1-154a-bcf3-47ef4a1c6068
2024-08-29 19:43:48,172 | DEBUG    | snowflake.connector.network | socket timeout: 60
^C2024-08-29 19:43:55,632 | DEBUG    | snowflake.connector.network | Session status for SessionPool '<http://OMITTED.snowflakecomputing.com|OMITTED.snowflakecomputing.com>', SessionPool 0/1 active sessions
e
Is it respecting the socket timeout and killing the connection after 1 minute?
d
Let me leave it open for a bit.
Nope, that's the last log message
All that and it turns out it's some kind of DNS issue 🥲 Confirmed it by setting to Google DNS and having it work, but losing all access to internal resources. Seems like there's some kind of DNS magic happening within WSL, though not sure how it broke in the first place
e
So it's always DNS indeed 😅 I'm glad that you at least found the root problem.
v
it shouldn't be that hard to debug a dns issue though, something's up with that library. If you could get that to replicate they should really fix it
d
Agreed, it should just say that it failed to resolve and throw instead of just hanging. Strangely enough, curl and ping both fail to return as well. If I have some nonsense address thrown in, it knows right away that it can't resolve. Something specific to our snowflake and powerbi seems to make it just stuck. So this might be beyond the scope of the library