xinge_li
03/21/2022, 3:02 PMtap-mssql
in production env for quite a while, any help will be really appreciated! So I have a private data-pipeline runs on AWS for my company internal and I needed to access a microsoft sql database to fetch some data. So I created a private subnet for airflow scheduler
and built a NAT gateway to give it a public IP. Then whitelisted the IP from the mssql DB side (also whitelisted my local IP for testing purpose). The meltano pipeline tap-mssql
to target-postgres
works well in local env when I run meltano elt…
. However, in prod env, it does not work with the following errors:
I’ve tested the IP address from AWS ECS, and it gives the correct IP as set from NAT gateway and also checked there is no connectivity issue because of IP address. Could someone help me think what else could be wrong here? Why it works in local but not in prod?
When running a pipeline from meltano, is it only the airflow scheduler part will access to the database etc?
I am totally stuck, please help! Thank you very much!aaron_phethean
03/22/2022, 6:56 AMmeltano config tap-mssql test
This will try to get one record and then quit. Should simplify the issue if there is a problem elsewhere.
2. If you have set a meltano database uri, then meltano will also try to connect
3. From the log it looked like the tap failed to connect, but the target-postgres also needs to connect
4. The IP from airflow might not be what you think it is. I've used wget or curl to ipchicken in the past to check https://ipchicken.com
This still might not be your IP, so try looking at the database logs to see why the connection is being rejected, if it is getting that far…xinge_li
03/22/2022, 9:49 AMmeltano config tap-mssql test
it works with plugin is valid. But I do not know how I can run this in production environment.
2. I have set a meltano database uri, and when I run it from local, this database will also try to connect, what is the difference between this database try to connect from local vs prod, it is the same AWS RDS uri.
3. I believe the target-postgres should be fine as many of my other pipelines are connecting with the same target.
4. I have used curl in AWS ECS task to return the actual IP address of airflow scheduler service, and it gave me the correct IP address that the mssql DB whitelisted. Unfortunately I do not have access to that mssql db logs because that is a third party service we use from another company.
Thanks a lot for giving some hints, would be nice if anyone could guide me how to perform some tests in production environment. 😄xinge_li
03/22/2022, 9:53 AMcurl hostname port
and it actually can. I am wondering when running meltano pipeline, are there more than airflow scheduler that is trying to connect to the source database?aaron_phethean
03/23/2022, 5:13 PM