Hello all, We use Airflow to start an AWS ECS con...
# troubleshooting
a
Hello all, We use Airflow to start an AWS ECS container in which our Meltano tap and target is executed. When building the container locally, the tap and target work fine. When running it in production, the target gets stuck (see logs below) - The local and production runs have access to the same configurations and credentials. - The target (with a modified name, but the same code) works fine for other imports that are running daily (same Airflow - AWS ECS setup) - We are able to pull data from the same source with a different tap and the same setup and credentials and config. (We decided to write our own purecloud tap from scratch, because the one on the Meltano hub seemed rather shoehorned, last time we checked) Can anyone help us find out what the issue is? The tap: https://github.com/Ahaberling/tap-purecloud The target: https://github.com/Ahaberling/target-s3-parquet-purecloud The Docker image: https://hub.docker.com/layers/meltano/meltano/v3.3.2-python3.11/images/sha256-a97dcdb03f392e930e575dcaf85d2b71a68febbc9823c16e339e854188444a9f The meltano command:
Meltano el tap-purecloud target-s3-parquet --select locations --full-refresh
(Same issue exists for the other streams) The logs:
Copy code
{'run_id': 'e4f66554-7912-45d7-9994-33dbfc54e681', 'state_id': 'purecloud_locations_to_s3', 'stdio': 'stderr', 'cmd_type': 'extractor', 'name': 'tap-purecloud', 'event': '2025-01-03 10:14:27,758 INFO Skipping parse of env var settings...', 'level': 'info', 'timestamp': '2025-01-03T10:14:27.759678Z'}
... | tap-purecloud        | Added 'conversation_participant' as child stream to 'conversation'", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.833213Z'}
... | tap-purecloud        | Added 'conversation_participant_session' as child stream to 'conversation_participant'", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.833852Z'}
... | tap-purecloud        | Added 'conversation_participant_session_metric' as child stream to 'conversation_participant_session'", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.834261Z'}
... | tap-purecloud        | Added 'conversation_participant_session_segment' as child stream to 'conversation_participant_session'", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.834554Z'}
... | tap-purecloud        | Added 'group_image' as child stream to 'groups'", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.834880Z'}
... | tap-purecloud        | Added 'group_owner' as child stream to 'groups'", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.835329Z'}
... | tap-purecloud        | Added 'queue_division' as child stream to 'queues'", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.835612Z'}
... | tap-purecloud        | Added 'queue_membership' as child stream to 'queues'", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.835881Z'}
... | tap-purecloud        | Added 'queue_wrapup_code' as child stream to 'queues'", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.836157Z'}
... | tap-purecloud        | Added 'users_division' as child stream to 'users'", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.836411Z'}
... | tap-purecloud        | Added 'users_language' as child stream to 'users'", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.836666Z'}
... | tap-purecloud        | Added 'users_location' as child stream to 'users'", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.836943Z'}
... | tap-purecloud        | Added 'users_presence' as child stream to 'users'", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.837203Z'}
... | tap-purecloud        | Added 'users_skill' as child stream to 'users'", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.839141Z'}
... | tap-purecloud        | Skipping deselected stream 'conversation'.", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.841975Z'}
... | tap-purecloud        | Skipping deselected stream 'conversation_participant'.", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.842300Z'}
... | tap-purecloud        | Skipping deselected stream 'conversation_participant_session'.", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.842511Z'}
... | tap-purecloud        | Skipping deselected stream 'conversation_participant_session_metric'.", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.842705Z'}
... | tap-purecloud        | Skipping deselected stream 'conversation_participant_session_segment'.", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.842966Z'}
... | tap-purecloud        | Skipping deselected stream 'group_image'.", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.843163Z'}
... | tap-purecloud        | Skipping deselected stream 'group_owner'.", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.843519Z'}
... | tap-purecloud        | Skipping deselected stream 'groups'.", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.843704Z'}
... | tap-purecloud        | Skipping deselected stream 'languages'.", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.843893Z'}
... | tap-purecloud.locations | Beginning full_table sync of 'locations'...", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.844078Z'}
... | tap-purecloud.locations | Tap has custom mapper. Using 1 provided map(s).', 'level': 'info', 'timestamp': '2025-01-03T10:14:27.844270Z'}
... | target-s3-parquet    | Initializing 'target-s3-parquet' target sink...", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.844572Z'}
... | target-s3-parquet.locations | Initializing target sink for stream 'locations'...", 'level': 'info', 'timestamp': '2025-01-03T10:14:27.844782Z'}
... | target-s3-parquet.locations | Initialized S3 Parquet with batch size: 100000', 'level': 'info', 'timestamp': '2025-01-03T10:14:27.849688Z'}
... | target-s3-parquet.locations | Setting up locations', 'level': 'info', 'timestamp': '2025-01-03T10:14:27.850199Z'}

<<Note from author: here the target gets stuck (There are only 20 short rows to pull)>>
 
... | singer_sdk.metrics   | METRIC: {"type": "timer", "metric": "sync_duration", "value": 1441.0389611721039, "tags": {"stream": "locations", "pid": 12, "context": {}, "status": "failed"}}', 'level': 'info', 'timestamp': '2025-01-03T10:38:28.875703Z'}
... | singer_sdk.metrics   | METRIC: {"type": "counter", "metric": "record_count", "value": 0, "tags": {"stream": "locations", "pid": 12, "context": {}}}', 'level': 'info', 'timestamp': '2025-01-03T10:38:28.876287Z'}
... | tap-purecloud.locations | An unhandled error occurred while syncing 'locations'", 'level': 'info', 'timestamp': '2025-01-03T10:38:28.885488Z'}
... 'event': 'Traceback (most recent call last):', 'level': 'info', 'timestamp': '2025-01-03T10:38:28.886146Z'}
... 'event': '  File "/meltano/.meltano/extractors/tap-purecloud/venv/lib/python3.11/site-packages/urllib3/connection.py", line 174, in _new_conn', 'level': 'info', 'timestamp': '2025-01-03T10:38:28.886489Z'}
... 'event': '    conn = connection.create_connection(', 'level': 'info', 'timestamp': '2025-01-03T10:38:28.886764Z'}
... 'event': '           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^', 'level': 'info', 'timestamp': '2025-01-03T10:38:28.887351Z'}
... 'event': '  File "/meltano/.meltano/extractors/tap-purecloud/venv/lib/python3.11/site-packages/urllib3/util/connection.py", line 95, in create_connection', 'level': 'info', 'timestamp': '2025-01-03T10:38:28.887737Z'}
... 'event': '    raise err', 'level': 'info', 'timestamp': '2025-01-03T10:38:28.888774Z'}
... 'event': '  File "/meltano/.meltano/extractors/tap-purecloud/venv/lib/python3.11/site-packages/urllib3/util/connection.py", line 85, in create_connection', 'level': 'info', 'timestamp': '2025-01-03T10:38:28.889135Z'}
... 'event': '    sock.connect(sa)', 'level': 'info', 'timestamp': '2025-01-03T10:38:28.889485Z'}
... 'event': 'TimeoutError: [Errno 110] Connection timed out', 'level': 'info', 'timestamp': '2025-01-03T10:38:28.889990Z'}
... 'event': '', 'level': 'info', 'timestamp': '2025-01-03T10:38:28.890273Z'}
... 'event': 'During handling of the above exception, another exception occurred:', 'level': 'info', 'timestamp': '2025-01-03T10:38:28.890545Z'}
v
I can help point you in the right place (there's also consulting options but slack is for helping everyone with no barriers!),
Copy code
... 'event': '  File "/meltano/.meltano/extractors/tap-purecloud/venv/lib/python3.11/site-packages/urllib3/util/connection.py", line 95, in create_connection', 'level': 'info', 'timestamp': '2025-01-03T10:38:28.887737Z'}
Shows that it's related to your tap. Connection timed out points to there being a connection issue. If it works locally but not on the server there's a few options. I'd honestly take everything you have there and jam it into google's new LLM or OpenAI's O1 and ask it for help since you're new. It's most likely 1. configuration issue in production where you don't have things set right. To debug you want to run something like
meltano config tap-purecloud list
to see what's different in configuration between local and production runs 2. Networking issue in the container where it can't reach where it needs to
👍 1
a
Thank you, I will update the post, once I found the issue/solution
v
Also the job shouldn't get "stuck" it should fail with an exit code
a
With "shouldn't" are you referring to: • the logs looking different than you would expect them to look, • or are you suggesting me adjusting the code in a way for it to raise an additional error?
v
<<Note from author: here the target gets stuck (There are only 20 short rows to pull)>>
Jobs shouldn't get "stuck"
They should fail or succeed with a proper error code