I have a custom tap I am running which runs fine o...
# infra-deployment
j
I have a custom tap I am running which runs fine on my local machine (dockerized container) but when running it on a Kubernetes Pod through Cloud Composer (GCP) the pod gets killed after 15-16 minutes with no error, the last log is a successful HTTP call... Any ideas what is happening? Something I have noticed is the State file is really large because it has several keys in the context and there are many entities, which makes the log very large
1
The exception is
RecursionError: maximum recursion depth exceeded
from
File "/opt/python3.11/lib/python3.11/site-packages/kubernetes/client/configuration.py", line 300, in logger_format\n    self.logger_formatter = logging.Formatter(self.__logger_format)
e
Hi @James Stratford! What's the offending log message with the too-large state? Is it
Incremental state: ...
?
j
Hi @Edgar Ramírez (Arch.dev), the start of the massive state log is here
Copy code
[2m2024-07-25T09:52:58.691311Z[0m [[32m[1minfo     [0m] [1m2024-07-25 09:52:58,678 | INFO     | target-bigquery      | Emitting completed target state {"bookmarks": ....
The total log size is 1mb I found this discussion on airflows github which may be the reason: https://github.com/apache/airflow/discussions/31315
Disabling container logs on the KuberenetesPodOperator in the DAG lets the pipeline run to completion
https://github.com/apache/airflow/discussions/29920#discussioncomment-5208504 Adding this into my DAG has stopped the mid pipeline crash
👀 1
e
Gotcha. It seems like it can be fixed by tweaking the orchestrator, but do open a GitHub issue if you think Meltano should do things differently on its end.