```meltano | Incremental state has been updated at...
# plugins-general
j
Copy code
meltano | Incremental state has been updated at 2020-12-21 20:36:59.943053.
meltano | Extraction failed (1): TypeError: can't compare offset-naive and offset-aware datetimes
meltano | ELT could not be completed: Tap failed
d
@jason_waldrip Looks like the same issue I looked into earlier today: https://meltano.slack.com/archives/C013EKWA2Q1/p1608586206289200
Does your
start_date
setting also look like
YYYY-MM-DD
rather than
YYYY-MM-DDTHH:MM:SSZ
?
j
let me look
Copy code
schedules:
- name: gitlab-to-bigquery
  extractor: tap-gitlab
  loader: target-bigquery
  transform: run
  interval: '@hourly'
  start_date: 2020-12-01 00:00:00
- name: postgres-to-bigquery
  extractor: tap-postgres
  loader: target-bigquery
  transform: skip
  interval: '@daily'
  start_date: 2020-12-17 22:20:30.188498
d
Can you show me the
start_date
value from
meltano config tap-gitlab
?
And the output of
meltano schedule run gitlab-to-bigquery --dump=state
?
j
Copy code
"start_date": "2018-01-01"
d
All right, we're definitely looking at the same issue then: https://gitlab.com/meltano/tap-gitlab/-/issues/33
The workaround there suggests doing a full refresh, but that would take ages for you, and there's another option. Let me write it down in the issue
j
Ill just do another full 😛
and report back in 2 days
d
Hehe
Gimme a moment 😉
@jason_waldrip Can I take it that that worked? 🙂
b
Hey @douwe_maan, would it be possible that this issue exists also for
tap-zendesk
? I got the error
ELT could not be completed: Tap failed
(without any extra info :/) when I used
start_date: '2021-01-13'
and this error went away once I changed to
start_date: '2021-01-13T10:00:00Z'
Well I was on meltano version
1.58.0
. I just switched to
1.65.0
and I get the same issue. Here’s the log output. Can’t seem to understand the problem
```tap-zendesk | INFO tickets: Starting sync target-csv | INFO Sending version information to singer.io. To disable sending anonymous usage data, set the config parameter "disable_collection" to true tap-zendesk | INFO METRIC: {"type": "timer", "metric": "http_request_duration", "value": 8.863351821899414, "tags": {"status": "succeeded"}} tap-zendesk | INFO Starting metrics capture at 2021-01-13T152307Z tap-zendesk | INFO METRIC: {"type": "timer", "metric": "http_request_duration", "value": 8.364282369613647, "tags": {"status": "succeeded"}} tap-zendesk | INFO METRIC: {"type": "timer", "metric": "http_request_duration", "value": 10.769266605377197, "tags": {"status": "succeeded"}} tap-zendesk | INFO METRIC: {"type": "timer", "metric": "http_request_duration", "value": 7.557981252670288, "tags": {"status": "succeeded"}} tap-zendesk | INFO METRIC: {"type": "timer", "metric": "http_request_duration", "value": 8.677280187606812, "tags": {"status": "succeeded"}} tap-zendesk | INFO METRIC: {"type": "timer", "metric": "http_request_duration", "value": 10.821506261825562, "tags": {"status": "succeeded"}} meltano | DEBUG Deleted configuration at /artemis/.meltano/run/elt/zendesk_tickets/43fc747f-eddb-4917-8a20-3662c36c402e/target.config.json meltano | DEBUG Deleted configuration at /artemis/.meltano/run/elt/zendesk_tickets/43fc747f-eddb-4917-8a20-3662c36c402e/tap.config.json meltano | ERROR Extraction failed (-9): INFO METRIC: {"type": "timer", "metric": "http_request_duration", "value": 10.821506261825562, "tags": {"status": "succeeded"}} meltano | DEBUG ELT could not be completed: Extractor failed Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/meltano/cli/elt.py", line 238, in run_elt await run_extract_load(elt_context, output_logger) File "/usr/local/lib/python3.8/site-packages/meltano/cli/elt.py", line 276, in run_extract_load await singer_runner.run( File "/usr/local/lib/python3.8/site-packages/meltano/core/runner/singer.py", line 253, in run await self.invoke( File "/usr/local/lib/python3.8/site-packages/meltano/core/runner/singer.py", line 227, in invoke raise RunnerError("Extractor failed", {PluginType.EXTRACTORS: tap_code}) meltano.core.runner.RunnerError: Extractor failed The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/meltano/cli/elt.py", line 226, in redirect_output yield File "/usr/local/lib/python3.8/site-packages/meltano/cli/elt.py", line 247, in run_elt raise CliError(f"ELT could not be completed: {err}") from err meltano.cli.utils.CliError: ELT could not be completed: Extractor failed meltano | ELT could not be completed: Extractor failed [2021-01-13 152409,658] [28|MainThread|meltano.cli.utils] [DEBUG] ELT could not be completed: Extractor failed Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/meltano/cli/elt.py", line 238, in run_elt await run_extract_load(elt_context, output_logger) File "/usr/local/lib/python3.8/site-packages/meltano/cli/elt.py", line 276, in run_extract_load await singer_runner.run( File "/usr/local/lib/python3.8/site-packages/meltano/core/runner/singer.py", line 253, in run await self.invoke( File "/usr/local/lib/python3.8/site-packages/meltano/core/runner/singer.py", line 227, in invoke raise RunnerError("Extractor failed", {PluginType.EXTRACTORS: tap_code}) meltano.core.runner.RunnerError: Extractor failed The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/meltano/cli/__init__.py", line 43, in main cli(obj={"project": None}) File "/usr/local/lib/python3.8/site-packages/click/core.py", line 829, in call return self.main(*args, **kwa…
Continuing my testing.. Looks like
start_date: '2021-01-13T00:00:00Z'
is failing but
start_date: '2021-01-13T10:00:00Z'
is succeeding… 🤷‍♂️
d
@benjamin_maquet It's definitely possible that tap-zendesk also requires
start_date
to be a full timestamp, since this is what the spec prescribes. Meltano can do better here by making sure the tap always receives a full timestamp even if the user only set a date, but I haven't gotten to it yet: https://gitlab.com/meltano/meltano/-/issues/2234 A contribution would be much appreciated 🙂 At first glance, the log output doesn't point to an issue with
start_date
, since there's no exception stacktrace coming from the tap, and
Extraction failed (-9)
shows that the tap process exited with exit code 9 (which Python's async subprocess library calls -9), which indicates the process was killed with SIGKILL, i.e. some outside influence rather than an error in the tap itself. But you're confident you're only only seeing that error with an incomplete
start_date
, and everything's fine otherwise?
Looks like 
start_date: '2021-01-13T00:00:00Z'
 is failing but 
start_date: '2021-01-13T10:00:00Z'
 is succeeding… 🤷‍♂️
That also suggests the issue isn't actually
start_date
, since both of those are obviously valid... Are you consistently seeing the tap will with exitcode
-9
with
2021-01-13
and
2021-01-13T00:00:00Z
, but never with
2021-01-13T10:00:00Z
?
b
I confirm it consistently happens for
2021-01-13
or
2021-01-13T00:00:00Z
but never happened for
2021-01-13T10:00:00Z
, and I tested 10+ runs. I am still trying to understand what is causing this issue
d
@benjamin_maquet Super weird. I suggest putting some debugging print statements in the tap source to figure out where it's failing
b
@douwe_maan Ok I finally found the issue….. I’m running Meltano in Docker and developing locally. The job was killed because I had only 2GB memory for my container. I’ve bumped it to 8GB now and it’s running!
d
@benjamin_maquet What version of Meltano are you on? The memory issue is supposed to be fixed in v1.64.0: https://meltano.slack.com/archives/CP8K1MXAN/p1610064894064300
That fix should prevent Meltano itself from using too much memory, but it's still possible for a specific tap or target to handle memory poorly and build up an ever-growing buffer
b
Yes I saw this message on the channel and that’s why I investigated the memory usage. I’m on 1.65.0
I was monitoring the memory usage of my container while running the tap-zendesk, and at the peak it was using ~4.5GB. It loaded ~7.5k tickets
d
Did you see if that memory usage was in tap-zendesk itself or in Meltano?
b
I’m not sure how to check that, but I can do it if you tell me how
d
I don't have an exact command ready, but I would run the container, then attach to the container with a separate bash, and run
ps <some flags> | grep python
to find the memory usage of each python process, and then use the pids to determine if that's Meltano itself, the tap, or the target
😄
Looks like it's the tap to me:
Copy code
root      1354 56.7 24.3 2087740 1982028 pts/1 R    16:46   1:14 /artemis/.meltano/extractors/tap-zendesk-tickets/venv/bin/python /artemis/.meltano/extractors/tap-zendesk-tickets/venv/bin/tap-zendesk --config /artemis/.meltano/run/elt/zen
b
yes I agree
d
I wonder if it's building up a huge buffer of records that are yet to be output on stdout, or if it has a memory leak where it holds on to records that were already flushed
Either way, I think this is worth an issue on https://github.com/singer-io/tap-zendesk. There's really no reason for a tap to use that much memory since it should be holding off on loading more data when the stdout write stream is blocked
Meltano could do better here by monitoring the tap and target process memory usage so that you at least get a useful warning when some limit is exceeded
Would you like to file an issue for that in https://gitlab.com/meltano/meltano/issues/? 🙂
b
thanks a lot for the help btw!
d
My pleasure, thanks for filing the issue!