luke_rodgers
05/03/2023, 5:48 PM[<http://self.logger.info|self.logger.info>](<<http://self.logger.info/>>)("URIs: %s", ", ".join(self.uris))
where self.uris
is probably a few 100,000 URIs. see https://github.com/z3z1ma/target-bigquery/blob/c4cc54f760032c8e9f2575adbe7d72b7a0a92d94/target_bigquery/gcs_stage.py#L185
• strace on the ec2 instance i’m running this on shows this for pid 18578
ubuntu@ip-10-1-4-224:~/meltano-elt$ sudo strace -f -p 18758
strace: Process 18758 attached with 2 threads
[pid 18760] futex(0x7fbf31416630, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 18758] write(2, "jsonl.gz,<gs://some_bucket/targ>"..., 56696505^Cstrace: Process 18758 detached
<detached ...>
strace: Process 18760 detached
• this seems to confirm that it is indeed stuck trying to write this giant string (i think 56696505 may be the number of bytes? this means about 54 megs which is roughly in line with 100s of thousands of URLs) to fd 2 which i think is usually STDOUT, but looking at /proc/18758/fd
i see 2 -> 'pipe:[361373]'
and looking at lsof i see
ubuntu@ip-10-1-4-224:~/meltano-elt$ lsof | grep 361373
meltano 18753 ubuntu 16r FIFO 0,13 0t0 361373 pipe
meltano 18753 18759 meltano ubuntu 16r FIFO 0,13 0t0 361373 pipe
target-bi 18758 ubuntu 2w FIFO 0,13 0t0 361373 pipe
target-bi 18758 18760 jemalloc_ ubuntu 2w FIFO 0,13 0t0 361373 pipe
gzip 19815 ubuntu 2w FIFO 0,13 0t0 361373 pipe
gzip 19816 ubuntu 2w FIFO 0,13 0t0 361373 pipe
gzip 575232 ubuntu 2w FIFO 0,13 0t0 361373 pipe
• which seems like pid 18758 (among others) is writing to this fifo and pid 18753 is trying reading from it (or at least, would normally be trying to read from it), probably to write logs to stdout. 18753 is the parent process of this, see pstree (hiding threads)
ubuntu@ip-10-1-4-224:~/meltano-elt$ pstree -p 18753 -T
meltano(18753)───target-bigquery(18758)─┬─gzip(19815)
├─gzip(19816)
└─gzip(575232)
• the gzip processes are doing nothing, looks like just waiting to read input.
• the parent process, 18753 is doing stuff, according to strace, but i can’t really tell what. seems to be opening/writing/reading to the meltano sqlite db over and over, but there’s also activity related to sqlite itself (wal/shm files). the db doesn’t seem to be large or growing, just chilling at 176 kb
• i was eventually able to “unclog” this situation by just manually running head -n 1
on the parent process fd that pointed at the named pipe. according to strace, the child process 18758 then began attempting to hit the bigquery API which was nice, but this eventually failed as per below:
```2023-05-03T173738.276785Z [error ] Loader failed
2023-05-03T173738.279070Z [error ] Block run completed. block_type=ExtractLoadBlocks err=RunnerError('Loader failed') exit_codes={<PluginType.LOADERS: 'loaders…visch
05/03/2023, 6:30 PM