I have an application that is built partly in Clojure and partly in PHP. On each load of the PHP-script, it connects to the Clojure application through nREPL, creates a new nREPL session, and then sends a few commands to retrieve some information, and then closes the session. So several new nREPL sessions are created and closed each second.
On my production server (Redhat Linux), the Clojure app has started to crash after running for a few days. The crash file gives this message.
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 12288 bytes for committing reserved memory.
Under running processes, there are about 32240
nREPL threads that look like this
0x00007f0b681f2800 JavaThread "nRepl-session-c4e18c8b-26f7-437b-85eb-5b6b16145b0f" daemon [_thread_blocked, id=129841, stack(0x00007f01a4b2b000,0x00007f01a4c2c000)]
They are all blocked. I have just added a monitor of (.getThreadCount (ManagementFactory/getThreadMXBean))
to the app, and it seems to increase for each PHP request. It is at 527
now, but I have not waited until the next crash yet to see if it grows up to >32000
.
I don’t see the same pattern in my dev environment (MacOS), PHP requests do not increase (.getThreadCount (ManagementFactory/getThreadMXBean))
, but the thread count stays pretty stable around 40
.
I’m pretty clueless here, since I know nothing about Java or threads. Is it possible that nREPL leaves thousands of threads lying around even though each session is closed? I have confirmed that the number of live session IDs in nREPL are never more than a few, so it seems that nREPL discards the sessions themselves properly.
I would be very grateful for ideas!