Java client occasionally leaks (non-daemon) netty IO threads after shutdown II

Hello

using Java SDK 1.4.12 with Couchbase server 4.5.0-2601 Community Edition (build-2601)

I’m having the same issue that was presented in the following post:

jvm process hangs after shutdown command, enters epollWait, and process fails to terminate.

ran tcpdump and it looks like client send some kind of keep-alive packets to server.

13:02:33.141969 IP src_ip.29578 > couchbase_ip.11210: Flags [P.], seq 1992:2016, ack 1993, win 44745, options [nop,nop,TS val 25896463 ecr 158881471], length 24
13:02:33.142652 IP couchbase_ip.11210 > src_ip.29578: Flags [P.], seq 1993:2017, ack 2016, win 26847, options [nop,nop,TS val 158882722 ecr 25896463], length 24
13:02:33.142666 IP src_ip.29578 > couchbase_ip.11210: Flags [.], ack 2017, win 44745, options [nop,nop,TS val 25896463 ecr 158882722], length 0
13:02:38.147428 IP src_ip.29578 > couchbase_ip.11210: Flags [P.], seq 2016:2040, ack 2017, win 44745, options [nop,nop,TS val 25897714 ecr 158882722], length 24
13:02:38.148102 IP couchbase_ip.11210 > src_ip.29578: Flags [P.], seq 2017:2041, ack 2040, win 26847, options [nop,nop,TS val 158883974 ecr 25897714], length 24
13:02:38.148117 IP src_ip.29578 > couchbase_ip.11210: Flags [.], ack 2041, win 44745, options [nop,nop,TS val 25897714 ecr 158883974], length 0
13:02:43.153020 IP src_ip.29578 > couchbase_ip.11210: Flags [P.], seq 2040:2064, ack 2041, win 44745, options [nop,nop,TS val 25898966 ecr 158883974], length 24
13:02:43.153706 IP couchbase_ip.11210 > src_ip.29578: Flags [P.], seq 2041:2065, ack 2064, win 26847, options [nop,nop,TS val 158885225 ecr 25898966], length 24
13:02:43.153721 IP src_ip.29578 > couchbase_ip.11210: Flags [.], ack 2065, win 44745, options [nop,nop,TS val 25898966 ecr 158885225], length 0
13:02:48.157865 IP src_ip.29578 > couchbase_ip.11210: Flags [P.], seq 2064:2088, ack 2065, win 44745, options [nop,nop,TS val 25900217 ecr 158885225], length 24
13:02:48.158563 IP couchbase_ip.11210 > src_ip.29578: Flags [P.], seq 2065:2089, ack 2088, win 26847, options [nop,nop,TS val 158886476 ecr 25900217], length 24
13:02:48.158578 IP src_ip.29578 > couchbase_ip.11210: Flags [.], ack 2089, win 44745, options [nop,nop,TS val 25900217 ecr 158886476], length 0
13:02:53.163324 IP src_ip.29578 > couchbase_ip.11210: Flags [P.], seq 2088:2112, ack 2089, win 44745, options [nop,nop,TS val 25901468 ecr 158886476], length 24
13:02:53.164007 IP couchbase_ip.11210 > src_ip.29578: Flags [P.], seq 2089:2113, ack 2112, win 26847, options [nop,nop,TS val 158887728 ecr 25901468], length 24
13:02:53.164022 IP src_ip.29578 > couchbase_ip.11210: Flags [.], ack 2113, win 44745, options [nop,nop,TS val 25901468 ecr 158887728], length 0

There was no JIRA issue opened for this issue, so I can’t tell if it was resolved in later version.
I’m wondering if migrating to SDK 2.+ will resolve the problem.

can you provide a thread dump after shutdown please?

Also note that the 1.X SDK has reached its EOL period and we recommend everyone to move to 2.x anyways.

Thread dump is the same as in the attached post.

here’s mine:

No deadlocks found.

Thread 26999: (state = IN_NATIVE)
 - sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int) @bci=0 (Compiled frame; information may be imprecise)
 - sun.nio.ch.EPollArrayWrapper.poll(long) @bci=18, line=269 (Compiled frame)
 - sun.nio.ch.EPollSelectorImpl.doSelect(long) @bci=28, line=79 (Compiled frame)
 - sun.nio.ch.SelectorImpl.lockAndDoSelect(long) @bci=37, line=87 (Compiled frame)
 - sun.nio.ch.SelectorImpl.select(long) @bci=30, line=98 (Compiled frame)
 - net.spy.memcached.MemcachedConnection.handleIO() @bci=130, line=420 (Compiled frame)
 - com.couchbase.client.CouchbaseConnection.run() @bci=15, line=325 (Compiled frame)

Locked ownable synchronizers:
    - None

Thread 23590: (state = BLOCKED)

Locked ownable synchronizers:
    - None

Thread 23830: (state = BLOCKED)
 - java.lang.Thread.sleep(long) @bci=0 (Interpreted frame)
 - com.amazonaws.http.IdleConnectionReaper.run() @bci=21, line=112 (Interpreted frame)

Locked ownable synchronizers:
    - None

Thread 23794: (state = BLOCKED)
 - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be imprecise)
 - java.lang.ref.ReferenceQueue.remove(long) @bci=44, line=135 (Compiled frame)
 - com.mysql.jdbc.AbandonedConnectionCleanupThread.run() @bci=16, line=41 (Compiled frame)

Locked ownable synchronizers:
    - None

Thread 23776: (state = BLOCKED)
 - java.lang.Object.wait(long) @bci=0 (Interpreted frame)
 - java.lang.ref.ReferenceQueue.remove(long) @bci=44, line=135 (Interpreted frame)
 - java.lang.ref.ReferenceQueue.remove() @bci=2, line=151 (Interpreted frame)
 - com.google.inject.internal.util.$Finalizer.run() @bci=5, line=114 (Interpreted frame)

Locked ownable synchronizers:
    - None

Thread 23703: (state = BLOCKED)

Locked ownable synchronizers:
    - None

Thread 23662: (state = BLOCKED)
 - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be imprecise)
 - java.lang.ref.ReferenceQueue.remove(long) @bci=44, line=135 (Interpreted frame)
 - java.lang.ref.ReferenceQueue.remove() @bci=2, line=151 (Interpreted frame)
 - java.lang.ref.Finalizer$FinalizerThread.run() @bci=36, line=209 (Interpreted frame)

Locked ownable synchronizers:
    - None

Thread 23660: (state = BLOCKED)
 - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be imprecise)
 - java.lang.Object.wait() @bci=2, line=503 (Interpreted frame)
 - java.lang.ref.Reference$ReferenceHandler.run() @bci=46, line=133 (Interpreted frame)

Locked ownable synchronizers:
    - None

I see, can you share the code how and when you are shutting down the client? Note that this stack is different from the original post, where its a netty thread lingering around, here its a spymemcached thread.

			client.shutdown(10, TimeUnit.SECONDS);

where client is CouchbaseClient instance

based on the observation above, does it mean that its not always the same thread which is still alive at the end of shutdown?

If I understand your question correct, the answer is yes,
I get the same stacktrace every time it hangs like that