Memory leaks java client 2.01


#1

Jvm frequently fullgc

I use " jmap -dump:format=b,file=heap.bin " to analysis by IBM HeapAnalyzer

My code is following :

public class CouchBaseAccessImpl implements CouchBaseAccess, InitializingBean {

    private final Log logger = LogFactory.getLog(getClass());

    private String servers;
    private List<String> serversList;
    private static Cluster cluster;
   
    @Override
    public JsonObject getjsonByKey(String id) {
        JsonDocument jd = null;
        Bucket bucket = null;
        try {
            bucket = getBucket(DEFAULT_BUCKET_NAME);
            if (bucket != null) jd = bucket.get(id);
        } catch (Exception e) {
            logger.error("getVoipjsonByKey:" + id + " - " + DEFAULT_BUCKET_NAME, e);
            initCluster();
        }
        if (jd == null) return null;

        return jd.content();
    }


    @Override
    public Bucket getBucket(String name) {
        try {
            Bucket bucket = cluster.openBucket(name);
            return bucket;
        } catch (Exception e) {
            logger.error("open Bucket:" + name, e);
            initCluster();
            try {
                Bucket bucket = cluster.openBucket(name);
                return bucket;
            } catch (Exception e2) {
                e2.printStackTrace();
            }
        }
        return null;
    }

    private void initCluster() {
        if (cluster != null) {
            cluster.disconnect();
        }
        cluster = CouchbaseCluster.create(DefaultCouchbaseEnvironment.builder().queryEnabled(true).build(), serversList);
        logger.info("re init Cluster complete");
    }

    @Override
    public void afterPropertiesSet() throws Exception {

        Assert.hasText(servers, "couchbase servers must not be empty!");
        String[] serverArr = servers.split(",");
        serversList = Arrays.asList(serverArr);
        cluster = CouchbaseCluster.create(DefaultCouchbaseEnvironment.builder().queryEnabled(true).build(), serversList);
        logger.info("init Cluster complete");
    }
    public void setServers(String servers) {
        this.servers = servers;
    }

}

#2

Hi @liuw086,

thanks for your report. We’ve already released 2.0.2, can you update to this dependency and see if this fixes your issue?


#3

Thanks ,daschl ! I will try to use the client 2.0.2 as soon as I can.

Cluster initialized only once,but I open Bucket on every request.
When opening Bucket failure,Cluster will re-initialize.
So while one node was down ,the Cluster will auto-failover.
Cluster re-initialization can ensure the client continue to work well.

Are there any questions?


#4

Hey @liuw086,

The SDK should already take care of reconnections and such, and you can safely share an instance of the Bucket in your code, so you can also set it as a static resource.

I imagine you are doing some inserts/gets in order to see the full gc? What is the kind of workload you have?

Also could you share either the logs of a run or some details on your configuration (what kind of machine, how many cores, how much memory, what are the JVM’s params… but all that is in the logs that the SDK produces), see “Setting Up Loggin” in the docs : http://docs.couchbase.com/developer/java-2.0/logging.html


#5

hi, simonbasle,

I change Bucket to static and not to re-initialize when it has exception, and then I try to stop one node on my cluster which has three node .

When the cluster has auto-failover, the client did not connect to the left two node.

In contrast,the old code can work well.


#6

Mmh ok let’s focus on your initial problem first then.

  • Can you detail the workload you have?
  • Can you produce some logs?
  • And did you try 2.0.2 version as @daschl suggested?

#7

Thanks, simonbasle

I had used jmap to dump the jvm log and upload here :
http://pan.baidu.com/s/1dDeWqxr

and I use client 2.0.2 version.


#8

@liuw086 which server version are you using?


#9

couchbase-server-enterprise-3.0.1-centos6.x86_64.rpm


#10

@liuw086 I’ve been looking at your hprof and saw that the reatined content is mostly nettys pool chunks. This could be an indication of a leak, but it also could not - just because a leak suspect is reported doesn’t mean there is one (suspect ;)). Now I checked your code and I boiled it down to

while(true) {
bucket.get(id);
}

Is that true? So you are only doing key/value gets right? I ran that and didn’t find a leak with 2.0.2. Can you try to reproduce this outside of the spring context and/or share some full code we can run that exhibits the behavior?


#11

Yes, Usually it is run well,until it happens several times a timeout.
And I use couchbase-query :
“DefaultCouchbaseEnvironment.builder().queryEnabled(true)

@Override
    public <T> List<T> getListSimpleObject(String type, Class<T> clazz) {
        List<T> tlist = null;
        Bucket bucket = null;
        try {
            bucket = getBucket(DEFAULT_BUCKET_NAME);

            long s = System.currentTimeMillis();
            QueryResult query = bucket.query("select * from voip where type = '" + type + "'");
            if (logger.isDebugEnabled()) {
                logger.debug("getListSimpleObject " + type + " from couchbase  use time = " + (System.currentTimeMillis() - s) + "ms "
                             + query.success());
            }
            if (query.success()) {
                List<QueryRow> list = query.allRows();
                tlist = new ArrayList<T>();
                for (QueryRow queryRow : list) {
                    try {
                        if (logger.isDebugEnabled()) {
                            logger.debug("getListSimpleObject load " + type + " row :" + queryRow.toString());
                        }
                        Map<String, Object> ip = getClassAttributeMap(clazz, queryRow.value());
                        T t = JSONUtil.json2Object(JSONUtil.map2Json(ip), clazz);
                        tlist.add(t);
                    } catch (Exception e) {
                        e.printStackTrace();
                    }
                }
            }
        } catch (Exception e) {
            logger.error("getListSimpleObject:" + type + " - " + clazz + " - " + DEFAULT_BUCKET_NAME, e);
             initCluster();
        }
        return tlist;
    }

The invoke “CouchBaseAccess.getListSimpleObject” spend 80.235 seconds .

2014-12-13-10:39:46,734 pool-2-thread-5 INFO - [(CouchBaseAccess.getListSimpleObject,SYSTEM_ERROR,80235ms)][args=“v_version”,class com.xiaomi.voip.model.judge.VersionDO,][result=null]
java.lang.RuntimeException: java.util.concurrent.TimeoutException
at com.couchbase.client.java.util.Blocking.blockForSingle(Blocking.java:93)
at com.couchbase.client.java.CouchbaseCluster.openBucket(CouchbaseCluster.java:108)
at com.couchbase.client.java.CouchbaseCluster.openBucket(CouchbaseCluster.java:99)
at com.couchbase.client.java.CouchbaseCluster.openBucket(CouchbaseCluster.java:89)
at com.couchbase.client.java.CouchbaseCluster.openBucket(CouchbaseCluster.java:79)
at com.xiaomi.voip.module.da.impl.CouchBaseAccessImpl.initCluster(CouchBaseAccessImpl.java:220)
at com.xiaomi.voip.module.da.impl.CouchBaseAccessImpl.getListSimpleObject(CouchBaseAccessImpl.java:114)

2014-12-13-10:39:41,500 pool-2-thread-5 ERROR - getListSimpleObject:v_version - class com.xiaomi.voip.model.judge.VersionDO - voip
java.lang.RuntimeException: java.util.concurrent.TimeoutException
at com.couchbase.client.java.util.Blocking.blockForSingle(Blocking.java:93)
at com.couchbase.client.java.CouchbaseBucket.query(CouchbaseBucket.java:488)
at com.couchbase.client.java.CouchbaseBucket.query(CouchbaseBucket.java:454)
at com.xiaomi.voip.module.da.impl.CouchBaseAccessImpl.getListSimpleObject(CouchBaseAccessImpl.java:91)

2014-12-13-10:40:13,017 pool-2-thread-4 ERROR - getVoipjsonByKey:v_config_relay_ip_net - voip
java.lang.RuntimeException: java.util.concurrent.TimeoutException
at com.couchbase.client.java.util.Blocking.blockForSingle(Blocking.java:93)
at com.couchbase.client.java.CouchbaseBucket.get(CouchbaseBucket.java:63)
at com.couchbase.client.java.CouchbaseBucket.get(CouchbaseBucket.java:58)
at com.xiaomi.voip.module.da.impl.CouchBaseAccessImpl.getVoipjsonByKey(CouchBaseAccessImpl.java:54)
Caused by: java.util.concurrent.TimeoutException
… 27 more
2014-12-13-10:40:13,037 pool-2-thread-4 INFO - re init Cluster complete

I has some tasks to load data from couchbase like this:

scheduledExecutor.scheduleWithFixedDelay(new AbstractTask() {

            @Override
            public void runTask() {
              couchBaseAccess.getListSimpleObject("v_config_" + EngineType.AGOLA.getName(), Config.class); 
.....
            }
        }, 1, 1, TimeUnit.MINUTES);

#12

I find another problem : the client use a lot of cpu(40% 8 core).

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15724 root 20 0 13.5g 2.9g 18m S 400.1 9.4 124:32.94 java

top -p 15724 -H
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15763 root 20 0 13.5g 2.9g 18m R 100.7 9.4 15:58.79 java
15772 root 20 0 13.5g 2.9g 18m R 100.7 9.4 19:21.44 java
15770 root 20 0 13.5g 2.9g 18m R 98.7 9.4 39:12.18 java
15775 root 20 0 13.5g 2.9g 18m R 98.7 9.4 43:17.31 java

There are four threads use cpu 100%.
I see the thread 15763 (0x3d93) :

jstack 15724 |grep ‘0x3d93’:

“cb-computations-1” daemon prio=10 tid=0x00007f5a10abd000 nid=0x3d93 runnable [0x00007f5a059f2000]
java.lang.Thread.State: RUNNABLE
at rx.internal.operators.OperatorObserveOn$ObserveOnSubscriber.pollQueue(OperatorObserveOn.java:181)
at rx.internal.operators.OperatorObserveOn$ObserveOnSubscriber.access$000(OperatorObserveOn.java:65)
at rx.internal.operators.OperatorObserveOn$ObserveOnSubscriber$2.call(OperatorObserveOn.java:153)
at rx.internal.schedulers.ScheduledAction.run(ScheduledAction.java:47)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)

Thread 15772(0x3d9c) 15770(0x3d9a) 15775(0x3d9f)
“cb-computations-6” daemon prio=10 tid=0x00007f597c04a000 nid=0x3d9c runnable [0x00007f5a052eb000]
“cb-computations-5” daemon prio=10 tid=0x00007f597c047800 nid=0x3d9a runnable [0x00007f5a054ed000]
“cb-computations-8” daemon prio=10 tid=0x00007f596c005800 nid=0x3d9f runnable [0x00007f59fecb5000]

What is the problem?


#13

It’s very hard to say with limited information. Can you please open a ticket here: http://www.couchbase.com/issues/browse/JCBC
and include the following information:

  • Environment (os, java version,…)
  • full code to reproduce
  • logs from the run
  • expected behaviour/actual behaviour

Using CPU is not a bad thing by itself, since that’s what CPUs are for :wink: Wasting CPU time because of bugs is something else. If you can also spare some time, doing a profiling run with JMH or YourKit would certainly make things even easier to debug. If you can attach such a run, it would be great.


#14

I find https://issues.couchbase.com/browse/JCBC-583 ,and said it’s resoved on 2.0.2.
But I use 2.0.2 :
https://issues.couchbase.com/browse/JCBC-670

Thanks

2015-01-05-15:35:42,385 cb-io-1-4 ERROR - LEAK: ByteBuf.release() was not called before it’s garbage-collected.
Recent access records: 0
Created at:
com.couchbase.client.deps.io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:259)
com.couchbase.client.deps.io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:155)
com.couchbase.client.deps.io.netty.buffer.PooledUnsafeDirectByteBuf.copy(PooledUnsafeDirectByteBuf.java:320)
com.couchbase.client.deps.io.netty.buffer.SlicedByteBuf.copy(SlicedByteBuf.java:150)
com.couchbase.client.deps.io.netty.buffer.AbstractByteBuf.copy(AbstractByteBuf.java:919)
com.couchbase.client.core.endpoint.query.QueryHandler.parseQueryInfo(QueryHandler.java:326)
com.couchbase.client.core.endpoint.query.QueryHandler.parseQueryResponse(QueryHandler.java:212)
com.couchbase.client.core.endpoint.query.QueryHandler.decodeResponse(QueryHandler.java:152)
com.couchbase.client.core.endpoint.query.QueryHandler.decodeResponse(QueryHandler.java:57)
com.couchbase.client.core.endpoint.AbstractGenericHandler.decode(AbstractGenericHandler.java:152)
com.couchbase.client.deps.io.netty.handler.codec.MessageToMessageCodec$2.decode(MessageToMessageCodec.java:81)
com.couchbase.client.deps.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
com.couchbase.client.deps.io.netty.handler.codec.MessageToMessageCodec.channelRead(MessageToMessageCodec.java:111)
com.couchbase.client.deps.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
com.couchbase.client.deps.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
com.couchbase.client.deps.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
com.couchbase.client.deps.io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:147)
com.couchbase.client.deps.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
com.couchbase.client.deps.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
com.couchbase.client.deps.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
com.couchbase.client.deps.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
com.couchbase.client.deps.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
com.couchbase.client.deps.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
java.lang.Thread.run(Thread.java:724)


#15

Okay thanks, let’s follow up in the ticket. Btw, the ticket you linked is referring to views, not N1QL queries.