Unknown error in debug.log with Java SDK NullpointException from time to time

I have an error message in debug.log and don’t know what it mean.

Couchbase Server 3.0.2

=========================PROGRESS REPORT=========================
supervisor:
{local,menelaus_sup}
started: [{pid,<0.16556.158>},
{name,menelaus_web_alerts_srv},
{mfargs,{menelaus_web_alerts_srv,start_link,[]}},
{restart_type,permanent},
{shutdown,5000},
{child_type,worker}]

[error_logger:error,2015-08-25T11:11:43.930,ns_1@127.0.0.1:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]OS_MON (memsup) called by <0.308.0>, not started

[error_logger:error,2015-08-25T11:11:43.930,ns_1@127.0.0.1:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]OS_MON (memsup) called by <0.308.0>, not started

[error_logger:error,2015-08-25T11:11:43.930,ns_1@127.0.0.1:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]OS_MON (disksup) called by <0.308.0>, not started

[error_logger:error,2015-08-25T11:11:44.320,ns_1@127.0.0.1:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]OS_MON (disksup) called by <0.16599.158>, not started

[error_logger:error,2015-08-25T11:11:44.320,ns_1@127.0.0.1:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]
=========================SUPERVISOR REPORT=========================
Supervisor: {local,menelaus_sup}
Context: child_terminated
Reason: {{badmatch,none},
[{menelaus_web_alerts_srv,'check/4-lc$^1/1-2',2,
[
{file,"src/menelaus_web_alerts_srv.erl"},{line,245}]},
{menelaus_web_alerts_srv,check,4,
[{file,"src/menelaus_web_alerts_srv.erl"}
,
{line,244}]},
{lists,foldl,3,[{file,"lists.erl"},{line,1248}]},
{menelaus_web_alerts_srv,'handle_info/2-fun-0',2,
[{file,"src/menelaus_web_alerts_srv.erl"},
{line,170}]}]}
Offender: [{pid,<0.16556.158>},
{name,menelaus_web_alerts_srv},
{mfargs,{menelaus_web_alerts_srv,start_link,[]}},
{restart_type,permanent},
{shutdown,5000},
{child_type,worker}]


[error_logger:error,2015-08-25T11:11:44.320,ns_1@127.0.0.1:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]Error in process <0.16599.158> on node 'ns_1@127.0.0.1' with exit value: {{badmatch,none},[{menelaus_web_alerts_srv,'check/4-lc$^1/1-2',2,[{file,"src/menelaus_web_alerts_srv.erl"},{line,245}]},{menelaus_web_alerts_srv,check,4,[{file,"src/menelaus_web_alerts_srv.erl"},{line,244}
]},{lists,foldl...
[error_logger:info,2015-08-25T11:11:44.320,ns_1@127.0.0.1:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]

It will show an error message and resend it in every seconds.
I have no idea whether it occurred by killing the process memcached.
I kill the process is because to NullpointException in the project witch using Spring-Data-Couchbase 1.3.2.RELEASE depends on couchbase-cleint 1.4.7, memcached 2.11.6.
It happen randomly,not every save() method with spring interface CrudRepository would cause.

java.lang.NullPointerException: null
at net.spy.memcached.internal.OperationFuture.getCas(OperationFuture.java:226) ~[spymemcached-2.11.6.jar:2.11.6]
at org.springframework.data.couchbase.core.CouchbaseTemplate$4.doInBucket(CouchbaseTemplate.java:317) ~[spring-data-couchbase-1.3.2.RELEASE.jar:na]
at org.springframework.data.couchbase.core.CouchbaseTemplate$4.doInBucket(CouchbaseTemplate.java:299) ~[spring-data-couchbase-1.3.2.RELEASE.jar:na]
at org.springframework.data.couchbase.core.CouchbaseTemplate.execute(CouchbaseTemplate.java:244) ~[spring-data-couchbase-1.3.2.RELEASE.jar:na]
at org.springframework.data.couchbase.core.CouchbaseTemplate.save(CouchbaseTemplate.java:299) ~[spring-data-couchbase-1.3.2.RELEASE.jar:na]
at org.springframework.data.couchbase.core.CouchbaseTemplate.save(CouchbaseTemplate.java:149) ~[spring-data-couchbase-1.3.2.RELEASE.jar:na]
at org.springframework.data.couchbase.repository.support.SimpleCouchbaseRepository.save(SimpleCouchbaseRepository.java:82) ~[spring-data-couchbase-1.3.2.RELEASE.jar:na]
at sun.reflect.GeneratedMethodAccessor54.invoke(Unknown Source) ~[na:na]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) ~[na:1.7.0_60]
at java.lang.reflect.Method.invoke(Unknown Source) ~[na:1.7.0_60]
at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.executeMethodOn(RepositoryFactorySupport.java:434) ~[spring-data-commons-1.10.2.RELEASE.jar:na]
at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.doInvoke(RepositoryFactorySupport.java:419) ~[spring-data-commons-1.10.2.RELEASE.jar:na]
at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.invoke(RepositoryFactorySupport.java:391) ~[spring-data-commons-1.10.2.RELEASE.jar:na]
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) ~[spring-aop-4.1.0.RELEASE.jar:4.1.0.RELEASE]
at org.springframework.data.couchbase.repository.support.ViewPostProcessor$ViewInterceptor.invoke(ViewPostProcessor.java:87) ~[spring-data-couchbase-1.3.2.RELEASE.jar:na]
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) ~[spring-aop-4.1.0.RELEASE.jar:4.1.0.RELEASE]
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92) ~[spring-aop-4.1.0.RELEASE.jar:4.1.0.RELEASE]
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) ~[spring-aop-4.1.0.RELEASE.jar:4.1.0.RELEASE]
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:207) ~[spring-aop-4.1.0.RELEASE.jar:4.1.0.RELEASE]
at com.sun.proxy.$Proxy59.save(Unknown Source) ~[na:na]

OperationFuture.java:226
A global variable ‘status’ is null.

In order to solve this issue, I kill the process and it will restart automatically.
After restart, this NullpointException disappear, everything can work.
Then the disk overview in the console cannot show correctly.
As below

Usable Free Space (0 B)
Total Cluster Storage (0 B)
In Use (3.44 GB)
Other Data (0 B)
Free (-3700562197 B)

I did not know whether has the relations with unusual status in couchbase.

Is there any recommendation to deal with it or any more info I can provide?
Should I re-install the couchbase? because of these issues can not be found in another machine(same version).

@simonbasle can you advise ?

it may be a bug in spymemcached (in OperationFuture.getCas()). maybe a race condition, seeing as the status should have been set by the call to get() just before?

@daschl, to add to the plate of next bugfix release of spymemcached/couchbase 1.4?

Note on link with Spring-Data-Couchbase: one user of the Spring Data Couchbase connector also saw it but at the time made the analysis that this was maybe due to autoboxing a null in the spring-data-couchbase code (which he attempted to fix via a pull request). In hindsight I don’t think it was the real root cause (even if the null-check protection made sense).

2 Likes

I think the couchbase server issue may solved [here][1].
But it still make no sense why the NullPointException happen.
[1]: https://issues.couchbase.com/browse/MB-16153

I’m not sure killing the memcached process is the best way to go if you can restart the server more gracefully… The NullPointerException maybe comes from a weird state from the server.

In the Java project, do you have logs enabled? Do you see a WARN message starting with “Error getting cas of operation” before the NullPointerException happens? If so can you post this part of the logs?

Also you say this happens randomly on save. Would you be able to identify a few keys for which the problem happens and confirm that they are served by the couchbase node that has errors?
To do that you can use the vbuckettool command line tool (in couchbaseInstallFolder/bin/tools):

cd bin/tools
curl http://ANYNODEINCLUSTER:8091/pools/default/buckets/BUCKETNAME | ./vbuckettool - PROBLEMATICKEY1 PROBLEMATICKEY2

This will print out the nodes on which each of the keys would be processed, allowing you to verify that all problematic keys go to the same node.

Here is the more complete logs I got

   2015-09-01 12:05:24.172 WARN net.spy.memcached.protocol.binary.BinaryMemcachedNodeImpl:  Discarding partially completed op: Cmd: 1 Opaque: 104228 Key: MonitorMsg@1441080322989 Cas: 0 Exp: 0 Flags: 0 Data Length: 11524804
   2015-09-01 12:05:24.174 INFO com.couchbase.client.CouchbaseConnection:  Reconnecting due to exception on {QA sa=WIN-RPSU7BPOQGI/172.28.11.158:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=1}
    java.lang.RuntimeException: Unhandled state: {OperationStatus success=false:  Temporary failure}
    	at net.spy.memcached.MemcachedClient$4.receivedStatus(MemcachedClient.java:640)
    	at net.spy.memcached.protocol.binary.OperationImpl.finishedPayload(OperationImpl.java:212)
    	at net.spy.memcached.protocol.binary.OperationImpl.readPayloadFromBuffer(OperationImpl.java:196)
    	at net.spy.memcached.protocol.binary.OperationImpl.readFromBuffer(OperationImpl.java:139)
    	at net.spy.memcached.MemcachedConnection.readBufferAndLogMetrics(MemcachedConnection.java:861)
    	at net.spy.memcached.MemcachedConnection.handleReads(MemcachedConnection.java:840)
    	at net.spy.memcached.MemcachedConnection.handleReadsAndWrites(MemcachedConnection.java:720)
    	at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:683)
    	at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:436)
    	at com.couchbase.client.CouchbaseConnection.run(CouchbaseConnection.java:325)
 2015-09-01 12:05:24.191 WARN com.couchbase.client.CouchbaseConnection:  Closing, and reopening {QA sa=WIN-RPSU7BPOQGI/172.28.11.158:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=1}, attempt 0.
 2015-09-01 12:05:24,199 ERROR [scheduler-3:locked] scheduling [GetSymbolDataStocksOnGQouteTask.java:55] - GetSymbolDataStocksOnGQouteTask fail: 
    java.lang.NullPointerException: null
    	at net.spy.memcached.internal.OperationFuture.getCas(OperationFuture.java:226) ~[spymemcached-2.11.6.jar:2.11.6]
    	at org.springframework.data.couchbase.core.CouchbaseTemplate$4.doInBucket(CouchbaseTemplate.java:317) ~[spring-data-couchbase-1.3.2.RELEASE.jar:na]
    	at org.springframework.data.couchbase.core.CouchbaseTemplate$4.doInBucket(CouchbaseTemplate.java:299) ~[spring-data-couchbase-1.3.2.RELEASE.jar:na]
    	at org.springframework.data.couchbase.core.CouchbaseTemplate.execute(CouchbaseTemplate.java:244) ~[spring-data-couchbase-1.3.2.RELEASE.jar:na]
    	at org.springframework.data.couchbase.core.CouchbaseTemplate.save(CouchbaseTemplate.java:299) ~[spring-data-couchbase-1.3.2.RELEASE.jar:na]
    	at org.springframework.data.couchbase.core.CouchbaseTemplate.save(CouchbaseTemplate.java:149) ~[spring-data-couchbase-1.3.2.RELEASE.jar:na]
    	at org.springframework.data.couchbase.repository.support.SimpleCouchbaseRepository.save(SimpleCouchbaseRepository.java:82) ~[spring-data-couchbase-1.3.2.RELEASE.jar:na]
    	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.7.0_76]
    	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[na:1.7.0_76]
    	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_76]
    	at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_76]
    	at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.executeMethodOn(RepositoryFactorySupport.java:434) ~[spring-data-commons-1.10.2.RELEASE.jar:na]
    	at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.doInvoke(RepositoryFactorySupport.java:419) ~[spring-data-commons-1.10.2.RELEASE.jar:na]
    	at org.springframework.data.repository.core.support.RepositoryFactorySupport$QueryExecutorMethodInterceptor.invoke(RepositoryFactorySupport.java:391) ~[spring-data-commons-1.10.2.RELEASE.jar:na]
    	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) ~[spring-aop-4.1.0.RELEASE.jar:4.1.0.RELEASE]
    	at org.springframework.data.couchbase.repository.support.ViewPostProcessor$ViewInterceptor.invoke(ViewPostProcessor.java:87) ~[spring-data-couchbase-1.3.2.RELEASE.jar:na]
    	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) ~[spring-aop-4.1.0.RELEASE.jar:4.1.0.RELEASE]
    	at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92) ~[spring-aop-4.1.0.RELEASE.jar:4.1.0.RELEASE]
    	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) ~[spring-aop-4.1.0.RELEASE.jar:4.1.0.RELEASE]
    	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:207) ~[spring-aop-4.1.0.RELEASE.jar:4.1.0.RELEASE]
    	at com.sun.proxy.$Proxy62.save(Unknown Source) ~[na:na]
    	...
    2015-09-01 12:05:26.193 INFO com.couchbase.client.CouchbaseConnection:  Reconnecting {QA sa=WIN-RPSU7BPOQGI/172.28.11.158:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0}
    2015-09-01 12:05:26.315 INFO net.spy.memcached.auth.AuthThread:  Authenticated to WIN-RPSU7BPOQGI/172.28.11.158:11210

it looks like the server declares a transient error (TemporaryFailure), but apparently it keeps hapening…
I wonder if there’s a problem with the installation on that particular node?

1 Like

I find the problem with the node in here.

With Couchbase Server, if you overwhelm the amount of memory from workload, rather than become slower, Couchbase will send back temporary failures indicating you can retry the operation. This gives the software a bit more control than with other database systems, where they become slower for all operations under load. It is a “relief valve” of sorts, and gives you as an app developer the opportunity to backoff and retry operations if appropriate in your application logic.

The bucket which occur NullpointException doesn’t allocate enough memory.
It occurs because of I save a huge document when the bucket mem_used is close to what it allocated.
I solved this issue after give it more memory.
@simonbasle, thanks for your replies.

you’re welcome, glad you found the root cause of your issue :thumbsup: