Error: Couchbase cluster scaling issue


#1

Creating couchbase cluster in kubernetes by following the instruction present in the link below

http://docs.couchbase.com/prerelease/couchbase-operator/beta/installKubernetes.html

Able to create couchbase operator and couchbase cluster but cluster is only running one “Pod” even though the number of servers described in yaml file are 3

servers:
- size: 3
name: all_services
services:
- data
- index
- query
- search
dataPath: /opt/couchbase/var/lib/couchbase/data
indexPath: /opt/couchbase/var/lib/couchbase/data

Getting the following error in operator pod logs:

time=“2018-08-14T19:26:19Z” level=error msg=“cluster failed to setup: unable to contact cb-example-0000.cb-example.default.svc:8091 after 120 attempts (dial tcp: i/o timeout)” cluster-name=cb-example module=cluster
time=“2018-08-14T19:26:19Z” level=error msg=“failed to update cluster phase (Failed): failed to update CR status: Operation cannot be fulfilled on couchbaseclusters.couchbase.database.couchbase.com “cb-example”: the object has been modified; please apply your changes to the latest version and try again” cluster-name=cb-example module=cluster


#2

Thanks for reaching out @sasanth. Sorry to hear that you are running into issues with Couchbase Autonomous Operator.

Can you please answer the following questions, so that we can help you better.

How was the k8s cluster created?
What version of k8s is used?
How many worker nodes in the k8s cluster? Are they able to reach each other (for time being you can turn off the firewall)?
Can you also cut+paste output of the following command
‘kubectl get pods --namespace=kube-system’?


#3

K8s cluster is created directly on centos 7
k8s version - v1.11.1
working nodes - 2, yes they are able to reach each other

avid-cfv174 ~]$ kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
coredns-78fcdf6894-9hhpf 1/1 Running 58 6d
coredns-78fcdf6894-x475l 1/1 Running 1 6d
etcd-avid-cfv174 1/1 Running 1 6d
kube-apiserver-avid-cfv174 1/1 Running 1 6d
kube-controller-manager-avid-cfv174 1/1 Running 1 6d
kube-proxy-c98lk 1/1 Running 0 6d
kube-proxy-d9ntb 1/1 Running 1 6d
kube-proxy-wxxsp 1/1 Running 0 1d
kube-scheduler-avid-cfv174 1/1 Running 1 6d
kubernetes-dashboard-6948bdb78-pw92n 1/1 Running 0 1d
weave-net-kk8ld 2/2 Running 0 6d
weave-net-mhx8j 2/2 Running 1 1d
weave-net-vtb9f 2/2 Running 9 6d


#4

Thanks for the information @sasanth.

I was suspecting the underlying OS to be RHEL/CentOS. I ran into exact same issue with RHEL. The issue is networking module weave is does not work all the time with RHEL/CentOS. And hence when K8s is placing POD on other worker nodes, it is not reachable and hence we never see it.

From here there are two options:

Option1: Try do add configmap for DNS, yaml file is below.


apiVersion: v1
kind: ConfigMap
metadata:
name: kube-dns
namespace: kube-system
labels:
addonmanager.kubernetes.io/mode: EnsureExists
data:
upstreamNameservers: |-
[“8.8.8.8”, “8.8.4.4”]

Option2: Try exact same thing with Ubuntu.
PS. It worked for me on Ubuntu.


#5

thanks for the response @ram

I tried Option 1 but getting the same error, haven’t tried option 2 yet because all of my servers are hosting RHEL/CentOS, please let me know if you know any other working solution for CentOS


#6

I would highly recommend trying Option#2 or escalating at CentOS/RHEL forums.
I burned my couple of days in vain to get the K8s cluster working on RHEL. I would use Openshift for k8s on RHEL.


#7

@ram Thanks for looking into this, i am trying the openshift option