Hi everyone, we’re facing some issues when trying to deploy a distributed Couchbase cluster.
We want to migrate an old cluster (5.1.1) to a new and stable version (6.6.0). In the new one, we’ll split the cluster in data and index, as image above shows, using XDCR protocol as data replication between clusters.
We use 4 VMs per cluster, each with:
1 SSD disk OS - 150GB
1 SSD 100GB extreme SSD disk /opt/couchbase/data (to avoid disk concurrecnly)
1 SSD 100GB extreme SSD disk /opt/couchbase/index (to avoid disk concurrecnly)
Both clusters (data and index) are in the same VPC region to improve internal connection throughput. We’re using a finning tunning shell script to improve Linux settings (improve max open files, disable SWAP, etc…)
First of all, we create the buckets on both clusters and start copy items using XDCR. with data-cluster everything looks good, but when we start to create some index on index-cluster we face strange behavior with clusters. After creating index, a lot of couchbase nodes became down.
Remember that the cluster has this problem without any user or query being done. We try to grow the disk to improve throughout, but every time that we grow the disk, we face the same behavior.
we suspect that some process is consuming resources even with the cluster is not in use.