Suggestion for distributed VPN for Couchbase Cluster on public cloud

I know this is a longstanding problem that anyone who uses a database in a public cloud / VPS hosting environment is likely to have. So, I’m sure there’s an obivous solution I’m simply unaware of. I’m a developer and Ops is not my strong suit.

PROBLEM

Since my cluster will live in a datacenter outside my control, all ports on each node that are open are exposed to the world. I don’t have a way to create a private network, so would like to create a virtual private network.

I’m looking for suggestions for open source software that does this, with the important requirement that there be no central VPN server.

GOAL

Thus I’d like there to be a separate network (e.g.: 10.0.1.X) that all of the Couchbase nodes talk to and communicate with each other over, so that 8091, et. al, are only open on this network and not the publicly routed interface. (A web server operating on port 80 would be how the outside world talks to the cluster, and all the rest of the ports (except SSH) would be firewalled off on the public ip.)

POSSIBLE SOLUTION
Conceivably, I can create SSH tunnels between each of the couchbase nodes and use key based authentication to make it work automatically, and set up some sort of watchdog to keep the tunnels up. That is quite possible, since I’ll know the map of the whole cluster, and thus each node would simply need N-1 tunnels to other nodes. I believe that is a solution that will work.

However, I suspect someone has solved this problem before, in a better and more elegant way. I’ve googled but there hasn’t been an obviously good/best solution.

Hello,

Your question is not directly related to Couchbase, but more a network configuration/security one.

First of all you have here the list of all the ports used by Couchbase:
http://www.couchbase.com/docs/couchbase-manual-2.1.0/couchbase-network-ports.html

So you have to create a virtual network between all the nodes of your cluster and open the ports, then open ports between your cluster and the application server. So you do not need to open all port to the “external world”.

This is typically the way it works when it is deployed on Amazon EC2.

Regards
Tug
@tgrall

I’m asking for suggestions for software that will create the virtual network between all the nodes in the cluster, without having a single point of failure.

Specifically- the standard network config/security answers are to have a VPN device to which all the nodes connect, and this device routes the connections between them.

Following that advice would completely defeat the purpose of using Couchbase instead of a regular SQL database:

  1. All data goes thru this one node, making it a bottleneck.
  2. This node becomes a single point of failure, such that if it goes down, even if all the couchbase nodes are still functioning the cluster is effectively offline. (or partitioned into N separate partitions)
  3. It introduces increased latently and the potential for very nefarious failure modes, which will likely cause problems with rebalancing. What does a node do when one of its replica nodes disappears? It tries to rebalance, but the node its rebalancing onto is able to see the node that disappeared as if it’s still there. I think there are probably N^n potential issues that could come up when nodes can both be “up” and “down” at the same time in a cluster, and different nodes each see a different cluster. These problems can exist in a fully distributed VPN, but are less likely as there’s no single node to get backed up on routing packets across the cluster.

The issue is not that I want to open all ports to the external world, that is the state of affairs. I am asking how to close them off, and protect them with a vpn efficiently. They’re “open to the external world” because there is no firewall between the cluster and the external world. So, yes, I need to create a vpn, and set the couchbase nodes up on the private virtual network, so the ports are only open to other nodes… but that is the very software I’m asking for suggestions for.

All of the existing solutions, OpenVPN, strongSWAN, etc, that I’ve found have a single VPN server thru which all data goes.

For decentralised vpn network you can try tinc ( http://tinc-vpn.org/ )

But my question is, what bandwith is required for a couchbase cluster instead of replication ? Did you get anything running?