Auto-scaling and the Python SDK

whollycow007 · October 30, 2015, 7:31pm

Hi,

We’re currently setting up rules and working through scripts to auto scale a couchbase cluster. Once we add new servers or drop servers from a cluster do we need to restart the Python app that handles all the requests? or would CB itself distribute the load?

Thanks!

mnunberg · October 30, 2015, 7:45pm

Typically you do not need an application restart. If you are only performing non-KV operations and are adding nodes, the SDK may need a restart to recognize those nodes. This is not the case with removing nodes.

There is an open ticket to poll for new nodes periodically anyway, so this limitation is more of a temporary restriction rather than something by design. See https://issues.couchbase.com/browse/CCBC-627

whollycow007 · October 30, 2015, 8:10pm

@mnunberg

We have auto-scaling setup for our API servers, which run python applications that talk to our cb cluster. Because of our aggressive auto-scaling, i’m never sure how many API servers we would have running at any point of time. It gets very complicated if I have to restart the python app on all those servers on an auto-scale of the cluster. We don’t plan on aggressive auto-scaling for the cb cluster as re-balancing is heavy.

What do you mean by only performing non-KV operations?

mnunberg · October 30, 2015, 8:28pm

KV operations are insert, upsert, replace, get, etc. – basically anything accepting a key/documentid as a mandatory input parameter. non-KV operations mean things such as views and n1ql.

The client fetches a new configuration (which would contain information about all nodes currently in the cluster) when it encounters specific types of errors (basically, any error that isn’t a result of a data logic error: not-found, exists, etc).

Because of how couchbase’s data sharding works, all KV nodes must be part of the data sharing process - and if a new node is added, the map (indicating which vbucket belongs to which server) inherently changes, and will eventually cause an error to be sent back to the client.

Non-KV operations on the other hand are not bound to specific nodes - so all nodes performing KV operations are considered equal, and there is no ‘error’ received by the client if it sends an operation to one node and not the other.