N1QL architecture questions

wrt question 1: you’ll have to do the load balancing & failover. You can get the list of active query end points from ns_server, which knows & maintains list. You can create a distributed prepare/execute support using the encoded_json returned by prepare statement. This is how the Couchbase SDKs implement the support as well. You can look at the Java SDK (2.2.4) implementation since it’s all open source :smile:
For a simple implementation randomized or round robin would work. Getting the “current workload” in each system and then distributing the query execution is quite involved for first implementation. Checkout the query monitoring feature in the upcoming (soon!) Couchbase Watson developer preview.

wrt question (2): You just need to load balance among query services. Index nodes are load balanced automatically if there are duplicate indices (multiple indices with same signature). Ditto for data services. Query service orchastrates the use of index and data.

wrt question (3): yes. right now, if you want duplicate indexes, you should create them with WITH clause.