Partitioned index doesn't get distributed over all available nodes

LeonidGvirtz · March 4, 2019, 11:34am

Hi

When I create partitioned indexes on a 3-node cluster (version 5.5.2), indexes sometimes don’t get distributed over all available nodes. Since all 3 nodes run indexer service, I assume that Couchbase is supposed to distribute the index on all available nodes. In fact, I see most of my indexes distributed on 2 nodes only. The issue is not related to a specific node.

To find a workaround, I decided to specify the list of the nodes for an index explicitly, even though I understand that the list just limits the index creation to the nodes in the list. However, it doesn’t help. The index definition in the UI looks somewhat curious:

Definition
CREATE INDEX idx_1 ON bucket1(upper(attr1)) PARTITION BY hash((meta().Id)) WITH { “nodes”:[ “node1:8091",“node2:8091”,"node3:8091” ], “num_partition”:8 }

Nodes
node2:8091, node3:8091

Note that the index is not created on node1 - and I don’t understand why. Am I missing something?

Thanks in advance
Leonid Gvirtz

Marco_Greco · March 4, 2019, 12:18pm

I haven’t done much digging, but it seems like the indexing client (which is the indexing module that n1ql uses to communicate with the indexing service) adds a default number of partitions of 8 to the create index statement.
I’m guessing that the indexing service is trying to make life easier for itself by spreading partitions evenly around nodes, and clearly 8 isn’t divisible by 2.
I’d try

CREATE INDEX idx_1 ON bucket1 (upper( attr1 )) PARTITION BY hash((meta(). Id )) WITH { “nodes”:[ “node1:8091",“node2:8091”,"node3:8091” ], “num_partition”:9 }

LeonidGvirtz · March 4, 2019, 1:37pm

Hi Marco

Thank you for your quick response. I tried your suggestion to explicitly increase the number of partitions to 9. Also, I explicitly specified the list of the nodes for the index, which I didn’t do during most of my tests. Unfortunately, I have received quite inconsistent results so far.

For example, creation of an index with exactly the same definition, while specifying the nodes and “partitions:9”, produced an index distributed on 3 nodes on the first run and on 2 nodes on the second run. Another example: I have an index on lower(attr1), which is spread on two nodes. Any attempt to create a similar index on upper(attr1) resulted in index distributed on a single node only. Even the primary index is distributed on two nodes only.

What is the expected behavior? To my understanding from the documentation, by default indexes should be distributed over all nodes that run indexer service. Is it correct? Also, what would you advise me to do next? Should I review some Couchbase logs?

Thanks
Leonid Gvirtz

vsr1 · March 4, 2019, 2:09pm

You have upper case I, please replace that with META().id (i.e. lower case i).
Your hash function is based on document key not on the field of the document so UPPER(attr1) or LOWER(attr1) doesn’t matter. cc @deepkaran.salooja. If no “nodes” options it distributes the partitions on all the nodes. The items in the partition still depends on hash key.

Also try index port instead of cluster port 8091 Partition Placement

https://dzone.com/articles/divide-and-conquer-couchbase-index-partitioning

LeonidGvirtz · March 5, 2019, 12:17pm

Replacing META().Id with META().id solved the problem. Thank you very much!