How to increase performance in FTS?


#1

Hello,

I need an advice to increase my search performance. Write now I’m developing a backend service with node.js and couchbase. My index definition is as following:

And my search code:

Product.searchAsync = function (term, categoryId, variations, limit, skip, sort) {
    return new Promise(function (resolve, reject) {
        var SearchQuery = Couchbase.SearchQuery;
        var SearchFacet = Couchbase.SearchFacet;

        var tqs = []
        if (term) {
            tqs.push(SearchQuery.match(term)); // It search in default field, here it's _all
        }
        if (categoryId) {
            tqs.push(SearchQuery.term(categoryId).field("categories.$ref"));
        }
        if (variations) {
            variations.map(v => tqs.push(SearchQuery.match(v).field("variationSerials")));
        }

        tqs.push(SearchQuery.booleanField(true).field("isActive"));
        tqs.push(SearchQuery.booleanField(true).field("isVisible"));
        var conjunction = SearchQuery.conjuncts(tqs);
        var query = SearchQuery.new("product_search", conjunction);

        query.addFacet("variations", SearchFacet.term("variationSerials", 100));
        query.limit(limit);
        query.skip(skip);
        if (sort)
            query.sort(sort);

        query.fields(["name", "fobCurrency", "fobPriceMin", "fobPriceMax", "fobUnit", "mainPicture", "minOrderQuan", "minOrderUnit"]);

        ottoman.bucket.query(query, (error, result, meta) => {
            if (error) {
                reject(error);
            }
            else {
                var resultset = {
                    hits: result,
                    totalHits: meta.totalHits
                };

                resolve(resultset);
               
            }
        });
    }.bind(this))
}

My test server has Intel® Xeon® Processor E7-4890 v2 2.80 GHz and 12 gb RAM. In single node if I use ‘Moss’ as index type my service can only respond 300 requests per second. If I changed the index type to ‘Scorch’ it’s increasing to 500-530 requests per second. I believe these are not very good results. What am i doing wrong? How can I increase the performance?


#2

how many documents are there in your index?


#3

It says 104054 on doc count but that is my total number of documents. Actually, 20000 of them are ‘Product’ type. I want to index only Products actually.


#4

Hi,

Regarding the performance, we don’t yet have the query performance / throughput estimations for a given workload against a given hardware configuration.
This is been considered as a future action item to quantify the query performance against the system resources. As per my current understanding, there won’t be a very straight forward solution there as well, as the query performance can vary depending on a lot on factors like the document size, mapping used, (~index size) mutation rate, type of the queries used like match, term, phrase , fuzzy, prefix, conjunctions etc…

The docCount is shown as 104054 as that many documents got analysed and FTS store their docID’s in the index. You can confirm this by running a match_all query.

May I know the memory quota given to FTS in meantime?

cheers!


#5

Hi,

Thanks for your reply. The memory quote for FTS is 2500 mb right now. I’m watching graphical statistics during test. I’m also watching the resources from task manager, too. The strange thing is that there are more resources that can be used. CPU is %40-50, there is a lot more memory, nothing wrong with disk and network. Why cbft doesn’t use allocated free resources? I can see some high values on average query latency and slow queries graphics. These are the only negative results in my observations.


#6

I have made more tests today. I added 2 more nodes to my cluster. One of them is ubuntu and the other one is also windows server 2012. If I remove the facet from my query, performance increases a lot.

query.addFacet("variations", SearchFacet.term("variationSerials", 100));

After removing this line of code, I get 1000-1200 query per second. If I use facets, I can only get 300 query per second. I think facet query couldn’t be distrubuted over nodes, but only returns from one node. Am I right or is something different happening?


#7

In case of a multi node cluster, all queries will get scatter gathered. Applies to facets as well.
Usually facets query involves an extra look up for each of the faceted field value for every hit in the search results and that would explain the relative low query throughput.


#8

I understand. However, decreasing to 300 from 1000 isn’t so dramatically change? Is there a problem with my queries?


#9

one thing noticeable is, do you really need a “100” as the facet count, usually it tend to lesser than that. You may just experiment with lower numbers.
That being said, if one really needs that many facets then changing that is not an option at all.


#10

I decreased it to until 20. Yes we have a lot of facets. It’s like alibaba.com.

I’m happy with the performance until I use the facets. They are really have a bad impact on the performance. Increasing nodes didn’t help. Do you have anyother suggestions? Why nodes don’t use the all free resources?