Some questions on best index performance


#1

Hi!
I am trying out couchbase for use in our domain (automotive). When the big €€€ enterpise players fails delivering i thought why not try it myself :slight_smile:

Assume i have an view with the following map function that creates a compound key (a article is mapped onto serveral cars that can be placed into several categories):

function (doc, meta) {
if (doc.type=="test"){
	if(doc.cars.length > 0) {
		for(var idx in doc.cars) {
			for(var idxs in doc.category) {
				emit([doc.category[idxs],doc.cars[idx].id], {"sku":doc.sku,"name":doc.article_name,"description":doc.article_description});
			}
		}
	}
}

}

  1. I want to query for the first param, the category ID. So my startkey for category 2 would be [2] and endkey [2,{}]. Does this query has a equally good performance as define yet another view , single key, that only has param startkey 2 endkey 2 ?

  2. Is there any problem with define a rather complex map function (multiple for loops as above) rather than model the keys unqiue from the beginning, i.e SKU_CARID (that is…every key has a single SKU and a single CAR_ID) rather than create the index “on-the-fly” using the un-nesting map function.

  3. Is there any way to create the view index and then remove part of the JSON document (programmatically update/set the keys again) without the index automatically starts to re-index? What i mean, sometimes it makes sense to create a index but the actual array used for building the index does not make sense to keep within the actual key in memory. In my case the json document can consists of an array with 10k+ numbers which with the above map generates needed structure for range queries. However it is of no importance to have this is array floating around within and steel bandwidth when GET the single key.

Thanks!


#2

[1] Startkey/Endkey performance vs a dedicated view depends on the size of the documents and size of the database. My suspicion is with CB 3.0 using a dedicated view will be slightly better performance for retrieval vs a startkey/endkey, while only incurring marginally more cpu at time of ingestion.

[2] I’m not sure I understand the question. I believe what’s being asked is what is the difference between a complex map function vs a the overhead incurred from modeling at the application tier? Its a question of where the modeling logic resides and is applied for retrieval–at the application or within the database in a view.

[3] There is no means of performing DML with views, currently.