Couchbase Design Document getting corrupted?

Such an active community and I was looking for a response to this post but either no one seems to have faced this problem or I am making trivial mistake that no one cares to answer, in either case, being a newbie on couchbase I stand disappointed.
Hi,

I am using Version: 4.0.0-4047 Community Edition (build-4047). I have been facing a frequent issue of views stop returning data.
I create multiple views in a design document and after playing around with views (fetching data, updating documents etc) suddenly views stop returning any data.
None of the views in that doc return any data and even if I create a new view in the doc or edit the existing one, nothing happens and the only way out is to create a new design document and all the views again.

This is blocking my work and I can not move on.
Immediate help would be highly appreciated.

I have been using Spring data to access the views but that doesn’t come into picture because executing views from admin console also returns with no response.

Wondering if any of the couchbase engineer can answer this one.

HI, are you using the development design document or the published design document?

Can you let us know if they are production views?

Hi @ldoguin @ctauro, Thanks for taking this up.
I am using Version: 4.0.0-4047 Community Edition (build-4047)
Here are the multiple scenarios I have tried -

  1. I have a design document “XXX” and 5 views in it which all works fine, and all the views return proper data.

  2. After accessing documents using these views and doing multiple CRUD operations on Documents using Admin Console or using Java client (Spring Data), all the views stop returning data. If at this stage I add a simple view to XXX which in all certainty should return data, even that does not work. Now if I create another design doc “YYY” with the same 5 views, they all return data as expected.

  3. After going through the various reference material available online I thought that this was a problem of views indexing only subset data but even after using Full Cluster Data Set on Admin Console the affected views do not return any data.

  4. Next step was to publish the affected Design Doc “XXX” and all views to production but even then the views do not return data. It seems that once the design doc is corrupted (I do not have a better word for the situation now) all the views the existing ones and the new one (if I add any) don’t return any data.

  5. Though I believe that CRUD on documents has nothing to do with Design Doc and the views, I have tried with combinations of Stale.False/Stale.True or StringDocument/JsonDocument just to make sure that CRUD on documents isn’t the reason for Design Doc to show this behaviour.

Although I have not been able to pin point the reason for such behaviour, I have noticed this happening very frequently.

For your reference, I have very simple views and one of them is -

` function(doc, meta) {
if(doc.type == “PO_Items” && doc.created_timestamp && doc.created_timestamp != “”) {
var createdTimeStamp = doc.created_timestamp.split(" “)[0];
var createdTimeStampParts = createdTimeStamp.split(”/");
var date = new Date(parseInt(createdTimeStampParts[2], 10) ,
parseInt(createdTimeStampParts[1], 10) - 1,
parseInt(createdTimeStampParts[0], 10) + 2);

            var year = date.getFullYear();
            var month = date.getMonth() + 1;
            var day = date.getDate();
            var commitmentDateEntryScore = "N";
            if(doc.po_commitment_date_entry_score && parseInt(doc.po_commitment_date_entry_score) > 0)
              commitmentDateEntryScore = "Y";
            
            emit([doc.supplier_id,commitmentDateEntryScore, year, month, day], null);
        }
    }

`

Thanks

Can you also paste a sample document that satisfies your emit criteria?

Below is one of the 871 documents being return right now.


{
“type”: “PO_Items”,
“po_no”: “9000000345”,
“po_date”: “24/02/2016”,
“supplier_id”: “500474”,
“po_no_of_line_items”: “1”,
“po_value”: “120000”,
“po_status”: “New”,
“po_delivery_status”: “Open”,
“channels”: “500474”,
“code”: “007”,
“pu_id”: “M”,
“pu_desc”: “MTCP”,
“po_line_item_no”: “10”,
“po_line_item_material_code”: “Q3-14583-1”,
“po_material_desc”: “Cylinder Tube”,
“po_schedule_date”: “09/04/2016”,
“po_line_item_qty”: “33”,
“po_line_item_pending_qty”: “33”,
“po_line_item_value”: “120000”,
“po_line_item_status”: “New”,
“po_line_item_change_type”: “”,
“po_line_item_delivery_status”: “Open”,
“buyer_id”: “M14”,
“so_number”: “50000240”,
“so_line_number”: “10”,
“qty_uom”: “EA”,
“stages”: [{
“id”: 1,
“stageid”: 1,
“stagename”: “RM procurement”,
“stagedays”: 10,
“created_ts”: “24/02/2016 09:41:27”,
“stage_completion_flag”: “N”,
“pending_qty”: “33”,
“completion_date”: “5/03/2016”,
“on_schedule”: “”,
“invoice_no”: “”,
“stage_completion”: [],
“stage_entry_score”: “”
}, {
“id”: 2,
“stageid”: 2,
“stagename”: “Profile cutting”,
“stagedays”: 3,
“created_ts”: “24/02/2016 09:41:27”,
“stage_completion_flag”: “N”,
“pending_qty”: 0,
“completion_date”: “8/03/2016”,
“on_schedule”: “”,
“invoice_no”: “”,
“stage_completion”: [],
“stage_entry_score”: “”
}, {
“id”: 3,
“stageid”: 3,
“stagename”: “Fitup & Straightening”,
“stagedays”: 3,
“created_ts”: “24/02/2016 09:41:27”,
“stage_completion_flag”: “N”,
“pending_qty”: 0,
“completion_date”: “11/03/2016”,
“on_schedule”: “”,
“invoice_no”: “”,
“stage_completion”: [],
“stage_entry_score”: “”
}, {
“id”: 4,
“stageid”: 4,
“stagename”: “Large weld Inspection”,
“stagedays”: 1,
“created_ts”: “24/02/2016 09:41:27”,
“stage_completion_flag”: “N”,
“pending_qty”: 0,
“completion_date”: “12/03/2016”,
“on_schedule”: “”,
“invoice_no”: “”,
“stage_completion”: [],
“stage_entry_score”: “”
}, {
“id”: 5,
“stageid”: 5,
“stagename”: “Fabrication”,
“stagedays”: 15,
“created_ts”: “24/02/2016 09:41:27”,
“stage_completion_flag”: “N”,
“pending_qty”: 0,
“completion_date”: “27/03/2016”,
“on_schedule”: “”,
“invoice_no”: “”,
“stage_completion”: [],
“stage_entry_score”: “”
}, {
“id”: 6,
“stageid”: 6,
“stagename”: “Machining”,
“stagedays”: 9,
“created_ts”: “24/02/2016 09:41:27”,
“stage_completion_flag”: “N”,
“pending_qty”: 0,
“completion_date”: “5/04/2016”,
“on_schedule”: “”,
“invoice_no”: “”,
“stage_completion”: [],
“stage_entry_score”: “”
}, {
“id”: 28,
“stageid”: 7,
“stagename”: “Inspection & Dispatch”,
“stagedays”: 1,
“created_ts”: “24/02/2016 09:41:27”,
“stage_completion_flag”: “N”,
“pending_qty”: 0,
“completion_date”: “6/04/2016”,
“on_schedule”: “”,
“invoice_no”: “”,
“stage_completion”: [],
“stage_entry_score”: “”
}],
“low_lead_time”: “y”,
“po_changes”: [],
“grn_rejects”: [],
“po_change_flag”: “n”,
“grn_reject_flag”: “n”,
“po_commitment_date_entry”: “”,
“po_commitment_date_entry_score”: “”,
“ontime_delivery_flag”: “”,
“ontime_delivery_score”: “”,
“commitment_date”: “”,
“low_lead_time_score”: 0,
“total_score”: 0,
“created_user_id”: “”,
“created_timestamp”: “24/02/2016 09:41:27”,
“approval”: [],
“modified_user_id”: “”,
“modified_timestamp”: “24/02/2016 09:41:27”,
“deleted_ts”: “”,
“delivery_date”: “”,
“commitment_due_date”: “22/02/2016”
}


{
“id”: “9000000345_Q3-14583-1_500474_10”,
“rev”: “1-1435d765ed7500000000000004000000”,
“expiration”: 0,
“flags”: 67108864
}

Well… I tried this out… I don’t see any issues.

@anujbhatnagar84 have you verified if your cluster is properly sized for the key-value and views workload? Sizing inputs should be present in our official documentation page. (From high level this is what it suggests: per design doc one additional CPU core, for handling KV workload you need 4 cores, per xdcr replication add one more core)

Problem that you’re reporting, I’ve seen it quote often on undersized clusters.

@ctauro yes everything works fine until views stop returning data. Can it be because of cluster size as mentioned by @asingh ? I just have one node running and there is no replication setup on but I did give the size of the bucket and other configuration a thought and increased the bucket size but even that didn’t work.

@anujbhatnagar84 how many cpu cores do you have? how many documents overall? what’s the average document size? what stale parameter are you providing?

Inputs for above questions would help to know the problem. Increasing bucket size has nothing do with the way views operate.

@asingh more than 1 core for sure (will confirm it later) ubuntu on AWS, not more than 1000 docs with each doc < 5KB.
So one case I have noticed that,

  1. If a document conforms to emit criteria and is emitted successfully.

  2. And then state of doc is changed to not pass the emit criteria and doc is written back. you would expect doc not being returned which works perfectly fine.

  3. Now at this point you change the doc to conform with emit criteria, you would expect view to emit it. That doesn’t happen even after long time, creating another view in the same design doc also fails but if a new design doc is created with a new view, doc starts being emitted.

And yes, by emitting the doc I mean emitting some composite key/value not the entire doc.

If the issue has something to do with machine capacity, isn’t it that one would expect a proper error/earning response from couchbase?

PS - I appreciate that you are trying to dig in to help me out.

In ideal scenario you would expect the installation to return specific errcode for different type of issues, but unfortunately it’s hard to do in real world. But from log messages you could get idea about hardware being undersized.

On 4.x, given we have lot of components(2i, n1ql, xdcr, views, KV - not sure how many are you using). If it’s 1 core, I’ve no idea if things would work.

Why don’t you try things on some 4 core instance?

Thanks @asingh I will see what best can be done. Will update here if I really get this problem resolved.

Hey @asingh/ @ctauro, I think the reason for this weird behaviour is “couchbase not able to index data when a view is updated or new docs being created”.
Whenever I update view and execute it with Stale=False, I have seen on Admin console the message of couchbase indexing view


and proper results being returned but at times I don’t see the message and in those cases views fail to return the data and in those particular cases other views of the design doc stop returning data.

Can it possibly be the reason of the issue I am facing?

I have also tried deleting entire design doc and create it back using curl command but even that fails to work.

and in those cases views.log suggests -

couch_set_view:mark_partitions_unindexable(spatial_view, Calling couch_set_view:mark_partitions_unindexable(mapreduce_view, <<"data">>, <<"_design/dev_worxogo10">>, [[]]) couch_set_view:mark_partitions_unindexable(mapreduce_view, <<"data">>, <<"_design/dev_worxogo10">>, [[]]) returned ok in 0ms Calling couch_set_view:mark_partitions_unindexable(spatial_view, <<"data">>, <<"_design/dev_worxogo10">>, [[]]) couch_set_view:mark_partitions_unindexable(spatial_view, <<"data">>, <<"_design/dev_worxogo10">>, [[]]) returned ok in 0ms

and

@anujbhatnagar84 have you taken a look at sizing suggestions I made yesterday? Stale=False is by design supposed to fetch updated state of all documents(hence waits till it gets mutations since last updater run) in order to provide you consistent response. If you lack on system firepower, then you would see issues.

Yes @asingh I am looking to upgrade the system configuration from single core to multiple core and increase the ram size from 4GB to maybe 8 or more but my limited point is that the system will lack in firepower at some stage if data grows rapidly and I would expect couchbase to tell me that it is not able to index the data because of reason ABC, If it executes the view silently without returning any data and without throwing error how on earth my calling program will know if something has gone wrong?

If you’re using Couchbase SDKs to interact with the cluster, then I’m pretty sure you see BackPressure or TimeOut related exception - which typically are pointers to system being slow/undersized. Official documentation about Sizing Best Practices.

@asingh @ctauro. I upgraded the machine but it didnt work even then so I was forced to think if it was a runtime exception from one of the views’ map function which was causing entire design doc to fail. And then quite amusingly I found that it was related to the view function. Try doing this -
In the beer-sample bucket, create a view

`function (doc, meta) {
if(doc.type = “beer” && doc.category){
//var x = parseInt(doc.asas);
emit(null, null);
}

}`

I am doing doc.type = "beer" instead of doc.type == "beer"
This is valid javascript code and this view works fine but because of this other view stops working. My understanding would be that when couchbase tries indexing for this view, it changes the type of every document of the beer-sample to beer so ideally the views which check for type as “beer” should work but even those stop working.

@ctauro, Let me know if my finding makes sense and you can explain this behaviour?

How do you conclude it’s a runtime exception?

If there are issues, you would want to look at the log files on your machine. Initially you might find it somewhat uneasy to make sense of but careful review would show potential problem.

Simulated exact same map function as you showed, I didn’t see issue on latest build(didn’t have time to test your specific version).

I don’t know why would one ever do assignment operation within a map function and emit(null, null).

I’m not sure if you’ve read how views work in our documentation and understand the way it operates. It would never ever modify a document that your application has populated in KV. Views will only ever touch the index files it creates on disk, located under @indexes location.

Would highly recommend you to go through our docs, it would help you as well as us to know if there is a real problem.

We have had a similar issue here. We have a monitoring server that monitors our views across the hosts in the cluster and when it detects this problem, it resets the view manager.

This is supposed to be fixed in the “4.next” version of Couchbase.

We only encounter this problem intermittently and unpredictably. Hence the monitor / reset.