Production - When to use stale=false if ever?

#1

Sorry to beat this horse but the other posts are kinda vague.
I have a 3.0 server couchbase 3.0 cluster.
I’m use latest php sdk.
I use views to search for documents. Nothing complicated something like this

emit([doc.subscriptionId,doc.epoch],null)

My question is around the availability of the document in a production environment.
I have read the “Read Your Own Write” doc’s and countless others that give a good explanation of how DCP and 3.0 optimize indexes but its hard to find any concrete examples. It might be I’m not looking hard enough. I also know it depends on the individual application.

So hopefully this post will help others as well.

If I have 1000’s of users creating documents like this for example:
{
“docId”: “3F1EE72B-8EF4-4620-99B4-03AA0A63A610”,
“docType”: “Subscription”,
“subscriptionId”:“1234567890”,
“subscriberId”:“0987654321”,
“lastUpdate”: “1425416756454”
}
There is only ever one document that contains the same subscriptionId / subscriberId combo.

I also have 1000’s of users querying the view with this:
function (doc, meta) {
if(doc.docType == ‘Subscription’) {
emit([doc.subscriptionId,parseInt(doc.lastUpdate)],null)
}
}

Do I need to manually reindex the data with stale=false?
If so where is the best place to do so. It doesn’t make sense to do in a view query for latency reasons. should I do it after every insert/upsert/detetion?

I’m happy to provide any other details you may need.

Thanks so much.

Keith

#2

Okay let me first clarify some wordings so we are on the same page.

Stale=False means that at the time of query, the view indexer refreshes the index with the latest data available to him. Depending on how often you run the query and how fast you insert data, there might be quite some stuff to catch up. For both stale=ok and stale=update_after the indexer runs in background and updates the index so it will be eventually consistent.

Now what has changed between 3.0 and 2.x is that before 3.0 the indexer had to pick up data from disk and with 3.0 it can take it straight out of memory. This removes the need to do PersistTo.MASTER on the client side (to make sure the data is persisted to disk) and of course since you dont have a disk roundtrip the index gets up to date much more quickly.

In neither of those cases you can or need to manually trigger a “index” process to refresh it, the question is what tradeoffs you want between query performance and index freshness. Does that make sense?

#3

daschl,

Thank you for the consice reply. And yes it does make sense.
For my system, which is a social type application, there will probably be more reads that writes.
Waiting for the normal index update (5000ms) I believe will suffice.
The problem exists when a user immediatly [or within a second or two] queries their own insertion [or update,delete etc…].
It’s not always available without stale=false.
As Mr Cihan Biyikoglu describes in the Read You Own Rights Blog
http://blog.couchbase.com/read-your-own-write-faster-in-3.0-incremental-map-reduce-views-with-couchbase-server

“Messaging apps…” would benefit from stale=false. However doing this on every view query seems to be overkill.
Would it not make sense to run a fake query after certain CRUD operations. Maybe not on all operations but those that require a RYOW? By a fake query I mean just to update the view index with stale=false.

I appreciate the help.

Keith

#4

Yes, for certain queries you can do that and “force” such a reindex in the background - I think the tradeoff is operationsl complexity, but it may be small one :smile: