Problem that large json missing in view

zepinos · May 12, 2015, 3:27am

When I save a large json, “body_too_large, is content_length” error has occurred.

So , I created a program for storage , it has been saved to the server.

However, it will be missing only large -scale document stored individually to emit the stored document from view.

What is the solution?
Thanks.

xiger · May 12, 2015, 2:37pm

Do you do the document-creating on admin console?

zepinos · May 13, 2015, 2:30am

No. I’ve create a program that uses the Java SDK.

daschl · May 13, 2015, 5:55am

@zepinos the UI has a limit on the document you can store and show, which is much smaller than the actual doc size of the server capabilities (20mb for couchbase buckets, 1m for memcached buckets).

Or are you saying when creating a View it did fail? If so, can you share your map/reduce code and the doc size so we can take a look? Maybe a sample doc would be good too.

zepinos · May 13, 2015, 7:46am

View code is very simple

function (doc, meta) { emit(meta.id, doc); }

When to enter a “null” instead of “doc”, all of the key will be displayed properly
However , when you enter the “doc”, it will disappear key of about 1.6MB.

Bucket Type : Couchbase
RAM/Quota Usage : 34.4MB / 100MB

Thanks.

daschl · May 19, 2015, 8:20am

@zepinos I’ve sent a quick mail to our view engineering team, maybe they have an idea whats going on.

asingh · May 19, 2015, 8:36am

@zepinos by default view engine will skip emitting values greater than 1MB in size. In your case, since you are emitting whole document emit(meta.id, doc) i.e. ~1.6MB of data - view engine has skipped it and logged a message on server side.

Public JIRA about it - MB-9467

If you want to bump up the limits, you could use rest endpoints we expose to tweak it:

indexer_max_doc_size - documents larger then this value are skipped by the
indexer. A message is logged (with document ID, its size, bucket name, view name, etc)
when such a document is encountered. A value of 0 means no limit (like what it used to
be before). Current default value is 1048576 bytes (1Mb).
max_kv_size_per_doc - maximum total size (bytes) of KV pairs that can be emitted for
a single document for a single view. When such limit is passed, message is logged (with
document ID, its size, bucket name, view name, etc). A value of 0 means no limit (like what
it used to be before). Current default value is 1048576 bytes (1Mb), which is already a
too large value, that makes everything far from efficient.

Sample commands that you could use:

$ curl -X POST http://Administrator:password@localhost:8091/diag/eval -d \
'rpc:eval_everywhere(erlang, apply, [fun() -> couch_config:set("set_views", "indexer_max_doc_size", "2048576") end, []]).'

$ curl -X POST http://Administrator:password@localhost:8091/diag/eval -d \
'rpc:eval_everywhere(erlang, apply, [fun() -> couch_config:set("mapreduce", "max_kv_size_per_doc", "524288") end, []]).'

asingh · May 19, 2015, 8:45am

@zepinos If you’re having quite a few such documents, CPU resource requirements will increase in proportion because of indexing overhead.

Also it isn’t a good practice to emit whole document in your view definition. In current state, you have at least 2 copies of same data i.e. one within our caching/persistence layer and another inside view index B+Tree.