Problem that large json missing in view


#1

When I save a large json, “body_too_large, is content_length” error has occurred.

So , I created a program for storage , it has been saved to the server.

However, it will be missing only large -scale document stored individually to emit the stored document from view.

What is the solution?
Thanks.


#2

Do you do the document-creating on admin console?


#3

No. I’ve create a program that uses the Java SDK.


#4

@zepinos the UI has a limit on the document you can store and show, which is much smaller than the actual doc size of the server capabilities (20mb for couchbase buckets, 1m for memcached buckets).

Or are you saying when creating a View it did fail? If so, can you share your map/reduce code and the doc size so we can take a look? Maybe a sample doc would be good too.


#5

View code is very simple

function (doc, meta) {
emit(meta.id, doc);
}

When to enter a “null” instead of “doc”, all of the key will be displayed properly
However , when you enter the “doc”, it will disappear key of about 1.6MB.

Bucket Type : Couchbase
RAM/Quota Usage : 34.4MB / 100MB

Thanks.


#6

@zepinos I’ve sent a quick mail to our view engineering team, maybe they have an idea whats going on.


#7

@zepinos by default view engine will skip emitting values greater than 1MB in size. In your case, since you are emitting whole document emit(meta.id, doc) i.e. ~1.6MB of data - view engine has skipped it and logged a message on server side.

Public JIRA about it - MB-9467

If you want to bump up the limits, you could use rest endpoints we expose to tweak it:

  • indexer_max_doc_size - documents larger then this value are skipped by the
    indexer. A message is logged (with document ID, its size, bucket name, view name, etc)
    when such a document is encountered. A value of 0 means no limit (like what it used to
    be before). Current default value is 1048576 bytes (1Mb).

  • max_kv_size_per_doc - maximum total size (bytes) of KV pairs that can be emitted for
    a single document for a single view. When such limit is passed, message is logged (with
    document ID, its size, bucket name, view name, etc). A value of 0 means no limit (like what
    it used to be before). Current default value is 1048576 bytes (1Mb), which is already a
    too large value, that makes everything far from efficient.

Sample commands that you could use:

$ curl -X POST http://Administrator:password@localhost:8091/diag/eval -d \
'rpc:eval_everywhere(erlang, apply, [fun() -> couch_config:set("set_views", "indexer_max_doc_size", "2048576") end, []]).'
$ curl -X POST http://Administrator:password@localhost:8091/diag/eval -d \
'rpc:eval_everywhere(erlang, apply, [fun() -> couch_config:set("mapreduce", "max_kv_size_per_doc", "524288") end, []]).'

#8

@zepinos If you’re having quite a few such documents, CPU resource requirements will increase in proportion because of indexing overhead.

Also it isn’t a good practice to emit whole document in your view definition. In current state, you have at least 2 copies of same data i.e. one within our caching/persistence layer and another inside view index B+Tree.