Couchbase - Elasticsearch transport plugin. Mapping to types using _class field


#1

Hello,

I am using couchbase with elasticsearch transport plugin. My issue is
about mapping couchbase document to elasticsearch type. It is possible
to mapping it adding to elasticsearch.yml:

couchbase.typeSelector: org.elasticsearch.transport.couchbase.capi.RegexTypeSelector
couchbase.typeSelector.documentTypesRegex.type: ^type:.+$

and then document in couchbase with id “type:123” is convert to type “type” in elasticsearch.

I am not content about this solution, because it determines type and
format of ID field in couchbase and it causing the solution is not
"elegant" and it may be troublesome.

It is possible to mapping document using for example “_class” field?
This field appears after inserting a document to couchbase using Java
API. I thing it will be much better solution.

Thank you for help.


#2

Hi @snosek,
I’m not sure I understand what you’re asking for. Let me see if I’ve got it.

I think you or a framework that you are using like Spring is inserting a “_class” field when you create a document in Couchbase. (You say this field appears “after” inserting the document into Couchbase, but I think it is part of the information you are writing inside the document. Otherwise, please clarify.)

You then want to use a regular expression to parse that “_class” field in Elasticsearch to select a mapping instead of using a regex on the document id.

If that is what you are looking for, it would be a new feature request. Currently type selection only inspects the keys (document ids). Modifying type selection so that it inspects the document bodies would impact performance and resource consumption.

Best,
-Will


#3

Thank you for answer.

I am using sprind data couchbase to insert data to couchbase. The _class field appears after inserting a document, it is not part of my document structure. But this field is only an example. What will be the best solution in general is capability to mapping document field value to elasticsearch type. Sample document in my elasticsearch now:
{
“id”: “clv:1234"
”_class": “path.to.package.clv”,
“exportTime”: 1450903202321,
“clvValue”: 2.7,
“customer”: “customer”
}
From my point of view it will be much better to convert value of “_class” rather than document ID.

Best,
Simon


#4

Hi Simon,

Just wanted to follow up with a link to David Ostrovsky’s answer on Stackoverflow for future forum users. The TL;DR version of his answer is that the Couchbase document key is used for type selection because it is immutable. Using a mutable document field would enable users to accidentally index a document multiple times as different types on the Elasticsearch side when the value of a document field changes.
-Will