Architecture of Plug-Ins

md1 · January 9, 2015, 5:29pm

Is there any information on Plug-In architecture like the elastic search plug in and likely the hadoop plugin.

I want to emulate XDCR behavior, but I do not want deleted documents propagated and removed from the other cluster.

This way, my production buckets can slowly drain away but have a backup copy on a deep storage server that is fed by an XDCR like machine.

Or maybe just run HADOOP or ELASTIC serach as my deep storage destination.

ideas?

davido · January 9, 2015, 6:38pm

Hi there!

There’s no official “plugin guide” or anything like that, but there are essentially two types of plugins: The first implements the XDCR v1 protocol (CAPI), like the ElasticSearch plugin, the second uses a TAP or DCP stream to read data from a bucket. The former listens for incoming XDCR replications (push), the latter opens a connection to a bucket and reads data from it (pull, but not quite).

Both the CAPI and TAP/DCP parts are available as stand-alone modules, so you can write your own connector that uses either one. You can find an example of a plugin that uses TAP streams here: https://github.com/paypal/couchbasekafka . You can find the CAPI server code here: https://github.com/couchbaselabs/couchbase-capi-server

There’s also an open feature request for the ElasticSearch plugin to be able to ignore deletes/expirations, so it can serve as an archive service of sort. If that’s what you need, you might want to bump that on Jira: https://issues.couchbase.com/browse/CBES-32

ingenthr · January 9, 2015, 7:05pm

In addition to what @davido said, note that long term we’re moving everything to DCP and in fact in the not too distant future we’ll have this in the Couchbase JVM Core for the JVM type components. That work is going on right now and you can see it on github or in the issue tracker.

md1 · January 9, 2015, 10:30pm

I looked at the JIRA ElasticSearch for a Bump Button on Jira and found none.

I guess the best answer to both responses is how to use DCP ptorocol along with some sort of triggering mechanism to grab the document and send it onward to the deep archive (or elastic search or Hadoop), while at the same time being able to search the bucket for any buckets yet to be sent and send them.

I could even see XDCR have an option to not pass along document deletes along with the “Pause” option currently availble.