Hadoop and Couchbase

pappu_pandit · September 11, 2014, 4:06pm

Hi Guys,

I am pretty new to NoSQL and Big Data. I was wondering do we really need hadoop as couchbase supports map reduce functions or vice versa.

With couchbase clusters having ability to have distributed nodes, what benefit Hadoop brings to couchbase users?

Thanks

cihangirb · September 11, 2014, 4:29pm

Couchbase does provide map reduce for creating views and we incrementally process the view as changes come in for a low latency access to the map/reduce views. Many folks use Hadoop with different objectives. Top few are around complex statistical models that don’t lend themselves easily to low latency analytics, or data archival etc etc.
Couchbase and Hadoop flavors (cloudera, hartonwors, mapr etc) work well hand in hand - you can easily exchange data between the 2 env and get the best of both worlds if needs be.
thanks
-cihan

pappu_pandit · September 11, 2014, 5:17pm

Thanks.
One more question … Does that mean that we need to maintain two different storage for the both systems ?

Thanks

cihangirb · September 11, 2014, 5:29pm

Yes, couchbase stores its files in a specific format that is required for us to access and concurrent access to these files form hadoop is not possible. We are working on facilities where you can use query language on hadoop such as hive or pig or even through ODBC or JDBC to access couchbase directly however. That could mean that hadoop does not store data but processes queries directly by communicating with couchbase.
would that be interesting to you?
-cihan

pappu_pandit · September 11, 2014, 6:43pm

Hi Cihan

Ya- that makes perfect sense for us. That will reduce the storage requirements and moreover the development work if we need to run Hadoop analytics.
-PP