Couchbase foundational underpinnings


#1

I am interested in knowing what the foundational blocks are that Couchbase Server is built upon. If I was to port the product to a new platform, what building blocks would be needed? More to the point:

  1. What frameworks are required to port Couchbase to a given server?
    a. IPC verbs
    b. Compilers
    c. Java, node.js etc.
  2. Are you leveraging any RDMA frameworks such as MVAPICH etc?
  3. What interfaces do you use for your SQL database interface adaptors? JDBC, ODBC etc?
  4. What are your plans to support InfiniBand?

#2

In order to build the server you need at least:
* C and C++ compilers (need to support C99 and C++11)
* Go compiler
* python
* cmake
* repo
* git
* openssl

Then you’ll need some specific versions of third party dependencies which you should be able to find a list of at http://src.couchbase.org/source/xref/trunk/tlm/deps/packages/CMakeLists.txt#126 . You’d most likely have to update http://src.couchbase.org/source/xref/trunk/tlm/cmake/Modules/PlatformIntrospection.cmake and add your new platform and then specify the platform in http://src.couchbase.org/source/xref/trunk/tlm/deps/manifest.cmake.


#3

In addition, for the clients (SDKs), it depends on which you want to use. All the SDKs and the separate JDBC driver (which is developed by a partner, SIMBA) consume N1QL query over HTTP, so there is nothing you need to build there as far the server is concerned.


#4

Thank you both for your answers. I suspect there are some missing pieces though. For example, I know Couchbase relies heavily upon shared memory. What facilities do you use? I suspect you rely on the Unix IPC verbs from the runtime library. You won’t find those on other non-Unix platforms or Windows. Also, node.js is not implemented on all operating systems either so it would be helpful to have a concise inventory of all the packages and operating system facilities the product relies on. I suppose I could pour through all the make files but I was hoping a Couchbase developer could authoritatively just tell me.

I’m thinking of what it would take to get Couchbase ported to the HP NonStop fault-tolerant server. The go compiler is not implemented on that platform but there is an API (i.e. what I am working on) that does what Unix IPC does so that part at least is covered. NonStop has an SVR4-compliant Unix variant called OSS that sits on top of the Guardian (i.e. NonStop OS) kernel – much like Linux sits on top of IBM’s mainframe OS - and it also has a build package called floss (included cmake) that makes OSS look like Linux so most Unix products port pretty cleanly. Where it gets dicey is shared memory and kernel-managed threads because Guardian doesn’t support them. git, openssl, python, and c99 are supported. Not sure about C++11. Repo is not there so it looks like that needs to be ported also.

Also their JVM is a little behind so it would be helpful to know what specific SDKs and versions are needed so I can check that too.


#5

I have to admit that my knowledge about the internals of the stuff we’ve implemented in Erlang and go, but the other modules in Couchbase does NOT use shared memory. I’ve tried hard to make sure that the parts of couchbase that is implemented in C/C++ is written in a portable way by making sure that in addition to building on our “supported” platforms, I’m also building on Solaris and FreeBSD. That means that we’ve tried to stick with POSIX and a pure windows implementation where we need it. There is one module that is really platform specific that you might have to fix yourself (there is a hpux implementation there, but given that we don’t build for hpux I have no idea if it even compiles :wink: )

Repo is just a python program so I would guess that it would just work out of the box for you.

So what is your goal? To build Couchbase server on HP NonStop? Or do you also want to be able to run all of the different SDK’s? The server don’t depend on or use node.js. I don’t know enough about the internals of all the various SDK’s to put up a list on what they need/use.

I don’t know all about the internal details of our dependencies (and I don’t know if you looked into the link I added), but we do use some dependencies I find it very likely to have low leve os-related implemetations like: Erlang, jemalloc, libevent and v8…

Cheers,

Trond


#6

Hi Trond,

Would it be possible for us to converse privately? My email is dean@caleb-ltd.com mailto:dean@caleb-ltd.com

Erlang? That’s another porting dependency. I don’t believe that compiler has been ported to NonStop either. From what I have read, Erlang does not use shared memory but apparently go does. Its architecture is consistent with NonStop. The Erlang OTP (i.e. process monitoring and inter-process messaging framework) does what Pathway (process supervision and restarts) and the Guardian (i.e. the NonStop OS) inter-process messaging does for a NonStop and I’ll bet the NonStop does it a lot better. It is proven to support seven nines of availability and absolutely bullet-proof fault tolerance. As such, it would be a pretty great foundation to port Erlang to. I found the following quote (http://joneisen.tumblr.com/post/38188396218/concurrency-models-go-vs-erlang) in a comparison between Erlang and go:

In addition to the author’s experience, there are underlying differences in the concurrency implementations. Go is built to run on multicore processors and uses a shared-memory model. Erlang is built to run on multiple computers and does not allow shared-memory…

I should clear up a few things. There are a couple main other differences between the languages’ concurrency models:

  •     Erlang’s concurrency model uses “processes” and does not share memory, which Go’s goroutines do.
    
  •     Erlang’s scheduler is preemptive, which is a goal of Go’s but not the current state.
    

As for my goal, this is an exploratory exercise. Here’s some background. I build products on NonStop and am investigating what it would take to get Couchbase to work on NonStop. More particularly, NonStop is a shared-nothing clustered-processors (i.e. from 2-16 CPUs per node) architecture where each processor has its own RAM. It is engineered from first principles to survive any single point of failure and my implementation of IPC does that. Imagine! Fault-tolerant memory. The Guardian OS uses message passing on a bus as its core backplane. That bus is InfiniBand so RDMA is an option. I am nearing the completion of a product that will leverage RDMA between these processors so that shared memory (i.e. think Unix IPC capabilities of memory, queue-based messages and semaphortes) will be possible on this platform. It sounds to me like this would be a pre-requisite to porting go and that go is a pre-requisite to porting Couchbase. The following link makes the need for shared memory even more apparent: http://stackoverflow.com/questions/13895147/whether-go-uses-shared-memory-or-distributed-computing

As for your question about the scope of the implementation, implementing just the Couchbase server would be a minimalist approach. I think having all the SDKs supported would be valuable if anyone is going to take the NonStop implementation seriously. This is about as much as I’m willing to discuss on an open forum. Please do reach out to me so we can communicate directly.