Error Using Java SDK in Spark Job

Hi,

since I maintain the spark connector, let me follow up here.

That is not needed at all, in fact the spark connector supports exactly what you are doing here. Your problem is another one that the connector solves: you need to create couchbase connections on the worker, not on the manager. Otherwise it needs to serialize it over the network which obviously can’t work.

In your case an error got raised (ChannelException from netty) which can’t be serialized over the network. The class not found is probably because netty is not available on all executors.

So in your case I wonder if you need to builder either a fat jar with all the deps or add the jars to the classpath on each worker node. In addition, you need to very careful where you open and use the Bucket instance.

If you pull in the spark connector version 1.1.0, check out Couchbase SDKs as an example on how to use it and let me know if you run into issues.

Note that for the ChannelException you’d need to share more logs so we can spot whats going on, its not possible to tell the root cause from just this exception type.

1 Like