Create Spark SQL DataFrame: Pass bucket name/password to DF not to Session builder

connections

#1

I’m using Spark and Spark Couchbase Connector version 2.1.

I want to load a bucket into a DataFrame. If I pass the bucket name and password to the session builder like this:

val spark = SparkSession.builder.master("local[*]").appName("Couchebase-app")
            .config("spark.couchbase.nodes", "127.0.0.1")
            .config("spark.couchbase.bucket.User", "123")
            .getOrCreate;

then issue val df = spark.read.couchbase(), the DataFrame gets created just normally.

However, omitting bucket/password from the session builder and submitting them instead as options during the DataFrame creation

val df = spark.read.format("com.couchbase.spark.sql.DefaultSource").option("spark.couchbase.bucket.User","123").load()

…fails throwing the common error:

com.couchbase.client.java.error.InvalidPasswordException: Passwords for bucket "default" do not match.

That suggests that the bucket name/password aren’t picked up at all.

If I give the name of bucket as an option (which is apparently only to be used when more than a bucket are open), like:

val df = spark.read.format("com.couchbase.spark.sql.DefaultSource").option("bucket","user").load()

The bucket name gets seemingly picked up, but the password is missing: java.lang.IllegalStateException: Not able to find bucket password for bucket User.

Question: I’m having a specific application design where I have to create Spark session builder separately from the DataFrame, so any database specific information, such as user, password, host, port, etc. are only passed during the DataFrame creation as options. Isn’t there a way to achieve this?