LDAP setup with autonomous operator fails due to tlsCert

Hi everybody,
I would like to set up LDAP on a Couchbase cluster managed by the autonomous operator.
I got everything up and running and I am able to login with the LDAP credentials.

Here the most relevant yaml part:

    ldap:
      ...
      port: 636
      encryption: TLS
      serverCertValidation: false
      ...

On the first apply, this works just fine. You see, I am not using any tlsCert since I do not need validation anyways. Now the problem occurs, that any following apply does not work as an admission validation error occurs on the tlsCert field.

“couchbase-operator-admission.default.svc” denied the request: validation failure list: spec.security.ldap.tlsSecret in body is required resource name may not be empty

This leads to not being able to failover nodes or update the cluster until an update contains the removal of the LDAP configuration.
Now my question is, how would one configure the cluster according to the following image taken from the UI by yaml?

Technical information:

Couchbase Operator: v2.2.1
Couchbase Admission: v2.2.1
Couchbase Server: v6.6.3 and v7.0.3

Hi @hickla , your config is correct for this use cause, and as long as the following remains set:
serverCertValidation: false
Then tlsSecret is not required. Is it possible that the setting is somehow reverted by the applied yaml?

Hi @tommie,
all my changes are applied correctly on the first time. The steps to reproduce this issue are:

  1. Apply the config with the LDAP settings → works just fine
    → kubectl describe cbc cb-cluster also gives back the correct config
  2. Remove one of the pods to provocate reconciliation by the operator
    → kubectl describe cbc cb-cluster still gives back the correct config
    → Operator tries to update the status and gives the error, that it could not accomplish that task due to
    the validation error on tlsSecret

An an extract of the operator logs is attached:

{"level":"info","ts":1652426735.2141702,"logger":"cluster","msg":"Resource updated","cluster":"default/cb-cluster","diff":"  (\n  \t\"\"\"\n  \tauthenticationEnabled: true\n  \tauthorizationEnabled: true\n  \tbindDN: SOME BIND DN\n- \tcacert: |-\n- \t  SOME CERTIFICATE HERE\n+ \tcacert: \"\"\n  \t... // 11 identical lines\n  \t\"\"\"\n  )\n"}
{"level":"info","ts":1652426745.4501116,"logger":"cluster","msg":"Resource updated","cluster":"default/cb-kcluster","diff":"  (\n  \t\"\"\"\n  \tauthenticationEnabled: true\n  \tauthorizationEnabled: true\n  \tbindDN: SOME BIND DN\n- \tcacert: |-\n- \t  SOME CERTIFICATE HERE\n+ \tcacert: \"\"\n  \t... // 11 identical lines\n  \t\"\"\"\n  )\n"}

---> everything works, I can log in via LDAP. Now I kill the pod

{"level":"info","ts":1652426760.0685852,"logger":"cluster","msg":"Resource updated","cluster":"default/cb-cluster","diff":"  (\n  \t\"\"\"\n  \t... // 100 identical lines\n  \t  ready:\n  \t  - cb-cluster-0019\n+ \t  unready:\n  \t  - cb-cluster-0020\n  \tsize: 2\n  \t... // 9 identical lines\n  \t\"\"\"\n  )\n"}
{"level":"info","ts":1652426760.3767927,"logger":"cluster","msg":"unable to update status","cluster":"default/cb-cluster","error":"admission webhook \"couchbase-operator-admission.default.svc\" denied the request: validation failure list:\nspec.security.ldap.tlsSecret in body is required\nresource name may not be empty"}
{"level":"info","ts":1652426760.3961055,"logger":"cluster","msg":"Resource updated","cluster":"default/cb-cluster","diff":"  (\n  \t\"\"\"\n  \t... // 100 identical lines\n  \t  ready:\n  \t  - cb-cluster-0019\n+ \t  unready:\n  \t  - cb-cluster-0020\n  \tsize: 2\n  \t... // 9 identical lines\n  \t\"\"\"\n  )\n"}
{"level":"info","ts":1652426760.55743,"logger":"cluster","msg":"failed to update cluster status","cluster":"default/cb-cluster"}
{"level":"info","ts":1652426760.5634825,"logger":"cluster","msg":"Cluster status","cluster":"default/cb-cluster","balance":"balanced","rebalancing":false}
{"level":"info","ts":1652426760.5635242,"logger":"cluster","msg":"Node status","cluster":"default/cb-cluster","name":"cb-cluster-0019","version":"enterprise-7.0.3","class":"db2","managed":true,"status":"Active"}
{"level":"info","ts":1652426760.5635295,"logger":"cluster","msg":"Node status","cluster":"default/cb-cluster","name":"cb-cluster-0020","version":"enterprise-7.0.3","class":"db2","managed":true,"status":"Down"}

In addition, this is my complete LDAP configuration:

ldap:
     authenticationEnabled: true
     authorizationEnabled: true
     hosts:
       - < SOME HOST >
     port: 636
     encryption: TLS
     serverCertValidation: false
     bindDN: < SOME BIND DN >
     bindSecret: ldap-secret
     userDNMapping:
       query: < SOME QUERY >
     groupsQuery: < SOME QUERY >

The problem is, that I cannot use it in production as is since the cluster doesn’t get healthy again if something happens.

Hi hickla!

I am trying to set the LDAP up on our couchbaseclusters, however i cant get the queries correct. I get everything working as i want it to in the UI, but when i translate it to the queries in the YAML it wont work.

Would you give me an example of how your queries are working, and im sure i can figure out how to match it for mine.

Kindly,
Tess

Hi Tess,
sadly this topic never has been driven forward since the CB-Support could not help me. Hence, my config is a little old, as are my memories. Nevertheless, I was able to login and this topic mentioned was one step further so:

From what I can tell is, that my config in regards of userDNMapping and the groupsQuery looked as follows:

    userDNMapping:
      query: ou=usr,o=employee??one?(uid=%u)
    groupsQuery: ou=cb-kapp,ou=apps,o=global??one?(member=%D)

Hope that helps you figuring out how to design your query, but it is very specific to how your ldap is structured.

Kind regards
Lars

Thanks Lars for your superfast reply!

I was definetly missing the ??one? in my queries, will go ahead and try this out.

Many thanks,
Tess