Is there a nice way to prevent deletes with the subdoc API from leaving "stub" paths with no children?

jc · December 17, 2020, 4:40pm

I have some documents of the form:

{
  "root": {
    "field1a": {
      "field2a": {
        "value": 1
      },
      "field2b": {
        "value": 1
      }
    },
    "field1b": {
      "field2a": {
        "value": 2
      },
      "field2b": {
        "value": 2
      }
    }
  },
}

Various modifications are being done with the subdoc API. However, the delete creates an issue. I need to remove elements at the “field2” layer. However, when a given node at the “field1” layer has no more children left, it will be left hanging:

{
  "root": {
    "field1a": {}, //  field1a here is empty
    "field1b": {
      "field2a": {
        "value": 2
      }
    }
  },
}

Since some logic may rely on the existence of certain keys, I need to clean up these empty path stubs. I can obviously iterate through and check this, and then delete any childless nodes. However I’m wondering if there is a cleaner way to do this. I am implementing a bunch of such deletes at different layers, and it seems a little bit cumbersome to iterate through and check whether everything has children.

graham.pople · December 18, 2020, 10:29am

Hi @jc

There’s no conditional logic in Sub-Document (“delete field1a iff empty”), so I don’t think there’s a way to do this. For this kind of complex operation it’s simpler to do a full-document get-and-replace.

jon.strabala · December 18, 2020, 3:48pm

@jc I know you are asking about sub-doc operations, but I would like to point out there is another option via the Eventing Servie

You could clean your entire bucket up in real time (once as a single shot tool -or- continuously in real-time on each mutation) with Eventing here is a link to a handler that remove both nulls objects and elements it will do what you want and much more.
https://docs.couchbase.com/server/current/eventing/eventing-handler-removeNullsAndEmptys.html

I took the above and stripped it down to a generic function to exactly what you would need

sample input:

{
  "root": {
    "field1a": {},
    "field1b": {
      "field2a": {
        "value": 2
      }
    },
    "field1c": {
      "field2c": {
        "value": 3,
        "field3c": {
            "value": 4,
            "field4c": {}
        }
      }
    }
  }
}

processed output / cleanedup:

 {
  "root": {
    "field1b": {
      "field2a": {
        "value": 2
      }
    },
    "field1c": {
      "field2c": {
        "value": 3,
        "field3c": {
          "value": 4
        }
      }
    }
  }
}

Eventing function “GenericRemoveStubs”:

function OnUpdate(doc, meta) {
    // optional filter to specific types
    // if (doc.type !== "my_data_to_prune") return;
    
    function removeEmptyParts(obj) {
        if (obj !== null && typeof obj == "object") {
            Object.entries(obj).forEach(([k, v]) => {
                if (obj[k] && typeof obj[k] === 'object') {
                    // recurse
                    removeEmptyParts(obj[k])
                    // remove stub {} object items
                    if (obj[k] && !Array.isArray(obj[k]) &&
                        Object.keys(obj[k]).length === 0
                    ) {
                        delete(obj[k]) // 6.6+ can use "delete obj"
                        updated = true; // set in scope OnUpdate
                    }
                }
            });
        } else {
            // obj is a number or a string
        }
        return obj;
    }

    // make a new doc without {} stubs
    var updated = false;
    var newdoc = removeEmptyParts(doc);

    // Requires 6.5+, src_bkt aliased to source bucket mode r+w 
    if (updated) {
        // only update the KV if we updated the doc
        src_bkt[meta.id] = newdoc;
    }
}

Like I said above function can be run (deployed with a feed boundary set to Everything and then undeployed) as a point tool to clean up all items in a bucket, or it can be run continuously to repair these sort of issues upon any mutation.