Solved - Remove_multi() - how to force direct removal


I have a case where I get a file generated by Spark containing document keys of cold data, which should be deleted. These files usually contain arround 6.000.000 records/ keys.

I wrote a Python script which reads the file line by line, generates packages of e.g. 250 lines and then deletes them using remove_multi() providing a dictionary with the keys and their expected CAS.

As far as I can see, the remove_multi() does not immediatelly remove the documents. In the Couchbase UI, I can see the are no “deletes per sec.” for quite a while, then after a few minutes they go up to some thousands per seconds and the disk write queue gets filled too.

Am I missing something like a commit to force the removal after each 250 docs package?

Any advice would be very much appreciated.

def remove_batch_data( rm_dict ):
    suc = 0
    cb = Bucket(CB_CONNSTR, password=CB_PASSWORD)
    if ( len(rm_dict)>0 ):
        if str2bool(dryrun_mode):
            for key in rm_dict:
      'Key: ' + key + ' - CAS: ' + str(rm_dict[key]))
                docs = cb.remove_multi( rm_dict, quiet=True )
                suc += len(rm_dict)
            except cb_errors.KeyExistsError as exc:
                for k, res in exc.all_results.items():
                    if not res.success:
                        logging.warning('Removal failed: ' + k)
                        suc += 1

    return suc


Solved. Calling function had an error.