Delete seems to be unusable?

stevefoxover · December 31, 2015, 9:58pm

Testing: Enterprise 4.1 Server running on a Linux Box and using C# console app.

I am testing a bucket with server log files. The bucket has about 3.5 million documents. Trying to delete about a million docs for a certain day at this rate will probably take a week or 2. I am using an index on type and N1QL statements. I reduced the test environment to a single server to simplify the test.
How can this “DELETE FROM logs WHERE type=‘LogDocument’ LIMIT 1000” be so slow on a bucket with only 3 million items of simple log files? I can try more complicated things like TTL with an ‘isdeleted’: true value and index it.

But my question is:
Is there a better way than using a single bucket and separating different documents by type=value? If there were separate tables I could just split days into different tables and drop the table. Seems like this issue could be a deal breaker.

Steve

ingenthr · January 1, 2016, 7:48pm

There are probably two issues here. One is that you’re probably not walking the index as efficient as you could and second is that your index read could be held up by the recalculation from the updates, depending on how your query is formed.

Try changing the query to do it in batches by starting the next group with the ID of the last group. On an index restricted to the type and using the meta().id this will be the most efficient index walk.

Then from the .NET client, if you fire off asynchronous deletes, you should be able to get to a high rate. With a similar index walk and reading data using some of the latest code, we hit about 30k ops/s.

Can you post some code (either here or gist.github.com) and we can make some recommendations? It’d also be useful to know where the C# app is running. On the Linux box? What Linux and what runtime?