We have a three-node cluster of Azure A3 VMs (4 cores, 7GB RAM). Our workloads are pretty light, except for a weekly data import process that imports around 100-200K records per week. The nodes are Ubuntu 12.04 and CB 3.0.1. We’re using the C# .net SDK to insert and update the records.
The initial (fresh) inserts are the fastest. We can insert new documents at around 500 docs/sec which displays in the admin console as 1000-1500 ops/sec (I’m guessing the replica write is counted in this stat?). Our documents are pretty small, with the average size being 2K. While this isn’t as fast as we’d hoped, it’s workable. Where the performance gets unworkable is in document updates.
Our import process sometimes does an update if an incoming document ID already exists in CB. It will load the document from CB into the C# model, compare it to the incoming document, apply the updates in memory, then rewrite the entire document back to CB. From what we read in the SDK docs, this is the only way to update an existing doc.Unfortunately this process is VERY slow. This process tends to write documents at about 30-40 docs per second.
We started our troubleshooting by doing a fairly deep review of our disk configuration. We’re currently using a RAID 0 software RAID (mdadm) of 8x Azure Page blob disks per the best “high-performance disk config” docs we could find for Linux on Azure.
500 docs/sec seems slow for a three-node cluster, but 30-40 seems REALLY slow. Does anyone else have any experience with Azure A3 VMs and blob storage disks? Definitely feels like there’s something we’re missing.