We have a new mid-scale app that will be the company’s first app using Couchbase as the only database. We store data as documents in Couchbase at the rate of about 35 million per month. Analytics will be very important and we are looking at some BI suites and tools as well as dedicated databases. We’ve thought of a couple of options:
Store everything in Couchbase forever and build some views and queries to do the analytics. Maybe something like the Spark connector for certain processing. I am worried that an ever-growing production database will hurt performance and drive up costs as we have to increase the number of nodes to support it.
Move data to a secondary database, transforming it into a snowflake schema for more traditional data analytics. How will we move this data reliably? Can we query for all new or changed documents?