We have use cases where we need to extract ( large) amounts of data from a bucket (e.g. 100,000s, from bucket with say 1,000,000s of documents), by ad-hoc query. Some questions/observations:
- Ideally we would want to stream to message broker (e.g. Kafka) for onward processing
- DCP is not suitable for this, since we don’t necessarily want only changed documents
- How to manage the large quantities of documents efficiently
Any ideas? Could CBExport be enhanced to allow a N1QL query to specify which documents to extract?