Fast method to insert many documents

#1

Hello,

What would be the fastest way for inserting a lot of documents through the Sync Gateway?

Currently, my document is composed by attachments (pictures) and some metadata.

The solution me any my team came up was to develop a python script which would use the REST API to create the document, retrieve the id and revision, then update the document by adding attachments:

res = self.client.post('{}/'.format(self.server), json=data)
self.document = res.json()
for img in self.images:
  res = self.client.put('{}/{}/{}?rev={}'.format(self.server,
                        self.document.get('id'), img,
                        self.document.get('rev')),
                        data=open(os.path.join(filepath, img), 'rb'),
                        headers={'Content-Type': 'image/jpeg'})
  self.document = res.json()

We added this script in celery + rabbit to make it faster, but this is not fast enough for us.

Does anyone have a better way of adding documents with attachments faster?

I’m getting an average of 200ops/sec:

Server Config: Inter i7 3.60GHz
RAM: 32GB
SSD: 512GB

Thanks

#2

@rugolini You can cut down on your round trips by using _bulk_docs to write multiple documents at once.
http://developer.couchbase.com/documentation/mobile/1.2/develop/references/sync-gateway/rest-api/database-public/post-bulk-docs/index.html

Another optimization might be to send the attachments inline along with the document body, instead of doing a subsequent update to add the attachments.

#3

@adamf how can I do that? Can you give an example? Because from what I read in the documentation you need to do HTTP PUT with the binary file in the body to upload an attachment.
http://developer.couchbase.com/documentation/mobile/1.2/develop/references/sync-gateway/rest-api/document-public/put-db-doc-attachment/index.html

Were you suggesting for me to put the attachments as base64?