Couchbase Team,
Bucket user-bucket has 30million+ records in it. I need to update the domain of the email id present in the document from xxmail.com to yymail.com.
I have created an index and executed the script in cbq as below:
cbq> CREATE INDEX USER_BKT-EMAIL-IDX ON user-bucket(emailId) WHERE emailId LIKE “%xxmail.com”;
cbq> SELECT COUNT(*) AS count FROM `user-bucket` WHERE emailId LIKE "%xxmail.com";
{
"requestID": "64633c5c-4559-4335-b363-4376d11d4ee6",
"signature": {
"count": "number"
},
"results": [
{
"count": 3063621
}
],
"status": "success",
"metrics": {
"elapsedTime": "11.009255457s",
"executionTime": "11.009217044s",
"resultCount": 1,
"resultSize": 40
}
}
cbq> UPDATE `user-bucket` SET emailId = REPLACE(emailId, "xxmail.com", "yymail.com") where emailId LIKE "%xxmail.com";
cbq>
This update statement executed for quite long time(15mins+), and it comes out with out any output message.
Then I executed the count query and found out only 100k+ records are updated.
Kindly help me in getting the update query executed for 30million+ records.
CREATE INDEX ix1 ON `user-bucket` (emailId) WHERE REVERSE(emailId) LIKE RVERSE("%xxmail.com");
UPDATE `user-bucket` SET emailId = REPLACE(emailId, "xxmail.com", "yymail.com")
WHERE REVERSE(emailId) LIKE RVERSE("%xxmail.com");
Use scan_consistency request_plus and timeout or error retry again so that it continues where it left off.
I tried with Prefix search (REVERSE) as you mentioned, and the update query executed for quite long time(15mins+), and it comes out with out any output message. I found below error in the couch logs:
Right you are Eventing was introduced in version 5.5.
In the 6.5 release, the handler code can directly mutate (or write back) to the source bucket, prior to 6.5 you needed to use an intermediate bucket (or write the KEYs to an intermediate bucket and let eventing use that as it source to update the true target).
Hi,
I tried as you said, and the update time is huge. To update 10k records, its taking more than 2minutes. Below indexes are created in the bucket before executing update script:
CREATE INDEX USERID-IDX ON user-bucket(emailId) WHERE REVERSE(emailId) LIKE REVERSE(“%xxmail.com”);
CREATE INDEX USERIDIQ-IDX ON user-bucket(emailId) WHERE REVERSE(emailId) LIKE REVERSE(“%yymail.com”);
N1QL update is full document update.
As REQUEST_PLUS. It needs wait Index catch upto latest mutation.
Then get a keys. get document and update one by one. N1QL update is full document update.
subdoc is helping me in completing my task .But I am getting DocumentTooDeepException for multiple documents.
com.couchbase.client.java.error.subdoc.DocumentTooDeepException: JSON is too deep in document D1D0AA2C42628D474D8EDF18012E5D1E
How do i handle such case? Do I need to retry the operation again?
I assume this is one time job, so skip those document and continue. At the end when index caught up you run the same query. You will have few skipped documents retry again or handle that manually or outside.