Restoring Backups With Indexes Behaves Inconsistently

btburnett3 · September 22, 2016, 8:28pm

I have an odd scenario regarding restoring backups with indexes. I’m sure there’s probably a sensible explanation, but I haven’t been able to figure it out yet.

First, I have a development “cluster”, which is really just a single node running all services. If I make a backup from this, it contains all views and indexes in the backup. I can restore this backup onto a brand new 5 server, MDS cluster (3 data, 1 index, 1 query), and it correctly restores all of the indexes. They’re waiting in a Created state for me to build them with BUILD INDEX commands.

I have second production cluster, which is again 5 servers. I make a backup of this cluster using one of the data nodes as the source, and using the exact same backup script as the development cluster. I can open this backup and see that it does contain both views and indexes in JSON files. When I restore it onto a brand new 5 server cluster, the indexes aren’t restored, just the views.

All of the servers in question are running Couchbase Server 4.5.0 on Ubuntu. I’m still using the old cbbackup and cbrestore utilities, not the new enterprise backup, because we haven’t had the chance to convert yet.

This isn’t a huge problem because I can always make the indexes manually. But it is making my work with our pre-production environment more difficult, since my automation scripts that work for the QA environment aren’t workign for pre-production.

Thanks,
Brant

deepkaran.salooja · September 23, 2016, 10:16pm

@btburnett3, can you change Indexer log level to Debug(UI -> Settings -> Cluster -> Index Settings -> Advanced -> Log Level), execute the cbrestore command and send us the output of cbrestore and indexer.log from the index server.

btburnett3 · October 3, 2016, 12:42pm

@deepkaran.salooja

I started a new cluster creation and restore Friday with debug logging enabled, and it finished over the weekend. I just uploaded the logs under the name CenterEdgeRestore.

Thanks,
Brant

deepkaran.salooja · October 4, 2016, 5:53pm

@btburnett3, can you send me the full path of the log files you uploaded?

btburnett3 · October 4, 2016, 6:07pm

I just used the upload function built into the web console. Upload to host was set to s3.amazonaws.com/cb-customers, and the Customer name was CenterEdgeRestore.

Thanks,
Brant

deepkaran.salooja · October 6, 2016, 12:36am

@btburnett3, I do not have access to that location. Can you also upload it to s3-us-west-1.amazonaws.com/forumlogs

btburnett3 · October 6, 2016, 1:04am

@deepkaran.salooja

Sure thing, I believe I’m recreating the environment again tomorrow so I’ll make some fresh logs.

Thanks,
Brant

btburnett3 · October 6, 2016, 4:45pm

@deepkaran.salooja

Okay, logs are uploaded to https://s3-us-west-1.amazonaws.com/forumlogs/CenterEdgeRestore

the node ending in .21 is the index node.

Brant

deepkaran.salooja · October 7, 2016, 12:06am

@btburnett3, the indexer logs from .21 indicate there was no index DDL received. Did you run the cbrestore on this cluster before collecting the logs. Can you share how you are running the cbrestore command and what output do you see.

btburnett3 · October 7, 2016, 1:21am

@deepkaran.salooja

Yeah, that’s what I was afraid it would show, that’s kind of how it’s behaving. Yes, the logs were shipped after the cbrestore was run. Here’s the command line I used:

for i in "Authentication" "CorpModule" "Stores" "General" "GiftCards"
do
  echo "Restoring $i bucket..."

  /opt/couchbase/bin/cbrestore -u Administrator -p xxxxxxx -b $i \
    /mnt/data/initialize/ couchbase://127.0.0.1:8091
done

The views from the backup are restored, only the GSI indexes are missing.

Brant

deepkaran.salooja · October 7, 2016, 7:02pm

@btburnett3, your backup has Memory Optimized Indexes but you are restoring to a cluster with storage mode as Global Secondary Indexes. That is the reason for this failure. You need to choose the correct storage mode in the new cluster.

btburnett3 · October 7, 2016, 7:12pm

Ahh, that makes sense, I guess. I could have sworn my script was turning on memory optimized indexes. Can you check my cluster init command line (this is run before attaching any servers). The ${…} are placeholders for the templating engine building the script.

/opt/couchbase/bin/couchbase-cli cluster-init -c 127.0.0.1:8091 -u Administrator -p `ec2metadata --instance-id` \
  --cluster-username=Administrator --cluster-password=${cluster_password} \
  --cluster-ramsize=${data_ramsize} --cluster-index-ramsize=${index_ramsize} --cluster-fts-ramsize=10000 \
  --index-storage-setting=memopt \
  --services=${services} >> /celog/celog.log

deepkaran.salooja · October 7, 2016, 7:28pm

The command is correct. Does it return success when you run it? You can double check from the UI what storage mode has been set.

btburnett3 · October 7, 2016, 7:39pm

@deepkaran.salooja

The UI does show it running standard indexes, not memory optimized.

The output from my scripts looks like this. The first line is from the cluster-init. There is also a node-init being run before that, but I don’t the output from that step. After that it attaches all of the servers (again missing output), then does rebalance and creates the empty buckets in prep for restore.

SUCCESS: init/edit 127.0.0.1
INFO: rebalancing .
SUCCESS: rebalanced cluster
SUCCESS: bucket-create
SUCCESS: bucket-create
SUCCESS: bucket-create
SUCCESS: bucket-create
SUCCESS: bucket-create
SUCCESS: bucket-create
SUCCESS: bucket-create

btburnett3 · October 7, 2016, 7:45pm

Also, here is the full script being executed, with passwords redacted. Chef is installing Couchbase and running node-init, turning off transparent huge pages, setting swappiness, mounting drives, etc.

#!/bin/bash -ex

mkdir celog
chmod -R 777 celog

apt-get update
apt-get install -y awscli >> /celog/celog.log
apt-get install -y zip >> /celog/celog.log

curl https://amazon-ssm-us-east-1.s3.amazonaws.com/latest/debian_amd64/amazon-ssm-agent.deb -o /tmp/amazon-ssm-agent.deb >> /celog/celog.log
dpkg -i /tmp/amazon-ssm-agent.deb >> /celog/celog.log
start amazon-ssm-agent >> /celog/celog.log

wget https://packages.chef.io/stable/ubuntu/12.04/chef_12.13.30-1_amd64.deb
dpkg -i chef_12.13.30-1_amd64.deb

echo "BEGIN CENTEREDGE CONFIGURATION" >> /celog/celog.log

aws s3 cp --region us-east-1 s3://bucketname/preprod/validator.pem /etc/chef/validation.pem >> /celog/celog.log
aws s3 cp --region us-east-1 s3://bucketname/preprod/linux-client.rb /etc/chef/client.rb >> /celog/celog.log
echo "{\"run_list\":[\"couchbase\"],\"couchbase\":{\"filename\":\"couchbase-server-enterprise_4.5.0-ubuntu14.04_amd64.deb\",\"bucket\":\"bucketname\"}}" > /etc/chef/first-boot.json

/opt/chef/bin/chef-client -j /etc/chef/first-boot.json >> /celog/celog.log

sleep 5

# Cluster init

/opt/couchbase/bin/couchbase-cli cluster-init -c 127.0.0.1:8091 -u Administrator -p `ec2metadata --instance-id` \
  --cluster-username=Administrator --cluster-password=xxxxxxx \
  --cluster-ramsize=16384 --cluster-index-ramsize=16384 --cluster-fts-ramsize=10000 \
  --index-storage-setting=memopt \
  --services=data >> /celog/celog.log

# Attach other Servers

for i in 1 2 3 4 5
do
  /opt/couchbase/bin/couchbase-cli server-add -c 127.0.0.1:8091 -u Administrator -p xxxxxxx \
    --server-add=10.75.2.12:8091 --server-add-username=Administrator --server-add-password=i-7bc1d24a \
    --services=data \
    && break >> /celog/celog.log

    # Sleep 15 seconds between add attempts
    echo "Sleeping 15 seconds" >> /celog/celog.log
    sleep 15
done
for i in 1 2 3 4 5
do
  /opt/couchbase/bin/couchbase-cli server-add -c 127.0.0.1:8091 -u Administrator -p xxxxxxx \
    --server-add=10.75.2.13:8091 --server-add-username=Administrator --server-add-password=i-f8c3d0c9 \
    --services=data \
    && break >> /celog/celog.log

    # Sleep 15 seconds between add attempts
    echo "Sleeping 15 seconds" >> /celog/celog.log
    sleep 15
done
for i in 1 2 3 4 5
do
  /opt/couchbase/bin/couchbase-cli server-add -c 127.0.0.1:8091 -u Administrator -p xxxxxxx \
    --server-add=10.75.2.21:8091 --server-add-username=Administrator --server-add-password=i-7cc1d24d \
    --services=index \
    && break >> /celog/celog.log

    # Sleep 15 seconds between add attempts
    echo "Sleeping 15 seconds" >> /celog/celog.log
    sleep 15
done
for i in 1 2 3 4 5
do
  /opt/couchbase/bin/couchbase-cli server-add -c 127.0.0.1:8091 -u Administrator -p xxxxxxx \
    --server-add=10.75.2.31:8091 --server-add-username=Administrator --server-add-password=i-1dc2d12c \
    --services=query \
    && break >> /celog/celog.log

    # Sleep 15 seconds between add attempts
    echo "Sleeping 15 seconds" >> /celog/celog.log
    sleep 15
done


# Rebalance

/opt/couchbase/bin/couchbase-cli rebalance -c 127.0.0.1:8091 -u Administrator -p xxxxxxx \
  >> /celog/celog.log

# Create buckets

/opt/couchbase/bin/couchbase-cli bucket-create -c 127.0.0.1:8091 -u Administrator -p xxxxxxx \
  --bucket=General --bucket-ramsize=4096 --bucket-replica=1 \
  --bucket-eviction-policy=fullEviction --wait >> /celog/celog.log

/opt/couchbase/bin/couchbase-cli bucket-create -c 127.0.0.1:8091 -u Administrator -p xxxxxxx \
  --bucket=Authentication --bucket-ramsize=1024 --bucket-replica=1 \
  --bucket-eviction-policy=fullEviction --wait >> /celog/celog.log

/opt/couchbase/bin/couchbase-cli bucket-create -c 127.0.0.1:8091 -u Administrator -p xxxxxxx \
  --bucket=CorpModule --bucket-ramsize=2048 --bucket-replica=1 \
  --bucket-eviction-policy=fullEviction --wait >> /celog/celog.log

/opt/couchbase/bin/couchbase-cli bucket-create -c 127.0.0.1:8091 -u Administrator -p xxxxxxx \
  --bucket=GiftCards --bucket-ramsize=100 --bucket-replica=1 \
  --bucket-eviction-policy=fullEviction --wait >> /celog/celog.log

/opt/couchbase/bin/couchbase-cli bucket-create -c 127.0.0.1:8091 -u Administrator -p xxxxxxx \
  --bucket=Cache --bucket-ramsize=100 --bucket-type=memcached \
  --enable-flush=1 --wait >> /celog/celog.log

/opt/couchbase/bin/couchbase-cli bucket-create -c 127.0.0.1:8091 -u Administrator -p xxxxxxx \
  --bucket=Stores --bucket-ramsize=100 --bucket-replica=1 \
  --bucket-eviction-policy=fullEviction --wait >> /celog/celog.log

/opt/couchbase/bin/couchbase-cli bucket-create -c 127.0.0.1:8091 -u Administrator -p xxxxxxx \
  --bucket=Cart --bucket-ramsize=100 --bucket-replica=1 \
  --bucket-eviction-policy=fullEviction --wait >> /celog/celog.log

# Perform restore (if configured)

echo "DOWNLOADING BACKUP..." >> /celog/celog.log

aws s3 cp --region us-east-1 s3://bucketname/databases/preprod-2016-09-22.zip /mnt/data/initialize.zip >> /celog/celog.log

echo "EXTRACTING BACKUP..." >> /celog/celog.log

mkdir /mnt/data/initialize >> /celog/celog.log
unzip /mnt/data/initialize.zip -d /mnt/data/initialize >> /celog/celog.log
rm /mnt/data/initialize.zip >> /celog/celog.log

echo "RESTORING BUCKETS" >> /celog/celog.log

for i in "Authentication" "CorpModule" "Stores" "General" "GiftCards"
do
  echo "Restoring $i bucket..." >> /celog/celog.log

  /opt/couchbase/bin/cbrestore -u Administrator -p xxxxxxx -b $i \
    /mnt/data/initialize/ couchbase://127.0.0.1:8091 >> /celog/celog.log
done

# Note: not restoring Carts intentionally, data is inherently temporary at this time

echo "CLEANUP BACKUP FILES..." >> /celog/celog.log

rm -r /mnt/data/initialize >> /celog/celog.log

# Rebuild indexes

echo "REBUILDING INDEXES..." >> /celog/celog.log

sleep 5

apt-get install -y jq >> /celog/celog.log

QUERY_HOST=http://10.75.2.31:8091

for i in "Authentication" "CorpModule" "Stores" "General"
do
  echo "Rebuilding $i bucket..." >> /celog/celog.log

  /opt/couchbase/bin/cbq -e $QUERY_HOST -s="$( \
    echo "BUILD INDEX ON $i (\`$( \
      /opt/couchbase/bin/cbq -e $QUERY_HOST -q=true -s="SELECT name FROM system:indexes where keyspace_id = '$i' AND state = 'deferred'" | \
        sed -n -e '/{/,$p' | \
        jq -r '.results[].name' | \
        sed ':a;/.*/{N;s/\n/\`,\`/;ba}')\`)")" >> /celog/celog.log

  # Wait for completion
  until [ `/opt/couchbase/bin/cbq -e $QUERY_HOST -q=true -s="SELECT COUNT(*) as unbuilt FROM system:indexes WHERE keyspace_id = '$i' AND state <> 'online'" | \
             sed -n -e '/{/,$p' | \
             jq -r '.results[].unbuilt'` -eq 0 ];
  do
    sleep 5
  done
done

# Run update queries

echo "RUNNING UPDATE QUERIES..." >> /celog/celog.log

aws s3 cp --region us-east-1 s3://bucketname/preprod/updatescripts.n1ql ~/updatescripts.n1ql >> /celog/celog.log
/opt/couchbase/bin/cbq -e $QUERY_HOST -f ~/updatescripts.n1ql >> /celog/celog.log

echo "RESTORE COMPLETE" >> /celog/celog.log


# Configure automatic backup

mkdir /etc/backup >> /celog/celog.log
aws s3 cp --region us-east-1 s3://bucketname/preprod/backupscript.sh /etc/backup/backupscript.sh >> /celog/celog.log
chmod +x /etc/backup/backupscript.sh >> /celog/celog.log

(crontab -l 2>/dev/null; echo "0 9 * * * /etc/backup/backupscript.sh >> ~/couchbasebackup.log 2>&1") | crontab -

echo "END CENTEREDGE CONFIGURATION" >> /celog/celog.log

deepkaran.salooja · October 12, 2016, 7:51pm

@btburnett3, I think you can try to set --index-storage-setting=memopt when doing server-add for index service nodes. Right now you are using --index-storage-setting in cluster-init but the “services” is only “data”. Or you can specify index service as well for cluster-init.

btburnett3 · October 12, 2016, 8:45pm

@deepkaran.salooja Okay, I can try that. For the sake of my scripting, what would happen if I add two index nodes and put that setting on both server-adds? Or if I include it on non-index nodes?

Brant

deepkaran.salooja · October 12, 2016, 9:05pm

@btburnett3, Having it for 2 add index nodes should be fine as long as it is the same value(mixed mode is not supported). For non-index nodes, it will be ignored.

btburnett3 · October 12, 2016, 9:28pm

@deepkaran.salooja

Okay, that should make it simple. I’ll give it a try on my next restore and see let you know.

Thanks,
Brant

btburnett3 · December 21, 2016, 4:20pm

@deepkaran.salooja

Finally got around to recreating the cluster, and your suggestions worked. Thanks!

Brant