Couchbase Hadoop Connector - Count mismatch?


#1

The couchbase Hadoop connector retrieves ~ 1.8 million records to Hdfs, where as the bucket has ~ 2.25 million records.

The Sqoop job succeeds without any issues. The only error message we can find in the logs is - "ERROR vbucket.VBucketNodeLocator: Critical reconfiguration error: Server list from Configuration and Nodes are out of synch. causing serverName1:11210 to be removed".

16/06/13 11:10:04 INFO client.CouchbaseClient: CouchbaseConnectionFactory{bucket=‘bucketName’, nodes=[http://serverName:8091/pools/], order=RANDOM, opTimeout=2500, opQueue=16384, opQueueBlockTime=10000, obsPollInt=10, obsPollMax=500, obsTimeout=5000, viewConns=10, viewTimeout=75000, viewWorkers=1, configCheck=10, reconnectInt=1100, failureMode=Redistribute, hashAlgo=NATIVE_HASH, authWaitTime=2500}
16/06/13 11:10:04 INFO client.CouchbaseClient: viewmode property isn’t defined. Setting viewmode to production mode
16/06/13 11:10:04 INFO client.CouchbaseConnection: Shut down Couchbase client
16/06/13 11:10:04 INFO client.ViewConnection: I/O reactor terminated
16/06/13 11:10:04 WARN split.JobSplitWriter: Max block location exceeded for split: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 splitsize: 103 maxsize: 10

Env - CDH 5.5.x, Couchbase 3.1 (5 node cluster with 2 replicas), Couchbase hadoop plugin


#2

Is this a transitory error or is it repeatable?


#3

It is repeatable. Every time i get the same ERROR messages and record count.


#4

Issue resolved. The node was removed due to different case in vBucketLocator. The server configuration has everything in upper case and sqoop connect argument was in lower case. Looks like hadoop connector url check is case sensitive.


#5

Thanks for the update. One question - is the host portion of the URL being treated as case sensitive? Because that should not be and that would be confusing. (I’m thinking RFC 3986 here as “the way everyone expects this to work”)

So, these three are all the same:

example.com
ExAmPlE.com
EXAMPLE.COM 

Any other portion of a URL is case sensitive, so these are different:

example.com/a 
example.com/A

#6

Sqoop Job -
sqoop import -Dmapreduce.map.log.level=DEBUG --connect http://server1.domainweb.local:8091/pools/ --table DUMP --username device-storage --password blahblah -m 20 --target-dir /user/yarn/pre-prod/couchbase_data/device-storage/ --verbose

Log -
16/06/13 11:10:04 INFO client.CouchbaseConnection: Added {QA sa=server1.domainweb.local/10.0.128.53:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
16/06/13 11:10:04 INFO client.CouchbaseConnection: Added {QA sa=SERVER2.DOMAINWEB.LOCAL/10.0.128.54:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
16/06/13 11:10:04 INFO client.CouchbaseConnection: Added {QA sa=SERVER3.DOMAINWEB.LOCAL/10.0.128.59:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
16/06/13 11:10:04 INFO client.CouchbaseConnection: Added {QA sa=SERVER4.DOMAINWEB.LOCAL/10.0.128.255:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
16/06/13 11:10:04 INFO client.CouchbaseConnection: Added {QA sa=SERVER5.DOMAINWEB.LOCAL/10.0.128.61:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
16/06/13 11:10:04 ERROR vbucket.VBucketNodeLocator: Critical reconfiguration error: Server list from Configuration and Nodes are out of synch. causing SERVER1.DOMAINWEB.LOCAL:11210 to be removed
16/06/13 11:10:04 INFO client.CouchbaseClient: CouchbaseConnectionFactory{bucket=‘device-storage’, nodes=[http://server1.domainweb.local:8091/pools/], order=RANDOM, opTimeout=2500, opQueue=16384, opQueueBlockTime=10000, obsPollInt=10, obsPollMax=500, obsTimeout=5000, viewConns=10, viewTimeout=75000, viewWorkers=1, configCheck=10, reconnectInt=1100, failureMode=Redistribute, hashAlgo=NATIVE_HASH, authWaitTime=2500}

Just by changing the Sqoop connect url we are able to resolve the issue.

sqoop import -Dmapreduce.map.log.level=DEBUG --connect http://SERVER1.DOMAINWEB.LOCAL:8091/pools/ --table DUMP --username device-storage --password blahblah -m 20 --target-dir /user/yarn/pre-prod/couchbase_data/device-storage/ --verbose


#7

I am assuming this is the connector repo.

HashMap is used for storing hostnames,

The following loop adds the servers from node config with null values.

Node config is added to the same hashmap using containsKey check. obviously hashmap keys are case sensitive and config for changed case server name wont be added .

Since value is null in the hashmap for the for the changed case servername it is removed in line # 238

FIX
Change HashMap to TreeMap with Case insensitive order in Line # 211.

HashMap<String, MemcachedNode> vbnodesMap =
new HashMap<String, MemcachedNode>();

TreeMap<String, MemcachedNode> vbnodesMap =
new TreeMap<String, MemcachedNode>(String.CASE_INSENSITIVE_ORDER);