Sync Gateway admin service not available after restart of server


#1

I have a Couchbase Community Edition server (6.0.0) running on CentOS 7.5 and a SyncGateway on the same box.

If I restart the box then the _admin service refuses a connection (HTTP: ERR_CONNECTION_REFUSED)… Only if I restart the sync.gateway service will it accept calls again.

Is there any way I can configure the sync.gateway to “re-connect” if it starts before the Couchbase server is available?


#2

What platform and version of Sync Gateway are you running?

The raw binary will retry its connection attempt for around a minute before abandoning. However, the standard service scripts automatically restart the sync gateway process on failure, which would allow it to continue retrying until Couchbase Server is accepting connections.


#3

Thanks for your response.

Sync Gateway is version 2.1.1 (Community Edition).

From what I understand then it sounds as it is more likely that I have a configuration problem… I have a setup quite similar to this on my dev. server - and have not seen this problem there.

So if I try to open port 4985 now then I get the ERR_CONNECTION_REFUSED. I have tried to look at the sg_error.log and the sync_gateway_error.log files. The first only contains this line:

2018-12-04T07:34:43.300+01:00 ==== Couchbase Sync Gateway/2.1.1(17;fea9947) ====

and the latter has this:

2018-12-04T07:34:43.246+01:00 [WRN] Using deprecated config option: log. Use logging.console.log_keys instead. -- rest.(*ServerConfig).deprecatedConfigLoggingFallback() at config.go:623
2018-12-04T07:34:43.300+01:00 ==== Couchbase Sync Gateway/2.1.1(17;fea9947) ====
2018-12-04T07:34:43.300+01:00 [INF] Console LogKeys: [HTTP HTTP+]
2018-12-04T07:34:43.300+01:00 [INF] Console LogLevel: info
2018-12-04T07:34:43.300+01:00 [INF] Log Redaction Level: none
2018-12-04T07:34:43.300+01:00 [INF] requestedSoftFDLimit < currentSoftFdLimit (5000 < 65535) no action needed
2018-12-04T07:34:43.300+01:00 [INF] Opening db /data as bucket "data", pool "default", server <http://localhost:8091>
2018-12-04T07:34:43.323+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-04T07:34:43.579+01:00 [INF] Error opening bucket: failed to connect to any of the specified hosts
2018-12-04T07:34:43.584+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-04T07:34:43.588+01:00 [INF] Error opening bucket: failed to connect to any of the specified hosts
2018-12-04T07:34:43.610+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-04T07:34:43.627+01:00 [INF] Error opening bucket: failed to connect to any of the specified hosts
2018-12-04T07:34:43.654+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-04T07:34:43.655+01:00 [INF] Error opening bucket: failed to connect to any of the specified hosts
2018-12-04T07:34:43.696+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-04T07:34:43.697+01:00 [INF] Error opening bucket: failed to connect to any of the specified hosts
2018-12-04T07:34:43.780+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-04T07:34:43.782+01:00 [INF] Error opening bucket: failed to connect to any of the specified hosts
2018-12-04T07:34:43.953+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-04T07:34:43.955+01:00 [INF] Error opening bucket: failed to connect to any of the specified hosts
2018-12-04T07:34:44.280+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-04T07:34:44.282+01:00 [INF] Error opening bucket: failed to connect to any of the specified hosts
2018-12-04T07:34:44.923+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-04T07:34:44.924+01:00 [INF] Error opening bucket: failed to connect to any of the specified hosts
2018-12-04T07:34:46.204+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-04T07:34:46.207+01:00 [INF] Error opening bucket: failed to connect to any of the specified hosts
2018-12-04T07:34:48.767+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-04T07:34:48.769+01:00 [INF] Error opening bucket: failed to connect to any of the specified hosts
2018-12-04T07:34:54.069+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-04T07:34:54.071+01:00 [INF] Error opening bucket: failed to connect to any of the specified hosts
2018-12-04T07:35:04.312+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-04T07:35:04.314+01:00 [INF] Error opening bucket: failed to connect to any of the specified hosts
2018-12-04T07:35:24.794+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-04T07:35:24.818+01:00 [INF] Successfully opened bucket

which confirms what you say.

If I look at the status of the service (systemctl status sync_gateway) then I get this:

● sync_gateway.service - Couchbase Sync Gateway server
   Loaded: loaded (/usr/lib/systemd/system/sync_gateway.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2018-12-04 07:34:41 CET; 6min ago
  Process: 5171 ExecStartPre=/bin/chown -R sync_gateway:sync_gateway /home/sync_gateway/data (code=exited, status=0/SUCCESS)
  Process: 5169 ExecStartPre=/bin/mkdir -p /home/sync_gateway/data (code=exited, status=0/SUCCESS)
  Process: 5163 ExecStartPre=/bin/chown -R sync_gateway:sync_gateway /home/sync_gateway/logs (code=exited, status=0/SUCCESS)
  Process: 5153 ExecStartPre=/bin/mkdir -p /home/sync_gateway/logs (code=exited, status=0/SUCCESS)
 Main PID: 5172 (bash)
   CGroup: /system.slice/sync_gateway.service
           ├─5172 /usr/bin/bash -c /opt/couchbase-sync-gateway/bin/sync_gateway --defaultLogFilePath "/home/sync_gateway/logs" /home/sync_gateway/sync_gateway.json >> /home/sync_gateway/logs/sync_gateway_acce...
           └─5173 /opt/couchbase-sync-gateway/bin/sync_gateway --defaultLogFilePath /home/sync_gateway/logs /home/sync_gateway/sync_gateway.json

Dec 04 07:34:40 db1.dalsgaard-data.dk systemd[1]: Starting Couchbase Sync Gateway server...
Dec 04 07:34:41 db1.dalsgaard-data.dk systemd[1]: Started Couchbase Sync Gateway server.

which I suppose also looks Ok.

So if this is really a config problem then I would have guessed it should be consistent. However, if I restart the service (systemctl restart sync_gateway) then I can open port 4985 right away…

And then the following is added to the sync_gateway_error.log file:

2018-12-04T07:42:40.373+01:00 [WRN] Using deprecated config option: log. Use logging.console.log_keys instead. -- rest.(*ServerConfig).deprecatedConfigLoggingFallback() at config.go:623
2018-12-04T07:42:40.373+01:00 ==== Couchbase Sync Gateway/2.1.1(17;fea9947) ====
2018-12-04T07:42:40.373+01:00 [INF] Console LogKeys: [HTTP HTTP+]
2018-12-04T07:42:40.373+01:00 [INF] Console LogLevel: info
2018-12-04T07:42:40.373+01:00 [INF] Log Redaction Level: none
2018-12-04T07:42:40.373+01:00 [INF] requestedSoftFDLimit < currentSoftFdLimit (5000 < 65535) no action needed
2018-12-04T07:42:40.373+01:00 [INF] Opening db /data as bucket "data", pool "default", server <http://localhost:8091>
2018-12-04T07:42:40.373+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-04T07:42:40.378+01:00 [INF] Successfully opened bucket
2018-12-04T07:42:40.478+01:00 [INF] Design docs for current SG view version (2.1) found.
2018-12-04T07:42:40.478+01:00 [INF] Verifying view availability for bucket data...
2018-12-04T07:42:44.255+01:00 [INF] Views ready for bucket data.
2018-12-04T07:42:44.256+01:00 [INF] Initializing changes cache for database data
2018-12-04T07:42:44.735+01:00 [INFO] Using plain authentication for user <ud>remoteSync</ud> 
2018-12-04T07:42:44.740+01:00 [INF] Using metadata purge interval of 3.00 days for tombstone compaction.
2018-12-04T07:42:44.747+01:00 [INFO] Using plain authentication for user <ud>remoteSync</ud> 
2018-12-04T07:42:44.822+01:00 [INF]     Reset guest user to config
2018-12-04T07:42:44.822+01:00 [INF] Starting admin server on 0.0.0.0:4985
2018-12-04T07:42:44.825+01:00 [INF] Starting server on 0.0.0.0:4984 ...
2018-12-04T07:42:45.873+01:00 [INF] HTTP:  #001: GET /_all_dbs (as ADMIN)
2018-12-04T07:42:45.873+01:00 [INF] HTTP+: #001:     --> 200   (0.1 ms)
2018-12-04T07:42:46.022+01:00 [INF] HTTP:  #002: GET /favicon.ico/ (as ADMIN)
2018-12-04T07:42:46.022+01:00 [INF] HTTP: #002:     --> 400 invalid database name "favicon.ico"  (0.1 ms)

The first contents of the sync_gateway.json configuration is:

	"log": ["HTTP+"],
	"adminInterface": "0.0.0.0:4985",
	"interface": "0.0.0.0:4984",
	"databases": {
		"data": {
		"use_views":true,
           	"bucket": "data",
	        "server": "http://localhost:8091",
            	"username": "remoteSync",
            	"password": "5bAEGXcb2yNR3sWfRq23Dzajf",
            	"enable_shared_bucket_access": true,
            	"import_docs": "continuous",
            	"users": { "GUEST": { "disabled": true, "admin_channels": ["!"] } },
               :

The server is hidden behind a proxy from the outside world - and therefore it is not “open” to everybody outside on port 4985 ::slight_smile:


#4

Hi Adam

Any ideas on this? It is very consistent…! Here are a tail of a couple of log files:

sg_error.log

2018-12-04T07:34:43.300+01:00 ==== Couchbase Sync Gateway/2.1.1(17;fea9947) ====
2018-12-04T07:42:40.373+01:00 ==== Couchbase Sync Gateway/2.1.1(17;fea9947) ====
2018-12-13T09:10:02.504+01:00 ==== Couchbase Sync Gateway/2.1.1(17;fea9947) ====

sg_info.log

2018-12-13T09:10:05.214+01:00 [INF] Error opening bucket: failed to connect to any of the specified hosts
2018-12-13T09:10:07.775+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-13T09:10:07.776+01:00 [INF] Auth: Attempting credential authentication http://localhost:8091?http_idle_conn_timeout=90000&http_max_idle_conns=64000&http_max_idle_conns_per_host=256
2018-12-13T09:10:07.777+01:00 [INF] Error opening bucket: failed to connect to any of the specified hosts
2018-12-13T09:10:13.278+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-13T09:10:13.278+01:00 [INF] Auth: Attempting credential authentication http://localhost:8091?http_idle_conn_timeout=90000&http_max_idle_conns=64000&http_max_idle_conns_per_host=256
2018-12-13T09:10:13.280+01:00 [INF] Error opening bucket: failed to connect to any of the specified hosts
2018-12-13T09:10:23.520+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-13T09:10:23.520+01:00 [INF] Auth: Attempting credential authentication http://localhost:8091?http_idle_conn_timeout=90000&http_max_idle_conns=64000&http_max_idle_conns_per_host=256
2018-12-13T09:10:23.529+01:00 [INF] Successfully opened bucket

sync_gateway_error.log

2018-12-13T09:10:03.930+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-13T09:10:03.933+01:00 [INF] Error opening bucket: failed to connect to any of the specified hosts
2018-12-13T09:10:05.213+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-13T09:10:05.214+01:00 [INF] Error opening bucket: failed to connect to any of the specified hosts
2018-12-13T09:10:07.775+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-13T09:10:07.777+01:00 [INF] Error opening bucket: failed to connect to any of the specified hosts
2018-12-13T09:10:13.278+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-13T09:10:13.280+01:00 [INF] Error opening bucket: failed to connect to any of the specified hosts
2018-12-13T09:10:23.520+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-13T09:10:23.529+01:00 [INF] Successfully opened bucket

I tried to open the admin console using: http://db1.my-domain.dk:4985/_admin/ - and I just get:

This site can’t be reached db1.dalsgaard-data.dk refused to connect.
Search Google for db1 my domain 4985 admin
ERR_CONNECTION_REFUSED

And it stays like that - I have even tried to leave it for days…

So issuing systemctl restart sync_gateway on the server will restart Sync Gateway - and then it responds on port 4985.

The log files now show (inlcuding the last few lines from before):

sg_error.log

2018-12-04T07:34:43.300+01:00 ==== Couchbase Sync Gateway/2.1.1(17;fea9947) ====
2018-12-04T07:42:40.373+01:00 ==== Couchbase Sync Gateway/2.1.1(17;fea9947) ====
2018-12-13T09:10:02.504+01:00 ==== Couchbase Sync Gateway/2.1.1(17;fea9947) ====
2018-12-13T09:20:58.702+01:00 ==== Couchbase Sync Gateway/2.1.1(17;fea9947) ====

sg_info.log

2018-12-13T09:10:23.520+01:00 [INF] Auth: Attempting credential authentication http://localhost:8091?http_idle_conn_timeout=90000&http_max_idle_conns=64000&http_max_idle_conns_per_host=256
2018-12-13T09:10:23.529+01:00 [INF] Successfully opened bucket
2018-12-13T09:20:58.702+01:00 ==== Couchbase Sync Gateway/2.1.1(17;fea9947) ====
2018-12-13T09:20:58.702+01:00 [INF] Console LogKeys: [HTTP HTTP+]
2018-12-13T09:20:58.702+01:00 [INF] Console LogLevel: info
2018-12-13T09:20:58.702+01:00 [INF] Log Redaction Level: none
2018-12-13T09:20:58.702+01:00 [INF] requestedSoftFDLimit < currentSoftFdLimit (5000 < 65535) no action needed
2018-12-13T09:20:58.702+01:00 [INF] Opening db /data as bucket "data", pool "default", server <http://localhost:8091>
2018-12-13T09:20:58.702+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-13T09:20:58.702+01:00 [INF] Auth: Attempting credential authentication http://localhost:8091?http_idle_conn_timeout=90000&http_max_idle_conns=64000&http_max_idle_conns_per_host=256
2018-12-13T09:20:58.708+01:00 [INF] Successfully opened bucket
2018-12-13T09:20:58.775+01:00 [INF] Design docs for current SG view version (2.1) found.
2018-12-13T09:20:58.775+01:00 [INF] Verifying view availability for bucket data...
2018-12-13T09:20:59.374+01:00 [INF] Views ready for bucket data.
2018-12-13T09:20:59.375+01:00 [INF] Cache: Initializing changes cache with options {ChannelCacheOptions:{ChannelCacheMinLength:0 ChannelCacheMaxLength:0 ChannelCacheAge:0s} CachePendingSeqMaxWait:5s CachePendingSeqMaxNum:10000 CacheSkippedSeqMaxWait:1h0m0s}
2018-12-13T09:20:59.375+01:00 [INF] Initializing changes cache for database data
2018-12-13T09:20:59.375+01:00 [INF] DCP: Starting mutation feed on bucket data due to either channel cache mode or doc tracking (auto-import/bucketshadow)
2018-12-13T09:20:59.375+01:00 [INF] DCP: Using DCP feed for bucket: "data" (based on feed_type specified in config file)
2018-12-13T09:20:59.811+01:00 [INF] Using metadata purge interval of 3.00 days for tombstone compaction.
2018-12-13T09:20:59.830+01:00 [INF]     Reset guest user to config
2018-12-13T09:20:59.830+01:00 [INF] Starting admin server on 0.0.0.0:4985
2018-12-13T09:20:59.836+01:00 [INF] Starting server on 0.0.0.0:4984 ...
2018-12-13T09:20:59.872+01:00 [INF] DCP: Backfill in progress: 0% (4 / 10377)
2018-12-13T09:21:00.260+01:00 [INF] Cache: Received #463322 (unused sequence)
2018-12-13T09:21:00.878+01:00 [INF] DCP: Backfill complete
2018-12-13T09:21:03.217+01:00 [INF] HTTP:  #001: GET /_all_dbs (as ADMIN)
2018-12-13T09:21:03.217+01:00 [INF] HTTP+: #001:     --> 200   (0.1 ms)
2018-12-13T09:21:03.375+01:00 [INF] HTTP:  #002: GET /favicon.ico/ (as ADMIN)
2018-12-13T09:21:03.375+01:00 [INF] HTTP: #002:     --> 400 invalid database name "favicon.ico"  (0.1 ms)

sync_gateway_error.log

2018-12-13T09:10:13.280+01:00 [INF] Error opening bucket: failed to connect to any of the specified hosts
2018-12-13T09:10:23.520+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-13T09:10:23.529+01:00 [INF] Successfully opened bucket
2018-12-13T09:20:58.701+01:00 [WRN] Using deprecated config option: log. Use logging.console.log_keys instead. -- rest.(*ServerConfig).deprecatedConfigLoggingFallback() at config.go:623
2018-12-13T09:20:58.702+01:00 ==== Couchbase Sync Gateway/2.1.1(17;fea9947) ====
2018-12-13T09:20:58.702+01:00 [INF] Console LogKeys: [HTTP HTTP+]
2018-12-13T09:20:58.702+01:00 [INF] Console LogLevel: info
2018-12-13T09:20:58.702+01:00 [INF] Log Redaction Level: none
2018-12-13T09:20:58.702+01:00 [INF] requestedSoftFDLimit < currentSoftFdLimit (5000 < 65535) no action needed
2018-12-13T09:20:58.702+01:00 [INF] Opening db /data as bucket "data", pool "default", server <http://localhost:8091>
2018-12-13T09:20:58.702+01:00 [INF] GoCBCustomSGTranscoder Opening Couchbase database data on <http://localhost:8091> as user "remoteSync"
2018-12-13T09:20:58.708+01:00 [INF] Successfully opened bucket
2018-12-13T09:20:58.775+01:00 [INF] Design docs for current SG view version (2.1) found.
2018-12-13T09:20:58.775+01:00 [INF] Verifying view availability for bucket data...
2018-12-13T09:20:59.374+01:00 [INF] Views ready for bucket data.
2018-12-13T09:20:59.375+01:00 [INF] Initializing changes cache for database data
2018-12-13T09:20:59.811+01:00 [INF] Using metadata purge interval of 3.00 days for tombstone compaction.
2018-12-13T09:20:59.814+01:00 [INFO] Using plain authentication for user <ud>remoteSync</ud> 
2018-12-13T09:20:59.829+01:00 [INFO] Using plain authentication for user <ud>remoteSync</ud> 
2018-12-13T09:20:59.830+01:00 [INF]     Reset guest user to config
2018-12-13T09:20:59.830+01:00 [INF] Starting admin server on 0.0.0.0:4985
2018-12-13T09:20:59.836+01:00 [INF] Starting server on 0.0.0.0:4984 ...
2018-12-13T09:21:03.217+01:00 [INF] HTTP:  #001: GET /_all_dbs (as ADMIN)
2018-12-13T09:21:03.217+01:00 [INF] HTTP+: #001:     --> 200   (0.1 ms)
2018-12-13T09:21:03.375+01:00 [INF] HTTP:  #002: GET /favicon.ico/ (as ADMIN)
2018-12-13T09:21:03.375+01:00 [INF] HTTP: #002:     --> 400 invalid database name "favicon.ico"  (0.1 ms)

I guess the error on favicon.ico comes as I open the admin console in a browser - and thus can be ignored.

Any ideas what to do? I have not seen this problem on the prior release. And I have the same version installed on another CentOS 6.10 box - where this is not a problem either (though that is a single server installation).

This is a cluster of two nodes - problem is consistent across both. OS i s CentOS 7.6, Couchbase Community Edition 6.0 and Sync Gateway 2.1.1

The sync_gateways.json file looks like this (leaving out the sync formula):

{
        "log": ["HTTP+"],
        "adminInterface": "0.0.0.0:4985",
        "interface": "0.0.0.0:4984",
        "databases": {
                "data": {
                "use_views":true,
                "bucket": "data",
                "server": "http://localhost:8091",
                "username": "remote.....",
                "password": ".........",
                "enable_shared_bucket_access": true,
                "import_docs": "continuous",
                "users": { "GUEST": { "disabled": true, "admin_channels": ["!"] } },
                "sync": `function (doc, oldDoc) {
                :