Can sync gateway channels be used in Couchbase server


#1

We’re using channels in sync server to define dataset / view of data which is sync’d up to mobile which looks to do what we need. We also need web users to have a identical view of the data according to the same channel rules.
Currently we’re access the documents on the web via the java sdk direct to the couchbase bucket.
Is there a way to query the data in the bucket via couchbase using the same channel rules?

Having examined the raw data in couchbase that’s served through sync gateway its not clear where the channels are defined, do they translate to views that could be used in couchbase?

using couchbase v5 beta & sync gateway 1.5 beta


Couchbase or Sync Gateway
#2

update:

When looking at a raw document via the sync gateway admin tool there is a _sync property which is not visible when viewing the same document through couchbase. i believe this is new in CB5 & SG1.5 to prevent sync metadata being stripped when mutating through CB.

e.g.

“name”: “Bob Smith”,
"_attachments": {},
"_sync": {
“rev”: “1-e837006f23e9e64520760b2c1ab2c249”,
“sequence”: 968,
“recent_sequences”: [
968
],
“channels”: {
“customer_f8a8ecb9-27d6-4c84-8a12-64d4b0d1fb9a”: null,
“sutton_2nd_office”: null
},
“cas”: “0x0000a82a887ddd14”,
“time_saved”: “2017-08-23T14:44:34.395401096+01:00”
},

Presumably sync gateway is fetching all documents higher than the sequence of the request where the sync.channels match the channels of the requesters user/role.

building a similar query via couchbase does not seem possible since _sync is not visible via n1ql either.

select * from syncgw where type = ‘customer’ and _sync.channels.customer_f8a8ecb9-27d6-4c84-8a12-64d4b0d1fb9a is null

The query contains the following fields not found in the inferred schema for their bucket: _sync.channels.customer_f8a8ecb9-27d6-4c84-8a12-64d4b0d1fb9a


#3

A seemly obvious answer to this is add a channels array to the document itself and have the sync function duplicate these directly into the _sync.channels.
e.g.
name: “bobsmith”,
channels [
“abc”,“def”
]

function(doc) {
channel(doc.channels);
}

then in CB we can write an n1ql query which fetches only documents where channel contains “abc”. which is suspect is all SG is doing to build the channel view.

will report back.


#4

The sync metadata is added by and used by the Sync Gateway for syncing documents over to the clients. It’s not used by CB server. With SG 1.5/CB5.0, the sync metadata is no longer stored in the document and that’s why you don’t see it in the document via the admin console… It is stored in XAttrs. There is currently no way to N1QL query for or view the XAttr information via CB.

Well. I am not so sure if it’s a good idea design-wise to update your application’s data model to retrofit it with data to deal with the way couchbase handles data routing.

If all you are doing is querying for the data , any reason why your web app can’t use the REST API exposed by Sync Gateway to query for this information?

Alternatively, assuming that the sync function routes documents to different channels based on some criteria, why don’t you use the same criteria for querying the documents from server? (Example : If you are using “name” property for assigning channels, why don’t you issue a N1QL query for documents with “name” equal to specific value)


Sync data from server to mobile client
#5

HI Priya,

I had assumed that all the sync gateway metadata was stored with the document in the hidden ‘_sync’ attribute. Are you saying that there is additional xattr’s stored separately from the document json? I’m interested to understand more about the internal’s if we’re to use this in production.

I’ve not tried querying via the REST api, presumably this is all view based with no n1ql support. I’d also assumed that working directly with n1ql via the SDK would be more performant for the webapp.

regarding the use of channels attribute in the doc, i did start with a snc gateway function that mapped ‘name’ attr to a channel, but felt that it was simplier for the developer to reason with a permission system based on channels when writting a n1ql query, rather than having to refer back to the sync function to create the equivalent query. We have quite a complex permissions system which will require several channels defined on each document and user and this needs to be adhered to whether syncing data to the app or viewing via the web.


#6

Hi
In Couchbase Server 5.0, there is no way to query for XAttr directly via N1QL (perhaps in future releases). While we see the potential need for use in development environments, we are not fully clear of its need in production environments.

As indicated above, in leiu of querying the SG which is probably not what you want in your use case, your best option is to construct query equivalent of the Sync Function logic.
I am wary of plugging in channel information into the document itself - you are effectively applying the sync function logic during document creation (Basically if you include channels property in document, then you don’t need to have the data routing logic in sync gateway and you lose the flexibility that a sync function offers)

(Tagging @adamf for insight and opinion)


#7

query rev info via N1QL is necessary for the application who use N1QL and SG both. for example, web application can use N1QL to query data include rev info,and web application can update document via SG REST API with rev info.


#8

I’ll try and describe what I’m trying to achieve to add some context.

We’ve got large number ~ 10000 customers, for each customer there are ~ 30 json forms and ~ 20000 json tasks.
Access to this information is via web and via an app which needs the data available offline.
Permission control is handled via groups, a single customer could belong to several groups, If the user is a member of a group then they can see the customer/forms/tasks, typically ~ 100-300 customers per user.
It wont be practical to sync all the permitted customer data to the app, so we’ll sync all the customers & forms and latest 7 days of task data.

The approach I’m currently taking is to define the list of groups to which a customer belongs on each customer, document & task json, this is being done with a channels array which translates directly to channels in the sync gateway. additionally we’d add a channel to each task for the date, so the app will sync all customer & forms that it has access to and will request tasks in channels for the latest 7 days (I believe couchbase-lite/pouchdb can select channels to sync)

If the device has access to the internet the remaining tasks can be fetched at runtime via an api which the webservice provides.
Web users will also be subject to the same permissions rules via webservice api’s and a single page app, and the webservice’s would be created using the java sdk and n1ql queries which would apply the channel security constraint to every request.

Be interested to hear anyones view to this approach.

Generally the recommendation seems to be that If i’m using the sync gateway then all communications with the DB should be via the sync gateway, but the REST api seems limited compared with native CB functionality. Not clear how i’d achieve complex queries over large data sets that the web api’s will require.


#9
  • To clarify, the recommendation applied more to pre SG 1.5/ CBS 5.0 . With SG 1.5 / CBS 5.0 (which is what you are presumably using), you don’t need to go through SG always . SG imports documents in CBS via DCP even if it is uploaded via web app and adds appropriate sync metadata in the document XAttr. The imported documents are processed via the sync function. These documents are available to mobile clients.
    So you shouldn’t have to re implement the sync function’s permissions rules within your web service for documents that are uploaded from within the web app. The sync gateway should take care of it.
  • We don’t have a good way to implement time series data although this has come up before. We have a GitHub issue to track this. If you can add specifics on what you would like , this will greatly help with prioritization of this issue. Until that feature is available, you can use the approach you described - have a channel associated with date and filter on that. Maybe if its just the last 7 days, you can have some process running daily on your server that moves all the old documents manually an "history " channel.

But we don’t have a way to query metadata via N1QL in 5.0 other than the _raw API on the Sync Gateway.
@atom_yang : Have you filed a GitHub ticket requesting this capability for future versions?


Get revision number in 5.1
#10

ok FYI


#11

Thanks @atom_yang.

@adam.hawkins : I have added notes on supporting the ability to get sync channel info via N1QL to the ticket above. Please add your own notes as appropriate.
As indicated earlier, the sync metadata is expected to be used internally by the SG. We can’t provide any guarantees around backwards compatibility in the _sync metadata structure, so we don’t recommend users attempt to code against that information directly.


#12

Thanks Atom, Priya.

i’ve updated my model to hold a ‘security’ array on the document rather than call it channels and a sync[valid/expiry] which declares the date range that it should be included in the sync. e.g.
_id: ‘task_123’,
syncValid: '2017-06-24 00:00:00’
syncExpiry: ‘2017-07-02 00:00:00’,
security: [‘butcher’,‘baker’,‘candlestickmaker’]

my sync function copies the security rules into channels so that generally we sync everything you have security access to. however this is not practical for high volume time sensitive information. so we only want to sync a small time window of the full dataset you have access to so the function does not propagate the channels if out of range. a background process will need to poke these every 24 hrs to re-eval the sync function.

function (doc) {
var now = new Date(‘2017-05-01’); // mock
// if syncExpiry is set and less than now then doc has sync expired
var expired = doc.syncExpiry && (new Date(doc.syncExpiry) <= now);
// if syncValid is set and less than now then doc is not yet valid
var notValid = doc.syncValid && (new Date(doc.syncValid) >= now);

if(expired || notValid){
// exclude
}else{
channel (doc.sgs);
}
}

The user can still access the data outside of this range by making live requests either via the web or the app to an api which queries couchbase directly using n1ql

select * from pass a where WHERE ANY sec IN security SATISFIES sec WITHIN [‘baker’,‘candlestickmaker’] END

This all seems to work fine, but getting slow ~ 12s response times with 600k docs from this query using the index
CREATE INDEX idx_security ON pass ( ALL DISTINCT ARRAY sec FOR sec IN security END );

but i’ll raise this separately in the n1ql forums.


#13

Yeah- it’s a bit redundant to have the “security” property that mimics the channels but I guess since you need that info for your N1QL queries, we probably have to live with it.

Since you are only interested in the last N days, How about having a single timestamp and then having a separate job that moves the documents older than N to an “archive” channel ?


#14

Hi Priya,

Because we have both historical data older than 7 days, and planned worked further than 7days into the future both of which shouldn’t be in the sync view.


#15

Hi!

we also want to build a PWA App that runs offline and online.
We have to use PouchDB since CB Lite doesnt support javascript.

We have the same problem juggling the permission structure and channels with Sync Gateway and Server Rest API (n1ql at the Backend).

I think couchbase needs a feature that syncs the Roles / Permission / Channels between SG and CB Server.

Best!