Explicitly selecting an INDEX to use

Hi,

I was wondering if you’re planning on being able to explicitly select an index to use in the N1QL syntax.

e.g. you could replace something like

SELECT email FROM bucket WHERE email LIKE ‘%@yahoo.com

could be replaced by creating an index that splits the emails into [domain,user] and then use an expression a bit like

SELECT email FROM bucket WHERE [domain] IN INDEX indexName SATISFIES domain = ‘yahoo.com

I know that this particular example could be handled completely with the REST API, and indeed would be more efficient, but there are more complicated examples where it wouldn’t be so easy/possible to use just the REST API without some application logic.

Thanks.

Yes, you will be able to create indexes on expressions, and then use those indexes in your queries.

In your example:

CREATE INDEX domain ON bucket (SUBSTR(email, POSITION(email, ‘@’)))

SELECT email FROM bucket WHERE SUBSTR(email, POSITION(email, ‘@’)) = ‘@yahoo.com

This query would use the created index.

OK, that sounds good. I’m guessing from the way you wrote that that it’s a feature for the future, and that it’s not yet ready in DP3 (as it didn’t work on your tutorial either).

The thing is that I’m specifically thinking about indexes that you create through views. I like the idea of being able to have the logic of the map separated from the N1QL, because it allows for all the complexity of JavaScript etc, but to have access to these possibly complex indexes directly from N1QL. I think it would be relatively (compared to other things you’re doing) to add a way to explicitly access a view’s index from N1QL

Is this something you’re planning?

DP4 will support some expression indexes. The GA release will support all expression indexes.

Note that N1QL does not support manually created views; only views created via N1QL’s CREATE INDEX statement. The idea is that N1QL will give you enough richness to define any index that would be usable for querying.

Do you have some specific logic in mind?

Thanks.

Not specifically right now, but I’m sure I would fairly soon. I can see the advantage of giving the possibility of accessing manually-created indexes, though. There’s a lot of flexibility in JavaScript that is not yet available in N1QL, and even if you have it in N1QL, it might be more elegant to write in JavaScript in the first place (from a dev point of view).

Is there any reason that you guys would have against it? I know that you probably are trying to completely separate N1QL logic from Couchbase logic, even if you’re generating it behind the scenes.

I’ll try to think of an example that would be good to have a mix of manually-defined views with N1QL that one or the other on their own wouldn’t work very well.

Any idea when DP4 will be released?

Thanks.

DP4 will released in a couple of weeks.

We are very much in favor of our JavaScript view indexes :smile: The challenge is being able to use them as lookup indexes for N1QL. In order to use a view as a lookup index, N1QL must be able to understand the definition of the view.

Looking forward to your example.
Cheers.

Yes. That’s what I was trying to get at.

What I had in my mind immediately was a subset case, where there is only a mapping.

N1QL could receive the objects in order based on the index, but would not receive the index itself. It would just need to search based on the index (with start/end probably), and then hand the docs back to the rest of the N1QL query to perform their normal tasks.

Essentially it would work in the same way as just defining an index on e.g. ‘name’. The reason why I see it as being easy to implement is that I assume that the underlying process for using indexes on specific fields probably already does this, so it would just be extending a feature you’ve already implemented.

Having an understanding of the meaning of the index, especially if there’s a reduce, is obviously much harder, but that wasn’t what I was thinking of. In fact I’m not sure how useful N1QL would be with reduce’s. I’m only thinking of the case with maps only.

OK, here’s an example that I think might demonstrate a usage:

Imagine you’ve got a list of people, with name etc, and a list of their experience with the age, like (I’ve left out quotes for speed):

{name:name,experience:[{company:co1,start_date:date,end_date},{…}]

You then create an index on the number of total years’ experience, which you call (originally) yearsExperience.

Then you want to perform some other N1QL queries, but using the ordering of the yearsExperience as the index. I think this would be a valid reason for wanting to access the yearsExperience index, even though it’s not defined in N1QL.

In general, indexes that aren’t directly based on key values, but are calculated from them, could warrant such usage.

Good news about DP4 - looking forward to trying out the new features. What’s the timeline for general availability?

Cheers.

Hi,

Thanks for the example. This is the type of usage we had in mind for computed indexes. You can do the following:

CREATE INDEX yearsExperience ON mybucket (ARRAY_SUM (ARRAY … END));

The important thing is you can create indexes on most expressions and functions, not just fields in the data.

Finally, note that indexes are used to process WHERE clauses. They are not used to process ORDER BY clauses. You cannot specify the order in which N1QL processes records; you can only specify the order in which it produces output results, using the ORDER BY clause.

GA is around mid-2015, and there will be a beta before that.

OK, I accept that it’s possible to do this particular example using N1QL. I guess if the range of expressions is large, that you’d be able to do pretty much everything in N1QL that you could with JavaScript. I just would like a way to access user-defined views from N1QL, since I’m sure that there are some situations where the code is more elegant in JavaScript.

Anyway, the most important thing is having the functionality, and if that’s already there, then great.

How many of the expressions listed on https://github.com/couchbaselabs/query/blob/master/docs/n1ql-select.md are actually implemented in N1QL?

Thanks for taking the time to reply.

All of the expressions in that spec are implemented in DP4.

A subset of them are indexable in DP4, and the rest will be indexable in GA.

Also, GA will include new secondary indexes specifically for N1QL. These will provide another mechanism to create indexes for N1QL. They will not provide JavaScript or MapReduce, but should provide faster lookups.

Please let us know your experience with DP4 in a couple of weeks.

Thanks.

Great, I shall look forward to trying it out.

Come GA, will the new secondary indexes be stored in the Couchbase or in the N1QL server - or will the two essentially be merged into one so there will only be one product rather than two?

Thanks.

Come GA, it will be one Couchbase server cluster, one install, one product, and 3 different services that can be enabled for each node in the cluster (data, index, query). Pretty exciting architecture.

Yeah, I’m really looking forward to it. I think I’ve decided to go with CB for the main database in my project, so I’m really looking forward to seeing it develop.

On a separate (sorry it’s off-topic here, but it’s quick to ask issue), would CB potentially be interested in getting invoved with (or at least offering more flexible licensing terms with the enterprise version) for sites that are for charitable purposes. I ask because I’m setting up a project to try to generate money for people in the third world, and would like to use CB as the DB. Would it be worth contacting someone in your sales department?

N1QL will be included in both the community and enterprise editions. The community edition is free. If you need the enterprise edition, you can certainly inquire with sales.

OK. The community version will probably be fine to begin with, so long as there aren’t any major bugs. Will there be a delay in the launching of the community version that has N1QL, or will the first version with it be launched at the same time as the enterprise version?

Also, will the differences between the Enterprise and Standard versions of Couchbase be only related to the main product, i.e. will the N1QL engine that’s provided with both versions of the same release be the same?

Yes, N1QL will be the same in both editions. In the past, the community edition is released after the enterprise edition.

http://blog.couchbase.com/n1ql-dp4-is-here