Best way to store docs for n1ql and subdoc api

KevinK · February 1, 2017, 6:39pm

I’m trying to query data using n1ql and mutate the data using subdocs and am trying to figure out the best way to store the documents.
Let’s say I have a document that has “people” in it (along with lots of other fields). I want to be able to search by a person and perform crud operation on a person (on a subdoc level).

The 2 ways I came up with are:

 {
   lots of other info
   "people" : {
        "bob" : { stuff about bob },
        "jim" : {stuff about jim}
    }
 }

OR:

{
  lots of other info
  "people" : [
       {"name" : "bob",  stuff about bob },
       {"name" : "jim",  stuff about jim}
   ]
}

For choice 1: (having the id as the “key”). I can’t seem to write a good n1ql query that uses indexes. To find the documents that “jim” is in, I can query using “people.jim IS VALUED”, but that doesn’t use indexes (and I can’t use a prepared statement).

For choice 2 (having the name as a field in an array). Indexes will work fine, but the subdoc crud operations aren’t as clean.
For instance, when I want to do an create, update or delete, I would need to select the whole list, modify it then “put” it (and worry about cas).
Using option 1, I would only to do an upsert or delete a subdoc using the person’s name (atomic and super easy).

Am I missing something in either n1ql or the subdoc api that would make this easier (or an entirely different way to store the doc)

Thanks

mnunberg · February 1, 2017, 7:36pm

You wouldn’t need to modify the whole list, though you would need to fetch the list and use CAS.

I’m unsure about the rest of this though

geraldss · February 1, 2017, 7:41pm

If your people id’s are unique, you can use choice 1 and we will help you with the indexes and N1QL queries. You need to be on 4.5.1 or above.

KevinK · February 1, 2017, 7:49pm

Geraldss,
That’s great news!!
Choice 1 is the one I’d like to go with.
We are currently on 4.5.0, but I could upgrade.
What would the indexes (and query) look like?

Thanks for the help

KevinK · February 1, 2017, 8:32pm

geraldss,
Just looking at new stuff, there is my guess (tell me if I’m in the right ballpark)

CREATE INDEX idx_people ON bucket(DISTINCT ARRAY t FOR t in TOKENS(people, {“specials”: true}) END);

(I needed specials:true)

The a select of:

select * from bucket WHERE ANY t in TOKENS(people, {“specials”:true}) SATISFIES t=‘jim’ END;

The explain plan on this looked good (had an index scan at the beginning).

If that’s right, I think I’m in business. Thanks again for the help.

geraldss · February 1, 2017, 9:26pm

TOKENS would be 4.6.0, which you are welcome to use (will GA shortly).

We also have a function OBJECT_NAMES in 4.5.0+, and you can index the results of that.

CREATE INDEX idx_person ON my_bucket( DISTINCT ARRAY n FOR n IN OBJECT_NAMES( people ) END );

EXPLAIN
SELECT *
FROM my_bucket
WHERE ANY n IN OBJECT_NAMES( people ) SATISFIES n = "jim" END;

Note that the variable n must match in both index and query statements.

KevinK · February 1, 2017, 9:33pm

Wow, awesome!!
Thanks for the help. OBJECT_NAMES work way better for me than TOKENS.

Thanks for all the help

geraldss · February 1, 2017, 11:03pm

BTW, pretty impressive that you created the right index and query using TOKENS! Wow!

At some point, you should write a dzone article about your use case. cc @keshav_m