Best way to store docs for n1ql and subdoc api

I’m trying to query data using n1ql and mutate the data using subdocs and am trying to figure out the best way to store the documents.
Let’s say I have a document that has “people” in it (along with lots of other fields). I want to be able to search by a person and perform crud operation on a person (on a subdoc level).

The 2 ways I came up with are:

 {
   lots of other info
   "people" : {
        "bob" : { stuff about bob },
        "jim" : {stuff about jim}
    }
 }

OR:

{
  lots of other info
  "people" : [
       {"name" : "bob",  stuff about bob },
       {"name" : "jim",  stuff about jim}
   ]
}

For choice 1: (having the id as the “key”). I can’t seem to write a good n1ql query that uses indexes. To find the documents that “jim” is in, I can query using “people.jim IS VALUED”, but that doesn’t use indexes (and I can’t use a prepared statement).

For choice 2 (having the name as a field in an array). Indexes will work fine, but the subdoc crud operations aren’t as clean.
For instance, when I want to do an create, update or delete, I would need to select the whole list, modify it then “put” it (and worry about cas).
Using option 1, I would only to do an upsert or delete a subdoc using the person’s name (atomic and super easy).

Am I missing something in either n1ql or the subdoc api that would make this easier (or an entirely different way to store the doc)

Thanks

You wouldn’t need to modify the whole list, though you would need to fetch the list and use CAS.

I’m unsure about the rest of this though

If your people id’s are unique, you can use choice 1 and we will help you with the indexes and N1QL queries. You need to be on 4.5.1 or above.

Geraldss,
That’s great news!!
Choice 1 is the one I’d like to go with.
We are currently on 4.5.0, but I could upgrade.
What would the indexes (and query) look like?

Thanks for the help

geraldss,
Just looking at new stuff, there is my guess (tell me if I’m in the right ballpark)

CREATE INDEX idx_people ON bucket(DISTINCT ARRAY t FOR t in TOKENS(people, {“specials”: true}) END);

(I needed specials:true)

The a select of:

select * from bucket WHERE ANY t in TOKENS(people, {“specials”:true}) SATISFIES t=‘jim’ END;

The explain plan on this looked good (had an index scan at the beginning).

If that’s right, I think I’m in business. Thanks again for the help.

TOKENS would be 4.6.0, which you are welcome to use (will GA shortly).

We also have a function OBJECT_NAMES in 4.5.0+, and you can index the results of that.

CREATE INDEX idx_person ON my_bucket( DISTINCT ARRAY n FOR n IN OBJECT_NAMES( people ) END );

EXPLAIN
SELECT *
FROM my_bucket
WHERE ANY n IN OBJECT_NAMES( people ) SATISFIES n = "jim" END;

Note that the variable n must match in both index and query statements.

1 Like

Wow, awesome!!
Thanks for the help. OBJECT_NAMES work way better for me than TOKENS.

Thanks for all the help

2 Likes

BTW, pretty impressive that you created the right index and query using TOKENS! Wow! :slight_smile:

At some point, you should write a dzone article about your use case. cc @keshav_m