N1QL query with second level nesting

I have a normalized model with this data:

INSERT INTO testBucket (key, value) VALUES ("order1", {"documentType" : "order", "orderNumber" : "3434223"});
INSERT INTO testBucket (key, value) VALUES ("order2", {"documentType" : "order", "orderNumber" : "3434224"});
INSERT INTO testBucket (key, value) VALUES ("order3", {"documentType" : "order", "orderNumber" : "3434225"});

INSERT INTO testBucket (key, value) VALUES ("product1", {"documentType" : "product", "type" : "physical", "displayName" : "Server"});
INSERT INTO testBucket (key, value) VALUES ("productPart1", {"documentType" : "productPart", "name" : "Part 1", "product" : "product1"});
INSERT INTO testBucket (key, value) VALUES ("productPart2", {"documentType" : "productPart", "name" : "Part 2", "product" : "product1"});
INSERT INTO testBucket (key, value) VALUES ("product2", {"documentType" : "product", "type" : "service", "displayName" : "N1QL training"});
INSERT INTO testBucket (key, value) VALUES ("productPart3", {"documentType" : "productPart", "name" : "Part A", "product" : "product2"});
INSERT INTO testBucket (key, value) VALUES ("productPart4", {"documentType" : "productPart", "name" : "Part B", "product" : "product2"});
INSERT INTO testBucket (key, value) VALUES ("product3", {"documentType" : "product", "type" : "hybrid", "displayName" : "Remote training kit"});
INSERT INTO testBucket (key, value) VALUES ("productPart5", {"documentType" : "productPart", "name" : "Part I", "product" : "product3"});
INSERT INTO testBucket (key, value) VALUES ("productPart6", {"documentType" : "productPart", "name" : "Part II", "product" : "product3"});
INSERT INTO testBucket (key, value) VALUES ("product4", {"documentType" : "product", "type" : "physical", "displayName" : "Phablet"});
INSERT INTO testBucket (key, value) VALUES ("productPart7", {"documentType" : "productPart", "name" : "Part 1/2", "product" : "product4"});
INSERT INTO testBucket (key, value) VALUES ("productPart8", {"documentType" : "productPart", "name" : "Part 2/2", "product" : "product4"});

INSERT INTO testBucket (key, value) VALUES ("orderItem1", {"documentType" : "orderItem", "fulfillment" : "Online", "order" : "order1", "product" : "product1"});
INSERT INTO testBucket (key, value) VALUES ("orderItem2", {"documentType" : "orderItem", "fulfillment" : "Delivery", "order" : "order1", "product" : "product2"});
INSERT INTO testBucket (key, value) VALUES ("orderItem3", {"documentType" : "orderItem", "fulfillment" : "Pickup", "order" : "order1", "product" : "product3"});
INSERT INTO testBucket (key, value) VALUES ("orderItem4", {"documentType" : "orderItem", "fulfillment" : "Delivery", "order" : "order2", "product" : "product1"});
INSERT INTO testBucket (key, value) VALUES ("orderItem5", {"documentType" : "orderItem", "fulfillment" : "Pickup", "order" : "order3", "product" : "product4"});

The test model consists of orders, products, order items. Apart from this, each product has some parts. Everything is normalized. (Note: this is a fictional model, which I forced to reflect my actual relationships in a more general context).

My goal is to query the data and receive it denormalized. For example, being able to get a single order with its products (first complexity level) and each product with its parts (second complexity level).

The first complexity level was feasible to me with a query like this:

SELECT orders.orderNumber,
    ARRAY_AGG( {
      "productName": p.displayName,
      "productType": p.type}) AS products
  FROM testBucket orderItems
    INNER JOIN testBucket orders
      ON KEYS orderItems.`order`
    INNER JOIN testBucket p
      ON KEYS orderItems.product
  WHERE orderItems.documentType = "orderItem"
  GROUP BY orders.orderNumber

But my final goal is to retrieve each product with a “parts” property with an array of the names of its parts.

A second ARRAY_AGG was my first choice, with no success. A subquery was next and then NEST clause.

Any ideas on how to address this with a single query?

1 Like

Are you doing a 1 to 1 mapping of your tables to JSON?

For this part of the model, yes. Not always, but I am doing this for document types on which I need to perform multiple queries filtering specific properties. When this is not the case, I rather embed them as sub-documents.

We will take a look today and propose something. -Gerald

Try denomilizing the order & order Items.

Couchbase 4.5 (dev preview) can index on items in an Arrays.
http://developer.couchbase.com/documentation/server/4.5-dp/indexing-arrays.html

Thanks @househippo for your answer. As I mentioned, the model I showed is basically reflecting my actual model, on which denormalizing for the involved tables is not an option, since the three of them are queried intensively on different combinations of attributes. I would know how to do it with a denormalized model, but my question is actually about doing this on a normalized model.

Hi @rlogman. Took a closer look. You can use NEST:

SELECT *
FROM orders
NEST orderItems ON KEY orderItems.order FOR orders
NEST products ON KEYS ARRAY oi.product FOR oi IN orderItems END
WHERE orders.orderNumber = "XYZ";

Thanks @geraldss. With the same data included on the original post, I tried with (only adding the bucket name):

SELECT *
FROM testBucket orders
NEST testBucket orderItems ON KEY orderItems.`order` FOR orders
NEST testBucket products ON KEYS ARRAY oi.product FOR oi IN orderItems END
WHERE orders.orderNumber = "XYZ";

But now it shows this error message:

[{"code":4100,"msg":"No index available for join term orderItems"},{"original_query":
"select cbq_query_workbench_limit.* from (SELECT *\nFROM testBucket orders\nNEST
testBucket orderItems ON KEY orderItems.`order` FOR orders\nNEST testBucket products
ON KEYS ARRAY oi.product FOR oi IN orderItems END\nWHERE
orders.orderNumber = \"XYZ\") cbq_query_workbench_limit limit 500;"}]

Am I missing something to be able to use this NEST?

Yes. Please create an index on testBucket.order.

I created the index, and the error is gone, but I think you misunderstood my initial purpose. The sample N1QL I put on the question does work, but I need one level more of complexity.

What I need, if I put WHERE orderNumber="3434223", is a structure similar to:

{
  "orderNumber": "3434223",
  "products": [
    {
      "productName": "N1QL training",
      "productType": "service",
      "parts": [{ "partName": "Part A" }, { "partName": "Part B" }]   // This is needed in the new query
    },
    {
      "productName": "Remote training kit",
      "productType": "hybrid",
      "parts": [{ "partName": "Part I" }, { "partName": "Part I" }]  // This is needed in the new query
    },
    {
      "productName": "Server",
      "productType": "physical",
      "parts": [{ "partName": "Part 1" }, { "partName": "Part 2" }]  // This is needed in the new query
    }
  ]
}

My requirements are:

  • Filter by order number: done
  • Show the the products of the order: done
  • For each product, show its parts: missing

Ok. You can do:

SELECT s.orderNumber, ARRAY_AGG(s.prod) AS products
FROM
(
    SELECT orders.orderNumber, { "productName": p.displayName, "productType": p.type, "parts": ARRAY_AGG(productPart) } AS prod
    FROM testBucket orders
    JOIN testBucket orderItem ON KEY orderItem.`order` FOR orders
    JOIN testBucket product p ON KEYS orderItem.product
    JOIN testBucket productPart ON KEY productPart.product FOR p
    WHERE orders.orderNumber = "XYZ"
    GROUP BY orders.orderNumber, p
) AS s
GROUP BY s.orderNumber;
1 Like

That made the trick @geraldss. I only removed the double alias product p (leaving only p), but apart from that, the sentence was perfect. The final working version was:

SELECT s.orderNumber, ARRAY_AGG(s.prod) AS products
  FROM (SELECT orders.orderNumber, { "productName": p.displayName, "productType": p.type,
      "parts": ARRAY_AGG(productPart) } AS prod
    FROM testBucket orders
    JOIN testBucket orderItem ON KEY orderItem.`order` FOR orders
    JOIN testBucket p ON KEYS orderItem.product
    JOIN testBucket productPart ON KEY productPart.product FOR p
    WHERE productPart.documentType = "productPart"
      AND orders.orderNumber = "XYZ"
    GROUP BY orders.orderNumber, p
  ) AS s
  GROUP BY s.orderNumber;

Thanks for your deep involvement on this.

2 Likes

Hi Gerald/rlogman,

How do you create index on “testBucket.order”? Can you please paste the exact command?

Hi @vancourse,

CREATE INDEX idx ON testBucket( order );

Thank you for the lightning fast response…
Looks like order is a keyword in Couchbase… I had to put in a quote around it for the index to succeed…

CREATE INDEX idx ON testBucket( ‘order’);

1 Like

Make sure you are using back ticks, `, not quotes, to escape the keyword order