N1QL query Filtering data within an array

query
n1ql
#1

I have a document that looks like this:

   {
	  "startTime": 123,
	  "endTime": 932345235,
	  "type": "results",
	  "validationResults": [
	    {
	      "name": "xyz",
	      "ruleViolated": 1,
	      "details": [
	        {
	          "timeStamp": 132345235,
	          "place": "abc",
	          "acknowledged": false
	        },
	        {
	          "timeStamp": 7658,
	          "place": "bcd",
	          "acknowledged": true
	        }
	      ]
	    },

	    {
	      "name": "xyz",
	      "ruleViolated": 2,
	      "details": [
	        {
	          "timeStamp": 132345235,
	          "place": "abc",
	          "acknowledged": false
	        },
	        {
	          "timeStamp": 7658,
	          "place": "bcd",
	          "acknowledged": true
	        }
	      ]
	    },

	    {
	      "name": "random",
	      "ruleViolated": 1,
	      "details": [
	        {
	          "timeStamp": 9679,
	          "place": "abc",
	          "acknowledged": false
	        },
	        {
	          "timeStamp": 12412,
	          "place": "bcd",
	          "acknowledged": true
	        }
	      ]
	    }

	  ]
	}

Here the startTime and endTime is the range for which the check was made while creating the doc.

Now I want the results within a certain time range [x,y] such that for every result in validation results, their details data get filtered within that range and i get the data in the same format with details being in the required range.

I tried using the query:
SELECT a.* FROM bucketName a unnest a.validationResults as b unnest b.details as c where ( (a.startTime <= 1462042800000 AND a.endTime >= 1462042800000) OR (a.startTime >= 1462042800000 AND a.startTime <= 1462043025018) ) AND (c.timeStamp between 1462042800000 and 1462043025018) and a.type=“results”

Issue is a.* is returning me every possible datapoint and filtering is only done when i do c.* but that is not the format that i want.
Any help is appreciated

#2
SELECT a.* , vr AS validationResults
FROM bucketName AS a
LET vr = (ARRAY OBJECT_PUT(b,"details",
                          (ARRAY c
                           FOR c IN b.details
                           WHEN c.timeStamp BETWEEN 1462042800000 AND 1462043025018
                           END))
         FOR b IN a.validationResults
         END)
WHERE a.type="results" AND ((a.startTime <= 1462042800000 AND a.endTime >= 1462042800000)
      OR (a.startTime >= 1462042800000 AND a.startTime <= 1462043025018));

If there is no inner array match and don’t want project the document.

SELECT a.* , ARRAY v FOR v IN vr WHEN ARRAY_LENGTH(v.details) > 0 END AS validationResults
FROM bucketName AS a
LET vr = (ARRAY OBJECT_PUT(b,"details",
                          (ARRAY c
                           FOR c IN b.details
                           WHEN c.timeStamp BETWEEN 1462042800000 AND 1462043025018
                           END))
         FOR b IN a.validationResults
         END)
WHERE a.type="results" AND ((a.startTime <= 1462042800000 AND a.endTime >= 1462042800000)
      OR (a.startTime >= 1462042800000 AND a.startTime <= 1462043025018))
      AND ANY v IN vr SATISIFES ARRAY_LENGTH(v.details) > 0 END;
1 Like
#3

Thanks for the reply @vsr1 . This is exactly the query I was looking for. Any tips on indexing to improve the performance?

#4

create index ix1 on bucketName(startTime, endTime) WHERE type = “results”

1 Like
#5

@vsr1 If I have multiple documents and one or more “name” keys are same in both then all the results should be in one array object. Basically I want only 1complete details array for 1 unique name.

#6
SELECT b.name, ARRAY_AGG(c) AS details
FROM bucketName AS a
UNNEST a.validationResults AS b
UNNEST b.details AS c
WHERE a.type="results" AND ((a.startTime <= 1462042800000 AND a.endTime >= 1462042800000)
            OR (a.startTime >= 1462042800000 AND a.startTime <= 1462043025018))
      AND (c.timeStamp BETWEEN 1462042800000 AND 1462043025018)
GROUP BY b.name;

OR

Replace query (FROM (query) AS t) from the previous working query.

SELECT RAW ARRAY_AGG(OBJECT_PUT(t,"validationResults",
                     ARRAY c FOR c IN t.validationResults WHEN c.name = v.name END))
     FROM ( query ) AS t
UNNEST t.validationResults AS v
GROUP BY v.name;
#7

@vsr1 The query above produces the desired result but it is taking too long to process. Around a minute when I have around 40,000 docs in the bucket.