Indexing & Querying Array for Aggregation

Hi there.

Looking to get some help with this issue.

I’ve got a large number of documents: 10,000,000 of them.

Documents look like:

{
  "Type": "scores",
  "StaffMember": "04828dd8-09d9-47fe-8a9d-87d23d109d84",
  "Departments": [
    "215fecad-c50c-4b3a-aa79-8f6a8dcd5179",
    "0b97f6f9-b62f-4a27-b6ed-5af0a1461756"
  ],
  "DateTimeUtc": "2019-07-28T07:57:23.3965708+01:00",
  "Jobs": [
    {
      "Name": "Associate",
      "RawValue": 2,
      "Score": 3,
      "Type": "Representative",
      "Duration": 241,
      "Weight": 60
    },
    {
      "Name": "Representative",
      "RawValue": 10,
      "Score": 5,
      "Type": "Analyst",
      "Duration": 1424,
      "Weight": 83
    },
    {
      "Name": "Assistant",
      "RawValue": 10,
      "Score": 1,
      "Type": "Supervisor",
      "Duration": 1100,
      "Weight": 18
    }
  ],
  "stats": {
    "RunDurationInSeconds": 2128,
    "TestDurationInSeconds": 2121,
    "PreTestDurationInSeconds": 8541,
  }
}

There are about 300 departments and around 5000 staff, there are around 27 distinct jobs

Struggling to get a good query performance here

I’d like to be able to say for a single Department, what each Staff Members Job scores (with some calculations in) and stats were.

I’ve got the following query, which currently provide me with this without the stats section:

SELECT d.StaffMember, 
       m[0] AS JobName, 
       SUM(m[1]) AS ScoreDurationWeight, 
       SUM(m[2]) AS DurationWeight, 
       SUM(m[3]) AS Duration, 
       SUM(m[4]) AS RawScore 
FROM ( 
    SELECT a.StaffMember, ARRAY [j.Name,
                  IFMISSINGORNULL(j.Score,0) * IFMISSINGORNULL(j.Duration,0) * IFMISSINGORNULL(j.Weight,0),
                  IFMISSINGORNULL(j.Duration,0) * IFMISSINGORNULL(j.Weight,0),
                  IFMISSINGORNULL(j.Duration,0),
                  IFMISSINGORNULL(j.RawScore,0)] 
           FOR j IN Jobs END as n
	    FROM staffjobscores AS a
    WHERE a.Type = "scores"
        AND a.DateTimeUtc > "2019-01-01" AND a.DateTimeUtc <= "2019-12-31"
	    AND ANY o IN a.Departments SATISFIES o = "a3598df8-f8fd-41a7-b837-1e7a8e12de1a" END
) 
AS d
UNNEST d.n as m
WHERE m[0] IS NOT NULL
GROUP BY d.StaffMember, m[0]

using this index:

CREATE INDEX `ixd_staffjobs_array` 
ON `staffjobscores`(StaffMember, `DateTimeUtc`,(distinct (`Departments`)),array [(`j`.`Name`), ((ifmissingornull((`j`.`Score`), 0) * ifmissingornull((`j`.`Duration`), 0)) * ifmissingornull((`j`.`Weight`), 0)), (ifmissingornull((`j`.`Duration`), 0) * ifmissingornull((`j`.`Weight`), 0)), ifmissingornull((`j`.`Duration`), 0), ifmissingornull((`j`.`RawScore`), 0)] for `j` in `Jobs` end) WHERE (`Type` = "scores")

Now this takes around 10 seconds to run. I really need to add in the Stats column into this, and I’d like to bring the query time down if possible.

Can anyone here suggest a better indexing strategy?

I need to be able to provide the following Filters / Results from the documents:

Filter: Staff Member/ Department/ Date … Output: Calculations by Job, Sum of Stats for the staff member provided.
Filter: Department/ Date … Output: Calculations by Job, Sum of Stats, per Staff Member within the provided department.
Filter: Department/ Date … Output: Calculations by Job, Sum of Stats for the whole department.

I’ve also tried looking at https://docs.couchbase.com/server/current/n1ql/n1ql-language-reference/groupby-aggregate-performance.html
And looking at my query plan, I’m getting the Old method and not the new method.

Any help would be greatly appreciated, my other option is to look at creating views, though not sure if they will scale better than N1QL.

Many thanks.

1 Like

Try the following index (https://blog.couchbase.com/create-right-index-get-right-performance, Rule#11).

CREATE INDEX `ixd_staffjobs_array` 
ON `staffjobscores`( DISTINCT  Departments,`DateTimeUtc`,  StaffMember, 
    ARRAY  [(`j`.`Name`), (
                       (ifmissingornull((`j`.`Score`), 0) * ifmissingornull((`j`.`Duration`), 0)) * ifmissingornull((`j`.`Weight`), 0)), 
                      (ifmissingornull((`j`.`Duration`), 0) * ifmissingornull((`j`.`Weight`), 0)), 
                       ifmissingornull((`j`.`Duration`), 0), 
                       ifmissingornull((`j`.`RawScore`), 0)] for `j` in `Jobs` end) 
WHERE (`Type` = "scores")

Also if you need combination of Score/Duration/Weight you can store individual values and do multification inside your SUM

Thanks @vsr1 when I try this one I don’t actually even get any results.
It doesn’t seem to like having the Distinct Departments as the first entity in the index. If I move it to second or third entity I get results back.
I’m using Couchbase 6.0.1 Build 2037

Execute query provide output from PlanText Tab

Thanks @vsr1 for the continued support.
Output as requested:

> {
>   "#operator": "Sequence",
>   "#stats": {
> "#phaseSwitches": 1
>   },
>   "~children": [
> {
>   "#operator": "Authorize",
>   "#stats": {
>     "#phaseSwitches": 3,
>     "servTime": "3.9998ms"
>   },
>   "privileges": {
>     "List": [
>       {
>         "Target": "default:staffjobscores",
>         "Priv": 7
>       }
>     ]
>   },
>   "~child": {
>     "#operator": "Sequence",
>     "#stats": {
>       "#phaseSwitches": 1
>     },
>     "~children": [
>       {
>         "#operator": "DistinctScan",
>         "#stats": {
>           "#phaseSwitches": 5,
>           "kernTime": "2.0002ms"
>         },
>         "scan": {
>           "#operator": "IndexScan3",
>           "#stats": {
>             "#phaseSwitches": 5,
>             "servTime": "2.0002ms"
>           },
>           "as": "a",
>           "covers": [
>             "cover ((distinct ((`a`.``))))",
>             "cover ((`a`.`DateTimeUtc`))",
>             "cover ((`a`.`StaffMember`))",
>             "cover ((meta(`a`).`id`))"
>           ],
>           "filter_covers": {
>             "cover ((`a`.`Type`))": "scores",
>             "cover (any `o` in (`a`.`Departments`) satisfies \"a3598df8-f8fd-41a7-b837-1e7a8e12de1a\" end)": true
>           },
>           "index": "ixd_Departments_DateTimeUtc_StaffMember",
>           "index_id": "aba3c068e28f8589",
>           "index_projection": {
>             "entry_keys": [
>               1,
>               2
>             ],
>             "primary_key": true
>           },
>           "keyspace": "staffjobscores",
>           "namespace": "default",
>           "spans": [
>             {
>               "exact": true,
>               "range": [
>                 {
>                   "high": "null",
>                   "inclusion": 0,
>                   "low": "null"
>                 }
>               ]
>             }
>           ],
>           "using": "gsi",
>           "#time_normal": "00:00.0020",
>           "#time_absolute": 0.0020002
>         }
>       },
>       {
>         "#operator": "Sequence",
>         "#stats": {
>           "#phaseSwitches": 1
>         },
>         "~children": [
>           {
>             "#operator": "Filter",
>             "#stats": {
>               "#phaseSwitches": 3,
>               "kernTime": "2.0002ms"
>             },
>             "condition": "((((cover ((`a`.`Type`)) = \"scores\") and (\"2019-01-01 00:00:00\" < cover ((`a`.`DateTimeUtc`)))) and (cover ((`a`.`DateTimeUtc`)) <= \"2019-12-31 23:59:59\")) and cover (any `o` in (`a`.`Departments`) satisfies \"a3598df8-f8fd-41a7-b837-1e7a8e12de1a\" end))"
>           },
>           {
>             "#operator": "InitialProject",
>             "#stats": {
>               "#phaseSwitches": 5,
>               "kernTime": "2.0002ms"
>             },
>             "result_terms": [
>               {
>                 "expr": "cover ((`a`.`StaffMember`))"
>               }
>             ]
>           },
>           {
>             "#operator": "FinalProject",
>             "#stats": {
>               "#phaseSwitches": 1
>             }
>           }
>         ]
>       }
>     ]
>   },
>   "#time_normal": "00:00.0039",
>   "#time_absolute": 0.0039998
> },
> {
>   "#operator": "Stream",
>   "#stats": {
>     "#phaseSwitches": 3,
>     "kernTime": "6ms"
>   }
> }
>   ],
>   "~versions": [
> "2.0.0-N1QL",
> "6.0.1-2037-enterprise"
>   ]
> }

hum. Drop ixd_Departments_DateTimeUtc_StaffMember index and

try this.

CREATE INDEX `ixd_Departments_DateTimeUtc_StaffMember` 
ON `staffjobscores`( DISTINCT  ARRAY v FOR v IN Departments END,`DateTimeUtc`,  StaffMember, 
    ARRAY  [(`j`.`Name`), (
                       (ifmissingornull((`j`.`Score`), 0) * ifmissingornull((`j`.`Duration`), 0)) * ifmissingornull((`j`.`Weight`), 0)), 
                      (ifmissingornull((`j`.`Duration`), 0) * ifmissingornull((`j`.`Weight`), 0)), 
                       ifmissingornull((`j`.`Duration`), 0), 
                       ifmissingornull((`j`.`RawScore`), 0)] for `j` in `Jobs` end) 
WHERE (`Type` = "scores")

Thanks @vsr1

That is running returning values now though its still rather slow ~19 seconds,

If i add in a StaffMember to filter it uses the same index and works fast. ~400ms. (See further down)

See Plan for slow return for one department for year::

> {
>   "#operator": "Sequence",
>   "#stats": {
>     "#phaseSwitches": 1
>   },
>   "~children": [
>     {
>       "#operator": "Authorize",
>       "#stats": {
>         "#phaseSwitches": 3,
>         "servTime": "19.9986ms"
>       },
>       "privileges": {
>         "List": [
>           {
>             "Target": "default:staffjobscores ",
>             "Priv": 7
>           }
>         ]
>       },
>       "~child": {
>         "#operator": "Sequence",
>         "#stats": {
>           "#phaseSwitches": 1
>         },
>         "~children": [
>           {
>             "#operator": "Sequence",
>             "#stats": {
>               "#phaseSwitches": 1
>             },
>             "~children": [
>               {
>                 "#operator": "DistinctScan",
>                 "#stats": {
>                   "#itemsIn": 61967,
>                   "#itemsOut": 61967,
>                   "#phaseSwitches": 247873,
>                   "execTime": "312.0153ms",
>                   "kernTime": "16.7818771s"
>                 },
>                 "scan": {
>                   "#operator": "IndexScan3",
>                   "#stats": {
>                     "#itemsOut": 61967,
>                     "#phaseSwitches": 247873,
>                     "execTime": "396.0237ms",
>                     "kernTime": "16.3984442s",
>                     "servTime": "130.4258ms"
>                   },
>                   "index": "ixd_CbDojo_2_Departments_DateTimeUtc_StaffMember",
>                   "index_id": "ae3ccaeb09ba3d62",
>                   "index_projection": {
>                     "primary_key": true
>                   },
>                   "keyspace": "staffjobscores ",
>                   "namespace": "default",
>                   "spans": [
>                     {
>                       "exact": true,
>                       "range": [
>                         {
>                           "high": "\"a3598df8-f8fd-41a7-b837-1e7a8e12de1a\"",
>                           "inclusion": 3,
>                           "low": "\"a3598df8-f8fd-41a7-b837-1e7a8e12de1a\""
>                         },
>                         {
>                           "high": "\"2019-12-31 23:59:59\"",
>                           "inclusion": 2,
>                           "low": "\"2019-01-01 00:00:00\""
>                         }
>                       ]
>                     }
>                   ],
>                   "using": "gsi",
>                   "#time_normal": "00:00.5264",
>                   "#time_absolute": 0.5264495
>                 },
>                 "#time_normal": "00:00.3120",
>                 "#time_absolute": 0.31201530000000005
>               },
>               {
>                 "#operator": "Fetch",
>                 "#stats": {
>                   "#itemsIn": 61967,
>                   "#itemsOut": 61967,
>                   "#phaseSwitches": 255617,
>                   "execTime": "502.4534ms",
>                   "kernTime": "2.884487s",
>                   "servTime": "13.7829489s"
>                 },
>                 "keyspace": "staffjobscores ",
>                 "namespace": "default",
>                 "#time_normal": "00:14.2854",
>                 "#time_absolute": 14.2854023
>               },
>               {
>                 "#operator": "Sequence",
>                 "#stats": {
>                   "#phaseSwitches": 1
>                 },
>                 "~children": [
>                   {
>                     "#operator": "Filter",
>                     "#stats": {
>                       "#itemsIn": 61967,
>                       "#itemsOut": 61967,
>                       "#phaseSwitches": 247871,
>                       "execTime": "5.9440434s",
>                       "kernTime": "11.2268467s"
>                     },
>                     "condition": "(((((`staffjobscores `.`Type`) = \"dpm::score\") and (\"2019-01-01 00:00:00\" < (`staffjobscores `.`DateTimeUtc`))) and ((`staffjobscores `.`DateTimeUtc`) <= \"2019-12-31 23:59:59\")) and any `v` in (`staffjobscores `.`Departments`) satisfies (`v` = \"a3598df8-f8fd-41a7-b837-1e7a8e12de1a\") end)",
>                     "#time_normal": "00:05.9440",
>                     "#time_absolute": 5.9440434
>                   },
>                   {
>                     "#operator": "InitialProject",
>                     "#stats": {
>                       "#itemsIn": 61967,
>                       "#itemsOut": 61967,
>                       "#phaseSwitches": 185906,
>                       "execTime": "11.7987572s",
>                       "kernTime": "1.0506581s"
>                     },
>                     "result_terms": [
>                       {
>                         "expr": "(`staffjobscores `.`StaffMember`)"
>                       },
>                       {
>                         "as": "n",
>                         "expr": "array [(`j`.`Name`), ((ifmissingornull((`j`.`Score`), 0) * ifmissingornull((`j`.`Duration`), 0)) * ifmissingornull((`j`.`Weight`), 0)), (ifmissingornull((`j`.`Duration`), 0) * ifmissingornull((`j`.`Weight`), 0)), ifmissingornull((`j`.`Duration`), 0), ifmissingornull((`j`.`RawScore`), 0)] for `j` in (`staffjobscores `.`Jobs`) end"
>                       }
>                     ],
>                     "#time_normal": "00:11.7987",
>                     "#time_absolute": 11.7987572
>                   },
>                   {
>                     "#operator": "FinalProject",
>                     "#stats": {
>                       "#itemsIn": 61967,
>                       "#itemsOut": 61967,
>                       "#phaseSwitches": 185902,
>                       "execTime": "114.9695ms",
>                       "kernTime": "4.2201501s"
>                     },
>                     "#time_normal": "00:00.1149",
>                     "#time_absolute": 0.1149695
>                   }
>                 ]
>               }
>             ]
>           },
>           {
>             "#operator": "Alias",
>             "#stats": {
>               "#itemsIn": 61967,
>               "#itemsOut": 61967,
>               "#phaseSwitches": 247871,
>               "execTime": "293.5329ms",
>               "kernTime": "17.0233571s"
>             },
>             "as": "d",
>             "#time_normal": "00:00.2935",
>             "#time_absolute": 0.2935329
>           },
>           {
>             "#operator": "Unnest",
>             "#stats": {
>               "#itemsIn": 61967,
>               "#itemsOut": 247770,
>               "#phaseSwitches": 619477,
>               "execTime": "2.0373681s",
>               "kernTime": "15.3625431s"
>             },
>             "as": "m",
>             "expr": "(`d`.`n`)",
>             "#time_normal": "00:02.0373",
>             "#time_absolute": 2.0373681
>           },
>           {
>             "#operator": "Sequence",
>             "#stats": {
>               "#phaseSwitches": 1
>             },
>             "~children": [
>               {
>                 "#operator": "Filter",
>                 "#stats": {
>                   "#itemsIn": 247770,
>                   "#itemsOut": 247770,
>                   "#phaseSwitches": 991083,
>                   "execTime": "1.2689639s",
>                   "kernTime": "16.1509253s"
>                 },
>                 "condition": "((`m`[0]) is not null)",
>                 "#time_normal": "00:01.2689",
>                 "#time_absolute": 1.2689639
>               },
>               {
>                 "#operator": "InitialGroup",
>                 "#stats": {
>                   "#itemsIn": 247770,
>                   "#itemsOut": 1525,
>                   "#phaseSwitches": 498593,
>                   "execTime": "16.3172543s",
>                   "kernTime": "1.1276341s"
>                 },
>                 "aggregates": [
>                   "sum((`m`[1]))",
>                   "sum((`m`[2]))",
>                   "sum((`m`[3]))",
>                   "sum((`m`[4]))"
>                 ],
>                 "group_keys": [
>                   "(`d`.`StaffMember`)",
>                   "(`m`[0])"
>                 ],
>                 "#time_normal": "00:16.3172",
>                 "#time_absolute": 16.3172543
>               }
>             ]
>           },
>           {
>             "#operator": "IntermediateGroup",
>             "#stats": {
>               "#itemsIn": 1525,
>               "#itemsOut": 1525,
>               "#phaseSwitches": 6103,
>               "execTime": "7.7863ms",
>               "kernTime": "17.4851514s"
>             },
>             "aggregates": [
>               "sum((`m`[1]))",
>               "sum((`m`[2]))",
>               "sum((`m`[3]))",
>               "sum((`m`[4]))"
>             ],
>             "group_keys": [
>               "(`d`.`StaffMember`)",
>               "(`m`[0])"
>             ],
>             "#time_normal": "00:00.0077",
>             "#time_absolute": 0.0077862999999999995
>           },
>           {
>             "#operator": "FinalGroup",
>             "#stats": {
>               "#itemsIn": 1525,
>               "#itemsOut": 1525,
>               "#phaseSwitches": 6103,
>               "execTime": "52.0349ms",
>               "kernTime": "17.4998563s"
>             },
>             "aggregates": [
>               "sum((`m`[1]))",
>               "sum((`m`[2]))",
>               "sum((`m`[3]))",
>               "sum((`m`[4]))"
>             ],
>             "group_keys": [
>               "(`d`.`StaffMember`)",
>               "(`m`[0])"
>             ],
>             "#time_normal": "00:00.0520",
>             "#time_absolute": 0.0520349
>           },
>           {
>             "#operator": "Sequence",
>             "#stats": {
>               "#phaseSwitches": 1
>             },
>             "~children": [
>               {
>                 "#operator": "InitialProject",
>                 "#stats": {
>                   "#itemsIn": 1525,
>                   "#itemsOut": 1525,
>                   "#phaseSwitches": 4580,
>                   "execTime": "45.1872ms",
>                   "kernTime": "17.5088951s"
>                 },
>                 "result_terms": [
>                   {
>                     "expr": "(`d`.`StaffMember`)"
>                   },
>                   {
>                     "as": "JobName",
>                     "expr": "(`m`[0])"
>                   },
>                   {
>                     "as": "ScoreDurationWeight",
>                     "expr": "sum((`m`[1]))"
>                   },
>                   {
>                     "as": "DurationWeight",
>                     "expr": "sum((`m`[2]))"
>                   },
>                   {
>                     "as": "Duration",
>                     "expr": "sum((`m`[3]))"
>                   },
>                   {
>                     "as": "RawScore",
>                     "expr": "sum((`m`[4]))"
>                   }
>                 ],
>                 "#time_normal": "00:00.0451",
>                 "#time_absolute": 0.0451872
>               },
>               {
>                 "#operator": "FinalProject",
>                 "#stats": {
>                   "#itemsIn": 1525,
>                   "#itemsOut": 1525,
>                   "#phaseSwitches": 4576,
>                   "kernTime": "15.8134ms"
>                 }
>               }
>             ]
>           }
>         ]
>       },
>       "#time_normal": "00:00.0199",
>       "#time_absolute": 0.0199986
>     },
>     {
>       "#operator": "Stream",
>       "#stats": {
>         "#itemsIn": 1525,
>         "#itemsOut": 1525,
>         "#phaseSwitches": 6103,
>         "kernTime": "17.5908936s"
>       }
>     }
>   ],
>   "~versions": [
>     "2.0.0-N1QL",
>     "6.0.1-2037-enterprise"
>   ]
> }

Where as if i filter it down more by adding this between the Type and DateTimeUtc filter:

AND StaffMember= “02245346-0543-49d8-8826-db1699042d3a”

This now returns under 400ms, using the same index (Which is the same level of performance as I’m looking to get when doing the one above)

Plan for this one :

> {
>   "#operator": "Sequence",
>   "#stats": {
>     "#phaseSwitches": 1
>   },
>   "~children": [
>     {
>       "#operator": "Authorize",
>       "#stats": {
>         "#phaseSwitches": 3,
>         "servTime": "1.9968ms"
>       },
>       "privileges": {
>         "List": [
>           {
>             "Target": "default:staffjobscores ",
>             "Priv": 7
>           }
>         ]
>       },
>       "~child": {
>         "#operator": "Sequence",
>         "#stats": {
>           "#phaseSwitches": 1
>         },
>         "~children": [
>           {
>             "#operator": "Sequence",
>             "#stats": {
>               "#phaseSwitches": 1
>             },
>             "~children": [
>               {
>                 "#operator": "DistinctScan",
>                 "#stats": {
>                   "#itemsIn": 1046,
>                   "#itemsOut": 1046,
>                   "#phaseSwitches": 4189,
>                   "kernTime": "216.2538ms"
>                 },
>                 "scan": {
>                   "#operator": "IndexScan3",
>                   "#stats": {
>                     "#itemsOut": 1046,
>                     "#phaseSwitches": 4189,
>                     "execTime": "5.2616ms",
>                     "servTime": "210.9922ms"
>                   },
>                   "index": "ixd_CbDojo_2_Departments_DateTimeUtc_StaffMember",
>                   "index_id": "ae3ccaeb09ba3d62",
>                   "index_projection": {
>                     "primary_key": true
>                   },
>                   "keyspace": "staffjobscores ",
>                   "namespace": "default",
>                   "spans": [
>                     {
>                       "exact": true,
>                       "range": [
>                         {
>                           "high": "\"a3598df8-f8fd-41a7-b837-1e7a8e12de1a\"",
>                           "inclusion": 3,
>                           "low": "\"a3598df8-f8fd-41a7-b837-1e7a8e12de1a\""
>                         },
>                         {
>                           "high": "\"2019-12-31 23:59:59\"",
>                           "inclusion": 2,
>                           "low": "\"2019-01-01 00:00:00\""
>                         },
>                         {
>                           "high": "\"02245346-0543-49d8-8826-db1699042d3a\"",
>                           "inclusion": 3,
>                           "low": "\"02245346-0543-49d8-8826-db1699042d3a\""
>                         }
>                       ]
>                     }
>                   ],
>                   "using": "gsi",
>                   "#time_normal": "00:00.2162",
>                   "#time_absolute": 0.2162538
>                 }
>               },
>               {
>                 "#operator": "Fetch",
>                 "#stats": {
>                   "#itemsIn": 1046,
>                   "#itemsOut": 1046,
>                   "#phaseSwitches": 4319,
>                   "execTime": "5.8573ms",
>                   "kernTime": "95.9923ms",
>                   "servTime": "158.1426ms"
>                 },
>                 "keyspace": "staffjobscores ",
>                 "namespace": "default",
>                 "#time_normal": "00:00.1639",
>                 "#time_absolute": 0.1639999
>               },
>               {
>                 "#operator": "Sequence",
>                 "#stats": {
>                   "#phaseSwitches": 1
>                 },
>                 "~children": [
>                   {
>                     "#operator": "Filter",
>                     "#stats": {
>                       "#itemsIn": 1046,
>                       "#itemsOut": 1046,
>                       "#phaseSwitches": 4187,
>                       "execTime": "91.042ms",
>                       "kernTime": "169.9779ms"
>                     },
>                     "condition": "((((((`staffjobscores `.`Type`) = \"dpm::score\") and ((`staffjobscores `.`StaffMember`) = \"02245346-0543-49d8-8826-db1699042d3a\")) and (\"2019-01-01 00:00:00\" < (`staffjobscores `.`DateTimeUtc`))) and ((`staffjobscores `.`DateTimeUtc`) <= \"2019-12-31 23:59:59\")) and any `v` in (`staffjobscores `.`Departments`) satisfies (`v` = \"a3598df8-f8fd-41a7-b837-1e7a8e12de1a\") end)",
>                     "#time_normal": "00:00.0910",
>                     "#time_absolute": 0.091042
>                   },
>                   {
>                     "#operator": "InitialProject",
>                     "#stats": {
>                       "#itemsIn": 1046,
>                       "#itemsOut": 1046,
>                       "#phaseSwitches": 3143,
>                       "execTime": "150.03ms",
>                       "kernTime": "109.06ms"
>                     },
>                     "result_terms": [
>                       {
>                         "expr": "(`staffjobscores `.`StaffMember`)"
>                       },
>                       {
>                         "as": "n",
>                         "expr": "array [(`j`.`Name`), ((ifmissingornull((`j`.`Score`), 0) * ifmissingornull((`j`.`Duration`), 0)) * ifmissingornull((`j`.`Weight`), 0)), (ifmissingornull((`j`.`Duration`), 0) * ifmissingornull((`j`.`Weight`), 0)), ifmissingornull((`j`.`Duration`), 0), ifmissingornull((`j`.`RawScore`), 0)] for `j` in (`staffjobscores `.`Jobs`) end"
>                       }
>                     ],
>                     "#time_normal": "00:00.1500",
>                     "#time_absolute": 0.15003
>                   },
>                   {
>                     "#operator": "FinalProject",
>                     "#stats": {
>                       "#itemsIn": 1046,
>                       "#itemsOut": 1046,
>                       "#phaseSwitches": 3139,
>                       "execTime": "934.1µs",
>                       "kernTime": "8.9655ms"
>                     },
>                     "#time_normal": "00:00.0009",
>                     "#time_absolute": 0.0009341
>                   }
>                 ]
>               }
>             ]
>           },
>           {
>             "#operator": "Alias",
>             "#stats": {
>               "#itemsIn": 1046,
>               "#itemsOut": 1046,
>               "#phaseSwitches": 4187,
>               "execTime": "7.0089ms",
>               "kernTime": "261.9807ms"
>             },
>             "as": "d",
>             "#time_normal": "00:00.0070",
>             "#time_absolute": 0.007008899999999999
>           },
>           {
>             "#operator": "Unnest",
>             "#stats": {
>               "#itemsIn": 1046,
>               "#itemsOut": 4186,
>               "#phaseSwitches": 10467,
>               "execTime": "17.8891ms",
>               "kernTime": "251.1005ms"
>             },
>             "as": "m",
>             "expr": "(`d`.`n`)",
>             "#time_normal": "00:00.0178",
>             "#time_absolute": 0.017889099999999998
>           },
>           {
>             "#operator": "Sequence",
>             "#stats": {
>               "#phaseSwitches": 1
>             },
>             "~children": [
>               {
>                 "#operator": "Filter",
>                 "#stats": {
>                   "#itemsIn": 4186,
>                   "#itemsOut": 4186,
>                   "#phaseSwitches": 16747,
>                   "execTime": "7.3776ms",
>                   "kernTime": "291.6149ms"
>                 },
>                 "condition": "((`m`[0]) is not null)",
>                 "#time_normal": "00:00.0073",
>                 "#time_absolute": 0.0073776
>               },
>               {
>                 "#operator": "InitialGroup",
>                 "#stats": {
>                   "#itemsIn": 4186,
>                   "#itemsOut": 25,
>                   "#phaseSwitches": 8425,
>                   "execTime": "251.9318ms",
>                   "kernTime": "75.0577ms"
>                 },
>                 "aggregates": [
>                   "sum((`m`[1]))",
>                   "sum((`m`[2]))",
>                   "sum((`m`[3]))",
>                   "sum((`m`[4]))"
>                 ],
>                 "group_keys": [
>                   "(`d`.`StaffMember`)",
>                   "(`m`[0])"
>                 ],
>                 "#time_normal": "00:00.2519",
>                 "#time_absolute": 0.2519318
>               }
>             ]
>           },
>           {
>             "#operator": "IntermediateGroup",
>             "#stats": {
>               "#itemsIn": 25,
>               "#itemsOut": 25,
>               "#phaseSwitches": 103,
>               "execTime": "999.6µs",
>               "kernTime": "326.9895ms"
>             },
>             "aggregates": [
>               "sum((`m`[1]))",
>               "sum((`m`[2]))",
>               "sum((`m`[3]))",
>               "sum((`m`[4]))"
>             ],
>             "group_keys": [
>               "(`d`.`StaffMember`)",
>               "(`m`[0])"
>             ],
>             "#time_normal": "00:00.0009",
>             "#time_absolute": 0.0009996
>           },
>           {
>             "#operator": "FinalGroup",
>             "#stats": {
>               "#itemsIn": 25,
>               "#itemsOut": 25,
>               "#phaseSwitches": 103,
>               "execTime": "999.8µs",
>               "kernTime": "327.9891ms"
>             },
>             "aggregates": [
>               "sum((`m`[1]))",
>               "sum((`m`[2]))",
>               "sum((`m`[3]))",
>               "sum((`m`[4]))"
>             ],
>             "group_keys": [
>               "(`d`.`StaffMember`)",
>               "(`m`[0])"
>             ],
>             "#time_normal": "00:00.0009",
>             "#time_absolute": 0.0009998
>           },
>           {
>             "#operator": "Sequence",
>             "#stats": {
>               "#phaseSwitches": 1
>             },
>             "~children": [
>               {
>                 "#operator": "InitialProject",
>                 "#stats": {
>                   "#itemsIn": 25,
>                   "#itemsOut": 25,
>                   "#phaseSwitches": 80,
>                   "execTime": "2ms",
>                   "kernTime": "328.9889ms"
>                 },
>                 "result_terms": [
>                   {
>                     "expr": "(`d`.`StaffMember`)"
>                   },
>                   {
>                     "as": "JobName",
>                     "expr": "(`m`[0])"
>                   },
>                   {
>                     "as": "ScoreDurationWeight",
>                     "expr": "sum((`m`[1]))"
>                   },
>                   {
>                     "as": "DurationWeight",
>                     "expr": "sum((`m`[2]))"
>                   },
>                   {
>                     "as": "Duration",
>                     "expr": "sum((`m`[3]))"
>                   },
>                   {
>                     "as": "RawScore",
>                     "expr": "sum((`m`[4]))"
>                   }
>                 ],
>                 "#time_normal": "02:00",
>                 "#time_absolute": 120
>               },
>               {
>                 "#operator": "FinalProject",
>                 "#stats": {
>                   "#itemsIn": 25,
>                   "#itemsOut": 25,
>                   "#phaseSwitches": 76,
>                   "kernTime": "1.0003ms"
>                 }
>               }
>             ]
>           }
>         ]
>       },
>       "#time_normal": "00:00.0019",
>       "#time_absolute": 0.0019968
>     },
>     {
>       "#operator": "Stream",
>       "#stats": {
>         "#itemsIn": 25,
>         "#itemsOut": 25,
>         "#phaseSwitches": 103,
>         "kernTime": "333.986ms"
>       }
>     }
>   ],
>   "~versions": [
>     "2.0.0-N1QL",
>     "6.0.1-2037-enterprise"
>   ]
> }

It is fetching 61K documents. I thought it uses covered index. What is index definition for ixd_CbDojo_2_Departments_DateTimeUtc_StaffMember and query.

do u have space in bucket name. “keyspace”: "staffjobscores ", Can you alias in query

With couchbase-server-6.0.1-2037.x86_64 the following query and index does covering

CREATE INDEX `ixd_Departments_DateTimeUtc_StaffMember` 
ON `staffjobscores`( DISTINCT  ARRAY v FOR v IN Departments END,`DateTimeUtc`,  StaffMember, 
    ARRAY  [(`j`.`Name`), (
                       (ifmissingornull((`j`.`Score`), 0) * ifmissingornull((`j`.`Duration`), 0)) * ifmissingornull((`j`.`Weight`), 0)), 
                      (ifmissingornull((`j`.`Duration`), 0) * ifmissingornull((`j`.`Weight`), 0)), 
                       ifmissingornull((`j`.`Duration`), 0), 
                       ifmissingornull((`j`.`RawScore`), 0)] for `j` in `Jobs` end) 
WHERE (`Type` = "scores");
SELECT d.StaffMember, 
       m[0] AS JobName, 
       SUM(m[1]) AS ScoreDurationWeight, 
       SUM(m[2]) AS DurationWeight, 
       SUM(m[3]) AS Duration, 
       SUM(m[4]) AS RawScore 
FROM ( 
    SELECT a.StaffMember, ARRAY [j.Name,
                  IFMISSINGORNULL(j.Score,0) * IFMISSINGORNULL(j.Duration,0) * IFMISSINGORNULL(j.Weight,0),
                  IFMISSINGORNULL(j.Duration,0) * IFMISSINGORNULL(j.Weight,0),
                  IFMISSINGORNULL(j.Duration,0),
                  IFMISSINGORNULL(j.RawScore,0)] 
           FOR j IN Jobs END as n
	    FROM staffjobscores AS a
    WHERE a.Type = "scores"
        AND a.DateTimeUtc > "2019-01-01" AND a.DateTimeUtc <= "2019-12-31"
	    AND ANY o IN a.Departments SATISFIES o = "a3598df8-f8fd-41a7-b837-1e7a8e12de1a" END
) 
AS d
UNNEST d.n as m
WHERE m[0] IS NOT NULL
GROUP BY d.StaffMember, m[0];

Hi @vsr1, thanks that was a test index I had put on earlier, I missed it was using that one… I’ve been looking at this problem for so long I’m not seeing the wood through the trees, sorry about that… Just grabbed a coffee so back on form now :smiley: the space was my fault, renaming my bucket etc when publishing publicly :slight_smile:

Anyway that particular index was defined as.

CREATE INDEX ixd_CbDojo_2_Departments_DateTimeUtc_StaffMember
ON staffjobscores((distinct (array o for o in Departments end)),DateTimeUtc,StaffMember)
WHERE (Type = “scores”)

I’ve dropped that index and the engine is now picking up the CORRECT index :slight_smile:

Correct plan now, still taking - 15 second:

> {
>   "#operator": "Sequence",
>   "#stats": {
>     "#phaseSwitches": 1
>   },
>   "~children": [
>     {
>       "#operator": "Authorize",
>       "#stats": {
>         "#phaseSwitches": 3,
>         "servTime": "1.9673ms"
>       },
>       "privileges": {
>         "List": [
>           {
>             "Target": "default:staffjobscores",
>             "Priv": 7
>           }
>         ]
>       },
>       "~child": {
>         "#operator": "Sequence",
>         "#stats": {
>           "#phaseSwitches": 1
>         },
>         "~children": [
>           {
>             "#operator": "Sequence",
>             "#stats": {
>               "#phaseSwitches": 1
>             },
>             "~children": [
>               {
>                 "#operator": "DistinctScan",
>                 "#stats": {
>                   "#itemsIn": 61967,
>                   "#itemsOut": 61967,
>                   "#phaseSwitches": 247873,
>                   "execTime": "375.4795ms",
>                   "kernTime": "14.1074227s"
>                 },
>                 "scan": {
>                   "#operator": "IndexScan3",
>                   "#stats": {
>                     "#itemsOut": 61967,
>                     "#phaseSwitches": 247873,
>                     "execTime": "778.774ms",
>                     "kernTime": "13.2376292s",
>                     "servTime": "349.4985ms"
>                   },
>                   "covers": [
>                     "cover ((distinct (array `v` for `v` in (`staffjobscores`.`Departments`) end)))",
>                     "cover ((`staffjobscores`.`DateTimeUtc`))",
>                     "cover ((`staffjobscores`.`StaffMember`))",
>                     "cover (array [(`j`.`Name`), ((ifmissingornull((`j`.`Score`), 0) * ifmissingornull((`j`.`Duration`), 0)) * ifmissingornull((`j`.`Weight`), 0)), (ifmissingornull((`j`.`Duration`), 0) * ifmissingornull((`j`.`Weight`), 0)), ifmissingornull((`j`.`Duration`), 0), ifmissingornull((`j`.`RawScore`), 0)] for `j` in (`staffjobscores`.`Jobs`) end)",
>                     "cover ((meta(`staffjobscores`).`id`))"
>                   ],
>                   "filter_covers": {
>                     "cover ((`staffjobscores`.`Type`))": "dpm::score",
>                     "cover (any `v` in (`staffjobscores`.`Departments`) satisfies (`v` = \"a3598df8-f8fd-41a7-b837-1e7a8e12de1a\") end)": true
>                   },
>                   "index": "ixd_Scores_Departments_DateTimeUtc_StaffMember",
>                   "index_id": "91a9ae274d0a32c8",
>                   "index_projection": {
>                     "entry_keys": [
>                       1,
>                       2,
>                       3
>                     ],
>                     "primary_key": true
>                   },
>                   "keyspace": "staffjobscores",
>                   "namespace": "default",
>                   "spans": [
>                     {
>                       "exact": true,
>                       "range": [
>                         {
>                           "high": "\"a3598df8-f8fd-41a7-b837-1e7a8e12de1a\"",
>                           "inclusion": 3,
>                           "low": "\"a3598df8-f8fd-41a7-b837-1e7a8e12de1a\""
>                         },
>                         {
>                           "high": "\"2019-12-31 23:59:59\"",
>                           "inclusion": 2,
>                           "low": "\"2019-01-01 00:00:00\""
>                         }
>                       ]
>                     }
>                   ],
>                   "using": "gsi",
>                   "#time_normal": "00:01.1282",
>                   "#time_absolute": 1.1282725
>                 },
>                 "#time_normal": "00:00.3754",
>                 "#time_absolute": 0.37547949999999997
>               },
>               {
>                 "#operator": "Sequence",
>                 "#stats": {
>                   "#phaseSwitches": 1
>                 },
>                 "~children": [
>                   {
>                     "#operator": "Filter",
>                     "#stats": {
>                       "#itemsIn": 61967,
>                       "#itemsOut": 61967,
>                       "#phaseSwitches": 247871,
>                       "execTime": "767.6966ms",
>                       "kernTime": "13.8412026s"
>                     },
>                     "condition": "((((cover ((`staffjobscores`.`Type`)) = \"dpm::score\") and (\"2019-01-01 00:00:00\" < cover ((`staffjobscores`.`DateTimeUtc`)))) and (cover ((`staffjobscores`.`DateTimeUtc`)) <= \"2019-12-31 23:59:59\")) and cover (any `v` in (`staffjobscores`.`Departments`) satisfies (`v` = \"a3598df8-f8fd-41a7-b837-1e7a8e12de1a\") end))",
>                     "#time_normal": "00:00.7676",
>                     "#time_absolute": 0.7676966
>                   },
>                   {
>                     "#operator": "InitialProject",
>                     "#stats": {
>                       "#itemsIn": 61967,
>                       "#itemsOut": 61967,
>                       "#phaseSwitches": 185906,
>                       "execTime": "593.4594ms",
>                       "kernTime": "160.9314ms"
>                     },
>                     "result_terms": [
>                       {
>                         "expr": "cover ((`staffjobscores`.`StaffMember`))"
>                       },
>                       {
>                         "as": "n",
>                         "expr": "cover (array [(`j`.`Name`), ((ifmissingornull((`j`.`Score`), 0) * ifmissingornull((`j`.`Duration`), 0)) * ifmissingornull((`j`.`Weight`), 0)), (ifmissingornull((`j`.`Duration`), 0) * ifmissingornull((`j`.`Weight`), 0)), ifmissingornull((`j`.`Duration`), 0), ifmissingornull((`j`.`RawScore`), 0)] for `j` in (`staffjobscores`.`Kpis`) end)"
>                       }
>                     ],
>                     "#time_normal": "00:00.5934",
>                     "#time_absolute": 0.5934594
>                   },
>                   {
>                     "#operator": "FinalProject",
>                     "#stats": {
>                       "#itemsIn": 61967,
>                       "#itemsOut": 61967,
>                       "#phaseSwitches": 185902,
>                       "execTime": "79.8184ms",
>                       "kernTime": "13.8553668s"
>                     },
>                     "#time_normal": "00:00.0798",
>                     "#time_absolute": 0.0798184
>                   }
>                 ]
>               }
>             ]
>           },
>           {
>             "#operator": "Alias",
>             "#stats": {
>               "#itemsIn": 61967,
>               "#itemsOut": 61967,
>               "#phaseSwitches": 247871,
>               "execTime": "290.3578ms",
>               "kernTime": "14.5575424s"
>             },
>             "as": "d",
>             "#time_normal": "00:00.2903",
>             "#time_absolute": 0.2903578
>           },
>           {
>             "#operator": "Unnest",
>             "#stats": {
>               "#itemsIn": 61967,
>               "#itemsOut": 247770,
>               "#phaseSwitches": 619477,
>               "execTime": "1.8617778s",
>               "kernTime": "13.1051181s"
>             },
>             "as": "m",
>             "expr": "(`d`.`n`)",
>             "#time_normal": "00:01.8617",
>             "#time_absolute": 1.8617778
>           },
>           {
>             "#operator": "Sequence",
>             "#stats": {
>               "#phaseSwitches": 1
>             },
>             "~children": [
>               {
>                 "#operator": "Filter",
>                 "#stats": {
>                   "#itemsIn": 247770,
>                   "#itemsOut": 247770,
>                   "#phaseSwitches": 991083,
>                   "execTime": "1.4131159s",
>                   "kernTime": "13.58778s"
>                 },
>                 "condition": "((`m`[0]) is not null)",
>                 "#time_normal": "00:01.4131",
>                 "#time_absolute": 1.4131159
>               },
>               {
>                 "#operator": "InitialGroup",
>                 "#stats": {
>                   "#itemsIn": 247770,
>                   "#itemsOut": 1525,
>                   "#phaseSwitches": 498593,
>                   "execTime": "14.0111973s",
>                   "kernTime": "1.033784s"
>                 },
>                 "aggregates": [
>                   "sum((`m`[1]))",
>                   "sum((`m`[2]))",
>                   "sum((`m`[3]))",
>                   "sum((`m`[4]))"
>                 ],
>                 "group_keys": [
>                   "(`d`.`StaffMember`)",
>                   "(`m`[0])"
>                 ],
>                 "#time_normal": "00:14.0111",
>                 "#time_absolute": 14.0111973
>               }
>             ]
>           },
>           {
>             "#operator": "IntermediateGroup",
>             "#stats": {
>               "#itemsIn": 1525,
>               "#itemsOut": 1525,
>               "#phaseSwitches": 6103,
>               "execTime": "30.9384ms",
>               "kernTime": "15.0689573s"
>             },
>             "aggregates": [
>               "sum((`m`[1]))",
>               "sum((`m`[2]))",
>               "sum((`m`[3]))",
>               "sum((`m`[4]))"
>             ],
>             "group_keys": [
>               "(`d`.`StaffMember`)",
>               "(`m`[0])"
>             ],
>             "#time_normal": "00:00.0309",
>             "#time_absolute": 0.0309384
>           },
>           {
>             "#operator": "FinalGroup",
>             "#stats": {
>               "#itemsIn": 1525,
>               "#itemsOut": 1525,
>               "#phaseSwitches": 6103,
>               "execTime": "59.0327ms",
>               "kernTime": "15.1148628s"
>             },
>             "aggregates": [
>               "sum((`m`[1]))",
>               "sum((`m`[2]))",
>               "sum((`m`[3]))",
>               "sum((`m`[4]))"
>             ],
>             "group_keys": [
>               "(`d`.`StaffMember`)",
>               "(`m`[0])"
>             ],
>             "#time_normal": "00:00.0590",
>             "#time_absolute": 0.0590327
>           },
>           {
>             "#operator": "Sequence",
>             "#stats": {
>               "#phaseSwitches": 1
>             },
>             "~children": [
>               {
>                 "#operator": "InitialProject",
>                 "#stats": {
>                   "#itemsIn": 1525,
>                   "#itemsOut": 1525,
>                   "#phaseSwitches": 4580,
>                   "execTime": "57.9724ms",
>                   "kernTime": "15.1209106s"
>                 },
>                 "result_terms": [
>                   {
>                     "expr": "(`d`.`StaffMember`)"
>                   },
>                   {
>                     "as": "JobName",
>                     "expr": "(`m`[0])"
>                   },
>                   {
>                     "as": "ScoreDurationWeight",
>                     "expr": "sum((`m`[1]))"
>                   },
>                   {
>                     "as": "DurationWeight",
>                     "expr": "sum((`m`[2]))"
>                   },
>                   {
>                     "as": "Duration",
>                     "expr": "sum((`m`[3]))"
>                   },
>                   {
>                     "as": "RawScore",
>                     "expr": "sum((`m`[4]))"
>                   }
>                 ],
>                 "#time_normal": "00:00.0579",
>                 "#time_absolute": 0.0579724
>               },
>               {
>                 "#operator": "FinalProject",
>                 "#stats": {
>                   "#itemsIn": 1525,
>                   "#itemsOut": 1525,
>                   "#phaseSwitches": 4576,
>                   "execTime": "999.5µs",
>                   "kernTime": "16.0148ms"
>                 },
>                 "#time_normal": "00:00.0009",
>                 "#time_absolute": 0.0009995
>               }
>             ]
>           }
>         ]
>       },
>       "#time_normal": "00:00.0019",
>       "#time_absolute": 0.0019673
>     },
>     {
>       "#operator": "Stream",
>       "#stats": {
>         "#itemsIn": 1525,
>         "#itemsOut": 1525,
>         "#phaseSwitches": 6103,
>         "execTime": "2.0002ms",
>         "kernTime": "15.1958644s"
>       },
>       "#time_normal": "00:00.0020",
>       "#time_absolute": 0.0020002
>     }
>   ],
>   "~versions": [
>     "2.0.0-N1QL",
>     "6.0.1-2037-enterprise"
>   ]
> }

Is it EE or CE?

It looks too many items and taking time. You can try query level max_parallelsim, it may improve some.

      {
        "#operator": "Unnest",
        "#stats": {
          "#itemsIn": 61967,
          "#itemsOut": 247770,
          "#phaseSwitches": 619477,
          "execTime": "1.8617778s",
          "kernTime": "13.1051181s"
        }

@vsr1 I’m using EE, three node cluster, 8 cores. 32GB RAM (dev box) Only me using it so no pressure from other queries, not ingesting any data while doing queries either.

I’ve tried setting max_parallelsim to 8 to test, getting result set in 8 seconds:

   "#operator": "Unnest",
          "#stats": {
            "#itemsIn": 61967,
            "#itemsOut": 247770,
            "#phaseSwitches": 619498,
            "execTime": "1.43841s",
            "kernTime": "57.4783863s"

setting to 1, 2, 4, still around 15 to 20 seonds.

Try Partition Index

 CREATE INDEX `ixd_Departments_DateTimeUtc_StaffMember` 
ON `staffjobscores`( DISTINCT  ARRAY v FOR v IN Departments END,`DateTimeUtc`,  StaffMember, 
    ARRAY  [(`j`.`Name`), (
                       (ifmissingornull((`j`.`Score`), 0) * ifmissingornull((`j`.`Duration`), 0)) * ifmissingornull((`j`.`Weight`), 0)), 
                      (ifmissingornull((`j`.`Duration`), 0) * ifmissingornull((`j`.`Weight`), 0)), 
                       ifmissingornull((`j`.`Duration`), 0), 
                       ifmissingornull((`j`.`RawScore`), 0)] for `j` in `Jobs` end) 
WHERE (`Type` = "scores")  PARTITION BY HASH(META().id);

kernTime - Time spent waiting to be scheduled for CPU time.

“scan”: {

              "#operator": "IndexScan3",
              "#stats": {
                "#itemsOut": 61967,
                "#phaseSwitches": 247873,
                "execTime": "778.774ms",
                "kernTime": "13.2376292s",
                "servTime": "349.4985ms"
              },

Thank you @vsr1 thats got my query down to 6 seconds.

kernTime I guess would be down to hardware performance?

So looks like the more StaffMembers I have the slower the performance, the unnest can turn 80000 returned from the index into 1,360,00 it needs to group by.
The group by performance is shocking

  • #stats
    • #itemsIn - 1360000
    • #itemsOut - 112
    • #phaseSwitches - 2720248
    • execTime - 1m55.7522247s
    • kernTime - 5m52.6061495s
  • aggregates
    • sum((m[1]))
    • sum((m[2]))
    • sum((m[3]))
    • sum((m[4]))
  • group_keys
    • (d.driverId)
    • (m[0])
  • #time_normal - 01:55.7522
  • #time_absolute - 115.7522247

Your kernTime is high check cpu usage on query node while running.