N1QL - array indexing, empty arrays and querying META().id


#1

I’m having an issue returning document IDs in an N1QL query where array indexing is in use and an indexed array is empty.

I’ve created the following documents:

101
{
  "docType": "Test",
  "tags": [],
  "title": "doc101"
}

102
{
  "docType": "Test",
  "tags": [
    "tag1"
  ],
  "title": "doc102"
}

103
{
  "docType": "Test",
  "tags": [
    "tag1",
    "tag2"
  ],
  "title": "doc103"
}

…and the following index:

CREATE INDEX `Idx_ArrayIndexingTest`
ON `test_bucket`(docType, tags, DISTINCT ARRAY t FOR t IN tags END, title)
WHERE docType="Test";

If I run this query:

SELECT META().id from `test_bucket`
WHERE docType="Test";

…I get this result:

[
  {},
  {
    "id": "102"
  },
  {
    "id": "103"
  }
]

As soon as I add an entry into the empty array in doc 101, I get all three IDs back in the query result as expected. Then if I remove again (to leave an empty array), the query result returns back to the above. A null for the array (rather than an empty array) fixes the problem, but that isn’t ideal. If I select a property other than META().id it’s fine too, it’s just the document ID that doesn’t appear in the result.

I’m running 4.5.0-2601 Enterprise Edition (build-2601) on Server 2012 R2.


#2

Could you post the EXPLAIN plan.


#3

Hi,

@vsr1 tested this internally on 4.5.1 and it works on 4.5.1. It seems you have hit a bug in 4.5. Sorry about that.


#4

No worries. We’re skipping 4.5 in production because of https://issues.couchbase.com/browse/MB-20166 anyway, so if this is fixed in 4.5.1 too that’s great.

I’ve included the EXPLAIN plan just in case it’s still of any use.

[
  {
    "plan": {
      "#operator": "Sequence",
      "~children": [
        {
          "#operator": "DistinctScan",
          "scan": {
            "#operator": "IndexScan",
            "covers": [
              "cover ((`test_bucket`.`docType`))",
              "cover ((`test_bucket`.`tags`))",
              "cover ((distinct (array `t` for `t` in (`test_bucket`.`tags`) end)))",
              "cover ((`test_bucket`.`title`))",
              "cover ((meta(`test_bucket`).`id`))"
            ],
            "filter_covers": {
              "cover ((`test_bucket`.`docType`))": "Test"
            },
            "index": "Idx_ArrayIndexingTest",
            "index_id": "bb13666ef6fd87c7",
            "keyspace": "test_bucket",
            "namespace": "default",
            "spans": [
              {
                "Range": {
                  "High": [
                    "successor(\"Test\")"
                  ],
                  "Inclusion": 1,
                  "Low": [
                    "\"Test\""
                  ]
                }
              }
            ],
            "using": "gsi"
          }
        },
        {
          "#operator": "Parallel",
          "~child": {
            "#operator": "Sequence",
            "~children": [
              {
                "#operator": "Filter",
                "condition": "(cover ((`test_bucket`.`docType`)) = \"Test\")"
              },
              {
                "#operator": "InitialProject",
                "result_terms": [
                  {
                    "expr": "cover ((meta(`test_bucket`).`id`))"
                  }
                ]
              },
              {
                "#operator": "FinalProject"
              }
            ]
          }
        }
      ]
    },
    "text": "SELECT META().id from `test_bucket` WHERE `docType`=\"Test\";"
  }
]

#5

Hi @geraldss,

I’m working with 4.5.1, and while I can now get META().id, there’s still an issue here. If I alter my query to include title, i.e.

SELECT META().id,title from `test_bucket`
WHERE docType="Test";

…the results I get are:

[
  {
    "id": "101"
  },
  {
    "id": "102",
    "title": "doc102"
  },
  {
    "id": "103",
    "title": "doc103"
  }
]

As you can see, the title field isn’t in the results for the doc that has an empty array. I’ve spotted this because an ORDER BY title in my real world data was placing some docs right at the beginning incorrectly (I assume because the query engine can’t see the title to determine sort order on these docs either).

As in 4.5, assigning a NULL to the array property (rather than an empty array) or adding at least one entry resolves the issue.

Cheers,
Fraser


#6

Hi @vsr1,

Please take a look at this so we can fix it in 4.5.1.

Thanks,
Gerald


#7

Opened Issue for the new problem.