Stuck building subquery

Hi!,

I am having trouble building a query and I am hoping to see if anybody can guide me on this.
I have the following documents in my bucket:
Project 1

  "id": "123@project",
  "_class": "com.package.Project",
  "name": "project1",
  "roles": [
    {
      "id": "abc@role",
      "name": "Manager",
      "projectId": "123@project"
    },
    {
      "id": "def@role",
      "name": "Developer",
      "projectId": "123@project"
    }
  ],
  "status": "ACTIVE"
}

Project 2:

{
  "id": "456@project",
  "_class": "com.package.Project",
  "name": "project2",
  "roles": [
    {
      "id": "ghi@role",
      "name": "Manager",
      "projectId": "456@project"
    },
    {
      "id": "jkl@role",
      "name": "Developer",
      "projectId": "456@project"
    }
  ],
  "status": "ACTIVE"
}

User 1:

{
  "id": "123@user",
  "_class": "com.package.User",
  "name": "user1",
  "projects": [
    {
      "projectId": "123@project"
    }
  ],
  "roles": [
    "abc@role",
    "def@role"
  ]
}

User 2:

{
  "id": "456@user
  "_class": "com.package.User",
  "name": "user2",
  "projects": [
    {
      "projectId": "123@project"
    },
    {
      "projectId": "456@project"
    }
  ],
  "roles": [
    "jkl@role"
  ]
}

Given these documents I need to get a list of projects but each result should include the list of users that have the “manager” role in that project.
The output should look like:

Result:

[
  {
    "id": "123@project",
    "name": "project1",
    "managers": [{ 
      "id": "123@user",
      "name": "user1"
    }]
  },
  {
    "id": "456@project",
    "name": "project2",
    "managers": []
  }
]

I tried to build this query but got stuck when trying to select the managers for each project:

SELECT meta(pg).id id, pg.name name,
       (
           SELECT META(um).id
           FROM identity um
           WHERE um._class= "com.package.User"
           AND ARRAY_CONTAINS(um.roles, <how-to-get-role-id-from-project-where-role-name-is-manager???>)
       ) AS managers
       FROM identity p
       JOIN identity u ON ANY x IN u.projects SATISFIES x.projectId = META(p).id END
       WHERE p._class LIKE "com.package.Project" AND u._class LIKE "com.package.User"
       GROUP BY p pg

The problem I am having is to select the role id from the project coming from the join. I tried different things like:

 (
           SELECT META(um).id, um.*
           FROM identity um USE KEYS SELF.p.roles[*].id
           WHERE um._class= "com.package.User"
               AND ARRAY_CONTAINS(um.roles, (
                   SELECT META(r).id rid
                   FROM identity r
                   WHERE r._class = "com.package.Project"
                       AND ANY x IN r.roles SATISFIES x.name = "Manager"
                       AND x.projectId = META(SELF.p).id END) ) ) AS managers

but this didnt work.
Any advice on how to achieve this?
Thanks!
Mauro.-

SELECT  META(p).id, p.name,
        (SELECT  META(u).id, u.name
         FROM identity AS u USE KEYS (ARRAY v.id FOR v IN p.`roles` WHEN v.name = "Manager" END)) AS managers
FROM identity AS p
WHERE p._class = "com.package.Project";

Hi @vsr1,
Thanks for the query. I tried that and I am getting managers as an a empty array:

[
  {
    "id": "123@project",
    "managers": [],
    "name": "project1"
  },
  {
    "id": "456@project",
    "managers": [],
    "name": "project2"
  }
]

I think the reason is that:

USE KEYS (ARRAY v.id FOR v IN p.`roles` WHEN v.name = "Manager" END)

returns the role ids from the project, but the USE KEYS should reference the User ids instead (if I understand correctly how USE KEYS work)? I am correct?

Thanks!

Thought project roles.id as user id

Try ANSI JOIN and GROUP BY

CREATE INDEX ix1 ON identity(DISTINCT `roles`, id, name) WHERE _class = "com.package.User";
SELECT d.id, d.name, ARRAY_AGG( DISTINCT {u.id, u.name}) AS managers
FROM (SELECT META(p).id, p.name, r.id AS rid
      FROM identity AS p
      UNNEST p.`roles` AS r
      WHERE p._class = "com.package.Project" AND r.name = "Manager") AS d
JOIN identity AS u ON ANY v IN u.`roles` SATISFIES v = d.rid END AND u._class = "com.package.User"
GROUP BY d.id, d.name;

Thanks @vsr1,
The ANSI JOIN kind of worked with some small modifications, but still getting the wrong output:

I added a left join instead of join and used CASE to handle the user.id null so I don’t end up with managers: [{}]

SELECT projectRoles.id, projectRoles.status,
       projectRoles.name,
       projectRoles.roleId,
       (CASE WHEN meta(users).id IS NOT NULL THEN ARRAY_AGG({META(users).id, users.name}) ELSE [] END) AS managers
FROM (
    SELECT META(project).id, project.name, roles.id AS roleId, project.status
    FROM identity AS project
    UNNEST project.`roles` AS roles
    WHERE project._class = "com.package.Project" AND roles.name = "Manager" ) AS projectRoles
    LEFT JOIN identity AS users ON users._class = "com.package.User" AND ARRAY_CONTAINS(users.roles, projectRoles.roleId)
WHERE projectRoles.status = "ACTIVE"
GROUP BY projectRoles, users

With those changes I get the following results

[
  {
    "id": "123@project",
    "managers": [
      {
        "id": "123@user",
        "name": "user1"
      }
    ],
    "name": "project1",
    "roleId": "abc@role",
    "status": "ACTIVE"
  },
  {
    "id": "123@project",
    "managers": [
      {
        "id": "789@user",
        "name": "user3"
      }
    ],
    "name": "project1",
    "roleId": "abc@role",
    "status": "ACTIVE"
  },
  {
    "id": "456@project",
    "managers": [],
    "name": "project2",
    "roleId": "ghi@role",
    "status": "ACTIVE"
  }
]

Now, the problem I have is that adding 1 more user (with manager role) to the project1, I end up with 2 records for project1, (I have 2 managers (123@user and 789@user).
How can I change the structure to have 1 record per project with an array with all the managers?
Thanks for the help!,
M.-

@monti.mauro ,

Try this. Also you may want to consider LEFT UNNEST, In case of project.roles is Non/empty ARRAY or no Manager . Based on the use case see if projectRoles.status = “ACTIVE” can be pushed into subquery

  SELECT projectRoles.*,
           ARRAY_AGG(DISTINCT (CASE users IS NOT MISSING TEHN {META(users).id, users.name} ELSE MISSING END)) AS managers
    FROM (
        SELECT META(project).id, project.name, roles.id AS roleId, project.status
        FROM identity AS project
        UNNEST project.`roles` AS roles
        WHERE project._class = "com.package.Project" AND roles.name = "Manager" ) AS projectRoles
        LEFT JOIN identity AS users ON users._class = "com.package.User" AND ARRAY_CONTAINS(users.roles, projectRoles.roleId)
    WHERE projectRoles.status = "ACTIVE"
    GROUP BY projectRoles

Check example 12 in ANSI JOIN blog vs ARRAY_CONTAINS() to avoid scan of all users

1 Like