N1QL Join to array within a document

Interesting… Here are our key naming conventions. The “123456789” is the unique account #, of which there are thousands per client. So if customer_xyz has 10,000 accounts, we would have 10,000 account- documents, 10,000 financial- documents, and 10,000 charges- documents, all linked together by the account # at the end of the key. Hopefully that makes sense.

account-client_name-123456789
financial-client_name-123456789
charges-client_name-123456789

Here are some short examples of what the documents structure looks like:

account-customer_xyz-123456789
{
  "acctNumber": 123456789,
  "tenantName": "customer_xyz",
  "code": "166"
}

financial-customer_xyz-123456789
{
  "acctFinancials": [
    {
      "acctNumber": "19800001",
      "transactionId": "164126601",
      "transactionType": "Credit Adjustment",
      "amount": 30.10
    },
    {
      "acctNumber": "19800001",
      "transactionId": "162555860",
      "transactionType": "Credit Adjustment",
      "amount": 255.14
	}
  ],
  "tenantName": "customer_xyz",
  "acctNumber": "123456789"
}

charges-customer_xyz-123456789
{
  "acctCharges": [
    {
      "acctNumber": "123456789",
      "chargeId": "165532430",
      "amount": 1193.64
	},
	{
      "acctNumber": "123456789",
      "chargeId": "165247278",
      "amount": 1461.09
	}
  ],
  "tenantName": "customer_xyz",
  "acctNumber": "123456789"
}

Our end-goal in most circumstances will be to join the account doc to the detailed docs (financial or charges), then aggregate on a value within the nested financials or charges. So for a simple example, if I wanted to show total charges for all accounts I would need to join the “account-” docs to the “charges-” docs, doing a SUM() on the nested amounts inside the charges document.

Then if I want limit the results to only one tenant, would it be the most efficient to limit by providing only partial key names? E.g.:

FROM acctbucket accounts JOIN accountbucket charges
             ON KEYS ("account-customer_xyz-" || "charges-customer_xyz-");

To restrict it to ONLY ONE account

FROM acctbucket accounts USE KEYS ["account-customer_xyz-123456789" 
          JOIN accountbucket charges
             ON KEYS ("charges-customer_xyz-" || TOSTRING(accounts.acctNumber);

This is LOGICALLY (syntax is unsupported).

Please see: https://dzone.com/articles/join-faster-with-couchbase-index-joins

FROM acctbucket accounts USE KEYS ["account-customer_xyz-123456789" 
          JOIN accountbucket charges
 ON KEYS ("charges-customer_xyz-" || TOSTRING(accounts.acctNumber) = META(charges).id;

Sorry, I misunderstood the concatenation operator for a logical “or”. I’m having trouble getting this to work.

SELECT *
FROM acctbucket accounts USE KEYS "account-customer_xyz-123456789"
JOIN acctbucket financials ON KEYS "financials-customer_xyz-" || TOSTRING(accounts.acctNumber) = META(financials).id

It’s throwing an error saying “Ambiguous reference to field financials”, however I only have the two aliases, account and financials.

This is the actual syntax you should use. The key (“charges-customer_xyz-” || TOSTRING(accounts.acctNumber) is implictly joined with document key of charges (meta(charges).id).

FROM acctbucket accounts USE KEYS ["account-customer_xyz-123456789" 
   JOIN accountbucket charges
  ON KEYS ("charges-customer_xyz-" || TOSTRING(accounts.acctNumber);

Thanks keshav_m, that fixed it :slight_smile: Going back to the original question, is there a way (without specifying a single account number) to join between the account- and financial- docs to gain access to the nested elements within the financials doc (such as “amount”)?

To provide a simple example, we want to look at all accounts for customer_xyz with a code = 166, then join to the financials document, and SUM() the nested “amount” value, grouped by account number.

We’ve been able to successfully SUM() the financial amounts within the financials doc, but we have not been able to properly join it back to the account doc to allow for filtering by account level attributes (such as “code” for example).

account-customer_xyz-123456789
{
  "acctNumber": 123456789,
  "tenantName": "customer_xyz",
  "code": "166"
}

financial-customer_xyz-123456789
{
  "acctFinancials": [
    {
      "acctNumber": "123456789",
      "transactionId": "164126601",
      "transactionType": "Credit Adjustment",
      "amount": 30.10
    },
    {
      "acctNumber": "123456789",
      "transactionId": "162555860",
      "transactionType": "Credit Adjustment",
      "amount": 255.14
	}
  ],
  "tenantName": "customer_xyz",
  "acctNumber": "123456789"
}

If you don’t use the USE KEYS, the query will join ALL qualifying documents from accounts.
Because you have the field acctNumber in both accounts and chargers, you should have a type field for each document type (or change the field name).

FROM acctbucket accounts  
   JOIN accountbucket charges
  ON KEYS ("charges-customer_xyz-" || TOSTRING(accounts.acctNumber) 
 WHERE accounts.type = 'accounts' and charges.type = 'charges';

This is really cool. I applied your index suggestions and they do help tremendously.
Can you help me understand why in the explain the LET is taking so much time? This is my explain text.

{
  "#operator": "Sequence",
  "#stats": {
    "#phaseSwitches": 1,
    "execTime": "2.174µs"
  },
  "~children": [
    {
      "#operator": "Authorize",
      "#stats": {
        "#phaseSwitches": 3,
        "execTime": "11.743µs",
        "servTime": "1.016365ms"
      },
      "privileges": {
        "List": [
          {
            "Target": "default:mdata",
            "Priv": 7
          }
        ]
      },
      "~child": {
        "#operator": "Sequence",
        "#stats": {
          "#phaseSwitches": 1,
          "execTime": "8.696µs"
        },
        "~children": [
          {
            "#operator": "IntersectScan",
            "#stats": {
              "#itemsIn": 864,
              "#itemsOut": 583,
              "#phaseSwitches": 2903,
              "execTime": "1.313074ms",
              "kernTime": "21.211604ms"
            },
            "scans": [
              {
                "#operator": "IndexScan3",
                "#stats": {
                  "#itemsOut": 184,
                  "#phaseSwitches": 745,
                  "execTime": "492.314µs",
                  "kernTime": "242.398µs",
                  "servTime": "2.73331ms"
                },
                "as": "o",
                "index": "ix_subset_version_fcstValidEpoch_DD_obs",
                "index_id": "9a0d59e5e39e178d",
                "index_projection": {
                  "primary_key": true
                },
                "keyspace": "mdata",
                "namespace": "default",
                "spans": [
                  {
                    "exact": true,
                    "range": [
                      {
                        "high": "\"METAR\"",
                        "inclusion": 3,
                        "low": "\"METAR\""
                      },
                      {
                        "high": "\"V01\"",
                        "inclusion": 3,
                        "low": "\"V01\""
                      },
                      {
                        "high": "(1615401000 + 900)",
                        "inclusion": 3,
                        "low": "(1612805400 - 900)"
                      }
                    ]
                  }
                ],
                "using": "gsi",
                "#time_normal": "00:00.003",
                "#time_absolute": 0.003225624
              },
              {
                "#operator": "IndexScan3",
                "#stats": {
                  "#itemsOut": 97,
                  "#phaseSwitches": 393,
                  "execTime": "292.162µs",
                  "kernTime": "59.505µs",
                  "servTime": "3.105274ms"
                },
                "as": "o",
                "index": "idx_type_docType_version_fcstValidEpoch_METAR",
                "index_id": "acf9f11cc0879013",
                "index_projection": {
                  "primary_key": true
                },
                "keyspace": "mdata",
                "namespace": "default",
                "spans": [
                  {
                    "exact": true,
                    "range": [
                      {
                        "high": "\"DD\"",
                        "inclusion": 3,
                        "low": "\"DD\""
                      },
                      {
                        "high": "\"obs\"",
                        "inclusion": 3,
                        "low": "\"obs\""
                      },
                      {
                        "high": "\"V01\"",
                        "inclusion": 3,
                        "low": "\"V01\""
                      },
                      {
                        "high": "(1615401000 + 900)",
                        "inclusion": 3,
                        "low": "(1612805400 - 900)"
                      }
                    ]
                  }
                ],
                "using": "gsi",
                "#time_normal": "00:00.003",
                "#time_absolute": 0.0033974359999999998
              },
              {
                "#operator": "IndexScan3",
                "#stats": {
                  "#itemsOut": 583,
                  "#phaseSwitches": 2337,
                  "execTime": "1.178854ms",
                  "kernTime": "320.14µs",
                  "servTime": "1.8744ms"
                },
                "as": "o",
                "index": "adv_fcstValidEpoch_docType_subset_version_type",
                "index_id": "aab729c871fbf576",
                "index_projection": {
                  "primary_key": true
                },
                "keyspace": "mdata",
                "namespace": "default",
                "spans": [
                  {
                    "exact": true,
                    "range": [
                      {
                        "high": "(1615401000 + 900)",
                        "inclusion": 3,
                        "low": "(1612805400 - 900)"
                      }
                    ]
                  }
                ],
                "using": "gsi",
                "#time_normal": "00:00.003",
                "#time_absolute": 0.003053254
              }
            ],
            "#time_normal": "00:00.001",
            "#time_absolute": 0.001313074
          },
          {
            "#operator": "Fetch",
            "#stats": {
              "#itemsIn": 583,
              "#itemsOut": 583,
              "#phaseSwitches": 2409,
              "execTime": "4.256622ms",
              "kernTime": "2.441514033s",
              "servTime": "490.409745ms"
            },
            "as": "o",
            "keyspace": "mdata",
            "namespace": "default",
            "#time_normal": "00:00.494",
            "#time_absolute": 0.494666367
          },
          {
            "#operator": "Sequence",
            "#stats": {
              "#phaseSwitches": 1,
              "execTime": "8.395µs"
            },
            "~children": [
              {
                "#operator": "NestedLoopJoin",
                "#stats": {
                  "#itemsIn": 583,
                  "#itemsOut": 7682,
                  "#phaseSwitches": 34229,
                  "execTime": "289.664183ms",
                  "kernTime": "47.837926883s"
                },
                "alias": "m",
                "on_clause": "(ceil((3600 * floor((((`o`.`fcstValidEpoch`) + (3600 / 2)) / 3600)))) = ceil((3600 * floor((((`m`.`fcstValidEpoch`) + (3600 / 2)) / 3600)))))",
                "~child": {
                  "#operator": "Sequence",
                  "#stats": {
                    "#phaseSwitches": 1166,
                    "execTime": "39.422477307s",
                    "kernTime": "1.750126ms",
                    "state": "running"
                  },
                  "~children": [
                    {
                      "#operator": "IntersectScan",
                      "#stats": {
                        "#itemsIn": 3781568,
                        "#itemsOut": 7682,
                        "#phaseSwitches": 7583747,
                        "execTime": "5.114748981s",
                        "kernTime": "5.684166012s"
                      },
                      "scans": [
                        {
                          "#operator": "IndexScan3",
                          "#stats": {
                            "#itemsOut": 1906903,
                            "#phaseSwitches": 7631940,
                            "execTime": "5.033923583s",
                            "kernTime": "2.487471313s",
                            "servTime": "2.783055257s"
                          },
                          "as": "m",
                          "index": "idx_type_docType_version_fcstValidEpoch_METAR",
                          "index_id": "acf9f11cc0879013",
                          "index_projection": {
                            "primary_key": true
                          },
                          "keyspace": "mdata",
                          "namespace": "default",
                          "nested_loop": true,
                          "spans": [
                            {
                              "range": [
                                {
                                  "high": "\"DD\"",
                                  "inclusion": 3,
                                  "low": "\"DD\""
                                },
                                {
                                  "high": "\"model\"",
                                  "inclusion": 3,
                                  "low": "\"model\""
                                },
                                {
                                  "high": "\"V01\"",
                                  "inclusion": 3,
                                  "low": "\"V01\""
                                },
                                {
                                  "high": "(1615401000 + 900)",
                                  "inclusion": 3,
                                  "low": "(1612805400 - 900)"
                                }
                              ]
                            }
                          ],
                          "using": "gsi"
                        },
                        {
                          "#operator": "IndexScan3",
                          "#stats": {
                            "#itemsOut": 7682,
                            "#phaseSwitches": 33643,
                            "execTime": "41.189062ms",
                            "kernTime": "54.520114ms",
                            "servTime": "10.109341365s"
                          },
                          "as": "m",
                          "index": "ix_subset_version_model_fcstLen_fcstValidEpoch_DD_model",
                          "index_id": "968e2954d5368e50",
                          "index_projection": {
                            "primary_key": true
                          },
                          "keyspace": "mdata",
                          "namespace": "default",
                          "nested_loop": true,
                          "spans": [
                            {
                              "range": [
                                {
                                  "high": "\"METAR\"",
                                  "inclusion": 3,
                                  "low": "\"METAR\""
                                },
                                {
                                  "high": "\"V01\"",
                                  "inclusion": 3,
                                  "low": "\"V01\""
                                },
                                {
                                  "high": "\"HRRR\"",
                                  "inclusion": 3,
                                  "low": "\"HRRR\""
                                },
                                {
                                  "high": "6",
                                  "inclusion": 3,
                                  "low": "6"
                                },
                                {
                                  "high": "(1615401000 + 900)",
                                  "inclusion": 3,
                                  "low": "(1612805400 - 900)"
                                },
                                {
                                  "high": "ceil((3600 * floor((((`o`.`fcstValidEpoch`) + (3600 / 2)) / 3600))))",
                                  "inclusion": 3,
                                  "low": "ceil((3600 * floor((((`o`.`fcstValidEpoch`) + (3600 / 2)) / 3600))))"
                                }
                              ]
                            }
                          ],
                          "using": "gsi"
                        },
                        {
                          "#operator": "IndexScan3",
                          "#stats": {
                            "#itemsOut": 1866983,
                            "#phaseSwitches": 7472279,
                            "execTime": "5.179590472s",
                            "kernTime": "2.449477912s",
                            "servTime": "2.909344991s"
                          },
                          "as": "m",
                          "index": "adv_fcstLen_model_fcstValidEpoch_type_docType_subset_version",
                          "index_id": "44cc5b35c7c4e605",
                          "index_projection": {
                            "primary_key": true
                          },
                          "keyspace": "mdata",
                          "namespace": "default",
                          "nested_loop": true,
                          "spans": [
                            {
                              "range": [
                                {
                                  "high": "6",
                                  "inclusion": 3,
                                  "low": "6"
                                },
                                {
                                  "high": "\"HRRR\"",
                                  "inclusion": 3,
                                  "low": "\"HRRR\""
                                },
                                {
                                  "high": "(1615401000 + 900)",
                                  "inclusion": 3,
                                  "low": "(1612805400 - 900)"
                                }
                              ]
                            }
                          ],
                          "using": "gsi"
                        }
                      ]
                    },
                    {
                      "#operator": "Fetch",
                      "#stats": {
                        "#itemsIn": 7682,
                        "#itemsOut": 7682,
                        "#phaseSwitches": 33949,
                        "execTime": "39.682079ms",
                        "kernTime": "10.801279003s",
                        "servTime": "1.126032525s"
                      },
                      "as": "m",
                      "keyspace": "mdata",
                      "namespace": "default",
                      "nested_loop": true
                    }
                  ]
                },
                "#time_normal": "00:00.289",
                "#time_absolute": 0.28966418299999996
              },
              {
                "#operator": "Let",
                "#stats": {
                  "#itemsIn": 7682,
                  "#itemsOut": 7682,
                  "#phaseSwitches": 30731,
                  "execTime": "51.289474254s",
                  "kernTime": "108.835001ms"
                },
                "bindings": [
                  {
                    "expr": "array {\"name\": (`station`.`name`), \"station\": (`station`.`Ceiling`)} for `station` in (`o`.`data`) when ((`station`.`name`) in [\"KBYG\", \"KGEY\", \"KSHR\"]) end",
                    "var": "observation"
                  },
                  {
                    "expr": "array {\"name\": (`station`.`name`), \"station\": (`station`.`Ceiling`)} for `station` in object_values((`m`.`data`)) when ((`station`.`name`) in [\"KBYG\", \"KGEY\", \"KSHR\"]) end",
                    "var": "prediction"
                  }
                ],
                "#time_normal": "00:51.289",
                "#time_absolute": 51.289474254
              },
              {
                "#operator": "Filter",
                "#stats": {
                  "#itemsIn": 7682,
                  "#itemsOut": 343,
                  "#phaseSwitches": 16053,
                  "execTime": "508.293338ms",
                  "kernTime": "50.890165348s"
                },
                "condition": "(((((((((((((((((`o`.`type`) = \"DD\") and ((`o`.`docType`) = \"obs\")) and ((`o`.`subset`) = \"METAR\")) and ((`o`.`version`) = \"V01\")) and ((1612805400 - 900) <= (`o`.`fcstValidEpoch`))) and ((`o`.`fcstValidEpoch`) <= (1615401000 + 900))) and (array_length(`observation`) = 3)) and ((`m`.`type`) = \"DD\")) and ((`m`.`docType`) = \"model\")) and ((`m`.`subset`) = \"METAR\")) and ((`m`.`version`) = \"V01\")) and ((`m`.`model`) = \"HRRR\")) and ((`m`.`fcstLen`) = 6)) and ((1612805400 - 900) <= (`m`.`fcstValidEpoch`))) and ((`m`.`fcstValidEpoch`) <= (1615401000 + 900))) and (array_length(`prediction`) = 3))",
                "#time_normal": "00:00.508",
                "#time_absolute": 0.508293338
              },
              {
                "#operator": "InitialProject",
                "#stats": {
                  "#itemsIn": 343,
                  "#itemsOut": 343,
                  "#phaseSwitches": 1034,
                  "execTime": "6.317129ms",
                  "kernTime": "51.390252994s"
                },
                "result_terms": [
                  {
                    "expr": "`observation`"
                  },
                  {
                    "expr": "`prediction`"
                  },
                  {
                    "as": "avtime",
                    "expr": "ceil((3600 * floor((((`o`.`fcstValidEpoch`) + (3600 / 2)) / 3600))))"
                  }
                ],
                "#time_normal": "00:00.006",
                "#time_absolute": 0.006317129
              },
              {
                "#operator": "FinalProject",
                "#stats": {
                  "#itemsIn": 343,
                  "#itemsOut": 343,
                  "#phaseSwitches": 1030,
                  "execTime": "649.917µs",
                  "kernTime": "988.742µs"
                },
                "#time_normal": "00:00.000",
                "#time_absolute": 0.0006499170000000001
              }
            ],
            "#time_normal": "00:00.000",
            "#time_absolute": 0.000008395
          }
        ],
        "#time_normal": "00:00.000",
        "#time_absolute": 0.000008696
      },
      "#time_normal": "00:00.001",
      "#time_absolute": 0.001028108
    },
    {
      "#operator": "Stream",
      "#stats": {
        "#itemsIn": 343,
        "#itemsOut": 343,
        "#phaseSwitches": 689,
        "execTime": "35.423083ms",
        "kernTime": "51.364131276s"
      },
      "#time_normal": "00:00.035",
      "#time_absolute": 0.035423083
    }
  ],
  "~versions": [
    "6.5.0-N1QL",
    "6.6.2-9588-enterprise"
  ],
  "#time_normal": "00:00.000",
  "#time_absolute": 0.000002174
}

@randy.pierce ,

Your query using IntersectScans, You should avoid intersectScan (How can an Intersect Scan return more items than MIN(inputs from respective indices). It's returning the MAX) by dropping unnecessary index or specify USE INDEX clause
mdata AS o USE INDEX (ix_subset_version_model_fcstLen_fcstValidEpoch_DD_model)
JOIN mdata AS m USE INDEX (ix_subset_version_model_fcstLen_fcstValidEpoch_DD_model)

Second JOIN produced documents, it needs to walk thorough two arrays and construct new arrays,
Later it throwing away Filter time 95% of documents are eliminated might be due to length of the array check.
That why i suggested in other post use Array index.

Thank you again!
randy

Thank you so much for your help with this. I still struggle with the end result of what I am trying to do.
What I really want to end up with is a set of contingencies for each fcstValidEpoch. This represents the contingencies, and I want to sum the contingencies (hits, misses, false_alarms, and correct_negatives) for
each fcstValidTime because depending on the number and value of predictions and observations there
can be different results. Obviously the commented COUNT… lines in m y query are not right. How do I apply this formula to the lists of predictions and observations in each result? I included my query and a snippet of the results, but of course I do not have any contingencies as they are commented out. I’ve worked at this for days now, any help is GREATLY appreciated.
This is my incorrect attempt at counting contingencies, it is the right formula, it just isn’t the right N1QL…

--COUNT (CASE WHEN (prediction[*].station.Ceiling < 1000 AND observation[*].station.Ceiling < 1000) THEN 1 ELSE 0 END) hits,
--COUNT (CASE WHEN (prediction[*].station.Ceiling < 1000 AND NOT observation[*].station.Ceiling < 1000) THEN 1 ELSE 0 END) fals_alarms,
--COUNT (CASE WHEN (NOT prediction[*].station.Ceiling < 1000 AND observation[*].station.Ceiling < 1000) THEN 1 ELSE 0 END) misses,
--COUNT (CASE WHEN (NOT prediction[*].station.Ceiling < 1000 AND NOT observation[*].station.Ceiling < 1000) THEN 1 ELSE 0 END) correct_negatives


This is my current query…

SELECT
observation observations,
prediction predictions,
ceil(3600*floor(((m0.fcstValidEpoch)+3600/2)/3600)) avtime,
COUNT(DISTINCT m0.fcstValidEpoch) N_times,
MIN(m0.fcstValidEpoch) min_secs,
MAX(m0.fcstValidEpoch) max_secs,
ARRAY_LENGTH(prediction) AS N0
--COUNT (CASE WHEN (prediction[*].station.Ceiling < 1000 AND observation[*].station.Ceiling < 1000) THEN 1 ELSE 0 END) hits,
--COUNT (CASE WHEN (prediction[*].station.Ceiling < 1000 AND NOT observation[*].station.Ceiling < 1000) THEN 1 ELSE 0 END) fals_alarms,
--COUNT (CASE WHEN (NOT prediction[*].station.Ceiling < 1000 AND observation[*].station.Ceiling < 1000) THEN 1 ELSE 0 END) misses,
--COUNT (CASE WHEN (NOT prediction[*].station.Ceiling < 1000 AND NOT observation[*].station.Ceiling < 1000) THEN 1 ELSE 0 END) correct_negatives
FROM mdata AS o
JOIN mdata AS m0 ON CEIL(3600*FLOOR(((o.fcstValidEpoch)+3600/2)/3600)) = CEIL(3600*FLOOR(((m0.fcstValidEpoch)+3600/2)/3600))
LET observation = (ARRAY {"station":station,"name":station.name} FOR station IN o.data WHEN station.name IN ["KBYG","KGEY","KSHR"] END),
prediction = (ARRAY {"prediction":station,"name":station.name} FOR station IN OBJECT_VALUES(m0.data) WHEN station.name IN ["KBYG","KGEY","KSHR"] END)
WHERE o.type='DD'
AND o.docType='obs'
AND o.subset='METAR'
AND o.version='V01'
AND o.fcstValidEpoch >= 1612805400 - 900
AND o.fcstValidEpoch <= 1615401000 + 900
AND ARRAY_LENGTH(observation) = 3
AND m0.type='DD'
AND m0.docType='model'
AND m0.subset='METAR'
AND m0.version='V01'
AND m0.model="HRRR"
AND m0.fcstLen=6
AND m0.fcstValidEpoch >= 1612805400 - 900
AND m0.fcstValidEpoch <= 1615401000 + 900
AND CEIL(3600*FLOOR(((m0.fcstValidEpoch)+3600/2)/3600)) = CEIL(3600*FLOOR(((o.fcstValidEpoch)+3600/2)/3600))
AND ARRAY_LENGTH(prediction) = 3
GROUP BY m0.fcstValidEpoch,
         observation,
         prediction
ORDER BY m0.fcstValidEpoch

This is a snippet of my current result set…

[
  {
    "N0": 3,
    "N_times": 1,
    "avtime": 1612807200,
    "max_secs": 1612806780,
    "min_secs": 1612806780,
    "observations": [
      {
        "name": "KBYG",
        "station": {
          "Ceiling": 100,
          "DewPoint": -71,
          "Reported Time": 1612807200,
          "Surface Pressure": 10085,
          "Temperature": -9,
          "Visibility": 1000,
          "WD": 330,
          "WS": 9,
          "name": "KBYG"
        }
      },
      {
        "name": "KSHR",
        "station": {
          "Ceiling": 800,
          "DewPoint": -80,
          "Reported Time": 1612807200,
          "Surface Pressure": 10129,
          "Temperature": 19,
          "Visibility": 1000,
          "WD": 340,
          "WS": 5,
          "name": "KSHR"
        }
      },
      {
        "name": "KGEY",
        "station": {
          "Ceiling": 290,
          "DewPoint": 19,
          "Reported Time": 1612807200,
          "Surface Pressure": 10112,
          "Temperature": 100,
          "Visibility": 175,
          "WD": 0,
          "WS": 0,
          "name": "KGEY"
        }
      }
    ],
    "predictions": [
      {
        "name": "KBYG",
        "prediction": {
          "Ceiling": 6084,
          "DewPoint": -45,
          "RH": 531,
          "Reported Time": 1612806780,
          "Surface Pressure": 8457,
          "Temperature": 101,
          "VGTYP": 10,
          "Visibility": 2175,
          "WD": 108,
          "WS": 5,
          "name": "KBYG"
        }
      },
      {
        "name": "KGEY",
        "prediction": {
          "Ceiling": 6162,
          "DewPoint": -20,
          "RH": 521,
          "Reported Time": 1612806780,
          "Surface Pressure": 8763,
          "Temperature": 130,
          "VGTYP": 10,
          "Visibility": 2094,
          "WD": 153,
          "WS": 4,
          "name": "KGEY"
        }
      },
      {
        "name": "KSHR",
        "prediction": {
          "Ceiling": 6176,
          "DewPoint": -55,
          "RH": 742,
          "Reported Time": 1612806780,
          "Surface Pressure": 8774,
          "Temperature": 14,
          "VGTYP": 10,
          "Visibility": 1373,
          "WD": 352,
          "WS": 5,
          "name": "KSHR"
        }
      }
    ]
  },
  {
    "N0": 3,
    "N_times": 1,
    "avtime": 1612821600,
    "max_secs": 1612821180,
    "min_secs": 1612821180,
    "observations": [
      {
        "name": "KBYG",
        "station": {
          "Ceiling": 260,
          "DewPoint": -80,
          "Reported Time": 1612821600,
          "Surface Pressure": 10115,
          "Temperature": -29,
          "Visibility": 150,
          "WD": 350,
          "WS": 18,
          "name": "KBYG"
        }
      },
      {
        "name": "KSHR",
        "station": {
          "Ceiling": 550,
          "DewPoint": -99,
          "Reported Time": 1612821600,
          "Surface Pressure": 10166,
          "Temperature": -29,
          "Visibility": 200,
          "WD": 300,
          "WS": 21,
          "name": "KSHR"
        }
      },
      {
        "name": "KGEY",
        "station": {
          "Ceiling": 900,
          "DewPoint": -60,
          "Reported Time": 1612821600,
          "Surface Pressure": 10136,
          "Temperature": 10,
          "Visibility": 1000,
          "WD": 310,
          "WS": 21,
          "name": "KGEY"
        }
      }
    ],
    "predictions": [
      {
        "name": "KBYG",
        "prediction": {
          "Ceiling": 6084,
          "DewPoint": 6,
          "RH": 864,
          "Reported Time": 1612821180,
          "Surface Pressure": 8470,
          "Temperature": 41,
          "VGTYP": 10,
          "Visibility": 957,
          "WD": 341,
          "WS": 13,
          "name": "KBYG"
        }
      },
      {
        "name": "KGEY",
        "prediction": {
          "Ceiling": 764,
          "DewPoint": 114,
          "RH": 733,
          "Reported Time": 1612821180,
          "Surface Pressure": 8759,
          "Temperature": 189,
          "VGTYP": 10,
          "Visibility": 1603,
          "WD": 342,
          "WS": 15,
          "name": "KGEY"
        }
      },
      {
        "name": "KSHR",
        "prediction": {
          "Ceiling": 92,
          "DewPoint": 20,
          "RH": 820,
          "Reported Time": 1612821180,
          "Surface Pressure": 8786,
          "Temperature": 66,
          "VGTYP": 10,
          "Visibility": 1125,
          "WD": 322,
          "WS": 7,
          "name": "KSHR"
        }
      }
    ]
  }, 
....

Thanks in advance,

@randy.pierce ,

prediction[*].station.Ceiling will become array , < 1000 will not give right results.
Not sure what is your logic any element match or all element match etc. Also you want use SUM vs COUNT. COUNT, counts for all non non-MISSING, non-NULL values (if you need to use COUNT change ELSE 0 to ELSE NULL).

SUM( CASE WHEN (ANY v IN prediction SATISFIES v.station.Ceiling < 1000 END AND ANY v IN observation SATISFIES v.station.Ceiling < 1000 END )TEHN 1 ELSE 0 END)

SUM( CASE WHEN (ANY AND EVERY v IN prediction SATISFIES v.station.Ceiling < 1000 END AND ANY AND EVERY v IN observation SATISFIES v.station.Ceiling < 1000 END) TEHN 1 ELSE 0 END)