Problem with memory when running larger N1QL query


#1

Hi,

we seem to be running into a problem with memory when we run larger N1QL queries. The error message that we receive is:

Error: An unknown N1QL error occurred. This is usually related to an out-of-memory condition.

We tried allocating more memory for N1QL but it does not seem to help the issue. Any additional info on this issue would really help us out. If you require any additional information from me please let me know.

Cheers

Mike Maik


#2

Hi @MikeMaik,

Can you post your query here.

How many results are being returned?

How large is each result (you can post one sample result here, using LIMIT 1)?

What version of Couchbase are you running?

How much memory does your system have?


#3

SELECT *
FROM data t
LEFT JOIN data c ON KEYS t.conversion
WHERE t.type = "transaction"
AND c.type = "conversion"
AND t.transaction_creationDate >= 1454284800000
AND t.transaction_creationDate <= 1456704000000
LIMIT {{limit}}
OFFSET {{offset}};

We get about 20,000 results returned before the system returns the error. Our limit/offset is 1000.
A single result looks like:

[
{
“c”: {
“transaction”: "transaction:A1234567890”,
“createdAt”: 1467801689486,
“conversion_buyCurrency”: “USD”,
“conversion_client”: “client1”,
“conversion_clientBuyAmount”: 100,
“conversion_clientRate”: 1.4289,
“conversion_clientSellAmount”: 50,
“conversion_conversionDate”: 1454284800000,
“conversion_coreRate”: 1.4289,
“conversion_createdAt”: 1454317346000,
“conversion_currencyPair”: “GBPUSD”,
“conversion_depositAmount”: 0,
“conversion_depositRequired”: false,
“conversion_depositStatus”: “not_required”,
“conversion_fixedSide”: “buy”,
“conversion_midMarketRate”: 1.1234,
“conversion_partnerBuyAmount”: 0,
“conversion_partnerSellAmount”: 0,
“conversion_partnerStatus”: “funds_arrived”,
“conversion_sellCurrency”: “GBP”,
“conversion_settlementDate”: 1454344200000,
“conversion_shortReference”: "20160201-AAAAAA”,
“conversion_status”: “trade_settled”,
“conversion_updatedAt”: 1454348063000,
“Payments”: [
"Payment:1234”
],
“Settlement”: "Settlement:1234”,
“type”: “conversion”,
“updatedAt”: 1468092288198
},
“t”: {
“transaction_api”: “API”,
“transaction_deliveryMethod”: “PICKUP”,
“transaction_fixedFee”: 0.00,
“transaction_folioNumber”: “12345678”,
“transaction_fraudReported”: 0,
“transaction_id”: "B1234567890”,
“transaction_lastUpdateDate”: 1454385748000,
“transaction_originalId”: "20160201-AAAAAA”,
“transaction_originalSendingPrincipal”: 44.12,
“transaction_payer”: “PAYER”,
“transaction_payerCode”: “CL”,
“transaction_payerFolioNumber”: “1234567”,
“transaction_payerLocationCode”: “LOCATION”,
“transaction_paymentFolio”: "B1234567890”,
“transaction_paymentStatus”: “ACCEPTED”,
“transaction_paymentType”: “DEBITCARD”,
“transaction_payoutCountry”: “PHL”,
“transaction_payoutCurrency”: “PHP”,
“transaction_payoutPrincipal”: 2000.00,
“transaction_purpose”: “PURPOSE”,
“transaction_rate”: 12.3456,
“transaction_rateModifier”: 0.00,
“transaction_senderCity”: “City”,
“transaction_senderCountry”: “GBR”,
“transaction_senderId”: 711815,
“transaction_senderType”: “INDIVIDUAL”,
“transaction_sendingCountry”: “GBR”,
“transaction_sendingCurrency”: “GBP”,
“transaction_sendingPrincipal”: 50.00,
“transaction_sequentialNumber”: “8387”,
“transaction_settlementCreated”: 1454025600000,
“transaction_settlementCutOffTime”: “EVENING”,
“transaction_settlementId”: 56064,
“transaction_settlementPayer”: “Payer”,
“transaction_settlementShortReference”: "20160129-AAAAAA”,
“transaction_settlementStatus”: 2,
“transaction_settlementSubmitTime”: 1454340711000,
“transaction_status”: “CLOSED_PAID_OUT”,
“transaction_statusHistory”: [
{
“createdAt”: 1454285042000,
“status”: “PAYMENT_AWAITING_FUNDS”
},
{
“createdAt”: 1454319252000,
“status”: “PAID_AVAILABLE_FOR_PICK_UP”
},
{
“createdAt”: 1454385747000,
“status”: “CLOSED_PAID_OUT”
}
],
“transaction_totalSale”: 100,
“transaction_userId”: 1234,
“transaction_variableFee”: 0,
“createdAt”: 1467800625935,
“emisSettlement”: “emisSettlement:20160202GBP”,
“emistransaction”: “emistransaction:B1523072508”,
“payer”: “payer:BDO”,
“conversion”: "conversion:1234”,
“Payments”: [
"Payment:1234”
],
“Settlement”: "Settlement:1234”,
“type”: “transaction”,
“updatedAt”: 1468092288193
}
}
]

We are running Couchbase 4.5 and we have 24165MB of memory. Here is a breakdown:

Data RAM Quota: 8704
Index RAM Quota: 2048
Full Text RAM Quota: 512

Thank you very much for the help and sorry it took me so long to respond.

Cheers

Mike Maik


#4

Could you please provide the index definition OR Try with following index. Also verify the query using right index with explain.

create index idx_data_creationdate on data(`transaction_creationDate`) where type = “transaction”;


#5

I forgot to mention that we have an index and it is exactly what you wrote above:

CREATE INDEX ext_conversion_idx ON data(transaction_creationDate) WHERE (type = “transaction”)

Could the problem be caused by the pagination?


#6

Did you leave out an ORDER BY? Or you do not sort?


#7

Having LIMIT and OFFSET should not require the use ORDER BY in Couchbase 4.5. So no, we do not use it.


#8

Could you post explain. You mentioned earlier post it returns 20,000 and returns error, offset and limit are 1,000. Is these numbers changed?

To identify issue, Try with out LEFT (The reason you predicate c.type = “conversion” on right side this will eliminate any null projected). Try without JOIN.


#9

Here is the EXPLAIN for the query:
[
{
“plan”: {
"#operator": “Sequence”,
"~children": [
{
"#operator": “Sequence”,
"~children": [
{
"#operator": “IndexScan”,
“index”: “ext_conversion_idx”,
“index_id”: “9b50de4c2bd6d911”,
“keyspace”: “data”,
“namespace”: “default”,
“spans”: [
{
“Range”: {
“High”: [
“successor(1456704000000)”
],
“Inclusion”: 1,
“Low”: [
“1454284800000”
]
}
}
],
“using”: “gsi”
},
{
"#operator": “Parallel”,
"~child": {
"#operator": “Sequence”,
"~children": [
{
"#operator": “Fetch”,
“as”: “t”,
“keyspace”: “data”,
“namespace”: “default”
},
{
"#operator": “Join”,
“as”: “c”,
“keyspace”: “data”,
“namespace”: “default”,
“on_keys”: “(t.conversion)”
},
{
"#operator": “Filter”,
“condition”: “((((((t.type) = “transaction”) and ((c.type) = “conversion”)) and ((t.isTemporary) is missing)) and (1454284800000 <= (t.transaction_creationDate))) and ((t.transaction_creationDate) <= 1456704000000))”
},
{
"#operator": “InitialProject”,
“result_terms”: [
{
“expr”: “self”,
“star”: true
}
]
},
{
"#operator": “FinalProject”
}
]
}
}
]
},
{
"#operator": “Offset”,
“expr”: “0”
},
{
"#operator": “Limit”,
“expr”: “1000”
}
]
},
“text”: “SELECT *\nFROM data t\nUSE INDEX (ext_conversion_idx)\nJOIN data c ON KEYS t.tccConversion\nWHERE t.type = “azimoTransaction”\nAND c.type = “tccConversion”\nAND t.azimoTransaction_creationDate >= 1454284800000\nAND t.azimoTransaction_creationDate <= 1456704000000\nLIMIT 1000\nOFFSET 0;”
}
]

So the strange thing is that any single query with any LIMIT and OFFSET goes through just fine. But when we do a download of the data using pagination of 1000 documents per page it stop working around 20000 documents in. That is why I mentioned this number. It seems that this problem isn’t with any individual query but with getting the full set of data one page at a time.

We think it must be a memory issue since the individual queries go through fine but the total does not. We have tried different was to allocate the memory but nothing seems to help. Is there anything that you can suggest?

Cheers

Mike Maik


#10

N1QL will stream the results, surprising N1QL gives error for 20K documents. Check resultSize at the end of the query. Is this more than request-size-cap (default 67108864).

Could you please try the query in cbq shell without limit and offset.

If resultSet is large to reduce memory consumption, best way to do pagination is prepare statement with offset, limit as parameters and execute prepare statement repeatedly changing offset ( if resultCount == limit increase offset by limit).


#11

Hi,

I ran the query with in the cbq shell (without LIMIT/OFFSET) and it streamed just fine. Here are the metrics:

"status": "success",
"metrics": {
    "elapsedTime": "20m45.528252049s",
    "executionTime": "20m45.528123318s",
    "resultCount": 302922,
    "resultSize": 1992956528

}

I will test it again today to see if it will consistently work or not. However this doesn’t really help me because if I want to download the results and I cannot do that for such a big query. This is why we are doing the pagination.

We do the pagination in exactly the way that you suggest. We have a limit of 1000 and then offset it by 1000 until the entire query goes through. However the error still occurs somewhere along the line. I have no idea what is causing this because the pagination should actually help in reducing memory consumption.

I really don’t know what to do to avoid this issue.

Cheers

Mike Maik


#12

What is the client you are using. Is it possible to post the SQL from the client.


#13

Hi @MikeMaik,

The results do not come back in the same order from one query to the next. Therefore, you should use an ORDER BY. You should also make sure the sort uses your index, by sorting on the leading keys of your index.


#14

Hi,

we seemed to have resolved the problem by eliminating the pagination all together and just increasing the limit that we allow per downloaded document. The method of just streaming the data has increased our download speed and got rid of all of the “out of memory” errors. I’ll be honest I’m not sure why this worked but the problem is resolved.

Thanks for the help.

Cheers

Mike Maik