This is interesting, the first thing to note is that you’re seeing this value for the document with key “baaa”.
“test”: “<binary (32 b)>”
This is because the document (as it’s currently stored) is not valid JSON i.e. it’s binary data. In theory the “cbimport json -f list” command will only successfully import data if it’s valid JSON so after a successful import I would expect the imported documents to also be valid JSON.
I’ve had a go a reproducing this issue with the data you’ve provided on a couple of different releases of Couchbase Server and I’m currently unable to do so. Please could you provide me with some additional information to aid with the debugging process.
What version of Couchbase server are you using?
Could you provide an attachment with the exact file you’re using?
Below are the setups I’m currently using to try to reproduce this issue:
Create a cluster with the data server, indexing service and the query service
Create a Couchbase bucket
Use cbimport to load the data
Create a primary index on the created bucket (using the query workbench)
Run the “select * from bucket” query
Inspect the results to determine if there are any which are binary data
Are there any other steps which I should be taking to match the behavior you’re seeing?
The 7.0.0-beta cbimport tool switched to using a different JSON library; this library had a bug which would result in whitespace not being removed when parsing raw JSON values. If you look at the example document below, you’ll see that the value is prefixed with a carriage return/newline.
Please note the ‘0d0a’ which is ‘\r\n’ prefixed to the document value. This is something that’s has already been fixed in the underlying library and we updated the dependency in MB-44424 meaning this issue should be resolved for the second 7.0.0-beta.
This is interesting because as discussed in the JSON RFC this is completely valid JSON.
Insignificant whitespace is allowed before or after any of the six structural characters.
Although these documents are valid according to cbimport and the data service, they’re not valid according to the query service; this is the reason for the query returning documents which are binary data. I’ve raised MB-44423 to have this issue looked into.
Regarding the error that you’re seeing when importing via the UI, I believe this issue has already been addressed. Please see the screenshot below which shows what the popup now looks like.
Thanks for the update. With regards to the UI import tab error I believe you might be hitting/referring to MB-43946. In which case this issue has already been fixed; please refer to the previous screenshot which I shared which displays what the completion dialog now looks like.