I’m storing lists of commonly used passwords (bruteforce lists) and want to quickly look up whether a password is contained within one of these lists when a user wants to register.
The passwords are stored in text files one per line. For each textfile a couchbase document is created that contains all of its passwords in an array. The documents are inserted by the following upsert query:
UPSERT INTO `blacklist`
VALUES ("password-blacklist::" || $name, {
"type" : "password-blacklist",
"name": $name,
"passwords" :$passwords
})
$name
is the name of the imported file (without the extension) and $passwords
is the array of all passwords stored in that text Document. I’ve uploaded the lists for you, they are also loaded in the enterprise container that I’ve already uploaded. So you could also export them from there if you’ve got it running.
So here are some more details about the files:
- The array of most documents contains exactly 1.500.000 passwords. The others contain between 20.000 and ~800.000 elements
- The password lengths varies. Since these are commonly used passwords they aren’t too long. Almost all passwords should be below 20 symbols. There are just some few passwords that have lengths up to 255 symbols, these few exceptions are mostly wrong data (e.g. fragments of an html page) that somehow got into the dataset.