Intuitively, the second is better, because it's harder to guess--even though they both use the same letters, and the second collection of letters is shorter (7 letters instead of 8).
This is the information theory concept of entropy[1]. Once you're familiar with entropy, you'll see it pop up everywhere in computing and everywhere in life. Intuitively, a set of rules that generates passwords that are harder to guess has higher entropy than a set of rules that generates passwords that are easier to guess.
If your set of rules for generating passwords has high entropy, that means it will take more attempts to guess your password. High entropy is crucial because an attacker might be able to make guesses very, very, very quickly. Thankfully, you can use rules that can outrun the guess rate of an attacker. This is why some web sites have obnoxious rules about including a mix of characters that makes your passwords hard to remember: the rules make them hard to guess as well.
So why is "ospawdr" better than "password"? Let's look at the rules that generated each password.
"password" is a word in the dictionary. Worse, it's a common default password. As a common default password, a manual attacker might try it within the first half-dozen attempts (and laugh hysterically when it works). But let's be generous and say the rules you used to arrive at the password "password" are just that it's a common English word. How many English words are there? Let's say there are a million. But "password" is a common English word--a clever attacker would surely try common words first, no? The second edition of the Oxford English Dictionary contains under 175,000 entries--most of which you probably haven't heard of. But rest assured, a computer can make 175,000 guesses within seconds. So with a generous estimate, we'll say we're guaranteed to guess "password" within the first 200,000 guesses.
Now let's look at "ospawdr". My rules for making this password were to choose an arbitrary 7 letters from the word "password". "password" has 7 different letters, so my rules generate 77 = 823,543 different passwords. Much better!
There's a problem, though: my "arbitrary" 7 letters still don't have great entropy! Why? Because it turns out that humans are bad at making random-looking choices--what looks random to a human isn't as "random" as it should be! For example, I chose these characters in my head while looking at the word, and it turns out that I chose 7 distinct letters in a 7-character password from a 7-character alphabet. The fraction of truly random passwords that share this characteristic is 7 • 6 • 5 • 4 • 3 • 2 • 1 = 7! = 720 out of a space of 823,543. My "arbitrary" rules picked a password that was actually in a subset of 0.8% of the space I thought I was using. What a disaster![2]
Next: Entropy and Bits
[1] There's a related physics version of entropy.
[2] For comparison, here are 10 passwords I generated with a Ruby script using the rules I thought I was using:
dddrdrr wpswoow oaprrrr drpaawo sraddsd warsosr psdasso rdpdwao ooapwaw apsppwo
And the script:
alphabet = 'pasword' 10.times do password = '' 7.times do password << alphabet.chars.sample end puts password end
Here are 10 sample "pasword" permutations:
posadwr owpdsar prowads podwasr sdawpro opadrsw rowsdpa wsaodpr rswdaop drsoapw
And the Ruby code:
alphabet = 'pasword' 10.times do puts alphabet.chars.shuffle.join end
As an example of how the human mind is bad at picking out "random" data, here are the two next to each other.
Random characters (entropy ≈ 20 bits) | Random permutation (entropy ≈ 10 bits) |
---|---|
dddrdrr wpswoow oaprrrr drpaawo sraddsd warsosr psdasso rdpdwao ooapwaw apsppwo | posadwr owpdsar prowads podwasr sdawpro opadrsw rowsdpa wsaodpr rswdaop drsoapw |
- xkcd on memorable master passwords
- Troy Hunt on why you should really, really, really use a password manager
- Ars Technica article on cracking passwords
No comments:
Post a Comment