MathJax

Monday, September 15, 2014

Entropy and Passwords - Computing for Everyone

Which is a better password (neither are good): "password" or "ospawdr"?
Intuitively, the second is better, because it's harder to guess--even though they both use the same letters, and the second collection of letters is shorter (7 letters instead of 8).

This is the information theory concept of entropy[1]. Once you're familiar with entropy, you'll see it pop up everywhere in computing and everywhere in life. Intuitively, a set of rules that generates passwords that are harder to guess has higher entropy than a set of rules that generates passwords that are easier to guess.

If your set of rules for generating passwords has high entropy, that means it will take more attempts to guess your password. High entropy is crucial because an attacker might be able to make guesses very, very, very quickly. Thankfully, you can use rules that can outrun the guess rate of an attacker. This is why some web sites have obnoxious rules about including a mix of characters that makes your passwords hard to remember: the rules make them hard to guess as well.

So why is "ospawdr" better than "password"? Let's look at the rules that generated each password.
"password" is a word in the dictionary. Worse, it's a common default password. As a common default password, a manual attacker might try it within the first half-dozen attempts (and laugh hysterically when it works). But let's be generous and say the rules you used to arrive at the password "password" are just that it's a common English word. How many English words are there? Let's say there are a million. But "password" is a common English word--a clever attacker would surely try common words first, no? The second edition of the Oxford English Dictionary contains under 175,000 entries--most of which you probably haven't heard of. But rest assured, a computer can make 175,000 guesses within seconds. So with a generous estimate, we'll say we're guaranteed to guess "password" within the first 200,000 guesses.

Now let's look at "ospawdr". My rules for making this password were to choose an arbitrary 7 letters from the word "password". "password" has 7 different letters, so my rules generate 77 = 823,543 different passwords. Much better!

There's a problem, though: my "arbitrary" 7 letters still don't have great entropy! Why? Because it turns out that humans are bad at making random-looking choices--what looks random to a human isn't as "random" as it should be! For example, I chose these characters in my head while looking at the word, and it turns out that I chose 7 distinct letters in a 7-character password from a 7-character alphabet. The fraction of truly random passwords that share this characteristic is 7 • 6 • 5 • 4 • 3 • 2 • 1 = 7! = 720 out of a space of 823,543. My "arbitrary" rules picked a password that was actually in a subset of 0.8% of the space I thought I was using. What a disaster![2]

Next: Entropy and Bits

[1] There's a related physics version of entropy.
[2] For comparison, here are 10 passwords I generated with a Ruby script using the rules I thought I was using:

dddrdrr
wpswoow
oaprrrr
drpaawo
sraddsd
warsosr
psdasso
rdpdwao
ooapwaw
apsppwo
And the script:
alphabet = 'pasword'
10.times do
  password = ''
  7.times do
    password << alphabet.chars.sample
  end
  puts password
end

These script-generated passwords don't "look" "as random," but they're actually harder to guess than the original. This is why I recommend using a password manager and generator.

Here are 10 sample "pasword" permutations:
posadwr
owpdsar
prowads
podwasr
sdawpro
opadrsw
rowsdpa
wsaodpr
rswdaop
drsoapw

And the Ruby code:
alphabet = 'pasword'
10.times do
  puts alphabet.chars.shuffle.join
end

As an example of how the human mind is bad at picking out "random" data, here are the two next to each other.

Random characters (entropy ≈ 20 bits)Random permutation (entropy ≈ 10 bits)
dddrdrr
wpswoow
oaprrrr
drpaawo
sraddsd
warsosr
psdasso
rdpdwao
ooapwaw
apsppwo
  
posadwr
owpdsar
prowads
podwasr
sdawpro
opadrsw
rowsdpa
wsaodpr
rswdaop
drsoapw
  
For more on passwords, see

No comments:

Post a Comment