Cyberknowledge’s blog post Analyzing 20,000 MySpace Passwords has some interesting tidbits about user behavior, and it certainly underscores the importance of watching out for phishing attacks and choosing a good password.
Their analysis of password “strength” leaves a lot to be desired; the length of the password, or whether it is a simple dictionary word is not taken into account. A password strength metric should assign zero points to dictionary words, and then assign some number of points to each letter, slightly more points to each number, and still more points to each symbol. Once you write this metric, you can only allow passwords that are over a certain threshold of points.
Many people have pointed out that the data set probably contains many fake entries from people who realized that they were being phished. The data could be drastically improved by writing a quick script that attempted to actually log in to MySpace with each email & password, to weed out the fake or incorrect entries, although actually doing that would probably be illegal and is definitely morally questionable.
If you’re willing to make the script I will run it on the passwords.