Researchers: Microsoft's CAPTCHAs Easy to Solve

— -- Microsoft's system to thwart automatic registrations of e-mail accounts leads to "a false sense of security," according to two researchers who have developed a low-cost way to break the security mechanism.

Jeff Yan and Ahmad Salah El Ahmad of the School of Computing Science at Newcastle University in the U.K. wrote in a research paper that their method can solve around 60 percent of Microsoft's CAPTCHAs used for validating registrations for its Windows Live Mail service.

A CAPTCHA (Completely Automated Public Turing test to Tell Computers and Humans Apart) is the distorted text that a person must decipher in order to be allowed to register for an e-mail account or perform other actions, such as post a comment, on a Web site. It's designed to prevent hackers from using automated tools for abusive purposes.

Microsoft could make its CAPTCHAs harder to solve for computers by, for instance, letting letters overlap, but that also makes it harder for people, said Yan, who lectures at the University of Newcastle.

As of the last few months, CAPTCHAs have been become increasingly ineffective. The CAPTCHA systems used by free e-mail providers such as Microsoft, Google and Yahoo have been solved on a mass scale, leading to an increase in spam originating from their domains.

Details are scarce on how hackers are solving the CAPTCHAs in great numbers. It has been suspected that low-wage CAPTCHA solvers are being employed in order to get a steady stream of new e-mail accounts.

Yan and El Ahmad started their work in mid-2007. Microsoft was notified of the problems outlined in their paper in September 2007. The researchers released the paper a few days ago with Microsoft's blessing.

Overall, Microsoft's CAPTCHA system is well designed, and the company even holds three patents related to it, they wrote. But designing a fool-proof CAPTCHA system isn't easy.

"To the best of our knowledge, this for the first time shows that a CAPTCHA that was carefully designed by serious professionals...is nevertheless vulnerable to novel but simple attacks," Yah and El Ahmad wrote.

In February, it was discovered that hackers were using a method that appeared to have a 30 percent to 35 percent success rate in solving the CAPTCHA used for Windows Live Hotmail.

Using their own analysis and algorithms, Yan and El Ahmad have almost doubled the success rate of the February attacks.

One of the hardest parts of breaking CAPTCHAs is separating the letters and putting the letters in the right order, a process known as segmentation. The twisting, wispy letters are confusing to machines, and humans are much better at sorting out extraneous lines.

Yan and El Ahmad's analysis was performed with off-the-shelf hardware: a 1.86 GHz Intel Core 2 Duo CPU (central processing unit) with 2G bytes of RAM. Their seven-step method is capable of removing "arcs" or strokes that link letters and make letters hard to isolate.

Ninety-two percent of the time, they could isolate each of the eight characters used for Microsoft's CAPTCHA. Combined with character recognition techniques, the CAPTCHAs could be solved 61 percent of the time.

Their method also works against the latest CAPTCHAs deployed by Yahoo last month, although the success rates are not as high. Yan said he will soon release another research paper looking at Yahoo's CAPTCHAs.

Of the big three -- Yahoo, Microsoft and Google -- Google seems to have the most effective CAPTCHAs right now due to the difficulty automated programs have in separating the characters, Yan said.

"Actually I think at a high level, the idea of a CAPTCHA is a good one, but the devil is in the details," Yan said.