It looks CAPTCHA (Completely Automated Procedures for Telling Computers and Humans Apart) systems will soon be made obsolete. A researcher group based from UK and China announced that they have found a method to utilize the power of machine learning to decode and demystify the text-based CAPTCHA systems which once ruled the web. In addition, it can still be found on the Internet. Text-based CAPTCHAs uses a jumble of letters, numbers and occluding lines to identify bots which are basically non-human automated users. The report also suggested that this new way of ML has proven to be highly efficient and more accurate in comparison to CAPTCHA solvers.

The researchers, who are from Northwest University, Peking University, and Lancaster University tested this new method on 33 different CAPTCHA systems which included11 that were used by a few very popular websites. This experiment gave them accurate and impressive results.

According to reports from Naked Security by Sophos, CAPTCHAs present on Sohu website, China’s were said to be the easiest to crack with 92 percent accuracy. This was followed by eBay (86.6 percent), (86 percent), Wikipedia (78 percent), and Microsoft (69.6 percent). Google’s text-based CAPTCHAs were the most resilient to this new method and gave only 3 percent accuracy.

The surprising fact is that the new CAPTCHA solver required only 500 genuine CAPTCHAs to refine itself, in comparison to the millions which solvers (in the past) have needed. In addition to this, the new solver can defeat a text-based CAPTCHA within a matter of 0.05 seconds only just a simple PC and GPU.

Dr. Zheng Wang, the Senior Lecturer at Lancaster University’s School of Computing and Communications and co-author of the research said thus- “We will be showing, for the first time that an adversary can be used to quickly launch an attack on a new text-based captcha scheme with very less effort. This is something to be worried about since this indicates that this first security defense of many websites is no longer reliable and safe.”

This was incorporated using an artificial intelligence algorithm technique which is known as Generative Adversarial Network (GAN). This was used to defeat the text-based CAPTCHA systems. The GAN has two parts associated with it – the generative network which is used to synthesize lots of examples of the target which is the text-CAPTCHAs, and the second part being the discriminative network which is used to assess the output against the examples from the real world. The constant cycle between the two systems helps both the systems to get better at their tasks as their experience increases.

According to Naked Security, the researchers have tried to use GANs to leverage the image-based CAPTCHAs in the past but this is the first time that they have used it and succeeded, with impressive results.

Dr. Wang noted the implications of this research and said that it is alarming, even though text-based CAPTCHAs are not very common these days. Wang added that this is truly a wake-up call for the websites which still thought they were guarded by the good old CAPTCHA systems.

