NIXSolutions: Captcha Distinguishes People from Bots with Generated Images

The hCaptcha service began generating images with neural networks to distinguish people from bots on the Internet, it was noticed by reddit users. In addition, more complex questions are now used in the captcha: you need to recognize not only the object, but also its characteristics, for example, cats with short hair.


According to some estimates, bots are already almost equal in activity on the Internet with people. To separate them from people, sites use different approaches. For example, you can monitor quick actions and other non-human-like activity. Another popular way to detect bots is captcha, that is, a task that is simple for a person, but difficult for algorithms. Most often, captchas show several images and ask you to select objects on them, for example, cars or fire hydrants. To make things more difficult, noise and other distortions are usually added to images. But since computer vision algorithms are developing very quickly, this is often no longer enough.

Users of the r/artificial subreddit noticed that hCaptcha, one of the popular captcha services, now uses images generated by algorithms to separate people from bots. They suggested that the service uses people’s answers as data markup in order to further train algorithms and improve the quality of generation.

Despite the fact that the images look quite high quality, you can see artifacts on them that are typical for neural network generation algorithms, says N+1. For example, an image of a plate might have a smudged edge that blends into what looks like a fork, and instead of a photo of a bird, the service often shows something that resembles a bird in shape but is very different from a real animal.

The fact that hCaptcha has begun to use modern generative algorithms is indirectly indicated by one of the tasks in which the user is shown detailed cookies in the shape of animals. The qualitative combination of such concepts in one subject (in this case, a cookie and an animal) has until recently been a weakness of generative algorithms, and this is often what developers of recent neural network models based on CLIP or similar approaches highlight, notes NIXSolutions. For example, the developers of DALL-E demonstrated in the announcement how their algorithm generates an avocado-shaped chair.

Another complication of captcha is that now the user needs to select not only the specified item, but also pay attention to its characteristics. For example, in one of the tasks you need to separate the potted plants standing on the table from those hanging from the ceiling.