Hmm based Attacks on Google s Recaptcha With Continuous Visual and Audio Symbols

Access throughyour institution

Computers & Electrical Engineering

Abstract

Completely Automated Public Turing Tests to Tell Computer and Humans Apart (CAPTCHA) is an approach used to distinguish between a self-active computer program and a human. This kind of CAPTCHA is used in the field of computing to discover a particular user interacting with the system is human or not. It requires someone to enter the correct sequence of digits, characters, or both the combinations. The average time taken to solve the current captcha displayed will take around 10 s approximately. Research findings also discover additionally that the numerous famous CAPTCHA procedures are not powerful or secure, further entangling the test of offering services acquired from robotic intervention yet available to individuals with disabilities. This leads to the formulation of new approaches like QR Code Scanning-to-Speech, Revamping, and Randomness. This technique is encrypted end-to-end by using Blowfish Fermat Little Theorem (BFLT). There are a total of 156 physical users (both visually impaired and visually fit) were analyzed to assess the potency of the proposed approach. Further, it is compared with the current highest development CAPTCHA to test the performance of the QR Code Scanning-to-Speech, Revamping, and Randomness. The security scrutiny proved that the proposed approach is robust and hindering Hidden Markov Model (HMM), Fuzzy solver, and recent eminent attacks.

Introduction

CAPTCHA, the phrase "Completely Automated Public Turing test to tell Computers and Humans Apart," is a security measure that protects internet benefits against machine-controlled exploitation with repulsive intentions [1]. They designed CAPTCHA is to restrict malicious attacks on particular sites, and if it is exposed, the cyber-criminals can hack on the websites. CAPTCHA differentiates between a person and computer programs that helps to make online polls more lawful, diminishes spam and viruses, leads online shopping less risky, and subsides fake email account creation. In the beginning, the text-oriented CAPTCHAs are widely used with the fusion of deforming alphanumeric letters and superimposing noise approach, which can be identified by the human but hard for machine-driven programs. In the real world various methods of CAPTCHA were practised based on different websites. Each of the existing CAPTCHA can be broken individually by the OCR program because of its unique encoding algorithms. OCR is the mechanical or electronic transformation of images from handwritten, typed, printed, or handwritten letters into machine code letters [2]. The only purpose of CAPTCHA is to ensure that the user's response is accepted by the web resources n and not from machine-driven programs. CAPTCHA find out whether the website is providing information to the real human or automated bots. There are three principles a CAPTCHA design should possess to avoid bot attacks, and it is randomly generated, effortless for humans, and tricky for the machine. Text CAPTCHA is the extensively used design as a result of low generation cost compared with other CAPTCHAs. The text-based CAPTCHA can be recognized by robust OCR techniques even displayed with distortion, overlaps, strikeouts, missing ink, background texture, and most of the users considered the CAPTCHAs are to be an annoying issue.

Neural network application breaks the text CAPTCHA pretty simple with the following steps by selecting the text CAPTCHA images from the website, and then breaks them into separate letter images to create the training dataset. Train the neural network to predict for every single letter. The conventional text-based CAPTCHAs examine as not much secure but still, they are the most commonly practiced reverse Turing test for differentiating between human-being and bots. It is in practice for more than two decades. Web resources apply CAPTCHA for preserving facilities against spam e-mail, multiple counterfeit online access, Denial of Service (DoS) attacks, web-based voting fraud, and so on [3]. Alta vista search engine is the first user of CAPTCHA in 1977 for deterring the machine-driven submission of URLs to their service. It was successful in reducing the amount of spam by over 95% [4]. Since then, CAPTCHA has transformed into a general section of the internet and has been applied in electronic services given by larger organizations such as Google, Microsoft, Facebook, Amazon, and Yahoo. The term "Turing" (named after Alan Turing scientist) is a test that differentiates a person from a computer-driven program.

A normal human being with healthy sight can identify the text given in the box while the machine blunders. Visually impaired people always stand behind and depend on visually fit people to complete the task related to communication (websites and online applications for money transfer, educational purpose, and online ticket booking, etc.) and information technology. Thus our proposed QR Code Scanning-to-Speech, Revamping, and Randomness CAPTCHA facilitate them to work independently to some extent. World Health Organization (WHO) declares that 1.3 million people have some vision impairment and 188.5 million People have moderate vision impairment, 217 million have average to severe vision impairment, and 36 million people survive with a near vision impairment. So these people experience difficulties while accessing transports, buildings, and information [5].

With the advancement of pattern recognition technology, even the complex CAPTCHA can be created to secure the websites against the cracking algorithms. Convolutional Neural Network (CNN) has revealed steep progress in the area of pattern recognition that is prominent in learning spatial characteristics in images [6] and convenient to handle the character recognition problem [7]. This proposed design is a robust and user-friendly system for visually impaired people to obtain ease of admittance to web applications and resources such as email accounts, chats, online voting, and forums.

(a): The proposed QR Code Scanning-to-Speech, Revamping, and Randomness CAPTCHA are a novel method specifically designed for visually impaired users. The user needs to scan the QR Code given to receive the CAPTCHA and revamping-text.
(b): To the finest of our skill, scanning and revamping CAPTCHA is the first work in the CAPTCHA methodologies. Users need to reconstruct the CAPTCHA text according to the revamping-text.
(c): Random text is end-to-end encrypted with BFLT [8] algorithm and is transmitted in an encrypted form to reach the client. Finally, the decrypted text converted into QR Code CAPTCHA.
(d): More than 90% of voice-CAPTCHA is solved by HMM and fuzzy solver when it is on the website itself. But in the QR Code Scanning-to-Speech, Revamping, and Randomness, the voice is in the consumer's smart device after scanning the QR Code.
(e): Compared to the existing voice CAPTCHA, the proposed method is the only noise-free CAPTCHA that helps visually impaired people. The experimental result says that the proposed method is efficient and practical for visually fit users as well as visually impaired.

The layout of the full article is organized in the following manner. Section 2 describes the survey of related works. Section 3 depicts the structure and working of the proposed method. Section 4 presents the security scheme against a few prominent attacks and a detailed explanation of the study. Section 5 discusses the limitations and drawbacks of the existing methods. Finally, Section 6 concludes the article with the future scope of the research work.

Section snippets

Related works

Audio CAPTCHA is difficult to handle and time-consuming for both visually impaired and visually fit users [9]. Sano et al. [10] stated the truth that earlier attacks on audio CAPTCHA targeted at resolving non-contiguous audio CAPTCHA. However, the security of continuous audio CAPTCHA depends on the complication in implementing proper segmentation. To enhance the security of audio CAPTCHA, the number of human voices must be increased for background noise. The authors designed the proposed

The proposed system: QR code scanning-to-speech, revamping, and randomness CAPTCHA

The new style of CAPTCHA has been introduced based on different kinds of images since text-based CAPTCHA is not much secured. Researchers create text CAPTCHA with more distortion and confusion to avoid OCR problems, which becomes very difficult for automated bots as well as human beings. For this reason, a unique genre of QR Code (Quick Response Code) Revamping CAPTCHA were introduced to make it simple for the users (both visually impaired and visually fit) to handle and are more secure than

Security analysis

Web resources need CAPTCHA to protect from the automated process of account logins while finding the misremembered password, automated online registrations, and fake account creation requests, and email bombarding. If the CAPTCHA is not secure enough, it may lead to loss of reputation for the website owners due to the extraction of sensitive data using tools, attacks on authentication, and DoS to users and admins. Since the proposed CAPTCHA has a large set of revamping-texts, the automated

Discussion

The OCR tool recognizes the most widely used text-based CAPTCHAs. The random text is encrypted with the BFLT algorithm while transmitting, and parallelly the online sources display the CAPTCHA text as a QR Code that is difficult for the OCR tool. Google's (reCAPTCHA) continuous voice CAPTCHA can be resolved by HMM-based automated voice recognition software on the webpage. But the HMM solver cannot reach and break the proposed system because the smartphone alone possess the text after scanning

Conclusion and future work

The proposed Quick Response Code Scanning-to-Speech, Revamping, and Randomness CAPTCHA is a new genre of challenge-response authentication test, uses Quick Response Code scanning with revamping audio-based design to block the most prominent attacks. This approach is designed for visually impaired individuals. The challenge and revamping-text is end-to-end encrypted by Blowfish Fermat Little Theorem. Password strength alone does not offer information security in the face of rising malicious

Declaration of Competing Interest

The authors of this manuscript state that there is no conflict of interest.

Acknowledgment

We would like to thank all of you who contributed to carry out the experimental study successfully. This work has been funded by the University of Madras (India) under University Research Fellow Grant No; GCCO/URF/Comp. Science/2019–20/323.

Author statement

None.

PL. Chithra is a Professor in the Department of Computer Science, University of Madras, Tamil Nadu, India. She received her M.C.A and Ph.D. degrees from Alagappa University and the University of Madras, Tamil Nadu, India, respectively. She was the awardee of UGC Faculty Improvement Programme for two years. She is serving as a supervisor for Ph.D. and M.Phil scholars in the area of Image Processing Techniques, Big data analytics, and Network Security.

References (23)

Von Ahn L., "CAPTCHA: using hard AI problems for security", in: E.Biham(Ed), ADVAN in Crypto - EUROCRIPT, (2003),...
An overview of the tesseract OCR Engine
K. Chellapilla et al.
Designing human friendly human interaction proofs (HIPs)
H.S. Baird et al.
Human interactive proofs and document image analysis
World health organization

(2019)
K. He et al.
Deep residual learning for image recognition
P. Li et al.
Rejecting character recognition errors using CNN based confidence estimation

Chin J Electron

(2016)
P.L. Chithra et al.
A novel encryption algorithm by fusion of modified blowfish algorithm and fermat's little theorem for data security

Int J Innovat Technol Explor Eng (IJITEE)

(2020)
J.P. Bigham et al.
Evaluating existing audio CAPTCHAs and an interface optimized for non-visual use
S. Sano et al.
HMM-based attacks on Google's ReCAPTCHA with continuous visual and audio symbols

JIP

(2021)

T. Ahmed

Hmm based Attacks on Google s Recaptcha With Continuous Visual and Audio Symbols

Abstract

Introduction

Section snippets

Related works

The proposed system: QR code scanning-to-speech, revamping, and randomness CAPTCHA

Security analysis

Discussion

Conclusion and future work

Declaration of Competing Interest

Acknowledgment

Author statement

References (23)

An overview of the tesseract OCR Engine

Designing human friendly human interaction proofs (HIPs)

Human interactive proofs and document image analysis

World health organization

Deep residual learning for image recognition

Rejecting character recognition errors using CNN based confidence estimation

Chin J Electron

A novel encryption algorithm by fusion of modified blowfish algorithm and fermat's little theorem for data security

Int J Innovat Technol Explor Eng (IJITEE)

Evaluating existing audio CAPTCHAs and an interface optimized for non-visual use