City Research Online

Investigating the detection of stored scripting attacks using machine learning

Mereani, F. A. (2021). Investigating the detection of stored scripting attacks using machine learning. (Unpublished Doctoral thesis, City, University of London)


Web applications now play an essential role in our daily lives; through them we can make bank transfers, purchase products and/or make bookings on the Internet. This makes them a target for attackers who will attempt to exploit security vulnerabilities in web applications in order to obtain access to sensitive user information or gain unauthorized privileges. One of the most common attacks aimed at stealing user information is Cross-Site Scripting; this is ranked among the top 10 security vulnerabilities in web applications. Traditional defense systems rely on a signature database describing known attacks; however, XSS attacks written in JavaScript are very variable; they do not exist only in a single form. The most common cause of XSS security vulnerabilities is weakness of verification of the user’s input. This provides the motivation for finding a method for identifying malicious code, written in JavaScript, that an attacker attempts to have executed on the server.

Machine learning has contributed to the security of web applications. Several studies have been conducted in relation to Intrusion Detecting Systems (IDS) which detect and prevent attacks against web applications. Cross-Site Scripting is one of the attacks that has been studied employing a number of methods: for example, using features to identify obfuscated scripts or using JavaScript keywords, evaluating machine learning algorithms in term of detecting attacks against web applications such as random forest, and SVM. These studies have achieved highly accurate results by using machine learning to detect XSS attacks. They often attained better results than dynamic and static analysis in terms of acting as a protection layer for web applications.

This present study will demonstrate the use of machine learning methods, incorporated into a web application at the user input validation stage - prior to the request being passed to the application server. Classifiers will be used to prevent persistent or stored XSS attacks, which are caused by malicious code injections via an input point in the web application. This study relies on supervised machine learning and the application of Boolean feature sets, in order to achieve ease and speed of classification. Furthermore, this study examined the use of such methods on two other types of injection attacks: SQL-i and LDAP. Cascading classifiers and ensemble techniques were used to reduce complexity while maintaining accuracy and speed. To understand how a decision is made in the classifier, an approximate Boolean function is extracted; this is done based on the techniques which have been employed to extract rules from black box classifiers.

Publication Type: Thesis (Doctoral)
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Departments: Doctoral Theses
School of Science & Technology > School of Science & Technology Doctoral Theses
School of Science & Technology > Computer Science
[thumbnail of FMereaniThesis-Final Version.pdf]
Text - Accepted Version
Download (3MB) | Preview


Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email


Downloads per month over past year

View more statistics

Actions (login required)

Admin Login Admin Login