City Research Online

A machine learning approach for smart computer security audit

Pozdniakov, K. (2017). A machine learning approach for smart computer security audit. (Unpublished Doctoral thesis, City, University of London)


This thesis presents a novel application of machine learning technology to automate network security audit and penetration testing processes in particular. A model-free reinforcement learning approach is presented. It is characterized by the absence of the environmental model. The model is derived autonomously by the audit system while acting in the tested computer network. The penetration testing process is specified as a Markov decision process (MDP) without definition of reward and transition functions for every state/action pair. The presented approach includes application of traditional and modified Q-learning algorithms. A traditional Q-learning algorithm learns the action-value function stored in the table, which gives the expected utility of executing a particular action in a particular state of the penetration testing process. The modified Q-learning algorithm differs by incorporation of the state space approximator and representation of the action-value function as a linear combination of features. Two deep architectures of the approximator are presented: autoencoder joint with artificial neural network (ANN) and autoencoder joint with recurrent neural network (RNN). The autoencoder is used to derive the feature set defining audited hosts. ANN is intended to approximate the state space of the audit process based on derived features. RNN is a more advanced version of the approximator and differs by the existence of the additional loop connections from hidden to input layers of the neural network. Such architecture incorporates previously executed actions into new inputs. It gives the opportunity to audit system learn sequences of actions leading to the goal of the audit, which is defined as receiving administrator rights on the host. The model-free reinforcement learning approach based on traditional Q-learning algorithms was also applied to reveal new vulnerabilities, buffer overflow in particular. The penetration testing system showed the ability to discover a string, exploiting potential vulnerability, by learning its formation process on the go.

In order to prove the concept and to test the efficiency of an approach, audit tool was developed. Presented results are intended to demonstrate the adaptivity of the approach, performance of the algorithms and deep machine learning architectures. Different sets of hyperparameters are compared graphically to test the ability of convergence to the optimal action policy. An action policy is a sequence of actions, leading to the audit goal (getting admin rights on the remote host). The testing environment is also presented. It consists of 80+ virtual machines based on a vSphere virtualization platform. This combination of hosts represents a typical corporate network with Users segment, Demilitarized zone (DMZ) and external segment (Internet). The network has typical corporate services available: web server, mail server, file server, SSH, SQL server. During the testing process, the audit system acts as an attacker from the Internet.

Publication Type: Thesis (Doctoral)
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Departments: Doctoral Theses
School of Science & Technology > School of Science & Technology Doctoral Theses
School of Science & Technology > Computer Science
Text - Accepted Version
Download (2MB) | Preview



Downloads per month over past year

View more statistics

Actions (login required)

Admin Login Admin Login