Towards an efficient automation of network penetration testing using model-based reinforcement learning
Ghanem, M.C. (2022). Towards an efficient automation of network penetration testing using model-based reinforcement learning. (Unpublished Doctoral thesis, City, University of London)
Abstract
Penetration Testing (PT) is an offensive method for assessing and evaluating the security of digital asset by planning, generating, and executing all or some of the possible attacks that aim to exploit its vulnerabilities. In large networks, penetration testing become repetitive, complex and resources consuming despite the use of autonomous tools. To maintain the consistency and efficiency of PT in medium and large network context. it is imperative to go through making it intelligent and optimized which will allow regular and systematic testing without having to provide a prohibitive amount of human labor in one hand and reducing the precious consumed time and tested system downtime in another hand. Reinforcement Learning (RL) led testing will unburden human experts from the heavy repetitive tasks and unveil special and complex situations such as unusual vulnerabilities or combined non-obvious combinations which are often ignored in manual testing. In this research, we are concerned with the specific context of improving current automated testing systems and making them intelligent, targeted, and efficient by embedding reinforcement learning techniques where it is relevant. The proposed Intelligent Automated Penetration Testing Framework (IAPTF) utilizes RL because of its relevance to sequential decision-making problems, it relies on a model based RL where planning and learning are combined and decomposed tasks to represent it as POMDP domain accounting for major PT features, tasks and information flowchart to realistically reflect the real-world context. The problem is then solved on an external POMDP-solver using different algorithms to identify most efficient options. As we encountered a huge scaling-up challenges in solving large POMDP which reflect the regular representation of PT on large networks, we propose thus a Hierarchical representation on which we divided large networks into security clusters and enabling IAPTF to deal with each cluster separately as small networks (intra-clusters), later we proceed to the testing of the network of clusters heads to ensure covering all possible complex and multistep attacking vectors largely adopted by nowadays hackers. The obtained results are unanimous and defeat both previous results and any human performances in term of consumed time, number tested vectors and accuracy especially in large networks. The learning is the second strength of our new model, as the generalization of the extracted knowledge become easier and allowing therefore the re-usability notably in the case of retesting the same network with few changes which is often the real-world context in PT. The performance enhancement and the knowledge extracted, and reuse confirm the efficiency, accuracy, and suitability of our proposed framework. Finally, IAPTF is designed to offload and ultimately replace human expert and to be independent, comprehensive, and versatile so it can integrate any automated PT platform or toolkit. Initially, the framework connects directly with Metasploit and Nessus APIs as both free versions coding architecture allows to perform such utilization.
Publication Type: | Thesis (Doctoral) |
---|---|
Subjects: | Q Science > Q Science (General) T Technology > T Technology (General) T Technology > TA Engineering (General). Civil engineering (General) |
Departments: | School of Science & Technology > Engineering School of Science & Technology > School of Science & Technology Doctoral Theses Doctoral Theses |
Download (13MB) | Preview
Export
Downloads
Downloads per month over past year