Software dependability with off-the-shelf components
Gashi, I. (2007). Software dependability with off-the-shelf components. (Unpublished Doctoral thesis, City, University of London)
Abstract
When systems are built out of “off-the-shelf’ (OTS) products, fault tolerance is often the only viable way of obtaining the required system dependability. Due to low acquisition costs, even using multiple versions of software in a parallel architecture, a scheme formerly reserved for few and highly critical applications, may become viable for many other applications. A wide range of solutions for employing fault tolerance are known in the literature, but the difficulty remains in assessing the possible dependability gains that may be achieved.
The research detailed in this thesis will aim to provide a new approach to assessing the dependability gains that may be achieved through software fault tolerance via modular redundancy with diversity in complex OTS software. OTS SQL database server products have been used in the studies: they are a very complex, widely-used category of off-the-shelf products meaning the results reported in this thesis may be of immediate interest to practitioners dealing with complex software systems. Bug reports of the servers were used as evidence in the assessment: they were the only direct dependability evidence that was found for these products. A sample of bug reports from four OTS SQL database server products and later releases of two of them have been studied to check whether they would cause coincident failures in more than one of the products. Very few bugs were found to affect more than one product, and none caused failures in more than two. Many of these faults caused systematic, non-crash failures, a category ignored by most studies and standard implementations of fault tolerance for databases. Use of different releases of the same product was also found to tolerate a significant fraction of the faults for one of the products used in the study. Therefore, a fault-tolerant server, built with diverse OTS servers products, seems to have a good chance of delivering improvements in availability and failure rates compared with the individual OTS server products or their replicated, non-diverse configurations.
Data diversity in the form of “SQL rephrasing rules” was also found to be a very useful fault tolerance mechanism. Data diversity is possible with these products thanks to the redundancy that exists in the SQL language: a statement can be specified in multiple different but logically equivalent ways. The results of all these studies are reported in this thesis and their implications, the architectural options available for exploiting them, and the difficulties that they may present are discussed.
Two reliability models developed previously by colleagues at the Centre for Software Reliability, City University have been extended to enable their use in assessing a fault-tolerant l-out-of-2 diverse server. The bug reports were used as evidence in the assessment with one of these models which enables an assessor to choose the pair of servers, from the possibly many pairs available, which will yield the highest reliability gains. The other model that was extended required additional data that was not available for the database servers. Therefore another approach was studied in which bug reports data alone can be used to derive estimates of possible reliability gains that may be expected from employing a l-out-of-2 diverse server in comparison to a non-diverse one.
Publication Type: | Thesis (Doctoral) |
---|---|
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science Q Science > QA Mathematics > QA76 Computer software |
Departments: | School of Science & Technology > Computer Science > Software Reliability School of Science & Technology > School of Science & Technology Doctoral Theses Doctoral Theses |
Download (10MB) | Preview
Export
Downloads
Downloads per month over past year