Fault injection explained

In computer science, fault injection is a testing technique for understanding how computing systems behave when stressed in unusual ways. This can be achieved using physical- or software-based means, or using a hybrid approach.[1] Widely studied physical fault injections include the application of high voltages, extreme temperatures and electromagnetic pulses on electronic components, such as computer memory and central processing units.[2] [3] By exposing components to conditions beyond their intended operating limits, computing systems can be coerced into mis-executing instructions and corrupting critical data.

In software testing, fault injection is a technique for improving the coverage of a test by introducing faults to test code paths; in particular error handling code paths, that might otherwise rarely be followed. It is often used with stress testing and is widely considered to be an important part of developing robust software.[4] Robustness testing[5] (also known as syntax testing, fuzzing or fuzz testing) is a type of fault injection commonly used to test for vulnerabilities in communication interfaces such as protocols, command line parameters, or APIs.

The propagation of a fault through to an observable failure follows a well-defined cycle. When executed, a fault may cause an error, which is an invalid state within a system boundary. An error may cause further errors within the system boundary, therefore each new error acts as a fault, or it may propagate to the system boundary and be observable. When error states are observed at the system boundary they are termed failures. This mechanism is termed the fault-error-failure cycle[6] and is a key mechanism in dependability.

History

The technique of fault injection dates back to the 1970s[7] when it was first used to induce faults at a hardware level. This type of fault injection is called Hardware Implemented Fault Injection (HWIFI) and attempts to simulate hardware failures within a system. The first experiments in hardware fault involved nothing more than shorting connections on circuit boards and observing the effect on the system (bridging faults). It was used primarily as a test of the dependability of the hardware system. Later specialised hardware was developed to extend this technique, such as devices to bombard specific areas of a circuit board with heavy radiation. It was soon found that faults could be induced by software techniques and that aspects of this technique could be useful for assessing software systems. Collectively these techniques are known as Software Implemented Fault Injection (SWIFI).

Software implemented fault injection

SWIFI techniques can be categorized into two types: compile-time injection and runtime injection.

Compile-time injection is an injection technique where source code is modified to inject simulated faults into a system. One method is called mutation testing which changes existing lines of code so that they contain faults. A simple example of this technique could be changing a = a + 1 to a = a – 1

Code mutation produces faults which are very similar to those unintentionally added by programmers.

A refinement of code mutation is Code Insertion Fault Injection which adds code, rather than modifying existing code. This is usually done through the use of perturbation functions which are simple functions which take an existing value and perturb it via some logic into another value, for example

int pFunc(int value)

int main(int argc, char * argv[])

In this case, pFunc is the perturbation function and it is applied to the return value of the function that has been called introducing a fault into the system.

Runtime Injection techniques use a software trigger to inject a fault into a running software system. Faults can be injected via a number of physical methods and triggers can be implemented in a number of ways, such as: Time Based triggers (When the timer reaches a specified time an interrupt is generated and the interrupt handler associated with the timer can inject the fault.); Interrupt Based Triggers (Hardware exceptions and software trap mechanisms are used to generate an interrupt at a specific place in the system code or on a particular event within the system, for instance, access to a specific memory location).

Runtime injection techniques can use a number of different techniques to insert faults into a system via a trigger.

These techniques are often based around the debugging facilities provided by computer processor architectures.

Protocol software fault injection

Complex software systems, especially multi-vendor distributed systems based on open standards, perform input/output operations to exchange data via stateful, structured exchanges known as "protocols." One kind of fault injection that is particularly useful to test protocol implementations (a type of software code that has the unusual characteristic in that it cannot predict or control its input) is fuzzing. Fuzzing is an especially useful form of Black-box testing since the various invalid inputs that are submitted to the software system do not depend on, and are not created based on knowledge of, the details of the code running inside the system.

Hardware implemented fault injection

This technique was applied on a hardware prototype. Testers inject fault by changing voltage of some parts in a circuit, increasing or decreasing temperature, bombarding the board by high energy radiation, etc.

Characteristics of fault injection

Faults have three main parameters.[8]

These parameters create the fault space realm. The fault space realm will increase exponentially by increasing system complexity. Therefore, the traditional fault injection method will not be applicable to use in the modern cyber-physical systems, because they will be so slow, and they will find a small number of faults (less fault coverage). Hence, the testers need an efficient algorithm to choose critical faults that have a higher impact on system behavior. Thus, the main research question is how to find critical faults in the fault space realm which have catastrophic effects on system behavior. Here are some methods that can aid fault injection to efficiently explore the fault space to reach higher fault coverage in less simulation time.

Fault injection tools

Although these types of faults can be injected by hand the possibility of introducing an unintended fault is high, so tools exist to parse a program automatically and insert faults.

Research tools

A number of SWIFI Tools have been developed and a selection of these tools is given here. Six commonly used fault injection tools are Ferrari, FTAPE, Doctor, Orchestra, Xception and Grid-FIT.

Commercial tools

Libraries

Fault injection in functional properties or test cases

In contrast to traditional mutation testing where mutant faults are generated and injected into the code description of the model, application of a series of newly defined mutation operators directly to the model properties rather than to the model code has also been investigated.[28] Mutant properties that are generated from the initial properties (or test cases) and validated by the model checker should be considered as new properties that have been missed during the initial verification procedure. Therefore, adding these newly identified properties to the existing list of properties improves the coverage metric of the formal verification and consequently lead to a more reliable design.

Application of fault injection

Fault injection can take many forms. In the testing of operating systems for example, fault injection is often performed by a driver (kernel-mode software) that intercepts system calls (calls into the kernel) and randomly returning a failure for some of the calls. This type of fault injection is useful for testing low-level user-mode software. For higher level software, various methods inject faults. In managed code, it is common to use instrumentation. Although fault injection can be undertaken by hand, a number of fault injection tools exist to automate the process of fault injection.[29]

Depending on the complexity of the API for the level where faults are injected, fault injection tests often must be carefully designed to minimize the number of false positives. Even a well designed fault injection test can sometimes produce situations that are impossible in the normal operation of the software. For example, imagine there are two API functions, Commit and PrepareForCommit, such that alone, each of these functions can possibly fail, but if PrepareForCommit is called and succeeds, a subsequent call to Commit is guaranteed to succeed. Now consider the following code:

error = PrepareForCommit; if (error

SUCCESS)

Often, it will be infeasible for the fault injection implementation to keep track of enough state to make the guarantee that the API functions make. In this example, a fault injection test of the above code might hit the assert, whereas this would never happen in normal operation.

See also

External links

Notes and References

  1. Book: Moradi. Mehrdad. Van Acker. Bert. Vanherpen. Ken. Denil. Joachim. Cyber Physical Systems. Model-Based Design . Model-Implemented Hybrid Fault Injection for Simulink (Tool Demonstrations) . 2019. Chamberlain. Roger. Taha. Walid. Törngren. Martin. 11615. Lecture Notes in Computer Science. en. Springer International Publishing. 71–90. 10.1007/978-3-030-23703-5_4. 9783030237035. 195769468 . https://figshare.com/articles/preprint/Model-Implemented_Hybrid_Fault_Injection_for_Simulink_Tool_Demonstrations_/12479930 .
  2. Shepherd. Carlton. Markantonakis. Konstantinos. Van Heijningen. Nico. Aboulkassimi. Driss. Gaine. Clement. Heckmann. Thibaut. Naccache. David . 2021. Physical fault injection and side-channel attacks on mobile devices: A comprehensive analysis. 2105.04454. Computers & Security. Elsevier. 111. 102471. 102471 . 10.1016/j.cose.2021.102471. 236957400 .
  3. Bar-El. Hagai. Choukri. Hamid. Naccache. David. Tunstall. Michael. Whelan. Claire. 2004. The sorcerer's apprentice guide to fault attacks. Proceedings of the IEEE. IEEE. 94. 2. 370–382 . 10.1109/JPROC.2005.862424 . 2397174 .
  4. J. Voas, "Fault Injection for the Masses," Computer, vol. 30, pp. 129–130, 1997.
  5. https://publications.vtt.fi/pdf/publications/2001/P448.pdf Kaksonen, Rauli. A Functional Method for Assessing Protocol Implementation Security. 2001.
  6. A. Avizienis, J.-C. Laprie, Brian Randell, and C. Landwehr, "Basic Concepts and Taxonomy of Dependable and Secure Computing," Dependable and Secure Computing, vol. 1, pp. 11–33, 2004.
  7. J. V. Carreira, D. Costa, and S. J. G, "Fault Injection Spot-Checks Computer System Dependability," IEEE Spectrum, pp. 50–55, 1999.
  8. Book: Fault Injection Techniques and Tools for Embedded Systems Reliability Evaluation. 2003. Springer US. 978-1-4020-7589-6. Benso. Alfredo. Frontiers in Electronic Testing. en. Prinetto. Paolo.
  9. Web site: Optimizing fault injection in FMI co-simulation through sensitivity partitioning Proceedings of the 2019 Summer Simulation Conference. 2020-06-14. dl.acm.org. EN.
  10. Moradi, M., Oakes, B.J., Saraoglu, M., Morozov, A., Janschek, K. and Denil, J., 2020. EXPLORING FAULT PARAMETER SPACE USING REINFORCEMENT LEARNING-BASED FAULT INJECTION.
  11. Rickard Svenningsson, Jonny Vinter, Henrik Eriksson and Martin Torngren, "MODIFI: A MODel-Implemented Fault Injection Tool,", Lecture Notes in Computer Science, 2010, Volume 6351/2010, 210-222.
  12. G. A. Kanawati, N. A. Kanawati, and J. A. Abraham, "FERRARI: A Flexible Software-Based Fault and Error Injection System," IEEE Transactions on Computers, vol. 44, pp. 248, 1995.
  13. T. Tsai and R. Iyer, "FTAPE: A Fault Injection Tool to Measure Fault Tolerance," presented at Computing in aerospace, San Antonio; TX, 1995.
  14. S. Han, K. G. Shin, and H. A. Rosenberg, "DOCTOR: An IntegrateD SOftware Fault InjeCTiOn EnviRonment for Distributed Real-time Systems," presented at International Computer Performance and Dependability Symposium, Erlangen; Germany, 1995.
  15. S. Dawson, F. Jahanian, and T. Mitton, "ORCHESTRA: A Probing and Fault Injection Environment for Testing Protocol Implementations," presented at International Computer Performance and Dependability Symposium, Urbana-Champaign, USA, 1996.
  16. http://wiki.grid-fit.org/ Grid-FIT Web-site
  17. N. Looker, B. Gwynne, J. Xu, and M. Munro, "An Ontology-Based Approach for Determining the Dependability of Service-Oriented Architectures," in the proceedings of the 10th IEEE International Workshop on Object-oriented Real-time Dependable Systems, USA, 2005.
  18. N. Looker, M. Munro, and J. Xu, "A Comparison of Network Level Fault Injection with Code Insertion," in the proceedings of the 29th IEEE International Computer Software and Applications Conference, Scotland, 2005.
  19. http://lfi.epfl.ch/ LFI Website
  20. http://www.beyondsecurity.com/black-box-testing.html beSTORM product information
  21. http://www.exhaustif.es ExhaustiF SWIFI Tool Site
  22. http://www.securityinnovation.com/holodeck/index.shtml Holodeck product overview
  23. http://www.codenomicon.com/defensics/ Codenomicon Defensics product overview
  24. http://www.mudynamics.com/products/overview.html Mu Service Analyzer
  25. http://www.mudynamics.com/ Mu Dynamics, Inc.
  26. http://www.criticalsoftware.com/en/products/p/xception Xception Web Site
  27. http://www.criticalsoftware.com Critical Software SA
  28. Book: 10.1109/DSD.2011.57. 978-1-4577-1048-3. Mutant Fault Injection in Functional Properties of a Model to Improve Coverage Metrics. 2011 14th Euromicro Conference on Digital System Design. 2011. Abbasinasab. Ali. Mohammadi. Mehdi. Mohammadi. Siamak. Yanushkevich. Svetlana. Smith. Michael. 15992130. 422–425.
  29. N. Looker, M. Munro, and J. Xu, "Simulating Errors in Web Services," International Journal of Simulation Systems, Science & Technology, vol. 5, 2004.