Global Sources
EE Times-India
Stay in touch with EE Times India
 
EE Times-India > Embedded
 
 
Embedded  

Simple method for embedded code fault detection

Posted: 12 May 2014     Print Version  Bookmark and Share

Keywords:embedded system  DSP  State machines  fault detection  debuggers 

In technical literature, there are a number of valid means to define a system, and an embedded system in particular. For the purposes of this article we will use one of the most general and classic model of systems theory: a system is an interconnection of subunits that can be modelled by data and control inputs, a state machine, and data and control outputs (figure 1).

What turns this into an embedded system is that most of it is hidden and inaccessible, and is often characterized by real-time constraints: a sort of a black box which must react to a set of stimuli with an expected behaviour according to the end user's (customer's) perception of the system. Any deviation from this behaviour is reported by the customer as a defect, a fault.

Figure 1: Input-state-output system model.

At the time of their occurrence, faults are characterized by their impact on product functionality and by the chain of events that led to their manifestation. Deciding how to handle a fault when it pops up and how to compensate for its effects is typically a static design issue, dealing with allowed tolerance level and system functional constraints. On the other hand, collecting run-time information about the causes that resulted in the system misbehaviour should be a dynamic process, as much as possible flexible and scalable.

As a matter of fact, fault handling strategies are commonly defined at the system design stage. They result from balancing the drawbacks naturally arising when it is allowed a minimal degree of divergence from the expected behaviour, and sizing the threshold above which the effects have to be considered as unacceptable.

When the degradation of the performance resulting from the fault is such that countermeasures are needed, recovering actions have to be well defined by specifications. In fact the occurrence of an unexpected behaviour, by its nature unwanted, does not necessarily mean a complete loss of functionality, that's why establishing back-off solutions is typically part of the design process.

Depending on the system's nature, recovery actions can be handled by implementing redundancy or allowing temporary degradation of the service while performing corrections. On the contrary, defining strategies to collect data for debug purposes is a process that often is left out, trusting the filtering applied at test stages. This deficiency is usually due to strict timing constraints at the development phase and a lack of system resources, resulting from fitting a maximum of functions into the product. Often, when there is a need to deal with malfunctioning in the field, strategies are decided and measures are set up on the fly.

One of the problems with embedded systems is that they are indeed embedded; that is, information accessibility is usually far from being granted: during the several test phases a product goes through, the designers/troubleshooters usually can make use of intrusive tools, like target debuggers and oscilloscopes, to isolate a fault. When the product instead is in service, it is often impossible to use such instruments and the available investigation tools may be not sufficient to easily identify the root cause of the problem within a time that is reasonable from the customer's perspective. Moreover, establishing some sort of strict synchronisation between recording instruments and internal fault detection is not always possible, with the result that data collected at the inputs/outputs cannot clearly be tied to the fault occurrence itself and has to be correlated manually.

The other problem is the fault localisation. While the system grows in complexity, possible deviations from the expected operation increase. This is one reason (but not the only one) why the more complex the system is, the bigger can be the distance between the fault and its symptoms. Symptoms alone are not sufficient to identify the root cause of a problem. The relevant information is hidden in the inputs and in the status of the system when the fault occurred, but in most cases this information is gone forever.

1 • 2 • 3 • 4 • 5 • 6 • 7 • 8 • 9 • 10 Next Page Last Page



Comment on "Simple method for embedded code faul..."
Comments:  
*  You can enter [0] more charecters.
*Verify code:
 
 
Webinars

Seminars

Visit Asia Webinars to learn about the latest in technology and get practical design tips.

 

Go to top             Connect on Facebook      Follow us on Twitter      Follow us on Orkut

 
Back to Top