Archive for May, 2009

Improving EEG Analysis

Friday, May 29th, 2009

neuro_brain1The Technology Review article Better Brain-Wave Analysis looks at start-up ElMindA that is trying to find new quantitative methods for broadening the clinical use of EEG.

The company has developed a novel system that calculates a number of different parameters from EEG data, such as the frequency and amplitude of electrical activity in particular brain areas, the origin of specific signals, and the synchronicity in activity in two different brain areas as patients perform specific tests on a computer.

This description doesn't sound very novel, but I've always felt that EEG analysis has tremendous clinical potential. This is particularly true for rehabilitation purposes (like the stroke example) and EEG-based communications for paralysis patients.

I am skeptical of  "objective diagnosis" claims for things like attention-deficit hyperactivity disorder (ADHD) though.  In the 1980's EEG Topography was thought to be able to distinguish some psychiatric disorders. These claims were never proven to be true.

I'm not saying that new quantitative techniques like those being developed by ElMindA are comparible to the old EEG "brain mapping", but significant clinical validation will be required before they can be used clinically.

UPDATE (6/3/09):  Another related article (also from Technology Review) Reading the Surface of the Brain (and cool picture):

laser_neuro_x220

UPDATE (6/10/09): The BrainGate project at Massachusetts General Hospital has recently started clinical trials that may help paralyzed patients. The on-going MGH project:

... the ultimate goals of which include "turning thought into action": developing point and click capabilities on a computer screen, controlling a prosthetic limb and a robotic arm, controlling functional electrical stimulation (FES) of nerves disconnected from the brain due to paralysis, and further expanding the neuroscience underlying the field of intracortical neurotechnology.

UPDATE (7/5/09): This is more related to Brain Control Headsets, but if you're interested in developing your own EEG-based controller you should check out An SDK for your brain. The free NeuroSky MindSet Development Tools along with a $200 headset will get you started developing your own "mind-controlled" game. Good luck with that!

Guest Article: Static Analysis in Medical Device Software (Part 1) — The Traps of C

Friday, May 15th, 2009

Any software controlled device that is attached to a human presents unique and potentially life threatening risks.  A recent article on the use of static analysis for medical device software prompted Pascal Cuoq at Frama-C to share his thoughts on the subject. This is part 1 of 3.

The article Diagnosing Medical Device Software Defects Using Static Analysis gives an interesting overview of the applicability of static analysis to embedded medical software. I have some experience in the field of formal methods (including static analysis of programs), and absolutely none at all in the medical domain. I can see how it would be desirable to treat software involved at any stage of a medical procedure as critical, and coincidentally, producing tools for managing critical software has been my main occupation for the last five years. This blog post constitute the first part of what I have to say on the subject, and I hope someone finds it useful.

As the article states, in the development of medical software, as in many other embedded applications, C and C++ are used predominantly, for better or for worse. The "worse" part is an extensive list of subtle and less subtle pitfalls that seem to lay in each of these two languages' corner.

The most obvious perils can be avoided by restricting the programmer to a safer subset of the language -- especially if it is possible to recognize syntactically when a program has been written entirely in the desired subset. MISRA C, for instance, defines a set of rules, most of them syntactic, that help avoid the obvious mistakes in C. But only a minority of C's pitfalls can be eliminated so easily. A good sign that coding style standards are no silver bullet is that there exist so many. Any fool can invent theirs, and some have. The returns of mandating more and more coding rules diminish rapidly, to the point that overdone recommendations found in the wild contradict each other, or in the worst case, contradict common sense.

Even written according to a reasonable development standard, a program may contain bugs susceptible to result in run-time errors. Worse, such a bug may, in some executions, fail to produce any noticeable change, and in other executions crash the software. This lack of reproducibility means that a test may fail to reveal the problem, even if the problematic input vector is used.

A C program may yet hide other dangerous behaviors. The ISO 9899:1999 C standard, the bible for C compilers implementers and C analyzers implementers alike, distinguishes "undefined", "unspecified", and "implementation-defined" behaviors. Undefined behaviors correspond roughly to the run-time errors mentioned above. The program may do anything if one of these occurs, because it is not defined by the standard what it should do. A single undefined construct may cause the rest of the program to behave erratically in apparently unrelated ways. Proverbially, a standard-compliant compiler may generate a program that causes the device to catch fire when a division by zero happens.

Implementation-defined behaviors represent choices that are not imposed by the standard, but that have to be made by the compiler once and for all. In embedded software, it is not a viable goal to avoid implementation-defined constructions: the software needs to use them to interface with the hardware. Additionally, size and speed constraints for embedded code often force the developer to use implementation-defined constructs even where standard constructs exist to do the same thing.

However, in the development of critical systems, the underlying architecture and compiler are known before software development starts. Some static analysis techniques lend themselves well to this kind of parameterization, and many available tools that provide advanced static analysis can be configured for the commonly available embedded processors and compilers. Provided that the tests are made with the same compiler and hardware as the final device, the existence of implementation-defined behaviors does not invalidate testing as a quality assurance method, either.

Unspecified behaviors are not treated as seriously as they should by many static analysis tools. That's because unlike undefined behaviors, they cannot set the device on fire. Still, they can cause different results from one compilation to the other, from one execution to the other, or even, when they occur inside a loop, from one iteration to the other. Like the trickiest of run-time errors, they lessen the value of tests because they are not guaranteed to be reproducible.

The "uninitialized variable" example in the list of undesirable behaviors in the article is in fact an example of unspecified behavior. In the following program, the local variable L has a value, it is only unknown which one.

Computing (L-L) in this example reliably give a result of zero.

Note: For the sake of brevity, people who work in static analysis have a tendency to reduce their examples to the shortest piece of code that exhibits the problem. In fact, in writing this blog post I realized I could write an entire other blog post on the deformation of language in practitioners of static analysis. Coming back to the subject at hand, of course, no programmer wants to compute zero by subtracting an uninitialized variable from itself. But a cryptographic random generator might for instance initialize its seed variable by mixing external random data with the uninitialized value, getting at least as much entropy as provided by the external source but perhaps more. The (L-L) example should be considered as representing this example and all other useful uses of uninitialized values.

Knowledge of the compilation process and lower-level considerations may be necessary in order to reliably predict what happens when uninitialized variables are used. If the local variable L was declared of type float, the actual bit sequence found in it at run-time could happen to represent IEEE 754's NaN or one of the infinities, in which case the result of (L-L) would be NaN.

Uninitialized variables, and more generally unspecified behaviors, are indeed less harmful than undefined behaviors. Some "good" uses for them are encountered from time to time. We argue that critical software should not exhibit any unspecified behavior at all. Uses of uninitialized variables can be excluded by a simple syntactic rule "all local variables should be initialized at declaration", or, if material constraints on the embedded code mean this price is too high to pay, with one of the numerous static analyzers that reliably detect any use of an uninitialized variable. Note that because of C's predominant use of pointers, it may be harder than it superficially appears to determine if a variable is actually used before being initialized or not; and this is even in ordinary programs.

There are other examples of unspecified behaviors not listed in the article, such as the comparison of addresses that are not inside a same aggregate object, or the comparison of an invalid address to NULL. I am in fact still omitting details here. See the carefully worded §6.5.8 in the standard for the actual conditions.

An example of the latter unspecified behavior is (p == NULL) where p contains an invalid address computed as t+12345678 (t being a char array with only 10000000 cells). This comparison may produce 1 when t happens to have been located at a specific address by the compiler, typically UINT_MAX-12345677. It also produces 0 in all other cases. If there is an erroneous behavior that manifests itself only when this condition produces 1, a battery of tests is very unlikely to uncover the bug, which may remain hidden until after the device has been widely deployed.

An example of comparison of addresses that are not in the same aggregate object is the comparison (p <= q), when p and q are pointers to memory blocks that have both been obtained by separate calls to the allocation function malloc. Again, the result of the comparison depends on uncontrolled factors. Assume such a condition made its way by accident in a critical function. The function may have been unit-tested exhaustively, but the unit tests may not have taken into account the previous sequence of bloc allocations and deallocations that results in one block being positioned before or after the other in the heap. A typical static analysis tool is smarter, and may consider both possibilities for the result of the condition, but we argue that in critical software, the fact that the result is unspecified should in itself be reported as an error.

Another failure mode for programs written in C or any other algorithmic language is the infinite loop. In embedded software, one is usually interested in an even stronger property than the absence of infinite loops, the verification of a predetermined bound on the execution time of a task. Detection of infinite loops is a famous example of undecidable problem. Undecidable problems are problems for which it is mathematically impossible to provide an algorithm that for any input (here, a program to analyze) eventually answers "yes" or "no". People moderately familiar with undecidability sometimes assume this means it is impossible to make a static analyzer that provides useful information on the termination of an analyzed program, but the theoretical limitation can be worked around by accepting a little imprecision (false negatives, or false positives, or both, in the diagnostic), or by allowing that the analyzer itself will, in some cases, not terminate.

The same people who recognize termination of the analyzed program as an undecidable property for which theory states that a perfect analyzer cannot be made, usually fail to recognize that finely recognizing run-time errors or unspecified behaviors are undecidable problems as well. For these questions, it is also mathematically impossible to build an analyzer that always terminates and emits neither false positives nor false negatives.

Computing the worse-case execution time is at least as hard as verifying termination, therefore it's undecidable too. That's for theory. In practice, there exist useful static analyzers that provide guaranteed worse case execution times for the execution of a piece of software. They achieve this by limiting the scope of the analysis, firstly, to the style of code that is common in embedded software, and secondly, to the one sub-task whose timing is important. This kind of analysis cannot be achieved using the source code alone. The existing analyzers all use the binary code of the task at some point, possibly in addition to the source code, a sample of the processor to be used in the device, or only an abstract description of the processor.

This was part one of the article, where I tried to provide a list of issues to look for in embedded software. In part two, I plan to talk about methodology. In part three, I will introduce formal specifications, and show what they can contribute to the issue of software verification.

Continuous Learning: 14 Ways to Stay at the Top of Your Profession

Saturday, May 9th, 2009

"Professional development refers to skills and knowledge attained for both personal development and career advancement. " I'm fortunate in that my personal and career interests are well aligned. I must enjoy my work because I do a lot of the same activities with a majority of my free time (just ask my wife!).

Keeping up with an industry's current technologies and trends is a daunting task.  Karl Seguin's post Part of your job should be to learn got me to thinking about the things I do to stay on top of my interests.  I never really thought about it much before, but as I started making a list I was surprised by how fast it grew.  When it reached a critical mass that I thought it would be worth sharing.

I actually have two professions. I'm a Biomedical Engineer (formal training) and a Software Engineer (self proclaimed).  I primarily do software design and development, but being in the medical device industry also requires that I keep abreast of regulatory happenings (the FDA in particular, HIPAA, etc.), quality system issues,  and industry standards (e.g. HL7).

Keeping track of Healthcare IT trends is also a big task. With the new emphasis by the federal government on EMR adoption, even a small company like mine has started planning and investing in the future demand for medical device integration.

The other major topic of interest to me is software design and development methodologies. A lot of the good work in this area seems to come from people that are involved in building enterprise class systems. I've discussed the ALT.NET community (here) and still think they are worth following.

So here's my list.  I talk about them with respect to my interests (mostly software technologies), but I think they are generally applicable to any profession.

1. Skunk Works

Getting permission from your manager to investigate new technologies that could potentially be used by your company is win-win. In particular, if you can parlay your new-found skills into a product that makes money (for the company, of course), then it's WIN-WIN.

In case you've never heard this phrase:  Skunk works.

2. Personal Projects

I always seem to be working with a new software development tool or trying to learn a new programming language. Even if you don't become an expert at them, I think hands-on exposure to other technologies and techniques is invaluable. It gives you new perspectives on the things that you are an expert in.

Besides getting involved in an open source project, people have many interesting hobby projects.  See Do you have a hobby development project? for some examples.

3. Reading Blogs

I currently follow about 40 feeds on a variety of topics. I try to remove 2-3  feeds and replace them with new ones at least once a month. Here is my Google Reader trend for the last 30 days:

30 day RSS trendYou can see I'm pretty consistent. That's 1605 posts in 30 days, or about 53 posts per day. To some, this may seem like a lot. To others, I'm a wimp.  During the week I usually read them over lunch or in the evening.

4. Google Alerts

Google Alerts is a good way to keep track of topics and companies of interest. You get e-mail updates with news and blog entries that match any search term. For general search terms use 'once a day' and for companies use 'as-it-happens'.

5. Social Networks

I joined Twitter over a month ago.  The 30 or so people I follow seem to have the same interests as I do. What's more important is that they point me to topics and reference sites that I would not have discovered otherwise. I've dropped a few people that were overly verbose or had mostly inane (like  "I'm going to walk the dog now.") tweets.

I'm also a member of LinkedIn. Besides connecting with people you know there are numerous groups you can join and track topical discussions. Unfortunately, there are quite a few recruiters on LinkedIn which somewhat diminishes the experience for me.

I don't have a Facebook account because my kids told me you have to be under 30 to join. Is that true? 🙂

6. Books

I browse the computer section of the bookstore on a regular basis.  I even buy a technical book every now and then.

Downloading free Kindle e-books is another good source (and free, of course) e.g. here are a couple though Karl's post: Foundations of Programming. There's a lot of on-line technical reading material around. Having a variety on the Kindle allows me to read them whenever the mood strikes me.  One caution though: the Amazon conversion from PDF and HTML to e-book format is usually not very good. This is particularly true for images and code. But still, it's free -- you get what you pay for.

7. Magazines

There are numerous technical print publications around, but they are becoming rare because of the ease of on-line alternatives.  I used to get Dr. Dobbs journal but they no longer publish a print version, but it is still available electronically though.

I miss that great feeling of cracking open a fresh nerd magazine.  I still remember the pre-Internet days when I had stacks of BYTE laying around the house.

8. Webinars

These tend to be company sponsored, but the content about a product or service that you may not know a lot about is a good way to learn a new subject.  You just have to filter out the sales pitch. You typically get an e-mail invitation for these directly from a vendor.

9. Local User Groups

I've talked about this before (at the end of the post).  In addition to software SIGs, look into other groups as well. For me, IEEE has a number of interesting lectures in the area.

Face to face networking with like professionals is very important for career development ("It's not what you know -- it's who you know" may be a cliche, but it’s true.).  Go and participate as much as possible.

If there's not a user group in your area that covers your interests, then start your own! For example: Starting a User Group, Entry #1 (first entry of 4).

10. Conferences and Seminars

Press your employer for travel and expenses, and go when you can. This is another win-win for both of you.  Like Webinars, vendor sponsored one day or half day seminars can be valuable.  Also, as in #9, this is another opportunity to network.

Just getting out of the office every now and then is a good thing.

11. Podcasts

These may be good for some people, but I rarely listen to podcasts.  My experience is that the signal to noise ratio is very low (well below 1). You have to listen to nonsense for long periods of time before you get anything worthwhile. But that's just me. Maybe I don't listen to the right ones?

12. Discussion Sites

CodeProject and Stack Overflow are my favorites. Also, if you do a search at Google Groups you can find people talking about every conceivable subject.

Asking good questions and providing your expertise for answers is a great way to show your professionalism.

13. Blogging

IMO your single most important professional skill is writing. Having a blog that you consistently update with material that interests you is a great way to improve your writing skills.  It forces you to organize your thoughts and attempt to make them comprehensible (and interesting) to others.

14. Take a Class

If you have a University or College nearby, they probably have an Extension system that provide classes.  Also, there are free on-line courses available. e.g.: Stanford, MIT, and U. of Wash.

UPDATE (6/23/09): Here's some more fuel for #13: The benefits of technical blogging. All good points.

——
CodeProject Note:  This is not a technical article but I decided to add the 'CodeProject' tag anyway. I thought the content might be of general interest to CPians even though there's no code here.