Testing Snowcat software


After two months on the road, I find myself at home getting ready for winter at the same time upper New York State is being buried by snow and using special “snow machines” (this is what the news called them or what I call a snowcat) to save people. I have “tested” one of these machines. Yes they have software.  And yes they have bugs.Snowcat

I posted on tweeter a question about how one would go about testing such a software-system. Of course the simple answer is that you would test it, just as any piece of software (test requirements and functions), but this can be lacking. My first answer (and probably last phase of testing) would be to do field testing, which is where I found my bugs.

In this case I mean testing in the field with a user. Here is what I did:

Field conditions defined = Snowy mountain side (3 to 5 feet of snow), at night, in the cold (-10 C), novice user (users must be considered), and is machine warmed up to a good running temperature  (these systems take a while to “warm” up)

The exact definition of these should be captured in the test setup, but I won’t do that in a blog posting.

Test defined = Novice user runs machine. Adjust machine blade setting. Adjust machine power settings. Adjust machine speed settings. Adjust machine direction settings. Vary settings (combinatorial test attack) from low to high.

I was doing exploratory testing but guided by an attack. Here is what I found and could repeat.

Bugs report =  Bug 1) Unexpected (and documented) machine safe mode (engine running but no movements)  entered when combined rapid inputs of power, speed, direction, and blade done at the same time. No user and safety warnings were documented about this combination being “not permitted”. System had to be reset (powered off and wait 60 seconds) to clear the error condition. Bug 2) Micro process computer warning message “110a” did not inform user of actions to take, was meaningless (like message #404), and no user documentation of this message type was provided in operations guide. Bug 3) In a better system design, before safing the system, an alarm warning the user to “stop” current input actions, might have been sounded to improve usability and safety of the system to avoid system safing.

So in my partial answer to my own question, you can see what I did and found. These bugs made a user “unhappy” (we want happy users).

There should be more to testing an embedded system that what I outline here. Having a user-tester find such errors in the field can and should be avoided. We have a long way to go in testing embedded software control devices. Some industries get it and some are still learning.