This was a busy week. Progress is slow, but constant. However, we want to share some thought on appliance development we’re starting at Next Stream Ltd. What should be the main features of an appliance product ? Our Top 3 is:
- Easy to use
- Stable
- Resilient
Appliances must be easy to use. And not only for the technical staff. If you ever need to interact with the device you shouldn’t be expected to know all the technical details on the subject. This is mostly related to user interface. User-Computer(Device) interaction should be straight-forward. Define a goal and let the machine ‘walk the walk’. No one wants to have a full-time job dealing with a bunch of computers constantly needing attention. Simplified user interface with embedded/contextual documentation is a must. Whenever you enter configuration details or click a button, it must be clear what will happen. The contextual help should give you this clarity. Data validation/verification, on the other hand, should prevent you from shooting your leg[s]. So, to make an appliance easy to use:
- minimise number of configuration options
- embed documentation in configuration screens
- always validate input for logical correctness
Don’t let your users make mistakes. It is better to sacrifice configurability than allow malicious actions be taken against your device.
Next on our list, stability. Your device must be stable. It must deliver what’s promised. And it must deliver it under almost all circumstances. Its behaviour must be deterministic. To achieve stability minimise ‘moving parts’. Keep the number of options minimal, keep the number of running components minimal. Test, test, TEST ! Then test some more. We can not stress how important is testing. There are two components that mixed together will give you stability – proper design and testing. So, thing fist, code later. And when coding is finished, test ! Without entering the dark territories of proper tests management, regression testing, fuzz testing, etc, we advise to think about testing procedures during the early stages of development. At least we try to do it this way.
Stable operation is the path to resilience. Once you appliance is stable and predictable enough, you can move to making it resilient. There are many operating systems out there, but only few are really designed to work in any environment. And Linux is not one of those. Nor are most of the other open source projects. Why ? Because they follow design paradigms more than 30 years old, that were meant to accommodate computers of that era. And we all suffer from this. Our every day operating systems are not stable. How can they be resilient ?!
There is work in this direction. QNX, RTEMS, Minix 3 are several examples. New Minix 3 architecture is really impressive. Prof. Tanenbaum and his team came up with the idea of ’state server’. A system process is constantly updated with state information from other processes. When one of those processes crashes, it is automatically restarted and state information is restored in it, so it can continue operation from the moment right before the crash. Which is really a good idea if you want to make your software resilient. The code is out there, so are the papers describing the process in detail. We admire one of the Minix 3 goals, to make the computers reliable like a TV set.
In conclusion we would like to quote a story from Tom Van Vleck (MULTICS developer) about his discussion with Dennis Ritchie (UNIX developer):
… We went to lunch afterward, and I remarked to Dennis that easily half the code I was writing in Multics was error recovery code. He said, “We left all that stuff out. If there’s an error, we have this routine called panic, and when it is called, the machine crashes, and you holler down the hall, ‘Hey, reboot it.’”
30+ years later, it’s the same.
Management General, Tech