Redundancy is not everything

05 10 10 - 20:08

Years ago when I was still maintaining highly available server clusters and thinking how to improve them, I learned quickly that redundancy of the servers by itself only brings you complications. The key to a meaningful redundant server setup are the sensory methods that monitor health of each server and the logic that acts upon those health states. One of the lessons I learned was that when you monitor some parameter via different methods and you get different outputs, it's usually the method that's at fault, be it either a timing issue or some simple text parsing (everyone loves to play with float numbers in bash</sarcasm>) error.

Now I just read an excellent report from  the Dutch Safety Board about a crash of Turkish 737-800 near Amsterdam Schiphol Airport last year. I was particularly interested in this accident because I know aircrafts usually carry two radio altimeters and I wondered what chain of events triggered a wrong reading from a single one that lead the plane to crash. Let me present my own view of this report and some thoughts that I got about the state of aviation software in general.