HW5: Reflections

+Readings:

  • An Investigation of the Therac-25 Accidents by Nancy Leveson and Clark S. Turner
  • After Stroke Scans, Patients Face Serious Health Risks by Walt Bogdanich
  • Motor Vehicles Increasingly Vulnerable to Remote Exploits - FBI Public Service Announcement
  • The Role of Software in Spacecraft Accidents by Nancy G. Leveson
  • Who Killed the Virtual Case File? by Harry Goldstein
  • FBI Sentinel project is over budget and behind schedule, say IG auditors by Jeff Stein
  • Years Late and Millions Over Budget, FBI's Sentinel Finally On Line by Damon Poeter
  • FBI’s Sentinel System Still Not In Total Shape to Surveil by Robert N. Charette
  • Software Engineering (Ian Sommerville): 
    • Chapter 12: Safety Engineering
    • Chapter 13: Security Engineering  

>Reflections:

          The above readings involve investigation into a variety of software-related accidents, from spacecraft crashes to threats of potential accidents like hacking modern automobiles. The articles, although varied, share some common important viewpoints about the importance of proper safety and security techniques needed to design and implement dependable software. Chapter 12 of our textbook introduces the concept of safety-critical systems: “the system should never damage people of the system’s environment, irrespective of whether or not the system conforms to its specification” (Sommerville 327). Closely related, a dependable system is secure if it protects the confidentiality of its information, the information cannot be changed by an unauthorized person, and the system is always available for use as intended (Sommerville 360). Flaws in a system’s security can also drastically affect the safety, as the FBI’s Public Service Announcement on the cyber security threats to modern motorized vehicles announced. The above articles go into detail about how software systems were implemented incorrectly, what (sometimes devastating) accidents resulted, and what could have been done differently to avoid these accidents.

          Commonalities between articles include bad management of software design where the developers reused old code and did not stick to the golden rule of simplifying everything; lack of communication or even understanding during program testing; a general complacency and time- integrity trade offs resulting in code functioning in less-than-optimal ways. Most of all, information on these accidents is very hard to find which leads people into a false sense of safety and security.

          Leveson and Turner’s article highlighted software-related accidents that resulted in deaths or serious injuries of patients exposed to the Therac-25, a radiation therapy system released in the 1980s. They point out that the software from the earlier model, Therac-20, was reused even though Therac-25 was given more responsibility for maintaining safety than previous models. Leveson’s article on the role of software in software-related spacecraft accidents also highlights how software for the Ariane 501 (which crashed) was reused from the Ariane 4, even though the functions (although useful) were not needed in either models (Leveson 2). With regards to the FBI and SAIC’s Trilogy fiasco, they relied on connecting useless software together to get a majorly complex system to run. All of these examples illustrate the inability to develop safe and secure software if the developers rely on previously created code - the code may work reliably, but it might not reliably do what it is intended to do.

          Another thing I wanted to discuss was the lack of openness about information. Some of the Therac-25 accidents and the accidents involving radiation overdoses of the patient’s receiving stroke scans (Bogdanich’s article) show an alarming degree of hiding information from the public, professionals using the software in the field, and just a general complacency and avoidance of checking and rechecking the safety of systems. Even the one guy (Patton) who tried to say something about the FBI’s lack of ingenuity with the Trilogy project was then put under federal investigation. Or the example of the Mars Climate Orbiter, which was checked and found faulty right before take off, but since there were no set-up lines of communication, the person who found the fault was unable or did not care enough to call someone who could actually take care of the problem.

          With regards to security, the articles describing the long-term building of the FBI’s Sentinel software and the PSA about motor vehicle hacking show that the desire to develop secure software comes after the release of the software, not before. Both examples explain that, although some work went into protecting important things like confidentiality and availability, there was a time-constraint to release the software and therefore not enough penetration testing could have been done to ensure the security of the systems. As discussed in the textbook, testers must also test the system, not just the security of the system, and therefore have less time to discover vulnerabilities that are open to exploitation. A hacker, however, has all the time in the world to poke and prod at the system, testing it for weakness (Sommerville 389).

          So, although this has been a long rant about the various problems in all these projects, a few things stand out: rewriting code seems to be the better way of avoiding past problems (whether the problems were detected or not), software development teams need to be open about communicating problems and also work very closely with testers to ensure that the correct thing is being output for tests to pass, and that more time and effort need to go into developing secure systems.