History’s Worst Software Bugs [wired.com]

History’s Worst Software Bugs

Simson Garfinkel 

Icon_email

 11.08.05

Last month automaker Toyota announced a recall of 160,000 of its Prius hybrid vehicles following reports of vehicle warning lights illuminating for no reason, and cars’ gasoline engines stalling unexpectedly. But unlike the large-scale auto recalls of years past, the root of the Prius issue wasn’t a hardware problem — it was a programming error in the smart car’s embedded code. The Prius had a software bug.
With that recall, the Prius joined the ranks of the buggy computer — a club that began in 1945 when engineers found a moth in Panel F, Relay #70 of the Harvard Mark II system.The computer was running a test of its multiplier and adder when the engineers noticed something was wrong. The moth was trapped, removed and taped into the computer’s logbook with the words: «first actual case of a bug being found.»
Sixty years later, computer bugs are still with us, and show no sign of going extinct. As the line between software and hardware blurs, coding errors are increasingly playing tricks on our daily lives. Bugs don’t just inhabit our operating systems and applications — today they lurk within our cell phones and our pacemakers, our power plants and medical equipment. And now, in our cars.
But which are the worst?
It’s all too easy to come up with a list of bugs that have wreaked havoc. It’s harder to rate their severity. Which is worse — a security vulnerability that’s exploited by a computer worm to shut down the internet for a few days or a typo that triggers a day-long crash of the nation’s phone system? The answer depends on whether you want to make a phone call or check your e-mail.
Many people believe the worst bugs are those that cause fatalities. To be sure, there haven’t been many, but cases like the Therac-25 are widely seen as warnings against the widespread deployment of software in safety critical applications. Experts who study such systems, though, warn that even though the software might kill a few people, focusing on these fatalities risks inhibiting the migration of technology into areas where smarter processing is sorely needed. In the end, they say, the lack of software might kill more people than the inevitable bugs.
What seems certain is that bugs are here to stay. Here, in chronological order, is the Wired News list of the 10 worst software bugs of all time … so far.
July 28, 1962 — Mariner I space probe. A bug in the flight software for the Mariner 1 causes the rocket to divert from its intended path on launch. Mission control destroys the rocket over the Atlantic Ocean. The investigation into the accident discovers that a formula written on paper in pencil was improperly transcribed into computer code, causing the computer to miscalculate the rocket’s trajectory.
1982 — Soviet gas pipeline. Operatives working for the Central Intelligence Agency allegedly (.pdf) plant a bug in a Canadian computer system purchased to control the trans-Siberian gas pipeline. The Soviets had obtained the system as part of a wide-ranging effort to covertly purchase or steal sensitive U.S. technology. The CIA reportedly found out about the program and decided to make it backfirewith equipment that would pass Soviet inspection and then fail once in operation. The resulting event is reportedly the largest non-nuclear explosion in the planet’s history.
1985-1987 — Therac-25 medical accelerator. A radiation therapy device malfunctions and delivers lethal radiation doses at several medical facilities. Based upon a previous design, the Therac-25 was an «improved» therapy system that could deliver two different kinds of radiation: either a low-power electron beam (beta particles) or X-rays. The Therac-25’s X-rays were generated by smashing high-power electrons into a metal target positioned between the electron gun and the patient. A second «improvement» was the replacement of the older Therac-20’s electromechanical safety interlocks with software control, a decision made because software was perceived to be more reliable.
What engineers didn’t know was that both the 20 and the 25 were built upon an operating system that had been kludged together by a programmer with no formal training. Because of a subtle bug called a «race condition,» a quick-fingered typist could accidentally configure the Therac-25 so the electron beam would fire in high-power mode but with the metal X-ray target out of position. At least five patients die; others are seriously injured.
1988 — Buffer overflow in Berkeley Unix finger daemon. The first internet worm (the so-called Morris Worm) infects between 2,000 and 6,000 computers in less than a day by taking advantage of a buffer overflow. The specific code is a function in the standard input/output library routine called gets() designed to get a line of text over the network. Unfortunately, gets() has no provision to limit its input, and an overly large input allows the worm to take over any machine to which it can connect.
Programmers respond by attempting to stamp out the gets() function in working code, but they refuse to remove it from the C programming language’s standard input/output library, where it remains to this day.
1988-1996 — Kerberos Random Number Generator. The authors of the Kerberos security system neglect to properly «seed» the program’s random number generator with a truly random seed. As a result, for eight years it is possible to trivially break into any computer that relies on Kerberos for authentication. It is unknown if this bug was ever actually exploited.
January 15, 1990 — AT&T Network Outage. A bug in a new release of the software that controls AT&T’s #4ESS long distance switches causes these mammoth computers to crash when they receive a specific message from one of their neighboring machines — a message that the neighbors send out when they recover from a crash.
One day a switch in New York crashes and reboots, causing its neighboring switches to crash, then their neighbors’ neighbors, and so on. Soon, 114 switches are crashing and rebooting every six seconds, leaving an estimated 60 thousand people without long distance service for nine hours. The fix: engineers load the previous software release.
1993 — Intel Pentium floating point divide. A silicon error causes Intel’s highly promoted Pentium chip to make mistakes when dividing floating-point numbers that occur within a specific range. For example, dividing 4195835.0/3145727.0 yields 1.33374 instead of 1.33382, an error of 0.006 percent. Although the bug affects few users, it becomes a public relations nightmare. With an estimated 3 million to 5 million defective chips in circulation, at first Intel only offers to replace Pentium chips for consumers who can prove that they need high accuracy; eventually the company relents and agrees to replace the chips for anyone who complains. The bug ultimately costs Intel $475 million.
1995/1996 — The Ping of Death. A lack of sanity checks and error handling in the IP fragmentation reassembly code makes it possible to crash a wide variety of operating systems by sending a malformed «ping» packet from anywhere on the internet. Most obviously affected are computers running Windows, which lock up and display the so-called «blue screen of death» when they receive these packets. But the attack also affects many Macintosh and Unix systems as well.
June 4, 1996 — Ariane 5 Flight 501. Working code for the Ariane 4 rocket is reused in the Ariane 5, but the Ariane 5’s faster engines trigger a bug in an arithmetic routine inside the rocket’s flight computer. The error is in the code that converts a 64-bit floating-point number to a 16-bit signed integer. The faster engines cause the 64-bit numbers to be larger in the Ariane 5 than in the Ariane 4, triggering an overflow condition that results in the flight computer crashing.
First Flight 501’s backup computer crashes, followed 0.05 seconds later by a crash of the primary computer. As a result of these crashed computers, the rocket’s primary processor overpowers the rocket’s engines and causes the rocket to disintegrate 40 seconds after launch.
November 2000 — National Cancer Institute, Panama City. In a series of accidents, therapy planning software created by Multidata Systems International, a U.S. firm, miscalculates the proper dosage of radiation for patients undergoing radiation therapy.
Multidata’s software allows a radiation therapist to draw on a computer screen the placement of metal shields called «blocks» designed to protect healthy tissue from the radiation. But the software will only allow technicians to use four shielding blocks, and the Panamanian doctors wish to use five.
The doctors discover that they can trick the software by drawing all five blocks as a single large block with a hole in the middle. What the doctors don’t realize is that the Multidata software gives different answers in this configuration depending on how the hole is drawn: draw it in one direction and the correct dose is calculated, draw in another direction and the software recommends twice the necessary exposure.
At least eight patients die, while another 20 receive overdoses likely to cause significant health problems. The physicians, who were legally required to double-check the computer’s calculations by hand, are indicted for murder.

http://www.wired.com/software/coolapps/news/2005/11/69355?currentPage=all

In Mixed Groups, Women Eat Less, Men Eat More

In Mixed Groups, Women Eat Less, Men Eat More
Female college students in a U.S. study bought an average of 833 calories per meal when they ate with other women but just 721 when they were with men, whereas men bought 952 calories when they were with other males but 1,162 when they were with females, says a team led by Molly Allen-O’Donnell of Indiana University of Pennsylvania. Meal size seems to be a tool for influencing others, the researchers say: In mixed company, women show their femininity by purchasing less, while men assert their masculinity by buying more.
Source: Impact of Group Settings and Gender on Meals Purchased by College Students

http://links.mkt3142.com/ctt?kn=32&ms=MzI4MTQ0OQS2&r=Mzc4ODQwNjIxS0&b=0&j=Mzc3OTgzNjMS1&mt=1&rt=0

What are telltale signs that you’re working at a «sinking ship» company? – Quora

Large Company Edition

  • New opportunities are framed in terms of how they impact the existing legacy businesses, not how they impact the customer and the future.
  • Mediocre employees are not fired since you can’t recruit better ones anyway.
  • You benchmark against your direct (mediocre) peers instead of the disruptive entrants in your market.
  • Your co-workers roll their eyes at Facebook, Twitter, cloud services, iPhones, about how they are for children and your customers will never trust their business to them.
  • You spend more time talking about how you are going to make the quarter than you do about long-term growth.
  • Cross-functional committees are formed to solve problems that would have been solved already if the people responsible for them were any good.
  • Management consultants are brought in to solve problems that would have been solved already if the people responsible for them were any good.
  • All problems can be solved with more budget (not better people or better decisions).
  • Acquired companies disintegrate after they regress to the mean of the rest of the company.
  • The CFO spends 95% of his time looking for places to spend less money and 5% looking for new investment opportunities.
  • The HR department thinks their job is administration, not leadership.
  • All technology decisions go through a centralized IT bottleneck steering committee.
  • IT sends out a memo that says anyone using unauthorized cloud services will be fired.
  • You have a Chief Strategy Officer.
  • You don’t have a recruiting playbook.
  • People argue over offices.
  • When risky, innovative projects are cancelled, the people working on them are laid off, thus getting richly punished for their risk-taking.
  • Spending and hiring decisions are «approved» by an entity outside of your group, even if they are within existing budget.
  • The company shuts down over the holidays just to get vacation off of the books.
  • No one can answer, «why work here?» except to talk about the dental plan.
  • Executives are shuffled around the company to new roles, but outsiders don’t ever seem to be brought in to raise the bar.
  • When an exec quits, their next in line is automatically promoted to their bosses’ job, even if they weren’t that great and there would have been better candidates elsewhere.
  • Each employee has a rationale for why, «I’m glad I don’t have to work at Facebook/Twitter/Goldman.»

Small Company Edition

  • You «rehearse» for board meetings.
  • When pressured on the business by employees, CEO always starts with, «I need you to stay focused on…»
  • You have more than one MBA on the team.
  • You have a Chief Strategy Officer.
  • Your CTO just came out of a Phd program.
  • Your CEO sells instead of listens.
  • You have a launch party, and no customers attend.
  • Customers hate the product and vision, so the sales guy is fired.
  • You are not told the terms of the last funding round (5x liquidation preference?)
  • You never hear how much cash you have in the bank or see board meeting notes.
  • You complain about how the customers «just don’t get it» and aren’t «visionary.»
  • Your CEO says revenue is coming in in two weeks, just after he gets a meeting with the buyer, negotiates price, gets it approved, agrees on terms, writes up up contracts, negotiates them, signs them, and invoices the customer on net 30 terms.
  • You add features because board members want them.
  • Your CEO calls himself a «visionary» in his bio.
  • The CEO keeps everything secret because, «that is how Apple does it.»
  • The CEO approves all of the design decisions because, «that is how Apple does it.»
  • You are selling a platform.
  • Co-founder agrees to bring in experienced execs but thinks they will report to him.
  • You are selling to schools, hospitals, or non-profits.
  • You are commercializing a technology.
  • Your value proposition is that you help workers break down organizational barriers and work cross-functionally.
  • Your business model assume you will become one of the 7 websites that the average user visits every day.
  • Your site is going to be ad-supported, and you have 1500 users.
  • CEO avoids eye contact.
  • It gets really quiet.
  • You get free lunch but have no customers.
  • Your free lunch is taken away.
  • You get asked, «how much do you really need to live on?»
  • You get a pay cut. Your co-worker disappears.
  • Your CEO still doesn’t make eye contact.
  • You get laid off and become a creditor to the company because they didn’t reimburse your last 5 expense reports.
  • The company declines to buy your unvested shares back.
  • The liquidation yields 5 Aeron chairs and an espresso machine, and Ashton Kutcher’s stock is senior to yours.

If you don’t have passion for what you’re doing, quit right now! — Madonna

Madonna’s advice

«If you don’t believe in what you’re doing and you don’t have passion for what you’re doing, quit right now because anything that’s worth having and accomplishing is going to be hard. There’s going to be a lot of people trying to bring you down and tell you, ‘You can’t,’ so have faith in yourself. And number two, don’t take anything personally.»

Seven Options for Handling Interruptions in Scrum and Other Agile Methods – Agile Advice – Working With Agile Methods (Scrum, OpenAgile, Lean)

Seven Options for Handling Interruptions in Scrum and Other Agile Methods – Agile Advice – Working With Agile Methods (Scrum, OpenAgile, Lean)www.agileadvice.com

  1. Follow Scrum Strictly
  2. Allocate a Portion of Time to Interruptions
  3. Visible Negotiation of Change
  4. Separate Team for Interruptions
  5. Extremely Short Cycles
  6. Status Quo / Suffering
  7. Commitment Velocity