Burnout Intergalactic visitors threaten to bring chaos to our electreonic world. Is there anything we can do, asks David Hambling THE travellers hurtle towards Earth at close to the speed of light. Their origin is uncertain: some may have been born in mammoth supernovae far across the universe, travelling for millennia before reaching us. Yet in the time it has taken you to read this, billions of them have smashed into our planet's atmosphere, raining down sub-atomic shrapnel that is passing right through you. These travellers are cosmic rays, a source of intrigue for astronomers, particle physicists, space scientists… and the car industry. Car manufacturers are just the latest group to appreciate the importance of these mysterious particles. For decades physicists have known that high-energy cosmic rays and the particles they generate can contain enough energy to damage sensitive electronics. Yet we are packing our gadgets with larger numbers of ever-smaller circuits, and giving them ever greater responsibilities – from driving our cars to controlling drones and other robots. So can we continue to enjoy the benefits of powerful electronics without these other-worldly visitors compromising our safety? First investigated over a century ago, cosmic rays are mostly protons and helium nuclei that originate far beyond the solar system. For many decades they were a curiosity with no practical significance, but that changed when we entered the space age: by the mid-1970s, satellite makers realised that some faults with their craft were caused by cosmic rays striking electronic components. This is hardly surprising, given that the most powerful cosmic rays have the energy of a tennis ball travelling at about 120 kilometres per hour. These particles can create havoc in a number of ways. One of the most common is called a single event upset, in which a cosmic ray ionises atoms in an electronic chip's semiconductor, releasing a burst of electrons that can flip a digital bit from, say, 1 to 0. This is known as a soft error (see diagram) and can be enough to crash a computer. More serious is a hard error or “single event burnout” in which components are damaged or destroyed by a sudden short-circuit created by the burst of electrons. Not surprisingly, the space industry began designing its craft with these problems in mind. Protection is now standard on all satellites and extra care is taken in aircraft design to ensure critical systems have multiple back-ups to make them less susceptible to this kind of damage. Down on the ground we are much safer. To a cosmic ray, Earth's atmosphere offers the same protection as a concrete layer 4 metres thick. Any incoming particle striking atoms in the atmosphere produces an avalanche of protons, neutrons or pions that in turn create a shower of millions of fragments – gamma rays, muons, neutrinos, electrons and other particles – which rain down over several square kilometres. Those reaching sea level may be the sixth generation descendants of the original cosmic ray, and back in the 1970s it was assumed that they weren't energetic enough to damage anything. Flipping bits However, in 1978, engineers at US chip maker Intel reported problems with some of their products. Following painstaking detective work, they discovered that batches of memory chips had been contaminated with radioactive material during manufacture. The weak radiation was enough to flip the occasional bit and create an error. This got James Ziegler thinking. An engineer at computer maker IBM, he realised that if low-energy radiation could trigger soft errors then there was a chance that cosmic rays could do the same thing, even at sea level. In 1979, he published calculations warning that cosmic rays could cause significant numbers of soft errors in the 256 kilobyte memory chips in use at the time (Science, vol 206, p 766). Worse, he suggested that in a 64 megabyte chip, cosmic rays would be expected to cause one soft error every day. To investigate this, IBM researcher Timothy O'Gorman developed a mobile testing device – basically an oversized computer with an array of 248 chips – and ran it at sea level, at an altitude of 3 kilometres and deep underground in a former mine. The results were unequivocal: soft errors at sea level were 10 times less likely than at 3 kilometres, but they still occurred at a rate of roughly one every 11 hours. For the first time there was evidence that these particles could touch us all. Over the next two decades, engineers at IBM and other chip makers investigated the problem, yet few results turned up in the open literature. In some cases, companies even asked customers to sign non-disclosure agreements so that news of problems didn't reach competitors. “Industry does not like to publicise problems,” says Clive Dyer of Surrey University in Guildford, UK, who studies the effects of cosmic rays. “What reaches the open literature is only the tip of the iceberg.” Not that the electronics industry has been idle. It has developed a number of ways to beat the cosmic ray threat. The most common defence against soft errors is error-correcting software, typically involving “check bits” sent along with each packet of data, which are used to confirm that the contents haven't been corrupted. In addition, multiple copies of data are stored in different locations on memory chips. In particularly vulnerable or critical systems, like those on satellites or passenger aircraft, whole hardware systems are built in triplicate and readings such as airspeed are calculated by “voting”. Since the majority vote wins, this should prevent the odd soft error from having an impact. However, these counter measures are expensive, so they are only added when strictly necessary. And if systems don't perform as planned, the results can be catastrophic. On 7 October 2008, a Qantas plane en route to Perth plunged more than 300 metres before recovering. Crew and passengers were flung around and dozens suffered broken bones, concussion or other injuries. Faulty readings from an airspeed and attitude sensor had put the plane into a steep dive, a fault subsequently blamed on a “rare type of internal or external error”. Cosmic whodunnit No one has been able to rule out a cosmic ray strike as being responsible. Airliners experience around 100 times the amount of cosmic rays we do at ground level, and aircraft are usually fitted with three sets of essential sensors to counter potential problems. Unfortunately, in this case the flight control software couldn't cope when an unexplained fault occurred, and the software put the plane's nose down to increase airspeed. This kind of incident is difficult to predict. Worse, nobody knows for sure the exact rate of errors caused by cosmic ray strikes. But we are likely to see an increasing number of them. To cram more processing power into computers, the last decade has seen the size of transistors on electronic chips shrink from around 180 nanometres across to less than 20 nanometres. According to a study by US-based computer company Oracle, this comes at a price: cosmic rays create eight times more soft errors in circuits with 40 nanometre transistors than in those with 130 nanometre transistors (see diagram). This trend is mirrored in the drive for more energy-efficient chips that run at lower voltages: circuits running at 0.5 volts have twice the rate of soft errors as those at 0.7 volts. “The ever-increasing need for faster, smaller electronic devices means that, in general, electronic devices are tending to becoming more vulnerable to cosmic rays,” says Christopher Frost, a physicist at the Rutherford Appleton lab in Harwell, UK. The need for faster, smaller devices means they are becoming more vulnerable Still, many in the electronics industry seem to underestimate the magnitude of the problem. The industry generally relies on computer models, plus limited lab tests, to predict whether a particular chip will be vulnerable, but this method has been called into question. In 2009, a team from the University of Toronto in Canada and Google examined the error rates experienced by memory chips in the company's servers over two and a half years of operation and found that hard and soft errors were up to 100 times more frequent than predicted. And it's not just companies running data centres that are at risk. According to Glen Schaff, a digital imaging expert at Videogenix in Boston, digital cameras are vulnerable since cosmic rays can destroy pixels in the sensor which then show up as black dots in images. To minimise this, manufacturers ship their products by sea rather than by air, and most cameras contain software that spots dead pixels and then disguises them in the image. More serious is the impact of cosmic rays on the many millions of traffic and surveillance cameras around the globe, says Schaff. Typically, these cameras fail at least once a year, and high energy particles are a major contributor, he says. This can be expensive to remedy. “One traffic camera could cost $5000 to reboot, because the freeway has to be closed to get access to it,” he says. A faulty camera is one thing. A faulty car could be fatal. In 2004, drivers in the US started complaining about sudden unintended acceleration in Toyota Camrys with an electronic throttle. The company recalled more than 9 million vehicles after some were involved in fatal accidents. Investigations suggested there were multiple causes: the accelerator pedal could become stuck on a floor mat, for example. But other incidents couldn't be easily explained. So NASA's Engineering and Safety Center was called in. It concluded that the vehicle's electronics were generally protected from the sort of upsets caused by cosmic rays, but the investigators were unable to pinpoint the exact cause of the problem. However, last year a US court found against Toyota, after Michael Barr, an expert in embedded software at Barr Group based in Gaithersburg, Maryland, presented results of a detailed investigation. His work identified weaknesses in the vehicle's software that NASA had apparently missed and pointed to a lack of redundancy in the throttle control system so that if a single bit flipped state, from a 1 to a 0, say, it could cause unintended acceleration. No one can prove what might have caused such flips, but as Barr points out: “a cosmic ray is one of the ways that a bit could flip in the Toyota engine computer.” Safety critical Certainly the case should be a wake-up call for those designing safety-critical systems. Take graphics processing units, or GPUs. Originally developed for image manipulation, they are increasingly used as part of powerful computers to forecast weather or model epidemics. They are also becoming a key component in machines that use sensors to detect nearby objects and avoid them – such as robots, drones and self-driving cars. GPUs are also part of pedestrian-detection systems now being fitted to many cars. Yet tests show that GPUs are particularly vulnerable to errors created by high energy particles. In fact, because of the way GOUs are designed, a single neutron can create multiple bit errors – up to 64, according to one study – which could have potentially fatal consequences. Vehicles propelled by electricity are also susceptible to burnout by cosmic rays. Research into component failures on German trains, for instance, showed that the rate of unexplained failures matched predictions for the impact of cosmic rays. In addition, an increasing number of systems such as wind turbines and electric vehicles use high-voltage electronic switches called insulated-gate bipolar transistors. Compared to older switches, IGBTs are particularly sensitive to cosmic radiation – in fact, the measured failure rate of IGBTs due to cosmic rays exceeds predictions by at least a factor of 10. To work out the risk of failure due to cosmic rays for electronic chips and components, more lab tests are needed. That's where a facility called ChipIR comes in. Under construction at the Rutherford Appleton Laboratory, it aims to provide the most realistic testing facility yet by generating a beam of neutrons that closely matches those produced by cosmic rays hitting the atmosphere. The beam can also be shaped, to test single components or bathe entire devices, and it will be able to simulate the effects of years of cosmic rays in a single hour. When it opens in 2015, ChipIR will be used by researchers and commercial organisations to screen microchips, circuits and devices and should provide a realistic assessment of risks, especially for new electronics. It will also help bring the issue to the fore. “Awareness is very important,” says Frost. “Your first thought when something goes wrong is not necessarily that a cosmic ray has just effectively reprogrammed your device.” Only when risks are recognised can they be tackled. Schaff, for example, solved a problem with NASA's Chandra space telescope by adding an external circuit to automatically reboot the camera system whenever cosmic rays made it crash. He has developed a similar solution called iPulse for Earth-based electronics which will stop the need to call out an engineer when a system crashes. Yet even new hardware may not be enough. Cosmic ray intensity can increase dramatically during periods of strong solar activity. Dyer says that in one storm in 1956, the particle intensity recorded at ground level jumped by a factor of more than 100. Fortunately there were few susceptible devices around at the time. Records suggest even bigger events have occurred in the past. Dyer says that the best solution would be to monitor the sun and warn when more cosmic rays are en route; complete protection is simply not possible. “It would be extremely difficult to harden against even the 1956 event,” says Dyer. Frost agrees: “We are becoming more dependent on electronics – we should understand the consequences.” This article appeared in print under the headline “Burnout” Dark lightning Cosmic rays aren't the only danger to electronics. Avalanches of high-speed electrons created by thunderclouds can produce bursts of radiation known as terrestrial gamma-ray flashes. For airline passengers, exposure to a TGF amounts to no more radiation than a typical chest X-ray, but the impact on electronics can be far more serious. Studies have shown that the radiation dose from a TGF can pack more of a punch than a cosmic ray when it strikes electronics. Alessandro Paccagnella at the University of Padua is working with Marco Tavani at Italy's National Institute of Astrophysics to explore how gamma rays interact with solid materials, like those in the fuselage of airliners, to produce potentially dangerous neutrons. When struck by a TGF, aircraft made from carbon composites will generate more neutrons than traditional metal airframes, warns Paccagnella, and these can trigger soft or hard errors (see main story). To find out more, the pair are measuring TGFs using sensors on the ground, in satellites and in aircraft. What do we know so far? The RHESSI satellite launched in 2002 has recorded 50 flashes a day. And David Smith, manager of NASA's ADELE project, which studies TGFs, recently discovered that the strongest flashes seen from orbit are some two to four times as powerful as those previously measured. David Hambling is a freelance writer based in London