Drift into failure, p.26
Drift Into Failure, page 26
Complexity and Drift
The paradox is of course this. Complexity can guarantee resilience. Because they consist of complex webs of relationships, and because a lot of control is distributed rather than centralized, complex systems can adapt to a changing world. They can survive in the world thanks to this ability to adapt. So how can it be that complexity contributes to failure, to accidents? What is the relationship between complexity and drift? Complexity opens up a way for a particular kind of brittleness. Their openness means unpredictable behavior. Releasing a jackscrew into a world of competition and scarcity, and bets about maintenance intervals are off. Release a complicated engineered system into a world of cultural nuance, diversity and societal maturation, and original design assumptions get adapted, forgotten, muffled. But that is just the start. Complexity and systems theory gives us a language, and some metaphors, to characterize what may happen during the journey into failure, during the trajectory toward an accident.
The path-dependence of complex systems (or of a transformative journey from complicated to complex system) is a great starting point. Drift into failure can never be seen synchronically; systems have to be studied diachronically to have any hope of being able to discern where they might be heading and why. The non-linearity of relationships between components offers opportunities for dampening and modulating risky influences (for example, an increase in lubrication intervals might be accompanied by better endplay measurement devices and checking, which could increase the reliability of that part non-linearly even with less lubrication). But the non-linearity can also turn small events into large ones. The same small event, of missing one lubrication opportunity, would have had small consequences in the 1960s, and huge ones in the late 1990s. Making it increasingly difficult for smugglers to get their drugs across the Caribbean by numerous small steps that improved monitoring and interdiction led to a large event: a wholesale deflection of smuggling routes through West Africa.
In the mechanistic worldview, it is enough to understand the functioning or breaking of parts to explain the behavior of the system as a whole. In complexity and systems thinking, where nothing really functions in an unbroken or strictly linear fashion, it is not. Recall, from the second chapter, the outlines of drift into failure. Here is how they interact with what complexity theory has to say:
❍ Resource scarcity and Competition, which leads to a chronic need to balance cost pressures with safety. In a complex system, this means that the thousands smaller and larger decisions and trade-offs that get made throughout the system each day can generate a joint preference without central coordination, and without apparent local consequences: production and efficiency get served in people's local goal pursuits while safety gets sacrificed – but not visibly so;
❍ Decrementalism, where constant organizational and operational adaptation around goal conflicts and uncertainty produces small, stepwise normalization where each next decrement is only a small deviation from the previously accepted norm, and continued operational success is relied upon as a guarantee of future safety;
❍ Sensitive dependence on initial conditions. Because of the lack of a central designer or any part that knows the entire complex system, conditions can be changed in one of its corners for a very good reason and without any apparent implications: it's simply no big deal. This may, however, generate reverberations through the interconnected webs of relationships; it can get amplified or suppressed as it modulates through the system;
❍ Unruly technology, which introduces and sustains uncertainties about how and when things may fail. Complexity can be a property of the technology-in-context. Even through parts or sub-systems can be modeled exhaustively in isolation (and therefore remain merely complicated), their operation with each other in a dynamic environment generates the unforeseeabilities and uncertainties of complexity;
❍ Contribution of the entire protective structure (the organization itself, but also the regulator, legislation, and other forms of oversight) that is set up and maintained to ensure safety (at least in principle: some regulators would stress that all they do is ensure regulatory compliance). Protective structures themselves can consist of complex webs of players and interactions, and are exposed to an environment that influences it with societal expectations, resource constraints, and goal interactions. This affects how it condones, regulates and helps rationalize or even legalizes definitions of "acceptable" system performance.
The concern behind complexity and drifting into failure is how a large number of things and processes interact, and generate organizational trajectories when exposed to different influences. Resource scarcity and goal oppositions form one such pervasive influence. They express themselves in thousands of smaller and larger trade-offs, sacrifices, budgetary decisions – some very obvious, others hardly noticed. The ripple effects of such decisions and trade-offs are sometimes easy to foresee, but often opaque and resistant to anything resembling deterministic prediction. Decrementalism shows up in all kinds of subtle ways as people in the organization adapt, rationalize and normalize their views, assessments and decisions.
The contribution of the protective structure (for example, a safety regulator) to such adaptation and normalization, as well as exposure to its own resource constraints and goal interactions, is another influence on this. Such influences ebb and flow to different parts of the operational organization or even originate there, and are negotiated, dealt with, ignored or integrated. As Heisenberg put it, "The world thus appears as a complex tissue of events, in which connections of different kinds alternate or overlap or combine and thereby determine the texture of the whole."40 Drift can be one property of all of these countless relationships and subtle interactions that emerges at the system level; as one aspect of Heisenberg's visible texture. Let's lift a few concepts out of that texture – emergence, phase shifts and the edge of chaos – to see how they can apply to drift into failure.
What is Emergence?
As always in the tug between Newtonian and holistic systems of thought, the basic tension that animated this development was that between the parts and the whole. What was the relationship between parts and whole? Could the parts explain the whole, or did they fall short in accounting for the behavior of the whole? What could be considered as "parts" in the first place? In twentieth-century science, the holistic perspective, which rejects the idea that the whole can be understood as the sum of parts, has become known as systemic, and the way of reasoning it encouraged was called systems thinking.41 Again, the basic reflex in systems thinking is to go up and out, not down and in. It is an understanding of relationships, not parts, that marks systems thinking.
The question of accidents in complex systems is most certainly a question about the relationship between parts and wholes. The Newtonian answer to the question of accident causation (and, by implication, the question of the relationship between parts and wholes) has always been simple: the parts fully account for the behavior of the whole. Hence our search for the broken part. Hence our satisfaction when we have found the "human error" by any other name or human, that can be held responsible for the accident. In complexity and systems thinking, the relationship between parts and wholes is – you guessed it – a bit more complex. The most common way to describe that relationship is by using the idea of emergence. A whole has emergent properties if what it produces cannot be explained through the properties of the parts that make up the whole. Here is an example:
What are the emergent properties of kitchen salt?
It has a salty taste (no, reaily?)
It is edible (well, in reasonable quantities)
It forms crystals
So what are the components that make up kitchen salt? You might say: salt crystals. Well, no. That's the whole point. The crystals are an emergent property, as are the taste and their edibility. The "components" that kitchen salt is made up of are Na and CI, or Sodium (Natrium) and Chlorine. Sodium is a poisonous gas. Natrium is a violently reactive soft metal (Yes, you eat this. A lot). So kitchen salt displays properties that are completely different from the properties of its (chemical) component parts. And what's more, it doesn't display the properties of its component parts at all. You could argue that it only gets dangerous at quantities like those consumed by Morgan Spurlock in his 2004 documentary "Supersize Me," not because it's a poisonous gas or a violently reactive metal but because it does things with blood pressure and kidney function and such. You remember Morgan puking out of the car window after eating yet another Mackey D's breakfast, the camera gleefully tracking the yellow glop as it spluttered onto the pavement? Okay, that's not what I'm talking about.
We used to say that the whole is more than the sum of its parts. Today, we would say that the whole has emergent properties. But it means the same thing. Emergence comes from the Latin meaning of "bringing to light." The nineteenth century English science philosopher George Henry Lewes was probably the first who made a distinction between resultant and emergent phenomena.42 Resultant phenomena can be predicted on the basis of the constituent parts. Emergent phenomena cannot. The heat of apple sauce is a resultant phenomenon: the faster the constituent molecules are moving around, the hotter the apple sauce. The taste of an apple, however, is emergent. Taste does not reside in each individual cell, it cannot be predicted on the basis of the cells that make up the apple. Wetness does not reside in individual H2O molecules. They are not wet, they don't flow. Wetness is an emergent property, something that can only be observed and experienced at system level. Wetness cannot be reduced to (or found in) the molecular components that make up water.
Theorizing around emergence took off in earnest in the late sixties with the study of slime mold in New York City.43 Why slime mold? Because as a collective, or a whole, it does some amazing stuff that the component parts couldn't dream of pulling off. Slime mold is like ordinary fungi, or mildew. It's like the stuff on the inside your basement walls. Mushrooms are fungi too, a kind of mold in another form. Not long ago, a Japanese scientist succeeded in getting slime mold to find its way through a maze, and have it find food. One clump of slime mold even stretched itself thin to gain access to two sources of food in different places in the maze simultaneously. In a later experiment, it was possible to have the slime mold stretch itself out in a mimic of the Tokyo subway map.
Sending organisms through mazes in the pursuit of food is something that scientists used to reserve for mice and rats and other rodents. What puts rodents apart from slime mold is the same thing that: separates a rule-based system from a complex system: they have a central nervous system – a location for their supposed intelligence, a mechanism for them to learn about and map their world, to build some kind of memory, and to then act in accordance with it (for which rodents have four paws, to walk them through the maze). Slime mold doesn't have a central nervous system. It also doesn't have paws. In fact, slime mold doesn't have much of anything, other than basic biological material that can reproduce and that needs food to survive. Slime mold has no central organizing entity, like a brain, that connects eyes on the one side with paws on the other. Nothing or nobody is in charge, nothing or nobody directs eyes where to look, paws where to walk. There is no specialization of parts that take on the various tasks. Every cell or protoplasm of slime mold is basically the same. One big egalitarian democracy of jelly, or slime.
In other words, slime mold is made up of very simple components, but it does very complex stuff, in interaction with its changing environment. Its emergent behavior would seem intelligent. The slime mold as a collective, amazingly enough, pursues goals. Like getting food. And collectively, it adjusts its behavior on the basis of what it finds in the environment. It does this really well. Slime mold actually walks. Well, in a sense. It moves across the forest floor, for example, in pursuit of food. And it gets even more uncanny: if there's a lot of food, it will actually break up into a flock, so as to capitalize on die various food sources. If there's not a lot of food, slime mold will pull together into one larger clump and stick together to ride out the lean times.
Organized Complexity
Such emergent behavior relies on a kind of distributed, bottom-up intelligence, not on a unified, top-down intelligence. Rather than running one single, smart program (as the rat in the maze supposedly would), there is a whole swarm of simple mini-programs, each running inside the cellular or a-cellular components that make up the mold. These programs really are sets of very simple rules. Anthills get built like this. In fact, much of ant colony life is organized around a few sets of simple rules. To find the shortest route to food, for example, (1) wander around randomly until finding food, (2) grab it and follow your pheromone trail back to the nest. Other ants will follow the same trail out because it is strong and fresh. If they don't find food at the end of it, or the trail begins to evaporate, the ants will go back to the first rule. Anthills, which are very sophisticated emergent structures, get built in the same way, though the rules are slightly different. Even the ant garbage dump and the location of their cemetery (maximal distance from the anthill) emerge from the millions of interactive applications of a few simple rules. Even the biblical book Proverbs from around 950 BCE, containing maxims attributed mainly to Solomon, refers to the ants. They have no king, no central authority, Proverbs says. Yet they are industrious and build great things.
What emerges has been called "organized complexity." It is complex because there are a large number of components, and, as a result, a dense throng of mini-programs running and interacting, cross-influencing each other. Such complex interaction and constant cross-adaptation is hard to model and understand using traditional deterministic, or Newtonian logic, because so much of it is non-linear and non-deterministic. But what it produces is not disorganized. Rather, it is organized: it creates higher-order behavior (like wandering in pursuit of food, or a huge ant stack that heats and cools and stores and incubates) as an amazing emergent product of the complex interactions between a multitude of simpler entities. Of course, the word "organized" is perhaps a bit problematic in this context, as it suggests some kind of entity that does the organizing or controls it. Which is not the case. The organization is purely emergent, purely bottom-up, the result of local interactions and the cumulative effect of simple rules.
Behavior of the brain has also been seen as emergent. In fact, it is both ironic and inspiring that we no longer think of the mammalian brain, the central nervous system, as a centralized smart program either. Rather, brains are made up of a huge mass of much simpler components with simpler programs (neurons or nerve cells) that, in their action-reaction or even ON–OFF behavior cannot begin to show the sort of complex phenomena that a collective brain is capable of exhibiting. Indeed, one example of emergence that is often used is consciousness. The cells that make up the brain are not conscious (at least not as far as we know. And, for that matter, how could we ever know?). But their collective interaction produces something that we experience as consciousness. Of course, much of human factors, thanks to its reliance on information processing psychology, still treats the brain as a central executive function: as an entity that produces and directs intelligent behavior. But the components that make up the brain are hardly intelligent, Its intelligence cannot be reduced to its constituent components. Intelligence itself is an emergent property.
Accidents as Emergent Properties
Since ideas about systemic accident models were first published and popularized, system safety has been characterized as an emergent property, something that cannot be predicted on the basis of the components that make up the system.44 Accidents have also been characterized as emergent properties of complex systems.45 They cannot be predicted on the basis of the constituent parts; rather, they are one emergent result of the constituent components doing their normal work.
But, you may object, isn't there a relationship between some components (people, or technical parts) not doing their work, and having an accident? Indeed, if you believe this, it would affirm the idea that accidents are resultant phenomena, not emergent. After all, the fact that an accident happens can be traced back, or reduced, to a component not doing its job. The accidents-as-resultant-phenomena idea is alive and seems intuitive, consistent with common sense. In Alaska 261, for example, the component that didn't do its job was the jackscrew in the tail of the MD-80 airplane. It broke and that's why there was an accident. In Tenerife, one of the components that didn't do his job was the co-pilot, who didn't speak up against a stubborn captain.46 That human component didn't work, and that's why there was an accident. Such characterizations are quite popular and, in many circles, still hard to call into question.
Those who are involved with occupational health and safety issues may want to believe it too. For example, isn't there a relationship between the number of occupational accidents (people burning themselves, falling off stairs, not securing loads, and so on) and having an organizational accident? Isn't it true that having a lot of occupational accidents points to something like a weak safety culture, which ultimately could help produce larger system accidents as well? Not necessarily, because it depends on how you describe the occupational accidents. If accidents are emergent properties, then the accident-proneness of the organization cannot be reduced to the accident-proneness of the people who make up the organization (again, if that is the model you want to use for explaining workplace accidents). In other words, you don't need a large number of accident-prone people in order to suffer an organizational accident. The accident-proneness of individual employees fails to predict or explain system-level accidents. You can suffer an organizational accident in an organization where people themselves have no little accidents or incidents, in which everything looks normal, and everybody is abiding by their rules.
