AI: Intimations of “Emergence”

By Walter Donway

April 10, 2025

SUBSCRIBE TO SAVVY STREET (It's Free)

 

This is Part I of a two-part essay exploring the claims that artificial intelligence may exhibit signs of emerging consciousness.

 

After an extended tour of duty in neuroscience, as the founding editor of Cerebrum: The Dana Forum on Brain Science, a Dana Foundation quarterly journal for laymen and professionals, certain terms resonate with me. One is “emergence,” today’s favored framework for tackling the queen of neuroscience puzzles: how consciousness can exist in an exceptionlessly material universe. Thus, I did a double take when, reading about ChatGPT4 (Generative Pre-trained Transformer 4) artificial intelligence, I saw a reference to “emerging” (untrained and unintended) phenomena in AI processes. “Emergence” is a big word in understanding natural consciousness.

“Emergence” is today’s favored framework for tackling the queen of neuroscience puzzles: how consciousness can exist in an exceptionlessly material universe.

The label slapped on the consciousness puzzle by Australian cognitive scientist and philosopher David Chalmers has stuck; it is “the hard problem” of consciousness.1 How did awareness and consciousness arise in the physical, material world? Francis Crick (1916–2004), the British molecular biologist (and later neuroscientist) who shared with James Watson the 1962 Nobel Prize in Physiology or Medicine for co-discovering the structure of DNA, framed consciousness as The Astonishing Hypothesis: The Scientific Search for the Soul (1994).

Science since the 18th century has been resolutely materialist. Crick embraced the view of virtually every student of the brain in any profession that consciousness must be a physical phenomenon. Collaborating with Christof Koch, he reinforced and refined the theory that consciousness arises from specific neural processes in the brain. This view, “biological naturalism,” maintains that conscious experience is produced by (and is) the interactions of neurons, synapses, and biochemical processes. Crick drove it home in The Astonishing Hypothesis:

You, your joys and your sorrows, your memories and your ambitions, your sense of personal identity and free will, are in fact no more than the behavior of a vast assembly of nerve cells and their associated molecules. (Emphasis added.)2

This reductionist view—that consciousness, self-awareness, and thought can ultimately be explained entirely in terms of neuroscience and biochemistry3—is shared by most in the professional study of consciousness: philosopher/neuroscientists like Chalmers and Crick, but also near-legends such as Antonio Damasio, Michael Gazzaniga, and Joseph LeDoux. They are among a generation of scientists who know far more about the brain and central nervous system, thanks in part to astonishing imaging technology, than earlier students of the brain could have dreamed of knowing. Yet, they have chosen to work as well in the great tradition of philosophy where all previous thinking about the brain, the body-and-brain, and consciousness itself has taken place. (Damasio’s famous bestseller was Descartes’s Error.)

The roots in philosophy are deep. Democritus (c. 460–370 BC), the ancient Greek atomist, theorized that everything, including the soul, was made of tiny, indivisible particles.4 Soon after, Aristotle (384–322 BC) identified the crucial distinction between the body (matter) and soul (form) in his concept of hylomorphism: The mind and body are inseparable (contemporary brain scientists would concur) but fundamentally different in their nature (only a minority phalanx of today’s brain scientists would concur…).

 

The “Hard Problem” of Neuroscience

David Chalmers’ “hard problem,” how physical processes could, even in principle, give rise to subjective experiences, was originally formulated by René Descartes (1596–1650), whose answer was: dualism: Mind and body are made of distinct substances. Almost at once, dualism was challenged by Baruch Spinoza (1632–1677), who argued instead for monism: Mind and body are two aspects of a single, unified reality. By the 18th and 19th centuries, philosophers and neuroscientists were wedded to materialism and sought an explanation of consciousness within the constraints of the purely physical. Few today in the brain sciences and philosophy (excluding theology) would claim that the fundamental question has been resolved: How does the objective, physical world give rise to the subjective realm of experience?

This might be the place to pause to acknowledge that there is a contemporary alternative to the materialist/physicalist premise of brain science, one not deriving from spiritualism or theology. There is a minority camp of neuroscientists and philosophers who find the view that the brain and nervous system are the mind (from a “first-person perspective”) self-evidently false. Yes, they agree that consciousness depends on the brain, but because it is so obviously radically different from physical processes, it cannot be identical to the brain. Some suggest it may even have an independent existence in some form. In other words, while eschewing supernaturalism, they also reject strict materialist reductionism.

Among these thinkers are David Chalmers who, rejecting that the brain explains consciousness, has investigated panpsychism (consciousness as a fundamental feature of reality, not just something that emerges from complexity). Thomas Nagel’s famous paper argues that subjective experience (What Is It Like to Be A Bat?) cannot be reduced to any physical description. Nobel laureate Roger Penrose and Stuart Hameroff find consciousness arising from quantum processes inside microtubules in brain cells. And the dean of today’s philosophers of the brain, John Searle, agrees consciousness is a biological process; but it not the same thing as brain activity. No machine manipulating symbols understands anything; it is processing inputs according to rules. And Crick’s collaborator, Christof Koch, has migrated from strict materialism to a kind of panpsychism: Consciousness emerges from complexity, but perhaps not just from the brain. Perhaps it is like mass or charge in physics, built into reality. These views, if correct, logically exclude the possibility that AI at any level of complexity or sophistication can be conscious. The special sauce may be biology, life itself, quantum processes, or something else.

When I read (and not infrequently published articles) deriving from investigation of consciousness, they invariably mentioned “dual reality” (called “mysticism” or “supernaturalism”) only as a foil for “real science.” Science, however, with its focus on a single reality, concerned itself with the concept of emergence—how complex systems generate (give rise to) properties and behaviors that are not present in their individual parts. For example, the separate components of an automobile engine—pistons, cylinders, fuel injectors—have no intrinsic ability to produce movement. But when assembled and supplied with fuel, the engine generates force and motion, an ability that does not exist in any of the individual parts.

Similarly—if I may speak in sweeping terms—consciousness is hypothesized to be an emergent property of brain activity. Individual neurons are just electrochemical switches, and brain chemicals are just…chemicals—but as a complex network with its interactions, they give rise to perception, memory, and self-awareness.

But how “complex”? If emergence depends on complexity, what does that mean in the context of the brain, often described as the most complex structure in the known universe? Consider a few key parameters and characteristics:

If emergence depends on complexity, what does that mean in the context of the brain, often described as the most complex structure in the known universe?

The human brain comprises some 86 billion neurons (nerve cells), each with up to 10,000 synaptic connections with other neurons. This results in trillions of connections, an intricate web of electrical and chemical signaling. A synapse is where neurons connect and communicate with each other, a given neuron having between a few and hundreds of thousands of synaptic connections with itself, neighboring neurons, or neurons in other regions of the brain.

First, neurons are not randomly connected but are in a hierarchical structure of layers, circuits, and networks. The prefrontal cortex, responsible for reasoning and decision-making, has distinct layers and long-range connections that integrate information across the brain.

Second, machines, once built, are static. Not the brain. It continuously adapts and rewires itself in response to its “experience.” Synaptic pruning, learning, and memory formation all contribute to a brain that reorganizes its structure dynamically.

Third, brain complexity has another dimension: the oscillatory activity we call brain waves. Different regions of the brain synchronize through sharing specific frequency bands, enabling coherent thought, attention, and consciousness.

And fourth, consciousness in the brain apparently has no “center.” No headquarters. Instead, cognition emerges from the interaction of multiple specialized regions that integrate sensory input, emotions, memory, and reasoning. For example, the prefrontal cortex is the brain’s “executive,” making decisions, guiding conscious actions, but it is not the locus of memory, sensory processing, or regulation of bodily functions.

This multidimensional complexity—quantitative and in terms of varied processes—offers a lot to “play around with” in hypothesizing how consciousness may emerge. As we shall see, the same or analogous concepts, parameters, structures, and behaviors are now being brought to bear by researchers trying to understand unexpected emergent abilities observed in large language models. LLMs are advanced AI systems trained on vast amounts of text data to “understand” and generate human-like language. LLMs use deep-learning techniques, particularly transformer-based architectures, to recognize patterns, predict words, and respond to queries with contextually relevant answers. They can analyze language, summarize information and, as we explore here, exhibit surprising emergent capabilities. (“Transformer-based architectures,” in turn, are “mechanisms” that process and generate text by analyzing relationships between words in a sentence, regardless of their position. Whereas older AI models processed text sequentially, transformers use what is called “self-attention” that enables them to consider multiple words at once, capture context more effectively, and generate the highly coherent and contextually relevant responses we get from ChatGPT.)

 

What Would Be “Emergent” Abilities?

Emergent abilities in LLMs would be abilities or behaviors that had not knowingly (explicitly) been programmed into the system, but that appear apparently spontaneously when the model reaches a certain level of complexity. The analogy with the emergence of consciousness is straightforward. New AI abilities, like awareness in humans, are hypothesized as arising when the interactions within the system reach a threshold of complexity or sophistication. Notable examples:

Mathematical reasoning has unexpectedly developed in LLMs trained primarily on language and not expected to develop “skills in logic and arithmetic.” GPT-4 unexpectedly demonstrated that it could solve complex mathematical problems or even generate novel proofs. It was never intentionally trained for advanced theorem proving. And it seems that advanced models can learn on their own to use software tools, running Python scripts within a text-based interface to solve problems more efficiently.5

Multimodal reasoning, a surprising ability to analyze images or audio, has been observed in LLMs trained only on text. AI models have generated original poetry in the style of Shakespeare or modernist free verse, producing metaphorical and symbolic language beyond what was explicitly included in their training data. They sort of “created it”. In the category “commonsense reasoning,” some models have shown an ability to infer everyday causal relationships—recognizing, even when not explicitly taught this fact, that an ice cube left outside in the sun will melt. And how about thinking by analogy? AI has shown it can apply concepts from one domain to another, using knowledge of fluid dynamics to draw parallels with traffic flow patterns.

“Theory of mind,” a reliable marker of human intelligence, has seemed to manifest itself in LLMs, too, with an apparent ability to predict human intentions and emotions in ways they never were deliberately (explicitly) trained to do. With a test commonly used with infants or animals, AI models are given a scenario where a person hides an object and another person enters the room unaware of the hiding spot. Some AI models seem to infer that the second person will not know where the object is. It is hypothesized that this might show a primitive form of “understanding false beliefs”—a classic aspect of theory of mind.6

If complexity is the key to emergence, how do we measure complexity in AI models? Through a combination of scale, architecture, and training.

Translation capabilities seem to be learned by LLMs that although trained on multiple languages then go on to learn on their own to translate between languages they have never explicitly encountered together. Say a model has been trained in English-to-French and English-to-Japanese translations but never in French-to-Japanese; some can still generate reasonable translations between the two by using English as an implicit bridge. Who taught it or told it to do that?

If complexity is the key to emergence, how do we measure complexity in AI models? Large language models achieve complexity through a combination of scale, architecture, and training:

Just as neurons are the “numbers game” in the brain, AI complexity is often measured in terms of parameters, the number of adjustable weights in a neural network. GPT-3 has 175 billion parameters, while GPT-4 is estimated to have trillions. These parameters store patterns of relationships between words, concepts, and contexts.

Just as brains have organic architecture, modern AI models are built on transformer architectures, which enable them to identify the importance of different words and phrases in a sentence—one key to deeper meaning. In newer AI models, transformers process entire sequences at the same instant, enabling them to recognize long-range dependencies in language.

 

AI Getting Organized

Training: Just as brains self-organize based on experience and have the plasticity to do so, AI trains itself on datasets of a magnitude that we might expect in brains if they became omniscient.

Training data volume: LLM are trained on massive datasets of books, articles, websites, and dialogues, potentially hundreds of terabytes (=1 trillion bytes) of text. The sheer amount of diverse input helps the model generalize across topics.

Training time and compute power: To train an advanced LLM takes weeks to months on thousands of GPUs (Graphic Processing Units, specialized electronic circuits initially designed for digital image processing) or on TPUs (Tensor Processing Units, designed to scale cost-efficiently for a wide range of AI workloads, spanning training, fine-tuning, and inference) running in parallel. The required energy for training a model like GPT-4 is estimated megawatt-hours, sometimes comparable to the annual energy use of a small town, and price competition for getting the same training done is keen.7

LLMs are not directly programmed for specific tasks just as brains aren’t born to be doctors. Instead, LLMs learn by predicting missing words in billions of sentences. As models scale up, they exhibit unexpected capabilities, reflecting what are often called “scaling laws.”

Some investigators propose the possibility that LLMs learn how to learn—and that may explain why AI models sometimes are able to refine their reasoning strategies even with no direct programming for that.

Researchers have fielded several hypotheses to try to explain how emergent abilities arise:

Self-awareness emerges in the brain when it has attained a certain level of neural complexity, so perhaps AI models, when sufficiently scaled up, may begin to organize themselves into more sophisticated structures. What if, at the scale of billions or trillions of parameters, some hidden layers begin to represent abstract concepts in ways that designers never intended and so did not explicitly design?

The training of LLMs is characterized by forcing the model to compress vast amounts of information into its internal structures. What if, in doing this, unexpected abilities emerge when the model begins generalizing across tasks it had not seen before?

Learning by neural networks does not proceed in linear fashion. Rather, they have hidden layers that capture complex, multi-dimensional relationships among words, ideas, and contexts. At certain scales, then, these representations seem to start forming more abstract (and cognitive-like) behaviors.

Some investigators propose the possibility that LLMs learn how to learn—a process parallel to “metacognition” in humans. And that may explain why AI models sometimes are able to refine their reasoning strategies even with no direct programming for that.

How far have the AIs ridden off the reservation? No one not writing science fiction is suggesting we are close to seeing non-biological consciousness. But a careful study of the parallels between AI and the emergence of intelligence and self-awareness in biological systems seem to be our best shot at understanding AI’s unexpected capabilities.

 

Notes

  1. Chalmers, David. “Facing Up to the Problem of Consciousness.” Journal of Consciousness Studies, 2(3), 1995, pp. 200–19.
  2. Crick, Francis. The Astonishing Hypothesis: The Scientific Search for the Soul. Charles Scribner’s Sons, 1994.
  3. Koch, Christof, and Francis Crick. “The Problem of Consciousness.” Scientific American, 267(3), 1992, pp. 152–159.
  4. Taylor, C.C.W. The Atomists: Leucippus and Democritus. University of Toronto Press, 1999.
  5. Bubeck, Sébastien et al. “Sparks of Artificial General Intelligence: Early Experiments with GPT-4.” arXiv preprint arXiv:2303.12712, 2023.
  6. Kosinski, Michal. “Theory of Mind May Have Spontaneously Emerged in Large Language Models.” arXiv preprint arXiv:2302.02083, 2023.
  7. Patterson, David et al. “Carbon Emissions and Large Neural Network Training.” arXiv preprint arXiv:2104.10350, 2021.

 

 

(Visited 94 times, 1 visits today)