Body plan.
The intricacies of living bodies and their development have fascinated biologists, physicists, and thinkers alike. For biology, early studies in embryology provided evidence of evolution by common descent and other “missing links” among species. Before being observed directly, it was inconceivable that developing humans could have gills and tails, that our hands and feet first develop as fins, and that most of us have hidden teeth.
For naturalists in the eighteenth and nineteenth centuries’ Europe, such as Goethe, the order of our bodies was a product of biological archetypes, the essential properties of living things that distinguished us from inorganic matter. Therefore, development and embryology provided the mechanism for generating this order. Early evolutionary biologists were confronted with the need to explain the amazing diversity of living things. As our ability to measure physical features increased, evolutionary developmental biology had to explain increasingly more complex features of living bodies, involving tens of thousands of genes, billions of molecules, and trillions of cells.
Beginning with the pioneering studies of Nüsslein-Volhard and her colleagues, evolutionary developmental biologists have now described many genes that initiate and control the development of specific body parts. Many dozen master regulators are now known to exist. They specify the development of the brain, eyes, ears, heart, lungs, liver, pancreas, kidneys, intestine, muscles, bones, blood, and skin. Some are exclusively expressed in distinct tissues. Others are expressed in several different ones, like blood and brain, but not in others. They often regulate each other’s expression, including feed forward and feedback loops that can exhibit bistability, amplification, robustness, and dynamic switching. These are indeed key properties of living things.
In addition to the control of development by the combinatorial expression of master transcription factors, tissues can also be patterned by specific molecular interactions between cells. In the case of cadherins, proteins that control the formation of junctions that bind many of our cells to each other in tissues, their specificity is determined principally by binding thermodynamics. Over one hundred types of cadherins have been documented in our tissues, with distinct binding preferences to each other, as determined by the energetics of their intermolecular binding interactions. Combined with tissue-specific transcription factors, the resultant “code” offers a developmental model that can explain both organization and complexity.
The core regulatory circuits formed by such master regulators provide the organizational basis for the development of many of our organs. For example, during early embryogenesis, distinct transcription factors localize to specific enhancers and regulatory DNA elements that sustain pluripotency, the ability of embryonic cells to form different tissue types and organs. As embryonic cells develop into progenitors of distinct somatic lineages, these factors relocalize to new genes with tissue-specific cofactors. These interactions can be cooperative, in which case they stabilize tissue-specific development, or antagonistic, in which case they inhibit lineage transitions, as needed for developmental stability and robustness. Similar arrangements of transcription factors control not only the development of various human cell types, but also early patterning of the embryo, with its primary tissue layers of ectoderm, mesoderm, and endoderm.
The most striking evidence of the developmental power of such an arrangement is the ability to experimentally reverse the development of specific somatic cells. This involves “reprogramming” of differentiated epithelial skin cells, mesenchymal blood cells, or ectodermal brain cells into induced pluripotent stem cells—mimicking the pluripotent state of the very first cells in our embryos—simply from ectopic expression of specific genes encoding master regulator transcription factors. These engineered stem cells have now been differentiated into diverse cell lineages, including those that otherwise are not known to interconvert in developed adult organs.
The extensive organization potential of such regulatory circuits was recognized almost seventy years ago by Turing. His simple reaction diffusion model explained how organized patterns can arise spontaneously from an otherwise homogeneous, disordered starting state. Indeed, the development of our arms and legs can be neatly explained by a Turing-like model of tissue-specific growth factors. Here, spatial distribution of specific morphogenic determinants can lead to patterning and order formation. This was originally observed in Gurdon’s pioneering experiments using gradients of activin protein to induce the development of blood, muscle, brain, and heart cells, induced by specific morphogen concentrations.
While this experiment was done using frog embryos, it is intuitive to imagine how the coupling of similar processes with the combinatorial action of master regulators can have rich organizational potential in the development of our bodies. With sufficient knowledge of the number and properties of specific regulatory molecules, one can even write a series of coupled differential equations to generate diverse developmental dynamics of various tissues, as controlled by the tissue-specific expression of combinations of distinct master regulator genes. In turn, interactions of tissue-specific transcription factors with factors that regulate DNA and histone modifications can provide a physical basis for epigenetic memory as cells divide to grow tissues that maintain their developmental identities. This seems like a satisfying solution to our body plan.
While experimental testing of such a model is now possible in model organisms, we also know of at least two features of mammalian body development that indicate that this paradigm will be insufficient to explain key properties of human development. First, by virtue of their deterministic nature, master regulatory circuits and their spatial organization in cells and tissues are inherently limited in the diversity of their biologic activities. Certain organs with fixed functions—perhaps skin which serves largely a singular function as a barrier—may be developed completely by master regulators. However, limitations of deterministic development are not compatible with the function of many other human organs, whose biologic functions depend on expansive diversity in structure or behavior.
Our immune system provides a clear example. In the case of this organ which functions to recognize and clear invasive pathogens and contribute to tissue homeostasis, deterministic aspects of its development are inherently limited to the recognition of a finite number of molecules and cells. For example, innate immune cells use distinct receptors to bind various molecular components of infectious pathogens as a part of their recognition and consequent clearance. For instance, Toll-like receptors bind to specific forms of nucleic acids that are largely absent in human cells, but abundant in various viruses. Similarly, NOD-like receptors bind peptidoglycans specifically synthesized by bacteria. While multiple receptor systems can evolve to bind diverse molecular species, they remain small in number relative to the number of potential pathogens. Indeed, the human genome contains numerous paralogues of genes encoding such receptors. These genes themselves are rapidly evolving, as evident by their substantial divergence from their homologues in other mammals.
Host immune genes co-evolve with their pathogens, and recent evidence supports the existence of multiple concurrent evolutionary “arms races.” This dynamic process may provide some function and fitness. Nevertheless, the functional potential of these and other deterministic immune sensors is finite, while the number of potential pathogens is much less so. With unrelenting evolutionary pressure, the vast difference between the number of human individuals and their innate immune receptors (now a few billion) and that of their pathogens (many billion trillion) is effectively the difference between finite and infinite biology.
To deal with this, vertebrate animals have developed another immune system, based on the stochastic diversification of receptor molecules through somatic genetic reorganization, coupled with functional selection. The resultant diversity in the structure and function of immune receptors is astronomical in scale. The underlying process involves somatic DNA rearrangements of genes encoding immunoglobulin receptors in B- and T-lymphocytes. In B-lymphocytes, recombined immunoglobulin receptor genes encode secreted antibodies that bind to circulating pathogens. In T-lymphocytes, a different set of somatically recombined genes encodes T-cell receptors that bind to proteolytic products of intracellular pathogens presented by specialized receptors on the surface of somatic cells.
Most of the somatic rearrangements of the immunoglobulin receptor genes are non-productive, in that they fail to produce functional immunoglobulin receptors that signal at the lymphocyte cell surface. Developing lymphocytes with non-functional immune receptors die by programmed cell death, which otherwise must be inhibited by a functional receptor signal.
Recent studies indicate that each of us generates as many as a quintillion (one million trillion) distinct immunoglobulin receptors, each able to bind to a specific molecular pathogen. Since somatic diversification is essentially random, the spectrum of immune receptors and their molecular recognition activities is limited only by the composition of the human genome and the number of rearrangements and immune cells generated during development. Fundamentally, immune tissues develop by stochastic diversification coupled with selection, occurring somatically during development. From the viewpoint of statistical mechanics, somatic DNA rearrangements are analogous to energetic fluctuations, as they diversify the ensemble of molecular states to create substrates for functional selection. This enables specific recognition of pathogens unknown a priori, a kind of receptor universe within our bodies to sense the microbial expanse beyond.
The hyperbole of this metaphor is not without basis in fact. For example, recent studies described extra-ordinarily diversified antibodies that specifically recognize malaria-infected red blood cells in individuals living in Africa, where malarial infection is endemic. In addition to the recombination of the canonical exons encoding the antibody molecule, induced by the known somatic gene rearrangements in human B-lymphocytes, several individuals were also found to have antibodies generated by somatic transposition of human genes from distant chromosomes. For example, some individuals possessed antibodies with insertions of neomorphic protein sequences that originate from the LAIR1 protein which normally binds collagen and has no known immune functions. The transposed sequences included LAIR1’s coding exons, as well as portions of its non-coding introns, which were spliced in their new immunoglobulin receptor gene locus, producing functional antibody molecules. The resultant neomorphic antibodies bind to specific malarial protein expressed on the surface of infected red blood cells, leading to its opsonization and clearance of infected cells and malarial parasites within. Thus, somatic DNA rearrangements leading to the diversification of human immune receptors are even more varied than the already expansive repertoire currently documented in developing human immune cells.
What is the evolutionary origin of such an expansive developmental process? In unexpected ways that only evolution can create, the RAG1/2 recombinase enzyme that is expressed exclusively in developing immune cells and catalyzes rearrangements of specific DNA sequences to diversify our immune receptors is evolutionarily derived from the Transib transposon. Transposons are mobile genetic elements found in all living things. When evolving autonomously in certain species, transposons can be true selfish genes.
In prokaryotes, mobile genetic elements are responsible for genetic exchange among individuals, and their rapid diversification. This horizontal gene transfer by transposable elements is thought to be the dominant mode of evolution in prokaryotes. This biology is precisely why it is so difficult to draw an accurate “tree of life.” Their features evolve not only by descent, but also by exchange. Thus, human immunity using somatic immune diversification against invasive pathogens is ultimately based on the co-option of transposable elements, similar to bacterial evolution itself. One cannot but think of Lynn Margulis’ endosymbiotic theory—where prokaryotes and eukaryotes co-evolve—which neatly demonstrates the creative power of evolutionary biology within the physical constraints of molecular interactions.
Recognition of somatic diversification and selection in immune system development also prompted similar considerations for other human tissues. Spurred by his landmark work on the structure of antibodies, Gerald Edelman proposed that the human brain also develops as a result of somatic diversification. Edelman developed this theory of neuronal selection to explain how perceptual categorization in our minds could occur without assuming that the world is prearranged in an informational fashion or that the brain contains a homunculus. From this perspective, the human brain is organized by cellular populations containing individually variant cell networks, the structure and function of which are selected during development and behavior. The units of selection are thus collections of strongly interconnected neurons that generate specific cellular and organismal behaviors.
Those groups of neurons that produce functional activities and behaviors during brain development are selected for survival, whereas others that fail to form functional connections are eliminated. Multiple neurotrophic factors have now been found to support this process. For example, cerebellar granule neurons survive only by making specific circuits with Purkinje neurons. This is mediated by the secretion of specific proteins that bind their receptors on the surface of granule neurons, leading to their survival. In experimental models lacking secreted growth factors or their receptors, neurons die. One can envision how distinct combinations of specific intercellular interactions, secreted growth factors, and their reciprocal effects on constituent cells, can lead to the organization of functional neuronal circuits upon stochastic sampling and selection of diverse intercellular architectures. Indeed, inappropriate pruning of neuronal connections (synapses) has been proposed to contribute to variety of neurologic disorders.
While current knowledge of the structural organization of the human brain is far from complete, its development by way of somatic diversification and selection is almost surely true. This is based on the observation that the mammalian brain produces just as many neurons that become functional as those that are eliminated by programmed cell death during development. Of course, this may be a non-adaptive byproduct of our evolutionary history. However, involvement of specific molecular mechanisms causing this developmental cell death argues for an evolutionarily adaptive, functional process.
Extensive studies of mouse brains observed that the survival of developing neurons requires repair of DNA breaks, consistent with the activation of somatic DNA rearrangements. This dependence is reminiscent of the requirement for similar forms of DNA damage repair in developing lymphocytes, where unrepaired somatic DNA rearrangements by RAG1/2 lead to programmed cell death due to the failure to generate functional immunoglobulin receptors. Experimentally induced defects in specific forms of DNA damage repair in mice lead to neurodevelopmental deficits. These deficits closely mimic human neurodevelopmental disorders, such as ataxia telangiectasia due to inherited mutations of the ATM gene, Seckel syndrome due to mutations of ATR, and others.
What biologic functions do somatic DNA rearrangements produce in developing neurons? One idea was originally proposed by Lee Hood and colleagues based on their estimate that the human genome does not have sufficient information encoding capacity to pattern the intricately interconnected human brain. Instead, a stochastic sampling process akin to immune receptor recombination would be needed to diversify cell adhesion molecules to organize the complex cell-cell architecture of the human brain. Subsequent cloning of the clustered protocadherin genes revealed a gene organization that is very similar to the immunoglobulin receptor genes, leading to proposals that our clustered protocadherins are subject to somatic DNA rearrangements. To date, no evidence of somatic rearrangements of the clustered protocadherin genes has been found in developing neurons. However, our clustered protocadherins do exhibit stochastic and mono-allelic expression of specific isoforms in individual developing neurons, and extensive diversity of their splice isoforms, as would be required for a developmental selection process.
In addition to stochastic diversification of adhesion molecules followed by their selection to form functional inter-neuron connections, it is also possible that intracellular molecules that control neuronal excitability can be diversified to produce ensembles of cells. Cell populations with variable excitability properties can then become diversified substrates for selection of populations with specific excitation dynamics required for distinct neuronal circuits. As a result, highly complex architectures with functionally varied dynamics can be organized spontaneously and adaptively.
In this regard, it seems striking that in addition to the immune DNA recombinase RAG1/2, the only other human gene that is conserved among vertebrates and known to have DNA transposition activity in human cells is PGBD5. Remarkably, PGBD5 is exclusively expressed in neuronal cells, consistent with a specific function in our nervous system, such as somatic diversification of genes required for neuronal development.
What other tissues and organs may depend on somatic cell selection? Developmental programmed cell death (apoptosis) offers an important clue. In this somatic selection framework, diversified somatic cells are functionally selected. Those that fail to achieve biologically functional states—based on cell connectivity, tissue organization or function—are physiologically eliminated by programmed cell death. Importantly, developmental cell death is not merely a byproduct of cell proliferation, as some suggested early on. For example, lymphocytes that are rapidly proliferating in germinal centers of lymph nodes undergo extensive cell death because of the failure to recombine immunoglobulin receptor genes into functional signaling molecules. In contrast, hepatocytes which are proliferating as part of liver regeneration fail to die, even when most of the liver has been removed.
In addition to the mammalian immune and nervous systems, experimentally induced defects in apoptosis also impair the physiologic development of heart and limbs. While our hearts and limbs are recognized for their regular activity—to predictably pump blood and move our bodies—it is not unreasonable to hypothesize that somatic selection also contributes to the development and function of these tissues. After all, the electrical conduction system of the heart and its endocrine integration parallel only the brain in their complexity, and a physiologic mechanism is needed to explain its functional development.
The development of mechanical philosophy and its explanations of the natural world rattled the foundations of human thought during the Enlightenment. Many could not envision that the variety and complexity of living things could be explained by physical principles. Today, we are confronted with the same question about the organization of the human body.
Can the complexity of its organs and diversity of its activities be explained completely by deterministic principles? The specific genes expressed by distinct cells during particular phases of organ development are about to be completely enumerated. Advances in spatial imaging and molecular measurements of single cells in intact tissues are already mind boggling. Their complete descriptions in human development are inevitable. It is clear that some aspects will be explained by combinatorial dynamics of gene expression and physics of protein interactions and cell adhesion. However, key organs and tissues are organized as a result of somatic selection of diversified states. This includes genetic diversification through somatic DNA rearrangements, and phenotypic diversification through stochastic gene expression and other biochemical reactions. Sampling of the resultant diversified cell ensembles allows for entropically favored body development, as self-organized by diverse and expansive biologic functions.
This means that organ development is inherently variable, constrained only by evolutionary fitness and statistical (energetic) fluctuations. As a result, while developmental processes and tissue organization with stereotypic reproducibility are evolutionarily selected for their fitness, they can also exhibit variability among individuals. This in and of itself may be evolutionarily favored, because such variation may present more diverse substrates for selection, especially when pressured by environmental conditions. While the human population is increasing, its size remains relatively small on evolutionary scales, as evident by multiple genetic bottlenecks documented in human history. Thus, diversification of individuals within a species due to the variation in somatic development can provide a substrate for selection and enhanced evolutionary fitness.
For tissues and organs that depend on expansive functions, somatic diversification is also a necessary aspect of their development. The immune system requires stochastic somatic diversification to produce diverse immune receptors to recognize unanticipated and more rapidly evolving pathogens. Similarly, our nervous system requires stochastic somatic diversification to produce diverse neuronal functions to process sensory stimuli and execute actions in variable and expansive environments. At the moment, we can only speculate whether the evolution of this diversifying ability was responsible for the evolution of animal behaviors that ultimately gave rise to human culture. But this idea is not so far-fetched to be implausible and needs to be explored.
Other organs and tissues that require expansive functions, beyond those that can be encoded deterministically, may similarly use somatic diversification, with equally impressive biologic effects. Modern biophysical theory has all the necessary tools to quantitatively model deterministic and stochastic processes, using algebra and statistics, respectively. Recently, they have been successfully combined to model conformational dynamics of macromolecules. To understand how our bodies come to be, we must define the spectrum of developmental processes and organs that rely on somatic selection versus those that are encoded deterministically.
Development by somatic selection enables more varied and responsive biologic structures and functions. This may increase evolutionary fitness, by promoting morphologic and functional diversification of individuals. This type of development may also be susceptible to errors and environmental insults. In this sense, the understanding of disease in evolutionary physical terms provides a natural experiment to infer a physical theory of living things. However, before we can attempt this, we need to articulate the physics of evolutionary genetics.