Molecular paradox redux.
What separates biological and physical sciences? I can trace my thinking about this to when I sat by the Botany Pond at the University of Chicago in 1997. Originally conceived as a laboratory for botanists more than 100 years ago, and subsequently remodeled as a small park, the Botany Pond was a site of one of my courses in the spring of 1997. Spurred by one of the first warm days after a typically freezing Chicago winter, we sat by the side of the Botany Pond for an evolutionary biology seminar with Leigh Van Valen. He was an evolutionary biologist, paleontologist, ecologist, and philosopher, known for multiple fundamental contributions to evolutionary biology. His office and nearby seminar room were a labyrinth of books, impossible to navigate and impressive to all who visited him, particularly to precocious students who quickly learned that no subject was of disinterest to this self-described “generalist.” On this day, Van Valen talked of evolution and entropy seamlessly. The connections were new to me, having learned biology and physics on the opposite sides of campus, but naturally intuitive. The evolutionary physics that I aim to develop here can be traced to that time at the Botany Pond.
Today’s textbooks and teaching materials for university courses are noticeably more compelling, incorporating improved forms of media and more intuitive visualizations, as compared to those I read back then. They introduce students to the components of matter and living organisms. In the case of biology, the principles of genetics, evolution and ecology are explained. Physicists learn about quantum mechanics and thermodynamics. However, the relationship between the physical basis of the natural world and its biological activities is complicated by a vast chasm that separates atoms and light from the human body and its development. While modern biology incorporates extensive quantitative and theoretical concepts, and biophysics extends physical theory to biological macromolecules, potential connections remain confined to their separate fields, to be sought out only after decades of academic study. However, the intuition linking physics and biology must inform all thinking about living things. It is my expectation that human biology and its irreducible complexity in particular are in dire need of this new reassessment. This must begin with the link between molecular activity and biological function.
The fundamental basis for living things concerns the chemical transformations of matter that are required for biological life. The initial inquiries into this question concerned the origin of key biological molecules such as amino acids and nucleotides. Their macromolecular polymers comprise key effectors of prokaryotic and eukaryotic cells that mediate their replication and growth. It became clear as early as the Miller-Urey experiment in 1952, and elaborated in studies since then, that chemical matter comprising biological molecules can form from simple and ultimately elemental precursors. It is impossible to determine definitively how this abiogenesis proceeded, though recent studies suggest that the earliest self-replicating biological molecules were related to RNA. Identifying the simplest or most plausible self-replicating molecules is fascinating from the standpoint of scientific inquiry, and irresistible to us as thinking humans. However, knowing this is not required to define the physical principles governing living things and their biological functions.
The most fundamental property of living things is to overcome the inherent propensity of matter towards disorder. This is recognized in Boltzmann’s second law of thermodynamics. Many have referred to the observation that repeated shuffling of playing cards inevitably eliminates any preceding order as an intuitive example. While this analogy does not capture the propensities of molecules to undergo specific chemical reactions, which themselves are forms of order, all living things must compensate for the entropic cost of maintaining homeostasis and increasing order for growth and development. It is a logical imperative. Therefore, the maintenance of the structural integrity of our bodies, and the transformations of molecules and cells that accompany this homeostasis, not to mention development and growth, require continued consumption, conversion, and dissipation of energy. This consumption and conversion are obviously important for the development, division, and growth of initially single-celled embryos into many differentiated cells and their organization into tissues and organs that ultimately comprise the body. However, the same conversion must accompany steady-state homeostasis, as molecules undergo spontaneous chemical reactions that inevitably inactivate them, e.g., oxidation and hydrolysis under ambient physiologic conditions. Continued replacement through biosynthesis is therefore an absolute requirement for life.
The input of energy into biological systems, be they organs, cells, or molecules, leads to fluctuations in their states. For example, proteins, which are chains of specific amino acids, function as distinct three-dimensional structures, formed by folding of their polymers. These structures are responsible for the enzymatic catalysis of chemical reactions, which would otherwise be too slow to sustain our living cells. Distinct protein shapes are also responsible for the stable and specific intermolecular interactions required for the spatial organization of cells and tissues, and their regulation during development and motion. Proteins cooled to the lowest possible temperatures of -273 °C (-460 °F), where thermal energy is essentially unavailable, remain frozen in a single, usually most stable conformation. But at physiologic body temperatures of +37 °C (+99 °F), where heat can transfer into molecules, molecular structures fluctuate. Indeed, the intrinsic propensity of heat to transfer through molecules is itself a manifestation of the second law of thermodynamics. As a result, proteins exist in multiple configurations, some compact and others more expanded. At or near equilibrium, their prevalence is a simple product of their energetic stability.
These fluctuations are not mere noise, but rather have essential biological functions. For example, they are required for the binding of substrates by enzymes, which would otherwise be too dense to allow the physical diffusion of substrates into their catalytic sites. Indeed, hemoglobin, which is needed for human life and the transfer of oxygen from the lungs to deep tissues, binds and releases oxygen deep from within its molecular interior through fluctuations in its three-dimensional shape. This allows the diffusion of oxygen through space that would otherwise by occluded by hemoglobin atoms.
Today, the evidence for fluctuations in macromolecular structure is ample and extensive. Initial studies of the atomic structure of biological macromolecules such as proteins and nucleic acids was based on crystallography, advanced throughout the twentieth century. This required the formation of molecular crystals, with each molecule arranged in a regularly spaced lattice. Solving atomic-resolution structures was a technical and conceptual tour de force, accounted by many authors, not the least of whom were the scientists themselves who were recognized with no less than twenty distinct Nobel Prizes. This includes Dorothy Crowfoot Hodgkin who was recognized for solving the structure of vitamin B12, and should have included Rosalind Franklin for the structure of DNA. While the deduced structures of specific biological macromolecules had unique three-dimensional shapes and atomic positions, all of them showed variations, attributed to imperfections in the arrangements of the crystals. Occasionally, these imperfections were irregularities in the lattice itself, but most of the time, they were caused by variations in the structures of individual molecules in the crystal. These variant structures are direct products of molecular fluctuations. As methods for the study of molecules under more physiologic conditions of aqueous solutions at near body temperature were developed, molecular fluctuations were observed by diverse spectroscopic methods. Recent development of single-particle electron microscopy methods allowed for the visualization of single macromolecules and the determination of their high-resolution structures. Numerous studies now directly show that all macromolecules adapt fluctuating structures, as observed by the structural variations among single molecules.
The fluctuations in the atomic structures of macromolecules are the direct product of thermal energy fluctuations of biological systems living at physiologic temperatures. Today, the dynamics of atoms in macromolecules can be predicted with good accuracy using classical Newtonian equations of particle motion, even when using rather approximate empirical potential energy functions of specific biological molecules. As we develop increasingly more accurate potential energy functions that will ultimately reach physiologic precision, it is safe to conclude that dynamics of macromolecules and their complexes—of increasing size, limited primarily by computer speed—can be predicted with effectively perfect accuracy. The need to formalize the relationship between the physical properties of proteins and their biological evolution is emphasized by the fact that the current best-performing algorithm for protein structure prediction, AlphaFold 2, achieved its accuracy by integrating physical and evolutionary models of protein structure.
It is now evident that molecular fluctuations are biologically important and essential for living things. Molecular structure is also a substrate of natural selection. Many have argued that the variation in the composition of biological macromolecules is essentially neutral with respect to fitness. Kimura noted that the adaptationist notion advanced by Darwinian studies of evolution was intuitive given the focus on visible features of organisms, such as beaks, wings and so on. Genetic studies have now shown that the compositional diversification of genes is occurring at a substantially faster rate, substantiating the neutral theory of molecular evolution. Indeed, at the molecular level, diversity of states and interactions is also high, raising the possibility that many states sampled by molecular or cellular systems are also neutral with respect to biologic function and natural selection.
Of course, this does not mean that the selection of specific variants and their biological function cannot exist. For example, a single amino acid substitution in hemoglobin confers resistance to infection by malaria parasites that are endemic in many parts of Africa, Asia and South America. This variant hemoglobin folds differently from the common form, leading to its propensity to oligomerize into fibers that distort and damage red blood cells. The sickling of red blood cells induced by this hemoglobin variant is inherently deleterious. However, in ecological conditions where the fitness of individuals with the sickle hemoglobin allele is affected by malarial infection, this hemoglobin variant serves a ‘beneficial’ biological function by conferring relative resistance to malaria that can otherwise be lethal or debilitating. This is natural selection operating on the compositional variation in protein sequence, induced by natural genetic variation.
The same selection principle applies to conformational variation in macromolecular structure. We already established that the three-dimensional structures of biological macromolecules fluctuate in response to thermal energy of physiologic temperature. Early studies explained the binding of macromolecules to their substrates or each other as part of biological function in terms of “induced fit,” as if biological molecules have intent. This teleological explanation is intuitive for biologists who historically explain biological phenomena in terms of some inferred purpose they serve. However, years of experimental and theoretical science have yet to find evidence of induced fit in molecular dynamics. Today, molecular recognition in the form of non-covalent binding of small ligands and macromolecules is best explained by statistical mechanics of conformational sampling and selection. There is no evidence of “induced fit,” other than a metaphor.
In the case of hemoglobin, uptake of oxygen in the lungs and its release in deep tissues is controlled by thermodynamic linkage between hemoglobin conformations with relatively high and low oxygen binding affinity. One linkage involves relative acidification due to the accumulation of carbon dioxide in deep tissues. We now know that this phenomenon, originally described by Bohr, is due to the fluctuations in the conformation of hemoglobin, stabilized by the binding of oxygen to heme and acid protons to hemoglobin side chains like histidine. The affinity of hemoglobin’s heme group to oxygen is lower in the conformation that is stabilized by acidification. Thus, hemoglobin’s conformations fluctuate dynamically and are ‘selected’ by the binding of oxygen in the lungs and its dissociation in deep tissues. It is easy to imagine how compositional variation in hemoglobin sequence in human populations would be subject to natural selection of variants with beneficial oxygen tissue delivery under specific ecological conditions, such as high altitude for example. This is Darwinian natural selection operating on the variation in genetic diversity of hemoglobin in individuals.
We can postulate that analogous selection operates on the conformational diversity in molecular structure, induced by the physiologic fluctuations in thermal energy, and selected by their biological functionality. These “functionally important motions,” or fims, as named by Frauenfelder, are governed by the general principles of equilibrium thermodynamics, where they fluctuate about their mean values. Alternatively, in systems with energy flux away from equilibrium, conformational dynamics follow fluctuation-dissipation equations, developed by Onsager and formalized by Prigogine and their colleagues. Regardless of the thermodynamic regime, and its mathematical formalism, conformations that are stabilized by their binding to biological cofactors are selected through conformational sampling. This applies equivalently to conformational fluctuations such as those in hemoglobin during physiologic oxygen delivery, hemoglobin oligomerization into fibers in sickle cell disease; and large reorganizations of structures of all proteins, nucleic acids, and other biological macromolecules in living cells.
Conformational selection of fims is a general principle of biological molecular function. This principle is a direct consequence of statistical thermodynamics, which can link physical forces with biological functions. Many have suggested that the complexity of biological macromolecules, and proteins in particular, requires special mechanisms. How can biological macromolecules comprised of long chains of components spontaneously organize into intricate and specific three-dimensional structures? Protein folding has captured the imaginations of generations of scientists because the complexity of three-dimensional structures of proteins, and the biological functions and evolutionary adaptations that they confer, are often intrinsic to the amino acid sequence. Their folding and organization do not require external input, and can occur spontaneously.
For proteins synthesized by polymerization of amino acids, the scope of this self-organization was appreciated early by Levinthal. Given a reorganization time for a torsion of the polypeptide backbone of about 10 billionth of a second, as limited by its diffusion in water under physiologic conditions, and a relatively small protein of 100 amino acids with at least 2 different conformations per amino acid, a random search of all possible conformations will take more than 100 billion years. Two kinds of models have been proposed to account for the actual efficiency and specificity of biological protein folding. Either the conformational space of unfolded proteins is pre-organized, or the conformational search is not purely diffusive, being organized by folding pathways.
The structures of functional proteins are organized hierarchically into various secondary structure elements such as alpha-helices and beta-sheets. For many human proteins, these elemental structures can fold spontaneously from disordered amino acid chains. This autonomous stability, combined with the observations of their ultrafast formation as compared to the folding of intact proteins, led to the hierarchical model of protein folding. In this conception, folding begins with the formation of elements that are local in structure and marginal in stability, with the hierarchic organization of these modules leading to structures of increasing complexity. In this way, restriction of the random search and its global coupling by way of structural and kinetic hierarchy pre-organize the folding of biologically functional proteins.
Alternatively, absence of stable intermediates that accumulate in the course of folding and coincidence of secondary and tertiary structure elements lend support to the idea of folding pathways that guide the ensemble of diverse protein conformations to the biologically active structures. Extensive experimental and computational evidence now supports the existence of both models. The two mechanisms are not mutually exclusive. Conformational selection is the essential element of the self-organization of proteins and other biologic macromolecules.
The unifying property of macromolecular self-organization is the stabilization and selection of conformations that adopt biologically functional states. Binding of heme stabilizes the conformation of hemoglobin required for biological oxygen binding and release. In the absence of heme, hemoglobin folds incorrectly and too slowly to be functional in living red cells carrying oxygen in our bodies. Functional conformations of individual proteins and RNAs in many multi-component supramolecular complexes are selected and stabilized by the binding of individual subunits. In isolation, these individual subunits are unstable or aggregate. When induced experimentally in living cells, this leads to misfolding and ultimate cell death. Indeed, recent work indicates that proteins evolve on the edge of self-assembly, with natural variation in amino acid composition and its evolutionary selection acting to promote their interactions. Conformational selection of stabilizing interactions applies to supramolecular complexes responsible for DNA replication (DNA polymerase), transcription (RNA polymerase), translation (ribosome), and signaling (receptors among others). Order of molecules arises from conformational selection upon binding, leading directly to biological function.
Our observations of biological molecules are inherently incomplete. Atomic-resolution structures and their detailed dynamics are defined mostly for small single-domain proteins, and relatively few supramolecular assemblies like the DNA and RNA polymerases, ribosomes, and signaling receptors. Gaps in our current knowledge of biological molecules do not undermine the principles of thermodynamics and conformational selection. Similarly, gaps in the fossil record do not limit the confidence of Darwin’s conclusions about natural selection. We need to learn more about the interactions of biological macromolecules in their physiologic environments of cells and tissues. And we need to continue to define their physical properties. However, today it is impossible to contemplate that any parts of our bodies would depend on “animalcules” or any other imagined concepts to explain intentional control of living things. Maxwell’s demon and homunculus are equally impossible in human bodies. The former cannot exist in isolation, and the latter cannot be made from biological molecules. This is not to say that microbes and other living things do not have roles in human physiology. The human microbiome is an inherent part of our gut and skin, and viruses and transposable elements are physical parts of human genomes. The central realization is that statistical thermodynamics and conformational selection are fundamental physical principles that explain biological functions of molecules that make up our tissues.
Chemical transformations of matter that are required for biological life convey the energy of electrons from one molecule to another. These transfers of energy are responsible for the ordering of atoms into metabolites, macromolecules, and cells and tissues making up our bodies. The process begins in the single-cell embryo and accompanies the organization of atoms in biological macromolecules, like amino acids in proteins, assemblies of macromolecules in cells, and organization of cells in tissues and organs. Folding of polypeptides into functional structures, such as the folding of hemoglobin into a functional oxygen carrier controlled by solution conditions, is energetically stabilized both by the specific atomic interactions of its evolutionarily selected three-dimensional structure, and by the release of water molecules upon folding, favored by their increased entropy. Analogous mechanism is responsible for the organization of other biological polymers including RNA and DNA, now extensively documented by experimental studies. It is intuitive to envision how similar principles are responsible for the organization of molecules into cells, and cells into organs, all of which involve stabilizing physical interactions in their three-dimensional structures, and entropic stabilization from desolvation, releasing water and counterion molecules. Experimental demonstration of these principles in these more complex systems awaits the development of more advanced synthetic biology tools, but today seems inevitable and incontrovertible.
Luria pointed out that the perfect geometric shape of virus shells (and by extension of many biological macromolecules) is as remarkable as the symmetrical shape of starfish and beautiful organization of many human organs. Yet the processes that give rise to these instances of biological order tend to be thought of as distinct. The shape of a virus is “simply the outcome of the assembly of protein molecules tending, like all molecular structures, to reach a state of minimal energy.” But the shape of these animals is “achieved through an elaborate process of development.” Evolutionary physics aims to dispel this distinction.
Biological organization of molecular systems is a hierarchical process of conformational selection, explained by statistical thermodynamics. Matter comprising our bodies and all living things is stabilized by physical forces. Electrons in atoms in sufficient proximity to each other fluctuate in response to energy fluctuations at physiologic temperature, leading to stabilizing interactions, originally called the London dispersion or now more commonly the van der Waals force. Atoms with fixed electrostatic charges are attracted and repelled, described by Coulomb’s law. Folding of extended chains into compact three-dimensional shapes releases water molecules, an energetically favored process due to the increased entropy and disorder of free water, described by the hydrophobic effect. These physical forces stabilize the spatial organization of biological matter. Thermal energy inherent in all living things diversifies molecular conformations under physiologic conditions. These fluctuations provide the substrate for selection of biologically functional states. Statistical probabilities of thermodynamic ensembles are the equivalents to “survival of the fittest,” as applied to molecular states.
Living things are defined by genetic replicators and their physical vehicles as subjects of natural selection and evolution. This cannot be a single molecule, though very simple molecular systems have been engineered to replicate and evolve under selective pressure in experimental systems. We do not need to know the origin of life or the composition of the simplest genetic replicators to define the physical principles of biologic function in living things. For humans, our ability to reproduce as living beings and our existence in ecological systems subject to natural selection mean that the molecular organization of our cells and tissues is shaped by evolutionary processes. The diversification in molecular structure, induced concomitantly by genetic variation and by conformational (energetic) fluctuations, provides the substrate for selection for individual fitness and biological function. The timescales are different— humans evolve over many thousands of years and molecules bind each other in several thousands of a second— but both are selected from diverse ensembles of individual organisms and conformations. This is essentially statistical mechanics of living things. Atomic organization and electronic energy limit the possible states, including the three-dimensional structures of both relatively simple biological molecules like small proteins and complex assemblies such as those responsible for the replication of our genes, cell signaling, and organ homeostasis and growth. Their specific organization, however, from a multitude that defies human intuition involves no Aristotelian anima or Romantic vital force, but a self-organizing process guided by hierarchical conformational selection, bound by energetic fluctuations induced by the physiologic temperature and conditions of our bodies. While the physics of atoms and molecules are separated by many orders of magnitude from evolutionary dynamics of genes and species, they are connected by a shared conceptual framework of biological function through selection of diversified states. This is evolutionary physics of biological molecules.