Basic background: Everyday physics and its math (2024)

Basic background: Everyday physics and its math (1)

Open in new tabDownload slide

The qualifying exam is the start.

A few concepts found in high-school courses of math and physics are probably indispensable in trying to explain how chemists and physicists have come to understand the inanimate objects with which they routinely deal in their professions. This chapter presents a brief overview of what I have found to be concepts of particular relevance to the stated quest for defining a photon.

Understanding of physics at any scale cannot proceed far without some supportive understanding of selected portions of mathematics (long held to be the language of science), primarily arithmetic and algebra, and so a brief review of portions of these disciplines appears in Section 2.1. (Appendix A.3 treats with more detail the portions of linear algebra and differential equations that are needed for putting quantum physics to work.)

Following this preliminary diversion, Section 2.2 overviews that portion of physics that became the fabric of my own professional career, dealing with the submicroscopic world of atoms and molecules. Regarded in the nineteenth century as somewhat hypothetical, these building blocks of matter are nowadays routinely isolated, and manipulated, as individuals. Complementing this discussion of particles, Section 2.3 presents a brief reminder of readily witnessed wave phenomena in bulk matter, to anticipate the discussion of electromagnetic waves in Section 2.8. Empty space—the vacuum—is not nowadays what it was imagined to be prior to the development of quantum theory. Section 2.4 notes the changed views.

Sections 2.5 - 2.6 present a qualitative introduction to some basic ideas of general physics: Forces, vectors, and energy. Section 2.7 then summarizes the (classical) equations with which one describes particle motions and electromagnetic wave phenomena. Section 2.8 begins a qualitative discussion of radiation, laying the groundwork for subsequent introduction of radiation quanta and photons in Appendix B.

A section on probability, 2.11, introduces notions that are essential for discussions of quantum theory. The concluding Section 2.12 introduces the notion of discrete quantum states, a concept that remains key to understanding both historical and contemporary photons.

2.1 Some mathematics

Mathematics appears in a variety of guises, not just as the memorization of arithmetic facts that once formed, together with reading and writing, the most essential core of children's education in the United States. And it is more than the geometry that fascinated the ancient Greeks and was taught, decades ago, to US high-school students. Following are some examples of other aspects of the vast subject of Mathematics, relevant to enquiry concerning photons.

Aside: Sets. Mathematicians have attempted, with some success, to base their entire subject upon the notion of set, meaning a collection of distinct objects, tangible or imagined, whose defining characteristics allow them to be identifiable as being members (elements) of the collection [Dev12]. The objects might be as diverse as Platonic solids, odd integers, points in a plane, months whose names begin with J, or dining-room furniture. During the 1960s it was common for schools in the US to teach basic set theory to primary-grade students as part of the New Math. I missed that educational experiment, and my own children were not impressed. Consequently the language of set theory is not prevalent in this Memoir. The abstract vectors of Section 2.5.2 and Appendix A.3 will be recognized as examples of sets.

2.1.1 Mathematics of events and actions

All readers know that the ordering of sentences A and B in a story can make a significant difference. Automobile passengers know that the command B (“turn left”) given before command A (“turn right”), denoted symbolically AB in the customary right-to-left reading of quantum optics (and some languages), will often result in a different destination than the commands ordered as BA. Craftsmen know that the actions of measurement A and cut B should always be performed in a definite order, BAA for the admonition “measure twice and cut once”.

The written description of actions, and the symbols that represent them as a sequence, requires a formalism in which, unlike the arithmetic of number multiplication 2×4 and 4×2, ordering is important: The symbolic elements do not commute, AB is not the same as BA. In addition to the world of literature and navigation and craftsmanship, the world of quantum mechanics requires such a formalism, a mathematics (an algebra) of noncommuting elements. Appendix A.18 defines the basic mathematics (group theory) of actions or operations. Appendix A.3 (vector spaces) presents the underlying mathematics of the entities that are acted upon in describing quantum-mechanical objects such as electrons and photons. Neither of these is necessary for understanding the main text of this Memoir; they are part of the mathematical background presented tutorially in the appendices.

2.1.2 Mathematics of numbers and equations

Mathematics is an unavoidable part of daily life for all but a few of us. It may be only the viewing of a calendar or a digital watch, it may be the paying of a supermarket cashier or the forming of a betting strategy for some professional sports event. Sciences, whether dealing with touchable objects or only with organized data, cannot exist without the mathematics of numbers [Cle16], and it is therefore reasonable to begin this scientific exposition with that start.

Numbers.

In infancy I probably became aware of whole numbers, what mathematicians call the Natural Numbers, 1,2,3, …, (a set they denote by ) and the fact that there is no upper bound on them—you can always find a larger one. Only later, being introduced to the arithmetic operations of addition and multiplication (defining for mathematicians an algebra, see Appendix A.17), did I encounter the notion of a Prime Number, meaning a natural number that could not be written as a product of other natural numbers (numbers that are not prime are composite), a property that has made them crucial for encoding messages [Sin99, Nie00, Bar09].

The notion of integer (a set denoted ) adds to the very valuable element of zero, needed to specify that the vacuum has zero photons, and the negative integers −1, −2, −3, …, needed as labels on various discretized quantities (quantum numbers).

Aside: Powers of 10. In expressing very large or very small numbers it is common to incorporate powers of 10, as in 103 for 1,000 or 10−3 for 0.001. The integer exponent n in the expression 10n defines the position of the decimal point, with 10n meaning 1/10n and 100 being just 1. This notation is used also for symbols: Sec−1 means 1 / sec (or per second). Defined exponents, with the prefixes used with them, include:1

Large

Small

1

deka, da

-1

deci, d

2

hecto, h

-2

centi, c

3

kilo, k

-3

milli, m

6

mega, M

-6

micro, μ

9

giga, G

-9

nano, n

12

tera, T

-12

pico, p

15

peta, P

-15

femto, f

18

exa, E

-18

atto, a

21

zetta, Z

-21

zepto, z

24

yotta, Y

-24

yocto, y

Large

Small

1

deka, da

-1

deci, d

2

hecto, h

-2

centi, c

3

kilo, k

-3

milli, m

6

mega, M

-6

micro, μ

9

giga, G

-9

nano, n

12

tera, T

-12

pico, p

15

peta, P

-15

femto, f

18

exa, E

-18

atto, a

21

zetta, Z

-21

zepto, z

24

yotta, Y

-24

yocto, y

Open in new tab

Enveloping those numbers comes the category of Real Numbers (), positive and negative and zero, unbounded and typically regarded by physicists as possible results of measurements. Physicists and other scientists use numbers as measures of quantity—the variables with which science deals— and it is those that I shall discuss. Mathematicians, being unhindered by any need to relate their objects to those of the world around us, struggled well into the twentieth century to find a satisfactory definition of real numbers.

When we select a pair of real numbers with which to specify the Cartesian (rectangular) x, y in a plane we deal with the two-dimensional set of independent reals 2. The unbounded coordinates in three dimensions (Euclidean space) form the set denoted 3. Very generally, a set of n independent, unrestricted reals is denoted n.

On occasion we deal with subsets Sn of reals, a locus of points that satisfy in n dimensions a common distance, say r, from some central point. Thus the symbol S1 refers to points on the circumference of a circle while S2 refers to points on the two-dimensional surface of a three-dimensional sphere in Euclidean space, i.e. a subspace of 3; see the discussion of the Bloch sphere in Appendix A.14.2 and the Poincaré sphere in Appendix B.1.7.

When a measuring device has a digital output the measurement provides a special case of real numbers— a Rational Number (the ratio of two integers, forming the set )— but other categories of number, such as irrationals (needed for solving algebraic equations, e.g. 2) and transcendentals (not solutions to algebraic equations, e.g. π) seldom need to be specifically identified by their restricting family characteristic.

Physicists frequently need to deal with the limit of obtained by considering the limit of indefinitely large denominators and correspondingly infinitesimal spacing of discrete rationals (see Appendix B.3.2). They also consider the replacement in order to discretize a continuum and replace an integral by a sum.

Number systems.

As one contemplates seemingly conflicting aspects of photons, it is useful to keep in mind that, even with something as seeming mundane as numbers, there are various alternatives possible. Transactions in all our daily lives employ a decimal system of numbers, based on the ten symbols 1,2,3,4,5,6,7,8,9,0. But we also encounter other systems that use other symbols. Preface-page numbering, analog clock faces, volume numberings, and years of filming all frequently employ Roman numerals, I,V,X,L,C,M together with a position-sensitive system in which IV is not the same as VI and the doing of simple sums is an intellectual challenge formerly given to school children to keep them occupied.

Numerical strangeness; Paradox.

Whenever we deal with numbers, or measurements, we encounter opportunity for strangeness or paradox—some surprising result of using logic.2 Mathematicians, for amusem*nt, like to point out that in the interval between the integers 0 and 1 (or any other numerical interval) there are an unlimited count (infinite but countable) of rational numbers , yet between any two of these, no matter how close together we may choose them, there are real numbers that are not in this set: Though the set is infinite, in a denumerable way that accords with familiar counting of objects, the numbers of are even more numerous—they form a continuum. It should be realized that even the most advanced digital computer must rely on discrete representation of its numbers: The size of its storage elements (its words) limits the number of digits in its numbers, thereby establishing a limit to the accuracy with which it can represent any number. Thus a number continuum is a mathematical construct akin to the frictionless plane or the perfect vacuum of a physicist.

This little mathematical curiosity of the continuum needs only mental exertion to appreciate, as does the seeming paradox that there are as many integers as there are odd integers (because they can be arranged with one-to-one matching). Other mathematical curiosities become evident when one considers integrals, such as the representation of signals by their frequency decomposition, as do electrical engineers. They find a relationship between frequency (or musical tone) and time duration of a signal, a purely mathematical characteristic associated with periodicity: You cannot specify, with absolute precision, both the frequency of a signal and an instant from the signal; frequency and time are complementary variables in such mathematics, formalized with Fourier transforms [Ebe77b]; see Appendix B.5.1.

Physicists encounter this sort of relationship in the experimentally-based position-velocity uncertainty relationship of Heisenberg, as discussed with eqn (2.12-1) and eqn (A.4-6), or the wave-particle duality relationships of Section 3.7. Physicists doing calculations get used to changing from to and from ; see Appendix B.3.2. But they may find the change between wave and particle computations, at the heart of atomic physics and quantum optics, to seem strange, even mysterious.

The atomic hypothesis: Continuous matter or denumerable atoms?

Towards the end of the nineteenth century there arose a lively debate amongst physicists concerning the nature of atoms. Were they discrete and denumerable (as is the set ) or did they form a continuum (as does the set )? I can imagine that mechanical engineers, familiar with point-like centers of mass, would have been comfortable with Newton's equations for discrete masses (see Appendix A.1), while nautical and aeronautical engineers, dealing with flowing fluids, would prefer a continuum. It should be no surprise that a similar debate arose over the nature of radiation and its photons.

Symbols.

It was in high-school algebra class that I learned the power of using some letter symbol, say x, to stand in place of all possible numbers that might be used in some mathematical relationship. In my early FORTRAN programming, which used only upper case, the letters I, J, K, L, M, N were reserved for integers, leaving all other letters for use as the more general real numbers. In physics articles it is common to see i, j, k, l, m, n used to indicate numbers from the set as subscripts for denumerating instances; for example E1, E2, … En … for a list of discrete energy values.

It is possible to discuss physics without using symbols; one does this in conversations that make no use of computer displays. However, to do so in writing, where contemporary software and hardware offer typesetting with a vast supply of fonts, poses a severe challenge to an author. I shall not abstain completely from the use of symbols, for they have become a seemingly indispensable part of physics.

Equations.

An equation is just a simple name for a relationship between variables, i.e. measurable quantities of interest. Quite commonly they consist of two sets of numbers or symbols separated by an equal sign (or other binary relationship symbol). Few adults escape viewing the famous formula of Einstein, E=mc2, for the relationship between energy E, mass m, and light speed c, and all attendees of a class on physics will have seen Newton's formula3 F=ma for the relationship of force F, massm, and acceleration a, see eqn (A.1-9). Formulas such as these, though absent from conversation, readily appear in writings, where their absence would seem an affectation.

Often a special notation indicates that a particular relationship, such as i1 amounts to a definition of one symbol, i on the left, with some expression, 1, on the right. The inequality symbol, , is used to indicate that two things are not equal, as in the relationship ABBA between noncommuting operations.

Some computer-programming languages use an equation structure to indicate an instruction for the moving of information from one location to another in storage. The equation A=B+C is the FORTRAN instruction to add together the numbers stored in locations A and B and store the result in location C.

Even with the familiar symbols, an arithmetic may obey differing rules for addition and subtraction. Common alternatives are the analog-clockface system that leads to the addition rule 6+7=1 and the binary system of electronic messages, with the rule 1+1=0.

Often an equation presents on the left the symbol for some incremental change (say the velocity as the ratio of incremental position change dx to incremental time change dt) and on the right an expression for its functional relationship to dependent variables (say a position-dependent force). Such equations of change are examples of differential equations and, for pupils of my era, their systematic study awaited courses in Differential Calculus. Inevitably they appear in discussions of quantum physics, and they appear profusely in the appendices of this Memoir. Given suitable differential equations it is possible, in principle, to follow their instructions from a specified initial position at a specified initial time and be directed to all possible positions and time that are consistent with these rules. The organization of those instructions into functions form the solutions to the differential equations.

Limitations.

The equations of physics present relationships between observable quantities. Inevitably there are limitations associated with application of these. It behooves us to appreciate these limitations—to recognize the assumptions and idealizations (frictionless planes, massless pulleys, point particles, empty surroundings, etc.)—that may negate their usefulness in particular instances.

Formulas, algorithms, and plans.

In chemistry, a formula is a fixed set of symbols, such as H2O or C2H6O, that describes the composition characteristics that distinguish between various compounds—that differentiate water from ethanol in this example. Organic compounds typically require elaborate diagrams of loops and lines for their definition, and these too could be considered formulas of a sort—a template that allows association of a name with a substance. But the formulas for nitroglycerine or gunpowder, by themselves, bring no information about how these substances can be formed into destructive devices, as would be explained in diagrams of an assemblage. In physics and math a formula is typically an equation expressing some mathematical relationship, such as A=πr2 for the area of a circle A from its radius r, but it may just be some arrangement of symbols—the πr2 part of the area equation.

Algorithm is a technical term for a procedure, typically for manipulating symbols or actions that then, from some starting material (or symbols or numbers) can produce a result that has desired characteristics. The area formula A=πr2 presents an algorithm for obtaining A from r. A FORTRAN instruction is a written form of an elementary computer algorithm for moving numbers around in storage. Earlier generations of school children learned the algorithms for adding columns of multi-digit numbers (carries) and carrying out long division. Commonly the term is used for mathematical procedures, but it could apply to other activities. Students of organic chemistry are asked, on exams, to devise a synthesis procedure—an algorithm of sorts—for creating a compound defined by its formula. An experimentally-inclined physicist might try to explain the algorithm for producing a perfect cup of coffee; his formula for perfectly prepared coffee may have little to say, directly, about how it came to be so perfect.

The ambiguous noun “plan” is used with meanings similar to both those above: A retailer may compose a plan for an advertising campaign (an algorithm, not a formula, though it may include charts) and draw a (floor) plan for a revised store layout (which may include a timeline for actions). Conversely, in literary works one reads of “a formula for success” rather than the more accurate but less auspicious sounding “a plan for success”. But it would be misleading to call Einstein's equation E=mc2 the “formula for an atomic bomb” (as I was once told).

Complex numbers.

Although measurements invariably produce real numbers, much of the mathematics associated with quantum theory (and hence with photons) is presented more simply by introducing the square root of minus one as a unit coordinate in a two-dimensional space (the Complex Plane) whose axes are, in one direction (horizontal), just real numbers, and in the perpendicular direction (vertical) are real numbers times the imaginary unit i1, defined by the requirement i2=1. A complex number z (from the set denoted ) therefore has the form z=x+iy, with real-valued x and y. The squared length z of such a two-dimensional vector is just the sum of the squares of its two components, x2+y2, an example of the Pythagorean Theorem that students of trigonometry (and the diploma-endowed Scarecrow of Oz) learn for describing the long side of a right triangle:

The square of the hypotenuse is equal to the sum of the squares of the other two sides.

Complex numbers are ubiquitous in physics, primarily for their mathematical convenience. In particular they occur as probability amplitudes, whose absolute squares are real-valued probabilities, and as electric-field amplitudes, whose absolute squares provide positive measures of radiation intensity and energy density (and hence of photon density). Appendix A.17 discusses various mathematical structures that generalize the notion of complex numbers—mathematical constructs (forms of algebras) sometimes termed hypercomplex numbers.

Aside: Magnitude and phase. Complex numbers have two parts, labeled real and imaginary. The two parts can also be expressed as a magnitude and a phase; see Appendix A.3.2. The magnitude of the complex number x+iy is defined, for example, by the formula

|x+iy|=x2+y2.

(2.1-1)

The phase of a complex number z is the angle that specifies its direction on a line from the origin in the complex plane. It is the angle φ such that the real and imaginary components of z are obtained from the trigonometric relations defining sine (sin) and cosine (cos) for the two short sides of a right triangle having |z| as hypotenuse (see Appendix A.1.1):

real:x=|z|cosφ,imaginary:y=|z|sinφ.

(2.1-2)

Physicists commonly express angles in radians, πrad=90°, so sinπ=1.

Variables and functions.

Relationships between numbers form the basis for the equations with which physicists and others formulate their descriptions of Nature. For that purpose the equations are regarded as providing a connection between some quantities such as position and time that are to be chosen, and some other quantities, such as temperature or rainfall, that will then be found associated with that choice. The numbers involved are Variables: Either to be chosen (the independent variables, location and time) or to be found (the dependent variables, temperature and wind velocity). A mathematical Function is a rule for connecting the (resulting) values of dependent variables to the (chosen) values of the independent variables—a connection that I eventually learned is called by mathematicians a mapping.

In much of the physics literature a functional relationship of a single independent variable is exhibited typographically in the form F(t) or F(x), where t or x is the symbol being used for the independent variable (say, time or position) and F is the symbol for the dependent variable (say, temperature or stock price). To emphasize that the energy of Einstein's famous equation varies with mass one might write E(m)=mc2. Appendix-defined examples from atomic physics include ψ(x, y, z) for the complex-valued probability amplitude (a wavefunction) of finding a given electron at a position whose Cartesian coordinates are x, y, z (see Appendix A.5), and E(x,y,z,t) for the complex-valued amplitude of an electric field at time t and a position specified by coordinates x, y, z.

The range of values of the independent variable for which a function is defined is known as the support for the function. A pulse of temperature or electric-field magnitude that is modeled as occurring during only a finite time interval is said to have finite support. It often proves mathematically convenient to allow the interval to extend indefinitely, to become a pulse of infinite support.

The locus of points x that map onto the function values F(x) form a curve, possibly smooth but possibly with jumps and kinks, even singular points where F(x) is undefined. The point at which two curves have the same value, F1(x)=F2(x), is known as a crossing.

2.2 Particles: Elementary and structured

Particles, to physicists, are localized concentrations (usually small) of mass and other properties. They combine to produce the objects of everyday life, and they come in a great variety. Particles can be characterized, in part, by several intrinsic qualities (attributes), meaning characteristics that are unchangeable and independent of position and velocity. Examples include the electric charge of an electron, the chemical properties of an atom (or the number of its electrons), the color of a pool ball. Spin, associated with an intrinsic angular momentum, is another such attribute (see Section 2.2.8 and Appendix A.5). By contrast, characteristics such as the arrangement of electrons within an atom (governing atom size and shape) or of pool balls upon a pool table, are changeable and not intrinsic. Particles are regarded as identical if they have exactly the same set of intrinsic attributes.

Amongst the intrinsic attributes of everyday objects that are to be considered in identifying and using them is the pull of gravity (or resistance to force), quantified as mass [Adl87, Oku89, Roc05] and expressed in kilograms, kg.

Aside: Rest mass. A stationary particle has an intrinsic measure of its resistance to motion (an inertia) quantified as a rest mass m0. The special theory of relativity introduces changes of length and time measures as observed in different moving reference frames. Consequently a particle seen as moving steadily with speed v will have an apparent inertia (or relativistic mass) that increases without bound as v approaches c, the speed of light in vacuum—the ultimate limit of any particle speed; see eqn (A.1-11). It is rest mass that we list as an intrinsic characteristic of elementary particles.

2.2.1 Indivisibility

An important concept in physics and chemistry is the notion of indivisibility, and its ally, elementarity. Put operationally, this refers to the possibility, with given tools, to break apart some given object.

The notion of indivisibility very clearly depends upon the tools available for taking something apart and the decision to use them. When I gathered around our family dining table for a meal I recognized chairs as being basic units for sitting. These, along with plates and tableware, were present in unchanged numbers each day, and so they constituted the elementary constituents (particles) of meal support. But occasionally there would be a mishap, and a chair or plate might be broken into pieces. Discarded chairs, being wooden, could even be burned in a bonfire. The notion of indivisibility clearly depended on the conditions under which an object was being used.

Traffic engineers deal with motor vehicles as indivisible units, moving along discrete roadways. Air-traffic controllers deal with individual airplanes along approved flight paths. Ranchers deal with herds or flocks of individual animals pasturing during their growth. These and many other examples of temporarily indivisible units—structured particles—come readily to mind.

The notion of indivisibility occurs in chemistry for the definition of molecules and atoms. As long as only chemical processes are employed, these remain elementary units. Physicists use a variety of tools to break apart atoms, finding thereby a variety of more elementary units. The following paragraphs describe some of the particles that, for appropriate conditions, are treated as indivisible and elementary.

2.2.2 Atoms

It was common, in my school days, to read that “every schoolboy knows …” and then to be presented with some fact from History or Science that, at the time of the writing, was part of the general education curriculum of boys—and of some girls. The notion of an atom (or a molecule) was one of those concepts. The idea was very simple, an example of what is called in German a Gedankenexperiment or thought experiment: you imagine cutting some chunk of Matter (a chair or a plate or a hammer) into smaller and smaller pieces. Eventually you will come to a situation when the pieces no longer carry the recognizable chemical properties of the larger object. The smallest piece of iron is an atom of iron. The smallest piece of a kitchen chair is a molecule of wood or plastic. This, in turn, can be broken into atoms of carbon and oxygen and hydrogen and nitrogen and other elements.

As every schoolchild used to know, there are nearly a hundred chemically distinguishable kinds of atoms (different chemical elements) found in nature. Charts listing these were once found displayed in classrooms, organized into the Periodic Table, starting with hydrogen and helium, elements 1 and 2, and ending with such heavy elements as uranium, element 92. Already in my elementary-school days we learned that the naturally occurring elements were being augmented by unstable, man-made elements (for example element 43, technetium, and, more recently, element 116, livermorium), but the Periodic Table remained our guide to the world of atoms which, when suitably combined into molecules and solids, provided all the inanimate objects of everyday life. We understood that every atom, every molecule, carried intrinsic chemical attributes (originating with the number of electrons and their structure) along with physical attributes of mass, size and shape. Isolated atoms might be treated as moving mass points but they carried intrinsic attributes as well.

Artificial atoms.

Nowadays a variety of remarkable fabrication techniques and technology tools make possible the construction of submicroscopic (nanoscale) objects in which confined electrons mimic observable characteristics we attribute to the internal structure of traditional atoms and molecules. Often termed “artificial atoms” [Kas93, Kou01, Rei02, Cho10, Hon11, Lod15, Man16, Sal19] or “photonic nanostructures” they are available in many forms, including one-dimensional “quantum wires”, two-dimensional “quantum wells”, and three-dimensional “quantum dots”, with properties that can be designed by an experimenter. The physics, and the mathematics, of such structures is very much what I learned in my years of professional concern with laser action upon traditional atoms—only the names have been changed. All of them serve equally well as examples of quantum-state systems that absorb, emit, and modify discrete increments of radiation (photons) in a coherent manner.

Subatomic entities; electrons and ions.

We scholars also learned that the thought experiment of taking apart our everyday objects need not end with single atoms. These, we learned, are made from bits of electrically charged matter: Very light negatively charged electrons, removable from much heavier, positively charged atomic nuclei that are much smaller than the full atom (see Section 2.2.6 for a discussion of sizes). These electrons and nuclei are the basic particles whose intrinsic properties must be considered when one sets out to understand such things as the colors of stained glass windows or the strength of bricks and mortar or the combustion properties of gasoline. Their natures underlie the study of Chemistry and Materials Science, and the design of electronic devices that enable the creation of computer games and the sending of messages.

As commonly encountered, individual atoms and molecules have no electrical charge. They are electrically neutral because the number of their electrons exactly equals the number of positive charges in their nuclei. The removal of an electron from an atom or molecule4 leaves a positively charged ion. If that freed electron becomes attached to a neutral particle the result is a negatively charged ion.

Individual atoms.

The notion of breaking apart pieces of matter to obtain molecules and atoms as elementary units is not so far removed from household experiment. The simplest idea, say of grinding table salt into a fine powder, does not produce individual atoms of the sodium and chlorine that comprise the salt crystals, but when one pours the salt into water the molecular bonds between Na and Cl atoms break, and one has a salty solution in which individual atoms, carrying single electron charges as Na+ and Cl ions, are to be found moving.

At least that is the picture that I was presented as a youngster. To some extent this picture of solvent action by water, breaking chemical bonds of solid compounds, remains valid, but the subject of Aqueous Chemistry—the study of water as a solvent [Stu95, Sch10, Hor12]—has brought a more complicated picture of the resulting liquid. Metal ions in aqueous solution carry with them an entourage of molecules from the fluid environment—a clothing of loosely held ligands5 that depends on such conditions as how acidic or basic the environment is. Our glasses of dairy milk carry a variety of bovine-produced colloids. And seawater contains a vast assortment of minute organic and inorganic particles of sizes that range upward from a few aggregated atoms [Ben92, Guo97], all of which have their characterizing interactions (scattering and absorption) with photons.

As noted in Section 5.1.2, to best understand photons as individuals it is important that their source atoms be carefully prepared to prevent randomizing influences. For this purpose vapors have advantages over liquids, and so the individual atoms whose photon interactions are studied in Quantum Optics are often obtained from a dilute vapor. Conceptually the simplest procedure is to allow a heated gas to leak from a container as a particulate beam, available for manipulation that can capture and hold a single molecule, atom, or ion.

2.2.3 Electrons

Electrons, the indivisible bits of (negative) electricity, are the simplest of all contenders for the title of “elementary particle”, meaning part of the stock from which all matter can be constructed. They are readily generated (from captivation in atoms) as cathode rays in evacuated containers, traveling in the space between two charged electrodes—from the negatively charged cathode to the positive anode. Flows of electrons occurred in the vacuum tubes of radio sets. Electron beams in the cathode-ray tubes of laboratory oscilloscopes and home television sets were directed by electric and magnetic fields towards designated target areas on the vacuum side of a transparent viewing screen, there to produce fluorescence from atoms, visible to human viewers from their side of the screen. This charged-particle motion through vacuum has become a model of elementary particle behavior; see Section 8.3.

Electrons can be stopped (by atoms) and extracted (from atoms) but cannot normally be otherwise created or lost: Electric charge is conserved under all conditions. Whatever may be their source, all electrons are identical, indistinguishable, and indivisible. It is natural to have electrons in mind when one contemplates indivisible elementary particles of radiation.

Within a metal there occur conduction electrons that are free to move throughout the solid, under the direction of applied voltages and localized structure. But in any sample of matter most of the electrons are bound to atomic nuclei, held by attractive Coulomb forces to form atoms and molecules or rigid structures.

Aside: Electron attributes. The electron has three intrinsic (unalterable) attributes, electric charge e, rest mass me, and magnetic moment, whose values rank among the basic fundamental constants of nature (see Appendix B.1.1 for definition of SI electromagnetic units6):

|e|1.60×1019C,me9.10×1031kg.

(2.2-1)

The intrinsic magnetic moment of a free electron differs very slightly from one Bohr magneton, defined as [here ℏ, called h-bar, is the reduced Planck constant (or Dirac constant), the elementary unit of angular momentum],

μB=e2me9.27×1024J/T,withh2π1.05×1035Js/rad.

(2.2-2)

As Section 2.2.6 notes, electrons have no intrinsic size, although several combinations of fundamental constants associated with electrons have dimensions of length; see Appendix A.2.

Aside: Atomic (or Hartree) units. In treating manipulations of quantum structures it is often useful to express charge, mass, and angular momentum as multiples of electron charge e, electron mass me and the reduced Planck constant ℏ, respectively. The resulting atomic units provide formulas that set e=me==1. The Bohr radius, a0, provides the atomic unit of length, while the atomic unit of energy, the Hartree energy EAU, is twice the ionization energy of an idealized hydrogen electron

a0=4πϵ02mee25.29×1011m,EAU=e24πϵ0a027.21eV.

(2.2-3)

In atomic units the speed of light is c137, the inverse of the fine structure constant α of eqn (A.2-3).

Aside: Fractional electrons. Electric currents are carried through gases and liquids by electrons and ions, bearing integer increments of the electron charge. In conducting solids (metals) both electrons and lack-of-electrons (holes) can carry the charges, whose voltage-driven motion is seen as currents, and is deflected by a magnetic field. But in two-dimensional semiconductors at very low temperatures, subject to a strong, static magnetic field, there can occur a collective fluid state wherein electrons and holes combine with magnetic flux to form charge-carrying quasiparticles (neither fermions or bosons, see Appendix A.8.3) that have fractional elementary charge, observable as the fractional quantum Hall (FQH) effect [Ait91, Sch98b].

2.2.4 Nuclei and their constituents

During the early years of the twentieth century physicists learned of two particles that account for the mass, charge, and volume of atomic nuclei: Protons, each carrying one unit of positive electric charge that exactly balances the negative charge of an electron, and electrically uncharged neutrons. The uncharged neutron is a constituent of the nuclei of all but the lightest hydrogen atoms.7 Each of these (known collectively as nucleons) is some two thousand times heavier than an electron.

Aside: Nuclear masses. The rest masses of the proton and neutron are

mp1836me1.008u,mn1.007u,u1.66×1027kg.

(2.2-4)

Here the atomic mass unit, u, is 1/12 the mass of a carbon 12C atom.

The summed masses of constituent nucleons accounted well for the atomic weights of chemical elements, although later, more accurate, mass values revealed energies of binding interactions—the energy of nuclear fission.

Each proton and neutron has an intrinsic magnetic moment, smaller than that of an electron by the mass ratio me/mp. These individual moments (vectors) combine with orbital motion of protons within the nucleus, to create an overall intrinsic nuclear magnetic moment (a vector) proportional to the nuclear spin [Eva55, Kop13]. In turn, the nuclear moment combines with the various magnetic moments of the electrons and their orbital currents to create a total atomic magnetic moment, again proportional to the total angular momentum of the electronic structure, as discussed in books on atomic structure [Con53, Bet57, Sla60, Hin67, Cow81, Sob92, Bra03, Dem10].

The names one encounters in reading of proposed particulate bits of matter—electron, proton, neutron, meson, and so on and on—originated with the notion that they had an elementary nature: They could not (normally) be further subdivided. Nowadays even the proverbial school attendee knows that, although electrons cannot be further divided (they are truly elementary particles), the atomic nuclei that chemists and atomic physicists treat as their elementary positively-charged particles, can, with sufficient effort and financial support, make evident an underlying structure based on neutrons and protons. These, we are told, are in turn built from quarks held together by gluons (see Appendix A.19).

But none of this sub-subatomic menagerie has immediate relevance to the workings of everyday objects. For their construction we need only electrons and nuclei. And for their understanding we need only the mathematics developed for the purpose in the early twentieth century, Quantum Theory; see Appendix A.4. It is within that world that the notion of photon, a companion to the electron, made its appearance.

2.2.5 Antimatter

Electrons and most naturally-occurring atomic nuclei (including the proton) are stable particles: Apart from radioactive nuclei they can remain unchanged indefinitely in isolation.

As physics experiments revealed during the mid-twentieth century, the particle constituents of the matter we encounter in everyday activities have antimatter counterparts that were first seen in cosmic rays and then in debris of target bombardments with particle accelerators. The positively-charged anti-electron (or positron) and the negatively-charged anti-proton, have opposite electric charge from (but identical mass as) their respective electron and proton matter partners.

The neutron too has an antiparticle, the uncharged antineutron. The photon has no electric charge and no mass, and so it is indistinguishable from its antiphoton.

Amongst the various elementary particles (leptons) that constitute the Standard Model of Particle Physics described in Appendix A.19 are three types of electrically neutral, almost-massless neutrinos. Though these are uncharged they have distinguishable antineutrinos—differing by helicity (the projection of spin onto the propagation axis).

Although neutrons can remain stable within nuclei, a neutron isolated outside of proximity to protons is unstable: It spontaneously decays into an electron, a proton, and an electron antineutrino (see Appendix A.19). The electron and proton charges exactly balance, thereby retaining overall conservation of electric charge in this decay.

The various particles of antimatter have exactly the same masses as their normal-matter counterparts, and so both sorts are affected equally by gravity and electromagnetism. However, particles of antimatter, upon encountering a matter counterpart, undergo mutual annihilation: Their masses are ultimately converted into radiation energy (photons) and neutrinos.8 Conversely, a photon of sufficient energy can, in passing by an atom, become a particle-antiparticle pair. Such events are part of the realm of high-energy physics, not immediately relevant to the photon physics discussed in this Memoir.

2.2.6 Particle sizes

It is important to know the sizes of objects you hope to fit into a room or a container, and we regularly make judgements of sizes (small, grande, big, huge, XXL) even without that intent. The familiar units of feet and inches derive their usefulness from their association with everyday objects—human bodies. To appreciate the world of atoms and photons it is helpful to know how their sizes, however defined, compare with everyday measures of size. Appendix A.2 discusses size measurement and presents several values relevant to atoms, electrons, and photons based on constants of nature. Here I summarize the qualitative details of that presentation.

Size estimates. A simple measure of atom or molecule size comes from recognizing that in a liquid or solid the particles, though moving relative to each other, maintain contact. Thus if the mass density is ρ kg/m3, and each particle has mass m kg/atom, then the volume occupied by one particle is

V=m[kg/atom]ρ[kg/m3].

(2.2-5)

Treating V as the volume of a sphere of radius r, one finds the particle radius r (a measure of size) from setting V=4πr3/3.

Electrons have no intrinsic size, although several combinations of fundamental constants associated with electrons have dimensions of length; see Appendix A.2. Within a metal, the conduction electrons are free to move throughout the solid. In other aggregates they occupy space around minuscule atomic nuclei but their electric-charge (and mass) distribution is not uniform within that volume. The localization of electron charge within an atom is rather like a cloud: It has no sharp boundary and is most concentrated near the nucleus, where the Coulomb attraction is largest, but with notable overall nodal structure (places where the local mass and charge density is negligibly small). Well outside the nucleus the mass density and charge density of electrons within an atom falls exponentially with distance from the nucleus; the “size” of an atom (or subatomic particle) therefore refers to a mean radius of mass or charge. As atoms become heavier along the sequence displayed in charts of chemical families their sizes vary in accord with chemical properties, but not in a monotonic way: Alkali atoms (e.g. Na) are appreciably larger than halide (e.g. Cl) or noble-gas (e.g. He) atoms of comparable mass [DeV44].

By contrast, the protons and neutrons that comprise the minute atomic nuclei have basic intrinsic sizes (not sharp edges but mean radii): The radius of a proton is around 1 fm (a femtometer or fermi, 10−15 m) while the radius of the first Bohr orbit of hydrogen is approximately 0.5Å=0.5×105fm, see Appendix A.2. Nucleons are not appreciably compressible—in first approximation they can be regarded as forming a liquid. Thus as nuclei gain protons along the Periodic Table their volumes grow in proportion to the number of their nucleons—their atomic weight. Nuclei generally have shapes that are not spherical, as characterized by various charge-distribution multipole moments [Sto05, Kop13].

Well beyond an atom or nucleon its charge and mass can be treated as localized at a point-like center of mass. Any structure of the charge and current distribution appears as a series of multipoles; see Appendix B.2.6.

Although electrons occupy volumes that are defined by whatever confining forces may be present, several numbers with dimensions of length occur in theoretical treatments of electrons; see Appendix A.2 and Appendix B.8.2.

2.2.7 Radiations

Toward the end of the nineteenth century the terms “radiations” and “rays” found use in a variety of applications, and these still influence our discussions of the natural world. There were X-rays that could penetrate matter (subsequently found to be forms of electromagnetic waves). There were cathode rays, now known to be electrons. From radioactive solids came alpha rays (positively charged nuclei of helium devoid of any electronic cover), stopped by a sheet of paper, and beta rays (electrons at first, but subsequently also positrons), stopped by a few mm of aluminum, and massless, uncharged gamma rays (electromagnetic waves) that readily penetrated flesh and bone and were attenuated by sheets of lead. The form of radiation of most interest for the present exposition is visible light, now known to be electromagnetic waves, studied in the discipline of Optics. In the present Memoir the term radiation, with its photon increments, will mean exclusively electromagnetic fields. The examples of historical rays with nonzero rest mass are regarded here as particles.

2.2.8 Particle spin: Fermions and bosons

All of the common particles, whether elementary (an electron or photon) or composite (an atom), fall into two classes,9 distinguished by their intrinsic spin:10

Fermions have intrinsic spin of half-odd-integers. They include electrons, protons and neutrons, whose spin is 12.

Bosons have spin of 0 or positive integers, and include photons, whose spin is 1.

The distinction between fermions and bosons becomes pronounced when we consider constructs formed from multiple particles, such as a nucleus formed from nucleons or a molecule formed from atoms. The mathematical connection between spin and statistics (see Appendix A.8.3) and the consequent Pauli exclusion principle, prevents multiple fermions from occupying the same space, and thereby maintains the electronic structure of atoms. A composite formed from an odd number of fermions will be a fermion. An even number of fermions will be a boson.

2.3 Aggregates: Fluids, flows, waves and granules

When we reverse the process of breaking apart material objects into their basic units—atoms and molecules—and instead assemble them to create liquids and solids we encounter collective properties of aggregates that are not to be found in the individual constituents. These are traditionally treated by idealizing matter as being continuously distributed within some defining volume—a continuum 3 of mass or electric-charge density (property per unit volume), without any specially noteworthy position, often without any specified boundary; see Appendix C.2.1. It is with such material that we find prototypes of wavelike behavior and so, to appreciate possible wavelike attributes of photons, we do well to review the equations found there.

Aggregate forms of matter.

In my elementary-school education I was taught that gases, liquids, and solids were the three basic forms of bulk matter and that water could be found as any of these, depending on its temperature and atmospheric pressure. Much later I read of plasmas (ionized gases) as a fourth state of matter.11 These categories appeared as organizings of physics (and its publications and academic subdivisions) into solid-state physics and plasma physics. Enlargement of the categories now include portions of matter that are soft and squishy. These are studied in condensed-matter physics and biophysics, and in departments of Biology. Section 2.3.5 notes more recently established categories.

Gases.

A gas is imagined by physicists as a collection of many particles, each idealized as an infinitesimal concentration of mass and other attributes, to be treated as mathematical points distributed uniformly throughout a given volume. In the absence of external forces the gas particles move along straight-line paths, making occasional infinitesimally brief encounters with other particles, treated as elastic (momentum-transferring) and inelastic (energy-transferring) collisions. The mean time between collisions quantifies the duration of coherence of particle phases.

A gas is usually regarded as a continuum of mass, momentum, and energy densities subject to an equation of state that relates the thermodynamic variables of pressure, temperature and volume. In particular, the averaged motion of the gas particles depends only on temperature; cooling of a gas requires that there be a mechanism for irreversibly transferring energy beyond the boundary of the gas—to a “bath”, see Section 5.1.3. In accord with the precepts of quantum theory, no bulk matter can ever be entirely without some uncontrollable thermal energy, even when the thermodynamic temperature approaches its lowest value, zero degrees Kelvin.

Liquids.

By definition, a given mass of liquid has a definite density and volume but no particular shape: When acted upon by gravity it readily takes the form of a container (although blobs of liquid freed from gravity will be shaped by surface tension). Like the molecules of a gas, the molecules of a liquid undergo continual random (thermal) motion conditioned by surrounding temperature. Though they are packed together closely they readily slip and slide past each other.

All liquids are held together by short-range forces between their molecules. These forces impose, but only briefly, a local structure akin to that of a solid, However, unlike the rigid structure of a solid, this orderliness is only in the immediate neighborhood of each molecule. The lack of long-range order is a defining characteristic of a liquid (or a glass).

The local, microscopic forces between molecules of a liquid produce a resistance to deformation, quantified macroscopically as viscosity—a frictional force between adjacent fluid increments that have different velocities.

Solids.

The molecules of a solid, though subject to small, incessant thermal motions, are held by chemical bonds into a nearly-rigid stationary framework. The framework may exhibit a simple long-range regularity (as in a crystal) or may be quite irregular (as in glass). The defining characteristic of a solid is the (nearly) rigid structure of individual equilibrium positions of molecules, atoms or ions that form a relatively fixed lattice for electrons.

2.3.1 Fluids and flows

Aggregates of matter that can readily change shape—liquids and gases—are treated mathematically as fluids, a continuous distribution of mass, momentum, and kinetic energy whose parcels undergo collaborative motion—fluid flow. Streams and rivers gain their public popularity from their flows of water in bulk, while air flows drive windmills and sailboats. The equations that describe fluid flows, freely and past boundaries, form the topic of hydrodynamics [Lam45, Mil60, Lan87, Fal11]. (The specialized topic of aerodynamics provides a description of air flow over airplane wings.) All these modelings of nature form the topic of fluid dynamics—the treatment of a continuous distribution of properties including mass, momentum, and kinetic energy under the influence of bulk forces.

The basic equations of fluid dynamics express the time and space variation of a vector field, the velocity of an infinitesimal parcel of fluid. These equations derive, in part, from requiring that the motion of each fluid parcel should conserve mass and energy and respond appropriately (in accord with Newton's laws) to the imposition of forces: Those of pressure and the viscous forces between neighboring fluid parcels that move with different velocities. (The velocity gradient is referred to as a strain rate.) Additionally, the bulk matter is assumed to satisfy an equation of state relating thermodynamic variables of pressure, temperature, and volume. The resulting equation, the Navier-Stokes equation [Lam45, Mil60, Lan87, Fal11] [see eqn (A.1-47) in Appendix A.1.10] is a relationship—a partial differential equation (PDE)—between incremental changes in position and time whose solution describes distributions of fluid flow induced by pressure and viscosity. It is subject to constraints imposed by rigid boundaries of walls or wings or by boundaries between different fluids, such as the air-water interface bounding the upper surface of a sea.

Flows.

The fluid flow may either follow simple paths (streamlines) along a watercourse or it may encircle a point, forming a vortex. An example of vortex motion is often visible in the flow of kitchen sink or bathtub water out through the drain. Along with wave motion, vortices are a characteristic of fluids, not evident in few-particle systems.

All real fluids are viscous; an idealization that has no resistance to deformation is known as an ideal or inviscid fluid. The dimensionless Reynolds number, the ratio of inertial forces to viscous forces, fixes the type of flow that will occur near a rigid boundary such as that of a water pipe or a ship hull. When viscous forces dominate, at low Reynolds number, the fluid motion is smooth (laminar), following simple flow lines. When inertial forces dominate, at high Reynolds number (typically several thousand, depending on the geometry), the motion is turbulent and irregular, producing vortices and other instabilities. The presence of viscosity leads ultimately to dissipation as heat; see page 35.

Aside: Photons as fluids. In recent years I have appreciated that suitably crafted electromagnetic fields in free space (i.e. photons) can exhibit not only flows of energy and momentum but also vortices—distributed angular momentum; see page 98. In a vacuum, photons move freely with only negligible mutual interaction—the photon-photon scattering is there extremely slight. But when radiation passes into dense bulk matter the field can be strongly blended with matter characteristics, forming quasiparticles (polaritons) that carry the disturbances; see Appendix C.4. The possibility of frequent collisions between polaritons can lead to collective fluidlike behavior [Car13]; see Section 8.5.

2.3.2 Matter waves

All of the traditional forms of matter—solids, liquids, and gases—exhibit patterns of collective effects regarded as wavelike and characterized by minima (valleys or null-valued nodes) and maxima (crests or peak values or antinodes) that travel steadily (traveling waves) or are confined in space (standing waves).

Surface waves.

As a vapor of diatomic water molecules condenses to form a mist of droplets and then a pan-filling liquid we are able to observe one of the most familiar properties of a liquid surface: Ripple patterns (waves) that move across the interface between the dense liquid and the overlying vapor. Viewings of water waves are a popular excuse for visiting shorelines. The waves, visible as crests and valleys of the water surface, are collective actions of the molecules that make up the bulk liquid.

From the equations of fluid dynamics it follows that at an interface between two different fluids, such as the air-water interface above a sea, there can occur surface waves. These are regular patterns of vertical and horizontal displacement of fluid elements, localized around the boundary, that transport energy along a propagation direction and which exhibit interference structure between colliding waves.

Such surface waves are also found in the equations of electromagnetism, associated with sharp boundaries between different dielectrics; see the topics in Appendix C.3.

Acoustic waves.

Our ears, as well as such devices as microphones, respond to slight abrupt changes in air pressure that travel as acoustic waves. When these waves exhibit periodic variation of pressure they produce sensations of musical tones. Acoustic waves travel not only through gases but also through liquids and solids. They are generated by vibrating guitar strings, by gently struck wine glasses, and in many other ways. As molten iron solidifies the individual atoms of iron become confined between neighboring atoms. The atoms, though able to move slightly (a displacement from an equilibrium position), form a rigid macroscopic lattice. Collective motions of the atoms in an iron rod or a steel wire can be heard as sound waves, initiated by striking or plucking a source. The individual atoms of iron are not evident, and the solid can be regarded as a continuum of mass, lacking any granularity. Here too there is no evidence of waves when the sample of matter contains only a few atoms, although collective motions are used for understanding rotations of molecules and vibrations of their atomic parts.

All fluids are at least slightly compressible—the molecules can be pressed more closely together briefly. As a result of such small incremental changes in pressure there can occur pressure waves, moving at the local speed of sound.

Sound waves traveling through gases, liquids, and solids are examples of longitudinal waves: There occur small, compressional density changes (displacements of molecular constituents) along the direction of disturbance travel.

Solids and their waves.

In solids there may occur not only compressional (longitudinal) waves but also transverse shear waves and torsional (twisting) waves, having displacements perpendicular to the direction the waves are traveling. Fluids, lacking the rigid structure of solids, do not exhibit transverse waves of shear and torsion. The mathematical description of all such waves requires parametrization of bulk properties (for example, mass density) but not the requirement of atoms. Nevertheless the collective behavior, of wave energy and momentum, can be regarded as comprising phonons as individual units (quasiparticles), akin to the photons of the electromagnetic field.

Solitary waves.

Localized traveling disturbances, so-called solitary waves (solitons) occur in a variety of situations, most famously as destructive ocean-borne tsunamis. Solitary waves in channels have become a source of surfing activity that was long confined to coastlines [Fin18]; though the wave itself is distributed over a large distance, its action can be localized to that of an individual surfer.

Heat.

The eventual fate of matter waves is as disorganized motions, first of wavelike collective actions (e.g. phonons) and finally as thermal jostling of the molecules that comprise the bulk matter. This eventual conclusion of organized waves constitutes heat. (The oft-mentioned “heat wave” of weather reports has no relationship to the waves described here.)

2.3.3 Electromagnetic waves

The electric and magnetic fields that jointly define electromagnetism (and photons) are governed by Maxwell's equations (see Appendix B and C), a set of PDEs that allow wavelike solutions. Waves of electromagnetic radiation, by contrast to waves of bulk matter, do not rely on any material medium; the notion of an all-pervading elastic “ether” as the carrier of these waves has long been disproven. Electromagnetic fields travel perfectly well through an idealized vacuum. In contrast to pressure waves of sound, elementary free-space radiation waves are treatable as transverse, meaning the fields that form the radiation, have their oscillatory variation in directions perpendicular to the wave-travel direction, as do shear waves of solids. Like all waves, these are spatially extended entities—a defining property of a field.

It is noteworthy that electromagnetic waves in dielectrics—or indeed, under conditions of tight-focusing or any spatial constraint—need not be transverse, and a variety of novel field structures involving longitudinal fields become possible [Bek11, Bli12, Bli14, Bli15, Tor15b, Bli17] in consequence of considering the nature of field momentum density (and angular momentum density) in a dielectric [Pad03, Pad04, Pfe07, Bar10, Bar10b, Bar10c, Mil10, Gri12].

2.3.4 Wave character

The familiar examples of traveling waves listed above all have one thing in common: They have some attribute that increases (to a peak value) and decreases (to a minimum, perhaps zero) over some characteristic distance (a wavelength) during some characteristic time (a period); see Section 2.7.3. That is, a wave is a distributed disturbance—along a line or on a surface or in a volume—that may have a characteristic length scale (of distance and time), but that cannot be localized completely, just as the beauty of a landscape painting cannot be localized more completely than the area enframed. In this respect wavelike phenomena are quite different from particle-like phenomena. Only with collective behavior in bulk matter are such waves evident, not with a few particles. Section 2.7 provides more details.

2.3.5 Granular character

In recent years the divisional organizing of professional physicists includes soft-matter physics, dealing with material whose properties differ from those of solids and viscous liquids. There has also come recognition that sand, sugar, salt, and other aggregates of granules [Jae92, Jae96, Ned05, Meh07, For08, Her13], can be poured and shaped by containers, but have collective characteristics markedly different from liquids. Sand and salt particles lack the interparticle attraction between water molecules and exhibit no surface tension.

Aggregates of photons are not like any of these.

2.4 Free space: The Vacuum

It is common for physicists to imagine massive, moving objects, such as baseballs and rocket ships, as well as elementary particles such as electrons, to be replaced by hypothetical point-particles that have mass but no significant spatial extent around their center of mass (see Appendix A.1.2). Physicists also imagine space in the absence of objects of any kind: A vacuum, completely devoid of any matter or any particles—or even any radiation.12 Outer space is a start, but even in the vast distances between stars or galaxies we have evidence that there are still a few atoms and molecules to be found. Certainly there is now known to be various forms of radiation supplementing visible pinpoints of starlight. And one reads increasingly often of “dark matter”[Tri88, Bar01, Kha02, Ber05b, Clo06, McD11, Cha15c, Aru17] and “dark energy”[Pee03, Aru17] that are now regarded seriously as entities to be considered when dealing with the vast scales of the universe; see Section 6.1. Section 5.1.2 presents an updated view of the quantum vacuum, a busy place that is far from devoid of matter and radiation.

The notion of Free Space, a complete vacuum, is an idealization, rather like the Frictionless Plane on which textbook-simplified motions of macroscopic objects might occur. It is in such a free-space background that one might imagine radiation traveling, at the speed of light c, between encounters with bits of matter (a viewpoint often associated with Richard Feynman, see Section 8.3). It is in such an empty void that one imagines a single atom to be placed, motionless, as it interacts with radiation.

Atoms and radiation in free space.

Nowadays it has become possible to create situations very like this ideal: Single atoms or ions can be formed into beams in a vacuum chamber and slowed down, thereafter to be trapped in a cluster from which a single particle can be selected for study [Bla88, Phi98, Coh98, Jav06, Met12, Nev15]. Only relatively rarely will such a scrutinized particle be affected by unwanted collision with some contaminant particle or by any radiation other than that which an experimenter controls. So studies of single atoms or single ions, in otherwise empty space, are not the fantastic proposals they once were. Single atoms demonstrably exist and their internal electronic structure can be manipulated by radiation that an experimenter controls to a desired end, using techniques that are part of many physics-laboratory courses [Dem96, Dem98, Dem05] and whose theory I have explained in textbooks [Sho90, Sho11].

But what about radiation? Can it be regarded as made up of little increments of some sort, to be called quanta or photons? And if these exist, how are we to regard them and measure them, and with what mathematics are we to describe them?

These are questions that have occupied writings of many of the renowned physicists of the twentieth century. This Memoir describes my own encounter with those fundamental questions, and explains how my views (and those of the scientific community) were formed and have changed—and how our certainties have become less so as advancing technology has allowed more control and scrutiny of the weak radiation fields where individual photons are to be expected to be noticeable.

2.5 Forces and vectors

My formal introduction to the world of physics as a subject for study in classrooms came in a high school course in 1951. The presentation followed an expository route that had been in place for a century or so, and would continue for many more decades, a pattern that began with the notion of a Force—not, as has become part of entertainment culture, an intangible power for accomplishing good deeds (“the Force be with you” say the Jedi), but something measurable and reproducible.

2.5.1 Force

I learned that forces were not objects that could actually be seen, as could tables and chairs, but their effects could most certainly be observed. A push exerted by a playground bully would have had very definite consequences to me: I would find myself falling to the ground, perhaps acquiring an uncomfortable skinned elbow or knee. Forces caused playground swings to move. They caused the contact between bat and ball to advance the play of a baseball game. They moved chairs away from tables at mealtime. On and on the list could go. Forces, though not themselves visible, can definitely make themselves apparent. They are pushes and pulls that change the motion of an object, in a manner quantified by Newton's laws (see Appendix A.1.2).

2.5.2 Vectors

In high school I learned that there was a mathematics that went with the notion of force. It was very simple: A force has a strength (strong or weak) and a direction. If you were lifting a suitcase the direction was either up or down and the suitcase could be heavy or light. So your mathematics deals with magnitude (the strength) and direction. I learned that the technical term for this kind of object is Vector, something that has magnitude and direction. The simplest cases were those when the direction had only two possibilities, up and down, or left and right, or, when it was expressed as numbers along the real-number line, positive and negative.

We students learned about the mathematics of vectors primarily in order to deal with forces, and the trigonometry of resolving forces into components—two directions (sine and cosine components) for wind propelling a sailboat or for actions of gravity on ladders leaning against walls or, more generally, three directions in the space of objects completely free to move. And we considered corresponding vectors of position for the objects that the forces acted on, and for vectors that described changing positions—velocities (see Appendix A.1.1). Our vectors were (relatively) unconstrained in magnitude and direction, and were used to describe things like positions and velocities of everyday objects such as leaning ladders and baseballs and steam engines (which could still be seen in action). This was the world of Mechanics, regarded as a foundation of the larger study of Physics, but also a foundation for Mechanical Engineering—the stuff upon which a good career could be built; Vectors have not only intrinsic interest as a segment of mathematics but they can have economic value.

Discrete spaces.

But we pupils did not give any thought to situations, such as describing the position of a checker on a checkerboard, that require only discrete whole numbers, integers that can, with some convention, specify which of the 64 discrete squares of a checkerboard is being considered. For the game of checkers the position vector of a piece has to be supplemented by a notion of the color, red or black. To treat a chess game the supplementary information must include not only the color, black or white, but the particular type of piece, pawn or bishop or king or queen or whatever. And the possible moves (the allowable discrete changes of position) of each piece can be described by a vector that leads from a starting square to a small set of possible ending squares. A checkerboard is an example of a flat (two-dimensional) space of finite size. Appendix A.3.1 discusses other everyday examples.

Abstract vectors.

The velocities and forces discussed above are special examples of Euclidean vectors having three or fewer components and a number of relationships between these, notably how these numbers change with an alteration of position or orientation of coordinates. It is Euclidean vectors that mechanical engineers work with and that occur in equations of motion for particles—equations whose solution provides predictions of particle positions (see Appendix A.1.1).

Only after completing my public-school education, which in those days included instruction in arithmetic, algebra, geometry, and trigonometry but little else from the vast world of Mathematics, did I encounter the notion of (abstract) vectors as ordered arrays of elements—numbers or symbols or even words. In this view a five-dimensional vector is an ordered list of five elements, say the numerical values of measured temperature at five different weather stations on a particular morning. Applied to a family supper table the five elements of an abstract vector might be two spatial position coordinates, ordered as north-south and east-west, to which are appended the name of a piece of tableware that has those coordinates, the name of the family member by whom it is to be used, and a yes-no answer to whether it has been used at that meal and therefore requires washing. Such a mixed-element vector is typical of what programmers might deal with in providing instructions for a computer app or a robot.

Statevectors.

As will be discussed in Section 2.12 and more fully in Appendix A.3, the mathematics of abstract-vector spaces provides a relatively straightforward approach to describing quantum systems. A particular system, characterized by its parts and their energies, is associated with a statevector in an abstract space. A task for those who wish a quantitative description for photon-changing alterations of a quantum system is to picture the motion of the statevector in its setting of a multidimensional abstract space. This activity is not so very different from following changes to weather maps on which are displayed daily inputs from a set of weather stations.

2.5.3 Forces from fields

My high-school physics studies of electricity and magnetism introduced me to the names of (Charles-Augustin) Coulomb, (Andrè-Marie) Ampére, and others, who had quantified the forces exerted by charges and currents on other charges and currents.

In principle these force laws allow one to evaluate the electric and magnetic forces that would act upon a small, idealized test charge, but in practice this evaluation is possible only for extremely simple, highly idealized systems of charges and currents. In general one must proceed as did Michael Faraday, who imagined the assorted charges and currents to be replaced by a spatial distribution of lines of force—an example of what physicists now call a Field. These space-filling force lines, of electric and magnetic fields, act as forces upon the test charges. I recall the fascination with which I played with iron filings on a sheet of paper above a magnet, as the filings patterns revealed magnetic field lines between what I learned were north and south poles.

The introduction of fields as a mathematical replacement for distributed charges and currents is an essential step in treating electromagnetic phenomena. It allows us to replace clumsy action-at-a–distance calculations with two separate and simpler calculations: First evaluating the field outside of a localized system of charges and currents, and then evaluating the effect of this field upon test charges. Fields, spread throughout space, rather than sources (localized moving charges), become the center of theoretical attention.

Aside: Mathematicians' notion of a field. The quantities that physicists and engineers refer to as fields – typically a continuous distribution of force or energy – are examples of what mathematicians call manifolds. The typical mathematical approach to such things begins by defining a topological space in terms of a set of points along with a set of neighborhoods (or open sets) for each point. This leads to defining properties such as continuity (smoothness and differentiability) and connectedness (gaps and cavities). For mathematicians a manifold is a differentiable topological space that at each point locally resembles an n-dimensional Euclidean space, n. The everyday physical fields of fluids and electromagnetism are, amongst other things, three-dimensional differentiable manifolds for mathematicians.

Discontinuities.

Physicists and engineers must often deal with idealizations in which there are abrupt (discontinuous) changes of field values with infinitesimal changes of position or time, or with singular field points at which field magnitude vanishes, so direction and phase have no definable value; see Section 4.1.4.

Superpositions.

An essential characteristic of the fields considered by physicists is the possibility of superposition: We can add the values of two fields at a specified position and time, to produce a third field. It is this property of fields that is seen as interference between their waves and the production of node patterns. There is no counterpart in descriptions of classical particles, whether or not they are endowed with size.

Fundamental forces.

Theoretical physicists like to say they have reduced the world of inanimate objects to the study of four fundamental force fields, from which all observable effects can be attributed. Two of these, known as the weak interaction responsible for radioactive beta decay and the strong interaction that holds atomic nuclei together despite the abundance of unbalanced positive charge of the protons, act only over subatomic distances and are of interest primarily to those whose research involves nuclear physics or to users of large charged-particle accelerators or to cosmologists who study the origin of the universe. They are considered to be important parts of any theory that attempts to explain Everything and their study has been rewarded by numerous Nobel Prizes over the decades. Appendix A.19 summarizes the Standard Model of particle physics that is regarded as the present frontier for such work. But these two interactions have negligible effect on household or university-laboratory experience.

Of the remaining two forces, both of them acting over indefinitely large distances, that of gravity is the most easily recognized. It is with us everywhere, unlimited in extent. We cannot shield against it nor alter its action—attraction between masses. On earth it dominates our daily activities: It keeps us standing and causes the flight of thrown balls to descend for possible capture.

The final force, that of electromagnetism, is responsible for all the non-gravitational forces we observe in our everyday lives. The chemical forces that draw atoms together into molecules, the forces of attraction between neutral molecules, these are all understandable results of quantum theory together with electrostatic interactions, as developed by chemists and described in their texts on atomic and molecular structure (see references in Section 9.2). There are two types of electric charge, positive and negative, and these are normally paired so that bulk matter is generally electrically neutral. For this reason the observed static fields outside bulk matter are usually magnetic rather than electric. It is with the extended, time-dependent electromagnetic field, able to break free, as radiation, from the moving charges that create it (see Section 2.8), that photons are to be found, and with which this exposition deals.

Although the strength of gravity and electrostatic forces both diminish with the same dependence on distance, their influence appears in different regimes. Atoms and molecules are held together entirely by electric and magnetic forces. The attraction of gravity has negligible effect in the immediate surroundings of atomic masses and in dealings with atomic and molecular physics it is justifiably neglected, apart from causing atoms to fall into boxes of radiation. (As noted by Feynman [Fey63], the gravitational attraction between two electrons is smaller by around 4×1042 than their electrostatic repulsion.) By contrast, stars, solar systems and galaxies are held together by gravity: Because opposed electrical charges occur in pairs, with resulting exact cancellation of charge seen at a distance, electric forces need not be taken into account for these structures. Magnetic forces are not so cancelled, and their effects are seen, for example, in the flow of ionized gases of the solar atmosphere.

Fields and quanta.

Quantum theory provides a connection between the wavelike properties of a field and the discrete quanta that carry the field attributes such as mass and electric charge. There is an intimate connection between the range of a force and the mass of the quanta associated with the force field. Short-range forces, such as those of the weak and strong interaction, have quanta of finite rest mass: The larger the mass, the shorter the range. Their field quanta are the bosons of the Standard Model of particle physics, see Appendix A.19.13

The electromagnetic field in free space, by contrast, extends indefinitely, with magnitudes falling inversely with distance from the source. The waves move in vacuum with speed of light c and the quanta of electromagnetism—the photons—therefore have zero rest mass.

Effects of gravity are described by the general theory of relativity, in which gravitational forces are attributed to curvature of four-dimensional space-time induced by large masses and photons travel along mass-distorted free-space paths; see Section 6.1.2. There is presently no generally accepted quantum theory of gravity. The quanta of gravity (gravitons, spin-two bosons) remain to be observed, although its waves have been observed (with great difficulty).

2.6 Energy and heat

Although I realized as a schoolboy that forces were not visible, as were aggregates of particles, they certainly produced visible effects: They could cause motions of objects. Some forces were evidently steady but seemingly inactive: When you are standing on a structurally weak bench that breaks under your weight you learn that prior to the breakage there had been a force to keep you in place, but once that bench broke then the force of gravity, which had been there all along, was able to pull you steadily toward the ground, perhaps with uncomfortable consequences. But forces could also be deliberately transient: A push or pull might make its effect (some displacement) known but might cease thereafter.

Work and energy.

Could the effects of a transient force be stored in some way? That question led our physics instruction to the notions of Work and Energy. We were taught that when a force moved an object through some distance (carrying a pail of water up a hill as Jack and Jill allegedly did, for example) the product of the two numbers (say a kilogram and meter) was to be regarded as Work, a technical term that we were not to confuse with Labor, which was what one had to do for payment or as a regular family chore—things like mowing a lawn or weeding the garden or just picking up toys. We schoolboys found it very amusing that these tasks, though they might be drudgery, were not Work as physicists knew that word. Work, for a physicist, meant a number, the product of force times distance. Labor, as economists use the term, is something with little connection to the Work as defined in physics, but has everything to do with daily life.

The interesting thing we scholars learned about Work is that it creates stored energy—Potential Energy—that can later be turned into energy of motion—Kinetic Energy. Physicists have formulas for these two sorts of energy.

Aside: Energy formulas; mass. The kinetic energy of structureless particle of mass m is proportional to the square of its velocity v,

Ekin=12mv2.

(2.6-1a)

The potential energy is proportional to the distance moved by a force. A mass m raised a height h against the pull of gravity (acceleration constant g) has potential energy

Epot=mgh.

(2.6-1b)

The slope of the position dependence of potential energy—height or displacement from equilibrium—is a force. It is this force that causes balls to roll down hills, and causes a clock pendulum to oscillate; see Appendix A.1.4.

In each energy there is a proportionality constant that expresses the amount of material undergoing motion—what the textbooks called mass and denoted here by m.

Conservation of energy.

The neat thing about the notions of energy, from the viewpoint of a teacher constructing exam questions or a student seeking intended answers, is that it enables you to figure out how fast your dropped hammer will be going when it hits the ground below your ladder or how fast your car can coast at the bottom of a hill of some given height. Such problems are solved by invoking the principle that energy of different forms can be converted from one form to another – combining eqn (2.6-1a) with eqn (2.6-1b). This principle—Conservation of Energy – provided a major argument for the existence of photons, at least as energy increments: If atom energies are quantized (see Section 3.2) then so should be the energies carried to and from them by radiation fields. (This was one of the ideas associated with the photons of Einstein, see Section 3.4, and of Dirac, see Chapter 4.)

Aside: Energy units. Workers in atomic and molecular physics often express energies in electron volts, eV. This is the energy gained, or lost, when a single electron moves through an electrostatic potential difference of one volt. In SI units [see eqn (A.1-13)] 1 eV is approximately 1.6×1019 joule (J). Energies required to alter internal structures of atoms and molecules are typically a few eV; see Section 3.3. The energy required to photoionize a single hydrogen atom is 13.6 eV. A photon having energy 1 eV would have an infrared wavelength 1240 nm and frequency 241.8 THz; see the table on page 73 and eqn (3.3-1).

Hamiltonians.

When I gained the use of tools from differential calculus I learned that force can be expressed as the slope (directed spatial derivative) of a potential energy—balls roll down hills of potential energy. Energy thereby becomes a key director of motion; see Appendix A.1.4. The set of rules that relate energy to such measurable properties as mass, height and speed (for example eqn (2.6-1a)) form a Hamiltonian.14 From the Hamiltonian for a particular collection of particles or other inanimate objects one can determine their possible motions and system changes—the falling of rocks and the creation of photons. Appendix A.1.6 provides details.

Friction and heat.

But textbook calculations that rely on conservation of mechanical energy, if they are to be applied to real-world activities, have to deal with the reality of Friction Forces. These are what brings a sliding object to rest. These forces convert motional energy into Heat. Every child knows about friction heat, and about heat from fires and stoves. Heat is very detectable by our built-in body sensors. We learn very early in our lives not to touch the hot stove, and we recognize the pleasure we have when the home surroundings hold sufficient heat (as measured by temperature) to allow us to run barefoot outdoors. We know heat by the sweating it causes as the human body attempts evaporative cooling. We know cold as the absence of heat, and the need to prevent loss of body heat—accomplished by using layers of warm clothing. So heat is definitely something perceptible and measurable, like positions of objects (and potential energy) and like velocities of objects (and kinetic energy).

Temperature.

As I was to learn from post-college instruction in Thermodynamics, heat is disorganized energy, only partially available for doing useful work. On a microscopic scale, heat is the amount of energy in random motion of atoms and molecules; see Appendix B.7. Temperature is the measure of the average kinetic energy of these constituents of matter.

Aside: Boltzmann's constant. The numerical equivalence of an energy increment and a temperature increment (in degrees Kelvin, K) is given by the Boltzmann constant kB:

kB=1.380649×1023J/K,1kB=11604.5K/eV.

(2.6-2)

In a gas of freely moving classical particles that are in thermal equilibrium with their surroundings every degree of freedom (DoF), meaning every independently specifiable, unconstrained coordinate in an equation of motion (see page 193), has an average kinetic energy of kBT/2; see Appendix B.7.

Entropy.

Energy that cannot be converted into useful work (in the technical sense of force times distance), and is therefore lost for practical use, is quantified as Entropy, the amount of heat, as disorganized thermal motion, present in a sample of matter. The nineteenth century development of thermodynamics dealt with incremental changes of entropy as the ratio of incremental change of heat energy, during a reversible process, to the temperature T at which that change occurred. At a temperature of zero degrees Kelvin, the random thermal motion of matter constituents is minimal (it can never be completely absent), and this is taken as the zero point of entropy. Entropy need not be conserved: It can increase indefinitely. The notion of entropy, as a measure of randomness, is applicable not only to the statistical mechanics of particles (and photons) but to information [Sha48].

Chemical energy.

The rather obvious exchange of mechanical energy between potential and kinetic forms has a counterpart in the chemical energy atoms have by virtue of their arrangement into molecules. As youngsters with Gilbert Chemistry Sets learned, certain sorts of atoms when brought together could form compounds with release of heat. Violent explosions could occur as atoms rearranged themselves. My first school course in Chemistry taught us about exoergic chemical changes that gave off energy and endoergic changes for which energy had to be supplied. Compounds were seen to be stores of Chemical Energy.

In college I learned that chemical changes occurred by breaking and remaking the electron-mediated chemical bonds that gave structure to molecules and solids. Only much later did I learn that energy changes of atoms and compounds could take place in non-damaging, reversible increments, induced by photons and discussed in Appendix A.10.2. These changes amount to reversible rearrangement of the internal structure of atoms and molecules. They are the changes (“coherent excitation”) induced by laser radiation with which I have spent the last few decades, treated in expositions on coherent excitation [Sho90, Sho08, Sho11, Sho17]. Appendix A presents the mathematics used to describe these changes.

Energy sources: Atomic vs. nuclear.

Prior to the 1950s almost all of our household energy came from the heat of chemical reactions—rearranging the electronic structure of hydrocarbon molecules and breaking them apart. The earliest domestic uses of fire by cave dwellers came from the rearrangement of wood molecules into smaller gaseous molecules. In time, power-line supplies of electrical energy came ultimately from combustion of coal or natural gas. All of these combustion-heat sources rely on rearrangement of electrons within atoms, and so they can be accurately called “atomic energy sources”. The responsible molecules are broken apart (split) and their constituent atoms are rearranged, with release of stored chemical energy as heat. Studies of these processes are traditionally found in courses on Atomic Physics, Molecular Physics, and Chemical Physics.

The nuclei of the heaviest atoms, most notably uranium, will break apart into smaller “fission fragments” with release of energy and free neutrons when they absorb a free neutron from their surroundings. Such nuclear-fission reactions, and the heat energy obtained by “splitting the nucleus”, are accurately called “nuclear energy sources” when they are controlled in nuclear reactors. They are studied in courses titled Nuclear Physics. Used militarily they are “nuclear weapons”. The associated restructuring of the atoms that surround the newly created nuclei are but a minor portion of the energy changes. Ion accelerators such as cyclotrons are often called “atom smashers”, although it is atomic nuclei that are the targets for their disruptive action.

Conservation laws.

Energy is but one of the things that physicists list for conservation (in the absence of friction). The conservation of electrons (electric charge) underlies the analysis of electric circuits by electrical engineers. As embodied in Noether's theorem (see Appendix A.18.5), conserved quantities are often associated with some symmetry in the equations of motion. Although photons are not conserved, a number of other electromagnetic quantities are conserved [Lip64, Kib65, Cam12, Bar14].

2.7 Equations of change: Particles and fluids

The mathematics of Physics (and its dependent Engineering world of compromise and optimization) deals, in large measure, with equations that describe how Systems (collections of interacting parts, say linked pieces of a machine) change with time. I shall frequently mention “equations of motion”, meaning equations that govern the rate of change in some property, often the location of a localizable particle or a fluid mass. For solid objects the equations express how particle positions (and velocities) change when subject to forces; see Appendix A.1.

Other equations describe how forces acting on particles change with distance. The changes of interest are typically expressed as incremental changes to some observable (say position or velocity) resulting from some infinitesimal change (a differential) occurring in an independent variable (space or time), and so these are differential equations. Calculus is the broad branch of Mathematics that deals with their solution.

When the system we treat has motion in several directions or has several parts, each of which can affect others, we deal with sets of several interdependent (coupled) differential equations that must be solved simultaneously. A variety of techniques find use in obtaining solutions, either numerically or in algebraic expressions involving various well-studied functions bearing names of mathematicians. Appendix B.2 discusses several examples.

The lengthy literature on photons suggests that, under appropriate conditions, they can be regarded as particles or as waves. In finding appropriate equations for photons it is therefore useful to look not only to the physics of particles but also to the physics where waves are to be found—to fluids and their wave equations.

Rates of change.

In discourse on motions there is always an unstated notion of “fast” and “slow”. It is always necessary to place these adverbs into some context. Recollected extremes from childhood include “faster than greased lightning” and “slower than molasses in January”. Applied to quantum-state change these distinctions are often termed impulsive or diabatic (fast) and quasistatic or adiabatic (slow); see Appendix A.13. In discussions of atomic processes a common reference interval is the lifetime of a given atomic state awaiting spontaneous emission, or the mean time between phase-interrupting collisions. In much of the discussion in this Memoir the changes to quantum systems are assumed to occur in a time interval that is shorter than a time that measures interruptions (decoherence time); see Section 5.1.

2.7.1 Particles: Newtonian mechanics

These (classical) equations describe, for example, how the gravitational force between two masses (or the Coulomb force between point charges) diminishes as they become increasingly far apart. The equations describe how the velocity of a massive object (the time derivative of a varying position) changes with time when forces act. For objects much larger than an atom (but smaller than the solar system) the common rule for calculating the rate of change in velocity v (i.e. the acceleration a) is that this equals the ratio of force to mass, a=F/m. This equation of motion, known as one of Newton's laws, is at the heart of what is called (nonrelativistic) Classical Dynamics (see Appendix A.1) to distinguish it from motions over distances comparable to the size of atoms, where the behavior of electrons is governed by Quantum Mechanics. Underlying these equations is the assumption that the mass-endowed objects of interest are localizable—that it is possible to measure, and thereby define, positions for all the constituents of a system. Quite often these can be idealized as having mass localized at a specifiable position (the center of mass, a mathematical point) and having angular momentum centered at that point. This localization assumption is the essence of what we regard as a “particle”. Appendix A.1 presents the needed equations.

Particulate light.

Newton's laws of motion predict straight-line trajectories for particle motion when no forces act. In free space or uniform, transparent matter, unfocused light rays also follow straight-line paths. It is such ray paths that gave an early impetus for regarding light as corpuscular and which underlies lens design software today. But the use of discrete ray paths in optical design does not mean invoking the discrete photons of Planck, Einstein, and Dirac.

2.7.2 Wave equations

For descriptions of continuously distributed properties such as mass, velocity, electric charge, and force, the appropriate equations of motion are Partial Differential Equations (PDEs) whose two or more independent variables bring a connection between changes in time and changes over distance. Particularly noteworthy for eventual connection with photons are changes that appear as waves—patterns of regularity in some property value; see Section 2.3.2.

Maxwell equations for radiation.

The essence of a particle is the notion of localizability. Radiation is rather different. The basic equations of electricity and magnetism, combining as electromagnetism (see Appendix B) are for electric and magnetic fields, envisioned as spatial distributions of forces that act on electric charges and which, in turn, are created by electric charges and currents. The fields themselves are, at every location and at every time, Euclidean vectors: They have a magnitude and a direction. And these vectors, one for the electric field and one for the magnetic field, may change over distance and with time. The changes, expressed as partial differential equations, are known as the Maxwell equations; see Appendices B.1 and C.2. They describe not only the static fields surrounding fixed point charges and permanent magnets but also radiation fields: Electric and magnetic fields that move through empty space at a constant velocity c, the velocity of light in free space. The basic Maxwell equations for fields in vacuum can be presented as four interconnected vector equations for six field-vector components. (Because they are partial differential equations, and because such things were regarded by curriculum writers as difficult mathematics, it was not until I was a graduate student that I saw the Maxwell equations in full.)

The four Maxwell equations are first-order equations, meaning that only first derivatives appear—three for spatial change, one for time change. For treatments of electromagnetic fields moving through space it is common to combine the four first-order equations into two equivalent second-order equations (i.e. equations involving second derivatives, see page 191), one for the electric field, another for the magnetic field (see Appendix B.1). These second-order PDEs are examples of wave equations: They have solutions that can describe either standing waves of stationary node patterns that form in enclosures or the traveling waves of moving nodal surfaces that appear as radiation beams and other constructs; see eqn (4.1-1).

Wave equations for electrons.

A consequence of the uncertainty principle, regarded as an expression of how Nature works, is the need to describe the behavior of electrons inside atoms as governed by a wave equation—an equation akin to those needed to describe electromagnetic radiation waves and known as the (time-independent) Schrödinger equation (see Appendix A.4.3). Its solutions are wavefunctions. The solutions to any wave equation shows patterns, in space and time, of places where the wave amplitude vanishes (a node) and places where the amplitude is a local maximum (an antinode). As quantum theory informs us, the electrons within atoms behave very much like an electric vapor, one that exhibits the sort of interference effects—nodes and antinodes—that are characteristic of waves in general—that occur in vibrating drumheads and tuning forks or that travel around obstacles. Section 4.1 discusses wave patterns expected from the charge distribution of electrons bound in atoms.

Aside: The de Broglie wavelength. Any particle having momentum p, rest mass m0, moving with speed v in free space (far from any centers of attraction), has a wavelike behavior quantified by the de Broglie wavelength λde,

λde=h|p|=hm0v1(v/c)2.

(2.7-1)

This wavelength diminishes, and the particle becomes more localizable (pointlike) as its speed increases. For a particle with speed much less than the speed of light c this wavelength varies inversely with mass and velocity, λdeh/m0v, becoming large without limit as the particle slows. Such wavelike behavior is associated with the probability amplitude—the wavefunction—for any monoenergetic particle in free space; see Appendix A.5. Electrons are the lightest particles and so for given velocity their de Broglie wavelength is largest.

Wave-particle electrons.

An electron can be localized as being within the small volume of an atom. If we do not enquire about its position more accurately than that, it can be idealized as is done in textbooks on classical motions: Its mathematical description is that of an electrically charged point particle, having a sharply defined position for use with Newton's laws of motion (suitably altered at high velocities to incorporate special relativity).

2.7.3 Wave attributes

A great variety of optical effects, studied as Physical Optics, involve phenomena that can be best understood by endowing light with wavelike properties: wavefronts of valleys (including null values—nodes) and crests (antinodes) that, in free space, move with constant speed c, the speed of light c in vacuum. When applied to optical radiation the wavelike properties refer to the electric and magnetic field vectors and to their squares, measured as radiation intensity, all of which are expressible as functions of space and time.

Wavelength and frequency.

The distance between nodes, or antinodes, of any wavelike pattern is the wavelength λ. At any fixed location a passing wavefront of wavelength λ oscillates with frequency ν. Successive waves need not be identical; during an interval when the cycles repeat exactly the wave is periodic.

Aside: Units. The unit of frequency ν, the hertz (Hz) is one cycle per second. The connection between frequency and wavelength of any disturbance is through a wave velocity. For light in free space this is the vacuum speed of light,15 c:

ν=c/λ,c300×106m/s.

(2.7-2)

Aside: Angular frequency. To avoid a plethora of 2π factors the mathematics of oscillatory behavior is commonly treated with angular frequency ω=2πν, measured in radians per second. A full cycle of rotation is 2π radians. Angular frequencies accompany reduced wavelengths λ=λ/2π=c/ω.

Phase.

Idealized, single-frequency (monochromatic) light is a periodically varying electromagnetic field: Its characteristics repeat regularly after a time interval (the wave period) 1/ν. During one period the field value will vary between its two extremes. Starting from a minimum (or any other arbitrary fiducial value), the subsequent time interval is quantified by a phase, varying from zero to 2π (radians) at the next repeat of the fiducial value (one period). The moment when we start counting waves, say the first minimum value, fixes a phase for the enduring wave pattern: Further minima occur at times n/ν for integer n as long as the phase is unchanged.

The maintenance of constant phase, or a phase whose variation can be specified in closed form, makes the field coherent, and allows interference between field samples.

Interference.

Amplitudes of any sort of wave are additive, so waves from different sources will, upon meeting, exhibit interference, producing weak or null intensities (destructive interference) when amplitudes cancel and peak intensities (constructive interference) when amplitudes reinforce each other. Wave properties include both interference and diffraction—the spreading of wavefronts that pass through a small opening in an opaque screen.

The existence of interference is demonstrated in electromagnetism and in quantum theory by observations on squares of amplitudes. In electromagnetic theory the amplitudes are electric and magnetic field magnitudes and the squares are radiation intensities; see Appendix B.1.1. In the quantum theory of atom structure the squares are probabilities and the amplitudes are wavefunctions or probability amplitudes; see Appendix A.3.

When optical intensity is a superposition of two or more competing field amplitudes their magnitudes can either add (constructive interference) to give a large (bright) intensity (maximal at an antinode) or subtract (destructive interference) to give a small (dark) intensity, perhaps zero (a node). The resulting bright-ark pattern of measurable intensities, in space or in time, is termed optical interference; see Figure 4.1 on page 97.

Quantum interference acts in the same way. When a probability is a superposition of two or more probability amplitudes these may contribute either constructively or destructively, to produce a pattern, in space or time, of varying high and low probabilities.

When traveling photons are invoked to treat examples of optical phenomena we consider superpositions of possible pathways leading from a source to a detector. If the pathways are indistinguishable then we add amplitudes before squaring. This allows interference. If pathways can be distinguished, even if only in principle, then we add separate squares. There is then no interference.

Interference, and the underlying superposition of different wave amplitudes, is the defining characteristic of waves, just as localization is considered the defining property of particles. It is therefore with measures of optical interference—in space or time—that we must look for evidence that optical photons have a wavelike nature. As was stressed by Glauber [Gla63, Gla95], what interferes in quantum theory are probability amplitudes (for specified events, such as electric field strength), not some notion of particle-like “photons”. One might say16 “photons follow from fields”, not the reverse.

Energy quantization.

Both the time-independent Schrödinger equation for the stationary distribution of an electron within an atom and the second-order Maxwell equations for free-space radiation have solutions that exhibit patterns of nodes and antinodes that is a defining characteristic of waves. Although the wave nature of radiation is evident in all the mathematics of the Maxwell equations, the wave properties of the electron, as given by the Schrödinger equation, come into play only when our concerns are with atomic-scale behavior. Then it becomes evident that something beyond classical mechanics is required. The most notable effect is that the energy of an electron, or groups of electrons, confined within an atom or molecule (or a nanoscale structure), can only take selected discrete values: The energies, combining potential and kinetic energy, form a discrete set and are said to be quantized.17 This quantization is evidence that, although the electron has properties of a classical particle, it also has wavelike properties that invoke quantum theory—it is a quantum particle.

Not only are energies of electrons within atoms quantized, so too are the vibrations of atoms in molecules and solids, and the unhindered rotations of molecules. As older textbooks explained, the discreteness of these energies first became apparent in measurements of specific heats, in which energy exchanges with a thermal environment produce a temperature change. The transferred energy may be not only radiation but kinetic energy from particle collisions.

2.8 Light: Electromagnetic radiation

Sunlight is so ubiquitous in our lives that it seldom is directly noticed—except when it is absent or overly abundant. One of the welcoming characteristics of sunlight, and of other sources of radiation, is that it provides heat. It is that absorbed heat that helps popularize lying on warm sand at a seaside beach. As I learned eventually, visible light, whether from the sun outside or from a glowing lightbulb indoors, is a form of electromagnetic radiation, as are radio waves and X-rays. All of them carry, and can deposit, energy. And they all have wavelike characteristics—they are delocalized disturbances.

Electromagnetic waves: Optics.

Already in the nineteenth century the unification of electricity and magnetism into electromagnetism by Maxwell provided the needed theoretical understanding for light as traveling waves of electromagnetic energy—of spatially distributed electric and magnetic field vectors; see Appendix B.1.1. Visible light was but one portion of the vast catalog of wavelengths that ranged from much longer radio waves and infrared (IR, the source of much solar heat) to shorter ultraviolet (UV, once termed “black light” because it was not visible to our eyes) and much shorter X-rays and gamma rays, and on without limit.

Although it is common to regard all visible radiation as comprising photons, the sources of this radiation are remarkably varied, and the nature of the light, as recorded by its spectrum (the distribution of frequencies, responsible for the appearance of color), is accordingly also quite varied.

Elementary sources: Accelerated charges.

Any attempt to understand radiation must, at some point, deal with the possible microscopic mechanisms that generate it. Once it became recognized that radiation was a traveling electromagnetic disturbance, a phenomenon governed by Maxwell's equations, it was possible to find a description of radiation sources as motions of electrical charges.

The simplest example, familiar to youngsters from demonstrations of electrostatics via shoe shuffling on carpets, is the electric field that surrounds a very localized electric charge (idealized as a mathematical point), say that of an electron. While the charge remains stationary the electric field, the Coulomb field, diminishes monotonically with distance, but when the charge undergoes acceleration, from whatever cause, the field falls more slowly: At large distances from the source charge it takes on characteristics we associate with traveling waves of radiation. Indeed,

all the radiation commonly encountered can be attributed to localized sources of accelerating charges.

Appendix C.1 quantifies the connection between acceleration and radiation, for point charges. The force that causes the acceleration may come from encounters with other charged particles (in a hot gas) or it may come from an electromagnetic field, either static or oscillatory (i.e. radiation). The more abrupt is the acceleration the higher are the dominant Fourier frequency components of the associated radiation (and the more energetic are the photons).

The acceleration need not be steady; it can undergo reversal. When the source-motion reversals are periodic (as occurs with a harmonic oscillator, see Appendix A.1.7) the field carries this frequency and is idealized as a monochromatic (i.e. single color) traveling wave; see Appendix B.2.2. This simple model of a charge (say, an electron) oscillating around an equilibrium position and thereby radiating, guided much of the thinking about radiation sources that underlay the early proposals for discrete increments of radiation; The model, originated by Hendrik Antoon Lorentz and subsequently justified for a two-state quantum system, is discussed in Appendix A.14.4.

When the accelerating charges are bound within an atom, molecule, or other composite the radiated electromagnetic energy originates with loss of structural energy (potential and kinetic) of charges within their confinement—a change of rest energy (see Section 8.3.1) of the composite. Because states of internal motion exhibit identifiable quantum characteristics such radiation will also have measurable quantum character. Alternatively, the charges may be unbound, unconstrained in momentum and energy. The fields from such acceleration draw upon changes in unquantized kinetic energy.

Quantum theory does not alter the prediction that accelerating charged particles will radiate, but it imposes an important proviso: When the motion is constrained, as it is for electrons bound in an atom or molecule, then the system change can only take place between discrete quantum states. Furthermore, the state of lowest energy will not radiate; it is a stable state.

Not all acceleration produces radiation. Macroscopic objects that are charged but structured may create radiation fields in which there is phased destructive interference from different parts, so the overall external radiation is not seen [Abb85]. It is possible to design bulk matter so that local induced dipole moments contribute destructively to incident radiation, thereby rendering objects (nearly) invisible [Kra19].

Radiation sources can be categorized in two broad classes, incoherent (disorganized) and coherent (organized), distinguished by the arrangement of the accelerating charges within the source. Light signals emerging from these classes of sources maintain their coherence properties. During the time interval when a signal is coherent it is predictable. The irregularity of an incoherent signal prevents reliable prediction for times longer than its coherence time. The distinction between coherent and incoherent sources is important when considering the nature of the photons associated with the radiation. The following paragraphs define, and illustrate, the difference.

2.8.1 Coherence

As Newton first demonstrated by passing sunbeams through prisms, visible light, from whatever source, can be separated into a range of colors, each quantified by a specific frequency or wavelength. It is also possible to pass light through optical filters that transmit only a very small range of frequencies.

Idealized, single-frequency (monochromatic) light is a periodically varying electromagnetic field: Its characteristics repeat regularly after a time interval 1/ν. The moment when we start counting waves, say the first peak value, fixes a phase for the wave: Further peaks occur at times n/ν, for integer n, as long as the phase is unchanged.

Experience with actual light sources never meet this ideal periodicity, if only because the observations do not continue indefinitely; see Appendix B.5.1. The deviation from perfect periodicity—a perfectly coherent wave train—can occur for many reasons. Beams of frequency-filtered sunlight and beams from lasers differ qualitatively in their coherence—the time duration of perfect periodicity—and these differences are responsible for observable qualitative differences in how their light affects single atoms. In brief, laser light is able to produce coherent excitation [Sho90, Sho11, Sho17], see Section 5.4, whereas sunlight and incandescent light cannot produce such effects, see Section 5.2.

With incoherent sources it is possible to use filters that restrict the frequency to a narrow range (a narrow bandwidth), but it is not possible to eliminate all evidence of the underlying randomness of the many emitting electrons whose combined output produces the light. This remains evident in measurements of correlation between field values at different times; see Section 6.5.2 and Appendix B.5.3.

2.8.2 Incoherent sources

Amongst the most common readily available sources of visible light are the following. In each of these the motions of the elementary charges responsible for the radiation do not maintain any regular pattern and their individual fields combine irregularly. The bulk radiation is therefore disordered.

Sunlight.

The most ancient of our light sources is undoubtedly that from the sun. As we now understand, this and other stellar sources is energy from a heated gas: Unconstrained collisions between atoms, ions and free electrons produce a very broad spectrum of frequencies (very close to a “blackbody” distribution) as energy alternates between kinetic energy of moving charges, potential energy of bound electrons, and traveling radiation.

Firelight.

Controlled-burning fuel—a candle, a gas flame, a fireplace log—produces a flame of glowing gas whose radiation originates in the spontaneous emission of light from electrons of heated atoms and molecules undergoing chemical changes. The numerous constituent frequencies of the light are characteristic of the fuel composition.

Gas discharges.

Several generations of experimental physicists made use of light that was generated by gas discharges: A glass tube that enclosed two voltage terminals, between which there was established a suitably high voltage. Electrons bound to gas molecules would, under the influence of the electric field, emerge from their binding molecule, accelerate under the influence of the electric field, and collide with other electrons (bound or free) to create an ionized vapor. The various processes would equilibrate, with some electrons becoming rebound and releasing their energy as radiation. The frequencies of the light were an indicator of the chemical composition of the gas.

Incandescence.

Any heated metal will, as the temperature rises, be seen to glow; first red, then orange, then blue-white. The charges responsible for this light are electrons, confined to the metal. The light spectrum, like that of the sun, is very broad, following closely a blackbody distribution characterized by a temperature. Until recently the most common sort of household lighting apart from flames, dating back to invention by Thomas Edison, relied on the glow of a metal filament heated by electrical current inside an evacuated glass envelope—a light bulb. Thermal motion of electrons (in the filament) underlies the emission of radiation, termed incandescent light.

Fluorescence.

A common alternative form of retail and workplace light comes from fluorescent lamps. The light in these originates with a glowing, electrically-driven gas discharge that produces ultraviolet (UV) light. The walls of the translucent glass enclosure are coated with compounds that absorb this light and reemit the energy, by spontaneous-emission fluorescence, in a range of other frequencies that partially fill out the visible spectrum. The accelerated charges are electrons, localized within atoms.

Phosphorescence.

Some materials, having been exposed to light, will thereafter glow with delayed emission known as phosphorescence. The source atoms have been placed, during their exposure to illumination, into metastable excited states that decay only slowly, in the same way that radioactive sources emit their radiations. Here too, the motions of confined electrons produce the radiation.

Chemiluminescence.

Various chemical reactions will produce products in states of electronic excitation that will subsequently emit light, for example glow sticks. This phenomenon is occasionally seen in nature as a will-o-the-wisp. When the process occurs in a living organism it is known as bioluminescence.

Light-emitting diodes.

Nowadays much of the illumination we see originates with light-emitting diodes (LEDs) rather than the incandescent lightbulbs that once filled our household light sockets and automobile headlights. The light from these devices, like the incandescent or fluorescent sources that preceded them, is incoherent: It originates with spontaneous emission by electrons undergoing energy change in the spatial interface between n-type and p-type semiconductors, that is, a region between a solid whose structure makes available negative electrons (n-type) and one that has electron vacancies, or positive holes (p-type). What emerges is light of a frequency associated with a particular transition and spatially distributed throughout the transparent interface region. (White-light output is obtained either by incorporating three distinct frequency transitions or else converting a portion of blue light into yellow light by means of phosphors that absorb and reemit light.)

Bremsstrahlung.

Any collision between charged particles involves an acceleration and hence the generation of radiation. The resulting radiation, known as Bremsstrahlung (German for “braking radiation”), occurs for inelastic scattering of any pair of charged particles. An example occurs when a free electron passes by an atomic nucleus: The closer is the approach the greater the acceleration and the more complete is the conversion of kinetic energy into radiation. A beam of electrons impinging on a metal target undergoes a variety of collisions and will emit a corresponding range of frequencies, up to the total incident kinetic energy of the electrons. Bremsstrahlung from beams of kilovolt electrons are the source of the continuum of frequencies found in X-ray tubes.

The stopping of electrons by solids also accompanies collisions that eject tightly bound electrons, leaving holes in the atomic structure. In refilling those vacancies the rebinding electrons emit discrete energies associated with the vacancies. These appear as characteristic X-rays on the continuous background.

The Bremsstrahlung occurring in thermal collisions in a hot plasma is known by astrophysicists as free-free radiation, between unquantized scattering states of free electrons. Free-bound radiation originates with an unbound electron and concludes with the electron in a discrete bound state, a process termed radiative recombination. The inverse of this process, photoionization, is known as bound-free radiation.

Proton-induced X-ray emission (PIXE).

Although beams of protons in cyclotrons have traditionally been used to induce nuclear reactions (and were part of my thesis research [Sho61]), in recent years they have found new uses in non-destructive chemical analysis. A proton, passing a target atom, transfers kinetic energy to one of the tightly bound inner-shell electrons, expelling it from the atom. The filling of this vacancy by other electrons leads to emission of discrete-energy photons whose frequency provides a unique connection to the chemical element. The perfected technique, known as proton-induced X-ray emission (PIXE), is capable of determining the chemical composition of the ink used on individual letters of ancient manuscripts without harming or discoloring the document, a capability that was used in providing remarkable information about the printing procedures used by Johannes Gutenberg [Cah80, Cah81, Dav19]. The message conveyed by these freshly minted photons, about chemical composition of ink and paper, offers a vision of daily life from centuries past, told well by Margaret Davis [Dav19].

2.8.3 Coherent sources

The disorganized individual radiators of the sources listed above will, in turn, produce disorganized, incoherent radiation. We can expect them to produce single photons at random or uncorrelated crowds of photons. A number of radiation sources offer examples of radiation from organized groups of electrons, or from individual electrons. These sources have particular interest for studies of individual photons.

Radio sources.

The first class of controllably-generated coherent electromagnetic waves outside of the visible portion were those, originally called Herztian waves for their demonstrator Heinrich Hertz, in the spectral region now termed radio waves. They range from extremely low frequency (ELF) of 3 Hz to 30 Hz to tremendously high frequency (THF) 300 GHz to 3 THz. Radio waves are generated by oscillating electric currents—collective periodic accelerations of electrons—in an antenna, and are detected as comparable collective oscillations of electrons in a receiving antenna. For communication purposes the basic periodic charge oscillation (the carrier frequency) is given modulation, in amplitude, frequency or phase, in which the information is encoded.

Cyclotron and synchrotron radiation.

Charged particles that are constrained by static magnetic fields to move in circular orbits, as happens in cyclotrons, undergo centripetal acceleration that is accompanied by cyclotron radiation. When the particle speed approaches the velocity of light the required machine becomes a synchrotron, and the radiation is termed synchrotron radiation. Quantitative description of this type of radiation makes no use of photons, other than the name.

Cherenkov radiation.

Charged particles in free space undergo abrupt deceleration when they strike the boundary of material where the speed of light is slower than the particle velocity. This deceleration produces Cherenkov (or Cˇerenkov) radiation, an electromagnetic version of a shock wave. The quantitative description of this process has no need for photons.

Atomic-transition radiation.

The electrons bound together in an atom or molecule are forever in motion in orbits around atomic nuclei. In this motion they are constantly acted on by centripetal Coulomb forces that produce acceleration toward positively charged centers of attraction. This acceleration of charged particles can be regarded, in the viewpoint of radiation-reaction theory [Ack73, Sen73, Ack74, Mil75, Dal82, Ser86, DiP12, Bur14, Mil19], as producing the radiation seen as spontaneous emission from excited states of motion. In simple situations the emission from a single atom transition produces a single photon. In other situations a single atomic transition will produce two photons whose total energy is equal to the change of atom energy—a two-photon transition.

Single-atom sources of single photons.

As lasers became commonplace tools in optics laboratories they made possible a variety of procedures in which single trapped atoms could be caused to emit single photons. Such sources require particular controls of the environment around the atom and particular control of its quantum state [Sho90, Sho08, Sho11, Sho13]; see Appendix A.10.2.

Lasers and masers.

The word “laser” and the acronym LASER for light amplification by stimulated emission of radiation, refers to the mechanism responsible for the light: An initial spontaneous emission event, converting internal energy of an atom or molecule into radiation, acts to induce similar energy conversions as it travels through matter where such potential energy is present—a set of actions proposed by Einstein. Such radiation was first studied with microwaves, where the devices were known as masers. Unlike the incoherent sources listed above, laser radiation can be very directional (though it need not be [Hor12b]) and nearly monochromatic, at a frequency that is dictated in part by the originating transition [Scu66, Sar74, Sie86, Mil88, Sal19]. Section 5.3 summarizes the physics underlying maser and laser radiation.

Particle annihilation; Positron emission tomography (PET).

Various processes act to produce anti-particles to the stable electrons and protons that comprise the matter of our surroundings. When an anti-particle encounters one of its particle counterparts, the two rest masses are converted into radiation energy in accord with the Einstein formula E=mc2.

For example, one form of radioactivity alters nuclear structure by emitting a positron, the anti-particle to the electron, and turning a bound proton into a neutron. Just as all electrons are indistinguishable, so too are their antiparticles, and any pairing will result in annihilation of the pair. When positrons have little kinetic energy the energy of a single electron-positron annihilation appears as pair of gamma rays whose characteristics, taken with those of the positron, conserve energy, momentum and angular momentum. Particle annihilation of a single positron-electron pair produces a single pair of photons.

The positrons emitted from a radioactive isotope embedded in matter—say human tissue—travel only a very short distance before encountering an electron and converting to gamma rays. The source of these photons is therefore very close to the positron source, and so by imaging the photons it is possible to localize an isotopically labeled piece of tissue. This procedure is known as positron emission tomography (PET).

Quasistatic fields.

As the carrier frequency ω of a traveling wave tends toward zero the wavelength tends to increase without bound (tends to infinity). Such fields are best treated as static or quasistatic (slowly varying) interactions. The Coulomb repulsion between electrons and the Coulomb attraction between electrons and nuclei is the primary example of such fields (but see page 242 for other examples). Their dominant effects, dealt with in nonrelativistic evaluations of atomic and molecular structure—the bare energies En—are commonly evaluated without the need of individual photons; but see Section 8.3.

2.8.4 Visualizing chaotic vs. coherent; Eddington's photons

In contemplating the two extremes of thermal light and laser light as sources of photons it is instructive to read a delightful description “The inside of a star” written in 1926 by Sir Arthur Eddington [Edd59] and reprinted in [Sho90]:

The inside of a star is a hurly-burly of atoms, electrons and aether waves. We have to call to aid the most recent discoveries of atomic physics to follow the intricacies of the dance. We started to explore the inside of a star; we soon find ourselves exploring the inside of an atom. Try to picture the tumult! Disheveled atoms tear along at 50 miles a second with only a few tatters left of their elaborate cloaks of electrons torn from them in the scrimmage. The lost electrons are speeding a hundred times faster to find new resting-places. Look out! there is nearly a collision as an electron approaches an atomic nucleus; but putting on speed it sweeps around it in a sharp curve. A thousand narrow shaves happen to the electron in 10−10 of a second; sometimes there is a side-slip at the curve, but the electron still goes on with increased or decreased energy. Then comes a worse slip than usual; the electron is fairly caught and attached to the atom, and its career of freedom is at an end. But only for an instant. Barely has the atom arranged the new scalp on its girdle when a quantum of aether waves runs into it. With a great explosion the electron is off again for further adventures. Elsewhere two of the atoms are meeting full tilt and rebounding, with further disaster to their scanty remains of vesture.

As we watch the scene we ask ourselves: Can this be the stately drama of stellar evolution? It is more like the jolly crockery-smashing turn of a music-hall. The knockabout comedy of atomic physics is not very considerate towards our aesthetic ideals; but it is all a question of time-scale. The motions of the electrons are as harmonious as those of the stars but in a different scale of space and time, and the music of the spheres is being played on a keyboard 50 octaves higher. To recover this elegance we must slow down the action, or alternatively accelerate our own wits; just as the slow motion film resolves the lusty blows of the prize-fighter into movements of extreme grace—and insipidity.

In 1990 I wrote the following counterpoint to that lively description of thermodynamic equilibrium [Sho90]: “Eddington's charming prose presents a vivid mental picture of the excitation processes occurring in most hot gases, although the word `photon' has now replaced his phrase `aether wave'. As a portrayal of the microscopic behavior in such a milieu, his words remain a reliable description today as they were when first written in 1926. But whereas an atom in Eddington's star must respond to frequent collisions with particles from chaotic surroundings, atoms exposed to laser light encounter primarily a procession of nearly identical photons. An atom encounters one of these photons far more frequently than it collides with other perturbers. Indeed, the encounters are so frequent that the discreteness of energy quanta becomes blurred. We can best view this situation as a scene dominated by a traveling electromagnetic wave. The periodic electric field of this wave forces sympathetic oscillations of the electrons. The result, coherent excitation, is a far more harmonious activity than Eddington imagined occurring inside a star. Eddington's words give no hint of the diversity of effects that occur under conditions of coherent excitation—the contemporary world of quantum optics. Were Eddington offering today a metaphor for coherent excitation he might suggest the following”:

Imagine a marching band on parade. In the distance we hear first the regular pulsations of the bass drum. Then the brasses are heard harmonizing a melody. Soon the full orchestration becomes apparent. As the players come into view we notice the precision of their march. Each left foot rises in unison, as if moved by some irresistible force—the feet of the musicians seem to be locked together with rigid but invisible bars. Soon even a few spectators can be see tapping feet in time to the music.

As we watch this scene we ask ourselves: How is it that these individuals become, for a few moments, part of a musical machine? What force periodically propels each foot? We know the answer: It is the rhythm of the music, directed by the drum major, that provides this invisible bond between musicians. Each individual acts so as to fit coherently into a pattern produced by an unseen but audible carrier melody.

One must proceed with care in placing reliance upon such pictures and analogies. Unlike the marching band, which always has a fixed number of members during any performance, a laser beam does not have a precisely defined number of photons; see Appendix B.4.3. It has a large average number and so individuals are not readily discerned—they seem merely an organized crowd. Just as we know that a distant, indistinctly seen mountain is clothed with unresolved, individual trees, we know that radiation ultimately comprises photons. This Memoir aims to explain just what that assertion means—how we are to regard the photon trees of the radiation forest.

2.8.5 Pencil beams; Rays

Radiation can be formed, by means of mirrors and lenses and a succession of apertures (small openings) in opaque screens, into pencil-sized beams that travel, when unconstrained, along straight-line paths (at least to a first approximation) [Car74, Dav79, Sim16]. These are idealized as light rays, treated as mathematical lines. Radiation from a laser quite naturally forms a pencil beam, as does light from individual stars. The radiation itself, during passage from its source (e.g. the sun, a street lamp, or a laser pointer) is visible only by scattering from dust particles along an air path (or a path through some translucent medium such as milk) or from surfaces such as furniture or street pavement. Its arrival can be evidenced in many ways, including reflection and absorption (to become heat).

Contemporary designers of optical devices rely on modeling their assorted lenses and mirrors by software that carries out extensive ray tracing through optical paths—so-called Geometric Optics. Superficially their procedures resemble the use of what Newton regarded as light: Streams of colored “corpuscles” that followed straight-line paths in free space and bent paths upon entering or leaving transparent materials. This model offered a simple explanation for how lenses act and how a beam of white light became dispersed, by a prism or grating, into constituent beams of colored light.

Nowadays it is common to find optical beams traveling through optical fibers, where confining action of refractive-index variation guides narrow beams of light along curved paths, thereby eliminating the need for beam-directing mirrors in laboratories and allowing long-distance optical communications; see Appendix C.3.2. Here too one can imagine finding photons, as indivisible increments of energy.

Beam characteristics.

Beams of laser radiation are found not only in optics laboratories but as pointers in lecture halls and scanning devices at retail checkouts. Such beams, as well as beams of incandescent radiation and the sunbeams with which Newton experimented, have three common attributes [Car74, Dav79, Sim16].

Direction: A pencil beam is characterized first by its direction—idealized as a three-component wavevector k that specifies the propagation direction of the electromagnetic-field wavefronts; see Appendix B.2.2.

Radiation which is confined to a narrow beam has a well-defined dominant wavevector, pointing along the beam axis (the longitudinal direction), but it must incorporate a superposition of many wavevectors in the transverse direction in order to produce the confinement; see Appendix B.2.2. It is therefore very different from an idealized simple plane wave, which extends indefinitely and uniformly in the transverse direction—termed Nadelstrahlung (needle radiation) by Einstein and others (the needle-like coordinates are in momentum space). Beams also contrast with Kugelstrahlung (literally “ball radiation”) or spherical waves (multipole fields, see Appendix B.2.6).

Color: Next its specification requires some measure of its frequency composition—its colors;18 see Appendix B.5.1. The frequencies may be spread over a broad range, as with sunlight, or they may be concentrated within a narrow range, as with laser radiation (idealized as monochromatic). The distribution of frequencies present in the intensity is the spectrum of the light. A monochromatic light wave having wavevector k has angular frequency ω=|k/c|. Thus color and propagation direction together require only three numerical parameters for complete specification.

Phase: As with any wave, a beam of coherent radiation can be assigned a phase—a number that specifies the moment during any period that is to serve as the origin for counting waves. The phase of a single beam can be chosen arbitrarily, but the difference in phase between two beams is determinable by interference measurements if they are coherent.

Polarization: Finally, light beams may be polarized—linear, circular, or elliptical. (Circularly polarized light is said to have a definite helicity: Positive helicity for right-circular polarization.) This attribute (demonstrable with polaroid sheets or sunglasses) is the time-averaged direction of the electric-field vector, perpendicular to the propagation direction (see page 298). At every position and every instant the electric field has a definite direction, but on average there may be no preferred direction: Unpolarized light is an incoherent mixture of polarizations, a time average of randomly varying electric-field directions. Appendix B.1.7 discusses intermediate situations of partial polarization as quantified by Stokes parameters.19

2.8.6 Elaborate fields

In recent years both theory and experiment have greatly enlarged the variety of beam-like classical fields being considered. Appendix B.2.1 mentions some of these, with references. But beams are only one of the many examples of radiation fields that contemporary optical techniques can prepare. A review article lists the many ideas that are under active investigation [Aie15]. Each of these fields offers possibilities for defining photons.

2.9 Possible radiation granularity; Photons

Discreteness in daily life is by no means uncommon. Until recently all cash financial transactions had to make use of a discrete set of values for the bills and coins that are to be found in all traditional monetary systems.

One might ask, as did philosophers of ancient times who thought about the structure of matter, whether there might be an ultimate granularity of radiation, as has been demonstrated for matter with its base of atoms: Might light rays possibly comprise some collection of massless elementary units—perhaps the corpuscles of Newton—just as matter is known to comprise collections of electrons and atomic nuclei? A stream of water flowing out of a faucet can be demonstrated to comprise discrete molecules. Can a flashlight beam somehow be demonstrated to comprise granules of light?

An affirmative response came in the early twentieth century. The hypothesized elementary units of radiation, the radiation Quanta, would eventually be termed Photons. They would be counterparts to the electron, the elementary unit of negative electricity, and to the atom, that elementary bit of matter. Presumably whatever definition might be used, a photon, in common with an electron, cannot be further split into parts.

Particles and rest mass.

However, there are important differences between radiation and electrons, or other bits of everyday matter. Most obviously the traditional particles that make up matter can not only move but they can be brought to rest. And at rest they have nonzero rest mass—the characteristic that gives numerical magnitude to potential and kinetic energy. Radiation, on the other hand, travels through free space always at a constant velocity, the speed of light c; see eqn (2.7-2). Unlike atoms and electrons, free-space radiation increments can have no rest mass. However, radiation traveling through bulk matter, whether transparent or opaque, will travel more slowly and will acquire other characteristics that differ fundamentally from radiation in a vacuum; see Appendix C.

The notion that radiation might be regarded as being composed of massless particle-like entities was for some time quite perplexing. George Darwin wrote in 1932 [Dar32]:

… it is quite unlike any known particle to come into existence and later to disappear without a trace.

Nowadays, with the possibility of experiments using very high-energy particle accelerators, and the relativistic theory of particle-antiparticle pair production, even this seemingly outrageous proposal has become an accepted part of nuclear chemistry and high-energy physics; see page 55. The “unlikely” has become commonplace.

Stopping photons.

When radiation is stopped by some obstruction, be it an opaque surface or some small particle, it deposits energy—this is the heat energy of sunlight. Might the energy deposition (or its emission) occur in discrete increments—photons being absorbed and emitted?

This was a question that not only I, as a newcomer to physics, pondered, but one that had brought thoughtful examination and detailed expository writing by many illustrious physicists: Do photons exist? If so, how are they to be understood and treated with appropriate mathematics [Scu72, Lam95, Mar96, Coh98, Lig02, Mut03, Bia06c, Fri09]? After all, physics must eventually reduce Nature to mathematical expressions (equations of some sort) that will lead to predictions of measurable quantities. What are the equations for photons that supplement well-established equations for atoms and electrons? Only when we have such equations can we think we understand the associated physics of photons.

The following chapter begins the presentation of various lines of experiment interpretation that supported the notion of granular radiation. There were several, and they led to somewhat differing details of how the radiation granules were to be regarded.

Photon size.

Massive particles such as atoms, and electrons bound to atoms, have a measurable extent—a size within which (most of) the mass and charge will be found; see Appendix A.2. Massless free-space photons are more difficult to pin down to any location.

For an idealized monochromatic field (one that extends indefinitely in space and time as a traveling wave or a standing wave) one might argue that localization could take place only near an antinode, over a distance set by the wavelength. When multiple frequencies are present there will occur interferences between waves, and localization along a standing-wave beam axis can become much more precise than a wavelength. This is the principle behind interferometry [Buc86, Pac93, Har96, Ber97, Cro09, Che16].

But the required periodicity that accompanies monochromatic waves then prevents identification of just which antinode is providing the localization. Superpositions of waves can improve the localization along a beam axis, but the ambiguity remains. Only when an atom or other bit of matter registers the absorption of a radiation increment is it possible to assign a position to the vanished photon—a position that is fixed by the localizable atom, not by the field (or photon). Appendix C.4.2 discusses issues related to localization in bulk matter.

Although localization is a defining characteristic of particles, light is regularly used to view small objects by means of lenses and light-sensitive surfaces. Monochromatic light that emerges into vacuum through an opening (an aperture) in an opaque surface will expand as spherical wave fronts. Along a single line of sight the waves crests move with the speed of light, separated in space by the wavelength, producing a steady intensity that is spread uniformly over small segments of a plane transverse (perpendicular) to the line of sight—there is usually no obvious nodal indicator of wavelength in the transverse direction (but see Appendix B.2 for examples of beams with transverse nodes). Nevertheless, the wavelike nature of light imposes long-established limitations upon the size of objects that can be distinguished using focusing optics. Traditional methods of optical observation set the limit of transverse localization of an unstructured beam as roughly a wavelength; see Appendix A.2.3. However, with the use of coherent radiation various techniques now exceed that classical resolution limit; see page 205.

Photon density.

Although electrons have no intrinsic size, there is a limit on how many can be packed into any finite volume. Apart from the Coulomb repulsion, which can be offset with positive charges of nuclei, there is a fundamental limit, expressed by the Pauli exclusion principle, that prevents two electrons (or any two fermions) from having the same set of four quantum numbers and that thereby inhibits them from occupying the same space; see Appendix A.8.2. This principle leads electrons in atoms to cluster into inert central cores (shells), around which a few valence electrons move, and it leads electrons in metals to occupy momentum bands. But there is no such fundamental restriction on photons (or any bosons) and on the associated intensity of electromagnetic radiation—the density of electromagnetic energy. Indeed, bosons prefer to share a common quantum state, and there is no theoretical upper limit on the energy density or the focused intensity of an electromagnetic field in vacuum.

As radiation becomes increasingly intense, the Lorentz force becomes sufficiently strong to rip off bound electrons, setting them free beyond residual ions. This becomes evident when the electric field of the radiation exceeds the Coulomb field that binds the electrons. Pulses from industrial lasers used for welding and cutting routinely produce such intense radiation, well above the values used for controlled manipulation of quantum states by single photons.

Aside: Radiation scales. The significant scales of electric field magnitude E and radiation intensity I for overwhelming the electron-binding Coulomb forces are the atomic units of these quantities:20

EAU=14πϵ0e(a0)25.14×1011V/m,

(2.9-1a)

IAU=(αc)2(a0)46.4×1019W/m2,

(2.9-1b)

where a0 is the Bohr radius of eqn (A.2-5), the atomic unit of length. That is, a monochromatic wave of intensity 1 watt/m2 creates an electric field whose magnitude |E| is about 5×107 atomic units of field strength.

When radiation fields become still more intense they eventually affect even the vacuum, and it is no longer possible to avoid considering the creation and annihilation of massive particle-antiparticle pairs; see eqn (8.5-1). These provide a “dressing” to the “bare” photons and the field equations become nonlinear.

The name “photon”.

The name “photon” is commonly credited to physical chemist Gilbert Lewis who, in 1926, used it to describe his notion of a particle-like quantum of radiation [Lew26]. As his article title, “The conservation of photons”, makes clear, he had in mind discrete particles, of energy hν, momentum magnitude hν/c and mass hν/c2 that, together with atoms, obeyed conservation laws of energy, momentum and mass and which were stored in atoms awaiting release. The “photons” of contemporary physicists, which I describe in this Memoir, are not what Lewis had in mind. As Willis Lamb has stressed [Lam95] (see page 177), the unconventional ideas of Lewis (conservation of photons) failed to gain acceptance—they were demonstrably wrong, photons are not conserved—but his chosen name has prevailed.

Update: The photon name.

Historian Helge Kragh [Kra14] has pointed out, in a posted but unpublished article cited by Wikipedia (in the wiki article “Photon”) that the word “photon” was used prior to Lewis's 1926 article by three other authors, primarily for describing physiological effects of light: Leonard Troland (in 1916), John Joly (in 1921), René Wurmser (in 1924), and it was used in 1926 by Frithiof Wolfers. Kragh credits Arthur Compton for first citing Lewis and for using the term photon with a more lasting interpretation.

2.10 Angular momentum: Orbital and spin

A slowly moving (nonrelativistic) particle of mass m and velocity v (a vector) has linear momentum p=mv (a vector in the direction of motion) and kinetic energy (a scalar) expressible as either

Ekin=m2|v|2orEkin=12m|p|2.

(2.10-1)

In the absence of any force, both the kinetic energy and the linear momentum remain constant.

Around any chosen reference position, such as the center of a fixed star, a particle may have an angular momentum (a moment of linear momentum—a twist), a vector whose magnitude is proportional to its mass and its velocity magnitude or to its linear momentum magnitude, and whose direction (the twist axis) is perpendicular to these collinear vectors and to the time-varying vector r that points from the chosen reference position to the moving point particle; see eqn (A.1-15) in Appendix A.1.2. The changing values of the particle-position vector r trace out a path known as the particle orbit (not necessarily a closed path), and the angular momentum associated with that path is therefore termed orbital angular momentum. Although these expressions are most commonly found in discussions of bound-state positions, they apply to any trajectory.

When the particle twists, or turns about an axis, as does a toy top, a gyroscope, a bicycle wheel, or a typical planet, it has a second form of angular momentum, a vector independent of its velocity or its position, pointing along the rotation angle (for the earth, toward the north star) with a magnitude proportional to the particle mass and to the rate of rotation about its axis (its angular velocity). This rotation, or twist, is termed spin, and its angular momentum is therefore termed spin angular momentum, commonly expressed in units of ℏ. This contribution to a total angular momentum is independent of the particle trajectory: It is an intrinsic property of a planet whatever may be the planetary orbit.

By contrast with a particle, a field can have spatially distributed energy and momentum, quantified as energy density and momentum density. It can also carry two forms of angular momentum density, orbital and spin. Quantum theory associates with an electron or other massive particle a field (a wavefunction) of electron-charge density and of mass density. The angular momentum density of this electron field has both an orbital contribution, dependent on the velocity and kinetic energy density, and an intrinsic contribution, a spin angular-momentum density whose spatial integral, for a single electron, is ℏ/2. Although the name suggests rotation about a moving axis, its origin is attributable to a wave-nature distribution (in a wavefunction) of intrinsic angular momentum [Oha86]. The spatially integrated spin density of the electromagnetic field of a single photon is ℏ, twice that of a single electron. Photons, as the quanta of electromagnetic fields, are therefore said to have spin one, while electrons have spin one-half. This marks them as bosons and fermions, respectively (see Section 2.2.8).

2.11 Probabilities

Much of the literature of modern life mentions probabilities, and numerous specialized texts and courses discuss their broad reach [Cox46, Fel68, Bal70, Dav70, Gri12b]. The enterprises of economists and investors, of weather forecasters and sports fans, of merchants and farmers, of police enforcers and medical advisors, all make use of probabilities—if only implicitly and unconsciously—for their decisions. Few organized contemporary activities are free of reference to probability (or likelihood)—and from the enticement of financial gain from using probabilities to place bets or avoid loss. The essence of probability is expressed as:

Probabilities are basically numerical representations of our best guess at outcomes of future or ongoing events.

For the events of everyday activities—of weather and traffic flows—such predictions are based on extrapolating data of past events. Their organization and use is now taught in courses entitled statistics, often part of the basic requirements for an undergraduate degree and administered by academic departments of Statistics. There the courses of interest for students of economics, medicine, and sociology often incorporate subjective notions of Value and Utility into procedures for decision making and Risk Management [Ber96].

In physics, and particularly quantum physics, probabilities are typically defined in terms of a set of measurements on a collection of identically prepared physical systems (or repeated observations of a single system). Expressed in words, the definition of probability used in Physics can be stated as:

The ratio of the number of events that are associated with some observable attribute x, to the totality of observed events, is taken to be the probability P(x) of attribute x.

The probabilities that underlie quantum theory and statistical physics—and which therefore appear in this Memoir—do not rely on historical records of measurements but upon simple rules. They have much in common with odds calculated centuries ago by Renaissance gamblers for games of chance played with dice [Ber96].

All interests in probability and chance, whether in everyday activities or in quantum theory, have some common features, discussed below.

2.11.1 Events

The events that are to be described by probabilities—observations of measurable characteristics of some reproducible phenomenon—may be of two types (casinos combine both).

Serial: First, we can repeatedly observe a single system, for example seeing whether a coin flipped into the air falls heads or tails, the only acceptable two possibilities. Or we might have a box of marbles, differing only by color, say, either green or orange. We shake the box well and, without peeking at the color, draw one out.

Parallel: Alternatively, we may have many copies of identical systems, prepared identically. For example, the contents of a well-shaken basket of dice is tossed onto a flat surface and the distribution of die faces is noted. A team of coin flippers announce their collective results after a coin flip. A group of competitors are issued decks of cards to shuffle and deal, and the assorted hands are duly noted. A morning report arrives from a set of weather stations giving their air temperature.

So it is with observations of atoms. We can either deal with a large collection of identically prepared atoms (this is the traditional assumption) or we can somehow trap and hold a single atom and, after repeatedly restoring it (an essential step), make an observation of its properties. This procedure has become possible in recent years.

2.11.2 Evaluating probabilities; Classical and quantum

Within any enterprise that uses probabilities there are well established procedures for evaluating them.

For example, the probability that it will rain tomorrow is estimated from the ratio of past rainy days to the total number of days in the weather record (a matter of statistics). The possibility that a racehorse or a sports team will win is (usually) based on its past record.

Alternatively, one can define probabilities in terms of the known, or presumed, distribution of possible events. Unless you have reason to believe otherwise, you assume that each distinguishable possibility is equally likely. If a coin is well balanced so that neither side has an obvious advantage, we estimate the probability of heads to be the fraction 1/2. In the marble-box example, if we know there are equal numbers of green and orange marbles in the box, and they are indistinguishable by touch, then our best guess is that in half the experiments of drawing out one marble its color will be green. Prior to drawing out a marble, but knowing the marble composition in the box, we estimate the probability of getting a green marble will be the fraction 1/2. The probability for drawing an ace (one of four possibilities) from a well-shuffled deck of 52 playing cards is regarded as the ratio 4/52 (a matter of combinatorics, an enumeration of possible ways of achieving a particular result).

Those probabilities deal with activities that are governed by everyday, classical physics. Quantum theory has its own special rules for calculating probabilities. Most notable is that all the real-valued probabilities are obtained as absolute squares of complex-valued probability amplitudes. It is with those amplitudes, leading to wavelike nodal structure associated with photons, that the several equations of quantum theory deals; see Appendices and A.3.3 and A.5.

2.11.3 Properties of probabilities

With either scenario—multiple systems or multiple trials with a single system—a probability is used to give a numerical estimate of our best guess of the results of many observations. It is with the prediction of such outcomes, taken in the limit of a very large number of trials, that probabilities occur. They generally say nothing about a single observation, only about an accumulation of a large number of observations, expressed as a fraction—a number between 0 and 1 inclusive.

Only if the probability is known to be 0 (meaning a result that is impossible) or 1 (meaning a mandatory, guaranteed result) have we any certainty about a single trial. It is the uncertainties of individual events that draws bettors to casinos. It is the numerical probabilities that keep the casinos in business.

One of the essential properties of any definition of probability is that the sum of the probabilities for all possible individual outcomes must be unity. For example, if one takes marbles from a box that contains only green and orange marbles, the probabilities of taking an orange one or taking a green one must sum to unity—the chosen marble can only be orange or green, given the underlying model. Probability cannot increase beyond complete certainty. (This intrinsic constraint does not affect the sportscasters who regularly exhort athletes to exert an effort in excess of 100%.)

Because of the way probabilities are defined they must be non-negative, real numbers, bounded by zero and unity. For persons of a mathematical inclination, they are mappings of event measures onto this small portion of the real-number line. For the examples above, of coins, cards, and marble colors, the mapping is to non-negative rational numbers.

As my education continued in graduate school, I learned how important probabilities are in any discussion of atoms and electrons—in the quantum mechanics with which their behavior is described. And I have found that concepts of photons rely also on probability concepts. Despite what Einstein hoped (in allegedly saying “God does not play dice”)21 our present physics of photons cannot proceed without encountering chance—meaning influences over which an experimenter has no control.

2.11.4 Random variables; Expectation values

The events with which probability deals can, for familiar examples, be represented by numbers associated with measurements or observations: The reading of a thermometer at a specified location; the value of a playing card drawn from a (presumably) well-shuffled deck; the published winning score of an athletic contest. (Measurements of qualities such as color can be given numerical values in accord with some scale.) The possible numerical outcomes of such measurements form a sample space, a set of numbers (say a discrete set N1,N2,) from which any given observation will select one. Unless the measurement outcome is predetermined, it will have some uncontrollable irregular variation (termed random or stochastic) of the successive values. A measurement provides an incidence, or particular value, of such a variable. It is the purpose of probability values to provide estimates of the distribution of any set of observation values. From whatever source they come, a given set of probabilities allows us to predict the most likely result of measuring a random variable. This expectation value (the numerical value predicted as the average of an indefinitely large number of cases) for some stochastic variable V is here denoted 〈V〉. For example, N denotes the (theoretically) expected value of the (experimental) average defined in the simple example of eqn (2.11-1).

Aside: Mean and variance. In any set of events governed by chance there are many measurable properties that can be of use in characterizing the underlying probability distribution. Typically it is possible to assign some single number to an event, say the summed values of reading of the upper faces of cast dice. From recording numerical values Ni of repeated events, i=1,2, one can evaluate the average numerical value, or mean, denoted N¯, defined as the sum of all successively observed numbers divided by the number of events, a construction of the form

N¯=N1+N2+N31+1+1+,

(2.11-1)

where each measured numerator number Ni accompanies an addition of 1 to the denominator count. The expectation value is the mean value, N=N¯.

When dealing with quantities that may be positive or negative (as happens with oscillations) it is common to use root-mean-square (RMS) values (the quadratic mean or square root of the mean square) to quantify the distribution of values. For n numbers Ni the RMS obtains from the formula,

NRMS=N12+N22++Nn2n=N2.

(2.11-2)

Such measures, along with such quantities as the most probable single result, have practical value as one ponders a probabilistic investment. But other characteristics are also important in making such decisions: How likely is it that a single event returns a number close to the average? A measure of the variation amongst a succession of events (and the uncertainty in the next one) is the variance,

var(N)=(N1N¯)2+(N2N¯)2++(NnN¯)2n=N2N2.

(2.11-3)

This number, the difference between the mean square and the square of the mean, expresses how sharply peaked around the mean value is the distribution of observed values. Such measures are important not only for risk managers of investment funds [Ber96] but for descriptions of quantum processes.

Uncertainty.

Whenever we measure properties of some system that has distinguishable irregularities we encounter variation of the numbers acquired from a succession of measurements: Weather patterns at different times or different places; prices of different stocks at different times. The tabulated values may trend, on average, with a predictable pattern (night and day, north and south, desert or coastline) but they may fluctuate around an average. One common measure of the range of values around the mean is known to mathematicians as the variance. Physicists call it uncertainty.

The uncertainties of physics occur for three reasons. First, the measurements may be of uncontrollable phenomena such as the rainfall outside a particular workplace during the lunch hour. Such data is stochastic—predictable only on average. A different, second, sort of uncertainty occurs when we measure the tone (frequency) of a musical note: The accuracy of the measurement depends on its duration. A third class of uncertainty occurs when the data comes from simultaneous measurements of certain characteristics of a single quantum system—say, measurements of position and velocity (or momentum) of a very small particle. Each of these three types of uncertainty requires a distinct approach and has distinct results and interpretation. Most notable for discussions of photons are the unavoidable uncertainties in the mathematically different pairings of frequency and time (see Appendix B.5.1) and of position and momentum (see Appendix A.4.2).

2.11.5 Conditional probabilities

Often we wish to consider repeated observations under constrained conditions in which we exclude some of the possible outcomes. We might, for example, know that a card packet comprises only spades or only black cards. Under these conditions we can assign probabilities to the occurrence of specific combinations. These probabilities are conditional—they depend upon prior certainty concerning the nature of the sample space. The conditional probability P(x|y) is the probability that, if the property y is known to hold, then the property x will be found. This conditional probability is the ratio of the number of systems that have both property x and property y, to the number of systems that have property y.

Uncertainty.

The differing views of probability used by physicists and by others listed above have to do with the nature of the events being described. Whereas every electron, every proton, is identical and indistinguishable, such uniformity is not found in other domains of enquiry: Every person, every pet animal, every sporting event, every new day, is unique. Thus predictions of human, or weather, behavior stand on a somewhat different base than predictions applied to simple inanimate objects. Nonetheless, the general properties of probabilities noted above apply broadly to all dealings with events and measurements that involve uncertainty about outcomes.

However, significant differences occur between classical and quantum probabilities and their uncertainties. Whereas one might hope that improved tools might continually diminish errors (uncertainty) in a manufacturing process, and that more elaborate computer facilities might improve weather forecasting, all quantum systems are subject to the Heisenberg uncertainty principle that fixes a lower bound for many sorts of measurements and predictions.

2.12 Quantum states

Physics deals with information about physical systems—not only about what sorts of individual pieces and linkages may be present, what sort of atoms or molecules may be present, but how these constituents are distributed in space and how they are moving. Information about those basic variables—positions and velocities—defines a state of motion.

2.12.1 The uncertainty (indeterminacy) principle

Until the twentieth century brought quantum theory as the basic description of physical systems, there seemed no limit to the accuracy with which one could measure positions and velocities, thereby specifying a state of motion. But as even the most basic course in quantum theory informs students, the distinction between particle behavior governed by classical equations and behavior governed by quantum mechanics is set by Heisenberg's Uncertainty Principle: The product of uncertainty22 in particle position Δx and the uncertainty in its nonrelativistic velocity Δv cannot be smaller than an amount that is inversely proportional to the mass of the particle, m.

Aside: Heisenberg uncertainty. The Heisenberg uncertainly relationship is commonly expressed in terms of coordinate and momentum variables; see eqn (A.4-6). With momentum p=mv the position-velocity relationship reads

Δx×Δv/2m.

(2.12-1)

The proportionality constant in this uncertainty product is the reduced Planck constant =h/2π; its value essentially fixes the sizes of atoms. Only under special circ*mstances, termed minimal uncertainty states, does the uncertainty product reach its minimum value; see Appendix B.4.4.

2.12.2 Defining a quantum state

The notion of information defined by particular instances (a state) of system behavior is found in the microscopic systems, such as atoms and electrons (and photons), governed by quantum theory. There the information, though constrained by the uncertainty principle, still provides a distinction between different examples of system behavior. To paraphrase Michael Raymer [Ray97],

A quantum state can most simply be regarded as whatever is needed to specify the probabilities of measurement outcomes of all observable quantities pertaining to a physical system.

An experimental view, stated by Leonard Mandel [Man99], is:

In an experiment the [quantum] state reflects not what is actually known about the system, but rather what is knowable, in principle, with the help of auxiliary measurements that do not disturb the original experiment.

In these views, which I adopt, a “quantum state” is not an actual object that you can touch or see, or push and pull, it is information about how some physical object is constructed—a building plan. (The term “blueprint” would have been used decades ago, when such objects were commonplace amongst draftsmen.)

There are two obvious logical concerns with this definition: How is the information to be acquired? This is known as the Measurement Problem [Wig63, Leg80, Roy89, Sch05, Van08, Whe14]. And: How is the information to be stored and retrieved for use? This question leads to consideration of lists and to notions of vectors and abstract spaces for them. (Other questions concern information manipulation.) Two approaches, defined more fully in Appendix A.3, find regular use.

Wavefunctions.

When the information that defines a quantum state is presented as probabilities for finding a set of particles at a prescribed set of positions, the underlying probability amplitude is termed a wavefunction (see Appendix A.4.3). In keeping with Heisenberg's uncertainty constraint, such a function of coordinate variables says nothing about their velocities.

Statevectors.

Apart from very simple idealizations, quantum systems have many degrees of freedom, each of which requires description by a probability amplitude. The resulting multidimensional wavefunction is too complicated to be viewed without simplification. When only discrete-energy states need to be considered, as happens when we treat photon creation and destruction, the abstract vector spaces defined in Section 2.5.2 offer a convenient way of organizing and working with the information that describes the quantum states. Appendix A.3 describes the abstract-vector spaces commonly used in describing quantum systems and their changes. As I shall be explaining there, such a space is a setting for a statevector: An abstract vector whose components are probability amplitudes; their absolute squares provide probabilities for observing a quantum system in one of its possible quantum states; see Appendix A.8. Under many conditions all of the possible information about a quantum system, and its possible changes, is embodied in such a statevector—an abstract vector that has as many dimensions as the number of discrete quantum states that may be required to fully define the system. (Under more general conditions the description requires a density matrix; See Appendix A.14.)

For the discussion of photons that I will be presenting in the narrative portion of this Memoir, little more than those general principles (of quantum states and probability amplitudes viewed as components of abstract vectors) are needed for dealing with the results of quantum theory. The view I shall take is that it is not necessary to speculate on what an electron or a photon “really is”. It suffices to have equations that consistently predict its observable attributes and how it behaves under controlled, specified forces—results of experiments.23 This is what quantum theory provides.

Notes

1

One attometer is the sensitivity of the LIGO gravitational-wave detector. One yottameter is about 108 light years, roughly the distance between some galaxies. The atomic unit of time, a0/αc, is about 24 attoseconds; see eqn (A.2-5) in Appendix A.2. The age of the earth is around 140 petaseconds.

2

Paradox—a statement relying on logic that seems to contradict itself—enlarges our culture. In the operetta “The Pirates of Penzance" William Gilbert and Arthur Sullivan (G&S) provide the “most amusing paradox” of Frederick who was (mistakenly) obligated to serve as a pirate apprentice until his twenty-first birthday but, because he was born on 29 February in a leap year, he must serve on as a detested pirate for many years after having lived for 21 years of “reckoning in the usual way.” The calendar-based paradox posed by G&S has a subsequent counterpart in the Einstein twin paradox, in which twins that travel by separate routes, involving different speeds, age at different rates, in accord with the timekeeping rules of special relativity; see Section 8.3.1 and [Lal01, Aha08].

3

The boldface font indicates that force and acceleration are examples of vectors; see Section 2.5.

4

It is relatively easy to remove an electron from an atom, thereby “splitting the atom” and creating an ion. A comb run through dry hair can do that. It is much more difficult to deliberately break apart an atomic nucleus.

5

Photons traveling through matter become similarly dressed; see Appendix C.2.2 and C.4.2.

6

Here C is coulomb, J is joule, T is tesla, kg is kilogram, s is second, and rad is radian.

7

The nuclei of (stable) deuterium and (unstable) tritium isotopes of hydrogen have one and two neutrons, respectively.

8

All electrons are identical and all positrons are identical, so any pairing will produce annihilation.

9

Situations occur when particle-like constructs do not fit these two choices; see the mention of “anyons”, page 235.

10

I distinguish between “spin”, a dimensionless vector, and “spin angular momentum”, commonly expressed in units of ℏ.

11

The familiar substances steam, liquid water, and ice cubes are now regarded as just three phases of water, forms that are to be found in accord with pressure and temperature conditions. There are now some nineteen identified phases of the H2O system, involving a variety of crystalline and amorphous fixed structures of ice [Gas18, Mil19b, Tse19, Kru19].

12

The quantum vacuum can be pictured in various equally valid ways; see page 335. In some pictures it is characterized by fluctuations in particle and photon numbers; see [Mil94].

13

Particle physicists regard the Universe as comprising matter fields, having fermion quanta, and force fields, having boson quanta.

14

Strictly speaking, a Hamiltonian is an expression of energy as a function of coordinates and momenta; see Appendix A.1.6. In practice, a Hamiltonian is a collection of relationships between energies and experimentally controlled parameters.

15

By international agreement c is defined to be exactly 299 792 458 m / s.

16

In a paraphrase of the maxim “form follows function” of architect Louis Sullivan.

17

The second-order equations of electron distribution and radiation modes have discrete sets of solutions when boundary conditions enclose the field.

18

Color, as distinct from frequency composition, is associated with the physiology of vision. In particular any given color can be produced by many combinations of three primary frequencies; see Chapter 35, on “Color vision”, of the Feynman Lectures on Physics Vol. 1, available through https://www.feynmanlectures.caltech.edu/info/

19

The name of George Gabriel Stokes appears in three separate contexts in the present monograph: That of the Stokes vector and Stokes parameters used for describing polarization characteristics of an electromagnetic field, Appendix B.1.7; the name of one field of the pair that contribute to a two-photon Raman transition, Appendix A.15; and, in passing, as a namesake of the Navier-Stokes equation, eqn (A.1-47).

20

The atomic unit of electric field magnitude is the electric field at a distance of one Bohr radius from one electron charge.

21

To which Bohr is said to have replied: “Einstein, you cannot tell God how He is to run the world”. [Isa07] page 609.

22

Appendix A.4.2 quantifies uncertainty.

23

This is a view that Feynman often expressed in his talks and books [Dud96].

Download all slides

Basic background: Everyday physics and its math (2024)

FAQs

What basic math is required for physics? ›

Algebra. If you haven't mastered algebra, then you won't be able to master calculus, which is a physics prerequisite. Algebra teaches the basics of abstract mathematical thinking and after students master basic algebra equations, they'll learn about concepts directly relevant to physics.

What level of math is needed for physics? ›

Get a good base of Algebra and Calculus. If you are a Physics major and study Theoretical Physics , you will typically be required to take Quantum Mechanics. To really understand it, you need Partial Differential Equations to solve Shroedinger's Wave Equations .

Why is it so hard for me to understand physics? ›

Physics demands problem-solving skills that can be developed only with practice. It also involves theoretical concepts, mathematical calculations and laboratory experiments that adds to the challenging concepts.

Is physics heavily math based? ›

Generally considered a relationship of great intimacy, mathematics has been described as "an essential tool for physics" and physics has been described as "a rich source of inspiration and insight in mathematics".

Is basic physics hard? ›

Physics is a challenging subject ─ it's a combination of math and science that can be difficult even for the best of us.

Is physics full of math? ›

While physicists rely heavily on math for calculations in their work, they don't work towards a fundamental understanding of abstract mathematical ideas in the way that mathematicians do.

Which is harder, physics or calculus? ›

As for difficulty, calculus-based physics is generally considered to be more challenging than algebra-based physics, as it requires a stronger grasp of calculus and its applications, in addition to a more sophisticated understanding of the physics concepts.

Is physics harder than chemistry? ›

Some people find Physics easier because it involves mainly mathematical concepts and logic, while others prefer Chemistry due to its mix of concepts, memorization, and hands-on lab work.

Why do most people struggle with physics? ›

Physics requires enhanced problem-solving skills. Students need to have very critical thinking while practicing certain concepts of physics. Solving physics equations, problems, and numerical also requires a strong command of mathematics.

What is the hardest math in physics? ›

Answer to the question (What is the hardest physics equation?): * The hardest general equation to arrive at is perhaps the relativistic mass-energy equation E = {m_0} c^2/ \sqrt{1 - {v^2}/{c^2}} . * The hardest specific equations to solve are perhaps the nonlinear Schrodinger equations or nonlinear solito…

What is the hardest science? ›

Top 5 hardest majors in science
  1. Chemical Engineering.
  2. Aerospace Engineering. ...
  3. Biology. ...
  4. Chemistry. ...
  5. Biochemistry and Molecular Biology. Biochemistry and molecular biology students take courses in biological sciences, biochemistry, chemistry, microbiology, computational biology, mathematics, and ecology. ...
May 7, 2024

What type of math is used in physics? ›

The mathematics needed for physics includes many types, from simple to complex. Mathematics is the language of physics and is used to describe the world. Just a few of the mathematical concepts needed for physics include: Calculus, including integral and differential calculus.

Do I need calculus for physics? ›

You don't have to, but having a strong foundation in calculus will cut your work very short for physics and help deepen your understanding immensely. You don't have to, but having a strong foundation in calculus will cut your work very short for physics and help deepen your understanding immensely.

Do you need to know math to study physics? ›

To study physics, you should take as much high school and college mathematics as you can reasonably fit into your schedule.

Do you need advanced math for physics? ›

Physics is a highly mathematical field that relies heavily on mathematical modeling and analysis. Physics majors are therefore required to take more advanced math courses beyond calculus and differential equations to develop the mathematical skills necessary to solve complex physical problems.

Top Articles
Latest Posts
Article information

Author: Madonna Wisozk

Last Updated:

Views: 5735

Rating: 4.8 / 5 (48 voted)

Reviews: 95% of readers found this page helpful

Author information

Name: Madonna Wisozk

Birthday: 2001-02-23

Address: 656 Gerhold Summit, Sidneyberg, FL 78179-2512

Phone: +6742282696652

Job: Customer Banking Liaison

Hobby: Flower arranging, Yo-yoing, Tai chi, Rowing, Macrame, Urban exploration, Knife making

Introduction: My name is Madonna Wisozk, I am a attractive, healthy, thoughtful, faithful, open, vivacious, zany person who loves writing and wants to share my knowledge and understanding with you.