Quantum Field Theory II
Quantum Field Theory II
And God said, Let there be light; and there was light.
Genesis 1,3
Light is not only the basis of our biological existence, but also an essential
source of our knowledge about the physical laws of nature, ranging from
the seventeenth century geometrical optics up to the twentieth century
theory of general relativity and quantum electrodynamics.
Folklore
Don’t give us numbers: give us insight!
A contemporary natural scientist to a mathematician
Prologue
One thing I have learned in a long life: that all our science, measured
against reality, is primitive and childlike – and yet it is the most precise
thing we have.
Albert Einstein (1879–1955)
The development of quantum mechanics in the years 1925 and 1926 had
produced rules for the description of systems of microscopic particles,
which involved promoting the fundamental dynamical variables of a corresponding
classical system into operators with specified commutators. By
this means, a system, described initially in classical particle language, acquires
characteristics associated with the complementary classical wave
picture. It was also known that electromagnetic radiation contained in an
enclosure, when considered as a classical dynamical system, was equivalent
energetically to a denumerably infinite number of harmonic oscillators.
With the application of the quantization process to these fictitious oscillators,
the classical radiation field assumed characteristics describable in
the complementary classical particle language. The ensuing theory of light
quantum emission and absorption by atomic systems7 marked the beginning
of quantum electrodynamics. . .
When it was attempted to quantize the complete electromagnetic field,
difficulties were encountered that stem from the gauge ambiguity of the
potentials that appear in the Lagrangian formulation of the Maxwell equations.
. .
From the origin of quantum electrodynamics, in the classical theory of
point charges, came a legacy of difficulties.The coupling of an electron
with the electromagnetic field implied an infinite displacement, and, indeed,
an infinite shift of all spectral lines emitted by an atomic system
in the reaction of the electromagnetic field stimulated by the presence
of the electron, arbitrary short wave lengths play a disproportionate and
divergent role. The phenomenon of electron-positron pair creation, which
finds a natural place in the relativistic electron field theory, contributes to
this situation in virtue of the fluctuating densities of charge and current
that occur even in the vacuum state11 as the matter-field counterpart of
the fluctuations in electric and magnetic field strengths.
In computing the energy of a single electron relative to that of the vacuum
state, it is of significance that the presence of the electron tends to suppress
the charge-current fluctuations induced by the fluctuating electromagnetic
field. The resulting electron energy, while still divergent in its dependence
upon the contributions of arbitrarily short wave lengths exhibits only a
logarithmic infinity; the combination of quantum and relativistic effects
has destroyed all correspondence with the classical theory and its strongly
structured-dependent electromagnetic mass.
The existence of current fluctuations in the vacuum has other implications,
since the introduction of an electromagnetic field induces currents
that tend to modify the initial field; the “vacuum” acts as a polarizable
medium.
New nonlinear electromagnetic phenomena appear, such as the scattering
of one light beam by another, or by an electrostatic field. . .
It is not likely that future developments will change drastically the practical
results of the electron theory, which gives contemporary quantum
electrodynamics a certain enduring value. Yet the real significance of the
work of the past decade lies in the recognition of the ultimate problems
facing electrodynamics, the problems of conceptual consistency and of physical
completeness. No final solution can be anticipated until physical science
has met the heroic challenge to comprehend the structure of the
sub-microscopic world that nuclear exploration has revealed.
Julian Schwinger, 1958
This quotation is taken from a beautiful collection of 34 papers which played
a fundamental role in the development of quantum electrodynamics. This
volume was edited by Julian Schwinger from Harvard University who himself
made fundamental contributions to this fascinating field of contemporary
physics.
In the present volume, we will use Dyson’s extremely elegant approach to
quantum electrodynamics based on the Dyson series for the S-matrix (scat-
tering matrix). In the beginning of his Selected Papers, Freeman Dyson (born
1923) describes the history of quantum electrodynamics:16
My first stroke of luck was to find Nicholas Kemmer in Cambridge (England)
in 1946. He was the teacher I needed. He rapidly became a friend
as well as a teacher. Our friendship is still alive and well after 45 years.
Kemmer gave two courses of lectures in Cambridge, one on nuclear physics
and one on quantum field theory. In 1946, the only existing text-book on
quantum field theory was the book “Quantentheorie der Wellenfelder”, by
Gregor Wentzel (1898–1978) written in Z¨urich and published in 1943 in
Vienna in the middle of the war. Kemmer had been a student of Wentzel
and possessed a copy of Wentzel’s book. It was at that time a treasure
without price. I believe there were then only two copies in England. It was
later reprinted in America and translated into English. But in 1946, few
people in America knew of its existence and fewer considered it important.
Kemmer not only possessed a copy, he also lent it to me and explained
why it was important. . .
In 1947, I arrived at Cornell as a student and found myself, thanks to
Kemmer, the only person in the whole university who knew about quantum
field theory. The great Hans Bethe (1906–2005) and the brilliant Richard
Feynman (1918–1988) taught me a tremendous lot about many areas of
physics, but when we were dealing with quantum field theory, I was the
teacher and they were the students . . .
Julian Schwinger (1918–1994) had known about quantum field theory long
before. But he shared the American view that it was a mathematical extravagance,
better avoided unless it should turn out to be essential. In 1948,
he understood that it could be useful. He used it for calculations of the
energy level shifts20 revealed by the experiments of Lamb and Retherford,
Foley and Kusch at Columbia.21 But he used it grudgingly. In his publications,
he preferred not to speak explicitly about quantum field theory.
Instead, he spoke about Green’s Functions. It turned out that the Green’s
Functions that Schwinger talked about and the quantum field theory that
Kemmer talked about were fundamentally the same thing. . .
At Cornell, I was learning Richard Feynman’s quite different way of calculating
atomic processes. Feynman had never been interested in quantum
field theory. He had his own private way of doing calculations. His way
was based on things that he called “Propagators,” which were probability
amplitudes for particles to propagate themselves from one space-time
point to another. He calculated the probabilities of physical processes by
adding up the propagators. He had rules for calculating the propagators.
Each propagator was represented graphically by a collection of diagrams.
Each diagram gave a pictorial view of particles moving along straight lines
and colliding with one another at points where the straight lines met.
When I learned this technique of drawing diagrams and calculating propagators
from Feynman, I found it completely baffling, because it always
gave the right answer, but did not seem based on any solid mathematical
foundation. Feynman called his way of calculating physical processes
“the space-time approach,” because his diagrams represented events as
occurring at particular places and at particular times. The propagators
described sequences of events in space and time. It later turned out that
Feynman’s propagators were merely another kind of Green’s Functions.
Feynman had been talking the language of Green’s Functions all his life
without knowing it.
Green’s Functions also appeared in the work of Sin-Itiro Tomonaga (1906–
1979), who had developed independently a new elegant version of relativistic
quantum field theory. His work was done in the complete isolation of
war-time Japan, and was published in Japanese in 1943. The rest of the
world became aware of it only in the spring of 1948, when an English
translation of it arrived at Princeton sent by Hideki Yukawa (1907–1981)
to Robert Oppenheimer (1904–1967). Tomonaga was a physicist in the
European tradition, having worked as a student with Heisenberg (1901–
1976) at Leipzig before the war. For him, in contrast to Schwinger and
Feynman, quantum field theory was a familiar and natural language in
which to think.
After the war, Tomonaga’s students had been applying his ideas to calculate
the properties of atoms and electrons with high accuracy, and were
reaching the same results as Schwinger and Feynman. When Tomonaga’s
papers began to arrive in America, I was delighted to see that he was
speaking the language of quantum field theory that I had learned from
Kemmer. It did not take us long to put the various ingredients of the pudding
together. When the pudding was cooked, all three versions of the new
theory of atoms and electrons turned out to be different ways of expressing
the same basic idea. The basic idea was to calculate Green’s Functions
for all atomic processes that could be directly observed. Green’s Functions
appeared as the essential link between the methods of Schwinger and Feynman,
and Tomonaga’s relativistic quantum field theory provided the firm
mathematical foundation for all three versions of quantum electrodynamics.
Dyson wrote two fundamental papers on the foundations of quantum electrodynamics,
which are now classics:
F. Dyson, The radiation theories of Tomonaga, Schwinger, and Feynman,
Phys. Rev. 75 (1949), 486–502.
F. Dyson, The S-matrix in quantum electrodynamics, Phys. Rev. 75
(1949), 1736–1755.
The fascinating story of the first paper can be found on page 27 of Vol. I.
Dyson’s second paper on renormalization theory starts as follows:
The covariant (i.e., relativistically invariant) quantum electrodynamics of
Tomonaga, Schwinger, and Feynman is used as the basis for a general treatment
of scattering problems involving electrons, positrons, and photons.
Scattering processes, including the creation and annihilation of particles,
are completely described by the S-matrix (scattering matrix) of Heisenberg.
22 It is shown that the elements of this matrix can be calculated by
a consistent use of perturbation theory to any order in the fine-structure
constant. Detailed rules are given for carrying out such calculations, and
it is shown that divergences arising from higher order radiative corrections
can be removed from the S-matrix by a consistent use of the ideas of mass
and charge renormalization.
Not considered in this paper are the problems of extending the treatment
to bound-state phenomena, and of proving the convergence of the theory
as the order of perturbation itself tends to infinity
Nowadays this identity is called the Ward identity; it is the prototype of
the crucial Ward–Takahashi–Slavnov–Taylor identities in gauge field theories,
which are consequences of (local) gauge symmetry. In quantum electrodynamics,
the Ward identity guarantees the unitarity of the S-matrix; this
is a decisive ingredient of S-matrix theory. In fact, the unitarity is crucial for
relating elements of the S-matrix to transition probabilities (see Sect. 7.15 of
Vol. I). If the unitarity of the S-matrix is violated, then the theory becomes
meaningless from the physical point of view.
After thinking about the convergence problem in quantum electrodynamics
for a long time, Dyson published his paper Divergence of perturbation
theory in quantum electrodynamics, Phys. Rev. 85 (1952), 631–632. The abstract
of this paper reads as follows:
An argument is presented which leads tentatively to the conclusion that all
the power-series expansions currently in use in quantum electrodynamics
are divergent after the renormalization of mass and charge. The divergence
in no way restricts the accuracy of practical calculations that can be made
with the theory, but raises important questions of principle concerning the
nature of the physical concepts upon which the theory is built.
Dyson’s heuristic argument can be found in Sect. 15.5.1 of Vol. I. Silvan
Schweber writes the following in his excellent history on quantum electrodynamics
entitled QED and the Men Who Made It: Dyson, Feynman,
Schwinger, and Tomonaga, Princeton University Press, Princeton, New Jersey,
1994:25
The importance of Schwinger’s 1947 calculation of the anomalous magnetic
moment of the electron cannot be underestimated. In the course of theoretical
developments, there sometimes occur important calculations that alter
the way the community thinks about particular approaches. Schwinger’s
calculation is one such instance. . .
The papers of Tomonaga, Schwinger, and Feynman did not complete the
renormalization program since they confined themselves to low order calculations.
It was Dyson who dared to face the problem of high orders
and brought the program to completion. In magnificently penetrating papers,
he pointed out and resolved the main problems of this very difficult
analysis. . . Whatever the future may bring, it is safe to assert that the
theoretical advances made in the unravelling of the constitution of matter
since World War II comprise one of the greatest intellectual achievements
of mankind. They were based on the ground secured by the contributions
of Bethe, Tomonaga, Schwinger, Feynman, and Dyson to quantum field
theory and renormalization theory in the period from 1946 to 1951.
For creating quantum electrodynamics, Richard Feynman, Julian Schwinger,
and Sin-Itiro Tomonaga were awarded the Nobel prize in physics in 1965.
Freemann Dyson was awarded the Wolf prize in physics in 1981. Working at
the Institute for Advanced Study in Princeton, Dyson is one of the most influential
intellectuals of our time; his research concerns mathematics (number
theory, random matrices), physics (quantum field theory, statistical mechanics,
solid state physics, stability of matter), astrophysics (interstellar communication),
biology (origin of life), history, and philosophy of the sciences.
Much material can be found in the Selected Papers of Freeman Dyson, Amer.
Math. Soc., Providence, Rhode Island and International Press, Cambridge,
Massachusetts, 1996. In particular, we refer to the following beautiful essays
and books written by Dyson:
Essays:
• Mathematics in the physical sciences, Scientific American 211 (1964), 129–
164.
• Missed opportunities, Bull. Amer. Math. Soc. 78 (1972), 635–652.
• George Green and Physics, Physics World 6, August 1993, 33–38.
• A walk through Ramanujan’s garden. Lecture given at the Ramanujan
(1887–1920) Centenary Conference in 1987, University of Illinois. In: F.
Dyson, Selected Papers, pp. 187–208.
• Foreword to J. Havil, Gamma: Exploring Euler’s Constant, Princeton University
Press, 2003.
• Foreword to P. Odifreddi, The Mathematical Century: The 30 Greatest Problems
of the Last 100 Years, Princeton University Press, 2004.
• The Scientist as Rebel, New York Review Books, 2007.
Books:
• Disturbing the Universe, Harper and Row, New York, 1979.
• From Eros to Gaia, Pantheon Books, New York, 1992.
• Origins of Life, Cambridge University Press, 1999.
• The Sun, the Genome and the Internet: Tools of Scientific Revolution, Oxford
University Press, 1999.
Elliott Lieb (Princeton University) writes the following in the foreword to
Dyson’s Selected Papers (reprinted with permission):
If any proof be needed that theoretical physics papers are not ephemeral
and are not written on a blackboard that has to be erased every five years,
then the papers in this volume will supply ample witness. The writings of
Freeman Dyson are among the jewels that crown the subject and today
even the earliest among them can be read with profit and much pleasure
by beginners and experts. . .
Dyson along with Feynman, Schwinger, and Tomonaga was a founder of
quantum electrodynamics. When I started my graduate studies in the
fifties, it was not easy to find a coherent pedagogical representation of
the new field, but fortunately, Dyson had given lectures at Cornell in 1951
and these were available as notes. Thanks to their clarity many people,
including me, were able to enter the field.
Recently, these classic notes were published:
F. Dyson, Advanced Quantum Mechanics: Cornell Lectures on Quantum
Electrodynamics 1951, World Scientific, Singapore, 2007.
Feynman’s approach to quantum electrodynamics was elegantly based on
the use of graphs called Feynman diagrams today. David Kaiser writes the
following in his book Drawing Theories Apart: The Dispersion of Feynman
Diagrams in Postwar Physics (the University of Chicago Press, Chicago and
London, 2005 – reprinted with permission):
For all of Feynman’s many contributions to modern physics, his diagrams
have had the widest and longest-lasting influence. Feynman diagrams have
revolutionized nearly every aspect of theoretical physics since the middle
of the twentieth century. Feynman first introduced his diagrams in the
late 1940s as a bookkeeping device for simplifying lengthy calculations
in one area of physics – quantum electrodynamics, physicist’s quantummechanical
description of electromagnetic forces. Soon the diagrams gained
adherents throughout the fields of nuclear and particle physics. Not long
thereafter, other theorists adopted – and subtly adapted – Feynman diagrams
for many-body applications in solid-state physics. By the end of the
1960s, some physicists even wielded the line drawings for calculations in
gravitational physics. With the diagrams’ aid, entire new calculational vistas
opened for physicists; theorists learned to calculate things that many
had barely dreamed possible before World War II. With the list of diagrammatic
applications growing ever longer, Feynman diagrams helped to
transform the way physicists saw the world, and their place within it.
There is no doubt that quantum electrodynamics is one of the most beautiful
theories in theoretical physics. The following quotation is taken from the
forthcoming article “Quantum Theory and Relativity” written by Arthur
Jaffe, Contemporary Mathematics, American Mathematical Society, Providence,
Rhode Island, 2008, pp. 209–245 (reprinted with permission):26
Two major themes dominated twentieth century physics: quantum theory
and relativity. These two fundamental principles provide the cornerstones
upon which one might build the understanding of modern physics. And today
after one century of elaboration of the original discoveries by Poincar´e,
Einstein, Bohr, Schr¨odinger, Heisenberg, Dirac – and many others – one
still dreams of describing the forces of nature within such an arena. Yet
we do not know the answer to the basic question:
Are quantum theory, relativity, and interaction mathematically compatible?
Even if one restricts relativity to special relativity, we do not know the
answer to this question about our four-dimensional world – much less about
other higher-dimensional worlds considered by string theorists.
Should quantum theory with relativity not qualify as logic? Physics suggests
that a natural way to combine quantum theory, special relativity and
interaction is through a nonlinear quantum field. Enormous progress on
this problem has been made over the past forty years. This includes showing
that theories exist in space-times of dimension two and three. Building
this new mathematical framework and finding these examples has become
known as the subject of constructive quantum field theory. . .
For centuries, the tradition in physics has been to describe natural phenomena
by mathematics. Eugene Wigner marveled on the relevance of mathematics
in his famous essay: “On the Unreasonable Effectiveness of Mathematics
in the Natural Sciences,” Comm. Pure Appl. Math. 13 (1960), 1–14.
Intuition can go a long way. But by endowing physics with a mathematical
foundation, one also bestows physical laws with longevity. For mathematical
ideas can be understood and conveyed more easily than conjectures,
both from person to person, and also from generation to generation.
In recent years, we have witnessed enormous progress in another direction –
of transferring ideas from physics to mathematics: to play on Wigner’s title,
concepts from physics have had an unreasonable effectiveness in providing
insight to formulate mathematical conjectures! The resulting infusion
of new perspectives has truly blossomed into a mathematical revolution,
which has been sufficiently robust to touch almost every mathematical
frontier. . .
Genesis 1,3
Light is not only the basis of our biological existence, but also an essential
source of our knowledge about the physical laws of nature, ranging from
the seventeenth century geometrical optics up to the twentieth century
theory of general relativity and quantum electrodynamics.
Folklore
Don’t give us numbers: give us insight!
A contemporary natural scientist to a mathematician
Prologue
One thing I have learned in a long life: that all our science, measured
against reality, is primitive and childlike – and yet it is the most precise
thing we have.
Albert Einstein (1879–1955)
The development of quantum mechanics in the years 1925 and 1926 had
produced rules for the description of systems of microscopic particles,
which involved promoting the fundamental dynamical variables of a corresponding
classical system into operators with specified commutators. By
this means, a system, described initially in classical particle language, acquires
characteristics associated with the complementary classical wave
picture. It was also known that electromagnetic radiation contained in an
enclosure, when considered as a classical dynamical system, was equivalent
energetically to a denumerably infinite number of harmonic oscillators.
With the application of the quantization process to these fictitious oscillators,
the classical radiation field assumed characteristics describable in
the complementary classical particle language. The ensuing theory of light
quantum emission and absorption by atomic systems7 marked the beginning
of quantum electrodynamics. . .
When it was attempted to quantize the complete electromagnetic field,
difficulties were encountered that stem from the gauge ambiguity of the
potentials that appear in the Lagrangian formulation of the Maxwell equations.
. .
From the origin of quantum electrodynamics, in the classical theory of
point charges, came a legacy of difficulties.The coupling of an electron
with the electromagnetic field implied an infinite displacement, and, indeed,
an infinite shift of all spectral lines emitted by an atomic system
in the reaction of the electromagnetic field stimulated by the presence
of the electron, arbitrary short wave lengths play a disproportionate and
divergent role. The phenomenon of electron-positron pair creation, which
finds a natural place in the relativistic electron field theory, contributes to
this situation in virtue of the fluctuating densities of charge and current
that occur even in the vacuum state11 as the matter-field counterpart of
the fluctuations in electric and magnetic field strengths.
In computing the energy of a single electron relative to that of the vacuum
state, it is of significance that the presence of the electron tends to suppress
the charge-current fluctuations induced by the fluctuating electromagnetic
field. The resulting electron energy, while still divergent in its dependence
upon the contributions of arbitrarily short wave lengths exhibits only a
logarithmic infinity; the combination of quantum and relativistic effects
has destroyed all correspondence with the classical theory and its strongly
structured-dependent electromagnetic mass.
The existence of current fluctuations in the vacuum has other implications,
since the introduction of an electromagnetic field induces currents
that tend to modify the initial field; the “vacuum” acts as a polarizable
medium.
New nonlinear electromagnetic phenomena appear, such as the scattering
of one light beam by another, or by an electrostatic field. . .
It is not likely that future developments will change drastically the practical
results of the electron theory, which gives contemporary quantum
electrodynamics a certain enduring value. Yet the real significance of the
work of the past decade lies in the recognition of the ultimate problems
facing electrodynamics, the problems of conceptual consistency and of physical
completeness. No final solution can be anticipated until physical science
has met the heroic challenge to comprehend the structure of the
sub-microscopic world that nuclear exploration has revealed.
Julian Schwinger, 1958
This quotation is taken from a beautiful collection of 34 papers which played
a fundamental role in the development of quantum electrodynamics. This
volume was edited by Julian Schwinger from Harvard University who himself
made fundamental contributions to this fascinating field of contemporary
physics.
In the present volume, we will use Dyson’s extremely elegant approach to
quantum electrodynamics based on the Dyson series for the S-matrix (scat-
tering matrix). In the beginning of his Selected Papers, Freeman Dyson (born
1923) describes the history of quantum electrodynamics:16
My first stroke of luck was to find Nicholas Kemmer in Cambridge (England)
in 1946. He was the teacher I needed. He rapidly became a friend
as well as a teacher. Our friendship is still alive and well after 45 years.
Kemmer gave two courses of lectures in Cambridge, one on nuclear physics
and one on quantum field theory. In 1946, the only existing text-book on
quantum field theory was the book “Quantentheorie der Wellenfelder”, by
Gregor Wentzel (1898–1978) written in Z¨urich and published in 1943 in
Vienna in the middle of the war. Kemmer had been a student of Wentzel
and possessed a copy of Wentzel’s book. It was at that time a treasure
without price. I believe there were then only two copies in England. It was
later reprinted in America and translated into English. But in 1946, few
people in America knew of its existence and fewer considered it important.
Kemmer not only possessed a copy, he also lent it to me and explained
why it was important. . .
In 1947, I arrived at Cornell as a student and found myself, thanks to
Kemmer, the only person in the whole university who knew about quantum
field theory. The great Hans Bethe (1906–2005) and the brilliant Richard
Feynman (1918–1988) taught me a tremendous lot about many areas of
physics, but when we were dealing with quantum field theory, I was the
teacher and they were the students . . .
Julian Schwinger (1918–1994) had known about quantum field theory long
before. But he shared the American view that it was a mathematical extravagance,
better avoided unless it should turn out to be essential. In 1948,
he understood that it could be useful. He used it for calculations of the
energy level shifts20 revealed by the experiments of Lamb and Retherford,
Foley and Kusch at Columbia.21 But he used it grudgingly. In his publications,
he preferred not to speak explicitly about quantum field theory.
Instead, he spoke about Green’s Functions. It turned out that the Green’s
Functions that Schwinger talked about and the quantum field theory that
Kemmer talked about were fundamentally the same thing. . .
At Cornell, I was learning Richard Feynman’s quite different way of calculating
atomic processes. Feynman had never been interested in quantum
field theory. He had his own private way of doing calculations. His way
was based on things that he called “Propagators,” which were probability
amplitudes for particles to propagate themselves from one space-time
point to another. He calculated the probabilities of physical processes by
adding up the propagators. He had rules for calculating the propagators.
Each propagator was represented graphically by a collection of diagrams.
Each diagram gave a pictorial view of particles moving along straight lines
and colliding with one another at points where the straight lines met.
When I learned this technique of drawing diagrams and calculating propagators
from Feynman, I found it completely baffling, because it always
gave the right answer, but did not seem based on any solid mathematical
foundation. Feynman called his way of calculating physical processes
“the space-time approach,” because his diagrams represented events as
occurring at particular places and at particular times. The propagators
described sequences of events in space and time. It later turned out that
Feynman’s propagators were merely another kind of Green’s Functions.
Feynman had been talking the language of Green’s Functions all his life
without knowing it.
Green’s Functions also appeared in the work of Sin-Itiro Tomonaga (1906–
1979), who had developed independently a new elegant version of relativistic
quantum field theory. His work was done in the complete isolation of
war-time Japan, and was published in Japanese in 1943. The rest of the
world became aware of it only in the spring of 1948, when an English
translation of it arrived at Princeton sent by Hideki Yukawa (1907–1981)
to Robert Oppenheimer (1904–1967). Tomonaga was a physicist in the
European tradition, having worked as a student with Heisenberg (1901–
1976) at Leipzig before the war. For him, in contrast to Schwinger and
Feynman, quantum field theory was a familiar and natural language in
which to think.
After the war, Tomonaga’s students had been applying his ideas to calculate
the properties of atoms and electrons with high accuracy, and were
reaching the same results as Schwinger and Feynman. When Tomonaga’s
papers began to arrive in America, I was delighted to see that he was
speaking the language of quantum field theory that I had learned from
Kemmer. It did not take us long to put the various ingredients of the pudding
together. When the pudding was cooked, all three versions of the new
theory of atoms and electrons turned out to be different ways of expressing
the same basic idea. The basic idea was to calculate Green’s Functions
for all atomic processes that could be directly observed. Green’s Functions
appeared as the essential link between the methods of Schwinger and Feynman,
and Tomonaga’s relativistic quantum field theory provided the firm
mathematical foundation for all three versions of quantum electrodynamics.
Dyson wrote two fundamental papers on the foundations of quantum electrodynamics,
which are now classics:
F. Dyson, The radiation theories of Tomonaga, Schwinger, and Feynman,
Phys. Rev. 75 (1949), 486–502.
F. Dyson, The S-matrix in quantum electrodynamics, Phys. Rev. 75
(1949), 1736–1755.
The fascinating story of the first paper can be found on page 27 of Vol. I.
Dyson’s second paper on renormalization theory starts as follows:
The covariant (i.e., relativistically invariant) quantum electrodynamics of
Tomonaga, Schwinger, and Feynman is used as the basis for a general treatment
of scattering problems involving electrons, positrons, and photons.
Scattering processes, including the creation and annihilation of particles,
are completely described by the S-matrix (scattering matrix) of Heisenberg.
22 It is shown that the elements of this matrix can be calculated by
a consistent use of perturbation theory to any order in the fine-structure
constant. Detailed rules are given for carrying out such calculations, and
it is shown that divergences arising from higher order radiative corrections
can be removed from the S-matrix by a consistent use of the ideas of mass
and charge renormalization.
Not considered in this paper are the problems of extending the treatment
to bound-state phenomena, and of proving the convergence of the theory
as the order of perturbation itself tends to infinity
Nowadays this identity is called the Ward identity; it is the prototype of
the crucial Ward–Takahashi–Slavnov–Taylor identities in gauge field theories,
which are consequences of (local) gauge symmetry. In quantum electrodynamics,
the Ward identity guarantees the unitarity of the S-matrix; this
is a decisive ingredient of S-matrix theory. In fact, the unitarity is crucial for
relating elements of the S-matrix to transition probabilities (see Sect. 7.15 of
Vol. I). If the unitarity of the S-matrix is violated, then the theory becomes
meaningless from the physical point of view.
After thinking about the convergence problem in quantum electrodynamics
for a long time, Dyson published his paper Divergence of perturbation
theory in quantum electrodynamics, Phys. Rev. 85 (1952), 631–632. The abstract
of this paper reads as follows:
An argument is presented which leads tentatively to the conclusion that all
the power-series expansions currently in use in quantum electrodynamics
are divergent after the renormalization of mass and charge. The divergence
in no way restricts the accuracy of practical calculations that can be made
with the theory, but raises important questions of principle concerning the
nature of the physical concepts upon which the theory is built.
Dyson’s heuristic argument can be found in Sect. 15.5.1 of Vol. I. Silvan
Schweber writes the following in his excellent history on quantum electrodynamics
entitled QED and the Men Who Made It: Dyson, Feynman,
Schwinger, and Tomonaga, Princeton University Press, Princeton, New Jersey,
1994:25
The importance of Schwinger’s 1947 calculation of the anomalous magnetic
moment of the electron cannot be underestimated. In the course of theoretical
developments, there sometimes occur important calculations that alter
the way the community thinks about particular approaches. Schwinger’s
calculation is one such instance. . .
The papers of Tomonaga, Schwinger, and Feynman did not complete the
renormalization program since they confined themselves to low order calculations.
It was Dyson who dared to face the problem of high orders
and brought the program to completion. In magnificently penetrating papers,
he pointed out and resolved the main problems of this very difficult
analysis. . . Whatever the future may bring, it is safe to assert that the
theoretical advances made in the unravelling of the constitution of matter
since World War II comprise one of the greatest intellectual achievements
of mankind. They were based on the ground secured by the contributions
of Bethe, Tomonaga, Schwinger, Feynman, and Dyson to quantum field
theory and renormalization theory in the period from 1946 to 1951.
For creating quantum electrodynamics, Richard Feynman, Julian Schwinger,
and Sin-Itiro Tomonaga were awarded the Nobel prize in physics in 1965.
Freemann Dyson was awarded the Wolf prize in physics in 1981. Working at
the Institute for Advanced Study in Princeton, Dyson is one of the most influential
intellectuals of our time; his research concerns mathematics (number
theory, random matrices), physics (quantum field theory, statistical mechanics,
solid state physics, stability of matter), astrophysics (interstellar communication),
biology (origin of life), history, and philosophy of the sciences.
Much material can be found in the Selected Papers of Freeman Dyson, Amer.
Math. Soc., Providence, Rhode Island and International Press, Cambridge,
Massachusetts, 1996. In particular, we refer to the following beautiful essays
and books written by Dyson:
Essays:
• Mathematics in the physical sciences, Scientific American 211 (1964), 129–
164.
• Missed opportunities, Bull. Amer. Math. Soc. 78 (1972), 635–652.
• George Green and Physics, Physics World 6, August 1993, 33–38.
• A walk through Ramanujan’s garden. Lecture given at the Ramanujan
(1887–1920) Centenary Conference in 1987, University of Illinois. In: F.
Dyson, Selected Papers, pp. 187–208.
• Foreword to J. Havil, Gamma: Exploring Euler’s Constant, Princeton University
Press, 2003.
• Foreword to P. Odifreddi, The Mathematical Century: The 30 Greatest Problems
of the Last 100 Years, Princeton University Press, 2004.
• The Scientist as Rebel, New York Review Books, 2007.
Books:
• Disturbing the Universe, Harper and Row, New York, 1979.
• From Eros to Gaia, Pantheon Books, New York, 1992.
• Origins of Life, Cambridge University Press, 1999.
• The Sun, the Genome and the Internet: Tools of Scientific Revolution, Oxford
University Press, 1999.
Elliott Lieb (Princeton University) writes the following in the foreword to
Dyson’s Selected Papers (reprinted with permission):
If any proof be needed that theoretical physics papers are not ephemeral
and are not written on a blackboard that has to be erased every five years,
then the papers in this volume will supply ample witness. The writings of
Freeman Dyson are among the jewels that crown the subject and today
even the earliest among them can be read with profit and much pleasure
by beginners and experts. . .
Dyson along with Feynman, Schwinger, and Tomonaga was a founder of
quantum electrodynamics. When I started my graduate studies in the
fifties, it was not easy to find a coherent pedagogical representation of
the new field, but fortunately, Dyson had given lectures at Cornell in 1951
and these were available as notes. Thanks to their clarity many people,
including me, were able to enter the field.
Recently, these classic notes were published:
F. Dyson, Advanced Quantum Mechanics: Cornell Lectures on Quantum
Electrodynamics 1951, World Scientific, Singapore, 2007.
Feynman’s approach to quantum electrodynamics was elegantly based on
the use of graphs called Feynman diagrams today. David Kaiser writes the
following in his book Drawing Theories Apart: The Dispersion of Feynman
Diagrams in Postwar Physics (the University of Chicago Press, Chicago and
London, 2005 – reprinted with permission):
For all of Feynman’s many contributions to modern physics, his diagrams
have had the widest and longest-lasting influence. Feynman diagrams have
revolutionized nearly every aspect of theoretical physics since the middle
of the twentieth century. Feynman first introduced his diagrams in the
late 1940s as a bookkeeping device for simplifying lengthy calculations
in one area of physics – quantum electrodynamics, physicist’s quantummechanical
description of electromagnetic forces. Soon the diagrams gained
adherents throughout the fields of nuclear and particle physics. Not long
thereafter, other theorists adopted – and subtly adapted – Feynman diagrams
for many-body applications in solid-state physics. By the end of the
1960s, some physicists even wielded the line drawings for calculations in
gravitational physics. With the diagrams’ aid, entire new calculational vistas
opened for physicists; theorists learned to calculate things that many
had barely dreamed possible before World War II. With the list of diagrammatic
applications growing ever longer, Feynman diagrams helped to
transform the way physicists saw the world, and their place within it.
There is no doubt that quantum electrodynamics is one of the most beautiful
theories in theoretical physics. The following quotation is taken from the
forthcoming article “Quantum Theory and Relativity” written by Arthur
Jaffe, Contemporary Mathematics, American Mathematical Society, Providence,
Rhode Island, 2008, pp. 209–245 (reprinted with permission):26
Two major themes dominated twentieth century physics: quantum theory
and relativity. These two fundamental principles provide the cornerstones
upon which one might build the understanding of modern physics. And today
after one century of elaboration of the original discoveries by Poincar´e,
Einstein, Bohr, Schr¨odinger, Heisenberg, Dirac – and many others – one
still dreams of describing the forces of nature within such an arena. Yet
we do not know the answer to the basic question:
Are quantum theory, relativity, and interaction mathematically compatible?
Even if one restricts relativity to special relativity, we do not know the
answer to this question about our four-dimensional world – much less about
other higher-dimensional worlds considered by string theorists.
Should quantum theory with relativity not qualify as logic? Physics suggests
that a natural way to combine quantum theory, special relativity and
interaction is through a nonlinear quantum field. Enormous progress on
this problem has been made over the past forty years. This includes showing
that theories exist in space-times of dimension two and three. Building
this new mathematical framework and finding these examples has become
known as the subject of constructive quantum field theory. . .
For centuries, the tradition in physics has been to describe natural phenomena
by mathematics. Eugene Wigner marveled on the relevance of mathematics
in his famous essay: “On the Unreasonable Effectiveness of Mathematics
in the Natural Sciences,” Comm. Pure Appl. Math. 13 (1960), 1–14.
Intuition can go a long way. But by endowing physics with a mathematical
foundation, one also bestows physical laws with longevity. For mathematical
ideas can be understood and conveyed more easily than conjectures,
both from person to person, and also from generation to generation.
In recent years, we have witnessed enormous progress in another direction –
of transferring ideas from physics to mathematics: to play on Wigner’s title,
concepts from physics have had an unreasonable effectiveness in providing
insight to formulate mathematical conjectures! The resulting infusion
of new perspectives has truly blossomed into a mathematical revolution,
which has been sufficiently robust to touch almost every mathematical
frontier. . .
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
1. Mathematical Principles of Modern Natural
Philosophy
The book of nature is written in the language of mathematics.
Galileo Galilei (1564–1642)
At the beginning of the seventeenth century, two great philosophers, Francis
Bacon (1561–1626) in England and Ren´e Descartes (1596–1650) in
France, proclaimed the birth of modern science. Each of them described
his vision of the future. Their visions were very different. Bacon said, “All
depends on keeping the eye steadily fixed on the facts of nature.” Descartes
said, “I think, therefore I am.” According to Bacon, scientists should travel
over the earth collecting facts, until the accumulated facts reveal how Nature
works. The scientists will then induce from the facts the laws that
Nature obeys. According to Descartes, scientists should stay at home and
deduce the laws of Nature by pure thought. In order to deduce the laws
correctly, the scientists will need only the rules of logic and knowledge of
the existence of God. For four hundred years since Bacon and Descartes led
the way, science has raced ahead by following both paths simultaneously.
Neither Baconian empiricism nor Cartesian dogmatism has the power to
elucidate Nature’s secrets by itself, but both together have been amazingly
successful. For four hundred years English scientists have tended to
be Baconian and French scientists Cartesian.
Faraday (1791–1867) and Darwin (1809–1882) and Rutherford (1871–
1937) were Baconians; Pascal (1623–1662) and Laplace (1749–1827) and
Poincar´e (1854–1912) were Cartesians. Science was greatly enriched by the
cross-fertilization of the two contrasting national cultures. Both cultures
were always at work in both countries. Newton (1643–1727) was at heart
a Cartesian, using pure thought as Descartes intended, and using it to
demolish the Cartesian dogma of vortices. Marie Curie (1867–1934) was
at heart a Baconian, boiling tons of crude uranium ore to demolish the
dogma of the indestructibility of atoms.
Freeman Dyson, 2004
It is important for him who wants to discover not to confine himself to a
chapter of science, but keep in touch with various others.
Jacques Hadamard (1865–1963)
Mathematics takes us still further from what is human, into the region of
absolute necessity, to which not only the actual world, but every possible
world must conform.
Bertrand Russel (1872–1972)
1.1 Basic Principles
There exist the following fundamental principles for the mathematical description
of physical phenomena in nature.
(I) The infinitesimal principle due to Newton and Leibniz: The laws of nature
become simple on an infinitesimal level of space and time.
(II) The optimality principle (or the principle of least action): Physical processes
proceed in such an optimal way that the action is minimal (or at
least critical). Such processes are governed by ordinary or partial differential
equations called the Euler–Lagrange equations.
(III) Emmy Noether’s symmetry principle: Symmetries of the action functional
imply conservation laws for the corresponding Euler–Lagrange
equations (e.g., conservation of energy).
(IV) The gauge principle and Levi-Civita’s parallel transport: The fundamental
forces in nature (gravitational, eletromagnetic, strong, and weak
interaction) are based on the symmetry of the action functional under
local gauge transformations. The corresponding parallel transport
of physical information generates the intrinsic Gauss–Riemann–Cartan–
Ehresmann curvature which, roughly speaking, corresponds to the acting
force (also called interaction). Briefly: force = curvature.
(V) Planck’s quantization principle: Nature jumps.
(VI) Einstein’s principle of special relativity: Physics does not depend on the
choice of the inertial system.
(VII) Einstein’s principle of general relativity: Physics does not depend on
the choice of the local space-time coordinates of an observer.
(VIII) Dirac’s unitarity principle: Quantum physics does not depend on the
choice of the measurement device (i.e., on the choice of an orthonormal
basis in the relevant Hilbert space). This corresponds to the invariance
under unitary transformations.
Geometrization of physics. In mathematics, the properties of geometric
objects do not depend on the choice of the coordinate system. This is
similar to the principles (VI)–(VIII). Therefore, it is quite natural that geometric
methods play a fundamental role in modern physics.
Linearity and nonlinearity. We have to distinguish between
(i) linear processes, and
(ii) nonlinear processes.
In case (i), the superposition principle holds, that is, the superposition of
physical states yields again a physical state. Mathematically, such processes
are described by linear spaces and linear operator equations. The mathematical
analysis can be simplified by representing physical phenomena as
superposition of simpler phenomena. This is the method of harmonic analysis
(e.g., the Fourier method based on the Fourier series, the Fourier integral,
or the Fourier–Stieltjes integral).
In case (ii), the superposition principle is violated. As a rule, interactions
in nature are mathematically described by nonlinear operator equations (e.g.,
nonlinear differential or integral equations). The method of perturbation theory
allows us to reduce (ii) to (i), by using an iterative method.
Basic properties of physical effects. For the mathematical investigation
of physical effects, one has to take the following into account.
(A) Faraday’s locality principle: Physical effects propagate locally in space
and time (law of proximity theory).
(B) Green’s locality principle: The response of a linear physical system can be
described by localizing the external forces in space and time and by considering
the superposition of the corresponding special responses (method
of the Green’s function). Furthermore, this can be used for computing
nonlinear physical systems by iteration.
(C) Planck’s constant: The smallest action (energy × time) in nature is given
by the action quantum h = 6.626 0755 · 10−34Js.
(D) Einstein’s propagation principle: Physical effects travel at most with the
speed of light c in a vacuum. Explicitly, c = 2.997 92458 · 108m/s.
(E) Gauge invariance principle: Physical effects are invariant under local
gauge transformations. Physical experiments are only able to measure
quantities which do not depend on the choice of the gauge condition.
(F) The Planck scale hypothesis: Physics dramatically changes below the
Planck length given by l = 10−35m.
In what follows, let us discuss some basic ideas related to all of the principles
summarized above. To begin with, concerning Faraday’s locality principle,
Maxwell emphasized the following:
Before I began the study of electricity I resolved to read no mathematics on
the subject till I had first read through Faraday’s 1832 paper Experimental
researches on electricity. I was aware that there was supposed to be a
difference between Faraday’s way of conceiving phenomena and that of
the mathematicians, so that neither he nor they were satisfied with each
other’s language. I had also the conviction that this discrepancy did not
arise from either party being wrong. For instance, Faraday, in his mind, saw
lines of force traversing all space where the mathematicians (e.g., Gauss)
saw centers of force attracting at a distance; Faraday saw a medium where
they saw nothing but distance; Faraday sought the seat of the phenomena
in real actions going on in the medium, where they were satisfied that they
had found it in a power of action at a distance impressed on the electric
fluids.
When I had translated what I considered to be Faraday’s ideas into a
mathematical form, I found that in general the results of the two methods
coincide. . . I also found that several of the most fertile methods of research
discovered by the mathematicians could be expressed much better in terms
of the ideas derived from Faraday than in their original form.
Philosophy
The book of nature is written in the language of mathematics.
Galileo Galilei (1564–1642)
At the beginning of the seventeenth century, two great philosophers, Francis
Bacon (1561–1626) in England and Ren´e Descartes (1596–1650) in
France, proclaimed the birth of modern science. Each of them described
his vision of the future. Their visions were very different. Bacon said, “All
depends on keeping the eye steadily fixed on the facts of nature.” Descartes
said, “I think, therefore I am.” According to Bacon, scientists should travel
over the earth collecting facts, until the accumulated facts reveal how Nature
works. The scientists will then induce from the facts the laws that
Nature obeys. According to Descartes, scientists should stay at home and
deduce the laws of Nature by pure thought. In order to deduce the laws
correctly, the scientists will need only the rules of logic and knowledge of
the existence of God. For four hundred years since Bacon and Descartes led
the way, science has raced ahead by following both paths simultaneously.
Neither Baconian empiricism nor Cartesian dogmatism has the power to
elucidate Nature’s secrets by itself, but both together have been amazingly
successful. For four hundred years English scientists have tended to
be Baconian and French scientists Cartesian.
Faraday (1791–1867) and Darwin (1809–1882) and Rutherford (1871–
1937) were Baconians; Pascal (1623–1662) and Laplace (1749–1827) and
Poincar´e (1854–1912) were Cartesians. Science was greatly enriched by the
cross-fertilization of the two contrasting national cultures. Both cultures
were always at work in both countries. Newton (1643–1727) was at heart
a Cartesian, using pure thought as Descartes intended, and using it to
demolish the Cartesian dogma of vortices. Marie Curie (1867–1934) was
at heart a Baconian, boiling tons of crude uranium ore to demolish the
dogma of the indestructibility of atoms.
Freeman Dyson, 2004
It is important for him who wants to discover not to confine himself to a
chapter of science, but keep in touch with various others.
Jacques Hadamard (1865–1963)
Mathematics takes us still further from what is human, into the region of
absolute necessity, to which not only the actual world, but every possible
world must conform.
Bertrand Russel (1872–1972)
1.1 Basic Principles
There exist the following fundamental principles for the mathematical description
of physical phenomena in nature.
(I) The infinitesimal principle due to Newton and Leibniz: The laws of nature
become simple on an infinitesimal level of space and time.
(II) The optimality principle (or the principle of least action): Physical processes
proceed in such an optimal way that the action is minimal (or at
least critical). Such processes are governed by ordinary or partial differential
equations called the Euler–Lagrange equations.
(III) Emmy Noether’s symmetry principle: Symmetries of the action functional
imply conservation laws for the corresponding Euler–Lagrange
equations (e.g., conservation of energy).
(IV) The gauge principle and Levi-Civita’s parallel transport: The fundamental
forces in nature (gravitational, eletromagnetic, strong, and weak
interaction) are based on the symmetry of the action functional under
local gauge transformations. The corresponding parallel transport
of physical information generates the intrinsic Gauss–Riemann–Cartan–
Ehresmann curvature which, roughly speaking, corresponds to the acting
force (also called interaction). Briefly: force = curvature.
(V) Planck’s quantization principle: Nature jumps.
(VI) Einstein’s principle of special relativity: Physics does not depend on the
choice of the inertial system.
(VII) Einstein’s principle of general relativity: Physics does not depend on
the choice of the local space-time coordinates of an observer.
(VIII) Dirac’s unitarity principle: Quantum physics does not depend on the
choice of the measurement device (i.e., on the choice of an orthonormal
basis in the relevant Hilbert space). This corresponds to the invariance
under unitary transformations.
Geometrization of physics. In mathematics, the properties of geometric
objects do not depend on the choice of the coordinate system. This is
similar to the principles (VI)–(VIII). Therefore, it is quite natural that geometric
methods play a fundamental role in modern physics.
Linearity and nonlinearity. We have to distinguish between
(i) linear processes, and
(ii) nonlinear processes.
In case (i), the superposition principle holds, that is, the superposition of
physical states yields again a physical state. Mathematically, such processes
are described by linear spaces and linear operator equations. The mathematical
analysis can be simplified by representing physical phenomena as
superposition of simpler phenomena. This is the method of harmonic analysis
(e.g., the Fourier method based on the Fourier series, the Fourier integral,
or the Fourier–Stieltjes integral).
In case (ii), the superposition principle is violated. As a rule, interactions
in nature are mathematically described by nonlinear operator equations (e.g.,
nonlinear differential or integral equations). The method of perturbation theory
allows us to reduce (ii) to (i), by using an iterative method.
Basic properties of physical effects. For the mathematical investigation
of physical effects, one has to take the following into account.
(A) Faraday’s locality principle: Physical effects propagate locally in space
and time (law of proximity theory).
(B) Green’s locality principle: The response of a linear physical system can be
described by localizing the external forces in space and time and by considering
the superposition of the corresponding special responses (method
of the Green’s function). Furthermore, this can be used for computing
nonlinear physical systems by iteration.
(C) Planck’s constant: The smallest action (energy × time) in nature is given
by the action quantum h = 6.626 0755 · 10−34Js.
(D) Einstein’s propagation principle: Physical effects travel at most with the
speed of light c in a vacuum. Explicitly, c = 2.997 92458 · 108m/s.
(E) Gauge invariance principle: Physical effects are invariant under local
gauge transformations. Physical experiments are only able to measure
quantities which do not depend on the choice of the gauge condition.
(F) The Planck scale hypothesis: Physics dramatically changes below the
Planck length given by l = 10−35m.
In what follows, let us discuss some basic ideas related to all of the principles
summarized above. To begin with, concerning Faraday’s locality principle,
Maxwell emphasized the following:
Before I began the study of electricity I resolved to read no mathematics on
the subject till I had first read through Faraday’s 1832 paper Experimental
researches on electricity. I was aware that there was supposed to be a
difference between Faraday’s way of conceiving phenomena and that of
the mathematicians, so that neither he nor they were satisfied with each
other’s language. I had also the conviction that this discrepancy did not
arise from either party being wrong. For instance, Faraday, in his mind, saw
lines of force traversing all space where the mathematicians (e.g., Gauss)
saw centers of force attracting at a distance; Faraday saw a medium where
they saw nothing but distance; Faraday sought the seat of the phenomena
in real actions going on in the medium, where they were satisfied that they
had found it in a power of action at a distance impressed on the electric
fluids.
When I had translated what I considered to be Faraday’s ideas into a
mathematical form, I found that in general the results of the two methods
coincide. . . I also found that several of the most fertile methods of research
discovered by the mathematicians could be expressed much better in terms
of the ideas derived from Faraday than in their original form.
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
1.2 The Infinitesimal Strategy and Differential
Equations
Differential equations are the foundation of the natural scientific, mathematical
view of the world.
Vladimir Igorevich Arnold (born 1937)
The infinitesimal strategy due to Newton and Leibniz studies the behavior of
a physical system for infinitesimally small time intervals and infinitesimally
small distances. This leads to the encoding of physical processes into differential
equations (e.g., Newton’s equations of motion in mechanics, or Maxwell’s
equations in electrodynamics).
The task of mathematics is to decode this information; that is, to
solve the fundamental differential equations.
1.3 The Optimality Principle
It is crucial that the class of possible differential equations is strongly restricted
by the optimality principle. This principle tells us that the fundamental
differential equations are the Euler–Lagrange equations to variational
problems. In 1918, Emmy Noether formulated her general symmetry principle
in the calculus of variations. The famous Noether theorem combines Lie’s
theory of continuous groups with the calculus of variations due to Euler and
Lagrange. This will be studied in Section 6.6.
1.4 The Basic Notion of Action in Physics and the Idea
of Quantization
The most important physical quantity in physics is not the energy, but the
action which has the physical dimension energy times time. The following is
crucial.
(i) The fundamental processes in nature are governed by the principle of
least action
S = min!
where we have to add appropriate side conditions.In fact, one has to use
the more general principal S = critical! (principle of critical action).
In this case, the principle of critical action
reads as
S[q] = critical!, q(t0) = q0, q(t1) = q1 (1.1)
where we fix the following quantities: the initial time t0, the initial position
q0 of the particle, the final time t1, and the final position q1 of the
particle. The solutions of (1.1) satisfy the Euler–Lagrange equation
mddq(t)/dtdt = F(t), t∈ R
with the force F(t) = −U(q(t)). This coincides with the Newtonian
equation of motion (see Sect. 6.5).
(ii) In 1900 Planck postulated that there do not exist arbitrarily small
amounts of action in nature. The smallest amount of action in nature
is equal to the Planck constant h. In ancient times, philosophers said:
Natura non facit saltus. (Nature does never make a jump.)
In his “Noveaux essais,” Leibniz wrote:
Tout va par degr´es dans la nature et rien par saut. (In nature
everything proceeds little by little and not by jumping.)
In contrast to this classical philosophy, Planck formulated the hypothesis
in 1900 that the energy levels of a harmonic oscillator form a discrete set.
He used this fact in order to derive his radiation law for black bodies (see
Sect. 2.3.1 of Vol. I). This was the birth of quantum physics. More generally,
the energy levels of the bound states of an atom or a molecule are
discrete. The corresponding energy jumps cause the spectra of atoms and
molecules observed in physical experiments (e.g., the spectral analysis of
the light coming from stars). Nowadays, we say that:
Nature jumps.
This reflects a dramatic change in our philosophical understanding of
nature.
(iii) In order to mathematically describe quantum effects, one has to modify
classical theories. This is called the process of quantization, which we
will encounter in this series of monographs again and again. As an introduction
to this, we recommend reading Chapter 7. Now to the point.
Feynman discovered in the early 1940s in his dissertation in Princeton
that the process of quantization can be most elegantly described by path
integrals (also called functional integrals) of the form
exp(iS[ψ]/h) Dψ
where we sum over all classical fields ψ (with appropriate side conditions).
Here, h := h/2π. For example, the quantization of the classical particle
considered in (i) can be based on the formula
G(q0, t0; q1, t1) = exp(iS[q]/h) Dq.
Here, we sum over all classical motions q = q(t) which satisfy the side
condition q(t0) = q0 and q(t1) = q1. The Green’s function G determines
the time-evolution of the wave function ψ, that is, if we know the wave
function ψ = ψ(x, t0) at the initial time t0, then we know the wave
function at the later time t by the formula
ψ(x, t) = G(x, t; y, t0)ψ(y, t0)dy.
Finally, the wave function ψ tells us the probability
|ψ(x, t)|ψ(x, t)|dx
of finding the quantum particle in the interval [a, b] at time t.
(iv) In quantum field theory, one uses the functional integral
exp(iS[ψ]/h) exp(i<ψ|J >)Dψ
with the additional external source J. Differentiation with respect to J
yields the moments of the quantum field. In turn, the moments determine
the correlation functions (also called Green’s functions). The correlation
functions describe the correlations between different positions of
the quantum field at different points in time. These correlations are the
most important properties of the quantum field which can be related to
physical measurements.
Feynman’s functional integral approach to quantum physics clearly shows
that both classical and quantum physics are governed by the classical action
functional S. This approach can also be extended to the study of manyparticle
systems at finite temperature, as we have discussed in Sect. 13.8 of
Vol. I. Summarizing, let us formulate the following general strategy:
The main task in modern physics is the mathematical description of
the propagation of physical effects caused by interactions and their
quantization.
In Sect. 1.9 we will show that in modern physics, interactions are described
by gauge theories based on local symmetries.
Iterative solution of nonlinear problems. The experience of physicists
shows that
Interactions in nature lead to nonlinear terms in the corresponding
differential equations.
This explains the importance of nonlinear problems in physics. We want
to show that the Green’s function can also be used in order to investigate
nonlinear problems.
Therefore, nonlinear problems can be iteratively solved if the Green’s
function is known.
Resonance effects cause singularities of the Green’s function.
Resonances are responsible for complicated physical effects.
For example, the observed chaotic motion of some asteroids is due to resonance
effects in celestial mechanics (the Kolmogorov–Arnold–Moser theory).
In quantum field theory, internal resonances of the quantum field cause special
quantum effects (e.g., the Lamb shift in the spectrum of the hydrogen
atom and the anomalous magnetic moment of the electron), which have to
be treated with the methods of renormalization theory (see Chap. 17 on
radiative corrections in quantum electrodynamics).
Equations
Differential equations are the foundation of the natural scientific, mathematical
view of the world.
Vladimir Igorevich Arnold (born 1937)
The infinitesimal strategy due to Newton and Leibniz studies the behavior of
a physical system for infinitesimally small time intervals and infinitesimally
small distances. This leads to the encoding of physical processes into differential
equations (e.g., Newton’s equations of motion in mechanics, or Maxwell’s
equations in electrodynamics).
The task of mathematics is to decode this information; that is, to
solve the fundamental differential equations.
1.3 The Optimality Principle
It is crucial that the class of possible differential equations is strongly restricted
by the optimality principle. This principle tells us that the fundamental
differential equations are the Euler–Lagrange equations to variational
problems. In 1918, Emmy Noether formulated her general symmetry principle
in the calculus of variations. The famous Noether theorem combines Lie’s
theory of continuous groups with the calculus of variations due to Euler and
Lagrange. This will be studied in Section 6.6.
1.4 The Basic Notion of Action in Physics and the Idea
of Quantization
The most important physical quantity in physics is not the energy, but the
action which has the physical dimension energy times time. The following is
crucial.
(i) The fundamental processes in nature are governed by the principle of
least action
S = min!
where we have to add appropriate side conditions.In fact, one has to use
the more general principal S = critical! (principle of critical action).
In this case, the principle of critical action
reads as
S[q] = critical!, q(t0) = q0, q(t1) = q1 (1.1)
where we fix the following quantities: the initial time t0, the initial position
q0 of the particle, the final time t1, and the final position q1 of the
particle. The solutions of (1.1) satisfy the Euler–Lagrange equation
mddq(t)/dtdt = F(t), t∈ R
with the force F(t) = −U(q(t)). This coincides with the Newtonian
equation of motion (see Sect. 6.5).
(ii) In 1900 Planck postulated that there do not exist arbitrarily small
amounts of action in nature. The smallest amount of action in nature
is equal to the Planck constant h. In ancient times, philosophers said:
Natura non facit saltus. (Nature does never make a jump.)
In his “Noveaux essais,” Leibniz wrote:
Tout va par degr´es dans la nature et rien par saut. (In nature
everything proceeds little by little and not by jumping.)
In contrast to this classical philosophy, Planck formulated the hypothesis
in 1900 that the energy levels of a harmonic oscillator form a discrete set.
He used this fact in order to derive his radiation law for black bodies (see
Sect. 2.3.1 of Vol. I). This was the birth of quantum physics. More generally,
the energy levels of the bound states of an atom or a molecule are
discrete. The corresponding energy jumps cause the spectra of atoms and
molecules observed in physical experiments (e.g., the spectral analysis of
the light coming from stars). Nowadays, we say that:
Nature jumps.
This reflects a dramatic change in our philosophical understanding of
nature.
(iii) In order to mathematically describe quantum effects, one has to modify
classical theories. This is called the process of quantization, which we
will encounter in this series of monographs again and again. As an introduction
to this, we recommend reading Chapter 7. Now to the point.
Feynman discovered in the early 1940s in his dissertation in Princeton
that the process of quantization can be most elegantly described by path
integrals (also called functional integrals) of the form
exp(iS[ψ]/h) Dψ
where we sum over all classical fields ψ (with appropriate side conditions).
Here, h := h/2π. For example, the quantization of the classical particle
considered in (i) can be based on the formula
G(q0, t0; q1, t1) = exp(iS[q]/h) Dq.
Here, we sum over all classical motions q = q(t) which satisfy the side
condition q(t0) = q0 and q(t1) = q1. The Green’s function G determines
the time-evolution of the wave function ψ, that is, if we know the wave
function ψ = ψ(x, t0) at the initial time t0, then we know the wave
function at the later time t by the formula
ψ(x, t) = G(x, t; y, t0)ψ(y, t0)dy.
Finally, the wave function ψ tells us the probability
|ψ(x, t)|ψ(x, t)|dx
of finding the quantum particle in the interval [a, b] at time t.
(iv) In quantum field theory, one uses the functional integral
exp(iS[ψ]/h) exp(i<ψ|J >)Dψ
with the additional external source J. Differentiation with respect to J
yields the moments of the quantum field. In turn, the moments determine
the correlation functions (also called Green’s functions). The correlation
functions describe the correlations between different positions of
the quantum field at different points in time. These correlations are the
most important properties of the quantum field which can be related to
physical measurements.
Feynman’s functional integral approach to quantum physics clearly shows
that both classical and quantum physics are governed by the classical action
functional S. This approach can also be extended to the study of manyparticle
systems at finite temperature, as we have discussed in Sect. 13.8 of
Vol. I. Summarizing, let us formulate the following general strategy:
The main task in modern physics is the mathematical description of
the propagation of physical effects caused by interactions and their
quantization.
In Sect. 1.9 we will show that in modern physics, interactions are described
by gauge theories based on local symmetries.
Iterative solution of nonlinear problems. The experience of physicists
shows that
Interactions in nature lead to nonlinear terms in the corresponding
differential equations.
This explains the importance of nonlinear problems in physics. We want
to show that the Green’s function can also be used in order to investigate
nonlinear problems.
Therefore, nonlinear problems can be iteratively solved if the Green’s
function is known.
Resonance effects cause singularities of the Green’s function.
Resonances are responsible for complicated physical effects.
For example, the observed chaotic motion of some asteroids is due to resonance
effects in celestial mechanics (the Kolmogorov–Arnold–Moser theory).
In quantum field theory, internal resonances of the quantum field cause special
quantum effects (e.g., the Lamb shift in the spectrum of the hydrogen
atom and the anomalous magnetic moment of the electron), which have to
be treated with the methods of renormalization theory (see Chap. 17 on
radiative corrections in quantum electrodynamics).
由一星于2014-08-02, 06:28进行了最后一次编辑,总共编辑了4次
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
格林函数[编辑]
[ltr]在数学中,格林函数(点源函数、影响函数)是一种用来解有初始条件或边界条件的非齐次微分方程的函数。在物理学的多体理论中,格林函数常常指各种关联函数,有时并不符合数学上的定义。
格林函数的名称是来自于英国数学家乔治·格林(George Green),早在1830年代,他是第一个提出这个概念的人。
[/ltr]
[size][ltr]
定义以及用法[编辑]
给定流形上的微分算子 ,其格林函数,为以下方程的解
其中 为狄拉克δ函数。此技巧可用来解下列形式的微分方程:
若的 零空间非平凡,则格林函数不唯一。不过,实际上因着对称性、边界条件或其他的因素,可以找到唯一的格林函数。一般来说,格林函数只是一个广义函数。
格林函数在凝聚态物理学中常被使用,因为格林函数允许扩散方程式有较高的精度。在量子力学中,哈密顿算子的格林函数和状态密度有重要的关系。由于扩散方程式和薛定谔方程有类似的数学结构,因此两者对应的格林函数也相当接近。
动机[编辑]
若可找到线性算符 的格林函数 ,则可将 (1) 式两侧同乘,再对变量 积分,可得:
由公式 (2) 可知上式的等号右侧等于 ,因此:
由于算符 为线式,且只对变量 作用,不对被积分的变量 作用),所以可以将等号右边的算符 移到积分符号以外,可得:
而以下的式子也会成立:
因此,若知道 (1) 式的格林函数,及 (2) 式中的 f(x),由于 L 为线性算符,可以用上述的方式得到 u(x)。换句话说, (2) 式的解 u(x) 可以由 (3) 式的积分得到。若可以找到满足 (1) 式的格林函数 G ,就可以求出 u(x)。
并非所有的算符 L 都存在对应的格林函数。格林函数也可以视为算符 L 的左逆元素。撇开要找到特定算符的格林函数的难度不论,(3) 式的积分也很难求解,因此此方法只能算是提供了一个理论上存在的解法。
格林函数可以用来解非齐次的微-积分方程──多半是施图姆-刘维尔问题。若G 是算符 L 的格林函数,则方程式 Lu = f 的解 u 为
可以视为 f 依狄拉克δ函数的基底展开,再将所有投影量叠加的结果。以上的积分为弗雷德霍姆积分方程。
非齐次边界值问题的求解[编辑]
格林函数的主要用途是用来求解非齐次的边界值问题。在近代的理论物理中,格林函数一般是用来作为费曼图中的传播子,而“格林函数”一词也用来表示量子力学中的关联函数。
研究框架[编辑]
令 为一个施图姆-刘维尔运算子,是一个以以下形式表示的线性微分运算子
而 D 是边界条件运算子
令 为在 区间的连续函数,并假设以下问题
有正则特牲;即其齐次问题只存在寻常解。
定理[编辑]
则存在唯一解 满足以下方程式
而其解的计算方式如下
而中 即为格林函数,有以下的特性:
[/ltr][/size]
[size][ltr]
寻找格林函数[编辑]
特征向量展开[编辑]
若一微分算子 L 有一组完备的特征向量 (也就是一组函数 及标量 使得 成立)则可以由特征向量及特征值产生格林函数。
先假设函数 满足以下的完备性:
经由证明可得下式:
若在等号两侧加上微分算子 L,则可以证明以上假设的完备性。
有关以上格林函数的进一步研究,及格林函数和特征向量所组成空间的关系,则为弗雷德霍姆理论所要探讨的内容。
拉普拉斯算子的格林函数[编辑]
先由格林定理开始:
假设线性算符 L 为拉普拉斯算子 ,而 G 为拉普拉斯算子的格林函数。则因为格林函数的定义,可得下式:
令格林定理中的 ,可得:
根据上式,可以解拉普拉斯方程 或 泊松方程 ,其边界条件可以为狄利克雷边界条件或是诺伊曼边界条件。换句话说,在以下任一个条件成立时,可以解一空间内任一位置的 :
[/ltr][/size]
[size][ltr]
若想解在区域内的 ,由于狄拉克δ函数的特性,(4) 式等号左边的第一项
可化简为 ,再将 (4) 式等号左边第二项 用 表示,(若为泊松方程,,若为拉普拉斯方程,),可得:
上式即为调和函数(harmonic function)的特性之一:若边界上的值或法向导数已知,则可以求出区域内每个位置的数值。
在静电学中, 为电位, 为电荷密度,而法向导数 则为电场在法向的分量。
若目前的边界条件为狄利克雷边界条件,可以选择在 x 或 x' 在边界时,其值也为 0 的格林函数。若边界条件为诺伊曼边界条件,可以选择在 x 或 x' 在边界时,其法向导数为 0 的格林函数。因此 (5) 式等号右侧的二个积分项有一项为 0 ,只剩下一项需计算。
在自由空间的情形下(此时可将边界条件视为:),拉普拉斯算子的格林函数为:
若 为电荷密度,则可得到电荷密度和电位 的公式:
范例[编辑]
针对以下微分方程
找出格林函数。
第 1 步
根据定理中,格林函数的特性 2,可得
在 x < s 时因特性 3 可知
(此时不需考虑 的式子,因 )在 x > s 时因特性 3 可知
(此时不需考虑 的式子,因 )整理上述的结果,可得以下的式子。
第 2 步
依格林函数的特性,找出 a(s)和b(s).
根据特性 1,可得
.
根据特性 4,可得
解上述二式,可以求出 a(s)和b(s)
.
因此格林函数为
对照此解和格林函数的特性 5,可知此解也满足特性 5 的要求。
其他举例[编辑]
[/ltr][/size]
[size][ltr]
参见[编辑]
[/ltr][/size]
[size][ltr]
参考[编辑]
[/ltr][/size]
[size][ltr]
外部链接[编辑]
[/ltr][/size]
[ltr]在数学中,格林函数(点源函数、影响函数)是一种用来解有初始条件或边界条件的非齐次微分方程的函数。在物理学的多体理论中,格林函数常常指各种关联函数,有时并不符合数学上的定义。
格林函数的名称是来自于英国数学家乔治·格林(George Green),早在1830年代,他是第一个提出这个概念的人。
[/ltr]
[size][ltr]
定义以及用法[编辑]
给定流形上的微分算子 ,其格林函数,为以下方程的解
其中 为狄拉克δ函数。此技巧可用来解下列形式的微分方程:
若的 零空间非平凡,则格林函数不唯一。不过,实际上因着对称性、边界条件或其他的因素,可以找到唯一的格林函数。一般来说,格林函数只是一个广义函数。
格林函数在凝聚态物理学中常被使用,因为格林函数允许扩散方程式有较高的精度。在量子力学中,哈密顿算子的格林函数和状态密度有重要的关系。由于扩散方程式和薛定谔方程有类似的数学结构,因此两者对应的格林函数也相当接近。
动机[编辑]
若可找到线性算符 的格林函数 ,则可将 (1) 式两侧同乘,再对变量 积分,可得:
由公式 (2) 可知上式的等号右侧等于 ,因此:
由于算符 为线式,且只对变量 作用,不对被积分的变量 作用),所以可以将等号右边的算符 移到积分符号以外,可得:
而以下的式子也会成立:
因此,若知道 (1) 式的格林函数,及 (2) 式中的 f(x),由于 L 为线性算符,可以用上述的方式得到 u(x)。换句话说, (2) 式的解 u(x) 可以由 (3) 式的积分得到。若可以找到满足 (1) 式的格林函数 G ,就可以求出 u(x)。
并非所有的算符 L 都存在对应的格林函数。格林函数也可以视为算符 L 的左逆元素。撇开要找到特定算符的格林函数的难度不论,(3) 式的积分也很难求解,因此此方法只能算是提供了一个理论上存在的解法。
格林函数可以用来解非齐次的微-积分方程──多半是施图姆-刘维尔问题。若G 是算符 L 的格林函数,则方程式 Lu = f 的解 u 为
可以视为 f 依狄拉克δ函数的基底展开,再将所有投影量叠加的结果。以上的积分为弗雷德霍姆积分方程。
非齐次边界值问题的求解[编辑]
格林函数的主要用途是用来求解非齐次的边界值问题。在近代的理论物理中,格林函数一般是用来作为费曼图中的传播子,而“格林函数”一词也用来表示量子力学中的关联函数。
研究框架[编辑]
令 为一个施图姆-刘维尔运算子,是一个以以下形式表示的线性微分运算子
而 D 是边界条件运算子
令 为在 区间的连续函数,并假设以下问题
有正则特牲;即其齐次问题只存在寻常解。
定理[编辑]
则存在唯一解 满足以下方程式
而其解的计算方式如下
而中 即为格林函数,有以下的特性:
[/ltr][/size]
- 对 及 连续。。
- 对所有 , .
- 对所有 , .
- 微分跳跃:.
- 对称:.
[size][ltr]
寻找格林函数[编辑]
特征向量展开[编辑]
若一微分算子 L 有一组完备的特征向量 (也就是一组函数 及标量 使得 成立)则可以由特征向量及特征值产生格林函数。
先假设函数 满足以下的完备性:
经由证明可得下式:
若在等号两侧加上微分算子 L,则可以证明以上假设的完备性。
有关以上格林函数的进一步研究,及格林函数和特征向量所组成空间的关系,则为弗雷德霍姆理论所要探讨的内容。
拉普拉斯算子的格林函数[编辑]
先由格林定理开始:
假设线性算符 L 为拉普拉斯算子 ,而 G 为拉普拉斯算子的格林函数。则因为格林函数的定义,可得下式:
令格林定理中的 ,可得:
根据上式,可以解拉普拉斯方程 或 泊松方程 ,其边界条件可以为狄利克雷边界条件或是诺伊曼边界条件。换句话说,在以下任一个条件成立时,可以解一空间内任一位置的 :
[/ltr][/size]
- 已知 在边界上的值(狄利克雷边界条件)。
- 已知 在边界上的法向导数(诺伊曼边界条件)。
[size][ltr]
若想解在区域内的 ,由于狄拉克δ函数的特性,(4) 式等号左边的第一项
可化简为 ,再将 (4) 式等号左边第二项 用 表示,(若为泊松方程,,若为拉普拉斯方程,),可得:
上式即为调和函数(harmonic function)的特性之一:若边界上的值或法向导数已知,则可以求出区域内每个位置的数值。
在静电学中, 为电位, 为电荷密度,而法向导数 则为电场在法向的分量。
若目前的边界条件为狄利克雷边界条件,可以选择在 x 或 x' 在边界时,其值也为 0 的格林函数。若边界条件为诺伊曼边界条件,可以选择在 x 或 x' 在边界时,其法向导数为 0 的格林函数。因此 (5) 式等号右侧的二个积分项有一项为 0 ,只剩下一项需计算。
在自由空间的情形下(此时可将边界条件视为:),拉普拉斯算子的格林函数为:
若 为电荷密度,则可得到电荷密度和电位 的公式:
范例[编辑]
针对以下微分方程
找出格林函数。
第 1 步
根据定理中,格林函数的特性 2,可得
在 x < s 时因特性 3 可知
(此时不需考虑 的式子,因 )在 x > s 时因特性 3 可知
(此时不需考虑 的式子,因 )整理上述的结果,可得以下的式子。
第 2 步
依格林函数的特性,找出 a(s)和b(s).
根据特性 1,可得
.
根据特性 4,可得
解上述二式,可以求出 a(s)和b(s)
.
因此格林函数为
对照此解和格林函数的特性 5,可知此解也满足特性 5 的要求。
其他举例[编辑]
[/ltr][/size]
- 若流形为 R,而线性算符 L 为 d/dx,则单位阶跃函数 H(x − x0) 为 L 在x0 处的格林函数。
- 若流形为第一象限平面 { (x, y) : x, y ≥ 0 } 而线性算符 L 为拉普拉斯算子,并假设在x = 0 处有狄利克雷边界条件,而在y = 0 处有诺依曼边界条件,则其格林函数为
[size][ltr]
参见[编辑]
[/ltr][/size]
[size][ltr]
参考[编辑]
[/ltr][/size]
- Eyges, Leonard, The Classical Electromagnetic Field, Dover Publications, New York, 1972. ISBN 0-486-63947-9.(其中的第五章介绍如何使用格林函数解静电场的边界值问题)
- A. D. Polyanin and V. F. Zaitsev, Handbook of Exact Solutions for Ordinary Differential Equations (2nd edition), Chapman & Hall/CRC Press, Boca Raton, 2003. ISBN 1-58488-297-2
- A. D. Polyanin, Handbook of Linear Partial Differential Equations for Engineers and Scientists, Chapman & Hall/CRC Press, Boca Raton, 2002.ISBN 1-58488-299-9
[size][ltr]
外部链接[编辑]
[/ltr][/size]
- MathWorld上 Green's Function 的资料,作者:埃里克·韦斯坦因。
- Green's function for differential operator at PlanetMath.
- PlanetMath上Green's function的资料。
- 格林函数简介(英文)
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
傅里叶变换
[ltr]傅里叶变换(法语:Transformation de Fourier、英语:Fourier transform)是一种线性的积分变换,常在将信号在时域(或空域)和频域之间变换时使用,在物理学和工程学中有许多应用。因其基本思想首先由法国学者约瑟夫·傅里叶系统地提出,所以以其名字来命名以示纪念。
经过傅里叶变换而生成的函数 称作原函数 的傅里叶变换、亦或其频谱。傅里叶变换是可逆的,即可通过 确定其原函数 。通常情况下, 是实数函数,而 则是复数函数,用一个复数来表示振幅和相位。
“傅里叶变换”一词既可以指变换操作本身(将函数 进行傅里叶变换),又可以指该操作所生成的复数函数( 是 的傅里叶变换)。[/ltr]
5 傅里叶变换的不同变种
5.1 连续傅里叶变换
5.2 傅里叶级数
5.3 离散时间傅里叶变换
5.4 离散傅里叶变换
5.5 在阿贝尔群上的统一描述
5.6 时频分析变换
5.7 傅里叶变换家族
6 常用傅里叶变换表
6.1 函数关系
6.2 平方可积函数
6.3 分布
6.4 二元函数
6.5 三元函数
7 参见
8 参考资料
9 外部链接
[ltr]
简介[编辑]
参见:傅立叶变换家族中的关系[/ltr]
[ltr]
傅里叶变换源自对傅里叶级数的研究。在对傅里叶级数的研究中,复杂的周期函数可以用一系列简单的正弦、余弦波之和表示。傅里叶变换是对傅里叶级数的扩展,由它表示的函数的周期趋近于无穷。
中文译名[编辑]
英语:Fourier transform 或 法语:Transformée de Fourier 有多个中文译名,常见的有“傅里叶变换”、“傅立叶变换”、“付立叶变换”、“傅利葉轉換”、“傅氏轉換”及“傅氏變換”等等。为方便起见,本文统一写作“傅里叶变换”。
应用[编辑]
傅里叶变换在物理学、声学、光学、结构动力学、量子力学、数论、组合数学、概率论、统计学、信号处理、密码学、海洋学、通讯、金融等领域都有着广泛的应用。例如在信号处理中,傅里叶变换的典型用途是将信号分解成振幅分量和频率分量。
基本性质[编辑]
线性性质[编辑]
两函数之和的傅里叶变换等于各自变换之和。数学描述是:若函数和的傅里叶变换和都存在,和为任意常系数,则;傅里叶变换算符可经归一化成为幺正算符。
平移性质[编辑]
若函数存在傅里叶变换,则对任意实数,函数也存在傅里叶变换,且有。式中花体是傅里叶变换的作用算子,平体表示变换的结果(复函数),为自然对数的底,为虚数单位。
微分关系[编辑]
若函数当时的极限为0,而其导函数的傅里叶变换存在,则有,即导函数的傅里叶变换等于原函数的傅里叶变换乘以因子。更一般地,若,且存在,则,即k阶导数的傅里叶变换等于原函数的傅里叶变换乘以因子。
卷积特性[编辑]
若函数及都在上绝对可积,则卷积函数(或者)的傅里叶变换存在,且。卷积性质的逆形式为,即两个函数卷积的傅里叶逆变换等于它们各自的傅里叶逆变换的乘积乘以。
帕塞瓦尔定理[编辑]
若函数可积且平方可积,则。其中是的傅里叶变换。
更一般化而言,若函数和皆平方可积方程(Square-integrable function),则。其中中和中分别是和的傅里叶变换, 代表复共轭。
傅里叶变换的不同变种[编辑]
连续傅里叶变换[编辑]
主条目:连续傅里叶变换
一般情况下,若“傅里叶变换”一词不加任何限定语,则指的是“连续傅里叶变换”(连续函数的傅里叶变换)。连续傅里叶变换将平方可积的函数f(t)表示成复指数函数的积分或级数形式。
这是将频率域的函数F(ω)表示为时间域的函数f(t)的积分形式。
连续傅里叶变换的逆变换(inverse Fourier transform)为
即将时间域的函数f(t)表示为频率域的函数F(ω)的积分。
一般可称函数f(t)为原函数,而称函数F(ω)为傅里叶变换的像函数,原函数和像函数构成一个傅里叶变换对(transform pair)。
除此之外,还有其它型式的变换对,以下两种型式亦常被使用。在通讯或是讯号处理方面,常以来代换,而形成新的变换对:
或者是因系数重分配而得到新的变换对:
一种对连续傅里叶变换的推广称为分数傅里叶变换(Fractional Fourier Transform)。
当f(t)为偶函数(或奇函数)时,其正弦(或余弦)分量将消亡,而可以称这时的变换为余弦转换(cosine transform)或正弦转换(sine transform).
另一个值得注意的性质是,当f(t)为纯实函数时,F(−ω) = F*(ω)成立.
傅里叶级数[编辑]
主条目:傅里叶级数
连续形式的傅里叶变换其实是傅里叶级数(Fourier series)的推广,因为积分其实是一种极限形式的求和算子而已。对于周期函数,其傅里叶级数是存在的:
其中为复振幅。对于实值函数,函数的傅里叶级数可以写成:
其中an和bn是实频率分量的振幅。
傅里叶分析最初是研究周期性现象,即傅里叶级数的,后来通过傅里叶变换将其推广到了非周期性现象。理解这种推广过程的一种方式是将非周期性现象视为周期性现象的一个特例,即其周期为无限长。
离散时间傅里叶变换[编辑]
主条目:离散时间傅里叶变换
离散傅里叶变换是离散时间傅里叶变换(DTFT)的特例(有时作为后者的近似)。DTFT在时域上离散,在频域上则是周期的。DTFT可以被看作是傅里叶级数的逆转换。
离散傅里叶变换[编辑]
主条目:离散傅里叶变换
为了在科学计算和数字信号处理等领域使用计算机进行傅里叶变换,必须将函数xn定义在离散点而非连续域内,且须满足有限性或周期性条件。这种情况下,使用离散傅里叶变换,将函数xn表示为下面的求和形式:
其中是傅里叶振幅。直接使用这个公式计算的计算复杂度为,而快速傅里叶变换(FFT)可以将复杂度改进为。计算复杂度的降低以及数字电路计算能力的发展使得DFT成为在信号处理领域十分实用且重要的方法。
在阿贝尔群上的统一描述[编辑]
以上各种傅里叶变换可以被更统一的表述成任意局部紧致的阿贝尔群上的傅里叶变换。这一问题属于调和分析的范畴。在调和分析中,一个变换从一个群变换到它的对偶群(dual group)。此外,将傅里叶变换与卷积相联系的卷积定理在调和分析中也有类似的结论。傅里叶变换的广义理论基础参见庞特里亚金对偶性(Pontryagin duality)中的介绍。
时频分析变换[编辑]
主条目:时频分析变换
小波变换,chirplet转换和分数傅里叶变换试图得到时间信号的频率信息。同时解析频率和时间的能力在数学上受不确定性原理的限制。
傅里叶变换家族[编辑]
下表列出了傅里叶变换家族的成员。容易发现,函数在时(频)域的离散对应于其像函数在频(时)域的周期性.反之连续则意味着在对应域的信号的非周期性.[/ltr]
|
经过傅里叶变换而生成的函数 称作原函数 的傅里叶变换、亦或其频谱。傅里叶变换是可逆的,即可通过 确定其原函数 。通常情况下, 是实数函数,而 则是复数函数,用一个复数来表示振幅和相位。
“傅里叶变换”一词既可以指变换操作本身(将函数 进行傅里叶变换),又可以指该操作所生成的复数函数( 是 的傅里叶变换)。[/ltr]
[ltr]目录
[/ltr]
[/ltr]
[ltr]
简介[编辑]
参见:傅立叶变换家族中的关系[/ltr]
[ltr]
傅里叶变换源自对傅里叶级数的研究。在对傅里叶级数的研究中,复杂的周期函数可以用一系列简单的正弦、余弦波之和表示。傅里叶变换是对傅里叶级数的扩展,由它表示的函数的周期趋近于无穷。
中文译名[编辑]
英语:Fourier transform 或 法语:Transformée de Fourier 有多个中文译名,常见的有“傅里叶变换”、“傅立叶变换”、“付立叶变换”、“傅利葉轉換”、“傅氏轉換”及“傅氏變換”等等。为方便起见,本文统一写作“傅里叶变换”。
应用[编辑]
傅里叶变换在物理学、声学、光学、结构动力学、量子力学、数论、组合数学、概率论、统计学、信号处理、密码学、海洋学、通讯、金融等领域都有着广泛的应用。例如在信号处理中,傅里叶变换的典型用途是将信号分解成振幅分量和频率分量。
基本性质[编辑]
线性性质[编辑]
两函数之和的傅里叶变换等于各自变换之和。数学描述是:若函数和的傅里叶变换和都存在,和为任意常系数,则;傅里叶变换算符可经归一化成为幺正算符。
平移性质[编辑]
若函数存在傅里叶变换,则对任意实数,函数也存在傅里叶变换,且有。式中花体是傅里叶变换的作用算子,平体表示变换的结果(复函数),为自然对数的底,为虚数单位。
微分关系[编辑]
若函数当时的极限为0,而其导函数的傅里叶变换存在,则有,即导函数的傅里叶变换等于原函数的傅里叶变换乘以因子。更一般地,若,且存在,则,即k阶导数的傅里叶变换等于原函数的傅里叶变换乘以因子。
卷积特性[编辑]
若函数及都在上绝对可积,则卷积函数(或者)的傅里叶变换存在,且。卷积性质的逆形式为,即两个函数卷积的傅里叶逆变换等于它们各自的傅里叶逆变换的乘积乘以。
帕塞瓦尔定理[编辑]
若函数可积且平方可积,则。其中是的傅里叶变换。
更一般化而言,若函数和皆平方可积方程(Square-integrable function),则。其中中和中分别是和的傅里叶变换, 代表复共轭。
傅里叶变换的不同变种[编辑]
连续傅里叶变换[编辑]
主条目:连续傅里叶变换
一般情况下,若“傅里叶变换”一词不加任何限定语,则指的是“连续傅里叶变换”(连续函数的傅里叶变换)。连续傅里叶变换将平方可积的函数f(t)表示成复指数函数的积分或级数形式。
这是将频率域的函数F(ω)表示为时间域的函数f(t)的积分形式。
连续傅里叶变换的逆变换(inverse Fourier transform)为
即将时间域的函数f(t)表示为频率域的函数F(ω)的积分。
一般可称函数f(t)为原函数,而称函数F(ω)为傅里叶变换的像函数,原函数和像函数构成一个傅里叶变换对(transform pair)。
除此之外,还有其它型式的变换对,以下两种型式亦常被使用。在通讯或是讯号处理方面,常以来代换,而形成新的变换对:
或者是因系数重分配而得到新的变换对:
一种对连续傅里叶变换的推广称为分数傅里叶变换(Fractional Fourier Transform)。
当f(t)为偶函数(或奇函数)时,其正弦(或余弦)分量将消亡,而可以称这时的变换为余弦转换(cosine transform)或正弦转换(sine transform).
另一个值得注意的性质是,当f(t)为纯实函数时,F(−ω) = F*(ω)成立.
傅里叶级数[编辑]
主条目:傅里叶级数
连续形式的傅里叶变换其实是傅里叶级数(Fourier series)的推广,因为积分其实是一种极限形式的求和算子而已。对于周期函数,其傅里叶级数是存在的:
其中为复振幅。对于实值函数,函数的傅里叶级数可以写成:
其中an和bn是实频率分量的振幅。
傅里叶分析最初是研究周期性现象,即傅里叶级数的,后来通过傅里叶变换将其推广到了非周期性现象。理解这种推广过程的一种方式是将非周期性现象视为周期性现象的一个特例,即其周期为无限长。
离散时间傅里叶变换[编辑]
主条目:离散时间傅里叶变换
离散傅里叶变换是离散时间傅里叶变换(DTFT)的特例(有时作为后者的近似)。DTFT在时域上离散,在频域上则是周期的。DTFT可以被看作是傅里叶级数的逆转换。
离散傅里叶变换[编辑]
主条目:离散傅里叶变换
为了在科学计算和数字信号处理等领域使用计算机进行傅里叶变换,必须将函数xn定义在离散点而非连续域内,且须满足有限性或周期性条件。这种情况下,使用离散傅里叶变换,将函数xn表示为下面的求和形式:
其中是傅里叶振幅。直接使用这个公式计算的计算复杂度为,而快速傅里叶变换(FFT)可以将复杂度改进为。计算复杂度的降低以及数字电路计算能力的发展使得DFT成为在信号处理领域十分实用且重要的方法。
在阿贝尔群上的统一描述[编辑]
以上各种傅里叶变换可以被更统一的表述成任意局部紧致的阿贝尔群上的傅里叶变换。这一问题属于调和分析的范畴。在调和分析中,一个变换从一个群变换到它的对偶群(dual group)。此外,将傅里叶变换与卷积相联系的卷积定理在调和分析中也有类似的结论。傅里叶变换的广义理论基础参见庞特里亚金对偶性(Pontryagin duality)中的介绍。
时频分析变换[编辑]
主条目:时频分析变换
小波变换,chirplet转换和分数傅里叶变换试图得到时间信号的频率信息。同时解析频率和时间的能力在数学上受不确定性原理的限制。
傅里叶变换家族[编辑]
下表列出了傅里叶变换家族的成员。容易发现,函数在时(频)域的离散对应于其像函数在频(时)域的周期性.反之连续则意味着在对应域的信号的非周期性.[/ltr]
连续傅里叶变换 | 连续,非周期性 | 连续,非周期性 |
傅里叶级数 | 连续,周期性 | 离散,非周期性 |
离散时间傅里叶变换 | 离散,非周期性 | 连续,周期性 |
离散傅里叶变换 | 离散,周期性 | 离散,周期性 |
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
1.7 The Method of Averaging and the Theory of
Distributions
In the early 20th century, mathematicians and physicists noticed that for
wave problems, the Green’s functions possess strong singularities such that
the solution formulas of the type (1.3) fail to exist as classical integrals.14 In
his classic monograph
The Principles of Quantum Mechanics,
Clarendon Press, Oxford, 1930, Dirac introduced a singular object δ(t) (the
Dirac delta function), which is very useful for the description of quantum
processes and the computation of Green’s functions. In the 1940s, Laurent
Schwartz gave all these approaches a sound basis by introducing the notion
of distribution (generalized function). In order to explain Laurent Schwartz’s
basic idea of averaging, consider the continuous motion
x(t) := |t| for all t ∈ R
of a particle on the real line. We want to compute the force F(t) = mddx(t)/dtdt
acting on the particle at time t. Classically, F(t) = 0 if t = 0, and the force
does not exist at the point in time t = 0. We want to motivate that
F(t) = 2mδ(t) for all t ∈ R. (1.20)
Summarizing, we obtain the averaged
force
F(ϕ) = 2mϕ(0) for all ϕ ∈ D(R). (1.21)
In the language of distributions, we have F = 2mδ, where δ denotes the
Dirac delta distribution. A detailed study of the theory of distributions and
its applications to physics can be found in Chaps. 11 and 12 of Vol. I. In
particular, equation (1.20) is equivalent to (1.21), in the sense of distribution
theory.
In terms of experimental physics, distributions correspond to the fact
that measurement devices only measure averaged values. It turns out that
classical functions can also be regarded as distributions. However, in contrast
to classical functions, the following is true:
Distributions possess derivatives of all orders.
Therefore, the theory of distributions is the quite natural completion of the
infinitesimal strategy due to Newton and Leibniz, who lived almost three
hundred years before Laurent Schwartz. This shows convincingly that the
development of mathematics needs time.
1.8 The Symbolic Method
The point is that the symbols know a lot about the properties of the
corresponding operators, and an elegant algebraic calculus for operators
can be based on algebraic operations for the symbols.
The Fourier transformation allows us to reduce differentiation
to multiplication.
In the history of mathematics and physics, formal (also called
symbolic) methods were rigorously justified by using the following tools:
• the Fourier transformation,
• the Laplace transformation (which can be reduced to the Fourier transformation),
• Mikusi´nski’s operational calculus based on the quotient field over a convolution
algebra,
• von Neumann’s operator calculus in Hilbert space,
• the theory of distributions,
• pseudo-differential operators and distributions (e.g., the Weyl calculus in
quantum mechanics), and
• Fourier integral operators and distributions.
1.9 Gauge Theory – Local Symmetry and the
Description of Interactions by Gauge Fields
As we have discussed in Chap. 2 of Vol. I, the Standard Model in particle
physics is based on
• 12 basic particles (6 quarks and 6 leptons), and
• 12 interacting particles (the photon, the 3 vector bosons W+,W−, Z0 and
8 gluons).
This model was formulated in the 1960s and early 1970s. Note the following
crucial fact about the structure of the fundamental interactions in nature.
The fields of the interacting particles can be obtained from the fields
of the basic particles by using the principle of local symmetry (also
called the gauge principle).
(P) The Lagrangian L is invariant under local gauge transformations.
∇t := ∂/∂t+ iU(x, t), ∇x := ∂/∂x+ iA(x, t), (1.43)
α=α(x,t)
U+ := U − αt, A+ := A − αx.
∇+t := ∂/∂t+ iU+, ∇+x := ∂/∂x+ iA+. (1.44)
∇+t ψ+ = exp(iα)∇tψ, ∇+x ψ+ = exp(iα)∇xψ. (1.45)
Theorem 1.1 There holds (1.45).
This theorem tells us the crucial fact that, in contrast to the classical
partial derivatives, the covariant partial derivatives are transformed in
the same way as the field ψ itself. This property is typical for covariant
partial derivatives in mathematics. Indeed, our construction of covariant
partial derivatives has been chosen in such a way that (1.45) is valid.
Proof. By the product rule,
(∂/∂t+ iU+)ψ+ = exp(iα)(iαtψ + ψt + iU+ψ) = exp(iα)(∂/∂t+ iU)ψ.
This yields ∇+t ψ+ = exp(iα)∇tψ. Similarly, we get ∇+x ψ+ = exp(iα)∇xψ.
Now let us discuss the main idea of gauge theory:
We replace the classical partial derivatives ∂/∂t , ∂/∂x by the covariant
partial derivatives ∇t,∇x, respectively.
This is the main trick of gauge theory. In particular, we replace the
Lagrangian
L = ψ† ∂/∂tψ + ψ† ∂/∂xψ
from the original variational problem (1.35) by the modified Lagrangian
L := ψ†∇tψ + ψ†∇xψ.
Explicitly, we have
L = ψ†ψt + ψ†ψx + iψ†Uψ + iψ†Aψ.
The corresponding Euler–Lagrange equations (1.37) and (1.38) read as
∇tψ + ∇xψ = 0, (1.46)
and (∇tψ + ∇xψ)† = 0, respectively.
The local symmetry principle (P) above is closely related to the Faraday–
Green locality principle, saying that physical interactions are localized in
space and time.
Summarizing, the local symmetry principle (P) enforces the existence
of additional gauge fields U,A which interact with the originally given
field ψ.
In the Standard Model in particle physics and in the theory of general relativity,
the additional gauge fields are responsible for the interacting particles.
Consequently, the mathematical structure of the fundamental interactions
in nature is a consequence of the local symmetry principle.
In his search of a unified theory for all interactions in nature, Einstein was
not successful, since he was not aware of the importance of the principle of
local symmetry. In our discussion below, the following notions will be crucial:
• local gauge transformation,
• gauge force F,
• connection form A,
• curvature form F (gauge force form), and
• parallel transport of information.
Gauge force. Covariant partial derivatives can be used in order to introduce
the following notions:
(a) Gauge force (also called curvature): We define
iF := ∇x∇t −∇t∇x. (1.47)
In physics, the function F is called the force induced by the gauge fields
U,A. Explicitly, we get
F = Ux − At. (1.48)
Relation (1.47) tells us that:
The “gauge force” F measures the non-commutativity of the covariant
partial derivatives.
In particular, the force F vanishes if the gauge fields U,A vanish. The
proof of (1.48) follows from
∇t(∇xψ) =(∂∂t+ iU)(ψx + iAψ)= ψtx + iAtψ + iAψt + iUψx − UAψ
and
∇x(∇tψ) =(∂∂x+ iA)(ψt + iUψ)= ψxt + iUxψ + iUψx + iAψt − AUψ.
Hence (∇x∇t −∇t∇x)ψ = i(Ux − At)ψ.
The transformation of the force F with respect to the gauge transformation
ψ+(x, t) = exp(iα(x,y))ψ(x, t) is defined by
iF+ := ∇+x∇+t−∇+t∇+x .
Theorem 1.2 F+ = exp(iα)Fexp(−iα).
(b) Covariant directional derivative: Consider the curve
C : x = x(σ), t = t(σ),
where the curve parameter σ varies in the interval [0, σ0]. The classical
directional derivative along the curve C is defined by
d/dσ:= dx(σ)/dσ∂/∂x+ dt(σ)/dσ∂/∂t.
Explicitly, we get
d/dσ ψ(x(σ), t(σ)) = dx(σ)/dσ ψx(x(σ), t(σ)) + dt(σ)/dσ ψt(x(σ), t(σ)).
Similarly, the covariant directional derivative along the curve C is defined
by
D/dσ:= dx(σ)/dσ∇x + dt(σ)/dσ∇t.
Explicitly,
D/dσψ(x(σ), t(σ)) = d/dσψ(x(σ), t(σ)) + iA(x(σ)), t(σ))dx(σ)/dσ+iU(x(σ), t(σ)) dt(σ)/dσ. (1.49)
(c) Parallel transport: We say that the field function ψ is parallel along the
curve C iff
D/dσψ(x(σ), t(σ)) = 0, 0 ≤ σ ≤ σ0. (1.50)
By (1.49), this notion depends on the gauge fields U,A. In particular, if
the gauge fields U,A vanish, then parallel transport means that the field
ψ is constant along the curve C.
The following observation is crucial. It follows from the key relation (1.45)
on page 36 that the equation (1.50) of parallel transport is invariant under
local gauge transformations. This means that (1.50) implies
D+/dσψ+(x(σ), t(σ)) = 0, 0 ≤ σ ≤ σ0.
Consequently, in terms of mathematics, parallel transport possesses a geometric
meaning with respect to local symmetry transformations.
In terms of physics, parallel transport describes the transport of physical
information in space and time.
This transport is local in space and time, which reflects the Faraday–Green
locality principle.
The Cartan differential. The most elegant formulation of gauge theories
is based on the use of the covariant Cartan differential. As a preparation,
let us recall the classical Cartan calculus. We will use the following relations:
dx ∧ dt = −dt ∧ dx, dx ∧ dx = 0, dt∧
dt = 0. (1.51)
Moreover, the wedge product of three factors of the form dx, dt is always
equal to zero. For example,
dx ∧ dt ∧ dt = 0, dt∧
dx ∧ dt = 0. (1.52)
For the wedge product, both the distributive law and the associative law are
valid. Let ψ : R2 → C be a smooth function. By definition,
• dψ := ψxdx + ψtdt.
The differential 1-form
A := iAdx + iUdt (1.53)
is called the Cartan connection form. By definition,
• dA := idA ∧ dx + idU ∧ dt,
• d(ψ dx∧ dt) = dψ ∧ dx ∧ dt = 0 (Poincar´e identity).
The Poincar´e identity is a consequence of (1.52).
The covariant Cartan differential. We now replace the classical partial
derivatives by the corresponding covariant partial derivatives. Therefore,
we replace dψ by the definition
• Dψ := ∇xψ dx + ∇tψ dt.
Similarly, we define
• DA := iDA ∧ dx + iDU ∧ dt,
• D(ψ dx ∧ dt) = Dψ ∧ dx ∧ dt = 0 (Bianchi identity).
1.9 Gauge Theory and Local Symmetry 41
The Bianchi identity is a consequence of (1.52). Let us introduce the Cartan
curvature form F by setting
F := DA. (1.54)
Theorem 1.3 (i) Dψ = dψ + Aψ.
(ii) F = dA +A∧A (Cartan’s structural equation).
(iii) DF = 0 (Bianchi identity).
In addition, we have the following relations for the curvature form F:
• F = dA + [U,A]−.22
• F = iFdx ∧ dt, where iF = Ux − At.
Proof. Ad (i). Dψ = (ψx + iAψ)dx + (ψt + iUψ)dt.
Ad (ii). Note that
DA = i(Ax + A2)dx + i(At + iUA)dt,
DU = i(Ux + iAU)dx + (Ut + iU2)dt,
and
A∧A = −(Adx + Udt) ∧ (Adx + Udt) = (UA − AU) dx ∧ dt.
Hence
DA = iDA ∧ dx + iDU ∧ dt
= i(At + iUA) dt ∧ dx + i(Ux + iAU) dx ∧ dt = dA +A∧A.
This yields all the identities claimed above.
The results concerning the curvature form F above show that Cartan’s
structural equation (1.54) is nothing else than a reformulation of the equation
iF = Ux − At,
which relates the force F to the potentials U, A. Furthermore, we will show
in Sect. 5.11 on page 333 that Cartan’s structural equation is closely related
to both
• Gauss’ theorema egregium on the computation of the Gaussian curvature of
a classic su***ces by means of the metric tensor and its partial derivatives,
• and the Riemann formula for the computation of the Riemann curvature
tensor of a Riemannian manifold by means of the metric tensor and its
partial derivatives.
In the present case, the formulas can be simplified in the following way. It
follows from the commutativity property AU = UA that:
• F = dA,
• F = iF dx∧ dt = i(Ux − At) dx ∧ dt.
A similar situation appears in Maxwell’s theory of electromagnetism. For
more general gauge theories, the symbols A and U represent matrices. Then
we obtain the additional nonzero terms [U,A]− and A∧A. This is the case
in the Standard Model of elementary particles (see Vol. III).
The mathematical language of fiber bundles. In mathematics, we
proceed as follows:
• We consider the field ψ : R2 → C as a section of the line bundle R2 × C
(with typical fiber C) (see Fig. 4.9 on page 208).
• The line bundle R2×C is associated to the principal fiber bundle R2×U(1)
(with structure group U(1) called the gauge group in physics).23
• As above, the differential 1-form A := iAdx+iUdt is called the connection
form on the base manifold R2 of the principal fiber bundle R2×U(1) , and
• the differential 2-form
F = dA +A∧A
is called the curvature form on the base manifold R2 of the principle fiber
bundle R2 × U(1).
• Finally, we define
Dψ := dψ + Aψ. (1.55)
This is called the covariant differential of the section ψ of the line bundle
R2 × C.
Observe that:
The values of the gauge field functions iU, iA are contained in the Lie
algebra u(1) of the Lie group U(1). Thus, the connection form A is
a differential 1-form with values in the Lie algebra u(1).
This can be generalized by replacing
• the special commutative Lie group U(1)
• by the the general Lie group G.
Then the values of the gauge fields iU, iA are contained in the Lie algebra LG
to G. If G is a noncommutative Lie group (e.g., SU(N) with N ≥ 2), then the
additional force term A∧A does not vanish identically, as in the special case
of the commutative group U(1).24 In Vol. III on gauge theory, we will show
that the Standard Model in particle physics corresponds to this approach by
choosing the gauge group U(1) × SU(2) × SU(3). Here,
• the electroweak interaction is the curvature of a (U(1) × SU(2))-bundle
(Glashow, Salam and Weinberg in the 1960s), and
• the strong interaction is the curvature of a SU(3)-bundle (Gell-Mann and
Fritzsch in the early 1970s).
Historical remarks. General gauge theory is equivalent to modern differential
geometry. This will be thoroughly studied in Vol. III. At this point
let us only make a few historical remarks.
In 1827 Gauss proved that the curvature of a 2-dimensional su***ce
in 3-dimensional Euclidean space is an intrinsic property of the manifold.
This means that the curvature of the su***ce can be measured without using
the surrounding space. This is the content of Gauss’ theorema egregium. The
Gauss theory was generalized to higher-dimensional manifolds by Riemann
in 1854. Here, the Gaussian curvature has to be replaced by the Riemann
curvature tensor. In 1915 Einstein used this mathematical approach in order
to formulate his theory of gravitation (general theory of relativity). In
Einstein’s setting, the masses of celestial bodies (stars, planets, and so on)
determine the Riemann curvature tensor of the four-dimensional space-time
manifold which corresponds to the universe. Thus, Newton’s gravitational
force is replaced by the curvature of a four-dimensional pseudo-Riemannian
manifold M4. The motion of a celestial body (e.g., the motion of a planet
around the sun) is described by a geodesic curve C in M4. Therefore, Einstein’s
equation of motion tells us that the 4-dimensional velocity vector of
C is parallel along the curve C. Roughly speaking, this corresponds to (1.50)
where ψ has to be replaced by the velocity field of C. In the framework of
his theory of general relativity, Einstein established the principle
force = curvature
for gravitation. Nowadays, the Standard Model in particle physics is also
based on this beautiful principle which is the most profound connection between
mathematics and physics.
In 1917 Levi-Civita introduced the notion of parallel transport, and he
showed that both the Gaussian curvature of 2-dimensional su***ces and the
Riemann curvature tensor of higher-dimensional manifolds can be computed
by using parallel transport of vector fields along small closed curves. In the
1920s, ´Elie Cartan invented the method of moving frames.25 In the 1950s,
Ehresmann generalized Cartan’s method of moving frames to the modern
curvature theory for principal fiber bundles (i.e., the fibers are Lie groups)
and their associated vector bundles (i.e., the fibers are linear spaces). In 1963,
Kobayashi and Nomizu published the classic monograph
Foundations of Differential Geometry,
Vols. 1, 2, Wiley, New York. This finishes a longterm development in mathematics.
In 1954, the physicists Yang and Mills created the Yang–Mills theory. It
was their goal to generalize Maxwell’s electrodynamics. To this end, they
started with the observation that Maxwell’s electrodynamics can be formulated
as a gauge theory with the gauge group U(1). This was known from
Hermann Weyl’s paper: Elektron und Gravitation, Z. Phys. 56 (1929), 330–
352 (in German). Yang and Mills
• replaced the commutative group U(1)
• by the non-commutative group SU(2).
The group SU(2) consists of all the complex (2×2)-matrices A with AA† = I
and detA = 1. Interestingly enough, in 1954 Yang and Mills did not know
a striking physical application of their model. However, in the 1960s and
1970s, the Standard Model in particle physics was established as a modified
Yang–Mills theory with the gauge group
U(1) × SU(2) × SU(3).
The modification concerns the use of an additional field called Higgs field
in order to generate the masses of the three gauge bosons W+,W−, Z0. In
the early 1970s, Yang noticed that the Yang–Mills theory is a special case of
Ehresmann’s modern differential geometry in mathematics. For the history
of gauge theory, we refer to:
L. Brown et al. (Eds.), The Rise of the Standard Model, Cambridge University
Press, 1995.
L. O’Raifeartaigh, The Dawning of Gauge Theory, Princeton University
Press, 1997.
C. Taylor (Ed.), Gauge Theories in the Twentieth Century, World Scientific,
Singapore, 2001 (a collection of fundamental articles).
Mathematics and physics. Arthur Jaffe writes the following in his
beautiful survey article Ordering the universe: the role of mathematics in the
Notices of the American Mathematical Society 236 (1984), 589–608:26
There is an exciting development taking place right now, reunification of
mathematics with theoretical physics. . . In the last ten or fifteen years
mathematicians and physicists realized that modern geometry is in fact
the natural framework for gauge theory. The gauge potential in gauge
theory is the connection of mathematics. The gauge field is the mathematical
curvature defined by the connection; certain charges in physics are
the topological invariants studied by mathematicians. While the mathematicians
and physicists worked separately on similar ideas, they did not
duplicate each other’s efforts. The mathematicians produced general, farreaching
theories and investigated their ramifications. Physicists worked
out details of certain examples which turned out to describe nature beautifully
and elegantly. When the two met again, the results are more powerful
than either anticipated. . . In mathematics, we now have a new motivation
to use specific insights from the examples worked out by physicists. This
signals the return to an ancient tradition.
Felix Klein (1849–1925) writes about mathematics:
Our science, in contrast to others, is not founded on a single period of
human history, but has accompanied the development of culture through
all its stages. Mathematics is as much interwoven with Greek culture as
with the most modern problems in engineering. It not only lends a hand
to the progressive natural sciences but participates at the same time in
the abstract investigations of logicians and philosophers.
1.10 The Challenge of Dark Matter
Although science teachers often tell their students that the periodic table
of the elements shows what the Universe is made of, this is not true. We
now know that most of the universe – about 96% of it – is made of dark
matter that defies brief description, and certainly is not represented by
Mendeleev’s periodic table. This unseen ‘dark matter’ is the subject of
this book. . .
Dark matter provides a further remainder that we humans are not essential
to the Universe. Ever since Copernicus (1473–1543) and others suggested
that the Earth was not the center of the Universe, humans have been on
a slide away from cosmic significance. At first we were not at the center of
the Solar System, and then the Sun became just another star in the Milky
Way, not even in the center of our host Galaxy. By this stage the Earth
and its inhabitants had vanished like a speck of dust in a storm. This was
a shock.
In the 1930s Edwin Hubble showed that the Milky Way, vast as it is, is a
mere ‘island Universe’ far removed from everywhere special; and even our
home galaxy was suddenly insignificant in a sea of galaxies, then clusters
of galaxies. Now astronomers have revealed that we are not even made of
the same stuff as most of the Universe. While our planet – our bodies, even
– are tangible and visible, most of the matter in the Universe is not. Our
Universe is made of darkness. How do we respond to that?
Ken Freeman and Geoff McNamarra, 2006
This quotation is taken from the monograph by K. Freemann and G. Mc-
Namarra, In Search of Dark Matter, Springer, Berlin and Praxis Publishing
Chichester, United Kingdom, 2006 (reprinted with permission). As an introduction
to modern cosmology we recommend the monograph by S. Weinberg,
Cosmology, Oxford University, 2008.
Distributions
In the early 20th century, mathematicians and physicists noticed that for
wave problems, the Green’s functions possess strong singularities such that
the solution formulas of the type (1.3) fail to exist as classical integrals.14 In
his classic monograph
The Principles of Quantum Mechanics,
Clarendon Press, Oxford, 1930, Dirac introduced a singular object δ(t) (the
Dirac delta function), which is very useful for the description of quantum
processes and the computation of Green’s functions. In the 1940s, Laurent
Schwartz gave all these approaches a sound basis by introducing the notion
of distribution (generalized function). In order to explain Laurent Schwartz’s
basic idea of averaging, consider the continuous motion
x(t) := |t| for all t ∈ R
of a particle on the real line. We want to compute the force F(t) = mddx(t)/dtdt
acting on the particle at time t. Classically, F(t) = 0 if t = 0, and the force
does not exist at the point in time t = 0. We want to motivate that
F(t) = 2mδ(t) for all t ∈ R. (1.20)
Summarizing, we obtain the averaged
force
F(ϕ) = 2mϕ(0) for all ϕ ∈ D(R). (1.21)
In the language of distributions, we have F = 2mδ, where δ denotes the
Dirac delta distribution. A detailed study of the theory of distributions and
its applications to physics can be found in Chaps. 11 and 12 of Vol. I. In
particular, equation (1.20) is equivalent to (1.21), in the sense of distribution
theory.
In terms of experimental physics, distributions correspond to the fact
that measurement devices only measure averaged values. It turns out that
classical functions can also be regarded as distributions. However, in contrast
to classical functions, the following is true:
Distributions possess derivatives of all orders.
Therefore, the theory of distributions is the quite natural completion of the
infinitesimal strategy due to Newton and Leibniz, who lived almost three
hundred years before Laurent Schwartz. This shows convincingly that the
development of mathematics needs time.
1.8 The Symbolic Method
The point is that the symbols know a lot about the properties of the
corresponding operators, and an elegant algebraic calculus for operators
can be based on algebraic operations for the symbols.
The Fourier transformation allows us to reduce differentiation
to multiplication.
In the history of mathematics and physics, formal (also called
symbolic) methods were rigorously justified by using the following tools:
• the Fourier transformation,
• the Laplace transformation (which can be reduced to the Fourier transformation),
• Mikusi´nski’s operational calculus based on the quotient field over a convolution
algebra,
• von Neumann’s operator calculus in Hilbert space,
• the theory of distributions,
• pseudo-differential operators and distributions (e.g., the Weyl calculus in
quantum mechanics), and
• Fourier integral operators and distributions.
1.9 Gauge Theory – Local Symmetry and the
Description of Interactions by Gauge Fields
As we have discussed in Chap. 2 of Vol. I, the Standard Model in particle
physics is based on
• 12 basic particles (6 quarks and 6 leptons), and
• 12 interacting particles (the photon, the 3 vector bosons W+,W−, Z0 and
8 gluons).
This model was formulated in the 1960s and early 1970s. Note the following
crucial fact about the structure of the fundamental interactions in nature.
The fields of the interacting particles can be obtained from the fields
of the basic particles by using the principle of local symmetry (also
called the gauge principle).
(P) The Lagrangian L is invariant under local gauge transformations.
∇t := ∂/∂t+ iU(x, t), ∇x := ∂/∂x+ iA(x, t), (1.43)
α=α(x,t)
U+ := U − αt, A+ := A − αx.
∇+t := ∂/∂t+ iU+, ∇+x := ∂/∂x+ iA+. (1.44)
∇+t ψ+ = exp(iα)∇tψ, ∇+x ψ+ = exp(iα)∇xψ. (1.45)
Theorem 1.1 There holds (1.45).
This theorem tells us the crucial fact that, in contrast to the classical
partial derivatives, the covariant partial derivatives are transformed in
the same way as the field ψ itself. This property is typical for covariant
partial derivatives in mathematics. Indeed, our construction of covariant
partial derivatives has been chosen in such a way that (1.45) is valid.
Proof. By the product rule,
(∂/∂t+ iU+)ψ+ = exp(iα)(iαtψ + ψt + iU+ψ) = exp(iα)(∂/∂t+ iU)ψ.
This yields ∇+t ψ+ = exp(iα)∇tψ. Similarly, we get ∇+x ψ+ = exp(iα)∇xψ.
Now let us discuss the main idea of gauge theory:
We replace the classical partial derivatives ∂/∂t , ∂/∂x by the covariant
partial derivatives ∇t,∇x, respectively.
This is the main trick of gauge theory. In particular, we replace the
Lagrangian
L = ψ† ∂/∂tψ + ψ† ∂/∂xψ
from the original variational problem (1.35) by the modified Lagrangian
L := ψ†∇tψ + ψ†∇xψ.
Explicitly, we have
L = ψ†ψt + ψ†ψx + iψ†Uψ + iψ†Aψ.
The corresponding Euler–Lagrange equations (1.37) and (1.38) read as
∇tψ + ∇xψ = 0, (1.46)
and (∇tψ + ∇xψ)† = 0, respectively.
The local symmetry principle (P) above is closely related to the Faraday–
Green locality principle, saying that physical interactions are localized in
space and time.
Summarizing, the local symmetry principle (P) enforces the existence
of additional gauge fields U,A which interact with the originally given
field ψ.
In the Standard Model in particle physics and in the theory of general relativity,
the additional gauge fields are responsible for the interacting particles.
Consequently, the mathematical structure of the fundamental interactions
in nature is a consequence of the local symmetry principle.
In his search of a unified theory for all interactions in nature, Einstein was
not successful, since he was not aware of the importance of the principle of
local symmetry. In our discussion below, the following notions will be crucial:
• local gauge transformation,
• gauge force F,
• connection form A,
• curvature form F (gauge force form), and
• parallel transport of information.
Gauge force. Covariant partial derivatives can be used in order to introduce
the following notions:
(a) Gauge force (also called curvature): We define
iF := ∇x∇t −∇t∇x. (1.47)
In physics, the function F is called the force induced by the gauge fields
U,A. Explicitly, we get
F = Ux − At. (1.48)
Relation (1.47) tells us that:
The “gauge force” F measures the non-commutativity of the covariant
partial derivatives.
In particular, the force F vanishes if the gauge fields U,A vanish. The
proof of (1.48) follows from
∇t(∇xψ) =(∂∂t+ iU)(ψx + iAψ)= ψtx + iAtψ + iAψt + iUψx − UAψ
and
∇x(∇tψ) =(∂∂x+ iA)(ψt + iUψ)= ψxt + iUxψ + iUψx + iAψt − AUψ.
Hence (∇x∇t −∇t∇x)ψ = i(Ux − At)ψ.
The transformation of the force F with respect to the gauge transformation
ψ+(x, t) = exp(iα(x,y))ψ(x, t) is defined by
iF+ := ∇+x∇+t−∇+t∇+x .
Theorem 1.2 F+ = exp(iα)Fexp(−iα).
(b) Covariant directional derivative: Consider the curve
C : x = x(σ), t = t(σ),
where the curve parameter σ varies in the interval [0, σ0]. The classical
directional derivative along the curve C is defined by
d/dσ:= dx(σ)/dσ∂/∂x+ dt(σ)/dσ∂/∂t.
Explicitly, we get
d/dσ ψ(x(σ), t(σ)) = dx(σ)/dσ ψx(x(σ), t(σ)) + dt(σ)/dσ ψt(x(σ), t(σ)).
Similarly, the covariant directional derivative along the curve C is defined
by
D/dσ:= dx(σ)/dσ∇x + dt(σ)/dσ∇t.
Explicitly,
D/dσψ(x(σ), t(σ)) = d/dσψ(x(σ), t(σ)) + iA(x(σ)), t(σ))dx(σ)/dσ+iU(x(σ), t(σ)) dt(σ)/dσ. (1.49)
(c) Parallel transport: We say that the field function ψ is parallel along the
curve C iff
D/dσψ(x(σ), t(σ)) = 0, 0 ≤ σ ≤ σ0. (1.50)
By (1.49), this notion depends on the gauge fields U,A. In particular, if
the gauge fields U,A vanish, then parallel transport means that the field
ψ is constant along the curve C.
The following observation is crucial. It follows from the key relation (1.45)
on page 36 that the equation (1.50) of parallel transport is invariant under
local gauge transformations. This means that (1.50) implies
D+/dσψ+(x(σ), t(σ)) = 0, 0 ≤ σ ≤ σ0.
Consequently, in terms of mathematics, parallel transport possesses a geometric
meaning with respect to local symmetry transformations.
In terms of physics, parallel transport describes the transport of physical
information in space and time.
This transport is local in space and time, which reflects the Faraday–Green
locality principle.
The Cartan differential. The most elegant formulation of gauge theories
is based on the use of the covariant Cartan differential. As a preparation,
let us recall the classical Cartan calculus. We will use the following relations:
dx ∧ dt = −dt ∧ dx, dx ∧ dx = 0, dt∧
dt = 0. (1.51)
Moreover, the wedge product of three factors of the form dx, dt is always
equal to zero. For example,
dx ∧ dt ∧ dt = 0, dt∧
dx ∧ dt = 0. (1.52)
For the wedge product, both the distributive law and the associative law are
valid. Let ψ : R2 → C be a smooth function. By definition,
• dψ := ψxdx + ψtdt.
The differential 1-form
A := iAdx + iUdt (1.53)
is called the Cartan connection form. By definition,
• dA := idA ∧ dx + idU ∧ dt,
• d(ψ dx∧ dt) = dψ ∧ dx ∧ dt = 0 (Poincar´e identity).
The Poincar´e identity is a consequence of (1.52).
The covariant Cartan differential. We now replace the classical partial
derivatives by the corresponding covariant partial derivatives. Therefore,
we replace dψ by the definition
• Dψ := ∇xψ dx + ∇tψ dt.
Similarly, we define
• DA := iDA ∧ dx + iDU ∧ dt,
• D(ψ dx ∧ dt) = Dψ ∧ dx ∧ dt = 0 (Bianchi identity).
1.9 Gauge Theory and Local Symmetry 41
The Bianchi identity is a consequence of (1.52). Let us introduce the Cartan
curvature form F by setting
F := DA. (1.54)
Theorem 1.3 (i) Dψ = dψ + Aψ.
(ii) F = dA +A∧A (Cartan’s structural equation).
(iii) DF = 0 (Bianchi identity).
In addition, we have the following relations for the curvature form F:
• F = dA + [U,A]−.22
• F = iFdx ∧ dt, where iF = Ux − At.
Proof. Ad (i). Dψ = (ψx + iAψ)dx + (ψt + iUψ)dt.
Ad (ii). Note that
DA = i(Ax + A2)dx + i(At + iUA)dt,
DU = i(Ux + iAU)dx + (Ut + iU2)dt,
and
A∧A = −(Adx + Udt) ∧ (Adx + Udt) = (UA − AU) dx ∧ dt.
Hence
DA = iDA ∧ dx + iDU ∧ dt
= i(At + iUA) dt ∧ dx + i(Ux + iAU) dx ∧ dt = dA +A∧A.
This yields all the identities claimed above.
The results concerning the curvature form F above show that Cartan’s
structural equation (1.54) is nothing else than a reformulation of the equation
iF = Ux − At,
which relates the force F to the potentials U, A. Furthermore, we will show
in Sect. 5.11 on page 333 that Cartan’s structural equation is closely related
to both
• Gauss’ theorema egregium on the computation of the Gaussian curvature of
a classic su***ces by means of the metric tensor and its partial derivatives,
• and the Riemann formula for the computation of the Riemann curvature
tensor of a Riemannian manifold by means of the metric tensor and its
partial derivatives.
In the present case, the formulas can be simplified in the following way. It
follows from the commutativity property AU = UA that:
• F = dA,
• F = iF dx∧ dt = i(Ux − At) dx ∧ dt.
A similar situation appears in Maxwell’s theory of electromagnetism. For
more general gauge theories, the symbols A and U represent matrices. Then
we obtain the additional nonzero terms [U,A]− and A∧A. This is the case
in the Standard Model of elementary particles (see Vol. III).
The mathematical language of fiber bundles. In mathematics, we
proceed as follows:
• We consider the field ψ : R2 → C as a section of the line bundle R2 × C
(with typical fiber C) (see Fig. 4.9 on page 208).
• The line bundle R2×C is associated to the principal fiber bundle R2×U(1)
(with structure group U(1) called the gauge group in physics).23
• As above, the differential 1-form A := iAdx+iUdt is called the connection
form on the base manifold R2 of the principal fiber bundle R2×U(1) , and
• the differential 2-form
F = dA +A∧A
is called the curvature form on the base manifold R2 of the principle fiber
bundle R2 × U(1).
• Finally, we define
Dψ := dψ + Aψ. (1.55)
This is called the covariant differential of the section ψ of the line bundle
R2 × C.
Observe that:
The values of the gauge field functions iU, iA are contained in the Lie
algebra u(1) of the Lie group U(1). Thus, the connection form A is
a differential 1-form with values in the Lie algebra u(1).
This can be generalized by replacing
• the special commutative Lie group U(1)
• by the the general Lie group G.
Then the values of the gauge fields iU, iA are contained in the Lie algebra LG
to G. If G is a noncommutative Lie group (e.g., SU(N) with N ≥ 2), then the
additional force term A∧A does not vanish identically, as in the special case
of the commutative group U(1).24 In Vol. III on gauge theory, we will show
that the Standard Model in particle physics corresponds to this approach by
choosing the gauge group U(1) × SU(2) × SU(3). Here,
• the electroweak interaction is the curvature of a (U(1) × SU(2))-bundle
(Glashow, Salam and Weinberg in the 1960s), and
• the strong interaction is the curvature of a SU(3)-bundle (Gell-Mann and
Fritzsch in the early 1970s).
Historical remarks. General gauge theory is equivalent to modern differential
geometry. This will be thoroughly studied in Vol. III. At this point
let us only make a few historical remarks.
In 1827 Gauss proved that the curvature of a 2-dimensional su***ce
in 3-dimensional Euclidean space is an intrinsic property of the manifold.
This means that the curvature of the su***ce can be measured without using
the surrounding space. This is the content of Gauss’ theorema egregium. The
Gauss theory was generalized to higher-dimensional manifolds by Riemann
in 1854. Here, the Gaussian curvature has to be replaced by the Riemann
curvature tensor. In 1915 Einstein used this mathematical approach in order
to formulate his theory of gravitation (general theory of relativity). In
Einstein’s setting, the masses of celestial bodies (stars, planets, and so on)
determine the Riemann curvature tensor of the four-dimensional space-time
manifold which corresponds to the universe. Thus, Newton’s gravitational
force is replaced by the curvature of a four-dimensional pseudo-Riemannian
manifold M4. The motion of a celestial body (e.g., the motion of a planet
around the sun) is described by a geodesic curve C in M4. Therefore, Einstein’s
equation of motion tells us that the 4-dimensional velocity vector of
C is parallel along the curve C. Roughly speaking, this corresponds to (1.50)
where ψ has to be replaced by the velocity field of C. In the framework of
his theory of general relativity, Einstein established the principle
force = curvature
for gravitation. Nowadays, the Standard Model in particle physics is also
based on this beautiful principle which is the most profound connection between
mathematics and physics.
In 1917 Levi-Civita introduced the notion of parallel transport, and he
showed that both the Gaussian curvature of 2-dimensional su***ces and the
Riemann curvature tensor of higher-dimensional manifolds can be computed
by using parallel transport of vector fields along small closed curves. In the
1920s, ´Elie Cartan invented the method of moving frames.25 In the 1950s,
Ehresmann generalized Cartan’s method of moving frames to the modern
curvature theory for principal fiber bundles (i.e., the fibers are Lie groups)
and their associated vector bundles (i.e., the fibers are linear spaces). In 1963,
Kobayashi and Nomizu published the classic monograph
Foundations of Differential Geometry,
Vols. 1, 2, Wiley, New York. This finishes a longterm development in mathematics.
In 1954, the physicists Yang and Mills created the Yang–Mills theory. It
was their goal to generalize Maxwell’s electrodynamics. To this end, they
started with the observation that Maxwell’s electrodynamics can be formulated
as a gauge theory with the gauge group U(1). This was known from
Hermann Weyl’s paper: Elektron und Gravitation, Z. Phys. 56 (1929), 330–
352 (in German). Yang and Mills
• replaced the commutative group U(1)
• by the non-commutative group SU(2).
The group SU(2) consists of all the complex (2×2)-matrices A with AA† = I
and detA = 1. Interestingly enough, in 1954 Yang and Mills did not know
a striking physical application of their model. However, in the 1960s and
1970s, the Standard Model in particle physics was established as a modified
Yang–Mills theory with the gauge group
U(1) × SU(2) × SU(3).
The modification concerns the use of an additional field called Higgs field
in order to generate the masses of the three gauge bosons W+,W−, Z0. In
the early 1970s, Yang noticed that the Yang–Mills theory is a special case of
Ehresmann’s modern differential geometry in mathematics. For the history
of gauge theory, we refer to:
L. Brown et al. (Eds.), The Rise of the Standard Model, Cambridge University
Press, 1995.
L. O’Raifeartaigh, The Dawning of Gauge Theory, Princeton University
Press, 1997.
C. Taylor (Ed.), Gauge Theories in the Twentieth Century, World Scientific,
Singapore, 2001 (a collection of fundamental articles).
Mathematics and physics. Arthur Jaffe writes the following in his
beautiful survey article Ordering the universe: the role of mathematics in the
Notices of the American Mathematical Society 236 (1984), 589–608:26
There is an exciting development taking place right now, reunification of
mathematics with theoretical physics. . . In the last ten or fifteen years
mathematicians and physicists realized that modern geometry is in fact
the natural framework for gauge theory. The gauge potential in gauge
theory is the connection of mathematics. The gauge field is the mathematical
curvature defined by the connection; certain charges in physics are
the topological invariants studied by mathematicians. While the mathematicians
and physicists worked separately on similar ideas, they did not
duplicate each other’s efforts. The mathematicians produced general, farreaching
theories and investigated their ramifications. Physicists worked
out details of certain examples which turned out to describe nature beautifully
and elegantly. When the two met again, the results are more powerful
than either anticipated. . . In mathematics, we now have a new motivation
to use specific insights from the examples worked out by physicists. This
signals the return to an ancient tradition.
Felix Klein (1849–1925) writes about mathematics:
Our science, in contrast to others, is not founded on a single period of
human history, but has accompanied the development of culture through
all its stages. Mathematics is as much interwoven with Greek culture as
with the most modern problems in engineering. It not only lends a hand
to the progressive natural sciences but participates at the same time in
the abstract investigations of logicians and philosophers.
1.10 The Challenge of Dark Matter
Although science teachers often tell their students that the periodic table
of the elements shows what the Universe is made of, this is not true. We
now know that most of the universe – about 96% of it – is made of dark
matter that defies brief description, and certainly is not represented by
Mendeleev’s periodic table. This unseen ‘dark matter’ is the subject of
this book. . .
Dark matter provides a further remainder that we humans are not essential
to the Universe. Ever since Copernicus (1473–1543) and others suggested
that the Earth was not the center of the Universe, humans have been on
a slide away from cosmic significance. At first we were not at the center of
the Solar System, and then the Sun became just another star in the Milky
Way, not even in the center of our host Galaxy. By this stage the Earth
and its inhabitants had vanished like a speck of dust in a storm. This was
a shock.
In the 1930s Edwin Hubble showed that the Milky Way, vast as it is, is a
mere ‘island Universe’ far removed from everywhere special; and even our
home galaxy was suddenly insignificant in a sea of galaxies, then clusters
of galaxies. Now astronomers have revealed that we are not even made of
the same stuff as most of the Universe. While our planet – our bodies, even
– are tangible and visible, most of the matter in the Universe is not. Our
Universe is made of darkness. How do we respond to that?
Ken Freeman and Geoff McNamarra, 2006
This quotation is taken from the monograph by K. Freemann and G. Mc-
Namarra, In Search of Dark Matter, Springer, Berlin and Praxis Publishing
Chichester, United Kingdom, 2006 (reprinted with permission). As an introduction
to modern cosmology we recommend the monograph by S. Weinberg,
Cosmology, Oxford University, 2008.
由一星于2014-08-04, 04:21进行了最后一次编辑,总共编辑了3次
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
李群[编辑]
[ltr]
数学中,李群是具有群结构的流形或者复流形,并且群中的二元运算和逆元运算是流形中的解析映射。李群在数学分析、物理和几何中都有非常重要的作用。它以索菲斯·李命名。[/ltr]
[ltr]
李群定义[编辑][/ltr]
[ltr]
解析李群与光滑李群[编辑]
部份书籍在定义李群时假设了解析性,本条目采相同定义。另一种进路则是定义李群为实光滑(简记为)流形,并具有光滑的群二元运算与逆元运算。解析条件看似较强,实则两者等价:
定理.任意李群上具有唯一的实解析流形结构,使得群二元运算及逆元运算皆为解析映射。此时指数映射亦为解析映射。
同态和同构[编辑]
均为李群,二者之间的一个同态:为群同态并且是解析映射(事实上,可以证明这里解析的条件只需满足连续即可)。显然,两个同态的复合是同态。所有李群的类加上同态构成一个范畴。 两个李群之间存在一个双射,这个双射及其逆射均为同态,就称之为同构。
李代数[编辑]
李代数刻划了李群在单位元附近的局部性状;借助指数映射或源自李代数的叶状结构,可以将李代数的性质提升到李群的层次。
设为李群,其李代数定义为在单位元的切空间。自然具备了矢量空间结构,上的李括积定义如下:[/ltr]
[ltr]
不难验证满足李代数的抽象定义。李括积蕴含了群乘法的无穷小性质,例如:连通李群是交换群当且仅当是交换李代数。
李括积也可以用左不变矢量场及泊松括号定义,或者取定局部坐标,用群乘法映射在原点的泰勒级数定义。
李群对应李代数[编辑]
若是李群,是其子群,并带有李群结构,使得包含映射为浸入(不一定是闭的),则可得到子李代数。反之,任意子李代数透过左平移定义了上的叶状结构,取含单位元的极大积分流形,便得到满足前述条件的子群。此子群未必是闭子群,它可能是的稠密子集(考虑环面的例子)。
李代数的映射未必能提升至李群的映射,但可提升至映射,其中是的万有覆叠空间。
指数映射[编辑]
对于任意矢量,根据常微分方程式的基本理论,存在中的单参数子群使得。由此得到的映射
称为指数映射。它总是解析映射。
若为的子群,则,这是指数映射一词的缘由。
当连通且非交换时,指数映射并非同态;局部上,可以由Campbell-Baker-Hausdorff公式表成涉及括积的无穷级数。
一般域上的李群[编辑]
在任意域、环乃至于概形上,都可以定义群概形;这是概形范畴中的群对象。群概形具有深刻的几何与数论意义,然而李群未必是代数簇。
另一方面,若域对某个绝对值是完备域,其特征为零,则可照搬解析李群的定义以定义域 上的李群、李代数与指数映射。较常见的例子是;至于数论方面,特别涉及自守表示的研究上,则须用到为p进数域的情形。
李代数[编辑]
数学上,李代数是一个代数结构,主要用于研究象李群和微分流形之类的几何对象。李代数因研究无穷小变换的概念而引入。“李代数”(以索甫斯·李命名)一词是由赫尔曼·外尔在1930年代引入的。在旧文献中,无穷小群指的就是李代数。
[/ltr]
[ltr]
定义[编辑]
李代数是一个域代数,在某个域 F 上的向量空间 ,其二元运算是: ,称为李括号,符合以下条件:[/ltr]
[ltr]
[/ltr]
[ltr]
[/ltr]
[ltr]
首两个条件蕴含
(反对称)。反之,这个反对称蕴含条件 2(F 的特征不是 2)。
用李括号表达的乘法不一定符合结合律。即 与 不一定相等。因此李代数通常并非环或结合代数。
例子[编辑]
1. 如果我们定义李括号等于,则每个向量空间自然成为一个平凡的交换李代数。
2. 欧几里得空间是一个李代数,如果选李括号为向量的叉乘。
3. 若一个结合代数给定乘法,它可以通过定义而成为李代数。这个表达式称为和的换位子。相反的,每个李代数可以嵌入到一个以这个方式从结合代数得到的李代数中。参看泛包络代数。
4. 另一个李代数的重要例子来自于微分几何:可微流形上的光滑向量场在把李导数作为李括号的时候成为一个无穷维李代数。李导数把向量场等同为作用在任何光滑标量场上的偏微分算子,这是通过令为在方向的方向导数达成的。这样,在表达式中,并列表示偏微分算子的复合。然后,李括号定义为
对于流形上的每个光滑函数。
这是流形的微分同胚集合构成的无穷维李群的李代数。
5. 李群的左不变向量场组成的向量空间在李括号这个操作下是闭的,因而是一个有限维李代数。或者,可以把属于一个李群的李代数的向量空间看成是该群的幺元的切空间。乘法就是群在幺元的微分的换位子,。
6. 作为一个具体的例子,考虑李群,所有实系数行列式为的矩阵。单位矩阵的切空间可以和所有迹为的实矩阵等同起来,其来自于李群的李代数结构和来自矩阵乘法的交换子的相同。
更多李群和它们相应的李代数,请参看李群条目。
同态,子代数,和理想[编辑]
在同样基域上的李代数和之间的一个同态是一个-线性映射,使得对于所有中的和有。这样的同态的复合也是同态,而域上的李代数,和这些态射一起,组成了一个范畴。如果一个同态是双射,它称为同构,而两个李代数和称为同构的。对于所有的应用目的,同构的李代数是相同的。
李代数的一个子代数是的一个线性子空间使得对于所有成立。则这个子代数自身是一个李代数。
李代数的理想是的一个子空间,使得对于所有和成立。所有理想都是子代数。若是的一个理想,则商空间成为一个李代数,这是通过定义为对于所有成立。理想刚好就是同态的核,而同态基本定理对于李代数是适用的。
李代数的分类[编辑]
实和复李代数可以分类到某种程度,而这个分类是李群分类的重要一步。每个有限维实或复李代数作为一个唯一的实或复单连通李群的李代数出现(Ado定理),但是可能有一个以上的群,甚至一个以上的连通群,有这个相同的李代数。例如,群 SO(3)(行列式值为1的 3×3 正交群)和SU(2) (行列式为1的 2×2 酉矩阵)有相同的李代数,就是 R3,以叉乘为李括号。
李代数是“交换的”,如果李括号为0,也就是 [x, y] = 0 对于所有 x 和 y。更一般的,一个李代数 是零幂(nilpotent)的,如果低中心序列(lower central series)
最终为 0。按照Engel定理,李代数零幂当且仅当对每个 中的 u 映射
是零幂的。更一般的,李代数 是可解的若导序列(derived series)
最终成为0。 极大可解子代数成为波莱尔子代数。
李代数 g 称为半单 如果 唯一的可解理想是平凡的。等价的, 是半单的当且仅当基灵型 K(u,v) = tr(ad(u)ad(v)) 是非退化的;这里 tr 表示迹算子。当域 F 的特征数为 0, 半单单当且仅当每个表示都是完全可约的,也就是对于每个表示的不变子空间,有一个不变的补空间(外尔定理 Weyl's theorem).
李代数是单的,如果它没有非平凡理想并且非交换。特别的有,一个单李代数是半单的,更一般的,半单李代数是单李代数的直和。
半单复李代数可通过它们的根系分类。
范畴理论定义[编辑]
使用范畴论的语言,李代数可以定义为向量空间范畴中的对象 A 和态射 使得[/ltr]
[ltr]
其中 而 σ 是复合 的循环枚举。用交换图形式:[/ltr]
[ltr]
参看[编辑][/ltr]
李导数[编辑]
[ltr]李导数(Lie derivative)是一种在流形M上的光滑函数组成的代数上的求导运算,以索甫斯·李命名。 所有李导数组成的向量空间对应于如下的李括号构成一个无限维李代数。
李导数用向量场表示,这些向量场可看作M上的流(时变微分同胚)的无穷小生成元。从另一角度看,M上的微分同胚组成的群,有其对应的李导数的李代数结构,在某种意义上和李群理论直接相关。
[/ltr]
[size][ltr]
定义[编辑]
李导数有几种等价的定义。在本节,为简便起见,我们用标量场和向量场的李导数的定义开始。李导数也可定义在一般的张量上,如后面的章节所述。
李导数的定义可以从函数的微分开始。这样,给定一个函数和一个M上的向量场X , f在点的李导数定义为
其中是f的微分。也就是,是由下式给出的[1-形式]
.
这里,是余切丛的基向量。这样,记号表示取f(在M中的点p)的微分和向量场X(在点p)的内积。
或者,可以先表明M上的光滑向量场X定义了一个M上的单参数曲线族。也就是,可以表明存在曲线在M上使得
其中对于所有M中的点p成立。这个一阶常微分方程的解的存在性由皮卡-林德洛夫定理给出(更一般的,这种曲线的存在性是弗罗贝尼乌斯定理给出)。然后可以定义李导数为
.
第三个可能的定义可以通过先定义一对向量场的李括号给出。首先注意到切空间的基向量可以写为,所以一个向量场,用一组选定的基向量可以表示为
定义李括号为
然后定义向量场Y的李导数等于X和Y的李导数,也就是,
.
根据上面任选的一个定义,其他的定义可被证明为其等价形式。 例如,可以证明,对于一个可微函数f,
并且
.
我们用在1-形式上的李导数的定义来结束本节:
.
性质[编辑]
李导数有一些属性。令为流形M上的函数组成的代数。则
是一个在代数上的导数。也就是, 是R-线性的,并且
.
类似的,它是上的一个导数,其中是M上的向量场的集合:
也可写为等价形式
其中张量积符号用于强调函数和向量场的积在整个流形上取。
另外的性质和李括号的一致。所以,例如,作为向量场的导数,
容易发现上面就是雅可比恒等式。这样,就可以得到“装备了李括号的M上的向量空间是李代数”的重要结果。
和外导数的关系、微分形式的李导数[编辑]
李导数和外导数密切相关,因此和埃里·嘉当的微分流形理论相关。 两个都试图给出导数的思想,其差别几乎只是记号上的。这个区别可以通过引入反导数或等效的内积来消除。 这之后,两者的关系就体现在一组恒等式上。
令M为一个流形,X为M上一个向量场。令为一k+1-形式。X和ω的内积为
注意
以及是-反导数。也就是,是R-线性的,并且
对于和另一个微分形式η成立。另外,对于一个函数,那是一个实或复值 的M上的函数,有
外导数和李导数的关系可以总结为以下这些。对于一般函数f,李导数就是外导数和向量场的内积:
对于一般的微分流形,李导数类似于内积,加上X的变化:
.
当ω为1-形式,上述恒等式经常写作
导数的乘积是可分配的
张量场的李导数[编辑]
在微分几何中,如果我们有一个阶可微张量场(我们可以把它当作余切丛的光滑截面和切丛的截面的线性映射 ),使得对于任何函数 有
),
而且如果进一步有一个可微向量场(也就是切丛的一个光滑截面),则线性映射
独立于联络∇;只要它是无挠率的,事实上,这个映射是一个张量。这个张量称为关于的李导数。
换句话说,如果你有一个张量场和一个由向量场给出的微分同胚的无穷小生成元,则就是在这个无穷小微分同胚下的无穷小变化。
或者,给定向向量场,令ψ为的积分曲线族,向上面那样。注意ψ是一个局部单参数局部微分同胚群。令为由ψ诱导的拉回(pullback)。则张量在点的李导数如下
.
参见[编辑]
[/ltr][/size]
本文介绍的是数学中的李群。关于其他含义,详见“李群 (消歧义)”。
[ltr]
数学中,李群是具有群结构的流形或者复流形,并且群中的二元运算和逆元运算是流形中的解析映射。李群在数学分析、物理和几何中都有非常重要的作用。它以索菲斯·李命名。[/ltr]
[ltr]
李群定义[编辑][/ltr]
- 为有限维实解析流形
- 两个解析映射,二元运算,和逆映射满足群公理,从而具有群结构。
[ltr]
解析李群与光滑李群[编辑]
部份书籍在定义李群时假设了解析性,本条目采相同定义。另一种进路则是定义李群为实光滑(简记为)流形,并具有光滑的群二元运算与逆元运算。解析条件看似较强,实则两者等价:
定理.任意李群上具有唯一的实解析流形结构,使得群二元运算及逆元运算皆为解析映射。此时指数映射亦为解析映射。
同态和同构[编辑]
均为李群,二者之间的一个同态:为群同态并且是解析映射(事实上,可以证明这里解析的条件只需满足连续即可)。显然,两个同态的复合是同态。所有李群的类加上同态构成一个范畴。 两个李群之间存在一个双射,这个双射及其逆射均为同态,就称之为同构。
李代数[编辑]
李代数刻划了李群在单位元附近的局部性状;借助指数映射或源自李代数的叶状结构,可以将李代数的性质提升到李群的层次。
设为李群,其李代数定义为在单位元的切空间。自然具备了矢量空间结构,上的李括积定义如下:[/ltr]
- 定义对自身的伴随作用为 ,。
- 取Ad对变元在单位元上的微分,得到李代数上的伴随作用,通常记为,。
- 再对变元微分,得到映射。定义李括积为。
[ltr]
不难验证满足李代数的抽象定义。李括积蕴含了群乘法的无穷小性质,例如:连通李群是交换群当且仅当是交换李代数。
李括积也可以用左不变矢量场及泊松括号定义,或者取定局部坐标,用群乘法映射在原点的泰勒级数定义。
李群对应李代数[编辑]
若是李群,是其子群,并带有李群结构,使得包含映射为浸入(不一定是闭的),则可得到子李代数。反之,任意子李代数透过左平移定义了上的叶状结构,取含单位元的极大积分流形,便得到满足前述条件的子群。此子群未必是闭子群,它可能是的稠密子集(考虑环面的例子)。
李代数的映射未必能提升至李群的映射,但可提升至映射,其中是的万有覆叠空间。
指数映射[编辑]
对于任意矢量,根据常微分方程式的基本理论,存在中的单参数子群使得。由此得到的映射
称为指数映射。它总是解析映射。
若为的子群,则,这是指数映射一词的缘由。
当连通且非交换时,指数映射并非同态;局部上,可以由Campbell-Baker-Hausdorff公式表成涉及括积的无穷级数。
一般域上的李群[编辑]
在任意域、环乃至于概形上,都可以定义群概形;这是概形范畴中的群对象。群概形具有深刻的几何与数论意义,然而李群未必是代数簇。
另一方面,若域对某个绝对值是完备域,其特征为零,则可照搬解析李群的定义以定义域 上的李群、李代数与指数映射。较常见的例子是;至于数论方面,特别涉及自守表示的研究上,则须用到为p进数域的情形。
李代数[编辑]
数学上,李代数是一个代数结构,主要用于研究象李群和微分流形之类的几何对象。李代数因研究无穷小变换的概念而引入。“李代数”(以索甫斯·李命名)一词是由赫尔曼·外尔在1930年代引入的。在旧文献中,无穷小群指的就是李代数。
[/ltr]
[ltr]
定义[编辑]
李代数是一个域代数,在某个域 F 上的向量空间 ,其二元运算是: ,称为李括号,符合以下条件:[/ltr]
- 双线性:
[ltr]
[/ltr]
[ltr]
[/ltr]
[ltr]
首两个条件蕴含
(反对称)。反之,这个反对称蕴含条件 2(F 的特征不是 2)。
用李括号表达的乘法不一定符合结合律。即 与 不一定相等。因此李代数通常并非环或结合代数。
例子[编辑]
1. 如果我们定义李括号等于,则每个向量空间自然成为一个平凡的交换李代数。
2. 欧几里得空间是一个李代数,如果选李括号为向量的叉乘。
3. 若一个结合代数给定乘法,它可以通过定义而成为李代数。这个表达式称为和的换位子。相反的,每个李代数可以嵌入到一个以这个方式从结合代数得到的李代数中。参看泛包络代数。
4. 另一个李代数的重要例子来自于微分几何:可微流形上的光滑向量场在把李导数作为李括号的时候成为一个无穷维李代数。李导数把向量场等同为作用在任何光滑标量场上的偏微分算子,这是通过令为在方向的方向导数达成的。这样,在表达式中,并列表示偏微分算子的复合。然后,李括号定义为
对于流形上的每个光滑函数。
这是流形的微分同胚集合构成的无穷维李群的李代数。
5. 李群的左不变向量场组成的向量空间在李括号这个操作下是闭的,因而是一个有限维李代数。或者,可以把属于一个李群的李代数的向量空间看成是该群的幺元的切空间。乘法就是群在幺元的微分的换位子,。
6. 作为一个具体的例子,考虑李群,所有实系数行列式为的矩阵。单位矩阵的切空间可以和所有迹为的实矩阵等同起来,其来自于李群的李代数结构和来自矩阵乘法的交换子的相同。
更多李群和它们相应的李代数,请参看李群条目。
同态,子代数,和理想[编辑]
在同样基域上的李代数和之间的一个同态是一个-线性映射,使得对于所有中的和有。这样的同态的复合也是同态,而域上的李代数,和这些态射一起,组成了一个范畴。如果一个同态是双射,它称为同构,而两个李代数和称为同构的。对于所有的应用目的,同构的李代数是相同的。
李代数的一个子代数是的一个线性子空间使得对于所有成立。则这个子代数自身是一个李代数。
李代数的理想是的一个子空间,使得对于所有和成立。所有理想都是子代数。若是的一个理想,则商空间成为一个李代数,这是通过定义为对于所有成立。理想刚好就是同态的核,而同态基本定理对于李代数是适用的。
李代数的分类[编辑]
实和复李代数可以分类到某种程度,而这个分类是李群分类的重要一步。每个有限维实或复李代数作为一个唯一的实或复单连通李群的李代数出现(Ado定理),但是可能有一个以上的群,甚至一个以上的连通群,有这个相同的李代数。例如,群 SO(3)(行列式值为1的 3×3 正交群)和SU(2) (行列式为1的 2×2 酉矩阵)有相同的李代数,就是 R3,以叉乘为李括号。
李代数是“交换的”,如果李括号为0,也就是 [x, y] = 0 对于所有 x 和 y。更一般的,一个李代数 是零幂(nilpotent)的,如果低中心序列(lower central series)
最终为 0。按照Engel定理,李代数零幂当且仅当对每个 中的 u 映射
是零幂的。更一般的,李代数 是可解的若导序列(derived series)
最终成为0。 极大可解子代数成为波莱尔子代数。
李代数 g 称为半单 如果 唯一的可解理想是平凡的。等价的, 是半单的当且仅当基灵型 K(u,v) = tr(ad(u)ad(v)) 是非退化的;这里 tr 表示迹算子。当域 F 的特征数为 0, 半单单当且仅当每个表示都是完全可约的,也就是对于每个表示的不变子空间,有一个不变的补空间(外尔定理 Weyl's theorem).
李代数是单的,如果它没有非平凡理想并且非交换。特别的有,一个单李代数是半单的,更一般的,半单李代数是单李代数的直和。
半单复李代数可通过它们的根系分类。
范畴理论定义[编辑]
使用范畴论的语言,李代数可以定义为向量空间范畴中的对象 A 和态射 使得[/ltr]
[ltr]
其中 而 σ 是复合 的循环枚举。用交换图形式:[/ltr]
[ltr]
参看[编辑][/ltr]
李导数[编辑]
[ltr]李导数(Lie derivative)是一种在流形M上的光滑函数组成的代数上的求导运算,以索甫斯·李命名。 所有李导数组成的向量空间对应于如下的李括号构成一个无限维李代数。
李导数用向量场表示,这些向量场可看作M上的流(时变微分同胚)的无穷小生成元。从另一角度看,M上的微分同胚组成的群,有其对应的李导数的李代数结构,在某种意义上和李群理论直接相关。
[/ltr]
[size][ltr]
定义[编辑]
李导数有几种等价的定义。在本节,为简便起见,我们用标量场和向量场的李导数的定义开始。李导数也可定义在一般的张量上,如后面的章节所述。
李导数的定义可以从函数的微分开始。这样,给定一个函数和一个M上的向量场X , f在点的李导数定义为
其中是f的微分。也就是,是由下式给出的[1-形式]
.
这里,是余切丛的基向量。这样,记号表示取f(在M中的点p)的微分和向量场X(在点p)的内积。
或者,可以先表明M上的光滑向量场X定义了一个M上的单参数曲线族。也就是,可以表明存在曲线在M上使得
其中对于所有M中的点p成立。这个一阶常微分方程的解的存在性由皮卡-林德洛夫定理给出(更一般的,这种曲线的存在性是弗罗贝尼乌斯定理给出)。然后可以定义李导数为
.
第三个可能的定义可以通过先定义一对向量场的李括号给出。首先注意到切空间的基向量可以写为,所以一个向量场,用一组选定的基向量可以表示为
定义李括号为
然后定义向量场Y的李导数等于X和Y的李导数,也就是,
.
根据上面任选的一个定义,其他的定义可被证明为其等价形式。 例如,可以证明,对于一个可微函数f,
并且
.
我们用在1-形式上的李导数的定义来结束本节:
.
性质[编辑]
李导数有一些属性。令为流形M上的函数组成的代数。则
是一个在代数上的导数。也就是, 是R-线性的,并且
.
类似的,它是上的一个导数,其中是M上的向量场的集合:
也可写为等价形式
其中张量积符号用于强调函数和向量场的积在整个流形上取。
另外的性质和李括号的一致。所以,例如,作为向量场的导数,
容易发现上面就是雅可比恒等式。这样,就可以得到“装备了李括号的M上的向量空间是李代数”的重要结果。
和外导数的关系、微分形式的李导数[编辑]
李导数和外导数密切相关,因此和埃里·嘉当的微分流形理论相关。 两个都试图给出导数的思想,其差别几乎只是记号上的。这个区别可以通过引入反导数或等效的内积来消除。 这之后,两者的关系就体现在一组恒等式上。
令M为一个流形,X为M上一个向量场。令为一k+1-形式。X和ω的内积为
注意
以及是-反导数。也就是,是R-线性的,并且
对于和另一个微分形式η成立。另外,对于一个函数,那是一个实或复值 的M上的函数,有
外导数和李导数的关系可以总结为以下这些。对于一般函数f,李导数就是外导数和向量场的内积:
对于一般的微分流形,李导数类似于内积,加上X的变化:
.
当ω为1-形式,上述恒等式经常写作
导数的乘积是可分配的
张量场的李导数[编辑]
在微分几何中,如果我们有一个阶可微张量场(我们可以把它当作余切丛的光滑截面和切丛的截面的线性映射 ),使得对于任何函数 有
),
而且如果进一步有一个可微向量场(也就是切丛的一个光滑截面),则线性映射
独立于联络∇;只要它是无挠率的,事实上,这个映射是一个张量。这个张量称为关于的李导数。
换句话说,如果你有一个张量场和一个由向量场给出的微分同胚的无穷小生成元,则就是在这个无穷小微分同胚下的无穷小变化。
或者,给定向向量场,令ψ为的积分曲线族,向上面那样。注意ψ是一个局部单参数局部微分同胚群。令为由ψ诱导的拉回(pullback)。则张量在点的李导数如下
.
参见[编辑]
[/ltr][/size]
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
拉普拉斯变换[编辑]
[ltr]拉普拉斯变换是应用数学中常用的一种积分变换,又名拉氏转换,其符号为。拉氏变换是一个线性变换,可将一个有引数实数t(t ≥ 0)的函数转换为一个引数为复数s的函数。
有些情形下一个实变量函数在实数域中进行一些运算并不容易,但若将实变量函数作拉普拉斯变换,并在复数域中作各种运算,再将运算结果作拉普拉斯反变换来求得实数域中的相应结果,往往在计算上容易得多。拉普拉斯变换的这种运算步骤对于求解线性微分方程尤为有效,它可把微分方程化为容易求解的代数方程来处理,从而使计算简化。在经典控制理论中,对控制系统的分析和综合,都是建立在拉普拉斯变换的基础上的。引入拉普拉斯变换的一个主要优点,是可采用传递函数代替常系数微分方程来描述系统的特性。这就为采用直观和简便的图解方法来确定控制系统的整个特性、分析控制系统的运动过程,以及提供控制系统调整的可能性。
[/ltr]
[size][ltr]
基本定义[编辑]
如果定义:
[/ltr][/size]
[size][ltr]
则的拉普拉斯变换由下列式子给出:
双边拉普拉斯变换[编辑]
除了普遍使用的单边拉普拉斯变换外,双边拉普拉斯变换是将单边变换积分范围扩大为整个实数区域:
拉普拉斯逆变换[编辑]
拉普拉斯逆变换,是已知,求解的过程。用符号 表示。
拉普拉斯逆变换的公式是:
对于所有的;
是收敛区间的横坐标值,是一个实常数且直线处在的收敛域内。
拉普拉斯变换的存在性[编辑]
主条目:拉普拉斯变换的存在性
关于一个函数的拉普拉斯变换,只有在拉普拉斯积分是收敛的情况下才存在。也就是说,必须是在对于的每一个有限区间内都是间断性连续的,且当趋于无穷大的时候,是指数阶地变化。
拉普拉斯变换的基本性质[编辑]
[/ltr][/size]
[size][ltr]
[/ltr][/size]
[size][ltr]
[/ltr][/size]
[size][ltr]
[/ltr][/size]
[size][ltr]
[/ltr][/size]
[size][ltr]
[/ltr][/size]
[size][ltr]
,要求为真分式,即分子的最高次小于分母的最高次,否则使用多项式除法将分解
[/ltr][/size]
[size][ltr]
,要求的所有极点都在左半复平面或原点为单极点。终值定理的实用性在于它能预见到系统的长期表现,且避免部分分式展开。如果函数的极点在右半平面,那么系统的终值未定义(例如: 或 )。
[/ltr][/size]
[size][ltr]
[/ltr][/size]
[size][ltr]
注: 表示阶跃函数.
[/ltr][/size]
[size][ltr]
,是收敛区间的横坐标值,是一个实常数且大于所有的个别点的实部值。
变换简表[编辑]
[/ltr][/size]
[size][ltr]
与其他变换的联系[编辑]
[/ltr][/size]
[size][ltr]
令s = iω或s = 2πfi,有:
[/ltr][/size]
[size][ltr]
z 变换表达式为:
其中. 比较两者表达式有:
例子:如何应用此变换及其性质[编辑]
拉普拉斯变换在物理学和工程中是常用的;线性时不变系统的输出可以通过卷积单位脉冲响应与输入信号来计算,而在拉氏空间中执行此计算将卷积通过转换成乘法来计算。后者是更容易解决,由于它的代数形式。
拉普拉斯变换也可以用来解决微分方程,这被广泛应用于电气工程。拉普拉斯变换把线性差分方程化简为代数方程,这样就可以通过代数规则来解决。原来的微分方程可以通过施加逆拉普拉斯变换得到其解。英国电气工程师奥利弗·黑维塞第一次提出了一个类似的计划,虽然没有使用拉普拉斯变换;以及由此产生的演算被誉为黑维塞演算。
在工程学上的应用[编辑]
应用拉普拉斯变换解常变量齐次微分方程,可以将微分方程化为代数方程,使问题得以解决。在工程学上,拉普拉斯变换的重大意义在于:将一个信号从时域上,转换为复频域(s域)上来表示,对于分析系统特性,系统稳定有着重大意义;在线性系统,控制自动化上都有广泛的应用。
相关条目[编辑]
[/ltr][/size]
[ltr]拉普拉斯变换是应用数学中常用的一种积分变换,又名拉氏转换,其符号为。拉氏变换是一个线性变换,可将一个有引数实数t(t ≥ 0)的函数转换为一个引数为复数s的函数。
有些情形下一个实变量函数在实数域中进行一些运算并不容易,但若将实变量函数作拉普拉斯变换,并在复数域中作各种运算,再将运算结果作拉普拉斯反变换来求得实数域中的相应结果,往往在计算上容易得多。拉普拉斯变换的这种运算步骤对于求解线性微分方程尤为有效,它可把微分方程化为容易求解的代数方程来处理,从而使计算简化。在经典控制理论中,对控制系统的分析和综合,都是建立在拉普拉斯变换的基础上的。引入拉普拉斯变换的一个主要优点,是可采用传递函数代替常系数微分方程来描述系统的特性。这就为采用直观和简便的图解方法来确定控制系统的整个特性、分析控制系统的运动过程,以及提供控制系统调整的可能性。
[/ltr]
[size][ltr]
基本定义[编辑]
如果定义:
[/ltr][/size]
[size][ltr]
则的拉普拉斯变换由下列式子给出:
双边拉普拉斯变换[编辑]
除了普遍使用的单边拉普拉斯变换外,双边拉普拉斯变换是将单边变换积分范围扩大为整个实数区域:
拉普拉斯逆变换[编辑]
拉普拉斯逆变换,是已知,求解的过程。用符号 表示。
拉普拉斯逆变换的公式是:
对于所有的;
是收敛区间的横坐标值,是一个实常数且直线处在的收敛域内。
拉普拉斯变换的存在性[编辑]
主条目:拉普拉斯变换的存在性
关于一个函数的拉普拉斯变换,只有在拉普拉斯积分是收敛的情况下才存在。也就是说,必须是在对于的每一个有限区间内都是间断性连续的,且当趋于无穷大的时候,是指数阶地变化。
拉普拉斯变换的基本性质[编辑]
[/ltr][/size]
- 线性叠加
[size][ltr]
[/ltr][/size]
- 时域微分(单边拉普拉斯变换)
[size][ltr]
[/ltr][/size]
- s域微分
[size][ltr]
[/ltr][/size]
- s域积分
[size][ltr]
[/ltr][/size]
- 时域积分
[size][ltr]
[/ltr][/size]
[size][ltr]
,要求为真分式,即分子的最高次小于分母的最高次,否则使用多项式除法将分解
[/ltr][/size]
[size][ltr]
,要求的所有极点都在左半复平面或原点为单极点。终值定理的实用性在于它能预见到系统的长期表现,且避免部分分式展开。如果函数的极点在右半平面,那么系统的终值未定义(例如: 或 )。
[/ltr][/size]
- 域平移
[size][ltr]
[/ltr][/size]
- 时域平移
[size][ltr]
注: 表示阶跃函数.
[/ltr][/size]
[size][ltr]
,是收敛区间的横坐标值,是一个实常数且大于所有的个别点的实部值。
变换简表[编辑]
[/ltr][/size]
原函数 | 转换后函数 | 收敛区域 |
与其他变换的联系[编辑]
[/ltr][/size]
- 与傅里叶变换关系
[size][ltr]
令s = iω或s = 2πfi,有:
[/ltr][/size]
- 与z变换的联系
[size][ltr]
z 变换表达式为:
其中. 比较两者表达式有:
例子:如何应用此变换及其性质[编辑]
拉普拉斯变换在物理学和工程中是常用的;线性时不变系统的输出可以通过卷积单位脉冲响应与输入信号来计算,而在拉氏空间中执行此计算将卷积通过转换成乘法来计算。后者是更容易解决,由于它的代数形式。
拉普拉斯变换也可以用来解决微分方程,这被广泛应用于电气工程。拉普拉斯变换把线性差分方程化简为代数方程,这样就可以通过代数规则来解决。原来的微分方程可以通过施加逆拉普拉斯变换得到其解。英国电气工程师奥利弗·黑维塞第一次提出了一个类似的计划,虽然没有使用拉普拉斯变换;以及由此产生的演算被誉为黑维塞演算。
在工程学上的应用[编辑]
应用拉普拉斯变换解常变量齐次微分方程,可以将微分方程化为代数方程,使问题得以解决。在工程学上,拉普拉斯变换的重大意义在于:将一个信号从时域上,转换为复频域(s域)上来表示,对于分析系统特性,系统稳定有着重大意义;在线性系统,控制自动化上都有广泛的应用。
相关条目[编辑]
[/ltr][/size]
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
规范场论[编辑]
[ltr]规范场论(Gauge Theory)是基于对称变换可以局部也可以全局地施行这一思想的一类物理理论。非交换对称群的规范场论最常见的例子为杨-米尔斯理论。物理系统往往用在某种变换下不变的拉格朗日量表述,当变换在每一时空点同时施行,它们有全局对称性。规范场论推广了这一思想,它要求拉格朗日量必须也有局部对称性—应该可以在时空的特定区域施行这些对称变换而不影响到另外一个区域。这个要求是广义相对论的等价原理的一个推广。
规范“对称性”反映了系统表述的一个冗余性。
规范场论在物理学上的重要性,在于其成功为量子电动力学、弱相互作用和强相互作用提供了一个统一的数学形式化架构——标准模型。这套理论精确地表述了自然界的三种基本力的实验预测,它是一个规范群为SU(3) × SU(2) × U(1)的规范场论。像弦论这样的现代理论,以及广义相对论的一些表述,都是某种意义上的规范场论。
有时,规范对称性一词被用于更广泛的含义,包括任何局部对称性,例如微分同胚。该术语的这个含义不在本条目使用。
[/ltr]
[size][ltr]
简史[编辑]
最早包含规范对称性的物理理论是麦克斯韦的电动力学。但是,该对称性的重要性在早期的表述中没有被注意到。在爱因斯坦发展广义相对论之后,赫尔曼·外尔在试图统一广义相对论和电磁学的尝试中,猜想Eichinvarianz或者说尺度(“规范”)变换下的不变性可能也是广义相对论的局部对称性。后来发现该猜想将导致某些非物理的结果。但是在量子力学发展以后,魏尔、Vladimir Fock和Fritz London实现了该思想,但作了一些修改(把缩放因子用一个复数代替,并把尺度变化变成了相位变化—一个U(1)规范对称性),这对一个相应于带电荷的量子粒子其波函数受到电磁场的影响,给定了一个漂亮的解释。这是第一个规范场论。泡利在1940年推动了该理论的传播,参看R.M. P.13, 203。
1950年代,为了解决一些基本粒子物理中的巨大混乱,杨振宁和罗伯特·米尔斯引入非交换规范场论作为理解将核子绑在原子核中的强相互作用的模型。(Ronald Shaw,和Abdus Salam一起工作,在他的博士论文中独立地引入了相同的概念。)通过推广电磁学中的规范不变性,他们试图构造基于(非交换的)SU(2)对称群在同位旋质子和中子对上的作用的理论,类似于U(1)群在量子电动力学的旋量场上的作用。在粒子物理中,重点是使用量子化规范场论。
该思想后来被发现能够用于弱相互作用的量子场论,以及它和电磁学的统一在电弱理论中。当人们意识到非交换规范场论能够导出一个称为渐近自由的特色的时候,规范场论变得更有吸引力,因为渐近自由被认为是强相互作用的一个重要特点—因而推动了寻找强相互作用的规范场论的研究。这个理论现在称为量子色动力学,是一个SU(3)群作用在夸克的色荷上的规范场论。标准模型用规范场论的语言统一了电磁力、弱相互作用和强相互作用的表述。
1970年代迈克尔·阿蒂亚爵士提出了研究经典杨-米尔斯方程的数学解的计划。1983年,Atiyah的学生Simon Donaldson 在这个工作之上证明了光滑4-流形的可微分类和它们只差一个同胚的分类非常不同。Michael Freedman采用Donaldson的工作证明伪R4的存在,也就是,欧几里得4维空间上的奇异微分结构。这导致对于规范场论本身的兴趣,独立于它在基础物理中的成功。1994年,爱德华·威滕和Nathan Seiberg发明了基于超对称的规范场技术,使得特定拓扑不变量的计算成为可能。这些从规范场论来的对数学的贡献导致了对该领域的新兴趣。
电磁学中的简单的规范对称性的例子[编辑]
电路中接地的定义是规范对称性的一个例子;当线路所有点的电位升高相同的值时,电路的行为完全不变;因为电路中的电位差不变。该事实的一个常见释例是栖息在高压电线上的鸟不会遭电击,因为鸟对地绝缘。
这称为整体规范对称性Trefil,1983。电压的绝对值不是真实的;真正影响电路的是电路组件两端的电压差。接地点的定义是任意的,但一旦该点确定了,则该定义必须全局的采用。
相反,如果某个对称性可以从一点到另一点任意的定义,它是一个局域规范对称性。
[/ltr][/size]
[size][ltr]
经典规范场论[编辑]
[/ltr][/size]
[size][ltr]
本节要求一些经典或量子场论的知识,以及拉格朗日量的使用。
本节中的定义:规范群,规范场,相互作用拉格朗日量,规范玻色子
[/ltr][/size]
[size][ltr]
一个例子:标量 O(n) 规范场论[编辑]
下面解释了局域规范不变性可以从整体对称性质启发式地“导出”,并且解释了它如何导向原来不相互作用的场之间的相互作用。
考虑一个n个无相互作用的标量场的集合,它们有相同的质量m。该系统用一个作用量表示,它是每个标量场φi的作用量之和
拉格朗日量可以简明的写作
这是通过引入一个场的矢量
现在很明显地,拉格朗日量在下面的变换中不变
只要G是一个常数 矩阵,G属于n-乘-n 正交群 O(n)。这是这个特定的拉格朗日量的全局对称性,而对称群经常称为规范群。很巧合的是,诺特定理蕴含着该变换群作用下的不变量导致如下的流的守恒
其中Ta矩阵是SO(n)群的生成元。每个生成元有一个守恒流。
现在,要求这个拉格朗日量必须有局域O(n)-不变性要求G矩阵(原来是常数)必须允许成为时空坐标x的函数。
不幸的是,G矩阵无法“传递”给导数。当G = G(x),
这意味着定义一个有如下属性的“导数”D
可以验证这样一个“导数”(称为协变导数)是
其中规范场 A(x)定义为有如下变换律的场
而g为耦合常数 - 定义一个相互作用强度的量。
规范场在一点的取值是李代数的一个元素,因此可以展开为
所以相互独立的测度场取值和李代数的生成元一样多。
最后,我们有了一个局域规范不变拉格朗日量
泡利把应用到象这样的场上的变换称为第一类规范变换,而把中的补偿变换称为第二类规范变换。
[/ltr][/size][size][ltr]
这个拉格朗日量和初始的全局规范不变的拉格朗日量的区别可以视为相互作用拉格朗日量
这个项作为要求局部规范不变性的结果而引入了n个标量场之间的相互作用。在这个经典场论的量子化版本中,规范场A(x)的量子称为规范玻色子。相互作用拉格朗日量在量子场论中的解释是标量玻色子通过交换这些规范玻色子来相互作用。
规范场的拉格朗日量[编辑]
我们关于经典规范理论的图像基本完成了,还剩协变导数D的定义,为此我们必须知道规范场 A(x) 在所有时空点的值。它可以通过一个场方程的解给出,而不是手工的设置这个场的值。进一步要求产生这个场方程的拉格朗日量也是局部规范不变的,规范场拉格朗日量的最一般的形式可以(传统地)写作
其中
而迹在场的矢量空间上取。
注意在这个拉格朗日量中,没有一个场其变换抵消的变换。该项在规范变换中的不变性是前面经典(或者说几何,如果喜欢的话)对称性的特殊情况。该对称性必须被限制以施行量子化,这个过程被称为规范固定,但是即使在限制之后,规范变换还是可能的(参看Sakurai, 高等量子力学,1-4节)。
O(n)规范场论的拉格朗日量现在成了
例子:电动力学[编辑]
作为前面章节中发展的形式化表述的简单应用,考虑电动力学的情形,只考虑电子场。产生电子场的狄拉克方程的最简单的作用(传统上)是
该系统的全局对称性是
这里的规范群是U(1),也就是场的相位角,带一个常数θ。
“局部”化这个对称性意味着用θ(x)取代θ。
一个合适的共变导数是
将“荷” e视为通常的电荷(这也是规范理论中这个术语的使用的来源),而把规范场A(x)视为电磁场的电磁四维势得到一个相互作用拉格朗日量
其中J(x)是通常的电流密度的四维矢量。规范原理因而可以视作以一种自然的方式引入了所谓的电磁场到电子场的最小耦合。
为规范场A(x)加入一个拉格朗日量,用场强张量的术语就象在电动力学中一样,可以得到在量子电动力学中作为起点的拉格朗日量。
参看:狄拉克方程,麦克斯韦方程组,量子电动力学
数学形式化[编辑]
规范理论通常用微分几何的语言讨论。数学上,一个规范就是某个流形的(局部)坐标系的一个选择。一个规范变换也就是一个坐标变换。
注意,虽然规范理论被联络的研究占据了大部分(主要是因为它主要在高能物理中研究),联络的思想一般不是规范理论的基本或者中心概念。事实上,一般规范理论的一个结果表明规范变换的仿射表示(也就是仿射模)可以分类到一种满足特定属性的节丛的截面。有些表示在每一点共变(物理学家称其为第一类规范变换),有些表示象联络形式一样变换(物理学家称其为第二类规范变换)(注意这是一种仿射表示),还有其它更一般的表示,例如BF理论中的B场。当然,我们可以考虑更一般的表示(实现),但那很复杂。但是,非线性σ模型非线性地变换,所以它们也有用处。
若我们有一个主丛P其底空间是空间或时空而结构群是一个李群,则P的截面组成一个群称为规范变换群。
我们可以在该主丛上定义一个联络(规范联络),这可以在每个相伴矢量丛上产生一个共变导数∇。若我们选择一个局部标架(截面的局部基),我们就可以用联络形式A表示这个共变导数,A是一个李代数-值的1-形式,在物理学中称为规范势,它显然不是内在的量,而是一个依赖于标架的选择的量。从这个联络形式,我们可以构造曲率形式F,这是一个李代数-值的2-形式,这是一个内在量,定义为
其中 d 代表外微分而代表楔积。
无穷小规范变换形成一个李代数,可以表述为一个光滑李代数值的标量,ε。在这样一个无穷小规范变换下,
其中是李括号。
一个有趣的结果是,若,则 其中D是共变导数
而且,,这意味着F共变地变换。
需要注意的一点是不是所有的一般规范变换都可以用无穷小规范变换生成;例如,当底流形是一个无边界的紧致流形使得从该流形到李群的映射的同伦类非平凡的时候。参看瞬子(instanton)中的例子。
杨-米尔斯作用现在可以如下给出
其中 * 代表霍奇对偶而积分和在微分几何中的定义一样。
一个规范-不变量也就是在规范变换下的不变量的例子是威尔逊环(Wilson loop),它定义在闭合路径γ上,定义如下:
其中χ是复表示ρ的特征标;而表示路径排序算子。
规范理论的量子化[编辑]
规范理论可以用能够应用到任何量子场论的方法的特殊化来量子化。但是,因为规范约束(参看上面的数学表述一节)所带来的微妙性,存在很多需要解决的理论问题,他们在其他场论中并不存在。同时,规范理论的更丰富的结构使得一些计算得以简化:例如Ward恒等式建立了不同的重正化常数的联系。
方法和目标[编辑]
第一个量子化的规范理论是量子电动力学(QED)。为此发展的最初的方法涉及规范固定和施行标准量子化。Gupta-Bleuler方法也被发展出来用于处理这个问题。非交换规范理论现在用很多不同的方法处理。量子化的方法在量子化条目有介绍。
量子化的要点在于能够计算对于理论所允许的各种进程的量子振幅。技术上,它们退化为在真空状态下的特定相关系数函数的计算。这涉及到理论的一个重正化。
当理论的巡行耦合足够小时,所有需要计算的量可以用微扰理论计算。设计用于简化这样的计算的量子化方案(例如标准量子化)可以称为微扰量子化方案。现在一些这种方法导向了规范理论的更精确的试验测试。
但是,在多数规范理论中,有很多有趣的问题是非微扰的。设计用于这些问题的量子化方案可以称为非微扰量子化方案。这样的方案的精确计算经常需要超级计算,因而目前比其他方案的发展要少。
反常[编辑]
一些理论经典的对称性在量子理论中不再成立—这个现象称为一个反常。最出名的包括:
[/ltr][/size]
[size][ltr]
在QCD中,这个反常导致了π介子衰变成为两个光子。
[/ltr][/size]
[size][ltr]
参看[编辑]
[/ltr][/size]
[ltr]规范场论(Gauge Theory)是基于对称变换可以局部也可以全局地施行这一思想的一类物理理论。非交换对称群的规范场论最常见的例子为杨-米尔斯理论。物理系统往往用在某种变换下不变的拉格朗日量表述,当变换在每一时空点同时施行,它们有全局对称性。规范场论推广了这一思想,它要求拉格朗日量必须也有局部对称性—应该可以在时空的特定区域施行这些对称变换而不影响到另外一个区域。这个要求是广义相对论的等价原理的一个推广。
规范“对称性”反映了系统表述的一个冗余性。
规范场论在物理学上的重要性,在于其成功为量子电动力学、弱相互作用和强相互作用提供了一个统一的数学形式化架构——标准模型。这套理论精确地表述了自然界的三种基本力的实验预测,它是一个规范群为SU(3) × SU(2) × U(1)的规范场论。像弦论这样的现代理论,以及广义相对论的一些表述,都是某种意义上的规范场论。
有时,规范对称性一词被用于更广泛的含义,包括任何局部对称性,例如微分同胚。该术语的这个含义不在本条目使用。
[/ltr]
[size][ltr]
简史[编辑]
最早包含规范对称性的物理理论是麦克斯韦的电动力学。但是,该对称性的重要性在早期的表述中没有被注意到。在爱因斯坦发展广义相对论之后,赫尔曼·外尔在试图统一广义相对论和电磁学的尝试中,猜想Eichinvarianz或者说尺度(“规范”)变换下的不变性可能也是广义相对论的局部对称性。后来发现该猜想将导致某些非物理的结果。但是在量子力学发展以后,魏尔、Vladimir Fock和Fritz London实现了该思想,但作了一些修改(把缩放因子用一个复数代替,并把尺度变化变成了相位变化—一个U(1)规范对称性),这对一个相应于带电荷的量子粒子其波函数受到电磁场的影响,给定了一个漂亮的解释。这是第一个规范场论。泡利在1940年推动了该理论的传播,参看R.M. P.13, 203。
1950年代,为了解决一些基本粒子物理中的巨大混乱,杨振宁和罗伯特·米尔斯引入非交换规范场论作为理解将核子绑在原子核中的强相互作用的模型。(Ronald Shaw,和Abdus Salam一起工作,在他的博士论文中独立地引入了相同的概念。)通过推广电磁学中的规范不变性,他们试图构造基于(非交换的)SU(2)对称群在同位旋质子和中子对上的作用的理论,类似于U(1)群在量子电动力学的旋量场上的作用。在粒子物理中,重点是使用量子化规范场论。
该思想后来被发现能够用于弱相互作用的量子场论,以及它和电磁学的统一在电弱理论中。当人们意识到非交换规范场论能够导出一个称为渐近自由的特色的时候,规范场论变得更有吸引力,因为渐近自由被认为是强相互作用的一个重要特点—因而推动了寻找强相互作用的规范场论的研究。这个理论现在称为量子色动力学,是一个SU(3)群作用在夸克的色荷上的规范场论。标准模型用规范场论的语言统一了电磁力、弱相互作用和强相互作用的表述。
1970年代迈克尔·阿蒂亚爵士提出了研究经典杨-米尔斯方程的数学解的计划。1983年,Atiyah的学生Simon Donaldson 在这个工作之上证明了光滑4-流形的可微分类和它们只差一个同胚的分类非常不同。Michael Freedman采用Donaldson的工作证明伪R4的存在,也就是,欧几里得4维空间上的奇异微分结构。这导致对于规范场论本身的兴趣,独立于它在基础物理中的成功。1994年,爱德华·威滕和Nathan Seiberg发明了基于超对称的规范场技术,使得特定拓扑不变量的计算成为可能。这些从规范场论来的对数学的贡献导致了对该领域的新兴趣。
电磁学中的简单的规范对称性的例子[编辑]
电路中接地的定义是规范对称性的一个例子;当线路所有点的电位升高相同的值时,电路的行为完全不变;因为电路中的电位差不变。该事实的一个常见释例是栖息在高压电线上的鸟不会遭电击,因为鸟对地绝缘。
这称为整体规范对称性Trefil,1983。电压的绝对值不是真实的;真正影响电路的是电路组件两端的电压差。接地点的定义是任意的,但一旦该点确定了,则该定义必须全局的采用。
相反,如果某个对称性可以从一点到另一点任意的定义,它是一个局域规范对称性。
[/ltr][/size]
- ^ James S. Trefil 1983年, 创造的瞬间。 Scribner, ISBN 0-684-17963-692-93页。
[size][ltr]
经典规范场论[编辑]
[/ltr][/size]
[size][ltr]
本节要求一些经典或量子场论的知识,以及拉格朗日量的使用。
本节中的定义:规范群,规范场,相互作用拉格朗日量,规范玻色子
[/ltr][/size]
[size][ltr]
一个例子:标量 O(n) 规范场论[编辑]
下面解释了局域规范不变性可以从整体对称性质启发式地“导出”,并且解释了它如何导向原来不相互作用的场之间的相互作用。
考虑一个n个无相互作用的标量场的集合,它们有相同的质量m。该系统用一个作用量表示,它是每个标量场φi的作用量之和
拉格朗日量可以简明的写作
这是通过引入一个场的矢量
现在很明显地,拉格朗日量在下面的变换中不变
只要G是一个常数 矩阵,G属于n-乘-n 正交群 O(n)。这是这个特定的拉格朗日量的全局对称性,而对称群经常称为规范群。很巧合的是,诺特定理蕴含着该变换群作用下的不变量导致如下的流的守恒
其中Ta矩阵是SO(n)群的生成元。每个生成元有一个守恒流。
现在,要求这个拉格朗日量必须有局域O(n)-不变性要求G矩阵(原来是常数)必须允许成为时空坐标x的函数。
不幸的是,G矩阵无法“传递”给导数。当G = G(x),
这意味着定义一个有如下属性的“导数”D
可以验证这样一个“导数”(称为协变导数)是
其中规范场 A(x)定义为有如下变换律的场
而g为耦合常数 - 定义一个相互作用强度的量。
规范场在一点的取值是李代数的一个元素,因此可以展开为
所以相互独立的测度场取值和李代数的生成元一样多。
最后,我们有了一个局域规范不变拉格朗日量
泡利把应用到象这样的场上的变换称为第一类规范变换,而把中的补偿变换称为第二类规范变换。
[/ltr][/size][size][ltr]
这个拉格朗日量和初始的全局规范不变的拉格朗日量的区别可以视为相互作用拉格朗日量
这个项作为要求局部规范不变性的结果而引入了n个标量场之间的相互作用。在这个经典场论的量子化版本中,规范场A(x)的量子称为规范玻色子。相互作用拉格朗日量在量子场论中的解释是标量玻色子通过交换这些规范玻色子来相互作用。
规范场的拉格朗日量[编辑]
我们关于经典规范理论的图像基本完成了,还剩协变导数D的定义,为此我们必须知道规范场 A(x) 在所有时空点的值。它可以通过一个场方程的解给出,而不是手工的设置这个场的值。进一步要求产生这个场方程的拉格朗日量也是局部规范不变的,规范场拉格朗日量的最一般的形式可以(传统地)写作
其中
而迹在场的矢量空间上取。
注意在这个拉格朗日量中,没有一个场其变换抵消的变换。该项在规范变换中的不变性是前面经典(或者说几何,如果喜欢的话)对称性的特殊情况。该对称性必须被限制以施行量子化,这个过程被称为规范固定,但是即使在限制之后,规范变换还是可能的(参看Sakurai, 高等量子力学,1-4节)。
O(n)规范场论的拉格朗日量现在成了
例子:电动力学[编辑]
作为前面章节中发展的形式化表述的简单应用,考虑电动力学的情形,只考虑电子场。产生电子场的狄拉克方程的最简单的作用(传统上)是
该系统的全局对称性是
这里的规范群是U(1),也就是场的相位角,带一个常数θ。
“局部”化这个对称性意味着用θ(x)取代θ。
一个合适的共变导数是
将“荷” e视为通常的电荷(这也是规范理论中这个术语的使用的来源),而把规范场A(x)视为电磁场的电磁四维势得到一个相互作用拉格朗日量
其中J(x)是通常的电流密度的四维矢量。规范原理因而可以视作以一种自然的方式引入了所谓的电磁场到电子场的最小耦合。
为规范场A(x)加入一个拉格朗日量,用场强张量的术语就象在电动力学中一样,可以得到在量子电动力学中作为起点的拉格朗日量。
参看:狄拉克方程,麦克斯韦方程组,量子电动力学
数学形式化[编辑]
规范理论通常用微分几何的语言讨论。数学上,一个规范就是某个流形的(局部)坐标系的一个选择。一个规范变换也就是一个坐标变换。
注意,虽然规范理论被联络的研究占据了大部分(主要是因为它主要在高能物理中研究),联络的思想一般不是规范理论的基本或者中心概念。事实上,一般规范理论的一个结果表明规范变换的仿射表示(也就是仿射模)可以分类到一种满足特定属性的节丛的截面。有些表示在每一点共变(物理学家称其为第一类规范变换),有些表示象联络形式一样变换(物理学家称其为第二类规范变换)(注意这是一种仿射表示),还有其它更一般的表示,例如BF理论中的B场。当然,我们可以考虑更一般的表示(实现),但那很复杂。但是,非线性σ模型非线性地变换,所以它们也有用处。
若我们有一个主丛P其底空间是空间或时空而结构群是一个李群,则P的截面组成一个群称为规范变换群。
我们可以在该主丛上定义一个联络(规范联络),这可以在每个相伴矢量丛上产生一个共变导数∇。若我们选择一个局部标架(截面的局部基),我们就可以用联络形式A表示这个共变导数,A是一个李代数-值的1-形式,在物理学中称为规范势,它显然不是内在的量,而是一个依赖于标架的选择的量。从这个联络形式,我们可以构造曲率形式F,这是一个李代数-值的2-形式,这是一个内在量,定义为
其中 d 代表外微分而代表楔积。
无穷小规范变换形成一个李代数,可以表述为一个光滑李代数值的标量,ε。在这样一个无穷小规范变换下,
其中是李括号。
一个有趣的结果是,若,则 其中D是共变导数
而且,,这意味着F共变地变换。
需要注意的一点是不是所有的一般规范变换都可以用无穷小规范变换生成;例如,当底流形是一个无边界的紧致流形使得从该流形到李群的映射的同伦类非平凡的时候。参看瞬子(instanton)中的例子。
杨-米尔斯作用现在可以如下给出
其中 * 代表霍奇对偶而积分和在微分几何中的定义一样。
一个规范-不变量也就是在规范变换下的不变量的例子是威尔逊环(Wilson loop),它定义在闭合路径γ上,定义如下:
其中χ是复表示ρ的特征标;而表示路径排序算子。
规范理论的量子化[编辑]
规范理论可以用能够应用到任何量子场论的方法的特殊化来量子化。但是,因为规范约束(参看上面的数学表述一节)所带来的微妙性,存在很多需要解决的理论问题,他们在其他场论中并不存在。同时,规范理论的更丰富的结构使得一些计算得以简化:例如Ward恒等式建立了不同的重正化常数的联系。
方法和目标[编辑]
第一个量子化的规范理论是量子电动力学(QED)。为此发展的最初的方法涉及规范固定和施行标准量子化。Gupta-Bleuler方法也被发展出来用于处理这个问题。非交换规范理论现在用很多不同的方法处理。量子化的方法在量子化条目有介绍。
量子化的要点在于能够计算对于理论所允许的各种进程的量子振幅。技术上,它们退化为在真空状态下的特定相关系数函数的计算。这涉及到理论的一个重正化。
当理论的巡行耦合足够小时,所有需要计算的量可以用微扰理论计算。设计用于简化这样的计算的量子化方案(例如标准量子化)可以称为微扰量子化方案。现在一些这种方法导向了规范理论的更精确的试验测试。
但是,在多数规范理论中,有很多有趣的问题是非微扰的。设计用于这些问题的量子化方案可以称为非微扰量子化方案。这样的方案的精确计算经常需要超级计算,因而目前比其他方案的发展要少。
反常[编辑]
一些理论经典的对称性在量子理论中不再成立—这个现象称为一个反常。最出名的包括:
[/ltr][/size]
- 共形反常,它导致了一个跑动耦合常数。在QED中,这导致了朗道奇点(Landau pole)。在量子色动力学(QCD)中,这导致渐近自由。
- 手征反常,出现在费米子手性或者矢量场论中。这通过瞬子的概念和拓扑有紧密的关联。
[size][ltr]
在QCD中,这个反常导致了π介子衰变成为两个光子。
[/ltr][/size]
[size][ltr]
参看[编辑]
[/ltr][/size]
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
纤维丛[编辑]
[ltr]数学上,特别是在拓扑学中,一个纤维丛是一个局部看来像两个空间的直积的空间,但是整体可能有不同的结构。每个纤维丛对应一个连续满射
并对每个E的局部空间(需存在B的局部空间能够保持上述的满射),都存在一个F(F称作纤维空间),使得E与直积空间B × F同胚。(通常会用此满射:π : E → B来表示一个纤维丛,而忽略F )
(这里局部表示在B上局部。)一个可以整体上如此表达的丛(通过一个保持π的同胚)叫做平凡丛。丛的理论建立在如何用一些比这个直接的定义更简单的方法表达丛不是平凡丛的意义的问题之上。
纤维丛扩展了向量丛,向量丛的主要实例就是流形的切丛。他们在微分拓扑和微分几何领域有着重要的作用。他们也是规范场论的基本概念。[/ltr]
[ltr]
形式化定义[编辑]
一个纤维丛由四元组(E, B, π, F)组成,其中E, B, F是拓扑空间而π : E →B是一个 连续满射,满足下面给出的局部平凡条件。B称为丛的基空间,E称为总空间,而F称为纤维。映射π称为投影映射.下面我们假定基空间B是连通的。
我们要求对于B中的每个x,存在一个x的开邻域U,使得π−1(U)是同胚于积空间U × F的,并满足π转过去就变成到第一个因子的投影。也就是一下的图可交换:[/ltr]
[ltr]
其中proj1 : U × F → U是自然投影而φ : π−1(U) → U × F是一个同胚。所有{(Ui, φi)}的集合称为丛的局部平凡化。
对于B中每个x,原象π−1(x)和F同胚并称为x上的纤维.一个纤维丛(E, B, π,F)经常记为
以引入一个空间的短恰当序列。注意每个纤维从π : E → B都是一个开映射,因为积空间的投影是开映射。所以B有由映射π决定的商拓扑.
一个光滑纤维丛是一个在光滑流形的范畴内的纤维丛。也就是,E, B, F都必须是光滑流形而所有上面用到的函数都必须是光滑映射。这是纤维丛研究和使用的通常环境。
例子[编辑]
令E = B × F并令π : E → B为对第一个因子的投影,则E是B上的丛.这里E不仅是局部的积而且是整体的积。任何这样的纤维丛称为平凡丛.[/ltr]
[ltr]
最简单的非平凡丛的例子可能要算莫比乌斯带(Möbius strip).莫比乌斯带是一个以圆为基空间B并以线段为纤维F的丛。对于一点的邻域是一段圆弧;在图中,就是其中一个方块的长。原象在图中是个(有些扭转的)切片,4个方块宽一个方块长。同胚φ把U的原象映到柱面的一块:弯曲但不扭转.
相应的平凡丛B × F看起来像一个圆柱,但是莫比乌斯带有个整体上的扭转。注意这个扭转只有整体上才能看出来;局部看来莫比乌斯带和圆柱完全一样(在其中任何一个竖直的切一刀会产生同样的空间).
一个类似的非平凡丛是克莱因瓶,它可以看作是一个"扭转"的圆在另一个圆上的丛。相应的平凡丛是一个环, S1 × S1.
一个覆盖空间是一个以离散空间为纤维的纤维丛。
纤维丛的一个特例,叫做向量丛,是那些纤维为向量空间的丛(要成为一个向量丛,丛的结构群—见下面—必须是一个线性群)。向量丛的重要实例包括光滑流形的切丛和余切丛。
另一个纤维丛的特例叫做主丛。更多的例子参看该条目。
一个球丛是一个纤维为n维球面的纤维丛。给定一个有度量的向量丛(例如黎曼流形的切丛),可以构造一个相应的单位球丛,其在一点x的纤维是所有Ex的单位向量的集合.
截面[编辑]
主条目:截面 (纤维丛)
纤维丛的截面(section或者cross section)是一个连续映射f : B → E使得π(f(x))=x对于所有B中的x成立。因为丛通常没有全局有定义的截面,理论的一个重要作用就是检验和证明他们的存在性。这导致了代数拓扑的示性类理论。
截面经常只被局部的定义(特别是当全局截面不存在时)。纤维丛的局部截面是一个连续映射f : U → E其中U是一个B中的开集而π(f(x))=x对所有U中的x成立。若(U, φ)是一个局部平凡化图,则局部截面在 U上总是存在的。这种截面和连续映射U → F有1-1对应。截面的集合组成一个层(sheaf)。
结构群和转移函数[编辑]
纤维丛经常有一个对称群描述重叠的图之间的相容条件。特别的,令G为一个拓扑群,它连续的从左边作用在纤维空间F上。不失一般性的,我们可以要求G有效的作用在F上,以便把它看成是F的同胚群。丛的一个G-图册(E, B, π, F)是一个局部平凡化,使得对任何两个重叠的图(Ui, φi)和(Uj, φj)函数
可以这样给出:
其中是一个称为转移函数的连续映射。两个G-图集等效如果他们的并也是一个G-图集。一个G-丛是一个有G-图集等价类的纤维丛。群G成为该丛的结构群.
在光滑范畴中,一个G-丛是一个光滑纤维丛,其中G是一个李群而相应的在F上的作用是光滑的并且变换函数都是光滑映射。
转移函数tij满足以下条件[/ltr]
[ltr]
这三个条件用到重叠的三元组上叫做上链条件 (见Čech上同调).
一个主丛是一个G-丛,其纤维可以认为是G本身,并且有一个在全空间上的G的右作用保持纤维不变。
参见[编辑][/ltr]
截面 (纤维丛)[编辑]
R2 上一个向量场。切丛的一个截面是一个向量场。[size][ltr]
在数学之拓扑学领域中,拓扑空间B 上纤维丛 π: E → B 的一个截面或横截面(section 或 cross section),是一个连续映射 s : B →E,使得对 x 属于 B 有 π(s(x))=x。
[/ltr][/size]
[size][ltr]
从函数图像开始[编辑]
截面是函数图像概念的某种推广。一个函数 g : X → Y 的图像可以等价于取值为 X 与 Y 的笛卡儿积的一个函数:
一个截面是什么是一个函数图像的抽象刻划。令 π : E → X 是到第一个分量的投影:π(x,y) = x,则一个图是任何使得 π(f(x))=x 的函数。
纤维丛的语言保证了截面的概念可以推广到当 E 不必为一个笛卡儿积的情形。如果 π : E → B 是一个纤维丛,则一个截面是在每个纤维中选取一个点f(x) 。条件 π(f(x)) = x 不过意味着在点 x 处的截面必须在 x 上(见右上图)。
例如,当 E 是一个向量丛,E 的一个截面是在每一点 x ∈ B 上的向量空间 Ex中有一个元素。特别地,光滑流形 M 上一个向量场是在 M 的每一点选取一个切向量:这是 M 的切丛的一个截面。类似地,M 上一个 1-形式是余切丛的一个截面。
局部截面[编辑]
纤维丛一般不一定有如上的整体截面,从而定义局部截面也是有用的。纤维丛的一个局部截面(local section)是一个连续函数 f : U → E,其中 U 是 B的一个开集,并满足 π(f(x))=x 对所有 x ∈ U。如果 (U, φ) 是 E 的一个局部平凡化,这里 φ 是从 π-1(U) 到 U × F 一个同胚(这里 F 是纤维),在 U 上的整体截面总存在且一一对应于从 U 到 F 的连续函数。局部截面形成了 B 上一个层,称为 E 的截面层(sheaf of sections)。
一个纤维丛 E 在 U 上的连续截面有时记成 C(U,E),而 E 的整体截面通常记做 Γ(E) 或 Γ(B,E)。
截面在同伦论与代数拓扑中都有研究,其中一个主要目标是确定整体截面的存在性或不存在性。这导向了层上同调和示性类理论。例如,一个主丛有一个整体截面当且仅当它是平凡的。另一方面,一个向量丛总有一个整体截面,即零截面。但只有当它的欧拉类为零时,才有在任何地方都不为零的整体截面。关于向量场的零点可参见庞加莱-霍普夫定理。
光滑截面[编辑]
截面,特别是对主丛和向量丛,是微分几何中的重要工具。在这种情形,底空间 B 是一个光滑流形 M,而 E 总假设是 M 上一个光滑纤维丛(即 E 是一个光滑流形且投影 π: E → M 是一个光滑映射)。此时,我们考虑 E 在一个开集 U 上的光滑截面,记做 C∞(U,E)。在几何分析中,考虑具有中等正则性的截面也是有用的。例如 Ck 截面,或满足赫尔德条件或索伯列夫空间的截面。
另见[编辑]
[/ltr][/size]
[ltr]数学上,特别是在拓扑学中,一个纤维丛是一个局部看来像两个空间的直积的空间,但是整体可能有不同的结构。每个纤维丛对应一个连续满射
并对每个E的局部空间(需存在B的局部空间能够保持上述的满射),都存在一个F(F称作纤维空间),使得E与直积空间B × F同胚。(通常会用此满射:π : E → B来表示一个纤维丛,而忽略F )
(这里局部表示在B上局部。)一个可以整体上如此表达的丛(通过一个保持π的同胚)叫做平凡丛。丛的理论建立在如何用一些比这个直接的定义更简单的方法表达丛不是平凡丛的意义的问题之上。
纤维丛扩展了向量丛,向量丛的主要实例就是流形的切丛。他们在微分拓扑和微分几何领域有着重要的作用。他们也是规范场论的基本概念。[/ltr]
[ltr]
形式化定义[编辑]
一个纤维丛由四元组(E, B, π, F)组成,其中E, B, F是拓扑空间而π : E →B是一个 连续满射,满足下面给出的局部平凡条件。B称为丛的基空间,E称为总空间,而F称为纤维。映射π称为投影映射.下面我们假定基空间B是连通的。
我们要求对于B中的每个x,存在一个x的开邻域U,使得π−1(U)是同胚于积空间U × F的,并满足π转过去就变成到第一个因子的投影。也就是一下的图可交换:[/ltr]
[ltr]
其中proj1 : U × F → U是自然投影而φ : π−1(U) → U × F是一个同胚。所有{(Ui, φi)}的集合称为丛的局部平凡化。
对于B中每个x,原象π−1(x)和F同胚并称为x上的纤维.一个纤维丛(E, B, π,F)经常记为
以引入一个空间的短恰当序列。注意每个纤维从π : E → B都是一个开映射,因为积空间的投影是开映射。所以B有由映射π决定的商拓扑.
一个光滑纤维丛是一个在光滑流形的范畴内的纤维丛。也就是,E, B, F都必须是光滑流形而所有上面用到的函数都必须是光滑映射。这是纤维丛研究和使用的通常环境。
例子[编辑]
令E = B × F并令π : E → B为对第一个因子的投影,则E是B上的丛.这里E不仅是局部的积而且是整体的积。任何这样的纤维丛称为平凡丛.[/ltr]
[ltr]
最简单的非平凡丛的例子可能要算莫比乌斯带(Möbius strip).莫比乌斯带是一个以圆为基空间B并以线段为纤维F的丛。对于一点的邻域是一段圆弧;在图中,就是其中一个方块的长。原象在图中是个(有些扭转的)切片,4个方块宽一个方块长。同胚φ把U的原象映到柱面的一块:弯曲但不扭转.
相应的平凡丛B × F看起来像一个圆柱,但是莫比乌斯带有个整体上的扭转。注意这个扭转只有整体上才能看出来;局部看来莫比乌斯带和圆柱完全一样(在其中任何一个竖直的切一刀会产生同样的空间).
一个类似的非平凡丛是克莱因瓶,它可以看作是一个"扭转"的圆在另一个圆上的丛。相应的平凡丛是一个环, S1 × S1.
一个覆盖空间是一个以离散空间为纤维的纤维丛。
纤维丛的一个特例,叫做向量丛,是那些纤维为向量空间的丛(要成为一个向量丛,丛的结构群—见下面—必须是一个线性群)。向量丛的重要实例包括光滑流形的切丛和余切丛。
另一个纤维丛的特例叫做主丛。更多的例子参看该条目。
一个球丛是一个纤维为n维球面的纤维丛。给定一个有度量的向量丛(例如黎曼流形的切丛),可以构造一个相应的单位球丛,其在一点x的纤维是所有Ex的单位向量的集合.
截面[编辑]
主条目:截面 (纤维丛)
纤维丛的截面(section或者cross section)是一个连续映射f : B → E使得π(f(x))=x对于所有B中的x成立。因为丛通常没有全局有定义的截面,理论的一个重要作用就是检验和证明他们的存在性。这导致了代数拓扑的示性类理论。
截面经常只被局部的定义(特别是当全局截面不存在时)。纤维丛的局部截面是一个连续映射f : U → E其中U是一个B中的开集而π(f(x))=x对所有U中的x成立。若(U, φ)是一个局部平凡化图,则局部截面在 U上总是存在的。这种截面和连续映射U → F有1-1对应。截面的集合组成一个层(sheaf)。
结构群和转移函数[编辑]
纤维丛经常有一个对称群描述重叠的图之间的相容条件。特别的,令G为一个拓扑群,它连续的从左边作用在纤维空间F上。不失一般性的,我们可以要求G有效的作用在F上,以便把它看成是F的同胚群。丛的一个G-图册(E, B, π, F)是一个局部平凡化,使得对任何两个重叠的图(Ui, φi)和(Uj, φj)函数
可以这样给出:
其中是一个称为转移函数的连续映射。两个G-图集等效如果他们的并也是一个G-图集。一个G-丛是一个有G-图集等价类的纤维丛。群G成为该丛的结构群.
在光滑范畴中,一个G-丛是一个光滑纤维丛,其中G是一个李群而相应的在F上的作用是光滑的并且变换函数都是光滑映射。
转移函数tij满足以下条件[/ltr]
[ltr]
这三个条件用到重叠的三元组上叫做上链条件 (见Čech上同调).
一个主丛是一个G-丛,其纤维可以认为是G本身,并且有一个在全空间上的G的右作用保持纤维不变。
参见[编辑][/ltr]
截面 (纤维丛)[编辑]
R2 上一个向量场。切丛的一个截面是一个向量场。
在数学之拓扑学领域中,拓扑空间B 上纤维丛 π: E → B 的一个截面或横截面(section 或 cross section),是一个连续映射 s : B →E,使得对 x 属于 B 有 π(s(x))=x。
[/ltr][/size]
[size][ltr]
从函数图像开始[编辑]
截面是函数图像概念的某种推广。一个函数 g : X → Y 的图像可以等价于取值为 X 与 Y 的笛卡儿积的一个函数:
一个截面是什么是一个函数图像的抽象刻划。令 π : E → X 是到第一个分量的投影:π(x,y) = x,则一个图是任何使得 π(f(x))=x 的函数。
纤维丛的语言保证了截面的概念可以推广到当 E 不必为一个笛卡儿积的情形。如果 π : E → B 是一个纤维丛,则一个截面是在每个纤维中选取一个点f(x) 。条件 π(f(x)) = x 不过意味着在点 x 处的截面必须在 x 上(见右上图)。
例如,当 E 是一个向量丛,E 的一个截面是在每一点 x ∈ B 上的向量空间 Ex中有一个元素。特别地,光滑流形 M 上一个向量场是在 M 的每一点选取一个切向量:这是 M 的切丛的一个截面。类似地,M 上一个 1-形式是余切丛的一个截面。
局部截面[编辑]
纤维丛一般不一定有如上的整体截面,从而定义局部截面也是有用的。纤维丛的一个局部截面(local section)是一个连续函数 f : U → E,其中 U 是 B的一个开集,并满足 π(f(x))=x 对所有 x ∈ U。如果 (U, φ) 是 E 的一个局部平凡化,这里 φ 是从 π-1(U) 到 U × F 一个同胚(这里 F 是纤维),在 U 上的整体截面总存在且一一对应于从 U 到 F 的连续函数。局部截面形成了 B 上一个层,称为 E 的截面层(sheaf of sections)。
一个纤维丛 E 在 U 上的连续截面有时记成 C(U,E),而 E 的整体截面通常记做 Γ(E) 或 Γ(B,E)。
截面在同伦论与代数拓扑中都有研究,其中一个主要目标是确定整体截面的存在性或不存在性。这导向了层上同调和示性类理论。例如,一个主丛有一个整体截面当且仅当它是平凡的。另一方面,一个向量丛总有一个整体截面,即零截面。但只有当它的欧拉类为零时,才有在任何地方都不为零的整体截面。关于向量场的零点可参见庞加莱-霍普夫定理。
光滑截面[编辑]
截面,特别是对主丛和向量丛,是微分几何中的重要工具。在这种情形,底空间 B 是一个光滑流形 M,而 E 总假设是 M 上一个光滑纤维丛(即 E 是一个光滑流形且投影 π: E → M 是一个光滑映射)。此时,我们考虑 E 在一个开集 U 上的光滑截面,记做 C∞(U,E)。在几何分析中,考虑具有中等正则性的截面也是有用的。例如 Ck 截面,或满足赫尔德条件或索伯列夫空间的截面。
另见[编辑]
[/ltr][/size]
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
配丛[编辑]
[ltr]在数学中,带有结构群 G(拓扑群)的纤维丛理论允许产生一个配丛(associated bundle)的操作,将丛的典型纤维由 F1 变成 F2,两者都是具有群 G 作用的拓扑空间。对具有结构群 G 的纤维丛 F,纤维在两个局部坐标系 Uα 与 Uβ 交集上的转移函数(即上链)由一个 Uα∩Uβ 上 G-值函数 gαβ给出。我们可以构造一个纤维丛 F′ 有同样的转移函数,但可能具有不同的纤维。[/ltr]
3 结构群的约化
3.1 约化的例子
4 另见
5 参考文献
[ltr]
一个例子[编辑]
一个简单的例子来自莫比乌斯带,这里 G 是 2 阶循环群 。我们可任取F 为实数线 、区间 、去掉 0 的实数线或两个点的集合 。直觉看来 G 在它们上的作用(在每种情形,非单位元素作用为 )是可比较的。可以更形式地说,把两个矩形 与 黏合在一起:我们其实需要的是将一端的 直接与自己等同,而在另一端扭转后等同。这个数据可用一个取值于 G 的补丁函数记下。配丛构造恰是观察到这个数据对 与对 是一样的。
构造[编辑]
一般地只需解释由一个具有纤维 F 作用的丛,G 作用在 F 上,变为相配的主丛(即以 G 为纤维的丛,考虑为作用在自身的平移)。然后,我们可由 F1经过主丛变为 F2。由一个开覆盖数据表述的细节由下降的一种情形给出。
这一节是这样组织的:我们首先引入从一个给定的纤维丛,产生一个具有制定的纤维的配丛的一般程序。然后是当制定的纤维是关于这个群在自身上左作用的一个主齐性空间特例,得到了配主丛。如果另外,在主丛的纤维上给出了一个右作用,我们叙述如何利用纤维积构造任何配丛 [1]。
一般配丛[编辑]
设 π : E → X 是拓扑空间 X 上一个纤维丛,带有结构群 G 及典型纤维 F。由定义,有 G 在纤维 F 上一个左作用(作为变换群)。此外假设这个作用是有效的[2]。存在 E 的一个由 X 的一个开覆盖 Ui,以及一族纤维映射
φi : π-1(Ui) → Ui × F
组成的局部平凡化,使得转移映射由 G 的元素给出。更确切地,存在连续函数 gij : (Ui ∩ Uj) → G 使得
ψij(u,f) := φi o φj-1(u,f) = (u,gij(u)f) 对每个 (u,f) ∈ (Ui ∩ Uj) × F。
现在设 F′ 是一个制定的拓扑空间,装备有 G 的一个连续左作用。则相配于E、具有纤维 F′ 的丛是一个丛 E′ 具有从属于覆盖 Ui 其转移函数为:
ψ′ij(u,f′) = (u, gij(u) f′),对 (u,f′ ) ∈(Ui ∩ Uj) × F′
这里 G-值函数 gij(u) 与由原先的丛 E 的局部平凡化得到的相同。
这个定义显然遵守转移函数的上链条件,因为在每一种情形它们由同样 G-值函数系统给出(使用另一个局部平凡化,如果有必要使用一般的加细过程,则 gij 通过相同的上边缘变换)。从而,由纤维丛构造定理(fiber bundle construction theorem),这样便产生了所要求的具有纤维 F′ 的纤维丛 E′ 。
主丛配于纤维丛[编辑]
和前面一样,假设 E 是一个具有结构群 G 的纤维丛。当 G-左自由且传递作用于 F′ 的特例时,所以 F′ 是 G 在自身上左作用的一个主齐性空间,则相配的丛 E′ 称为相配于纤维丛 E 的主 G-丛。如果此外新纤维 F′ 等同于 G(从而F′ 不仅有左作用也继承了 G 的一个右作用),则 G 在 F′ 上的右作用诱导了G 在 E′ 上的右作用。通过选取等同化,E′ 成为通常意义的主丛。注意,尽管没有典范的方式选取 G 的一个主齐性空间上的右作用,任何这样的作用将得出相同的具有结构群 G 的承载纤维丛(因为这是由 G 的左作用得到),而且作为 G-空间在存在一个整体定义的 G-值函数联系两者的意义下同构。
以这样方式,装备一个右作用的主 G-丛通常视为确定具有结构群 G 的纤维丛的数据之一部分,因为对纤维丛我们可以由配丛构造法来建构主丛。在下一节中,我们经相反的道路利用一个纤维积得到任何纤维丛。
纤维丛配于主丛[编辑]
设 π : P → X 是一个主 G-丛,令 ρ : G → Homeo(F) 是 G 在空间 F上一个连续左作用(在连续范畴中,我们需有光滑流形上一个光滑作用)。不失一般性,我们取作用是有效的(ker(ρ) = 1)。
G 在 P × F 上定义 G 的一个右作用为
然后我们将这个作用等化得到空间 E = P ×ρ F = (P × F) /G。将 (p,f) 的等价类记为 [p,f]。注意到
,对所有
由 πρ([p,f]) = π(p),定义投影映射 πρ : E → X。注意这是良定义的。
那么 πρ : E → X 是一个纤维丛,具有纤维 F 与结构群 G。转移函数由 ρ(tij) 给出,这里 tij 是主丛 P 的转移函数。
结构群的约化[编辑]
更多资料:结构群的约化
配丛的一个相伴的概念是一个 G-丛 B 的结构群的约化。我们问是否存在一个 H-丛 C,使得相配的 G-丛是 B(在同构的意义下)。更具体地,这是问 B的转移数据能否一致的取值于 H 中。换句话说,我们要求确认相配丛映射的像(这其实是一个函子)。
约化的例子[编辑]
向量丛的例子包括:引入一个度量导致结构群由一个一般线性群约化为正交群 O(n);一个实丛的复结构的存在性导致结构群由实一般线性群 GL(2n,R) 约化为复线性群 GL(n,C)。
另一个重要的情形实寻找一个秩 n 向量丛 V 的作为秩 k 与秩 n-k 子丛的惠特尼和(Whitney sum),这将导致结构群由 GL(n,R) 约化为 GL(k,R) × GL(n-k,R).
我们也能将叶状结构的条件表述为将切丛的结构群约化为分块矩阵子群——但这里约化只是必要条件,还有一个可积性条件使得弗罗贝尼乌斯定理可以使用。
另见[编辑][/ltr]
[ltr]
参考文献[编辑][/ltr]
纤维化 (数学)[编辑]
数学中,尤其是代数拓扑,一个纤维化(fibration)是一个连续映射
对任何空间满足同伦提升性质。纤维丛(在仿紧底上)构成一类重要例子。在同伦论中任何映射和纤维化“一样好”——即任何映射可以分解为到“映射道路空间”的同伦等价复合一个纤维化(参见同伦纤维)。
对 CW复形(或等价地,只用多方体 In)有同伦提升性质的纤维化称为塞尔纤维化,让-皮埃尔·塞尔在其博士论文中部分提出了这个概念。这篇论文牢固地在代数拓扑学中建立了谱序列的使用,并将纤维丛与纤维化的概念从层中清晰地分离出来(这两个概念在早期让·勒雷的处理中是不清晰的)。因为一个层(想象为一个艾达尔空间)可以视为一个局部同胚,那时候这些概念是密切相连的。
“纤维”由定义是 E 的子空间,是 B 中一个点 b 的逆像。如果底空间 B 是道路连通的,有定义可以推出 B 中两个不同点 b1 和 b2 的纤维是同伦等价的。从而我们通常就说纤维 F。纤维化不必有定义更受限的纤维丛时的局部笛卡儿乘积结构,但弱一点仍可从纤维到纤维移动。塞尔谱序列的一个主要令人满意的性质是说明了底 B 的基本群在全空间 E 的同调上的作用。
乘积空间的投影映射容易看出是一个纤维化。纤维丛有局部平凡化性质——这样的笛卡儿乘积结构在 B 上局部存在,就通常足够证明一个纤维丛是一个纤维化。更确切地,如果在 B 一个可数开覆盖上有局部平凡化,则丛是纤维化。仿紧空间上任何覆盖——比如任何度量空间,有一个棵树加细,所以任何这样空间上的纤维丛是纤维化。局部平凡化也蕴含了良定义的“纤维”的存在性(差一个同胚),至少在 B 的每个连通分支上。
[/ltr][/size]
[size][ltr]
例子[编辑]
下面纤维化的例子记作
F → E → B,
这里第一个映射是“纤维” F 到群空间的包含,第二个是到底空间 B 的纤维化映射。这也称为一个纤维化序列。
[/ltr][/size]
[size][ltr]
性质[编辑]
欧拉示性数[编辑]
主条目:欧拉示性数
对具有一定条件的纤维化欧拉示性数是可乘的。
如果 是一个纤维化,纤维为 F,底 B 是道路连通的,且纤维化在一个域 K 上可定向,则在系数 K 中的欧拉示性数满足乘积性质:[1]
这包括了特例乘积空间与覆叠空间,可用纤维化的同调塞尔谱序列证明。
对一个纤维丛,这也可用转移映射 来理解——注意这是一个提升且朝“错误的方向”—— 它与投影映射 复合的效果是乘以纤维的欧拉类:[2]
闭模型范畴中的纤维化[编辑]
拓扑空间范畴的纤维化可放入更一般的框架中,所谓闭模型范畴(closed model category)。在这样的范畴中,有一些特殊的态射,所谓的“纤维化”、上纤维化与弱等价。某些公理,比如纤维化在复合与拉回下的稳定性,任何映射可分解为一个非周期上纤维化与一个纤维化或一个上纤维化与一个非周期纤维化的复合,这里词“非周期”表示相应的箭头不是一个弱等价,以及其他一些要求允许抽象地处理同伦理论。(在原先丹尼尔·奎伦的处理中,使用“平凡”代替“非周期”。)
可以证明拓扑空间范畴确实是一个模型范畴,这里(抽象的)纤维化恰好就是上面介绍的纤维化而弱等价是同伦等价,参考 Dwyer, Spaliński (1995)。
相关条目[编辑]
[/ltr][/size]
[ltr]在数学中,带有结构群 G(拓扑群)的纤维丛理论允许产生一个配丛(associated bundle)的操作,将丛的典型纤维由 F1 变成 F2,两者都是具有群 G 作用的拓扑空间。对具有结构群 G 的纤维丛 F,纤维在两个局部坐标系 Uα 与 Uβ 交集上的转移函数(即上链)由一个 Uα∩Uβ 上 G-值函数 gαβ给出。我们可以构造一个纤维丛 F′ 有同样的转移函数,但可能具有不同的纤维。[/ltr]
[ltr]
一个例子[编辑]
一个简单的例子来自莫比乌斯带,这里 G 是 2 阶循环群 。我们可任取F 为实数线 、区间 、去掉 0 的实数线或两个点的集合 。直觉看来 G 在它们上的作用(在每种情形,非单位元素作用为 )是可比较的。可以更形式地说,把两个矩形 与 黏合在一起:我们其实需要的是将一端的 直接与自己等同,而在另一端扭转后等同。这个数据可用一个取值于 G 的补丁函数记下。配丛构造恰是观察到这个数据对 与对 是一样的。
构造[编辑]
一般地只需解释由一个具有纤维 F 作用的丛,G 作用在 F 上,变为相配的主丛(即以 G 为纤维的丛,考虑为作用在自身的平移)。然后,我们可由 F1经过主丛变为 F2。由一个开覆盖数据表述的细节由下降的一种情形给出。
这一节是这样组织的:我们首先引入从一个给定的纤维丛,产生一个具有制定的纤维的配丛的一般程序。然后是当制定的纤维是关于这个群在自身上左作用的一个主齐性空间特例,得到了配主丛。如果另外,在主丛的纤维上给出了一个右作用,我们叙述如何利用纤维积构造任何配丛 [1]。
一般配丛[编辑]
设 π : E → X 是拓扑空间 X 上一个纤维丛,带有结构群 G 及典型纤维 F。由定义,有 G 在纤维 F 上一个左作用(作为变换群)。此外假设这个作用是有效的[2]。存在 E 的一个由 X 的一个开覆盖 Ui,以及一族纤维映射
φi : π-1(Ui) → Ui × F
组成的局部平凡化,使得转移映射由 G 的元素给出。更确切地,存在连续函数 gij : (Ui ∩ Uj) → G 使得
ψij(u,f) := φi o φj-1(u,f) = (u,gij(u)f) 对每个 (u,f) ∈ (Ui ∩ Uj) × F。
现在设 F′ 是一个制定的拓扑空间,装备有 G 的一个连续左作用。则相配于E、具有纤维 F′ 的丛是一个丛 E′ 具有从属于覆盖 Ui 其转移函数为:
ψ′ij(u,f′) = (u, gij(u) f′),对 (u,f′ ) ∈(Ui ∩ Uj) × F′
这里 G-值函数 gij(u) 与由原先的丛 E 的局部平凡化得到的相同。
这个定义显然遵守转移函数的上链条件,因为在每一种情形它们由同样 G-值函数系统给出(使用另一个局部平凡化,如果有必要使用一般的加细过程,则 gij 通过相同的上边缘变换)。从而,由纤维丛构造定理(fiber bundle construction theorem),这样便产生了所要求的具有纤维 F′ 的纤维丛 E′ 。
主丛配于纤维丛[编辑]
和前面一样,假设 E 是一个具有结构群 G 的纤维丛。当 G-左自由且传递作用于 F′ 的特例时,所以 F′ 是 G 在自身上左作用的一个主齐性空间,则相配的丛 E′ 称为相配于纤维丛 E 的主 G-丛。如果此外新纤维 F′ 等同于 G(从而F′ 不仅有左作用也继承了 G 的一个右作用),则 G 在 F′ 上的右作用诱导了G 在 E′ 上的右作用。通过选取等同化,E′ 成为通常意义的主丛。注意,尽管没有典范的方式选取 G 的一个主齐性空间上的右作用,任何这样的作用将得出相同的具有结构群 G 的承载纤维丛(因为这是由 G 的左作用得到),而且作为 G-空间在存在一个整体定义的 G-值函数联系两者的意义下同构。
以这样方式,装备一个右作用的主 G-丛通常视为确定具有结构群 G 的纤维丛的数据之一部分,因为对纤维丛我们可以由配丛构造法来建构主丛。在下一节中,我们经相反的道路利用一个纤维积得到任何纤维丛。
纤维丛配于主丛[编辑]
设 π : P → X 是一个主 G-丛,令 ρ : G → Homeo(F) 是 G 在空间 F上一个连续左作用(在连续范畴中,我们需有光滑流形上一个光滑作用)。不失一般性,我们取作用是有效的(ker(ρ) = 1)。
G 在 P × F 上定义 G 的一个右作用为
然后我们将这个作用等化得到空间 E = P ×ρ F = (P × F) /G。将 (p,f) 的等价类记为 [p,f]。注意到
,对所有
由 πρ([p,f]) = π(p),定义投影映射 πρ : E → X。注意这是良定义的。
那么 πρ : E → X 是一个纤维丛,具有纤维 F 与结构群 G。转移函数由 ρ(tij) 给出,这里 tij 是主丛 P 的转移函数。
结构群的约化[编辑]
更多资料:结构群的约化
配丛的一个相伴的概念是一个 G-丛 B 的结构群的约化。我们问是否存在一个 H-丛 C,使得相配的 G-丛是 B(在同构的意义下)。更具体地,这是问 B的转移数据能否一致的取值于 H 中。换句话说,我们要求确认相配丛映射的像(这其实是一个函子)。
约化的例子[编辑]
向量丛的例子包括:引入一个度量导致结构群由一个一般线性群约化为正交群 O(n);一个实丛的复结构的存在性导致结构群由实一般线性群 GL(2n,R) 约化为复线性群 GL(n,C)。
另一个重要的情形实寻找一个秩 n 向量丛 V 的作为秩 k 与秩 n-k 子丛的惠特尼和(Whitney sum),这将导致结构群由 GL(n,R) 约化为 GL(k,R) × GL(n-k,R).
我们也能将叶状结构的条件表述为将切丛的结构群约化为分块矩阵子群——但这里约化只是必要条件,还有一个可积性条件使得弗罗贝尼乌斯定理可以使用。
另见[编辑][/ltr]
[ltr]
参考文献[编辑][/ltr]
- ^ 所有这些构造都属于埃雷斯曼 Ehresmann (1941-3);由 Steenrod (1951) p. 36 给出。
- ^ 有效性是对纤维丛的通常假设,参见 Steenrod (1951)。特别地,这个条件足够保证相配于 E 的主丛的存在性与惟一性。
- Steenrod, Norman. Topology of Fibre Bundles. Princeton University Press. 1951. ISBN 0-691-00548-6.
纤维化 (数学)[编辑]
本文介绍的是代数拓扑中的纤维化。关于用于下降理论与范畴逻辑的范畴论中的纤维化,请参看“纤维范畴”。
[size][ltr]数学中,尤其是代数拓扑,一个纤维化(fibration)是一个连续映射
对任何空间满足同伦提升性质。纤维丛(在仿紧底上)构成一类重要例子。在同伦论中任何映射和纤维化“一样好”——即任何映射可以分解为到“映射道路空间”的同伦等价复合一个纤维化(参见同伦纤维)。
对 CW复形(或等价地,只用多方体 In)有同伦提升性质的纤维化称为塞尔纤维化,让-皮埃尔·塞尔在其博士论文中部分提出了这个概念。这篇论文牢固地在代数拓扑学中建立了谱序列的使用,并将纤维丛与纤维化的概念从层中清晰地分离出来(这两个概念在早期让·勒雷的处理中是不清晰的)。因为一个层(想象为一个艾达尔空间)可以视为一个局部同胚,那时候这些概念是密切相连的。
“纤维”由定义是 E 的子空间,是 B 中一个点 b 的逆像。如果底空间 B 是道路连通的,有定义可以推出 B 中两个不同点 b1 和 b2 的纤维是同伦等价的。从而我们通常就说纤维 F。纤维化不必有定义更受限的纤维丛时的局部笛卡儿乘积结构,但弱一点仍可从纤维到纤维移动。塞尔谱序列的一个主要令人满意的性质是说明了底 B 的基本群在全空间 E 的同调上的作用。
乘积空间的投影映射容易看出是一个纤维化。纤维丛有局部平凡化性质——这样的笛卡儿乘积结构在 B 上局部存在,就通常足够证明一个纤维丛是一个纤维化。更确切地,如果在 B 一个可数开覆盖上有局部平凡化,则丛是纤维化。仿紧空间上任何覆盖——比如任何度量空间,有一个棵树加细,所以任何这样空间上的纤维丛是纤维化。局部平凡化也蕴含了良定义的“纤维”的存在性(差一个同胚),至少在 B 的每个连通分支上。
[/ltr][/size]
[size][ltr]
例子[编辑]
下面纤维化的例子记作
F → E → B,
这里第一个映射是“纤维” F 到群空间的包含,第二个是到底空间 B 的纤维化映射。这也称为一个纤维化序列。
[/ltr][/size]
- 霍普夫纤维化 S1 → S3 → S2 在历史上是纤维化最早的非平凡例子。
- 塞尔纤维化 SO(2) → SO(3) → S2 来自旋转群 SO(3) 在球面 S2 上的作用。
- 在复射影空间上,存在一个纤维化 S1 → S2n+1 → CPn。
[size][ltr]
性质[编辑]
欧拉示性数[编辑]
主条目:欧拉示性数
对具有一定条件的纤维化欧拉示性数是可乘的。
如果 是一个纤维化,纤维为 F,底 B 是道路连通的,且纤维化在一个域 K 上可定向,则在系数 K 中的欧拉示性数满足乘积性质:[1]
这包括了特例乘积空间与覆叠空间,可用纤维化的同调塞尔谱序列证明。
对一个纤维丛,这也可用转移映射 来理解——注意这是一个提升且朝“错误的方向”—— 它与投影映射 复合的效果是乘以纤维的欧拉类:[2]
闭模型范畴中的纤维化[编辑]
拓扑空间范畴的纤维化可放入更一般的框架中,所谓闭模型范畴(closed model category)。在这样的范畴中,有一些特殊的态射,所谓的“纤维化”、上纤维化与弱等价。某些公理,比如纤维化在复合与拉回下的稳定性,任何映射可分解为一个非周期上纤维化与一个纤维化或一个上纤维化与一个非周期纤维化的复合,这里词“非周期”表示相应的箭头不是一个弱等价,以及其他一些要求允许抽象地处理同伦理论。(在原先丹尼尔·奎伦的处理中,使用“平凡”代替“非周期”。)
可以证明拓扑空间范畴确实是一个模型范畴,这里(抽象的)纤维化恰好就是上面介绍的纤维化而弱等价是同伦等价,参考 Dwyer, Spaliński (1995)。
相关条目[编辑]
[/ltr][/size]
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
2. The Basic Strategy of Extracting Finite
Information from Infinities – Ariadne’s Thread
in Renormalization Theory
There is no doubt that renormalization is one of the most sophisticated
procedures for obtaining significant numerical quantities by starting from
meaningless mathematical expressions. This is fascinating for both physicists
and mathematicians.
Alain Connes, 2003
Quantum field theory deals with fields ψ(x) that destroy and create particles
at a spacetime point x. Earlier experience with classical electron theory
provided a warning that a point electron will have infinite electromagnetic
self-mass; this mass is
ee/6πacc
for a sur face distribution of charge with radius a, and therefore blows
up for a → 0. Disappointingly this problem appeared with even greater
severity in the early days of quantum field theory, and although greatly
ameliorated by subsequent improvements in the theory, it remains with us
to the present day.
The problem of infinities in quantum field theory was apparently first
noted in the 1929–30 papers by Heisenberg and Pauli. Soon after, the
presence of infinities was confirmed by calculations of the electromagnetic
self-energy of a bound electron by Oppenheimer, and of a free electron by
Ivar Waller.
Steven Weinberg, 1995
2.1 Renormalization Theory in a Nutshell
In renormalization theory, one has to clearly distinguish between
(I) mathematical regularization of infinities, and
(II) renormalization of physical parameters by introducing effective parameters
which relate the mathematical regularization to physical measurements.
In order to help the reader to find his/her way through the jungle of renormalization
theory, let us discuss a few basic ideas. (The details will be studied later on.) This
concerns:
• Bare and effective parameters – the effective frequency and the running coupling
constant of an anharmonic oscillator.
• The renormalized Green’s function and the renormalization group.
• The zeta function, Riemann’s idea of analytic continuation, and the Casimir
effect in quantum electrodynamics.
• Meromorphic functions and Mittag-Leffler’s idea of subtractions.
• Euler’s gamma function and the dimensional regularization of integrals.
Behind this there is the general strategy of extracting finite information from infinities.
For example, we will consider the following examples:
• Regularization of divergent integrals (including the famous overlapping divergences).
• Abel’s adiabatic regularization of infinite series.
• Adiabatic regularization of oscillating integrals (the prototype of the Feynman
path integral trick).
• Poincar´e’s asymptotic series, the Landau singularity, and the Ritt theorem.
• The summation methods by Euler, Frobenius, H¨older, and Borel.
• Tikhonov regularization of ill-posed problems.
In modern renormalization theory, the combinatorics of Feynman diagrams plays
the crucial role. This is related to the mathematical notion of
• Hopf algebras and
• Rota–Baxter algebras.
The prototypes of these algebras appear in classical complex function theory in
connection with the inversion of holomorphic functions (the relation between Lagrange’s
inversion formula and the coinverse (antipode) of a Hopf algebra) and
the regularization of Laurent series (Rota-Baxter algebras). This will be studied in
Chap. 3.
2.1.1 Effective Frequency and Running Coupling Constant of an
Anharmonic Oscillator
In quantum electrodynamics, physicists distinguish between
• the bare mass of the electron, and
• its effective mass (resp. the bare charge of the electron and its effective charge).
The effective mass and the effective charge of the electron coincide with the physical
quantities measured in physical experiments. In contrast to this, the bare parameters
are only introduced as auxiliary quantities in renormalization theory for constructing
the effective quantities in terms of a mathematical algorithm. To illustrate
the crucial difference between bare parameters and effective physical parameters, let
us summarize the main features of a simple rigorous model which is studied in Sect.
11. 5 of Vol. I in full detail.
Let us now investigate the nonlinear problem of an
Anharmonic Oscillator. We have to distinguish
between two cases, namely,
• the regular non-resonance case and
• the singular resonance case which corresponds to renormalization in physics.
We will show that the trouble comes from resonances between the external
forces and the eigenoscillations.
This is the typical trouble in physics if resonance occurs.
To overcome the difficulties in the resonance case, we will use the method of
the pseudo-resolvent introduced by Erhard Schmidt in 1908.
This simple model corresponds to the (much more complicated) determination of
the electron charge, the electron mass, and the running coupling constant in quantum
electrodynamics. In particle physics, the running coupling constants of electromagnetic,
weak, and strong interaction depend on energy and momentum of the
scattering process observed in a particle accelerator.
2.1.2 The Zeta Function and Riemann’s Idea of Analytic
Continuation
Nature sees analytic continuation.
2.1.3 Meromorphic Functions and Mittag-Leffler’s Idea of
Subtractions
The method of subtractions plays a fundamental role in the renormalization of
quantum field theories according to Bogoliubov, Parasiuk, Hepp, and Zimmermann
(called BPHZ renormalization).
The basic idea is to enforce convergence by subtracting additional terms
called subtractions.
2.2 Regularization of Divergent Integrals in Baby
Renormalization Theory
In order to distinguish between convergent and divergent integrals, we will use the
method of power-counting based on a cut-off of the domain of integration. In terms
of physics, this cut-off corresponds to the introduction of an upper bound for the
admissible energies (resp. momenta). For the regularization of divergent integrals,
we will discuss the following methods used by physicists in renormalization theory:
(i) the method of differentiating parameter integrals,
(ii) the method of subtractions (including the famous overlapping divergences),
(iii) Pauli–Villars regularization,
(iv) dimensional regularization by means of Euler’s Gamma function,
(v) analytic regularization via integrals of Riemann–Liouville type, and
(vi) distribution-valued analytic functions.
Summarizing, regularization (and hence renormalization) methods produce
additional parameters which have to be determined by physical experiments.
The role of the renormalization group in physics. The change of the
normalizing momentum is governed by a transformation which is described by the
so-called renormalization group. In particular, this is crucial for studying the highenergy
behavior of quantum field theories (see Chap. 3 of Vol. I on scale changes in
physics). For example, quarks behave like free particles at very high energies. This
is the so-called asymptotic freedom of quarks.
The Method of Differentiating Parameter Integrals
Improve the convergence behavior of a parameter-depending integral by
differentiation with respect to the parameter.
Folklore
Overlapping Divergences
Overlapping divergences caused a lot of trouble in the history of renormalization
theory.
Folklore
The Role of Counterterms
The method of subtractions changes the integrands by subtracting regularizing
terms in order to enforce convergence of the integrals. Physicists try to connect the
subtraction terms with additional terms in the Lagrangian of the quantum field
theory under consideration. These additional terms of the Lagrangian are called
counterterms. It is one of the tasks of renormalization theory
• to study the structure of the necessary subtraction terms and
• to show that the subtraction terms can be generated by appropriate counterterms
of the classical Lagrangian (see Chap. 16).
The philosophy behind this approach is that the procedure of quantization adds
quantum effects to the classical field theory. These quantum effects can be described
by changing the classical Lagrangian by adding counterterms.
Historical remarks. Dimensional regularization was introduced by Gerardus
’t Hooft and Martinus Veltman in 1972. They showed that, in contrast to the 1938
Fermi model, the electroweak Standard Model in particle physics is renormalizable.
For this, ’t Hooft and Veltman were awarded the Nobel prize in physics in 1999.
Dimensional regularization is the standard method used by physicists in
modern renormalization theory
Pauli–Villars Regularization
Regularize divergent integrals by introducing additional ghost particles of
large masses.
Folklore
The Pauli–Villars regularization preserves the relativistic invariance. However, the
introduction of additional masses may destroy the gauge invariance.
Application to Algebraic Feynman Integrals in Minkowski
Space
The important mathematical problem of evaluating (algebraic) Feynman
integrals arises quite naturally in elementary particle physics when one
treats various quantities (corresponding to Feynman diagrams) in the
framework of perturbation theory.
Vladimir Smirnov, 2006
Distribution-Valued Meromorphic Functions
In quantum field theory, Green’s functions are closely related to distribution-
valued meromorphic functions.
Folklore
The following considerations will be used in the next section in order to study
Newton’s equation of motion in terms of the Fourier transform of tempered distributions.
For the convenience of the reader, our approach will be chosen in such a
way that it serves as a prototype for the study of Green’s functions in quantum
field theory later on. The essential tool is given by families of tempered distributions
which analytically depend on a complex parameter λ. This approach dates
back to a fundamental paper by Marcel Riesz in 1948。
2.3 Further Regularization Methods in Mathematics
In physical experiments, physicists measure finite numbers. In contrast to this, the
theory frequently generates infinite expressions.
The main task is to extract relevant finite information from infinite expressions.
In the history of mathematics, mathematicians frequently encountered this problem.
36 Let us discuss some important approaches.
2.3.1 Euler’s Philosophy
One of the greatest masters in mathematics, Leonhard Euler, relied on formal analytic
identities. He used the principle that the sum of a divergent series is the
value of the function from which the series is derived.
Adiabatic averaging cancels infinities.
This is the secret behind the incredible success of path integral methods in quantum
physics.
The Feynman approach to quantum field theory via path integrals is based
on formal adiabatic averages.
The basic ideas are studied in Chap. 7 of Vol. I, in terms of a finite-dimensional
rigorous setting.
Tauberian theorems. Let us consider an arbitrary series with complex
terms a0, a1, a2, . . . Then the following implications hold:
classical convergence ⇒ regularization by averaging ⇒ adiabatic regularization.
In particular, this tells us that if the series can be regularized by averaging, then
it can also be regularized in the adiabatic sense, and the regularized values are the
same. Conversely, we have the following convergence theorem:
The series is convergent if the series can be regularized in the
adiabatic sense (or by averaging in the sense of (2.73)) and we have the
relation |an| = O( 1/n ) as n → +∞.
This was proved by Hardy and Littlewood in the 1910s by generalizing a weaker
theorem proved by Tauber in 1890.39 Combining the Tauberian theorem above with
the Fej´er theorem, we get the following basic theorem:
If the 2π-periodic function f : R → C is continuous and its Fourier coefficients
satisfy the condition |an| = O( 1/n ) as n → ∞, then the Fourier
series converges to f on the real line.
2.4 Trouble in Mathematics
Formal manipulations in mathematics can lead to completely wrong results.
Folklore
2.4.1 Interchanging Limits
In the mathematics and physics literature, one frequently encounters the interchange
of limits. By considering three simple examples, we want to illustrate the
crucial fact that the formal interchange of limits can lead to wrong results.
In the mathematical literature, one proves theorems which guarantee the interchange
of limits.
Roughly speaking, one needs uniform convergence.
Ill-Posed Problems
Distinguish carefully between well-posed and ill-posed problems.
Folklore
By using the simple example (2.88) below, we want to show that an uncritical use
of the method of perturbation theory may lead to wrong results if the problem
is ill-posed. This is a possible paradigm for quantum field theory. According to
Hadamard (1865–1863), a mathematical problem is called well-posed iff it has a
unique solution which depends continuously on the data of the problem. Otherwise,
the problem is called ill-posed.
Roughly speaking, ill-posed problems refer to incomplete information.
Therefore, if we use perturbation theory in a formal manner, we can never exclude
such a singular behavior. This tells us that
The results obtained by formal perturbation theory have to be handled cautiously.
Formal perturbation theory and quantum field theory. Until now, all of
the predictions made by quantum field theory are based on the method of formal
perturbation theory. Surprisingly, for small coupling constants in quantum electrodynamics
and electroweak interaction, the theoretical predictions coincide with the
experimental results with very high accuracy. Physicists belief that this cannot happen
by chance. There remains the task to create a mathematically rigorous theory
which explains the great success of formal perturbation theory.
2.5 Mathemagics
Euler truly did not sour his life with limiting value considerations, convergence
and continuity criteria and he could not and did not wish to
bother about the logical foundation of analysis, but rather he relied – only
on occasion unsuccessfully – on his astonishing certitude of instinct and
algorithmic power.
Emil Fellmann, 1975
Seen statistically, Euler must have made a discovery every week. . . About
1911, Enestr¨om published an almost complete (from today’s viewpoint) list
of works with 866 titles. Of the 72 volumes of Euler’s Collected Works all
but three have appeared as of today. Euler’s correspondence with nearly
300 colleagues is estimated to constitute 4500 to 5000 letters, of which
perhaps a third appear to have been lost. These letters are to appear in
13 Volumes.
Euler was not only one of the greatest mathematicians, but also in general
one of the most creative human beings.
R¨udiger Thiele, 1982
In the entire history of mathematics, aside from the golden age of Greek
mathematics, there has never been a better time than that of Leonhard
Euler. It was his privilege to leave mathematics with a completely changed
face, *** it into a powerful machine that it is today.
Andreas Speiser, 1934
Pierre Cartier writes the following in his beautiful article Mathemagics, A tribute to
L. Euler and R. Feynman, S´eminaire Lotharingien 44, 1–71 from the year 2000
The implicit philosophical belief of the working mathematician is today the
Hilbert–Bourbaki formalism. Ideally, one works within a closed system: the
basic principles are clearly enunciated once for all, including (that is an
addition of twentieth century science) the formal rules of logical reasoning
clothed in mathematical form. The basic principles include precise definitions
of all mathematical objects, and the coherence between the various
branches of mathematical sciences is achieved through reduction to basic
models in the universe of sets. A very important feature of the system is its
non-contradiction; after G¨odel (1906–1978), we have lost the initial hopes
to establish the non-contradiction by a formal reasoning, but one can live
with a corresponding belief in non-contradiction. The whole structure is
certainly very appealing, but the illusion is that it is eternal, that it will
function for ever according to the same principles. What history of mathematics
teaches us is that the principles of mathematical deduction, and
not simply the mathematical theories, have evolved over the centuries. In
modern times, theories like General Topology or Lebesgue’s Integration
Theory represent an almost perfect model of precision, flexibility, and harmony,
and their applications, for instance to probability theory, have been
very successful. My thesis is:
There is another way of doing mathematics, equally successful, and the two
methods should supplement each other and not fight.
This other way bears various names: symbolic method, operational calculus,
operator theory. . . Euler was the first to use such methods in his
extensive study of infinite series, convergent as well as divergent. The calculus
of differences was developed by Boole (1815–1864) around 1860 in a
symbolic way, then Heaviside (1850–1925) created his own symbolic calculus
to deal with systems of differential equations in electric circuits. But
the modern master was Feynman (1918–1988) who used his diagrams, his
disentangling of operators, his path integrals. . .
The method consists in stretching the formulas to their extreme
consequences, resorting to some internal feeling of coherence and
harmony.
There are obviously pitfalls in such methods, and only experience can tell
you that for the Dirac delta function an expression like xδ(x) or δ'(x) is
lawful, but not δ(x)/x or δ(x)2. Very often, these so-called symbolic methods
have been substantiated by later rigorous developments, for instance,
the Schwartz distribution theory gives a rigorous meaning to δ(x), but
physicists used sophisticated formulas in “momentum space” long before
Laurent Schwartz codified the Fourier transformation for distributions.
The Feynman “sums over histories” have been immensely successful in
many problems, coming from physics as well from mathematics, despite
the lack of a comprehensive rigorous theory.
Newton (1643–1727), Leibniz (1646–1716), Euler (1707–1783) and their successors
very successfully used infinitesimals, that is, quantities with the strange property
dx =/= 0 and (dx) (dx)= 0. (2.90)
Such quantities are still frequently used in the physics literature. Obviously, classical
numbers dx do not have the property (2.90). Based on the notion of ultra-filters,
we will show in Sect. 4.6 how to introduce rigorously such infinitesimals as equivalence
classes of real numbers. We will embed this into the discussion of a general
mathematical strategy called the strategy of equivalence classes
Information from Infinities – Ariadne’s Thread
in Renormalization Theory
There is no doubt that renormalization is one of the most sophisticated
procedures for obtaining significant numerical quantities by starting from
meaningless mathematical expressions. This is fascinating for both physicists
and mathematicians.
Alain Connes, 2003
Quantum field theory deals with fields ψ(x) that destroy and create particles
at a spacetime point x. Earlier experience with classical electron theory
provided a warning that a point electron will have infinite electromagnetic
self-mass; this mass is
ee/6πacc
for a sur face distribution of charge with radius a, and therefore blows
up for a → 0. Disappointingly this problem appeared with even greater
severity in the early days of quantum field theory, and although greatly
ameliorated by subsequent improvements in the theory, it remains with us
to the present day.
The problem of infinities in quantum field theory was apparently first
noted in the 1929–30 papers by Heisenberg and Pauli. Soon after, the
presence of infinities was confirmed by calculations of the electromagnetic
self-energy of a bound electron by Oppenheimer, and of a free electron by
Ivar Waller.
Steven Weinberg, 1995
2.1 Renormalization Theory in a Nutshell
In renormalization theory, one has to clearly distinguish between
(I) mathematical regularization of infinities, and
(II) renormalization of physical parameters by introducing effective parameters
which relate the mathematical regularization to physical measurements.
In order to help the reader to find his/her way through the jungle of renormalization
theory, let us discuss a few basic ideas. (The details will be studied later on.) This
concerns:
• Bare and effective parameters – the effective frequency and the running coupling
constant of an anharmonic oscillator.
• The renormalized Green’s function and the renormalization group.
• The zeta function, Riemann’s idea of analytic continuation, and the Casimir
effect in quantum electrodynamics.
• Meromorphic functions and Mittag-Leffler’s idea of subtractions.
• Euler’s gamma function and the dimensional regularization of integrals.
Behind this there is the general strategy of extracting finite information from infinities.
For example, we will consider the following examples:
• Regularization of divergent integrals (including the famous overlapping divergences).
• Abel’s adiabatic regularization of infinite series.
• Adiabatic regularization of oscillating integrals (the prototype of the Feynman
path integral trick).
• Poincar´e’s asymptotic series, the Landau singularity, and the Ritt theorem.
• The summation methods by Euler, Frobenius, H¨older, and Borel.
• Tikhonov regularization of ill-posed problems.
In modern renormalization theory, the combinatorics of Feynman diagrams plays
the crucial role. This is related to the mathematical notion of
• Hopf algebras and
• Rota–Baxter algebras.
The prototypes of these algebras appear in classical complex function theory in
connection with the inversion of holomorphic functions (the relation between Lagrange’s
inversion formula and the coinverse (antipode) of a Hopf algebra) and
the regularization of Laurent series (Rota-Baxter algebras). This will be studied in
Chap. 3.
2.1.1 Effective Frequency and Running Coupling Constant of an
Anharmonic Oscillator
In quantum electrodynamics, physicists distinguish between
• the bare mass of the electron, and
• its effective mass (resp. the bare charge of the electron and its effective charge).
The effective mass and the effective charge of the electron coincide with the physical
quantities measured in physical experiments. In contrast to this, the bare parameters
are only introduced as auxiliary quantities in renormalization theory for constructing
the effective quantities in terms of a mathematical algorithm. To illustrate
the crucial difference between bare parameters and effective physical parameters, let
us summarize the main features of a simple rigorous model which is studied in Sect.
11. 5 of Vol. I in full detail.
Let us now investigate the nonlinear problem of an
Anharmonic Oscillator. We have to distinguish
between two cases, namely,
• the regular non-resonance case and
• the singular resonance case which corresponds to renormalization in physics.
We will show that the trouble comes from resonances between the external
forces and the eigenoscillations.
This is the typical trouble in physics if resonance occurs.
To overcome the difficulties in the resonance case, we will use the method of
the pseudo-resolvent introduced by Erhard Schmidt in 1908.
This simple model corresponds to the (much more complicated) determination of
the electron charge, the electron mass, and the running coupling constant in quantum
electrodynamics. In particle physics, the running coupling constants of electromagnetic,
weak, and strong interaction depend on energy and momentum of the
scattering process observed in a particle accelerator.
2.1.2 The Zeta Function and Riemann’s Idea of Analytic
Continuation
Nature sees analytic continuation.
2.1.3 Meromorphic Functions and Mittag-Leffler’s Idea of
Subtractions
The method of subtractions plays a fundamental role in the renormalization of
quantum field theories according to Bogoliubov, Parasiuk, Hepp, and Zimmermann
(called BPHZ renormalization).
The basic idea is to enforce convergence by subtracting additional terms
called subtractions.
2.2 Regularization of Divergent Integrals in Baby
Renormalization Theory
In order to distinguish between convergent and divergent integrals, we will use the
method of power-counting based on a cut-off of the domain of integration. In terms
of physics, this cut-off corresponds to the introduction of an upper bound for the
admissible energies (resp. momenta). For the regularization of divergent integrals,
we will discuss the following methods used by physicists in renormalization theory:
(i) the method of differentiating parameter integrals,
(ii) the method of subtractions (including the famous overlapping divergences),
(iii) Pauli–Villars regularization,
(iv) dimensional regularization by means of Euler’s Gamma function,
(v) analytic regularization via integrals of Riemann–Liouville type, and
(vi) distribution-valued analytic functions.
Summarizing, regularization (and hence renormalization) methods produce
additional parameters which have to be determined by physical experiments.
The role of the renormalization group in physics. The change of the
normalizing momentum is governed by a transformation which is described by the
so-called renormalization group. In particular, this is crucial for studying the highenergy
behavior of quantum field theories (see Chap. 3 of Vol. I on scale changes in
physics). For example, quarks behave like free particles at very high energies. This
is the so-called asymptotic freedom of quarks.
The Method of Differentiating Parameter Integrals
Improve the convergence behavior of a parameter-depending integral by
differentiation with respect to the parameter.
Folklore
Overlapping Divergences
Overlapping divergences caused a lot of trouble in the history of renormalization
theory.
Folklore
The Role of Counterterms
The method of subtractions changes the integrands by subtracting regularizing
terms in order to enforce convergence of the integrals. Physicists try to connect the
subtraction terms with additional terms in the Lagrangian of the quantum field
theory under consideration. These additional terms of the Lagrangian are called
counterterms. It is one of the tasks of renormalization theory
• to study the structure of the necessary subtraction terms and
• to show that the subtraction terms can be generated by appropriate counterterms
of the classical Lagrangian (see Chap. 16).
The philosophy behind this approach is that the procedure of quantization adds
quantum effects to the classical field theory. These quantum effects can be described
by changing the classical Lagrangian by adding counterterms.
Historical remarks. Dimensional regularization was introduced by Gerardus
’t Hooft and Martinus Veltman in 1972. They showed that, in contrast to the 1938
Fermi model, the electroweak Standard Model in particle physics is renormalizable.
For this, ’t Hooft and Veltman were awarded the Nobel prize in physics in 1999.
Dimensional regularization is the standard method used by physicists in
modern renormalization theory
Pauli–Villars Regularization
Regularize divergent integrals by introducing additional ghost particles of
large masses.
Folklore
The Pauli–Villars regularization preserves the relativistic invariance. However, the
introduction of additional masses may destroy the gauge invariance.
Application to Algebraic Feynman Integrals in Minkowski
Space
The important mathematical problem of evaluating (algebraic) Feynman
integrals arises quite naturally in elementary particle physics when one
treats various quantities (corresponding to Feynman diagrams) in the
framework of perturbation theory.
Vladimir Smirnov, 2006
Distribution-Valued Meromorphic Functions
In quantum field theory, Green’s functions are closely related to distribution-
valued meromorphic functions.
Folklore
The following considerations will be used in the next section in order to study
Newton’s equation of motion in terms of the Fourier transform of tempered distributions.
For the convenience of the reader, our approach will be chosen in such a
way that it serves as a prototype for the study of Green’s functions in quantum
field theory later on. The essential tool is given by families of tempered distributions
which analytically depend on a complex parameter λ. This approach dates
back to a fundamental paper by Marcel Riesz in 1948。
2.3 Further Regularization Methods in Mathematics
In physical experiments, physicists measure finite numbers. In contrast to this, the
theory frequently generates infinite expressions.
The main task is to extract relevant finite information from infinite expressions.
In the history of mathematics, mathematicians frequently encountered this problem.
36 Let us discuss some important approaches.
2.3.1 Euler’s Philosophy
One of the greatest masters in mathematics, Leonhard Euler, relied on formal analytic
identities. He used the principle that the sum of a divergent series is the
value of the function from which the series is derived.
Adiabatic averaging cancels infinities.
This is the secret behind the incredible success of path integral methods in quantum
physics.
The Feynman approach to quantum field theory via path integrals is based
on formal adiabatic averages.
The basic ideas are studied in Chap. 7 of Vol. I, in terms of a finite-dimensional
rigorous setting.
Tauberian theorems. Let us consider an arbitrary series with complex
terms a0, a1, a2, . . . Then the following implications hold:
classical convergence ⇒ regularization by averaging ⇒ adiabatic regularization.
In particular, this tells us that if the series can be regularized by averaging, then
it can also be regularized in the adiabatic sense, and the regularized values are the
same. Conversely, we have the following convergence theorem:
The series is convergent if the series can be regularized in the
adiabatic sense (or by averaging in the sense of (2.73)) and we have the
relation |an| = O( 1/n ) as n → +∞.
This was proved by Hardy and Littlewood in the 1910s by generalizing a weaker
theorem proved by Tauber in 1890.39 Combining the Tauberian theorem above with
the Fej´er theorem, we get the following basic theorem:
If the 2π-periodic function f : R → C is continuous and its Fourier coefficients
satisfy the condition |an| = O( 1/n ) as n → ∞, then the Fourier
series converges to f on the real line.
2.4 Trouble in Mathematics
Formal manipulations in mathematics can lead to completely wrong results.
Folklore
2.4.1 Interchanging Limits
In the mathematics and physics literature, one frequently encounters the interchange
of limits. By considering three simple examples, we want to illustrate the
crucial fact that the formal interchange of limits can lead to wrong results.
In the mathematical literature, one proves theorems which guarantee the interchange
of limits.
Roughly speaking, one needs uniform convergence.
Ill-Posed Problems
Distinguish carefully between well-posed and ill-posed problems.
Folklore
By using the simple example (2.88) below, we want to show that an uncritical use
of the method of perturbation theory may lead to wrong results if the problem
is ill-posed. This is a possible paradigm for quantum field theory. According to
Hadamard (1865–1863), a mathematical problem is called well-posed iff it has a
unique solution which depends continuously on the data of the problem. Otherwise,
the problem is called ill-posed.
Roughly speaking, ill-posed problems refer to incomplete information.
Therefore, if we use perturbation theory in a formal manner, we can never exclude
such a singular behavior. This tells us that
The results obtained by formal perturbation theory have to be handled cautiously.
Formal perturbation theory and quantum field theory. Until now, all of
the predictions made by quantum field theory are based on the method of formal
perturbation theory. Surprisingly, for small coupling constants in quantum electrodynamics
and electroweak interaction, the theoretical predictions coincide with the
experimental results with very high accuracy. Physicists belief that this cannot happen
by chance. There remains the task to create a mathematically rigorous theory
which explains the great success of formal perturbation theory.
2.5 Mathemagics
Euler truly did not sour his life with limiting value considerations, convergence
and continuity criteria and he could not and did not wish to
bother about the logical foundation of analysis, but rather he relied – only
on occasion unsuccessfully – on his astonishing certitude of instinct and
algorithmic power.
Emil Fellmann, 1975
Seen statistically, Euler must have made a discovery every week. . . About
1911, Enestr¨om published an almost complete (from today’s viewpoint) list
of works with 866 titles. Of the 72 volumes of Euler’s Collected Works all
but three have appeared as of today. Euler’s correspondence with nearly
300 colleagues is estimated to constitute 4500 to 5000 letters, of which
perhaps a third appear to have been lost. These letters are to appear in
13 Volumes.
Euler was not only one of the greatest mathematicians, but also in general
one of the most creative human beings.
R¨udiger Thiele, 1982
In the entire history of mathematics, aside from the golden age of Greek
mathematics, there has never been a better time than that of Leonhard
Euler. It was his privilege to leave mathematics with a completely changed
face, *** it into a powerful machine that it is today.
Andreas Speiser, 1934
Pierre Cartier writes the following in his beautiful article Mathemagics, A tribute to
L. Euler and R. Feynman, S´eminaire Lotharingien 44, 1–71 from the year 2000
The implicit philosophical belief of the working mathematician is today the
Hilbert–Bourbaki formalism. Ideally, one works within a closed system: the
basic principles are clearly enunciated once for all, including (that is an
addition of twentieth century science) the formal rules of logical reasoning
clothed in mathematical form. The basic principles include precise definitions
of all mathematical objects, and the coherence between the various
branches of mathematical sciences is achieved through reduction to basic
models in the universe of sets. A very important feature of the system is its
non-contradiction; after G¨odel (1906–1978), we have lost the initial hopes
to establish the non-contradiction by a formal reasoning, but one can live
with a corresponding belief in non-contradiction. The whole structure is
certainly very appealing, but the illusion is that it is eternal, that it will
function for ever according to the same principles. What history of mathematics
teaches us is that the principles of mathematical deduction, and
not simply the mathematical theories, have evolved over the centuries. In
modern times, theories like General Topology or Lebesgue’s Integration
Theory represent an almost perfect model of precision, flexibility, and harmony,
and their applications, for instance to probability theory, have been
very successful. My thesis is:
There is another way of doing mathematics, equally successful, and the two
methods should supplement each other and not fight.
This other way bears various names: symbolic method, operational calculus,
operator theory. . . Euler was the first to use such methods in his
extensive study of infinite series, convergent as well as divergent. The calculus
of differences was developed by Boole (1815–1864) around 1860 in a
symbolic way, then Heaviside (1850–1925) created his own symbolic calculus
to deal with systems of differential equations in electric circuits. But
the modern master was Feynman (1918–1988) who used his diagrams, his
disentangling of operators, his path integrals. . .
The method consists in stretching the formulas to their extreme
consequences, resorting to some internal feeling of coherence and
harmony.
There are obviously pitfalls in such methods, and only experience can tell
you that for the Dirac delta function an expression like xδ(x) or δ'(x) is
lawful, but not δ(x)/x or δ(x)2. Very often, these so-called symbolic methods
have been substantiated by later rigorous developments, for instance,
the Schwartz distribution theory gives a rigorous meaning to δ(x), but
physicists used sophisticated formulas in “momentum space” long before
Laurent Schwartz codified the Fourier transformation for distributions.
The Feynman “sums over histories” have been immensely successful in
many problems, coming from physics as well from mathematics, despite
the lack of a comprehensive rigorous theory.
Newton (1643–1727), Leibniz (1646–1716), Euler (1707–1783) and their successors
very successfully used infinitesimals, that is, quantities with the strange property
dx =/= 0 and (dx) (dx)= 0. (2.90)
Such quantities are still frequently used in the physics literature. Obviously, classical
numbers dx do not have the property (2.90). Based on the notion of ultra-filters,
we will show in Sect. 4.6 how to introduce rigorously such infinitesimals as equivalence
classes of real numbers. We will embed this into the discussion of a general
mathematical strategy called the strategy of equivalence classes
由一星于2014-08-10, 06:29进行了最后一次编辑,总共编辑了21次
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
黎曼猜想[编辑]
[ltr]
黎曼猜想由德国数学家波恩哈德·黎曼(Bernhard Riemann)于1859年提出。它是数学中一个重要而又著名的未解决的问题。多年来它吸引了许多出色的数学家为之绞尽脑汁。[/ltr]
[ltr]
黎曼猜想(RH)是关于黎曼ζ函数ζ(s)的零点分布的猜想。黎曼ζ函数在任何复数s ≠ 1上有定义。它在负偶数上也有零点(例如,当s = −2, s = −4, s = −6, ...)。这些零点是“平凡零点”。黎曼猜想关心的是非平凡零点。
黎曼猜想提出:
黎曼ζ函数非平凡零点的实数部份是½
即所有的非平凡零点都应该位于直线(“临界线”)上。t为一实数,而i为虚数的基本单位。沿临界线的黎曼ζ函数有时通过Z-函数进行研究。它的实零点对应于ζ函数在临界线上的零点。
素数在自然数中的分布问题在纯粹数学和应用数学上都很重要。素数在自然数中的分布并没有简单的规律。黎曼(1826-1866)发现素数出现的频率与黎曼ζ函数紧密相关。
1901年Helge von Koch指出,黎曼猜想与强条件的素数定理等价。现在已经验证了最初的1,500,000,000个素数对这个定理都成立。但是是否所有的解对此定理都成立,至今尚无人给出证明。
黎曼猜想所以被认为是当代数学中一个重要的问题,主要是因为很多深入和重要的数学和物理结果都能在它成立的大前提下被证明。大部份数学家也相信黎曼猜想是正确的(约翰·恩瑟·李特尔伍德与塞尔伯格曾提出怀疑。塞尔伯格于晚年部分改变了他的怀疑立场。在1989年的一篇论文中,他猜测黎曼猜想对更广泛的一类函数也应当成立。)克雷数学研究所设立了$1,000,000美元的奬金给予第一个得出正确证明的人。[/ltr]
4 已否证的猜想
5 相对弱的猜想
5.1 Lindelöf猜想
5.2 大素数间隙猜想
6 证明黎曼猜想的尝试
7 黎曼猜想证明的可能的着手方向
8 与算子理论的可能联系
9 搜寻ζ函数的零点
10 参考文献
10.1 历史文献
10.2 现代技术参考
10.3 受欢迎的参考资料
10.4 引用来源
11 参见
[ltr]
历史[编辑][/ltr]
[ltr]
黎曼1859年在他的论文《Über die Anzahl der Primzahlen unter einer gegebenen Größe》中提及了这个著名的猜想,但它并非该论文的中心目的,他也没有试图给出证明。黎曼知道ζ函数的不平凡零点对称地分布在直线s = ½ + it上,以及他知道它所有的不平凡零点一定位于区域0 ≤ Re(s) ≤ 1中。
1896年,雅克·阿达马和Charles Jean de la Vallée-Poussin分别独立地证明了在直线Re(s) = 1上没有零点。连同了黎曼对于不非凡零点已经证明了的其他特性,这显示了所有不平凡零点一定处于区域0 < Re(s) < 1上。这是素数定理第一个完整证明中很关键的一步。
1900年,大卫·希尔伯特将黎曼猜想包括在他著名的23条问题中,与哥德巴赫猜想一起组成了希尔伯特名单上的第8号问题。同时黎曼猜想也是希尔伯特问题中唯一一个被收入克雷数学研究所的千禧年大奖数学难题的。希尔伯特曾说,如果他在沉睡1000年后醒来,他将问的第一个问题便是:黎曼猜想得到证明了吗?[1]
1914年,高德菲·哈罗德·哈代证明了有无限个零点在直线Re(s) = ½上。然而仍然有可能有无限个不平凡零点位于其它地方(而且有可能是最主要的零点)。后来哈代与约翰·恩瑟·李特尔伍德在1921年及塞尔伯格在1942年的工作(临界线定理)也就是计算零点在临界线Re(s) = ½上的平均密度。
近年来的工作主要集中于清楚的计算大量零点的位置(希望借此能找到一个反例)以及对处于临界线以外零点数目的比例置一上界(希望能把上界降至零)[来源请求]。
黎曼猜想与素数定理[编辑]
黎曼猜想传统的表达式隠藏了这个猜想的真正重要性。黎曼ζ函数与素数的分布有着深厚的连结。Helge von Koch在1901年证明了黎曼猜想等价于素数定理一个可观的强化:给出任何ε > 0,我们有
式中π(x)为素数计数函数,ln(x)为x 的自然对数,以及右手边用上了大O符号[2]。一个由Lowell Schoenfeld提出的非近似版本,表示黎曼猜想等价于
。
黎曼ζ函数的零点与素数满足一个称为明确公式的对偶性,这表明了:在调和分析的意义下,黎曼ζ函数的零点可视为素数分布的谐波。
将黎曼ζ函数代为更一般的L-函数,此时仍有相应的猜想:整体L-函数的非平凡零点的实部必等于。这被称为广义黎曼猜想。函数域上的广义黎曼猜想已被证明,数域的情形仍悬而未决。
黎曼猜想之结果及其等价命题[编辑]
黎曼猜想的实际用途包括一些在黎曼猜想成立前提底下能被证明为真的命题,当中有些更被证明了跟黎曼猜想等价。其中一个就是以上素数定理误差项的增长率。
默比乌斯函数的增长率[编辑]
其中一个命题牵涉了默比乌斯函数μ。命题“等式
在s的实部大于½的时候成立,而且右边项的和收敛”就等价于黎曼猜想。由此我们能够总结出假如Mertens函数的定义为
那黎曼猜想就等价于对任何都有
,
这将会对于M的增长给出了一个更紧的限制,因为即使没有黎曼猜想我们也能得出
(关于这些符号的意思,见大O符号。)
积性函数增长率[编辑]
黎曼猜想等价于一些除μ(n)以外一些积性函数增长率的猜想。例如,约数函数σ(n)由下式给出:
那在n > 5040的时候,
,
这名为Robin定理并在1984年以Guy Robin命名。另一个有关的上限在2002年由Jeffrey Lagarias提出,他证明了黎曼猜想等价于命题“对于任意自然数n,
而为第n个调和数。
里斯判准与二项式系数和[编辑]
里斯判准由里斯在1916年给出[3],它断言黎曼猜想等价于下式对所有成立
哈代稍后于1918年以波莱尔求和法及梅林变换证明了下式的积分表法。
其它相关的积性函数的增长率也具有与黎曼猜想等价的表述。
考虑二项式系数和
,
Báez-Duarte[4][5]与Flajolet、Brigitte Vallée[6]证明了黎曼猜想等价于对所有的下式成立
。
类似的还有以下级数
。
对此。Flajolet与Vepstas [7]证明了黎曼猜想等价于对所有的下式成立
其中的是依赖于的某个常数。
韦伊判准、李判准[编辑]
韦伊判准断言某些函数的正定性等价于广义黎曼猜想。与此相似的还有李判准,这断言某些数列的正性等价于黎曼猜想。
跟法里数列的关系[编辑]
另外两个跟黎曼猜想等价的命题牵涉了法里数列。假如Fn是法里数列中的第n项,由1/n开始而终于1/1,那命题“给出任何e > ½
”等价于黎曼猜想。在这里是法里数列中n阶项的数目。类似地等价于黎曼猜想的命题是“给出任何e > −1.
”
跟群论的关系[编辑]
黎曼猜想等价于群论中的一些猜想。举例说,g(n),是对称群Sn的所有元素的秩之中,最大的一个,也就是兰道函数,则黎曼猜想等价于:对够大的n,下式成立:
。
与埃拉托斯特尼筛法的关系[编辑]
参见埃拉托斯特尼筛法,黎曼猜想的素数公式直接来源于埃拉托斯特尼筛法的过程。
临界线定理[编辑]
黎曼猜想等价于命题“的导函数在区域
上无零点。” 函数ζ在临界线上只有单零点的充要条件是其导函数在临界线上非零。所以若黎曼猜想成立,命题中的非零区域可以延伸为。这条进路带来了一些成果。Norman Levinson将此条件加细,从而得到了较强的临界线定理。
已否证的猜想[编辑]
一些比黎曼猜想强的猜想曾被提出,但它们有被否证的趋势。Paul Turan证明了假如级数
当s大于1时没有零点,则黎曼猜想成立,但Hugh Montgomery证明了这前提并不成立。另一个更强的默滕斯猜想也同样被否证。
相对弱的猜想[编辑]
Lindelöf猜想[编辑]
黎曼猜想有各种比较弱的结果;其中一个是关于ζ函数于临界线上的增长速度的Lindelöf猜想,表明了给出任意的e > 0,当t趋向无限,
记第n 个素数为pn,一个由Albert Ingham得出的结果显示,Lindelöf猜想将推导出“给出任意e > 0,对足够大的n 有
pn+1 - pn < p1/2+e,”
不过这个结果比大素数间隙猜想弱,详如下述。
大素数间隙猜想[编辑]
另一个猜想是大素数间隙猜想。哈拉尔德·克拉梅尔证明了:假设黎曼猜想成立,素数p 与其后继者之间的间隙将会为。平均来说,该间隙的阶仅为,而根据数值计算结果,它的增长率并不似黎曼猜想所预测的那么大。
证明黎曼猜想的尝试[编辑]
过去的一百多年,有很多数学家声称证明了黎曼猜想。截至2007年为止,尚有一些证明还未被验证;但它们都被数学社群所质疑,多数专家并不相信它们是正确的。艾希特大学的Matthew R. Watkins为这些或是严肃或是荒唐的证明编辑了一份列表[8]。其他一些证明可在arXiv数据库中找到。
黎曼猜想证明的可能的着手方向[编辑]
由于黎曼猜想是有关2维变量(临界线(critical line)上的虚数解和黎曼ζ函数中的自然数变量n)的问题,故不但要考虑在2维变量下的情况,似乎还可以从更高维数(例如3或4维甚至更高维)变量的情况下来考虑问题。
另外,由于黎曼猜想从本质上来说是证明一个方程的非平凡的复数解必然是1/2+bi的形式(b是实数,i是虚数单位),因此应该与代数学是密不可分的;就是说,代数几何、代数数论甚至代数拓扑等学科的知识是不可缺少的。如果能从上述几个分支学科之间找到新的联系,以及对这些分支学科有进一步的新发现,那可能可以为证明黎曼猜想打下基础,或为黎曼猜想的证明做好准备。
与算子理论的可能联系[编辑]
主条目:希尔伯特-波利亚猜想
长久以来,人们猜测黎曼猜想的“正解”是找到一个适当的自伴算符,再由实特征值的判准导出零点实部的资讯。在此方向上已有许多工作,却仍未有决定性的进展。
黎曼ζ函数的统计学性质与随机矩阵的特征值有许多相似处。这为希尔伯特-波利亚猜想提供了一些支持。
在1999年,Michael Berry与Jon Keating猜想经典哈密顿函数有某个未知的量子化,使得下式成立
更奇特的是,黎曼ζ函数的零点与算子的谱相同。正则量子化的情形则相反:正则量子化引致海森堡测不准原理,并使量子谐振子的谱为自然数。重点在于,所求的哈密顿算符应当是个闭自伴算符,方能满足希尔伯特-波利亚猜想之要求。
搜寻ζ函数的零点[编辑][/ltr]
[ltr]
关于计算上找寻ζ函数零点越多越好的尝试,已经有一段很长的历史了。其中一个出名的尝试乃ZetaGrid,一个分散式计算的计划,一天可检查上十亿个零点。这计划在2005年11月终止。直至2006年,没有计算计划成功找到黎曼猜想的一个反例。
2004年,Xavier Gourdon与Patrick Demichel透过Odlyzko-Schönhage algorithm验证了黎曼猜想的头十兆个非平凡零点。
Michael Rubinstein给了公众一个算法去算出零点。[/ltr]
[ltr]参见[size=13][编辑]
[/ltr][/size]
P/NP问题 |
霍奇猜想 |
庞加莱猜想(已证明) |
黎曼猜想 |
杨-米尔斯存在性与质量间隙 |
纳维-斯托克斯存在性与光滑性 |
贝赫和斯维讷通-戴尔猜想 |
黎曼猜想由德国数学家波恩哈德·黎曼(Bernhard Riemann)于1859年提出。它是数学中一个重要而又著名的未解决的问题。多年来它吸引了许多出色的数学家为之绞尽脑汁。[/ltr]
- 黎曼猜想:
黎曼ζ函数,
。 非平凡零点(在此情况下是指s不为-2、-4、-6‧‧‧等点的值)的实数部份是。
未解决的数学问题:黎曼ζ函数的每个非平凡零点的实部是否同为½? |
黎曼猜想(RH)是关于黎曼ζ函数ζ(s)的零点分布的猜想。黎曼ζ函数在任何复数s ≠ 1上有定义。它在负偶数上也有零点(例如,当s = −2, s = −4, s = −6, ...)。这些零点是“平凡零点”。黎曼猜想关心的是非平凡零点。
黎曼猜想提出:
黎曼ζ函数非平凡零点的实数部份是½
即所有的非平凡零点都应该位于直线(“临界线”)上。t为一实数,而i为虚数的基本单位。沿临界线的黎曼ζ函数有时通过Z-函数进行研究。它的实零点对应于ζ函数在临界线上的零点。
素数在自然数中的分布问题在纯粹数学和应用数学上都很重要。素数在自然数中的分布并没有简单的规律。黎曼(1826-1866)发现素数出现的频率与黎曼ζ函数紧密相关。
1901年Helge von Koch指出,黎曼猜想与强条件的素数定理等价。现在已经验证了最初的1,500,000,000个素数对这个定理都成立。但是是否所有的解对此定理都成立,至今尚无人给出证明。
黎曼猜想所以被认为是当代数学中一个重要的问题,主要是因为很多深入和重要的数学和物理结果都能在它成立的大前提下被证明。大部份数学家也相信黎曼猜想是正确的(约翰·恩瑟·李特尔伍德与塞尔伯格曾提出怀疑。塞尔伯格于晚年部分改变了他的怀疑立场。在1989年的一篇论文中,他猜测黎曼猜想对更广泛的一类函数也应当成立。)克雷数学研究所设立了$1,000,000美元的奬金给予第一个得出正确证明的人。[/ltr]
- 1 历史
- 2 黎曼猜想与素数定理
- 3 黎曼猜想之结果及其等价命题
- 3.1 默比乌斯函数的增长率
- 3.2 积性函数增长率
- 3.3 里斯判准与二项式系数和
- 3.4 韦伊判准、李判准
- 3.5 跟法里数列的关系
- 3.6 跟群论的关系
- 3.7 与埃拉托斯特尼筛法的关系
- 3.8 临界线定理
[ltr]
历史[编辑][/ltr]
[ltr]
黎曼1859年在他的论文《Über die Anzahl der Primzahlen unter einer gegebenen Größe》中提及了这个著名的猜想,但它并非该论文的中心目的,他也没有试图给出证明。黎曼知道ζ函数的不平凡零点对称地分布在直线s = ½ + it上,以及他知道它所有的不平凡零点一定位于区域0 ≤ Re(s) ≤ 1中。
1896年,雅克·阿达马和Charles Jean de la Vallée-Poussin分别独立地证明了在直线Re(s) = 1上没有零点。连同了黎曼对于不非凡零点已经证明了的其他特性,这显示了所有不平凡零点一定处于区域0 < Re(s) < 1上。这是素数定理第一个完整证明中很关键的一步。
1900年,大卫·希尔伯特将黎曼猜想包括在他著名的23条问题中,与哥德巴赫猜想一起组成了希尔伯特名单上的第8号问题。同时黎曼猜想也是希尔伯特问题中唯一一个被收入克雷数学研究所的千禧年大奖数学难题的。希尔伯特曾说,如果他在沉睡1000年后醒来,他将问的第一个问题便是:黎曼猜想得到证明了吗?[1]
1914年,高德菲·哈罗德·哈代证明了有无限个零点在直线Re(s) = ½上。然而仍然有可能有无限个不平凡零点位于其它地方(而且有可能是最主要的零点)。后来哈代与约翰·恩瑟·李特尔伍德在1921年及塞尔伯格在1942年的工作(临界线定理)也就是计算零点在临界线Re(s) = ½上的平均密度。
近年来的工作主要集中于清楚的计算大量零点的位置(希望借此能找到一个反例)以及对处于临界线以外零点数目的比例置一上界(希望能把上界降至零)[来源请求]。
黎曼猜想与素数定理[编辑]
黎曼猜想传统的表达式隠藏了这个猜想的真正重要性。黎曼ζ函数与素数的分布有着深厚的连结。Helge von Koch在1901年证明了黎曼猜想等价于素数定理一个可观的强化:给出任何ε > 0,我们有
式中π(x)为素数计数函数,ln(x)为x 的自然对数,以及右手边用上了大O符号[2]。一个由Lowell Schoenfeld提出的非近似版本,表示黎曼猜想等价于
。
黎曼ζ函数的零点与素数满足一个称为明确公式的对偶性,这表明了:在调和分析的意义下,黎曼ζ函数的零点可视为素数分布的谐波。
将黎曼ζ函数代为更一般的L-函数,此时仍有相应的猜想:整体L-函数的非平凡零点的实部必等于。这被称为广义黎曼猜想。函数域上的广义黎曼猜想已被证明,数域的情形仍悬而未决。
黎曼猜想之结果及其等价命题[编辑]
黎曼猜想的实际用途包括一些在黎曼猜想成立前提底下能被证明为真的命题,当中有些更被证明了跟黎曼猜想等价。其中一个就是以上素数定理误差项的增长率。
默比乌斯函数的增长率[编辑]
其中一个命题牵涉了默比乌斯函数μ。命题“等式
在s的实部大于½的时候成立,而且右边项的和收敛”就等价于黎曼猜想。由此我们能够总结出假如Mertens函数的定义为
那黎曼猜想就等价于对任何都有
,
这将会对于M的增长给出了一个更紧的限制,因为即使没有黎曼猜想我们也能得出
(关于这些符号的意思,见大O符号。)
积性函数增长率[编辑]
黎曼猜想等价于一些除μ(n)以外一些积性函数增长率的猜想。例如,约数函数σ(n)由下式给出:
那在n > 5040的时候,
,
这名为Robin定理并在1984年以Guy Robin命名。另一个有关的上限在2002年由Jeffrey Lagarias提出,他证明了黎曼猜想等价于命题“对于任意自然数n,
而为第n个调和数。
里斯判准与二项式系数和[编辑]
里斯判准由里斯在1916年给出[3],它断言黎曼猜想等价于下式对所有成立
哈代稍后于1918年以波莱尔求和法及梅林变换证明了下式的积分表法。
其它相关的积性函数的增长率也具有与黎曼猜想等价的表述。
考虑二项式系数和
,
Báez-Duarte[4][5]与Flajolet、Brigitte Vallée[6]证明了黎曼猜想等价于对所有的下式成立
。
类似的还有以下级数
。
对此。Flajolet与Vepstas [7]证明了黎曼猜想等价于对所有的下式成立
其中的是依赖于的某个常数。
韦伊判准、李判准[编辑]
韦伊判准断言某些函数的正定性等价于广义黎曼猜想。与此相似的还有李判准,这断言某些数列的正性等价于黎曼猜想。
跟法里数列的关系[编辑]
另外两个跟黎曼猜想等价的命题牵涉了法里数列。假如Fn是法里数列中的第n项,由1/n开始而终于1/1,那命题“给出任何e > ½
”等价于黎曼猜想。在这里是法里数列中n阶项的数目。类似地等价于黎曼猜想的命题是“给出任何e > −1.
”
跟群论的关系[编辑]
黎曼猜想等价于群论中的一些猜想。举例说,g(n),是对称群Sn的所有元素的秩之中,最大的一个,也就是兰道函数,则黎曼猜想等价于:对够大的n,下式成立:
。
与埃拉托斯特尼筛法的关系[编辑]
参见埃拉托斯特尼筛法,黎曼猜想的素数公式直接来源于埃拉托斯特尼筛法的过程。
临界线定理[编辑]
黎曼猜想等价于命题“的导函数在区域
上无零点。” 函数ζ在临界线上只有单零点的充要条件是其导函数在临界线上非零。所以若黎曼猜想成立,命题中的非零区域可以延伸为。这条进路带来了一些成果。Norman Levinson将此条件加细,从而得到了较强的临界线定理。
已否证的猜想[编辑]
一些比黎曼猜想强的猜想曾被提出,但它们有被否证的趋势。Paul Turan证明了假如级数
当s大于1时没有零点,则黎曼猜想成立,但Hugh Montgomery证明了这前提并不成立。另一个更强的默滕斯猜想也同样被否证。
相对弱的猜想[编辑]
Lindelöf猜想[编辑]
黎曼猜想有各种比较弱的结果;其中一个是关于ζ函数于临界线上的增长速度的Lindelöf猜想,表明了给出任意的e > 0,当t趋向无限,
记第n 个素数为pn,一个由Albert Ingham得出的结果显示,Lindelöf猜想将推导出“给出任意e > 0,对足够大的n 有
pn+1 - pn < p1/2+e,”
不过这个结果比大素数间隙猜想弱,详如下述。
大素数间隙猜想[编辑]
另一个猜想是大素数间隙猜想。哈拉尔德·克拉梅尔证明了:假设黎曼猜想成立,素数p 与其后继者之间的间隙将会为。平均来说,该间隙的阶仅为,而根据数值计算结果,它的增长率并不似黎曼猜想所预测的那么大。
证明黎曼猜想的尝试[编辑]
过去的一百多年,有很多数学家声称证明了黎曼猜想。截至2007年为止,尚有一些证明还未被验证;但它们都被数学社群所质疑,多数专家并不相信它们是正确的。艾希特大学的Matthew R. Watkins为这些或是严肃或是荒唐的证明编辑了一份列表[8]。其他一些证明可在arXiv数据库中找到。
黎曼猜想证明的可能的着手方向[编辑]
由于黎曼猜想是有关2维变量(临界线(critical line)上的虚数解和黎曼ζ函数中的自然数变量n)的问题,故不但要考虑在2维变量下的情况,似乎还可以从更高维数(例如3或4维甚至更高维)变量的情况下来考虑问题。
另外,由于黎曼猜想从本质上来说是证明一个方程的非平凡的复数解必然是1/2+bi的形式(b是实数,i是虚数单位),因此应该与代数学是密不可分的;就是说,代数几何、代数数论甚至代数拓扑等学科的知识是不可缺少的。如果能从上述几个分支学科之间找到新的联系,以及对这些分支学科有进一步的新发现,那可能可以为证明黎曼猜想打下基础,或为黎曼猜想的证明做好准备。
与算子理论的可能联系[编辑]
主条目:希尔伯特-波利亚猜想
长久以来,人们猜测黎曼猜想的“正解”是找到一个适当的自伴算符,再由实特征值的判准导出零点实部的资讯。在此方向上已有许多工作,却仍未有决定性的进展。
黎曼ζ函数的统计学性质与随机矩阵的特征值有许多相似处。这为希尔伯特-波利亚猜想提供了一些支持。
在1999年,Michael Berry与Jon Keating猜想经典哈密顿函数有某个未知的量子化,使得下式成立
更奇特的是,黎曼ζ函数的零点与算子的谱相同。正则量子化的情形则相反:正则量子化引致海森堡测不准原理,并使量子谐振子的谱为自然数。重点在于,所求的哈密顿算符应当是个闭自伴算符,方能满足希尔伯特-波利亚猜想之要求。
搜寻ζ函数的零点[编辑][/ltr]
[ltr]
关于计算上找寻ζ函数零点越多越好的尝试,已经有一段很长的历史了。其中一个出名的尝试乃ZetaGrid,一个分散式计算的计划,一天可检查上十亿个零点。这计划在2005年11月终止。直至2006年,没有计算计划成功找到黎曼猜想的一个反例。
2004年,Xavier Gourdon与Patrick Demichel透过Odlyzko-Schönhage algorithm验证了黎曼猜想的头十兆个非平凡零点。
Michael Rubinstein给了公众一个算法去算出零点。[/ltr]
[ltr]参见[size=13][编辑]
[/ltr][/size]
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
Divergent series
From Wikipedia, the free encyclopedia
[ltr]For the media franchise, see Divergent trilogy.[/ltr]
[ltr]
In mathematics, a divergent series is aninfinite series that is not convergent, meaning that the infinite sequence of the partial sums of the series does not have a finite limit.
If a series converges, the individual terms of the series must approach zero. Thus any series in which the individual terms do not approach zero diverges. However, convergence is a stronger condition: not all series whose terms approach zero converge. The simplest counterexample is the harmonic series
The divergence of the harmonic series was proven by the medieval mathematician Nicole Oresme.
In specialized mathematical contexts, values can be usefully assigned to certain series whose sequence of partial sums diverges. A summability method or summation method is a partial function from the set of sequences of partial sums of series to values. For example, Cesàro summation assigns Grandi's divergent series
the value 1/2. Cesàro summation is an averaging method, in that it relies on the arithmetic mean of the sequence of partial sums. Other methods involve analytic continuations of related series. In physics, there are a wide variety of summability methods; these are discussed in greater detail in the article on regularization.[/ltr]
5 Nørlund means
5.1 Cesàro summation
6 Abelian means
6.1 Abel summation
6.2 Lindelöf summation
7 Analytic continuation
7.1 Analytic continuation of power series
7.2 Euler summation
7.3 Analytic continuation of Dirichlet series
7.4 Zeta function regularization
8 Integral function means
8.1 Borel summation
8.2 Valiron's method
9 Moment methods
9.1 Borel summation
10 Miscellaneous methods
10.1 Hausdorff transformations
10.2 Hölder summation
10.3 Hutton's method
10.4 Ingham summability
10.5 Lambert summability
10.6 Le Roy summation
10.7 Mittag-Leffler summation
10.8 Ramanujan summation
10.9 Riemann summability
10.10 Riesz means
10.11 Vallée-Poussin summability
11 See also
12 References
[ltr]
History[edit][/ltr]
[ltr]
Before the 19th century divergent series were widely used by Euler and others, but often led to confusing and contradictory results. A major problem was Euler's idea that any divergent series should have a natural sum, without first defining what is meant by the sum of a divergent series. Cauchy eventually gave a rigorous definition of the sum of a (convergent) series, and for some time after this divergent series were mostly excluded from mathematics. They reappeared in 1886 with Poincaré's work on asymptotic series. In 1890 Cesaro realized that one could give a rigorous definition of the sum of some divergent series, and defined Cesaro summation. (This was not the first use of Cesaro summation which was used implicitly by Frobenius in 1880; Cesaro's key contribution was not the discovery of this method but his idea that one should give an explicit definition of the sum of a divergent series.) In the years after Cesaro's paper several other mathematicians gave other definitions of the sum of a divergent series, though these are not always compatible: different definitions can give different answers for the sum of the same divergent series, so when talking about the sum of a divergent series it is necessary to specify which summation method one is using.
Theorems on methods for summing divergent series[edit]
A summability method M is regular if it agrees with the actual limit on allconvergent series. Such a result is called an abelian theorem for M, from the prototypical Abel's theorem. More interesting and in general more subtle are partial converse results, called tauberian theorems, from a prototype proved by Alfred Tauber. Here partial converse means that if Msums the series Σ, and some side-condition holds, then Σ was convergent in the first place; without any side condition such a result would say that Monly summed convergent series (*** it useless as a summation method for divergent series).
The operator giving the sum of a convergent series is linear, and it follows from the Hahn–Banach theorem that it may be extended to a summation method summing any series with bounded partial sums. This fact is not very useful in practice since there are many such extensions, inconsistentwith each other, and also since proving such operators exist requires invoking the axiom of choice or its equivalents, such as Zorn's lemma. They are therefore nonconstructive.
The subject of divergent series, as a domain of mathematical analysis, is primarily concerned with explicit and natural techniques such as Abel summation, Cesàro summation and Borel summation, and their relationships. The advent of Wiener's tauberian theorem marked an epoch in the subject, introducing unexpected connections to Banach algebramethods in Fourier analysis.
Summation of divergent series is also related to extrapolation methods andsequence transformations as numerical techniques. Examples for such techniques are Padé approximants, Levin-type sequence transformations, and order-dependent mappings related to renormalization techniques for large-order perturbation theory in quantum mechanics.
Properties of summation methods[edit]
Summation methods usually concentrate on the sequence of partial sums of the series. While this sequence does not converge, we may often find that when we take an average of larger and larger initial terms of the sequence, the average converges, and we can use this average instead of a limit to evaluate the sum of the series. So in evaluatinga = a0 + a1 + a2 + ..., we work with the sequence s, where s0 = a0 andsn+1 = sn + an+1. In the convergent case, the sequence s approaches the limit a. A summation method can be seen as a function from a set of sequences of partial sums to values. If A is any summation method assigning values to a set of sequences, we may mechanically translate this to a series-summation method AΣ that assigns the same values to the corresponding series. There are certain properties it is desirable for these methods to possess if they are to arrive at values corresponding to limits and sums, respectively.[/ltr]
[ltr]
The third condition is less important, and some significant methods, such as Borel summation, do not possess it.[citation needed]
One can also give a weaker alternative to the last condition.[/ltr]
[ltr]
A desirable property for two distinct summation methods A and B to share is consistency: A and B are consistent if for every sequence s to which both assign a value, A(s) = B(s). If two methods are consistent, and one sums more series than the other, the one summing more series isstronger.
There are powerful numerical summation methods that are neither regular nor linear, for instance nonlinear sequence transformations like Levin-type sequence transformations and Padé approximants, as well as the order-dependent mappings of perturbative series based on renormalizationtechniques.
Taking regularity, linearity and stability as axioms, it is possible to sum many divergent series by elementary algebraic manipulations. This partly explains why many different summation methods give the same answer for certain series.
For instance, whenever r ≠ 1, the geometric series
can be evaluated regardless of convergence. More rigorously, any summation method that possesses these properties and which assigns a finite value to the geometric series must assign this value. However, when ris a real number larger than 1, the partial sums increase without bound, and averaging methods assign a limit of ∞.
Classical summation methods[edit]
The two classical summation methods for series, ordinary convergence and absolute convergence, define the sum as a limit of certain partial sums. Strictly speaking these are not really summation methods for divergent series, as by definition a series is divergent only if these methods do not work, but are included for completeness. Most but not all summation methods for divergent series extend these methods to a larger class of sequences.
Absolute convergence[edit]
Absolute convergence defines the sum of a sequence (or set) of numbers to be the limit of the net of all partial sums ak[sub]1[/sub]+ ...+ak[sub]n[/sub], if it exists. It does not depend on the order of the elements of the sequence, and a classical theorem says that a sequence is absolutely convergent if and only if the sequence of absolute values is convergent in the standard sense.
Sum of a series[edit]
Cauchy's classical definition of the sum of a series a0+a1+... defines the sum to be the limit of the sequence of partial sums a0+ ...+an. This is the default definition of convergence of a sequence.
Nørlund means[edit]
Suppose pn is a sequence of positive terms, starting from p0. Suppose also that
If now we transform a sequence s by using p to give weighted means, setting
then the limit of tn as n goes to infinity is an average called the Nørlundmean Np(s).
The Nørlund mean is regular, linear, and stable. Moreover, any two Nørlund means are consistent.
Cesàro summation[edit]
The most significant of the Nørlund means are the Cesàro sums. Here, if we define the sequence pk by
then the Cesàro sum Ck is defined by Ck(s) = N(pk)(s). Cesàro sums are Nørlund means if k ≥ 0, and hence are regular, linear, stable, and consistent. C0 is ordinary summation, and C1 is ordinary Cesàro summation. Cesàro sums have the property that if h > k, then Ch is stronger than Ck.
Abelian means[edit]
Suppose λ = {λ0, λ1, λ2, ...} is a strictly increasing sequence tending towards infinity, and that λ0 ≥ 0. Suppose
converges for all real numbers x>0. Then the Abelian mean Aλ is defined as
More generally, if the series for f only converges for large x but can be analytically continued to all positive real x, then one can still define the sum of the divergent series by the limit above.
A series of this type is known as a generalized Dirichlet series; in applications to physics, this is known as the method of heat-kernel regularization.
Abelian means are regular and linear, but not stable and not always consistent between different choices of λ. However, some special cases are very important summation methods.
Abel summation[edit]
See also: Abel's theorem
If λn = n, then we obtain the method of Abel summation. Here
where z = exp(−x). Then the limit of ƒ(x) as x approaches 0 through positive reals is the limit of the power series for ƒ(z) as z approaches 1 from below through positive reals, and the Abel sum A(s) is defined as
Abel summation is interesting in part because it is consistent with but more powerful than Cesàro summation: A(s) = Ck(s) whenever the latter is defined. The Abel sum is therefore regular, linear, stable, and consistent with Cesàro summation.
Lindelöf summation[edit]
If λn = n log(n), then (indexing from one) we have
Then L(s), the Lindelöf sum (Volkov 2001), is the limit of ƒ(x) as x goes to zero. The Lindelöf sum is a powerful method when applied to power series among other applications, summing power series in the Mittag-Leffler star.
If g(z) is analytic in a disk around zero, and hence has a Maclaurin seriesG(z) with a positive radius of convergence, then L(G(z)) = g(z) in the Mittag-Leffler star. Moreover, convergence to g(z) is uniform on compact subsets of the star.
Analytic continuation[edit]
Several summation methods involve taking the value of an analytic continuation of a function.
Analytic continuation of power series[edit]
If Σanxn converges for small complex x and can be analytically continued along some path from x=0 to the point x=1, then the sum of the series can be defined to be the value at x=1. This value may depend on the choice of path.
Euler summation[edit]
Main article: Euler summation
Euler summation is essentially an explicit form of analytic continuation. If a power series converges for small complex z and can be analytically continued to the open disk with diameter from −1/(q+1) to 1 and is continuous at 1, then its value at is called the Euler or (E,q) sum of the series a0+.... Euler used it before analytic continuation was defined in general, and gave explicit formulas for the power series of the analytic continuation.
The operation of Euler summation can be repeated several times, and this is essentially equivalent to taking an analytic continuation of a power series to the point z=1.
Analytic continuation of Dirichlet series[edit]
This method defines the sum of a series to be the value of the analytic continuation of the Dirichlet series
at s=0, if this exists and is unique. This method is sometimes confused with zeta function regularization.
Zeta function regularization[edit]
If the series
(for positive values of the an) converges for large real s and can beanalytically continued along the real line to s=−1, then its value at s=−1 is called the zeta regularized sum of the series a1+a2+... Zeta function regularization is nonlinear. In applications, the numbers ai are sometimes the eigenvalues of a self-adjoint operator A with compact resolvant, andf(s) is then the trace of A−s. For example, if A has eigenvalues 1, 2, 3, ... then f(s) is the Riemann zeta function, ζ(s), whose value at s=−1 is −1/12, assigning a value to the divergent series is 1 + 2 + 3 + 4 + ⋯. Other values of s can also be used to assign values for the divergent sums ζ(0)=1 + 1 + 1 + ... = -1/2, ζ(-2)=1 + 4 + 9 + ... = 0 and in general , where Bk is a Bernoulli number.[3]
Integral function means[edit]
If J(x)=Σpnxn is an integral function, then the J sum of the series a0+... is defined to be
if this limit exists.
There is a variation of this method where the series for J has a finite radius of convergence r and diverges at x=r. In this case one defines the sum as above, except taking the limit as x tends to r rather than infinity.
Borel summation[edit]
In the special case when J(x)=ex this gives one (weak) form of Borel summation.
Valiron's method[edit]
Valiron's method is a generalization of Borel summation to certain more general integral functions J. Valiron showed that under certain conditions it is equivalent to defining the sum of a series as
where H is the second derivative of G and c(n)=e−G(n).
Moment methods[edit]
Suppose that dμ is a measure on the real line such that all the moments
are finite. If a0+a1+... is a series such that
converges for all x in the support of μ, then the (dμ) sum of the series is defined to be the value of the integral
if it is defined. (Note that if the numbers μn increase too rapidly then they do not uniquely determine the measure μ.)
Borel summation[edit]
For example, if dμ = e−xdx for positive x and 0 for negative x then μn = n!, and this gives one version of Borel summation, where the value of a sum is given by
There is a generalization of this depending on a variable α, called the (B',α) sum, where the sum of a series a0+... is defined to be
if this integral exists. A further generalization is to replace the sum under the integral by its analytic continuation from small t.
Miscellaneous methods[edit]
Hausdorff transformations[edit]
Hardy (1949, chapter 11).
Hölder summation[edit]
Main article: Hölder summation
Hutton's method[edit]
In 1812 Hutton introduced a method of summing divergent series by starting with the sequence of partial sums, and repeated applying the operation of replacing a sequence s0, s1, ... by the sequence of averages (s0+ s1)/2, (s1+ s2)/2, ..., and then taking the limit (Hardy 1949, p. 21).
Ingham summability[edit]
The series a1+... is called Ingham summable to s if
.
Albert Ingham showed that if δ is any positive number then (C,−δ) (Cesaro) summability implies Ingham summability, and Ingham summability implies (C,δ) summability Hardy (1949, Appendix II).
Lambert summability[edit]
The series a1+... is called Lambert summable to s if
.
If a series is (C,k) (Cesaro) summable for any k then it is Lambert summable to the same value, and if a series is Lambert summable then it is Abel summable to the same value Hardy (1949, Appendix II).
Le Roy summation[edit]
The series a0+... is called Le Roy summable to s if
.
Hardy (1949, 4.11)
Mittag-Leffler summation[edit]
The series a0+... is called Mittag-Leffler (M) summable to s if
.
Hardy (1949, 4.11)
Ramanujan summation[edit]
Main article: Ramanujan summation
Ramanujan summation is a method of assigning a value to divergent series used by Ramanujan and based on the Euler–Maclaurin summation formula. The Ramanujan sum of a series f(0) + f(1) + ... depends not only on the values of f at integers, but also on values of the function f at non-integral points, so it is not really a summation method in the sense of this article.
Riemann summability[edit]
The series a1+... is called (R,k) (or Riemann) summable to s if
.
Hardy (1949, 4.17). The series a1+... is called R2 summable to s if
.
Riesz means[edit]
Main article: Riesz mean
If λn form an increasing sequence of real numbers and
then the Riesz (R,λ,κ) sum of the series a0+... is defined to be
Vallée-Poussin summability[edit]
The series a1+... is called VP (or Vallée-Poussin) summable to s if
.
Hardy (1949, 4.17).
See also[edit][/ltr]
[ltr]References[size=13][edit]
[/ltr][/size]
From Wikipedia, the free encyclopedia
[ltr]For the media franchise, see Divergent trilogy.[/ltr]
Les séries divergentes sont en général quelque chose de bien fatal et c’est une honte qu’on ose y fonder aucune démonstration. ("Divergent series are in general something fatal, and it is a disgrace to base any proof on them." Often translated as "Divergent series are an invention of the devil...")
N. H. Abel, letter to Holmboe, January 1826, reprinted in volume 2 of his collected papers.
[ltr]
In mathematics, a divergent series is aninfinite series that is not convergent, meaning that the infinite sequence of the partial sums of the series does not have a finite limit.
If a series converges, the individual terms of the series must approach zero. Thus any series in which the individual terms do not approach zero diverges. However, convergence is a stronger condition: not all series whose terms approach zero converge. The simplest counterexample is the harmonic series
The divergence of the harmonic series was proven by the medieval mathematician Nicole Oresme.
In specialized mathematical contexts, values can be usefully assigned to certain series whose sequence of partial sums diverges. A summability method or summation method is a partial function from the set of sequences of partial sums of series to values. For example, Cesàro summation assigns Grandi's divergent series
the value 1/2. Cesàro summation is an averaging method, in that it relies on the arithmetic mean of the sequence of partial sums. Other methods involve analytic continuations of related series. In physics, there are a wide variety of summability methods; these are discussed in greater detail in the article on regularization.[/ltr]
- 1 History
- 2 Theorems on methods for summing divergent series
- 3 Properties of summation methods
- 4 Classical summation methods
- 4.1 Absolute convergence
- 4.2 Sum of a series
[ltr]
History[edit][/ltr]
... before Cauchy mathematicians asked not 'How shall we define 1−1+1...?' but 'What is 1−1+1...?' and that this habit of mind led them into unnecessary perplexities and controversies which were often really verbal.
G. H. Hardy, Divergent series, page 6
[ltr]
Before the 19th century divergent series were widely used by Euler and others, but often led to confusing and contradictory results. A major problem was Euler's idea that any divergent series should have a natural sum, without first defining what is meant by the sum of a divergent series. Cauchy eventually gave a rigorous definition of the sum of a (convergent) series, and for some time after this divergent series were mostly excluded from mathematics. They reappeared in 1886 with Poincaré's work on asymptotic series. In 1890 Cesaro realized that one could give a rigorous definition of the sum of some divergent series, and defined Cesaro summation. (This was not the first use of Cesaro summation which was used implicitly by Frobenius in 1880; Cesaro's key contribution was not the discovery of this method but his idea that one should give an explicit definition of the sum of a divergent series.) In the years after Cesaro's paper several other mathematicians gave other definitions of the sum of a divergent series, though these are not always compatible: different definitions can give different answers for the sum of the same divergent series, so when talking about the sum of a divergent series it is necessary to specify which summation method one is using.
Theorems on methods for summing divergent series[edit]
A summability method M is regular if it agrees with the actual limit on allconvergent series. Such a result is called an abelian theorem for M, from the prototypical Abel's theorem. More interesting and in general more subtle are partial converse results, called tauberian theorems, from a prototype proved by Alfred Tauber. Here partial converse means that if Msums the series Σ, and some side-condition holds, then Σ was convergent in the first place; without any side condition such a result would say that Monly summed convergent series (*** it useless as a summation method for divergent series).
The operator giving the sum of a convergent series is linear, and it follows from the Hahn–Banach theorem that it may be extended to a summation method summing any series with bounded partial sums. This fact is not very useful in practice since there are many such extensions, inconsistentwith each other, and also since proving such operators exist requires invoking the axiom of choice or its equivalents, such as Zorn's lemma. They are therefore nonconstructive.
The subject of divergent series, as a domain of mathematical analysis, is primarily concerned with explicit and natural techniques such as Abel summation, Cesàro summation and Borel summation, and their relationships. The advent of Wiener's tauberian theorem marked an epoch in the subject, introducing unexpected connections to Banach algebramethods in Fourier analysis.
Summation of divergent series is also related to extrapolation methods andsequence transformations as numerical techniques. Examples for such techniques are Padé approximants, Levin-type sequence transformations, and order-dependent mappings related to renormalization techniques for large-order perturbation theory in quantum mechanics.
Properties of summation methods[edit]
Summation methods usually concentrate on the sequence of partial sums of the series. While this sequence does not converge, we may often find that when we take an average of larger and larger initial terms of the sequence, the average converges, and we can use this average instead of a limit to evaluate the sum of the series. So in evaluatinga = a0 + a1 + a2 + ..., we work with the sequence s, where s0 = a0 andsn+1 = sn + an+1. In the convergent case, the sequence s approaches the limit a. A summation method can be seen as a function from a set of sequences of partial sums to values. If A is any summation method assigning values to a set of sequences, we may mechanically translate this to a series-summation method AΣ that assigns the same values to the corresponding series. There are certain properties it is desirable for these methods to possess if they are to arrive at values corresponding to limits and sums, respectively.[/ltr]
- Regularity. A summation method is regular if, whenever the sequence s converges to x, A(s) = x. Equivalently, the corresponding series-summation method evaluates AΣ(a) = x.
- Linearity. A is linear if it is a linear functional on the sequences where it is defined, so that A(k r + s) = k A(r) + A(s) for sequencesr, s and a real or complex scalar k. Since the terms an = sn+1 − snof the series a are linear functionals on the sequence s and vice versa, this is equivalent to AΣ being a linear functional on the terms of the series.
- Stability. If s is a sequence starting from s0 and s′ is the sequence obtained by omitting the first value and subtracting it from the rest, so that s′n = sn+1 − s0, then A(s) is defined if and only if A(s′) is defined, and A(s) = s0 + A(s′). Equivalently, whenever a′n = an+1 for all n, then AΣ(a) = a0 + AΣ(a′).[1][2]
[ltr]
The third condition is less important, and some significant methods, such as Borel summation, do not possess it.[citation needed]
One can also give a weaker alternative to the last condition.[/ltr]
- Finite Re-indexability. If a and a′ are two series such that there exists a bijection such that ai = a′f(i) for all i, and if there exists some such that ai = a′i for all i > N, thenAΣ(a) = AΣ(a′). (In other words, a′ is the same series as a, with only finitely many terms re-indexed.) Note that this is a weaker condition than Stability, because any summation method that exhibitsStability also exhibits Finite Re-indexability, but the converse is not true.
[ltr]
A desirable property for two distinct summation methods A and B to share is consistency: A and B are consistent if for every sequence s to which both assign a value, A(s) = B(s). If two methods are consistent, and one sums more series than the other, the one summing more series isstronger.
There are powerful numerical summation methods that are neither regular nor linear, for instance nonlinear sequence transformations like Levin-type sequence transformations and Padé approximants, as well as the order-dependent mappings of perturbative series based on renormalizationtechniques.
Taking regularity, linearity and stability as axioms, it is possible to sum many divergent series by elementary algebraic manipulations. This partly explains why many different summation methods give the same answer for certain series.
For instance, whenever r ≠ 1, the geometric series
can be evaluated regardless of convergence. More rigorously, any summation method that possesses these properties and which assigns a finite value to the geometric series must assign this value. However, when ris a real number larger than 1, the partial sums increase without bound, and averaging methods assign a limit of ∞.
Classical summation methods[edit]
The two classical summation methods for series, ordinary convergence and absolute convergence, define the sum as a limit of certain partial sums. Strictly speaking these are not really summation methods for divergent series, as by definition a series is divergent only if these methods do not work, but are included for completeness. Most but not all summation methods for divergent series extend these methods to a larger class of sequences.
Absolute convergence[edit]
Absolute convergence defines the sum of a sequence (or set) of numbers to be the limit of the net of all partial sums ak[sub]1[/sub]+ ...+ak[sub]n[/sub], if it exists. It does not depend on the order of the elements of the sequence, and a classical theorem says that a sequence is absolutely convergent if and only if the sequence of absolute values is convergent in the standard sense.
Sum of a series[edit]
Cauchy's classical definition of the sum of a series a0+a1+... defines the sum to be the limit of the sequence of partial sums a0+ ...+an. This is the default definition of convergence of a sequence.
Nørlund means[edit]
Suppose pn is a sequence of positive terms, starting from p0. Suppose also that
If now we transform a sequence s by using p to give weighted means, setting
then the limit of tn as n goes to infinity is an average called the Nørlundmean Np(s).
The Nørlund mean is regular, linear, and stable. Moreover, any two Nørlund means are consistent.
Cesàro summation[edit]
The most significant of the Nørlund means are the Cesàro sums. Here, if we define the sequence pk by
then the Cesàro sum Ck is defined by Ck(s) = N(pk)(s). Cesàro sums are Nørlund means if k ≥ 0, and hence are regular, linear, stable, and consistent. C0 is ordinary summation, and C1 is ordinary Cesàro summation. Cesàro sums have the property that if h > k, then Ch is stronger than Ck.
Abelian means[edit]
Suppose λ = {λ0, λ1, λ2, ...} is a strictly increasing sequence tending towards infinity, and that λ0 ≥ 0. Suppose
converges for all real numbers x>0. Then the Abelian mean Aλ is defined as
More generally, if the series for f only converges for large x but can be analytically continued to all positive real x, then one can still define the sum of the divergent series by the limit above.
A series of this type is known as a generalized Dirichlet series; in applications to physics, this is known as the method of heat-kernel regularization.
Abelian means are regular and linear, but not stable and not always consistent between different choices of λ. However, some special cases are very important summation methods.
Abel summation[edit]
See also: Abel's theorem
If λn = n, then we obtain the method of Abel summation. Here
where z = exp(−x). Then the limit of ƒ(x) as x approaches 0 through positive reals is the limit of the power series for ƒ(z) as z approaches 1 from below through positive reals, and the Abel sum A(s) is defined as
Abel summation is interesting in part because it is consistent with but more powerful than Cesàro summation: A(s) = Ck(s) whenever the latter is defined. The Abel sum is therefore regular, linear, stable, and consistent with Cesàro summation.
Lindelöf summation[edit]
If λn = n log(n), then (indexing from one) we have
Then L(s), the Lindelöf sum (Volkov 2001), is the limit of ƒ(x) as x goes to zero. The Lindelöf sum is a powerful method when applied to power series among other applications, summing power series in the Mittag-Leffler star.
If g(z) is analytic in a disk around zero, and hence has a Maclaurin seriesG(z) with a positive radius of convergence, then L(G(z)) = g(z) in the Mittag-Leffler star. Moreover, convergence to g(z) is uniform on compact subsets of the star.
Analytic continuation[edit]
Several summation methods involve taking the value of an analytic continuation of a function.
Analytic continuation of power series[edit]
If Σanxn converges for small complex x and can be analytically continued along some path from x=0 to the point x=1, then the sum of the series can be defined to be the value at x=1. This value may depend on the choice of path.
Euler summation[edit]
Main article: Euler summation
Euler summation is essentially an explicit form of analytic continuation. If a power series converges for small complex z and can be analytically continued to the open disk with diameter from −1/(q+1) to 1 and is continuous at 1, then its value at is called the Euler or (E,q) sum of the series a0+.... Euler used it before analytic continuation was defined in general, and gave explicit formulas for the power series of the analytic continuation.
The operation of Euler summation can be repeated several times, and this is essentially equivalent to taking an analytic continuation of a power series to the point z=1.
Analytic continuation of Dirichlet series[edit]
This method defines the sum of a series to be the value of the analytic continuation of the Dirichlet series
at s=0, if this exists and is unique. This method is sometimes confused with zeta function regularization.
Zeta function regularization[edit]
If the series
(for positive values of the an) converges for large real s and can beanalytically continued along the real line to s=−1, then its value at s=−1 is called the zeta regularized sum of the series a1+a2+... Zeta function regularization is nonlinear. In applications, the numbers ai are sometimes the eigenvalues of a self-adjoint operator A with compact resolvant, andf(s) is then the trace of A−s. For example, if A has eigenvalues 1, 2, 3, ... then f(s) is the Riemann zeta function, ζ(s), whose value at s=−1 is −1/12, assigning a value to the divergent series is 1 + 2 + 3 + 4 + ⋯. Other values of s can also be used to assign values for the divergent sums ζ(0)=1 + 1 + 1 + ... = -1/2, ζ(-2)=1 + 4 + 9 + ... = 0 and in general , where Bk is a Bernoulli number.[3]
Integral function means[edit]
If J(x)=Σpnxn is an integral function, then the J sum of the series a0+... is defined to be
if this limit exists.
There is a variation of this method where the series for J has a finite radius of convergence r and diverges at x=r. In this case one defines the sum as above, except taking the limit as x tends to r rather than infinity.
Borel summation[edit]
In the special case when J(x)=ex this gives one (weak) form of Borel summation.
Valiron's method[edit]
Valiron's method is a generalization of Borel summation to certain more general integral functions J. Valiron showed that under certain conditions it is equivalent to defining the sum of a series as
where H is the second derivative of G and c(n)=e−G(n).
Moment methods[edit]
Suppose that dμ is a measure on the real line such that all the moments
are finite. If a0+a1+... is a series such that
converges for all x in the support of μ, then the (dμ) sum of the series is defined to be the value of the integral
if it is defined. (Note that if the numbers μn increase too rapidly then they do not uniquely determine the measure μ.)
Borel summation[edit]
For example, if dμ = e−xdx for positive x and 0 for negative x then μn = n!, and this gives one version of Borel summation, where the value of a sum is given by
There is a generalization of this depending on a variable α, called the (B',α) sum, where the sum of a series a0+... is defined to be
if this integral exists. A further generalization is to replace the sum under the integral by its analytic continuation from small t.
Miscellaneous methods[edit]
Hausdorff transformations[edit]
Hardy (1949, chapter 11).
Hölder summation[edit]
Main article: Hölder summation
Hutton's method[edit]
In 1812 Hutton introduced a method of summing divergent series by starting with the sequence of partial sums, and repeated applying the operation of replacing a sequence s0, s1, ... by the sequence of averages (s0+ s1)/2, (s1+ s2)/2, ..., and then taking the limit (Hardy 1949, p. 21).
Ingham summability[edit]
The series a1+... is called Ingham summable to s if
.
Albert Ingham showed that if δ is any positive number then (C,−δ) (Cesaro) summability implies Ingham summability, and Ingham summability implies (C,δ) summability Hardy (1949, Appendix II).
Lambert summability[edit]
The series a1+... is called Lambert summable to s if
.
If a series is (C,k) (Cesaro) summable for any k then it is Lambert summable to the same value, and if a series is Lambert summable then it is Abel summable to the same value Hardy (1949, Appendix II).
Le Roy summation[edit]
The series a0+... is called Le Roy summable to s if
.
Hardy (1949, 4.11)
Mittag-Leffler summation[edit]
The series a0+... is called Mittag-Leffler (M) summable to s if
.
Hardy (1949, 4.11)
Ramanujan summation[edit]
Main article: Ramanujan summation
Ramanujan summation is a method of assigning a value to divergent series used by Ramanujan and based on the Euler–Maclaurin summation formula. The Ramanujan sum of a series f(0) + f(1) + ... depends not only on the values of f at integers, but also on values of the function f at non-integral points, so it is not really a summation method in the sense of this article.
Riemann summability[edit]
The series a1+... is called (R,k) (or Riemann) summable to s if
.
Hardy (1949, 4.17). The series a1+... is called R2 summable to s if
.
Riesz means[edit]
Main article: Riesz mean
If λn form an increasing sequence of real numbers and
then the Riesz (R,λ,κ) sum of the series a0+... is defined to be
Vallée-Poussin summability[edit]
The series a1+... is called VP (or Vallée-Poussin) summable to s if
.
Hardy (1949, 4.17).
See also[edit][/ltr]
[ltr]References[size=13][edit]
[/ltr][/size]
- Jump up^ see Michon's Numericanahttp://www.numericana.com/answer/sums.htm
- Jump up^ see also Translativity at The Encyclopedia of Mathematics wiki (Springer) [1]
- Jump up^ Tao, Terence (10 April 2010). "The Euler-Maclaurin formula, Bernoulli numbers, the zeta function, and real-variable analytic continuation".
- Arteca, G.A.; Fernández, F.M.; Castro, E.A. (1990), Large-Order Perturbation Theory and Summation Methods in Quantum Mechanics, Berlin: Springer-Verlag.
- Baker, Jr., G. A.; Graves-Morris, P. (1996), Padé Approximants, Cambridge University Press.
- Brezinski, C.; Zaglia, M. Redivo (1991), Extrapolation Methods. Theory and Practice, North-Holland.
- Hardy, G. H. (1949), Divergent Series, Oxford: Clarendon Press.
- LeGuillou, J.-C.; Zinn-Justin, J. (1990), Large-Order Behaviour of Perturbation Theory, Amsterdam: North-Holland.
- Volkov, I.I. (2001), "Lindelöf summation method", in Hazewinkel, Michiel, Encyclopedia of Mathematics, Springer, ISBN 978-1-55608-010-4.
- Zakharov, A.A. (2001), "Abel summation method", in Hazewinkel, Michiel, Encyclopedia of Mathematics, Springer, ISBN 978-1-55608-010-4.
|
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
3. The Power of Combinatorics
Hopf algebra is invading quantum field theory from both ends, both at
the foundational level and the computational level. . . The approach from
quantum theoretical first principle is still in its first infancy.
H´ector Figueroa and Jos´e Gracia-Bondia, 2005
In this series of monographs, we will show that:
There are highly complex mathematical structures behind the idea of the
renormalization of quantum field theories.
The combinatorial structure of Feynman diagrams lies at the heart of renormalization
methods. In the standard Boguliubov–Parasiuk–Hepp–Zimmermann (BPHZ)
approach, the regularization of algebraic Feynman integrals is carried out by an
iterative method which was invented by Bogoliubov in the 1950s. It was shown by
Zimmermann in 1969 that Bogoliubov’s iterative method can be solved in a closed
form called the Zimmermann forest formula. Finally, it was discovered by Kreimer
in 1998 that Zimmermann’s forest formula can be formulated by using the coinverse
of an appropriate Hopf algebra for Feynman graphs. This will be thoroughly
studied later on. In this chapter, we only want to discuss some basic ideas about
Hopf algebras and Rota–Baxter algebras.
3.1 Algebras
Products play a fundamental role in quantum field theory (e.g., normal
products, time-ordered products, retarded products). They are used in
order to construct correlation functions.
Folklore
Algebras are linear spaces equipped with a distributive multiplication. Let us discuss
this.
The algebra of smooth functions as a prototype. Fix n = 1, 2, . . . Recall
that E(RN) denotes the set of all smooth complex-valued functions f : RN → C.
We write A instead of E(RN). For all functions f, g ∈ A and all complex numbers
α, β, we define the linear combination αf + βg and the product fg by setting for
all x ∈ RN:
• (αf + βg)(x) := αf(x) + βg(x),
• (fg)(x) := f(x)g(x).
In addition, we define the so-called unit element 1 by setting
• 1(x) := 1 for all x ∈ RN.
Then for all f, g, h ∈ A and all α, β ∈ C, the following hold:
(A1) Linearity: The set A is a complex linear space.
(A2) Consistency: fg ∈ A, and (αf)g = f(αg) = α(fg).
(A3) Distributivity: (αf + βg)h = αfh + βfh, and
h(αf + βg) = αhf + βhg.
(A4) Associativity: (fg)h = f(gh).
(A5) Commutativity: fg = gf.
(A6) Unitality: There exists precisely one element 1 in A such that
1f = f1 = f for all f ∈ A.
Here, 1 is called the unit element of A.
The definition of an algebra. As we will show later on, algebras play a fundamental
role in the mathematical description of quantum processes. By definition,
the set A is called a complex algebra iff for all f, g ∈ A and all complex numbers
α and β, the linear combination αf + βg and the product fg are defined in such a
way that the conditions (A1), (A2), and (A3) are always satisfied.2 In addition, we
use the following terminology.
• The algebra A is called associative iff condition (A4) is always satisfied.
• The algebra A is called commutative iff condition (A5) is always satisfied.
• The algebra A is called unital iff condition (A6) is satisfied.
For example, the space D(RN) of smooth test functions f : RN → C with compact
support is a complex algebra. This algebra is associative and commutative. The
same is true for the spaces E(RN) and S(RN) of test functions.3 In addition, the
space E(RN) is unital.
For fixed n = 2, 3, . . . , the set of complex (n × n)-matrices forms a complex
algebra which is associative, noncommutative, and unital. Here, the unit element
is given by the unit matrix I := diag(1, 1, . . . , 1).
A subset B of a complex algebra A is called a subalgebra iff it is an algebra
with respect to the operations induced by A. Explicitly, this means that if f, g ∈ B
and α, β ∈ C, then αf + βg ∈ B and fg ∈ B.
Algebra morphism. Let A and B be algebras over C. The map
χ : A → B
is called an algebra morphism iff it respects linear combinations and products, that
is, the map χ is linear and we have χ(fg) = χ(f)χ(g) for all f, g ∈ A. Bijective
algebra morphisms are also called algebra isomorphisms.
The map S : A → B is called an algebra anti-morphism iff it is linear and we
have S(fg) = S(g)S(f) for all f, g ∈ A.
Modification. Real algebras (also called algebras over R) are defined analogously.
We only replace the field C of complex numbers by the field R of real
numbers.
Perspectives. In this series of monographs, algebras will be encountered quite
often. Let us mention the following examples:
• the Hopf algebra of linear differential operators (Sect. 3.3.2);
• Hopf algebras, formal power series expansions, and renormalization (Sect. 3.4);
• symmetries and Lie groups; linearized symmetries and Lie algebras (Vols. I–VI);
• the algebra of multilinear functionals (Sect. 3.2);
• the tensor algebra of a linear space (Vol. III);
• the algebra of symmetric multilinear functionals and the symmetric algebra of a
linear space (Vol. III);
• the algebra of antisymmetric multilinear functionals and the Grassmann (or exterior)
algebra of a linear space (Vol. III);
• the Clifford (or inner) algebras of a linear space equipped with a bilinear form
(Vol. III);
• the enveloping algebra of a Lie algebra (Vol. III).
Concerning applications to physics, we mention the following:
• the convolution algebra and the Heaviside calculus for computing electric circuits
in electrical engineering (Sect. 4.2);
• Lie algebras in classical mechanics based on Poisson brackets (Sect. 6.9.2);
• Lie super algebras and the supersymmetry of elementary particles and strings
(Sect. 7.21 and Vols. III–VI);
• the ∗-algebra approach to physics (classical mechanics, statistical physics, quantum
physics) (Sect. 7.17);
• ∗-algebras, C
∗
-algebras, and von Neumann algebras (Sect. 7.18 and Vol. IV);
• Clifford algebras and the Dirac equation for fermions (Vol. III);
• C
∗
-algebras and quantum information (Vol. IV);
• local nets of operator algebras and the algebraic approach to quantum field theory
due to Haag and Kastler (Vols. IV–VI);
• operator algebras and spectral theory of observables in quantum physics (the
Gelfand theory) (Vol. IV);
• the Gelfand–Naimark theorem and noncommutative geometry (Vol. IV);
• the Connes–Kreimer–Moscovici Hopf algebra and renormalization (Vol. IV);
• operator algebras and quantum gravity (Vol. VI).
Roughly speaking, products and hence algebras are everywhere.
3.2 The Algebra of Multilinear Functionals
Multilinear algebra studies all kinds of products.
Folklore
In what follows we want to study the elements of multilinear algebra, which play
a crucial role in modern physics. The main tool are multilinear functionals. The
tensor product ⊗ and the Grassmann product ∧ correspond to special multilinear
functionals which are called decomposable. A special role is played by symmetric
and antisymmetric multilinear functionals, which is related to bosons and fermions
in elementary particle physics, respectively.
In this section, the symbols X, Y,Z,Xα denote linear spaces over K. Here, we
choose K = R and K = C; this corresponds to real and complex linear spaces,
respectively. The index α runs in the nonempty index set A. For the basic definitions
concerning linear spaces, we refer to Sect. 7.3 of Vol. I. Recall that:
Two finite-dimensional real (resp. complex) linear spaces are linear isomorphic
iff they have the same dimension.
This well-known theorem from the basic course in linear algebra essentially simplifies
the theory of finite-dimensional linear spaces. It shows that a finite-dimensional
linear space can be described by a single invariant, namely, its dimension. Note
that an analogous theorem for infinite-dimensional linear spaces is not true.
The tensor product A⊗B of two algebras. Let A and B be algebras over K
with K = R,C. Since A and B are linear spaces over K, we have the tensor product
A⊗B at hand. In addition, we define the product
(a ⊗ b)(c ⊗ d) := ac ⊗ bd
for all a, c ∈ A and all b, d ∈ B. In a natural way, this definition can be extended
to expressions of the form (3.3).
Proposition 3.4 The tensor product A⊗B is an algebra over K.
The proof will be given in Problem 4.17 on page 261. We have to show that the
product on A × B does not depend on the choice of the representations (3.3). To
this end, we will use the language of equivalence classes.
Tensor products will be studied in greater detail in Vol. III. In terms of physics,
tensor products are used in order to describe composite particles. For example, the
tensor product ϕ ⊗ ψ of the two states ϕ and ψ of single particles is the state of
the composite particle. In terms of mathematics, tensor products are used in order
to reduce multilinear functionals to linear operators on tensor products.
3.3 Fusion, Splitting, and Hopf Algebras
In nature, one observes fusion and splitting of physical states. From the
mathematical point of view, this corresponds to products and coproducts
of Hopf algebras, respectively.
Folklore
In this section, we will use tensor products in order to define Hopf algebras. In
particular, the language of tensor products will tell us why coassociativity (CA)
and counitality (CU) are dual concepts to associativity (A) and unitality (U) of
algebras (see page 128).
3.3.2 The Definition of Hopf Algebras
The canonical morphisms of an associative unital algebra. Let A be an
associative unital complex algebra A with the unit element 1. For all a, b ∈ A and
all complex numbers z, we define the following maps:
(i) Multiplication map: μ(a ⊗ b) := ab.
(ii) Unitality map: η(z) := z1.
(iii) Identical map: id(a) := a.
Using linear extension, we get the following three algebra morphisms:
μ : A⊗A →A, η: C→A, id :A→A.
These so-called three canonical morphisms of the algebra A have the following
properties:
(A) Associativity: μ(μ ⊗ id) = μ(id ⊗ μ).
(U) Unitality: μ(η ⊗ id) = μ(id ⊗ η) = id.
Let us prove this. Relation (A) follows from the associative law a(bc) = (ab)c for
all a, b, c ∈ A. In fact,
μ(id ⊗ μ)(a ⊗ b ⊗ c) = μ(a ⊗ μ(b ⊗ c)) = μ(a ⊗ bc) = a(bc).
Similarly, μ(μ ⊗ id)(a ⊗ b ⊗ c) = (ab)c. This proves (A). Relation (U) follows from
μ(η ⊗ id)(1 ⊗ a) = μ(η(1) ⊗ a) = μ(1 ⊗ a) = 1a = a.
Similarly, μ(id ⊗ η)(a ⊗ 1) = a1 = a. This yields (U) if we identify a ⊗ 1 and 1⊗ a
with a. This corresponds to the isomorphisms A⊗C = A = C⊗A.
Dualization and the definition of bialgebras. It is our goal to dualize the
relations (A) and (U) above by using the replacement
μ ⇒ Δ, η ⇒ ε
and by commuting the factors. This way, we obtain the following two dual relations:
(CA) Coassociativity: (id ⊗ Δ)Δ = (Δ ⊗ id)Δ.
(CU) Counitality: (id ⊗ ε)Δ = (ε ⊗ id)Δ = id.
Let A be an associative unital complex algebra. Such an algebra is called a complex
bialgebra iff there exist two algebra morphisms
(i) Δ :A→A⊗A (coproduct) and
(ii) ε :A→C (counit)
such that the conditions (CA) and (CU) are satisfied. The counitality map ε is also
called the augmentation map.
The definition of Hopf algebras. The complex bialgebra A is called a Hopf
algebra iff there exists a linear map S :A→A such that
μ(S ⊗ id)Δ = μ(id ⊗ S)Δ = ηε. (3.15)
This condition looks strange at the first glance. However, we will show below that
this is a very natural condition in terms of both Sweedler’s notation and a convolution
on the space of linear operators L(A,A) on the algebra A.
The coinverse is also called
antipode.
As we will show later on, the coinverse lies at the heart of renormalization
in quantum field theory.
Commutative and cocommutative Hopf algebras. The Hopf algebra A is
called commutative iff ab = ba for all a, b ∈ A.
Proposition 3.5 The convolution on L(A,A) is associative and has the unit element
ηε.
Explicitly, this means that, for all B,C,D ∈ L(A,A), we have
(B ∗ C) ∗ D = B ∗ (C ∗ D), B∗ ηε = ηε ∗ B = B.
The proof can be found in Problem 3.7 on page 170. Using this notion of convolution,
the defining relation (3.15) of the coinverse S can be elegantly written as
S ∗ id = id ∗ S = ηε
where id is the identical map on A. This means the following:
The complex bialgebra A is a Hopf algebra iff there exists a linear map
S : A → A which is the two-sided inverse of the identical map on A for
the convolution on L(A,A).
Historical remarks. Hopf algebras were studied first by Heinz Hopf (1894–
1971) in 1941 in order to compute the cohomology of Lie groups and more general
topological spaces.
H. Hopf, On the topology of group manifolds and its generalizations, Ann.
Math. 42 (1941), 22–52 (in German).
This can be found in
E. Spanier, Algebraic Topology, Springer, New York, 1989.
In what follows, we will study the relation of Hopf algebras to both
• power series expansions and
• symmetry.
Roughly speaking, Hopf algebras are frequently used in order to carry out sophisticated
computations for problems where a nontrivial symmetry is behind. Using the
language of commutative diagrams, it turns out that a bialgebra can be understood
best as a mathematical concept which combines the concept of algebra with its dual
concept. One only has to reinverse the arrows in the commutative diagrams of an
algebra. This will be thoroughly considered in Problem 3.4 on page 168. Hopf algebras
are bialgebras which carry the additional structure of a coinverse. Typically,
the coinverse comes from dualizing the inverse of a group structure (see Sect. 3.5.2).
3.4 Power Series Expansion and Hopf Algebras
3.4.1 The Importance of Cancellations
The big surprise in renormalization theory is the appearance of unexpected
huge cancellations in the lengthy computations.
Folklore
It happens quite often in mathematics and physics that extremely complicated
long expressions dramatically simplify by rearranging them as alternating sums
and by cancelling the alternating terms.
As a nontrivial example, let us mention that the elegant heat-kernel approach to
the sophisticated Atiyah–Singer index theorem for elliptic differential operators
on compact manifolds was discovered by Atiyah, Bott, and Patodi in 1973; they
noticed completely unexpected cancellations in long formulas related to the spectral
geometry on manifolds.8 It is typical for topology that topological invariants are
related to alternating sums. The prototype is the Euler characteristic (see Sect. 5.6.2
in Vol. I). Unexpected cancellations are also typical for quite lengthy computations
in renormalization theory. The experience of mathematicians and physicists shows
that symmetries are behind cancellations. It turns out that the cancellations in
renormalization theory can be based on Hopf algebras. As a prototype, we want to
study the Fa`a di Bruno Hopf algebra related to the local diffeomorphism group in
the complex plane. As a preparation for this, we need two classical formulas for the
coefficients of power series expansions, namely,
• the Lagrange inversion formula (3.20) related to the famous Kepler equation
(3.19) in celestial mechanics, and
• the Fa`a di Bruno composition formula (3.30).
Let us study these two formulas first.
The Generalized Zimmermann Forest Formula
The coinverse (also called antipode) of Hopf algebras allows us to elegantly
describe complicated inversion processes in mathematics and physics.
Folklore
It turned out that the whole iterative and intricate structure of renormalization
theory could be mapped to the theory of Hopf algebras, with
Zimmermann’s forest formula for the counterterm coming along as antipode.
Dirk Kreimer, 1994
Correlation Functions in Quantum Field Theory
In the perturbative approach to quantum field theory, combinatorial formulas
are used for computing correlation functions of interacting quantum
fields by means of simpler correlation functions of free fields.
Folklore
3.4.8 Random Variables, Moments, and Cumulants
Generating functions for the moments of random variables represent a
basic tool in quantum field theory.
Folklore
The following considerations are basic for the theory of random variables. Let X
and Y be two independent random variables. Then, for the mean values and the
mean fluctuations, we have the following additivity property
X + Y = X + Y , (ΔX + ΔY )(ΔX + ΔY ) = (ΔX)(ΔX) + (ΔY )(ΔY ).
We want to generalize this to higher moments. The basic idea is to pass from the
multiplicative family of moments to the additive family of cumulants by using the
logarithmic function. In terms of algebra, the passage from cumulants to moments
is given by Schur polynomials. In quantum field theory, the passage from moments
to cumulants corresponds to the passage from correlation functions (i.e., Green’s
functions) to reduced correlation functions (i.e., connected Green’s functions). In
terms of Feynman diagrams, this corresponds to a passage from general Feynman
graphs to connected Feynman graphs (see (3.41)). The proofs for the following
statements can be found in Sect. II.12 of the textbook by A. Shiryaev, Probability,
Springer, New York, 1996.
The moment problem played an important role in the development of measure
theory and functional analysis. In particular, the Riesz–Markov representation theorem
(about linear continuous functionals on spaces of real continuous functions
defined on compact sets) and the Hahn–Banach theorem (on the extension of linear
continuous functionals in Banach spaces) were proved for solving the moment
problem. This fascinating history can be found in J. Dieudonn´e, History of Functional
Analysis, 1900–1975, North-Holland, Amsterdam, 1983. We also refer to J.
Shohat and J. Tamarkin, The Problem of Moments, New York, 1950, and P. Lax,
Functional Analysis, Sect. 33.5, Wiley, New York, 2002.
Hopf algebra is invading quantum field theory from both ends, both at
the foundational level and the computational level. . . The approach from
quantum theoretical first principle is still in its first infancy.
H´ector Figueroa and Jos´e Gracia-Bondia, 2005
In this series of monographs, we will show that:
There are highly complex mathematical structures behind the idea of the
renormalization of quantum field theories.
The combinatorial structure of Feynman diagrams lies at the heart of renormalization
methods. In the standard Boguliubov–Parasiuk–Hepp–Zimmermann (BPHZ)
approach, the regularization of algebraic Feynman integrals is carried out by an
iterative method which was invented by Bogoliubov in the 1950s. It was shown by
Zimmermann in 1969 that Bogoliubov’s iterative method can be solved in a closed
form called the Zimmermann forest formula. Finally, it was discovered by Kreimer
in 1998 that Zimmermann’s forest formula can be formulated by using the coinverse
of an appropriate Hopf algebra for Feynman graphs. This will be thoroughly
studied later on. In this chapter, we only want to discuss some basic ideas about
Hopf algebras and Rota–Baxter algebras.
3.1 Algebras
Products play a fundamental role in quantum field theory (e.g., normal
products, time-ordered products, retarded products). They are used in
order to construct correlation functions.
Folklore
Algebras are linear spaces equipped with a distributive multiplication. Let us discuss
this.
The algebra of smooth functions as a prototype. Fix n = 1, 2, . . . Recall
that E(RN) denotes the set of all smooth complex-valued functions f : RN → C.
We write A instead of E(RN). For all functions f, g ∈ A and all complex numbers
α, β, we define the linear combination αf + βg and the product fg by setting for
all x ∈ RN:
• (αf + βg)(x) := αf(x) + βg(x),
• (fg)(x) := f(x)g(x).
In addition, we define the so-called unit element 1 by setting
• 1(x) := 1 for all x ∈ RN.
Then for all f, g, h ∈ A and all α, β ∈ C, the following hold:
(A1) Linearity: The set A is a complex linear space.
(A2) Consistency: fg ∈ A, and (αf)g = f(αg) = α(fg).
(A3) Distributivity: (αf + βg)h = αfh + βfh, and
h(αf + βg) = αhf + βhg.
(A4) Associativity: (fg)h = f(gh).
(A5) Commutativity: fg = gf.
(A6) Unitality: There exists precisely one element 1 in A such that
1f = f1 = f for all f ∈ A.
Here, 1 is called the unit element of A.
The definition of an algebra. As we will show later on, algebras play a fundamental
role in the mathematical description of quantum processes. By definition,
the set A is called a complex algebra iff for all f, g ∈ A and all complex numbers
α and β, the linear combination αf + βg and the product fg are defined in such a
way that the conditions (A1), (A2), and (A3) are always satisfied.2 In addition, we
use the following terminology.
• The algebra A is called associative iff condition (A4) is always satisfied.
• The algebra A is called commutative iff condition (A5) is always satisfied.
• The algebra A is called unital iff condition (A6) is satisfied.
For example, the space D(RN) of smooth test functions f : RN → C with compact
support is a complex algebra. This algebra is associative and commutative. The
same is true for the spaces E(RN) and S(RN) of test functions.3 In addition, the
space E(RN) is unital.
For fixed n = 2, 3, . . . , the set of complex (n × n)-matrices forms a complex
algebra which is associative, noncommutative, and unital. Here, the unit element
is given by the unit matrix I := diag(1, 1, . . . , 1).
A subset B of a complex algebra A is called a subalgebra iff it is an algebra
with respect to the operations induced by A. Explicitly, this means that if f, g ∈ B
and α, β ∈ C, then αf + βg ∈ B and fg ∈ B.
Algebra morphism. Let A and B be algebras over C. The map
χ : A → B
is called an algebra morphism iff it respects linear combinations and products, that
is, the map χ is linear and we have χ(fg) = χ(f)χ(g) for all f, g ∈ A. Bijective
algebra morphisms are also called algebra isomorphisms.
The map S : A → B is called an algebra anti-morphism iff it is linear and we
have S(fg) = S(g)S(f) for all f, g ∈ A.
Modification. Real algebras (also called algebras over R) are defined analogously.
We only replace the field C of complex numbers by the field R of real
numbers.
Perspectives. In this series of monographs, algebras will be encountered quite
often. Let us mention the following examples:
• the Hopf algebra of linear differential operators (Sect. 3.3.2);
• Hopf algebras, formal power series expansions, and renormalization (Sect. 3.4);
• symmetries and Lie groups; linearized symmetries and Lie algebras (Vols. I–VI);
• the algebra of multilinear functionals (Sect. 3.2);
• the tensor algebra of a linear space (Vol. III);
• the algebra of symmetric multilinear functionals and the symmetric algebra of a
linear space (Vol. III);
• the algebra of antisymmetric multilinear functionals and the Grassmann (or exterior)
algebra of a linear space (Vol. III);
• the Clifford (or inner) algebras of a linear space equipped with a bilinear form
(Vol. III);
• the enveloping algebra of a Lie algebra (Vol. III).
Concerning applications to physics, we mention the following:
• the convolution algebra and the Heaviside calculus for computing electric circuits
in electrical engineering (Sect. 4.2);
• Lie algebras in classical mechanics based on Poisson brackets (Sect. 6.9.2);
• Lie super algebras and the supersymmetry of elementary particles and strings
(Sect. 7.21 and Vols. III–VI);
• the ∗-algebra approach to physics (classical mechanics, statistical physics, quantum
physics) (Sect. 7.17);
• ∗-algebras, C
∗
-algebras, and von Neumann algebras (Sect. 7.18 and Vol. IV);
• Clifford algebras and the Dirac equation for fermions (Vol. III);
• C
∗
-algebras and quantum information (Vol. IV);
• local nets of operator algebras and the algebraic approach to quantum field theory
due to Haag and Kastler (Vols. IV–VI);
• operator algebras and spectral theory of observables in quantum physics (the
Gelfand theory) (Vol. IV);
• the Gelfand–Naimark theorem and noncommutative geometry (Vol. IV);
• the Connes–Kreimer–Moscovici Hopf algebra and renormalization (Vol. IV);
• operator algebras and quantum gravity (Vol. VI).
Roughly speaking, products and hence algebras are everywhere.
3.2 The Algebra of Multilinear Functionals
Multilinear algebra studies all kinds of products.
Folklore
In what follows we want to study the elements of multilinear algebra, which play
a crucial role in modern physics. The main tool are multilinear functionals. The
tensor product ⊗ and the Grassmann product ∧ correspond to special multilinear
functionals which are called decomposable. A special role is played by symmetric
and antisymmetric multilinear functionals, which is related to bosons and fermions
in elementary particle physics, respectively.
In this section, the symbols X, Y,Z,Xα denote linear spaces over K. Here, we
choose K = R and K = C; this corresponds to real and complex linear spaces,
respectively. The index α runs in the nonempty index set A. For the basic definitions
concerning linear spaces, we refer to Sect. 7.3 of Vol. I. Recall that:
Two finite-dimensional real (resp. complex) linear spaces are linear isomorphic
iff they have the same dimension.
This well-known theorem from the basic course in linear algebra essentially simplifies
the theory of finite-dimensional linear spaces. It shows that a finite-dimensional
linear space can be described by a single invariant, namely, its dimension. Note
that an analogous theorem for infinite-dimensional linear spaces is not true.
The tensor product A⊗B of two algebras. Let A and B be algebras over K
with K = R,C. Since A and B are linear spaces over K, we have the tensor product
A⊗B at hand. In addition, we define the product
(a ⊗ b)(c ⊗ d) := ac ⊗ bd
for all a, c ∈ A and all b, d ∈ B. In a natural way, this definition can be extended
to expressions of the form (3.3).
Proposition 3.4 The tensor product A⊗B is an algebra over K.
The proof will be given in Problem 4.17 on page 261. We have to show that the
product on A × B does not depend on the choice of the representations (3.3). To
this end, we will use the language of equivalence classes.
Tensor products will be studied in greater detail in Vol. III. In terms of physics,
tensor products are used in order to describe composite particles. For example, the
tensor product ϕ ⊗ ψ of the two states ϕ and ψ of single particles is the state of
the composite particle. In terms of mathematics, tensor products are used in order
to reduce multilinear functionals to linear operators on tensor products.
3.3 Fusion, Splitting, and Hopf Algebras
In nature, one observes fusion and splitting of physical states. From the
mathematical point of view, this corresponds to products and coproducts
of Hopf algebras, respectively.
Folklore
In this section, we will use tensor products in order to define Hopf algebras. In
particular, the language of tensor products will tell us why coassociativity (CA)
and counitality (CU) are dual concepts to associativity (A) and unitality (U) of
algebras (see page 128).
3.3.2 The Definition of Hopf Algebras
The canonical morphisms of an associative unital algebra. Let A be an
associative unital complex algebra A with the unit element 1. For all a, b ∈ A and
all complex numbers z, we define the following maps:
(i) Multiplication map: μ(a ⊗ b) := ab.
(ii) Unitality map: η(z) := z1.
(iii) Identical map: id(a) := a.
Using linear extension, we get the following three algebra morphisms:
μ : A⊗A →A, η: C→A, id :A→A.
These so-called three canonical morphisms of the algebra A have the following
properties:
(A) Associativity: μ(μ ⊗ id) = μ(id ⊗ μ).
(U) Unitality: μ(η ⊗ id) = μ(id ⊗ η) = id.
Let us prove this. Relation (A) follows from the associative law a(bc) = (ab)c for
all a, b, c ∈ A. In fact,
μ(id ⊗ μ)(a ⊗ b ⊗ c) = μ(a ⊗ μ(b ⊗ c)) = μ(a ⊗ bc) = a(bc).
Similarly, μ(μ ⊗ id)(a ⊗ b ⊗ c) = (ab)c. This proves (A). Relation (U) follows from
μ(η ⊗ id)(1 ⊗ a) = μ(η(1) ⊗ a) = μ(1 ⊗ a) = 1a = a.
Similarly, μ(id ⊗ η)(a ⊗ 1) = a1 = a. This yields (U) if we identify a ⊗ 1 and 1⊗ a
with a. This corresponds to the isomorphisms A⊗C = A = C⊗A.
Dualization and the definition of bialgebras. It is our goal to dualize the
relations (A) and (U) above by using the replacement
μ ⇒ Δ, η ⇒ ε
and by commuting the factors. This way, we obtain the following two dual relations:
(CA) Coassociativity: (id ⊗ Δ)Δ = (Δ ⊗ id)Δ.
(CU) Counitality: (id ⊗ ε)Δ = (ε ⊗ id)Δ = id.
Let A be an associative unital complex algebra. Such an algebra is called a complex
bialgebra iff there exist two algebra morphisms
(i) Δ :A→A⊗A (coproduct) and
(ii) ε :A→C (counit)
such that the conditions (CA) and (CU) are satisfied. The counitality map ε is also
called the augmentation map.
The definition of Hopf algebras. The complex bialgebra A is called a Hopf
algebra iff there exists a linear map S :A→A such that
μ(S ⊗ id)Δ = μ(id ⊗ S)Δ = ηε. (3.15)
This condition looks strange at the first glance. However, we will show below that
this is a very natural condition in terms of both Sweedler’s notation and a convolution
on the space of linear operators L(A,A) on the algebra A.
The coinverse is also called
antipode.
As we will show later on, the coinverse lies at the heart of renormalization
in quantum field theory.
Commutative and cocommutative Hopf algebras. The Hopf algebra A is
called commutative iff ab = ba for all a, b ∈ A.
Proposition 3.5 The convolution on L(A,A) is associative and has the unit element
ηε.
Explicitly, this means that, for all B,C,D ∈ L(A,A), we have
(B ∗ C) ∗ D = B ∗ (C ∗ D), B∗ ηε = ηε ∗ B = B.
The proof can be found in Problem 3.7 on page 170. Using this notion of convolution,
the defining relation (3.15) of the coinverse S can be elegantly written as
S ∗ id = id ∗ S = ηε
where id is the identical map on A. This means the following:
The complex bialgebra A is a Hopf algebra iff there exists a linear map
S : A → A which is the two-sided inverse of the identical map on A for
the convolution on L(A,A).
Historical remarks. Hopf algebras were studied first by Heinz Hopf (1894–
1971) in 1941 in order to compute the cohomology of Lie groups and more general
topological spaces.
H. Hopf, On the topology of group manifolds and its generalizations, Ann.
Math. 42 (1941), 22–52 (in German).
This can be found in
E. Spanier, Algebraic Topology, Springer, New York, 1989.
In what follows, we will study the relation of Hopf algebras to both
• power series expansions and
• symmetry.
Roughly speaking, Hopf algebras are frequently used in order to carry out sophisticated
computations for problems where a nontrivial symmetry is behind. Using the
language of commutative diagrams, it turns out that a bialgebra can be understood
best as a mathematical concept which combines the concept of algebra with its dual
concept. One only has to reinverse the arrows in the commutative diagrams of an
algebra. This will be thoroughly considered in Problem 3.4 on page 168. Hopf algebras
are bialgebras which carry the additional structure of a coinverse. Typically,
the coinverse comes from dualizing the inverse of a group structure (see Sect. 3.5.2).
3.4 Power Series Expansion and Hopf Algebras
3.4.1 The Importance of Cancellations
The big surprise in renormalization theory is the appearance of unexpected
huge cancellations in the lengthy computations.
Folklore
It happens quite often in mathematics and physics that extremely complicated
long expressions dramatically simplify by rearranging them as alternating sums
and by cancelling the alternating terms.
As a nontrivial example, let us mention that the elegant heat-kernel approach to
the sophisticated Atiyah–Singer index theorem for elliptic differential operators
on compact manifolds was discovered by Atiyah, Bott, and Patodi in 1973; they
noticed completely unexpected cancellations in long formulas related to the spectral
geometry on manifolds.8 It is typical for topology that topological invariants are
related to alternating sums. The prototype is the Euler characteristic (see Sect. 5.6.2
in Vol. I). Unexpected cancellations are also typical for quite lengthy computations
in renormalization theory. The experience of mathematicians and physicists shows
that symmetries are behind cancellations. It turns out that the cancellations in
renormalization theory can be based on Hopf algebras. As a prototype, we want to
study the Fa`a di Bruno Hopf algebra related to the local diffeomorphism group in
the complex plane. As a preparation for this, we need two classical formulas for the
coefficients of power series expansions, namely,
• the Lagrange inversion formula (3.20) related to the famous Kepler equation
(3.19) in celestial mechanics, and
• the Fa`a di Bruno composition formula (3.30).
Let us study these two formulas first.
The Generalized Zimmermann Forest Formula
The coinverse (also called antipode) of Hopf algebras allows us to elegantly
describe complicated inversion processes in mathematics and physics.
Folklore
It turned out that the whole iterative and intricate structure of renormalization
theory could be mapped to the theory of Hopf algebras, with
Zimmermann’s forest formula for the counterterm coming along as antipode.
Dirk Kreimer, 1994
Correlation Functions in Quantum Field Theory
In the perturbative approach to quantum field theory, combinatorial formulas
are used for computing correlation functions of interacting quantum
fields by means of simpler correlation functions of free fields.
Folklore
3.4.8 Random Variables, Moments, and Cumulants
Generating functions for the moments of random variables represent a
basic tool in quantum field theory.
Folklore
The following considerations are basic for the theory of random variables. Let X
and Y be two independent random variables. Then, for the mean values and the
mean fluctuations, we have the following additivity property
X + Y = X + Y , (ΔX + ΔY )(ΔX + ΔY ) = (ΔX)(ΔX) + (ΔY )(ΔY ).
We want to generalize this to higher moments. The basic idea is to pass from the
multiplicative family of moments to the additive family of cumulants by using the
logarithmic function. In terms of algebra, the passage from cumulants to moments
is given by Schur polynomials. In quantum field theory, the passage from moments
to cumulants corresponds to the passage from correlation functions (i.e., Green’s
functions) to reduced correlation functions (i.e., connected Green’s functions). In
terms of Feynman diagrams, this corresponds to a passage from general Feynman
graphs to connected Feynman graphs (see (3.41)). The proofs for the following
statements can be found in Sect. II.12 of the textbook by A. Shiryaev, Probability,
Springer, New York, 1996.
The moment problem played an important role in the development of measure
theory and functional analysis. In particular, the Riesz–Markov representation theorem
(about linear continuous functionals on spaces of real continuous functions
defined on compact sets) and the Hahn–Banach theorem (on the extension of linear
continuous functionals in Banach spaces) were proved for solving the moment
problem. This fascinating history can be found in J. Dieudonn´e, History of Functional
Analysis, 1900–1975, North-Holland, Amsterdam, 1983. We also refer to J.
Shohat and J. Tamarkin, The Problem of Moments, New York, 1950, and P. Lax,
Functional Analysis, Sect. 33.5, Wiley, New York, 2002.
由一星于2014-08-13, 13:12进行了最后一次编辑,总共编辑了6次
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
霍普夫代数[编辑]
[ltr]在数学中,霍普夫代数是一类双代数,亦即具有相容的结合代数与余代数结构的向量空间,配上一个对极映射,后者推广了群上的逆元运算 。霍普夫代数以数学家海因茨·霍普夫命名,此类结构广见于代数拓扑、群概形、群论、量子群等数学领域。
[/ltr]
[size][ltr]
定义[编辑]
所谓霍普夫代数,是指一个域 上的双代数 ,配上一个线性映射 (称为对极映射),使得下述图表交换:
[/ltr][/size][size][ltr]
利用 Sweedler 记号,此定义亦可表为
对极映射可理解为 对卷积之逆,故其若存在必唯一。当 ,则称 为对合的;交换或余交换霍普夫代数必对合。
根据定义,有限维霍普夫代数的对偶空间也带有自然的霍普夫代数结构。
例子[编辑]
群代数. 设 为群,可赋予群代数 下述霍普夫代数结构:
[/ltr][/size]
[size][ltr]
有限群上的函数. 设 为有限群,置 为所有 的函数,并以逐点的加法与乘法使之成为结合代数。此时有自然的同构 。定义:
[/ltr][/size]
[size][ltr]
仿射代数概形的座标环:处理方式同上。
泛包络代数. 假设 是域 上的李代数,置 为其泛包络代数,定义:
[/ltr][/size]
[size][ltr]
后两条规则与交换子相容,因此可唯一地延拓至整个 上。
李群的上同调[编辑]
李群的上同调代数构成一个霍普夫代数,其代数结构由上同调的上积给出,余代数结构则来自群乘法 ,由此导出
对极映射来自 。这是霍普夫代数的历史起源,事实上,霍普夫借着研究这种结构,得以证明李群上同调的结构定理:
定理(霍普夫,1941年)[1].
设 为 上的有限维分次交换、余交换之霍普夫代数,则 (视为 -代数)同构于由奇数次元素生成的自由外代数。
量子群与非交换几何[编辑]
主条目:量子群
上述所有例子若非交换便是余交换的。另一方面,泛包络代数的某些“变形”或“量子化”可给出非交换亦非余交换的例子;这类霍普夫代数常被称为量子群,尽管严格而言它们并不是群。这类代数在非交换几何中相当重要:一个仿射代数群可以由其座标环构成的霍普夫代数刻划,而这些霍普夫代数的变形则可设想为某类“量子化”了的代数群(实则非群)。
文献[编辑]
[/ltr][/size]
[size][ltr]
注记[编辑]
[/ltr][/size]
[ltr]在数学中,霍普夫代数是一类双代数,亦即具有相容的结合代数与余代数结构的向量空间,配上一个对极映射,后者推广了群上的逆元运算 。霍普夫代数以数学家海因茨·霍普夫命名,此类结构广见于代数拓扑、群概形、群论、量子群等数学领域。
[/ltr]
[size][ltr]
定义[编辑]
所谓霍普夫代数,是指一个域 上的双代数 ,配上一个线性映射 (称为对极映射),使得下述图表交换:
[/ltr][/size][size][ltr]
利用 Sweedler 记号,此定义亦可表为
对极映射可理解为 对卷积之逆,故其若存在必唯一。当 ,则称 为对合的;交换或余交换霍普夫代数必对合。
根据定义,有限维霍普夫代数的对偶空间也带有自然的霍普夫代数结构。
例子[编辑]
群代数. 设 为群,可赋予群代数 下述霍普夫代数结构:
[/ltr][/size]
[size][ltr]
有限群上的函数. 设 为有限群,置 为所有 的函数,并以逐点的加法与乘法使之成为结合代数。此时有自然的同构 。定义:
[/ltr][/size]
[size][ltr]
仿射代数概形的座标环:处理方式同上。
泛包络代数. 假设 是域 上的李代数,置 为其泛包络代数,定义:
[/ltr][/size]
[size][ltr]
后两条规则与交换子相容,因此可唯一地延拓至整个 上。
李群的上同调[编辑]
李群的上同调代数构成一个霍普夫代数,其代数结构由上同调的上积给出,余代数结构则来自群乘法 ,由此导出
对极映射来自 。这是霍普夫代数的历史起源,事实上,霍普夫借着研究这种结构,得以证明李群上同调的结构定理:
定理(霍普夫,1941年)[1].
设 为 上的有限维分次交换、余交换之霍普夫代数,则 (视为 -代数)同构于由奇数次元素生成的自由外代数。
量子群与非交换几何[编辑]
主条目:量子群
上述所有例子若非交换便是余交换的。另一方面,泛包络代数的某些“变形”或“量子化”可给出非交换亦非余交换的例子;这类霍普夫代数常被称为量子群,尽管严格而言它们并不是群。这类代数在非交换几何中相当重要:一个仿射代数群可以由其座标环构成的霍普夫代数刻划,而这些霍普夫代数的变形则可设想为某类“量子化”了的代数群(实则非群)。
文献[编辑]
[/ltr][/size]
- Eiichi Abe, Hopf Algebras (1980), translated by Hisae Kino***a and Hiroko Tanaka, Cambridge University Press. ISBN 0-521-22240-0
- Jurgen Fuchs, Affine Lie Algebras and Quantum Groups, (1992), Cambridge University Press. ISBN 0-521-48412-X
- Ross Moore, Sam Williams and Ross Talent: Quantum Groups: an entrée to modern algebra
- Pierre Cartier, A primer of Hopf algebras, IHES preprint, September 2006, 81 pages
[size][ltr]
注记[编辑]
[/ltr][/size]
- ^ H. Hopf, Uber die Topologie der Gruppen-Mannigfaltigkeiten und ihrer Verallgemeinerungen, Ann. of Math. 42 (1941), 22-52. Reprinted in Selecta Heinz Hopf, pp. 119-151, Springer, Berlin (1964). MR4784
分类:
对偶 (数学)[编辑]
[ltr]在数学领域中,对偶一般来说是以一对一的方式,常常(但并不总是)通过某个对合算子,把一种概念、公理或数学结构转化为另一种概念、公理或数学结构:如果A的对偶是B,那么B的对偶是A。由于对合有时候会存在不动点,因此A的对偶有时候会是A自身。比如射影几何中的笛沙格定理,即是在这一意义下的自对偶。
对偶在数学背景当中具有很多种意义,而且,尽管它是“现代数学中极为普遍且重要的概念(a very pervasive and important concept in (modern) mathematics)”[1]并且是“在数学几乎每一个分支中都会出现的重要的一般性主题(an important general theme that has manifestations in almost every area of mathematics)”[2],但仍然没有一个能把对偶的所有概念统一起来的普适定义。[2]
在两类对象之间的对偶很多都和配对(pairing),也就是把一类对象和另一类对象映射到某一族标量上的双线性函数相对应。例如,线性代数的对偶对应着把线性空间中的向量对双线性映射到标量上,广义函数及其相关的试验函数也对应着一个配对且在该配对中可用试验函数来对广义函数进行积分,庞加莱对偶从给定流形的子流形之间的配对的角度看同样也对应着交数。[3]
[/ltr]
[size][ltr]
序逆对偶[编辑]
一种特别简单的对偶形式来自于序理论。偏序关系P = (X, ≤)的对偶是由同一偏序集组成但关系相反的偏序关系Pd。我们比较熟悉的对偶偏序的例子有:
[/ltr][/size]
[size][ltr]
为某一偏序P定义的概念会对应到对偶偏序集Pd的对偶概念上。例如,P的极小元对应于Pd的极大元:极小和极大是序理论中的对偶概念。序理论中的其他对偶概念还包括上界和下界、上闭集合和下闭集合、理想和滤子。
一种特殊的序逆对偶存在于某个集合S的幂集合中:若表示补集,则当且仅当。在拓扑学中,开集和闭集是对偶概念:开集的补是闭的,反之亦然。在拟阵论中,某个给定拟阵的独立集合的补集簇形成另一个拟阵,称作对偶拟阵。在逻辑中,我们可以把非量化公式中变量的成真赋值表示为对该赋值为真的变量集合。成真赋值满足该公式当且仅当该成真赋值的补满足该公式的德·摩根定律。逻辑中的全称量词和存在量词也是类似的对偶。
偏序可以解释为范畴,在该范畴中存在从x到y的arrow当且仅当偏序中有x ≤ y。偏序的序逆对偶可扩展为对偶范畴的概念,即由给定范畴中所有arrow的逆所组成的范畴。后面将要描述的很多具体的对偶都是在此意义下的范畴的对偶。
维逆对偶[编辑]
[/ltr][/size][size][ltr]
存在着很多种不同但互相联系的在同一类几何或拓扑对象之间的对偶,不过具有对偶关系的对象在特征维数上是相反的。这方面的经典例子是正多面体的对偶,其中立方体和正八面体形成了一个对偶配对,正十二面体和正二十面体形成了另一个对偶配对,而正四面体是自对偶的。任何一种这类多面体的对偶多面体可作为主要多面体每一面中心点的凸包。
相关条目[编辑]
[/ltr][/size]
[size][ltr]
Notes[编辑]
[/ltr][/size]
[size][ltr]
参考资料[编辑]
[/ltr][/size]
[size]
分类:
[/size]
对偶 (数学)[编辑]
[ltr]在数学领域中,对偶一般来说是以一对一的方式,常常(但并不总是)通过某个对合算子,把一种概念、公理或数学结构转化为另一种概念、公理或数学结构:如果A的对偶是B,那么B的对偶是A。由于对合有时候会存在不动点,因此A的对偶有时候会是A自身。比如射影几何中的笛沙格定理,即是在这一意义下的自对偶。
对偶在数学背景当中具有很多种意义,而且,尽管它是“现代数学中极为普遍且重要的概念(a very pervasive and important concept in (modern) mathematics)”[1]并且是“在数学几乎每一个分支中都会出现的重要的一般性主题(an important general theme that has manifestations in almost every area of mathematics)”[2],但仍然没有一个能把对偶的所有概念统一起来的普适定义。[2]
在两类对象之间的对偶很多都和配对(pairing),也就是把一类对象和另一类对象映射到某一族标量上的双线性函数相对应。例如,线性代数的对偶对应着把线性空间中的向量对双线性映射到标量上,广义函数及其相关的试验函数也对应着一个配对且在该配对中可用试验函数来对广义函数进行积分,庞加莱对偶从给定流形的子流形之间的配对的角度看同样也对应着交数。[3]
[/ltr]
[size][ltr]
序逆对偶[编辑]
一种特别简单的对偶形式来自于序理论。偏序关系P = (X, ≤)的对偶是由同一偏序集组成但关系相反的偏序关系Pd。我们比较熟悉的对偶偏序的例子有:
[/ltr][/size]
- 任何集合簇上的子集和超集关系和;
- 整数上的因数和倍数关系;
- 人类集合上的后代和祖先关系。
[size][ltr]
为某一偏序P定义的概念会对应到对偶偏序集Pd的对偶概念上。例如,P的极小元对应于Pd的极大元:极小和极大是序理论中的对偶概念。序理论中的其他对偶概念还包括上界和下界、上闭集合和下闭集合、理想和滤子。
一种特殊的序逆对偶存在于某个集合S的幂集合中:若表示补集,则当且仅当。在拓扑学中,开集和闭集是对偶概念:开集的补是闭的,反之亦然。在拟阵论中,某个给定拟阵的独立集合的补集簇形成另一个拟阵,称作对偶拟阵。在逻辑中,我们可以把非量化公式中变量的成真赋值表示为对该赋值为真的变量集合。成真赋值满足该公式当且仅当该成真赋值的补满足该公式的德·摩根定律。逻辑中的全称量词和存在量词也是类似的对偶。
偏序可以解释为范畴,在该范畴中存在从x到y的arrow当且仅当偏序中有x ≤ y。偏序的序逆对偶可扩展为对偶范畴的概念,即由给定范畴中所有arrow的逆所组成的范畴。后面将要描述的很多具体的对偶都是在此意义下的范畴的对偶。
维逆对偶[编辑]
[/ltr][/size][size][ltr]
存在着很多种不同但互相联系的在同一类几何或拓扑对象之间的对偶,不过具有对偶关系的对象在特征维数上是相反的。这方面的经典例子是正多面体的对偶,其中立方体和正八面体形成了一个对偶配对,正十二面体和正二十面体形成了另一个对偶配对,而正四面体是自对偶的。任何一种这类多面体的对偶多面体可作为主要多面体每一面中心点的凸包。
相关条目[编辑]
[/ltr][/size]
[size][ltr]
Notes[编辑]
[/ltr][/size]
- ^ Kostrikin 2001
- ^ 2.0 2.1 Gowers 2008,p. 187, col. 1
- ^ Gowers 2008,p. 189, col. 2
[size][ltr]
参考资料[编辑]
[/ltr][/size]
- Kostrikin, A. I., Duality//Hazewinkel, Michiel, 数学百科全书, 克鲁维尔学术出版社. 2001, ISBN 978-1556080104.
- Gowers, Timothy, III.19 Duality, The Princeton Companion to Mathematics, Princeton University Press. 2008: 187–190.
- Cartier, Pierre, A mad day's work: from Grothendieck to Connes and Kontsevich. The evolution of concepts of space and symmetry, American Mathematical Society. Bulletin. New Series. 2001, 38 (4): 389–408, doi:10.1090/S0273-0979-01-00913-2, MR1848254,ISSN 0002-9904 (a non-technical overview about several aspects of geometry, including dualities)
- Artstein-Avidan, Shiri; Milman, Vitali, The concept of duality for measure projections of convex bodies, Journal of functional analysis. 2008, 254 (10): 2648–2666, doi:10.1016/j.jfa.2007.11.008. Alsoauthor's site.
- Artstein-Avidan, Shiri; Milman, Vitali, A characterization of the concept of duality, Electronic research announcements in mathematical sciences. 2007, 14: 42–59. Also author's site.
- Dwyer, William G.; Spaliński, J., Homotopy theories and model categories, Handbook of algebraic topology, Amsterdam: North-Holland. 1995: 73–126, MR1361887
- Fulton, William, Introduction to toric varieties, Princeton University Press. 1993, ISBN 978-0-691-00049-7
- Griffiths, Phillip; Harris, Joseph, Principles of algebraic geometry, Wiley Classics Library, New York: John Wiley & Sons. 1994, MR1288523,ISBN 978-0-471-05059-9
- Hartshorne, Robin, Residues and Duality, Lecture Notes in Mathematics 20, Berlin, New York: Springer-Verlag. 1966: 20–48
- Hartshorne, Robin, Algebraic Geometry, Berlin, New York: Springer-Verla. 1977, MR0463157, ISBN 978-0-387-90244-9,OCLC 13348052
- Iversen, Birger, Cohomology of sheaves, Universitext, Berlin, New York: Springer-Verlag. 1986, MR842190, ISBN 978-3-540-16389-3
- Joyal, André; Street, Ross, An introduction to Tannaka duality and quantum groups, Category theory (Como, 1990), Lecture notes in mathematics, 1488, Berlin, New York: Springer-Verlag. 1991: 413–492, MR1173027
- Lam, Tsit-Yuen, Lectures on modules and rings, Graduate Texts in Mathematics No. 189, Berlin, New York: Springer-Verlag. 1999,MR1653294, ISBN 978-0-387-98428-5
- Lang, Serge, Algebra, Graduate Texts in Mathematics, 211, Berlin, New York: Springer-Verlag. 2002, MR1878556, ISBN 978-0-387-95385-4
- Loomis, Lynn H., An introduction to abstract harmonic analysis, Toronto-New York-London: D. Van Nostrand Company, Inc.. 1953: pp. x+190
- Mac Lane, Saunders, Categories for the Working Mathematician. 2nd, Berlin, New York: Springer-Verlag. 1998, ISBN 978-0-387-98403-2
- Mazur, Barry, Notes on étale cohomology of number fields, Annales Scientifiques de l'École Normale Supérieure. Quatrième Série. 1973, 6: 521–552, MR0344254, ISSN 0012-9593
- Milne, James S., Étale cohomology, Princeton University Press. 1980,ISBN 978-0-691-08238-7
- Milne, James S., Arithmetic duality theorems. 2nd, Charleston, SC: BookSurge, LLC. 2006, MR2261462, ISBN 978-1-4196-4274-6
- Negrepontis, Joan W., Duality in analysis from the point of view of triples, Journal of Algebra. 1971, 19 (2): 228–253, doi:10.1016/0021-8693(71)90105-0, MR0280571, ISSN 0021-8693
- Veblen, Oswald; Young, John Wesley, Projective geometry. Vols. 1, 2, Blaisdell Publishing Co. Ginn and Co. New York-Toronto-London. 1965, MR0179666
- Weibel, Charles A., An introduction to homological algebra, Cambridge University Press. 1994, MR1269324, ISBN 978-0-521-55987-4
[size]
分类:
[/size]
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
幺半范畴[编辑]
[ltr]张量范畴(tensor category),或曰幺半范畴 (monoidal category), 直觉地讲,是个配上张量积的阿贝尔范畴(abelian category),可当作环的范畴化。
[/ltr]
[size][ltr]
定义[编辑]
数学中中,一个张量范畴(tensor category) (或称幺半范畴 monoidal category) 是一个包含单一个对象的双范畴(bicategory)。 更具体的描述:一个张量范畴是
[/ltr][/size]
[size][ltr]
;
[/ltr][/size]
[size][ltr]
,;
[/ltr][/size]
[size][ltr]
每, , , ,和都交换.///
在这以上两道相容条件下,任何以结合子,左右单位子和张量积组成的图表都交换,因为 Mac Lane 凝聚定理(Mac Lane's coherence theorem): 每个幺半范畴都 幺半等价(monoidally equivalent) 于一严格幺半范畴(见下).
严格幺半范畴[编辑]
严格幺半范畴(strict monoidal category) 是个幺半范畴 ,其自然态射 , 和 都是恒等影射.
取任一 范畴 , 我们可构筑其 自由严格幺半范畴 :
[/ltr][/size]
[size][ltr]
按:此算符 ,向由任一 范畴 配上 ,可推广到 上的严格-2-单子 (en:strict 2-monad) 。
例[编辑]
取任一范畴,若以其平常范畴积作张量积,以其终对象作单位对象,则成为一个张量范畴。 亦可取任一范畴,以其余积(co-product)作张量积,以其始对象作单位对象,亦成一个张量范畴。 (此二例实为对称幺半范畴结构。) 但亦有许多张量范畴 (例如 -Mod,如下),其张量积 既非 范畴积 亦非 范畴余积。
以下举张量范畴二例——向量空间范畴和集合范畴——并表明其类比:
[/ltr][/size]
[size][ltr]
相关的结构[编辑]
[/ltr][/size]
[size][ltr]
应用[编辑]
[/ltr][/size]
[size][ltr]
参考[编辑]
[/ltr][/size]
[ltr]张量范畴(tensor category),或曰幺半范畴 (monoidal category), 直觉地讲,是个配上张量积的阿贝尔范畴(abelian category),可当作环的范畴化。
[/ltr]
[size][ltr]
定义[编辑]
数学中中,一个张量范畴(tensor category) (或称幺半范畴 monoidal category) 是一个包含单一个对象的双范畴(bicategory)。 更具体的描述:一个张量范畴是
[/ltr][/size]
- 一个范畴 ;
- 被赋予张量积,即一个二元函子
[size][ltr]
;
[/ltr][/size]
- 被赋予一个单位对象 ;
- 被赋予三组自然同构映射:
- 结合子: ;
- 左/右单位子: 自然同构映射 , :
[size][ltr]
,;
[/ltr][/size]
- 满足以下相容条件:
[size][ltr]
每, , , ,和都交换.///
在这以上两道相容条件下,任何以结合子,左右单位子和张量积组成的图表都交换,因为 Mac Lane 凝聚定理(Mac Lane's coherence theorem): 每个幺半范畴都 幺半等价(monoidally equivalent) 于一严格幺半范畴(见下).
严格幺半范畴[编辑]
严格幺半范畴(strict monoidal category) 是个幺半范畴 ,其自然态射 , 和 都是恒等影射.
取任一 范畴 , 我们可构筑其 自由严格幺半范畴 :
[/ltr][/size]
- 对象:其每一对象是一串由里面的对象组成之有限序列 );
- 态射:当且仅当时,我们在二个对象 和 之间定义 态射:每 -态射 是一串由 -态射组成的有限序列 ;
- 张量积: 二个-对象 及 之张量积, 我们定义为 此二有限序列之串接(concatenation) ; 同样地任何二 -态射之张量积, 我们定义为其串接。
[size][ltr]
按:此算符 ,向由任一 范畴 配上 ,可推广到 上的严格-2-单子 (en:strict 2-monad) 。
例[编辑]
取任一范畴,若以其平常范畴积作张量积,以其终对象作单位对象,则成为一个张量范畴。 亦可取任一范畴,以其余积(co-product)作张量积,以其始对象作单位对象,亦成一个张量范畴。 (此二例实为对称幺半范畴结构。) 但亦有许多张量范畴 (例如 -Mod,如下),其张量积 既非 范畴积 亦非 范畴余积。
以下举张量范畴二例——向量空间范畴和集合范畴——并表明其类比:
[/ltr][/size]
取任一域 或交换环 , 各 -模 所成之 范畴 -模 (若R 为一域, 则 R-模即 R-向量空间) 是一 对称幺半范畴;其张量积 ⊗ 与单位对象为:. | 范畴 集 为一对称幺半范畴赋有张量积 × 与单位对象 {*}. |
单元结合代数为-模之 一对象,赋上态射 与 并满足以下条件: | A 幺半群 为一对象 M ,配上态射 与 并满足 |
and | 与 |
. | . |
A 余代数(coalgebra) 是一个 对象 C ,被赋予 态射 和 并满足以下条件: | 集内每一对象(即每一集合)S, 都被赋予 态射 和 满足以下条件: |
and | and |
. | . |
此 ε 是唯一的,因为 (即一元集合)是个终对象. |
相关的结构[编辑]
[/ltr][/size]
- 很多张量范畴更进一步有 辫, 交换态射 or 封闭等结构. 详见下述参考。
- 幺半函子 (en:monoidal functor)为二张量范畴(么半范畴)间、保存张量积结构之函子; 幺半态射为二么半函子间之态射(自然变换 (natural transformations))。
- 一般幺半群之概念可推广成么半范畴中的幺半对象(en:monoid object)。尤其者,可视一严格么半范畴作 范畴之“范畴” Cat中的么半对象(并以卡氏积为么半结构)。
- 上有界交半格 构成一严格对称么半范畴:其积为交,而单位元则为顶。
[size][ltr]
应用[编辑]
[/ltr][/size]
[size][ltr]
参考[编辑]
[/ltr][/size]
- Mac Lane, Saunders (1963). "Natural Associativity and Commutativity".Rice University Studies 49, 28–46.
- Kelly, G. Max (1964). "On MacLane's Conditions for Coherence of Natural Associativities, Commutativities, etc." Journal of Algebra 1, 397–402
- Joyal, André; Street, Ross (1993). "Braided Tensor Categories".Advances in Mathematics 102, 20–78.
- Mac Lane, Saunders (1997), Categories for the Working Mathematician (2nd ed.). New York: Springer-Verlag.
- Baez, John, Definitions
- : <
>, < >,Springer On-line Reference Works
分类:
直积[编辑]
[size][ltr]
在数学中,经常定义已知对象的直积(direct product)来给出新对象。例子有集合的乘积(参见笛卡尔积),群的乘积(下面描述),环的乘积和其他代数结构的乘积。拓扑空间的乘积是另一个例子。
[/ltr][/size]
[size][ltr]
例子[编辑]
[/ltr][/size]
[size][ltr]
以类似的方式,我们可以谈论多于两个对象的乘积,比如 。我们甚至可以谈论无限多个对象的乘积比如 。
群直积[编辑]
在群论中可以定义两个群 (G, *) 和 (H, o) 的直积,指示为 G × H。对于写为加法的阿贝尔群,它也可以叫做两个群的直和,指示为 。
它定义为如下:
[/ltr][/size]
[size][ltr]
(g, h) × (g' , h' ) = (g * g' , h o h' )
(注意运算 * 可以同于 o。)
这个构造给出了新群。它有同构于 G (构成自形如 (g, 1) 的元素)的一个正规子群,和同构于 H (构成自元素 (1, h))的一个正规子群。
逆命题也成立,有下列识别定理: 如果群 K 包含两个正规子群 G 和 H,使得K= GH 并且 G 和 H 的交集只包含单位元,则 K = G × H。将其中一个正规子群条件弱化为一般子群则给出半直积。
作为一个例子,选取 G 和 H 是唯一(不别同构之异) 2 阶群 C2 的两个复本: 即 {1, a} 和 {1, b}。则 C2×C2 = {(1,1), (1,b), (a,1), (a,b)},带有逐元素运算。例如,(1,b)*(a,1) = (1*a, b*1) = (a,b),而 (1,b)*(1,b) = (1,b2) = (1,1)。
通过直积,我们得到一些自然群同态: 投影映射
,
叫做坐标函数。
还有,在直积上的所有同态 f 都完全决定自它的分量(component)函数 。
对于任何群 (G, *),和任何整数 n ≥ 0,多次应用直积得到所有 n-元组的群Gn (n=0 时是平凡群)。例如:
[/ltr][/size]
[size][ltr]
模的直积[编辑]
模的直积(不要混淆于张量积)非常类似于上述群直积的定义,使用笛卡尔积带有逐分量的加法运算,和只分布在所有分量上的标量乘法运算。开始于R 我们得到欧几里得空间 Rn,它是实 n-维向量空间的原型例子。Rm 和 Rn的直积是 Rm + n。
注意有限索引 的直积同一于直和 。直和与直积只对无限索引有区别,这里直和的元素对于除了对于有限多个之外所有的项目是零。它们是对偶的: 直和是上积,而直积是乘积。
例如,考虑 和 ,实数的无限直积和直和。在Y 中只有有着有限多个非零元素的序列。例如,(1,0,0,0,...) 在 Y 中但(1,1,1,1,...) 不在。这两种序列都在直积 X 中;事实上,Y 是 X 的真子集(也就是 Y⊂X)。
拓扑空间直积[编辑]
拓扑空间的搜集 Xi 即对于 i 在 I 中的某个索引集合的直积,再次利用了笛卡尔积
定义拓扑是有些技巧的。对于有限多个因子这是明显和自然的事情: 简单的选取开集构成的基为来自每个因子的开子集的所有笛卡尔积的搜集:
这个拓扑叫做乘积拓扑。例如,直接通过 R 的开集们(开区间的不交并)定义在 R2 上的乘积拓扑,这个拓扑的基由在平面上的开矩形的所有不交并构成(明显的它一致于平常的度量拓扑)。
无限乘积的拓扑就有些曲折了,要能够确使所有投影映射连续,并确使所有到乘积中的函数连续当且仅当所有它的分量函数是连续的(就是满足乘积范畴定义: 这里的态射是连续函数): 我们同上面一样的选取的开集构成的基围来自每个因子的开子集的所有笛卡尔积的搜集,但带有除了有限多个开子集之外所有都是整个因子的限制条件:
在这种情况下更自然可靠的拓扑将是如上那样选取无限多个开子集的乘积,而这产生了有些意思的拓扑,即盒拓扑,但是不难找到其乘积函数不是连续的连续分量函数丛(例子请参见盒拓扑的条目)。使这种曲折成为必须的问题最终根源于在拓扑定义中开集的交集对无限多集合不保证是开集的事实。
乘积(带有乘积拓扑)关于保持它们因子的性质是良好的;例如,豪斯多夫空间的乘积是豪斯多夫空间;连通空间的乘积是连通空间,而紧致空间的乘积是紧致空间。最后一个也叫做吉洪诺夫定理,它是选择公理的另一个等价形式。
更多的形式和等价公式请参见单独条目乘积拓扑。
二元关系的直积[编辑]
在带有二元关系 R 和 S 的两个集合上的笛卡尔积上,定义 (a, b) T (c, d) 为a R c 并且 b S d。如果 R 和 S 都是自反的、反自反的、传递的、对称的或反对称的,则 T 有同样性质。[1] 组合各性质,可得出这还适用于作为预序和作为等价关系情况。但是如果 R 和 S 是完全关系,T 一般不是。
度量和范数[编辑]
在度量空间的笛卡尔积上的度量,和在赋范向量空间的直积上的范数,可以用各种方式定义,例子请参见p-范数。
参见[编辑]
[/ltr][/size]
[size][ltr]
注释[编辑]
[/ltr][/size]
[size][ltr]
引用[编辑]
[/ltr][/size]
[size]
分类:
[/size]
直积[编辑]
[size][ltr]
在数学中,经常定义已知对象的直积(direct product)来给出新对象。例子有集合的乘积(参见笛卡尔积),群的乘积(下面描述),环的乘积和其他代数结构的乘积。拓扑空间的乘积是另一个例子。
[/ltr][/size]
[size][ltr]
例子[编辑]
[/ltr][/size]
- 如果我们认 为实数集在加法下的群,则直积 仍构成自 。和上个例子的不同是 现在是群。我们必须定义如何做它们的元素的加法。这个定义为 。
- 如果我们认 为实数集的环,则直积 仍构成自 。要使它成为环,我们必须定义它们的元素的运算,加法定义为 ,而乘法定义为 。
- 但是如果我们认 为实数集的域,则直积 不存在! 以类似上例的方式天真的定义 不会结果一个域,因为元素 不存在乘法逆元。
[size][ltr]
以类似的方式,我们可以谈论多于两个对象的乘积,比如 。我们甚至可以谈论无限多个对象的乘积比如 。
群直积[编辑]
在群论中可以定义两个群 (G, *) 和 (H, o) 的直积,指示为 G × H。对于写为加法的阿贝尔群,它也可以叫做两个群的直和,指示为 。
它定义为如下:
[/ltr][/size]
[size][ltr]
(g, h) × (g' , h' ) = (g * g' , h o h' )
(注意运算 * 可以同于 o。)
这个构造给出了新群。它有同构于 G (构成自形如 (g, 1) 的元素)的一个正规子群,和同构于 H (构成自元素 (1, h))的一个正规子群。
逆命题也成立,有下列识别定理: 如果群 K 包含两个正规子群 G 和 H,使得K= GH 并且 G 和 H 的交集只包含单位元,则 K = G × H。将其中一个正规子群条件弱化为一般子群则给出半直积。
作为一个例子,选取 G 和 H 是唯一(不别同构之异) 2 阶群 C2 的两个复本: 即 {1, a} 和 {1, b}。则 C2×C2 = {(1,1), (1,b), (a,1), (a,b)},带有逐元素运算。例如,(1,b)*(a,1) = (1*a, b*1) = (a,b),而 (1,b)*(1,b) = (1,b2) = (1,1)。
通过直积,我们得到一些自然群同态: 投影映射
,
叫做坐标函数。
还有,在直积上的所有同态 f 都完全决定自它的分量(component)函数 。
对于任何群 (G, *),和任何整数 n ≥ 0,多次应用直积得到所有 n-元组的群Gn (n=0 时是平凡群)。例如:
[/ltr][/size]
[size][ltr]
模的直积[编辑]
模的直积(不要混淆于张量积)非常类似于上述群直积的定义,使用笛卡尔积带有逐分量的加法运算,和只分布在所有分量上的标量乘法运算。开始于R 我们得到欧几里得空间 Rn,它是实 n-维向量空间的原型例子。Rm 和 Rn的直积是 Rm + n。
注意有限索引 的直积同一于直和 。直和与直积只对无限索引有区别,这里直和的元素对于除了对于有限多个之外所有的项目是零。它们是对偶的: 直和是上积,而直积是乘积。
例如,考虑 和 ,实数的无限直积和直和。在Y 中只有有着有限多个非零元素的序列。例如,(1,0,0,0,...) 在 Y 中但(1,1,1,1,...) 不在。这两种序列都在直积 X 中;事实上,Y 是 X 的真子集(也就是 Y⊂X)。
拓扑空间直积[编辑]
拓扑空间的搜集 Xi 即对于 i 在 I 中的某个索引集合的直积,再次利用了笛卡尔积
定义拓扑是有些技巧的。对于有限多个因子这是明显和自然的事情: 简单的选取开集构成的基为来自每个因子的开子集的所有笛卡尔积的搜集:
这个拓扑叫做乘积拓扑。例如,直接通过 R 的开集们(开区间的不交并)定义在 R2 上的乘积拓扑,这个拓扑的基由在平面上的开矩形的所有不交并构成(明显的它一致于平常的度量拓扑)。
无限乘积的拓扑就有些曲折了,要能够确使所有投影映射连续,并确使所有到乘积中的函数连续当且仅当所有它的分量函数是连续的(就是满足乘积范畴定义: 这里的态射是连续函数): 我们同上面一样的选取的开集构成的基围来自每个因子的开子集的所有笛卡尔积的搜集,但带有除了有限多个开子集之外所有都是整个因子的限制条件:
在这种情况下更自然可靠的拓扑将是如上那样选取无限多个开子集的乘积,而这产生了有些意思的拓扑,即盒拓扑,但是不难找到其乘积函数不是连续的连续分量函数丛(例子请参见盒拓扑的条目)。使这种曲折成为必须的问题最终根源于在拓扑定义中开集的交集对无限多集合不保证是开集的事实。
乘积(带有乘积拓扑)关于保持它们因子的性质是良好的;例如,豪斯多夫空间的乘积是豪斯多夫空间;连通空间的乘积是连通空间,而紧致空间的乘积是紧致空间。最后一个也叫做吉洪诺夫定理,它是选择公理的另一个等价形式。
更多的形式和等价公式请参见单独条目乘积拓扑。
二元关系的直积[编辑]
在带有二元关系 R 和 S 的两个集合上的笛卡尔积上,定义 (a, b) T (c, d) 为a R c 并且 b S d。如果 R 和 S 都是自反的、反自反的、传递的、对称的或反对称的,则 T 有同样性质。[1] 组合各性质,可得出这还适用于作为预序和作为等价关系情况。但是如果 R 和 S 是完全关系,T 一般不是。
度量和范数[编辑]
在度量空间的笛卡尔积上的度量,和在赋范向量空间的直积上的范数,可以用各种方式定义,例子请参见p-范数。
参见[编辑]
[/ltr][/size]
[size][ltr]
注释[编辑]
[/ltr][/size]
[size][ltr]
引用[编辑]
[/ltr][/size]
- Lang, S. Algebra. New York: Springer-Verlag, 2002.
[size]
分类:
[/size]
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
矩阵加法[编辑]
(重定向自直和)
[ltr]在数学里,矩阵加法一般是指两个矩阵把其相对应元素加在一起的运算。但有另一运算也可以认为是一种矩阵的加法。
[/ltr]
[size][ltr]
个别元素相加(减)[编辑]
通常的矩阵加法被定义在两个相同大小的矩阵。两个m×n矩阵A和B的和,标记为A+B,一样是个m×n矩阵,其内的各元素为其相对应元素相加后的值。例如:
也可以做矩阵的减法,只要其大小相同的话。A-B内的各元素为其相对应元素相减后的值,且此矩阵会和A、B有相同大小。例如:
在MS Excel做矩阵加(减)法[编辑]
一般的矩阵加(减)法如下,至于下一节的“直和”请另找参考资料。
[/ltr][/size]
[size][ltr]
直和[编辑]
另较少用来的一种运算为直和。直和可以由任何一对矩阵形成,其定义为:
举例来说:
注意到两个方阵的直和可以表示两个图论的联集之邻接矩阵。
在任两个向量空间内取定基底,并取两基底的联集为向量空间直和的基底,则两空间上的线性变换的直和可以表成两矩阵的直和。
一般地,n个矩阵的直和可以写成:
另见[编辑]
[/ltr][/size]
[size][ltr]
外部链接[编辑]
[/ltr][/size]
(重定向自直和)
[ltr]在数学里,矩阵加法一般是指两个矩阵把其相对应元素加在一起的运算。但有另一运算也可以认为是一种矩阵的加法。
[/ltr]
[size][ltr]
个别元素相加(减)[编辑]
通常的矩阵加法被定义在两个相同大小的矩阵。两个m×n矩阵A和B的和,标记为A+B,一样是个m×n矩阵,其内的各元素为其相对应元素相加后的值。例如:
也可以做矩阵的减法,只要其大小相同的话。A-B内的各元素为其相对应元素相减后的值,且此矩阵会和A、B有相同大小。例如:
在MS Excel做矩阵加(减)法[编辑]
一般的矩阵加(减)法如下,至于下一节的“直和”请另找参考资料。
[/ltr][/size]
- 先输入要相加的两个矩阵,大小必须一致为,一般矩阵加法才有定义;
- 用鼠标选取大小为的空白格矩阵;
- 输入 =
- 用鼠标选取矩阵1
- 输入 + (若做减法则输入 -)
- 用鼠标选取矩阵2
- 按Ctrl+⇧ Shift+↵ Enter这三个键的组合。
[size][ltr]
直和[编辑]
另较少用来的一种运算为直和。直和可以由任何一对矩阵形成,其定义为:
举例来说:
注意到两个方阵的直和可以表示两个图论的联集之邻接矩阵。
在任两个向量空间内取定基底,并取两基底的联集为向量空间直和的基底,则两空间上的线性变换的直和可以表成两矩阵的直和。
一般地,n个矩阵的直和可以写成:
另见[编辑]
[/ltr][/size]
[size][ltr]
外部链接[编辑]
[/ltr][/size]
分类:
笛卡儿积[编辑]
(重定向自笛卡尔积)
在数学中,两个集合X和Y的笛卡儿积(Cartesian product),又称直积,表示为X × Y,是其第一个对象是X的成员而第二个对象是Y的一个成员的所有可能的有序对:
。
笛卡儿积得名于笛卡儿,他的解析几何的公式化引发了这个概念。
具体的说,如果集合X是13个元素的点数集合{ A, K, Q, J, 10, 9, 8, 7, 6, 5, 4, 3, 2 }而集合Y是4个元素的花色集合{♠, ♥, ♦, ♣},则这两个集合的笛卡儿积是52个元素的标准扑克牌的集合{ (A, ♠), (K, ♠), ..., (2, ♠), (A, ♥), ..., (3, ♣), (2, ♣) }。
[/ltr][/size]
[size][ltr]
笛卡儿积的性质[编辑]
易见笛卡儿积满足下列性质:
[/ltr][/size]
[size][ltr]
笛卡儿平方和n-元乘积[编辑]
集合X的笛卡儿平方(或二元笛卡儿积)是笛卡儿积X × X。一个例子是二维平面R × R,这里R是实数的集合 - 所有的点(x,y),这里的x和y是实数(参见笛卡儿坐标系)。
可以推广出在n个集合X1, ..., Xn上的n-元笛卡儿积:
。
实际上,它可以被认同为 (X1 × ... × Xn-1) × Xn。它也是n-元组的集合。
一个例子是欧几里得三维空间R × R × R,这里的R再次是实数的集合。
为了辅助它的计算,可绘制一个表格。一个集合作为行而另一个集合作为列,从行和列的集合选择元素形成有序对作为表的单元格。
无穷乘积[编辑]
对最常用的数学应用而言,上述定义通常就是所需要的全部。但是有可能在任意(可能无限)的集合的搜集上定义笛卡儿积。如果I是任何指标集合,而
是由I索引的集合的搜集,则我们定义
,
就是定义在索引集合上的所有函数的集合,使得这些函数在特定索引i上的值是Xi 的元素。
对在I中每个j,定义自
的函数
叫做第j投影映射。
n-元组可以被看作在{1, 2, ..., n}上的函数,它在i上的值是这个元组的第i个元素。所以,在I是{1, 2, ..., n}的时候这个定义一致于对有限情况的定义。在无限情况下这个定义是集合族。
特别熟悉的一个无限情况是在索引集合是自然数的集合的时候:这正是其中第i项对应于集合Xi 的所有无限序列的集合。再次,提供了这样的一个例子:
是实数的无限序列的搜集,并且很容易可视化为带有有限数目构件的向量或元组。另一个特殊情况(上述例子也满足它)是在乘积涉及因子Xi都是相同的时候,类似于“笛卡儿指数”。则在定义中的无限并集自身就是这个集合自身,而其他条件被平凡的满足了,所以这正是从I到X的所有函数的集合。
此外,无限笛卡儿积更少直觉性,尽管有应用于高级数学的价值。
断言非空集合的任意非空搜集的笛卡儿积为非空等价于选择公理。
函数的笛卡儿积[编辑]
如果f是从A到B的函数而g是从X到Y的函数,则它们的笛卡儿积f×g是从A×X到B×Y的函数,带有
上述可以被扩展到函数的元组和无限指标。
外部链接[编辑]
[/ltr][/size]
[size][ltr]
参见[编辑]
[/ltr][/size]
[size]
分类:
[/size]
笛卡儿积[编辑]
(重定向自笛卡尔积)
“笛卡儿平方”重定向至此,关于范畴论中的笛卡儿平方,参见拉回 (范畴论)
[size][ltr]在数学中,两个集合X和Y的笛卡儿积(Cartesian product),又称直积,表示为X × Y,是其第一个对象是X的成员而第二个对象是Y的一个成员的所有可能的有序对:
。
笛卡儿积得名于笛卡儿,他的解析几何的公式化引发了这个概念。
具体的说,如果集合X是13个元素的点数集合{ A, K, Q, J, 10, 9, 8, 7, 6, 5, 4, 3, 2 }而集合Y是4个元素的花色集合{♠, ♥, ♦, ♣},则这两个集合的笛卡儿积是52个元素的标准扑克牌的集合{ (A, ♠), (K, ♠), ..., (2, ♠), (A, ♥), ..., (3, ♣), (2, ♣) }。
[/ltr][/size]
[size][ltr]
笛卡儿积的性质[编辑]
易见笛卡儿积满足下列性质:
[/ltr][/size]
[size][ltr]
笛卡儿平方和n-元乘积[编辑]
集合X的笛卡儿平方(或二元笛卡儿积)是笛卡儿积X × X。一个例子是二维平面R × R,这里R是实数的集合 - 所有的点(x,y),这里的x和y是实数(参见笛卡儿坐标系)。
可以推广出在n个集合X1, ..., Xn上的n-元笛卡儿积:
。
实际上,它可以被认同为 (X1 × ... × Xn-1) × Xn。它也是n-元组的集合。
一个例子是欧几里得三维空间R × R × R,这里的R再次是实数的集合。
为了辅助它的计算,可绘制一个表格。一个集合作为行而另一个集合作为列,从行和列的集合选择元素形成有序对作为表的单元格。
无穷乘积[编辑]
对最常用的数学应用而言,上述定义通常就是所需要的全部。但是有可能在任意(可能无限)的集合的搜集上定义笛卡儿积。如果I是任何指标集合,而
是由I索引的集合的搜集,则我们定义
,
就是定义在索引集合上的所有函数的集合,使得这些函数在特定索引i上的值是Xi 的元素。
对在I中每个j,定义自
的函数
叫做第j投影映射。
n-元组可以被看作在{1, 2, ..., n}上的函数,它在i上的值是这个元组的第i个元素。所以,在I是{1, 2, ..., n}的时候这个定义一致于对有限情况的定义。在无限情况下这个定义是集合族。
特别熟悉的一个无限情况是在索引集合是自然数的集合的时候:这正是其中第i项对应于集合Xi 的所有无限序列的集合。再次,提供了这样的一个例子:
是实数的无限序列的搜集,并且很容易可视化为带有有限数目构件的向量或元组。另一个特殊情况(上述例子也满足它)是在乘积涉及因子Xi都是相同的时候,类似于“笛卡儿指数”。则在定义中的无限并集自身就是这个集合自身,而其他条件被平凡的满足了,所以这正是从I到X的所有函数的集合。
此外,无限笛卡儿积更少直觉性,尽管有应用于高级数学的价值。
断言非空集合的任意非空搜集的笛卡儿积为非空等价于选择公理。
函数的笛卡儿积[编辑]
如果f是从A到B的函数而g是从X到Y的函数,则它们的笛卡儿积f×g是从A×X到B×Y的函数,带有
上述可以被扩展到函数的元组和无限指标。
外部链接[编辑]
[/ltr][/size]
[size][ltr]
参见[编辑]
[/ltr][/size]
[size]
分类:
[/size]
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
自由积[编辑]
[ltr]在数学的群论中,自由积(英语:free product,法语:produit libre)是从两个以上的群构造出一个群的一种操作。两个群G和H的自由积,是一个新的群G ∗ H。这个群包含G和H为子群,由G和H的元素生成,并且是有以上性质的群之中“最一般”的。自由积一定是无限群,除非G和H其一是平凡群。自由积的构造方法和自由群(由给定的生成元集合所能构造出的最一般的群)相似。
自由积是群范畴中的余积。
[/ltr]
[size][ltr]
建构方式[编辑]
若G和H是群,以G和H形成的字是以下形式的乘积:
其中si是G或H的元。这种字可以用以下的操作简化:
[/ltr][/size]
[size][ltr]
每个简约字都是G的元素和H的元素交替的积,例如:
自由积G ∗ H的元素是以G和H形成的简约字,其上的运算是将两字接合后简化。
例如若G是无穷循环群<x>,H是无穷循环群<y>,则G ∗ H的元素是x的幂和y的幂交替的积。此时G ∗ H同构于以x和y生成的自由群。
设是群的一个族。用形成的字,也可以用上述操作简化为简约字。仿上可定义出的自由积。
展示[编辑]
设
是G的一个展示(SG是生成元的集合,RG是关系元的集合),又设
是H的一个展示。那么
即是G ∗ H是G的生成元和H的生成元所生成,而其关系是G的关系元和H的关系元所组成。(两者都是不交并。)
性质[编辑]
[/ltr][/size]
[size][ltr]
泛性质[编辑]
设G是群,是由群组成的一个族,有一族群同态。那么存在唯一的群同态,使得对所有都有
其中是把嵌入到中的群同态。
共合积[编辑]
共合积(英语:amalgamated (free) product或free product with amalgamation,法语:produit (libre) amalgamé)是自由积的推广。设G和H是群,又设F是另一个群,并有群同态
及
对F中所有元素f,在自由积G ∗ H中加入关系
便得出其共合积。换言之,在G ∗ H中取最小的正规子群N,使得上式左方的元素都包含在内,则商群
就是共合积。
共合积可视为在群范畴中图表的推出。
塞弗特-范坎彭定理指,两个路径连通的拓扑空间沿着一个路径连通子空间接合的并,其基本群是这两个拓扑空间的基本群的共合积。
共合积及与之相近的HNN扩张,是讨论在树上作用的群的Bass–Serre理论的基本组件。
参考[编辑]
[/ltr][/size]
[ltr]在数学的群论中,自由积(英语:free product,法语:produit libre)是从两个以上的群构造出一个群的一种操作。两个群G和H的自由积,是一个新的群G ∗ H。这个群包含G和H为子群,由G和H的元素生成,并且是有以上性质的群之中“最一般”的。自由积一定是无限群,除非G和H其一是平凡群。自由积的构造方法和自由群(由给定的生成元集合所能构造出的最一般的群)相似。
自由积是群范畴中的余积。
[/ltr]
[size][ltr]
建构方式[编辑]
若G和H是群,以G和H形成的字是以下形式的乘积:
其中si是G或H的元。这种字可以用以下的操作简化:
[/ltr][/size]
- 除去其中的(G或H的)单位元,
- 将其中的g1g2一对元素以其在G中的积代替,将其中的h1h2一对元素以其在H中的积代替。
[size][ltr]
每个简约字都是G的元素和H的元素交替的积,例如:
自由积G ∗ H的元素是以G和H形成的简约字,其上的运算是将两字接合后简化。
例如若G是无穷循环群<x>,H是无穷循环群<y>,则G ∗ H的元素是x的幂和y的幂交替的积。此时G ∗ H同构于以x和y生成的自由群。
设是群的一个族。用形成的字,也可以用上述操作简化为简约字。仿上可定义出的自由积。
展示[编辑]
设
是G的一个展示(SG是生成元的集合,RG是关系元的集合),又设
是H的一个展示。那么
即是G ∗ H是G的生成元和H的生成元所生成,而其关系是G的关系元和H的关系元所组成。(两者都是不交并。)
性质[编辑]
[/ltr][/size]
[size][ltr]
泛性质[编辑]
设G是群,是由群组成的一个族,有一族群同态。那么存在唯一的群同态,使得对所有都有
其中是把嵌入到中的群同态。
共合积[编辑]
共合积(英语:amalgamated (free) product或free product with amalgamation,法语:produit (libre) amalgamé)是自由积的推广。设G和H是群,又设F是另一个群,并有群同态
及
对F中所有元素f,在自由积G ∗ H中加入关系
便得出其共合积。换言之,在G ∗ H中取最小的正规子群N,使得上式左方的元素都包含在内,则商群
就是共合积。
共合积可视为在群范畴中图表的推出。
塞弗特-范坎彭定理指,两个路径连通的拓扑空间沿着一个路径连通子空间接合的并,其基本群是这两个拓扑空间的基本群的共合积。
共合积及与之相近的HNN扩张,是讨论在树上作用的群的Bass–Serre理论的基本组件。
参考[编辑]
[/ltr][/size]
- PlanetMath上Free product的资料。
- PlanetMath上Free product with amalgamated subgroup的资料。
- Pierre de la Harpe. Topics in Geometric Group Theory. Chicago and London: The University of Chicago Press. 2000. ISBN 0-226-31721-8.
分类:
半直积[编辑]
[size][ltr]
在数学中,特别是叫做群论的抽象代数领域中,半直积(semidirect product)是从其中一个是正规子群的两个子群形成一个群的特定方法。半直积是直积的推广。半直积是作为集合的笛卡尔积,但带有特定的乘法运算。
[/ltr][/size]
[size][ltr]
一些等价的定义[编辑]
令G为群,N为G的一个正规子群,并且H是G的一个子群。下列命题等价:
[/ltr][/size]
[size][ltr]
如果这些命题中的一个(从而所有)成立,则称G是一个N和H的半直积,或者说G在N上“分裂(splits)”,并写作G = N ⋊ H。
基本事实和提醒[编辑]
若G是正规子群N和子群H的半直积,而且N和H都是有限的,则G的阶等于N和H的阶的积。
注意,和直积的情况不同,半直积通常不是唯一的;如果G和G' 是两个群,都包含N为正规子群,并且都包含H为子群,而且二者都是N和H的半直积,则未必 G和G' 是同构的。
外半直积[编辑]
若G是一个N和H的半直积,则映射φ : H → Aut(N) (其中Aut(N)表示N的所有自同构组成的群)(定义为φ(h)(n) = hnh–1 对于所有H中的h和N中的n)是一个群同态。实际上N, H 和 φ 一起确定了G 最多相差一个同构,如下面所证。
给定任意两个群N和H(不必是某个群的子群)和一个群同态φ : H → Aut(N),我们定义一个新群N ⋊φ H,N和H相对于φ的半直积,如下: 基础的集合是集合直积 N × H,而群运算*给定为
(n1, h1) * (n2, h2) = (n1 φ(h1)(n2), h1 h2)
对于所有n1, N中的n2 和H中的h1, h2。这确实定义了一个群;其幺元为(eN,eH)而元素(n, h)的逆为(φ(h–1)(n–1), h–1). N × {eH}是同构于N的正规子群, {eN} × H是同构于H的子群,而该群是这两个子群在上面给出的意义下的半直积。
现在反过来假设我们有上述定义的内半直积,也就是说,一个群G有一个正规子群N,一个子群H,并且使得G的每个元素g 可以唯一的写成g=nh的形式,其中n在N中而h在H中。令φ : H→Aut(N)为如下同态
φ(h)(n)=hnh–1.
则G同构于外半直积N ⋊φ H; 该同构把乘积nh映到2元组(n,h)。在G中,我们有如下规则
(n1h1)(n2h2) = n1(h1n2h1–1)(h1h2)
而这是上述外半直积的定义的深层原因,也是一个记住它的方便办法。
群的分裂引理(splitting lemma)的一个版本称群G同构于两个群N和H的半直积当且仅当存在短正合序列
和一个群同态r : H → G 使得v o r = idH, H上的恒等映射。在这种情况, φ : H→ Aut(N)给出如下
φ(h)(n) = u–1(r(h)u(n)r(h–1)).
例子[编辑]
有 2n个元素的二面体群 Dn 同构于循环群Cn 和C2的半直积。这里,C2的非单位元作用于Cn,将元素变成其逆;这是一个自同构因为Cn是交换群。
平面的刚体运动群(映射f : R2 → R2 使得x和y之间的欧氏距离等于f(x) 和f(y)之间的距离对于所有在R2中的x和y成立)同构于交换群R2 (描述平移)和正交2×2矩阵的群O(2)(描述转动和反射)的半直积。每个正交矩阵通过矩阵乘法作用在R2上,并且是一个自同构。
所有正交n×n矩阵的群O(n)(直观的讲,所有n维空间的所有转动和反射的集合)同构于群SO(n) (所有行列式值为1的正交矩阵,直观的讲n维空间的转动的集合)和C2的准直积。如果我们将C2表示为矩阵{I, R}的乘法群,其中R是n维空间的翻转(也就是行列式为-1的正交对角矩阵),则φ : C2 → Aut(SO(n)) 由φ(H)(N) = H N H–1对所有 在C2中的H 和SO(n)中的N给出.
与直积的关系[编辑]
假设G是一个正规子群N和子群H的半直积。若H也在G中正规,或者说,若存在一个同态G → N是N上的恒等映射,则G是N和H的直积。
两个群N和H的直积可以视为N和H相对于φ(h) = idN (对于所有H中的h)的外半直积。
注意在直积中,因子的次序不重要,因为N × H同构于H × N。这在半直积中不成立,因为两个因子的角色不同。
推广[编辑]
半直积的构造可以推得更广。在环理论中有一个版本,环的交叉积(crossed product of rings)。一旦构造了群的一个半直积的群环,这可以很自然的看出。还有李代数的半直和。给定拓扑空间上的一个群作用,存在一个相应的交叉积,它通常非交换,即使群是可交换的。这样的环在群作用的轨道空间有重要作用,特别是当该空间不能用常规的拓扑技术处理的时候,例如在阿兰·孔涅的工作中(细节请参见非交换几何)。
在范畴论中也有推广。它们表明了如何从“指标范畴(indexed categories)”构造“纤维范畴(fibred categories)”。这是外准直积的抽象形式。
参看[编辑]
[/ltr][/size]
[size]
分类:
[/size]
半直积[编辑]
[size][ltr]
在数学中,特别是叫做群论的抽象代数领域中,半直积(semidirect product)是从其中一个是正规子群的两个子群形成一个群的特定方法。半直积是直积的推广。半直积是作为集合的笛卡尔积,但带有特定的乘法运算。
[/ltr][/size]
[size][ltr]
一些等价的定义[编辑]
令G为群,N为G的一个正规子群,并且H是G的一个子群。下列命题等价:
[/ltr][/size]
- G = NH 且 N ∩ H = {e} (其中e是G的幺元)
- G = HN 且 N ∩ H = {e}
- G的每个元素可以写作唯一的N的一个元素和H的一个元素的积
- G的每个元素可以写作唯一的H的一个元素和N的一个元素的积
- 自然的嵌入H → G, 和自然的投影G → G/N的复合,给出一个在H和G/N之间的同构
- 存在同态G → H,它的像是H本身而其核是N
[size][ltr]
如果这些命题中的一个(从而所有)成立,则称G是一个N和H的半直积,或者说G在N上“分裂(splits)”,并写作G = N ⋊ H。
基本事实和提醒[编辑]
若G是正规子群N和子群H的半直积,而且N和H都是有限的,则G的阶等于N和H的阶的积。
注意,和直积的情况不同,半直积通常不是唯一的;如果G和G' 是两个群,都包含N为正规子群,并且都包含H为子群,而且二者都是N和H的半直积,则未必 G和G' 是同构的。
外半直积[编辑]
若G是一个N和H的半直积,则映射φ : H → Aut(N) (其中Aut(N)表示N的所有自同构组成的群)(定义为φ(h)(n) = hnh–1 对于所有H中的h和N中的n)是一个群同态。实际上N, H 和 φ 一起确定了G 最多相差一个同构,如下面所证。
给定任意两个群N和H(不必是某个群的子群)和一个群同态φ : H → Aut(N),我们定义一个新群N ⋊φ H,N和H相对于φ的半直积,如下: 基础的集合是集合直积 N × H,而群运算*给定为
(n1, h1) * (n2, h2) = (n1 φ(h1)(n2), h1 h2)
对于所有n1, N中的n2 和H中的h1, h2。这确实定义了一个群;其幺元为(eN,eH)而元素(n, h)的逆为(φ(h–1)(n–1), h–1). N × {eH}是同构于N的正规子群, {eN} × H是同构于H的子群,而该群是这两个子群在上面给出的意义下的半直积。
现在反过来假设我们有上述定义的内半直积,也就是说,一个群G有一个正规子群N,一个子群H,并且使得G的每个元素g 可以唯一的写成g=nh的形式,其中n在N中而h在H中。令φ : H→Aut(N)为如下同态
φ(h)(n)=hnh–1.
则G同构于外半直积N ⋊φ H; 该同构把乘积nh映到2元组(n,h)。在G中,我们有如下规则
(n1h1)(n2h2) = n1(h1n2h1–1)(h1h2)
而这是上述外半直积的定义的深层原因,也是一个记住它的方便办法。
群的分裂引理(splitting lemma)的一个版本称群G同构于两个群N和H的半直积当且仅当存在短正合序列
和一个群同态r : H → G 使得v o r = idH, H上的恒等映射。在这种情况, φ : H→ Aut(N)给出如下
φ(h)(n) = u–1(r(h)u(n)r(h–1)).
例子[编辑]
有 2n个元素的二面体群 Dn 同构于循环群Cn 和C2的半直积。这里,C2的非单位元作用于Cn,将元素变成其逆;这是一个自同构因为Cn是交换群。
平面的刚体运动群(映射f : R2 → R2 使得x和y之间的欧氏距离等于f(x) 和f(y)之间的距离对于所有在R2中的x和y成立)同构于交换群R2 (描述平移)和正交2×2矩阵的群O(2)(描述转动和反射)的半直积。每个正交矩阵通过矩阵乘法作用在R2上,并且是一个自同构。
所有正交n×n矩阵的群O(n)(直观的讲,所有n维空间的所有转动和反射的集合)同构于群SO(n) (所有行列式值为1的正交矩阵,直观的讲n维空间的转动的集合)和C2的准直积。如果我们将C2表示为矩阵{I, R}的乘法群,其中R是n维空间的翻转(也就是行列式为-1的正交对角矩阵),则φ : C2 → Aut(SO(n)) 由φ(H)(N) = H N H–1对所有 在C2中的H 和SO(n)中的N给出.
与直积的关系[编辑]
假设G是一个正规子群N和子群H的半直积。若H也在G中正规,或者说,若存在一个同态G → N是N上的恒等映射,则G是N和H的直积。
两个群N和H的直积可以视为N和H相对于φ(h) = idN (对于所有H中的h)的外半直积。
注意在直积中,因子的次序不重要,因为N × H同构于H × N。这在半直积中不成立,因为两个因子的角色不同。
推广[编辑]
半直积的构造可以推得更广。在环理论中有一个版本,环的交叉积(crossed product of rings)。一旦构造了群的一个半直积的群环,这可以很自然的看出。还有李代数的半直和。给定拓扑空间上的一个群作用,存在一个相应的交叉积,它通常非交换,即使群是可交换的。这样的环在群作用的轨道空间有重要作用,特别是当该空间不能用常规的拓扑技术处理的时候,例如在阿兰·孔涅的工作中(细节请参见非交换几何)。
在范畴论中也有推广。它们表明了如何从“指标范畴(indexed categories)”构造“纤维范畴(fibred categories)”。这是外准直积的抽象形式。
参看[编辑]
[/ltr][/size]
- 圈积(Wreath product)
[size]
分类:
[/size]
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
开普勒定律[编辑]
(重定向自行星运动定律)
[size][ltr]
开普勒定律是开普勒所发现、关于行星运动的定律。他于1609年在他出版的《新天文学》科学杂志上发表了关于行星运动的两条定律,又于1618年,发现了第三条定律。
开普勒幸运地得到了著名丹麦天文学家第谷·布拉赫所观察与收集、且非常精确的天文资料。大约于1605年,根据布拉赫的行星位置资料,开普勒发现行星的移动遵守着三条相当简单的定律。同年年底,他撰写完成了发表文稿。但是,直到1609年,才在《新天文学》科学杂志发表,这是因为布拉赫的观察数据属于他的继承人,不能随便让别人使用,因此产生的一些法律纠纷造成了延迟。
在天文学与物理学上、开普勒的定律给予亚里士多德派与托勒密派极大的挑战。他主张地球是不断地移动的;行星轨道不是周转圆(epicycle)的,而是椭圆形的;行星公转的速度不等恒。这些论点,大大地动摇了当时的天文学与物理学。经过了几乎一个世纪披星戴月,废寝忘食的研究,物理学家终于能够运用物理理论解释其中的奥秘。牛顿应用他的第二定律和万有引力定律,在数学上严格地证明了开普勒定律,也让人们了解了其中的物理意义。
[/ltr][/size]
[size][ltr]
开普勒定律[编辑]
开普勒的三条行星运动定律改变了整个天文学,彻底摧毁了托勒密复杂的宇宙体系,完善并简化了哥白尼的日心说。
开普勒第一定律[编辑]
[/ltr][/size][size][ltr]
开普勒第一定律,也称椭圆定律、轨道定律:每一个行星都沿各自的椭圆轨道环绕太阳,而太阳则处在椭圆的一个焦点中。
开普勒第二定律[编辑]
[/ltr][/size][size][ltr]
开普勒第二定律,也称等面积定律:在相等时间内,太阳和运动着的行星的连线所扫过的面积都是相等的。
这一定律实际揭示了行星绕太阳公转的角动量守恒。用公式表示为
。
开普勒第三定律[编辑]
[/ltr][/size][size][ltr]
开普勒第三定律,也称周期定律:各个行星绕太阳公转周期的平方和它们的椭圆轨道的半长轴的立方成正比。
由这一定律不难导出:行星与太阳之间的引力与半径的平方成反比。这是牛顿的万有引力定律的一个重要基础。
用公式表示为
;
这里, 是行星公转轨道半长轴, 是行星公转周期, 是常数。
数学推导:由牛顿万有引力定律导出开普勒定律[编辑]
开普勒定律描述的是行星围绕太阳的运动,牛顿定律可以更广义地描述几个粒子因万有引力相互吸引而形成的运动。假设只有两个粒子,其中一个粒子超轻于另外一个粒子,则轻的粒子会绕着重的粒子运动,就好似行星根据开普勒定律绕著太阳运动。另外,牛顿定律还可计算出关于其它方面的解答,行星轨道可以呈抛物线运动或双曲线运动。这是开普勒定律所无法预测到的结果。在一个粒子并不超轻于另外一个粒子的状况下,依照广义二体问题的解答,每一个粒子会绕着它们的共同质心运动。这也是开普勒定律无法预测到的。
开普勒定律使用几何语言将行星的坐标及时间跟轨道参数相连结。牛顿第二定律是一个微分方程。开普勒定律的推导涉及一些解析微分方程的技巧。在推导开普勒第一定律之前,必须先推导出开普勒第二定律,因为开普勒第一定律需要用到开普勒第二定律里的一些计算结果。
开普勒第二定律推导[编辑]
牛顿万有引力定律表明,任意两个粒子由通过连线方向的力相互吸引。该引力的的大小与它们的质量乘积成正比,与它们距离的平方成反比。由于太阳超重于行星,可以假设太阳是固定的。用方程表示,
;
这里, 是太阳作用于行星的万有引力, 是行星的质量, 是太阳的质量, 是行星相对于太阳的位移矢量, 是 的单位矢量。
牛顿第二定律表明,物体受力后所产生的加速度 ,和其所受的合力 成正比,和其质量 成反比,以方程表示,
。
合并这两个方程,
。(1)
思考位置矢量 ,对于时间 微分一次可得到速度矢量,再微分一次则可得到加速度矢量:
, 。(2)
在这里,用到了单位矢量微分方程:
, 。
合并方程 (1) 与 (2) ,可以得到矢量运动方程:
取各个分量,可以得到两个常微分方程,一个是关于径向加速度,另一个是关于切向加速度:
,(3) 。(4)
导引开普勒第二定律只需切向加速度方程。试想行星的角动量 。由于行星的质量是常数,角动量对于时间的导数为
。
角动量 也是一个运动常数,即使距离 与角速度 都可能会随时间变化。
从时间 到时间 扫过的区域 ,
。
行星太阳连线扫过的区域面积相依于间隔时间 。所以,开普勒第二定律是正确的。
开普勒第一定律推导[编辑]
设定 。这样,角速度是
。
对时间微分和对角度微分有如下关系:
。
根据上述关系,径向距离 对时间的导数为:
。
再求一次导数:
。
代入径向运动方程 (3) , ,
。
将此方程除以 ,则可得到一个简单的常系数非齐次线性全微分方程来描述行星轨道:
。
为了解这个微分方程,先列出一个特解
。
再求解剩余的常系数齐次线性全微分方程,
。
它的解为
;
这里, 与 是常数。合并特解和与齐次方程解,可以得到通解
。
选择坐标轴,让 。代回 ,
;
其中, 是离心率。
这是圆锥曲线的极坐标方程,坐标系的原点是圆锥曲线的焦点之一。假若 ,则 所描述的是椭圆轨道。这证明了开普勒第一定律。
开普勒第三定律推导[编辑]
在建立牛顿万有引力定律的概念与数学架构上,开普勒第三定律是牛顿依据的重要线索之一。假若接受牛顿运动定律。试想一个虚拟行星环绕着太阳公转,行星的移动轨道恰巧呈圆形,轨道半径为 。那么,太阳作用于行星的万有引力为 。行星移动速度为 。依照开普勒第三定律,这速度 与半径的平方根 成反比。所以,万有引力 。猜想这大概是牛顿发现万有引力定律的思路,但这个猜想无法被证实,因为在他的计算本里,并没有找到任何关于这方面的证据。
[/ltr][/size][size][ltr]
开普勒第一定律阐明,行星环绕太阳的轨道是椭圆形的。椭圆的面积是 ;这里, 与 分别为椭圆的半长轴与半短轴。在开普勒第二定律推导里,行星-太阳连线扫过区域速度 为
。
所以,行星公转周期 为
。(5)
关于此行星环绕太阳,椭圆的半长轴 ,半短轴 与近拱距 (近拱点 A 与引力中心之间的距离),远拱距 (远拱点 B 与引力中心之间的距离)的关系分别为
,(6) 。(7)
如果想要知道半长轴与半短轴,必须先求得近拱距与远拱距。依据能量守恒定律,
。
在近拱点 A 与远拱点 B,径向速度都等于零:
。
所以,
。
稍为加以编排,可以得到 的一元二次方程:
。
其两个根分别为椭圆轨道的近拱距 与远拱距 。
; 。
代入方程 (6) 与 (7) ,
, 。
代入方程 (5) ,周期的方程为
。
Q.E.D.
参见[编辑]
[/ltr][/size]
(重定向自行星运动定律)
[size][ltr]
开普勒定律是开普勒所发现、关于行星运动的定律。他于1609年在他出版的《新天文学》科学杂志上发表了关于行星运动的两条定律,又于1618年,发现了第三条定律。
开普勒幸运地得到了著名丹麦天文学家第谷·布拉赫所观察与收集、且非常精确的天文资料。大约于1605年,根据布拉赫的行星位置资料,开普勒发现行星的移动遵守着三条相当简单的定律。同年年底,他撰写完成了发表文稿。但是,直到1609年,才在《新天文学》科学杂志发表,这是因为布拉赫的观察数据属于他的继承人,不能随便让别人使用,因此产生的一些法律纠纷造成了延迟。
在天文学与物理学上、开普勒的定律给予亚里士多德派与托勒密派极大的挑战。他主张地球是不断地移动的;行星轨道不是周转圆(epicycle)的,而是椭圆形的;行星公转的速度不等恒。这些论点,大大地动摇了当时的天文学与物理学。经过了几乎一个世纪披星戴月,废寝忘食的研究,物理学家终于能够运用物理理论解释其中的奥秘。牛顿应用他的第二定律和万有引力定律,在数学上严格地证明了开普勒定律,也让人们了解了其中的物理意义。
[/ltr][/size]
[size][ltr]
开普勒定律[编辑]
开普勒的三条行星运动定律改变了整个天文学,彻底摧毁了托勒密复杂的宇宙体系,完善并简化了哥白尼的日心说。
开普勒第一定律[编辑]
[/ltr][/size][size][ltr]
开普勒第一定律,也称椭圆定律、轨道定律:每一个行星都沿各自的椭圆轨道环绕太阳,而太阳则处在椭圆的一个焦点中。
开普勒第二定律[编辑]
[/ltr][/size][size][ltr]
开普勒第二定律,也称等面积定律:在相等时间内,太阳和运动着的行星的连线所扫过的面积都是相等的。
这一定律实际揭示了行星绕太阳公转的角动量守恒。用公式表示为
。
开普勒第三定律[编辑]
[/ltr][/size][size][ltr]
开普勒第三定律,也称周期定律:各个行星绕太阳公转周期的平方和它们的椭圆轨道的半长轴的立方成正比。
由这一定律不难导出:行星与太阳之间的引力与半径的平方成反比。这是牛顿的万有引力定律的一个重要基础。
用公式表示为
;
这里, 是行星公转轨道半长轴, 是行星公转周期, 是常数。
数学推导:由牛顿万有引力定律导出开普勒定律[编辑]
开普勒定律描述的是行星围绕太阳的运动,牛顿定律可以更广义地描述几个粒子因万有引力相互吸引而形成的运动。假设只有两个粒子,其中一个粒子超轻于另外一个粒子,则轻的粒子会绕着重的粒子运动,就好似行星根据开普勒定律绕著太阳运动。另外,牛顿定律还可计算出关于其它方面的解答,行星轨道可以呈抛物线运动或双曲线运动。这是开普勒定律所无法预测到的结果。在一个粒子并不超轻于另外一个粒子的状况下,依照广义二体问题的解答,每一个粒子会绕着它们的共同质心运动。这也是开普勒定律无法预测到的。
开普勒定律使用几何语言将行星的坐标及时间跟轨道参数相连结。牛顿第二定律是一个微分方程。开普勒定律的推导涉及一些解析微分方程的技巧。在推导开普勒第一定律之前,必须先推导出开普勒第二定律,因为开普勒第一定律需要用到开普勒第二定律里的一些计算结果。
开普勒第二定律推导[编辑]
牛顿万有引力定律表明,任意两个粒子由通过连线方向的力相互吸引。该引力的的大小与它们的质量乘积成正比,与它们距离的平方成反比。由于太阳超重于行星,可以假设太阳是固定的。用方程表示,
;
这里, 是太阳作用于行星的万有引力, 是行星的质量, 是太阳的质量, 是行星相对于太阳的位移矢量, 是 的单位矢量。
牛顿第二定律表明,物体受力后所产生的加速度 ,和其所受的合力 成正比,和其质量 成反比,以方程表示,
。
合并这两个方程,
。(1)
思考位置矢量 ,对于时间 微分一次可得到速度矢量,再微分一次则可得到加速度矢量:
, 。(2)
在这里,用到了单位矢量微分方程:
, 。
合并方程 (1) 与 (2) ,可以得到矢量运动方程:
取各个分量,可以得到两个常微分方程,一个是关于径向加速度,另一个是关于切向加速度:
,(3) 。(4)
导引开普勒第二定律只需切向加速度方程。试想行星的角动量 。由于行星的质量是常数,角动量对于时间的导数为
。
角动量 也是一个运动常数,即使距离 与角速度 都可能会随时间变化。
从时间 到时间 扫过的区域 ,
。
行星太阳连线扫过的区域面积相依于间隔时间 。所以,开普勒第二定律是正确的。
开普勒第一定律推导[编辑]
设定 。这样,角速度是
。
对时间微分和对角度微分有如下关系:
。
根据上述关系,径向距离 对时间的导数为:
。
再求一次导数:
。
代入径向运动方程 (3) , ,
。
将此方程除以 ,则可得到一个简单的常系数非齐次线性全微分方程来描述行星轨道:
。
为了解这个微分方程,先列出一个特解
。
再求解剩余的常系数齐次线性全微分方程,
。
它的解为
;
这里, 与 是常数。合并特解和与齐次方程解,可以得到通解
。
选择坐标轴,让 。代回 ,
;
其中, 是离心率。
这是圆锥曲线的极坐标方程,坐标系的原点是圆锥曲线的焦点之一。假若 ,则 所描述的是椭圆轨道。这证明了开普勒第一定律。
开普勒第三定律推导[编辑]
在建立牛顿万有引力定律的概念与数学架构上,开普勒第三定律是牛顿依据的重要线索之一。假若接受牛顿运动定律。试想一个虚拟行星环绕着太阳公转,行星的移动轨道恰巧呈圆形,轨道半径为 。那么,太阳作用于行星的万有引力为 。行星移动速度为 。依照开普勒第三定律,这速度 与半径的平方根 成反比。所以,万有引力 。猜想这大概是牛顿发现万有引力定律的思路,但这个猜想无法被证实,因为在他的计算本里,并没有找到任何关于这方面的证据。
[/ltr][/size][size][ltr]
开普勒第一定律阐明,行星环绕太阳的轨道是椭圆形的。椭圆的面积是 ;这里, 与 分别为椭圆的半长轴与半短轴。在开普勒第二定律推导里,行星-太阳连线扫过区域速度 为
。
所以,行星公转周期 为
。(5)
关于此行星环绕太阳,椭圆的半长轴 ,半短轴 与近拱距 (近拱点 A 与引力中心之间的距离),远拱距 (远拱点 B 与引力中心之间的距离)的关系分别为
,(6) 。(7)
如果想要知道半长轴与半短轴,必须先求得近拱距与远拱距。依据能量守恒定律,
。
在近拱点 A 与远拱点 B,径向速度都等于零:
。
所以,
。
稍为加以编排,可以得到 的一元二次方程:
。
其两个根分别为椭圆轨道的近拱距 与远拱距 。
; 。
代入方程 (6) 与 (7) ,
, 。
代入方程 (5) ,周期的方程为
。
Q.E.D.
参见[编辑]
[/ltr][/size]
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
Random variable
From Wikipedia, the free encyclopedia
(Redirected from Stochastic variable)
[size][ltr]
In probability and statistics, arandom variable, aleatory variable or stochastic variable is a variable whose value is subject to variations due to chance (i.e.randomness, in a mathematical sense).[1]:391 A random variable can take on a set of possible different values (similarly to other mathematical variables), each with an associated probability(if discrete) or a probability density function (if continuous), in contrast to other mathematical variables.
A random variable's possible values might represent the possible outcomes of a yet-to-be-performed experiment, or the possible outcomes of a past experiment whose already-existing value is uncertain (for example, as a result of incomplete information or imprecise measurements). They may also conceptually represent either the results of an "objectively" random process (such as rolling a die) or the "subjective" randomness that results from incomplete knowledge of a quantity. The meaning of the probabilities assigned to the potential values of a random variable is not part of probability theory itself but is instead related to philosophical arguments over the interpretation of probability. The mathematics works the same regardless of the particular interpretation in use.
The mathematical function describing the possible values of a random variable and their associated probabilities is known as a probability distribution. Random variables can be discrete, that is, taking any of a specified finite or countable list of values, endowed with a probability mass function, characteristic of a probability distribution; or continuous, taking any numerical value in an interval or collection of intervals, via a probability density function that is characteristic of a probability distribution; or a mixture of both types. The realizations of a random variable, that is, the results of randomly choosing values according to the variable's probability distribution function, are called random variates.
The formal mathematical treatment of random variables is a topic inprobability theory. In that context, a random variable is understood as afunction defined on a sample space whose outputs are numerical values.[2]
[/ltr][/size]
[size][ltr]
Definition[edit]
Random variable is usually understood to mean a real-valued random variable; this discussion assumes real values. A random variable is a real-valued function defined on a set of possible outcomes, the sample spaceΩ. That is, the random variable is a function that maps from its domain, the sample space Ω, to its range, the real numbers or a subset of the real numbers. It is typically some kind of a property or measurement on the random outcome (for example, if the random outcome is a randomly chosen person, the random variable might be the person's height, or number of children).
The fine print: the admissible functions for defining random variables are limited to those for which a probability distribution exists, derivable from aprobability measure that turns the sample space into a probability space. That is, for the mapping to be an admissible random variable, it must be theoretically possible to compute the probability that the value of the random variable is less than any particular real number. Equivalently, the preimage of any range of values of the random variable must be a subset of Ω that has a defined probability; that is, there exists a subset of Ω, anevent, the probability of which is the same probability as the random variable being in the range of real numbers that that event maps to. Furthermore, the notion of a "range of values" here must be generalizable to the non-pathological subset of reals known as Borel sets.[3]
Random variables are typically distinguished as discrete versus continuous ones. Mixtures of both types also exist.
Discrete random variables can take on either a finite or at most acountably infinite set of discrete values (for example, the integers).[1]:392Their probability distribution is given by a probability mass function which directly maps each value of the random variable to a probability; for each possible value of the random variable, the probability is equal to the probability of the event containing all possible outcomes in Ω that map to that value.
Continuous random variables, on the other hand, take on values that vary continuously within one or more real intervals[1]:399, and have a cumulative distribution function (CDF) that is absolutely continuous. As a result, the random variable has an uncountably infinite number of possible values, all of which have probability 0, though ranges of such values can have nonzero probability. The resulting probability distribution of the random variable can be described by a probability density. (Some sources refer to this class as "absolutely continuous random variables", and allow a wider class of "continuous random variables",[4] including those with singular distributions, but note that these are typically not encountered in practical situations.[5])
Random variables with discontinuities in their CDFs can be treated as mixtures of discrete and continuous random variables.
Examples[edit]
For example, in an experiment a person may be chosen at random, and one random variable may be the person's height. Mathematically, the random variable is interpreted as a function which maps the person to the person's height. Associated with the random variable is a probability distribution that allows the computation of the probability that the height is in any non-pathological subset of possible values, such as probability that the height is between 180 and 190 cm, or the probability that the height is either less than 150 or more than 200 cm.
Another random variable may be the person's number of children; this is a discrete random variable with non-negative integer values. It allows the computation of probabilities for individual integer values – the probability mass function (PMF) – or for sets of values, including infinite sets. For example, the event of interest may be "an even number of children". For both finite and infinite event sets, their probabilities can be found by adding up the PMFs of the elements; that is, the probability of an even number of children is the infinite sum PMF(0) + PMF(2) + PMF(4) + ...
In examples such as these, the sample space (the set of all possible persons) is often suppressed, since it is mathematically hard to describe, and the possible values of the random variables are then treated as a sample space. But when two random variables are measured on the same sample space of outcomes, such as the height and number of children being computed on the same random persons, it is easier to track their relationship if it is acknowledged that both height and number of children come from the same random person, for example so that questions of whether such random variables are correlated or not can be posed.
Probability density[edit]
The probability distribution for continuous random variables can be defined using a probability density function (PDF or p.d.f), which indicates the "density" of probability in a small neighborhood around a given value. The probability that a random variable is in a particular range can then be computed from the integral of the probability density function over that range. The PDF is the derivative of the CDF.
Mixtures[edit]
Some random variables are neither discrete nor continuous, but a mixture of both types. Their CDF is not absolutely continuous, and a PDF does not exist. For example, a typical "sparse" continuous random variable may be exactly 0 with probability 0.9, and continuously distributed otherwise, so its CDF has a big jump discontinuity at 0. The PDF therefore does not exist as an ordinary function in this case, though such situations are easily handled by using a distribution instead of a function to represent a PDF, or by using other representations of measure.
Extensions[edit]
The basic concept of "random variable" in statistics is real-valued, and therefore expected values, variances and other measures can be computed. However, one can consider arbitrary types such as boolean values, categorical variables, complex numbers, vectors, matrices,sequences, trees, sets, shapes, manifolds, functions, and processes. The term random element is used to encompass all such related concepts.
Another extension is the stochastic process, a set of indexed random variables (typically indexed by time or space).
These more general concepts are particularly useful in fields such ascomputer science and natural language processing where many of the basic elements of analysis are non-numerical. Such general random elements can sometimes be treated as sets of real-valued random variables — often more specifically as random vectors. For example:
[/ltr][/size]
[size][ltr]
Reduction to numerical values is not essential for dealing with random elements: a randomly selected individual remains an individual, not a number.
Examples[edit]
The possible outcomes for one coin toss can be described by the sample space . We can introduce a real-valued random variable that models a $1 payoff for a successful bet on heads as follows:
If the coin is equally likely to land on either side then Y has a probability mass function given by:
[/ltr][/size]
If the sample space is the set of possible numbers rolled on two dice, and the random variable of interest is the sum S of the numbers on the two dice, then S is a discrete random variable whose distribution is described by the probability mass function plotted as the height of picture columns here.[size][ltr]
A random variable can also be used to describe the process of rolling dice and the possible outcomes. The most obvious representation for the two-dice case is to take the set of pairs of numbers n1 and n2 from {1, 2, 3, 4, 5, 6} representing the numbers on the two dice as the sample space, defining the random variable X to be equal to the total number rolled, the sum of the numbers in each pair. In this case, the random variable of interest X is defined as the function that maps the pair to the sum:
and has probability mass function ƒX given by:
An example of a continuous random variable would be one based on a spinner that can choose a horizontal direction. Then the values taken by the random variable are directions. We could represent these directions by North, West, East, South, Southeast, etc. However, it is commonly more convenient to map the sample space to a random variable which takes values which are real numbers. This can be done, for example, by mapping a direction to a bearing in degrees clockwise from North. The random variable then takes values which are real numbers from the interval [0, 360), with all parts of the range being "equally likely". In this case, X = the angle spun. Any real number has probability zero of being selected, but a positive probability can be assigned to any range of values. For example, the probability of choosing a number in [0, 180] is ½. Instead of speaking of a probability mass function, we say that the probabilitydensity of X is 1/360. The probability of a subset of [0, 360) can be calculated by multiplying the measure of the set by 1/360. In general, the probability of a set for a given continuous random variable can be calculated by integrating the density over the given set.
An example of a random variable of mixed type would be based on an experiment where a coin is flipped and the spinner is spun only if the result of the coin toss is heads. If the result is tails, X = −1; otherwise X = the value of the spinner as in the preceding example. There is a probability of ½ that this random variable will have the value −1. Other ranges of values would have half the probability of the last example.
Measure-theoretic definition[edit]
The most formal, axiomatic definition of a random variable involvesmeasure theory. Continuous random variables are defined in terms of setsof numbers, along with functions that map such sets to probabilities. Because of various difficulties (e.g. the Banach–Tarski paradox) that arise if such sets are insufficiently constrained, it is necessary to introduce what is termed a sigma-algebra to constrain the possible sets over which probabilities can be defined. Normally, a particular such sigma-algebra is used, the Borel σ-algebra, which allows for probabilities to be defined over any sets that can be derived either directly from continuous intervals of numbers or by a finite or countably infinite number of unions and/orintersections of such intervals.[2]
The measure-theoretic definition is as follows.
Let be a probability space and a measurable space. Then an -valued random variable is a function which is -measurable. The latter means that, for every subset , its preimage where .[6] This definition enables us to measure any subset in the target space by looking at its preimage, which by assumption is measurable.
When is a topological space, then the most common choice for the σ-algebra is the Borel σ-algebra , which is the σ-algebra generated by the collection of all open sets in . In such case the -valued random variable is called the -valued random variable. Moreover, when space is the real line , then such real-valued random variable is called simply the random variable.
Real-valued random variables[edit]
In this case the observation space is the real numbers. Recall, is the probability space. For real observation space, the function is a real-valued random variable if
This definition is a special case of the above because the set generates the Borel σ-algebra on the real numbers, and it suffices to check measurability on any generating set. Here we can prove measurability on this generating set by using the fact that .
Distribution functions of random variables[edit]
If a random variable defined on the probability space is given, we can ask questions like "How likely is it that the value of is equal to 2?". This is the same as the probability of the event which is often written as or for short.
Recording all these probabilities of output ranges of a real-valued random variable yields the probability distribution of . The probability distribution "forgets" about the particular probability space used to define and only records the probabilities of various values of . Such a probability distribution can always be captured by its cumulative distribution function
and sometimes also using a probability density function, . In measure-theoretic terms, we use the random variable to "push-forward" the measure on to a measure on . The underlying probability space is a technical device used to guarantee the existence of random variables, sometimes to construct them, and to define notions such ascorrelation and dependence or independence based on a joint distributionof two or more random variables on the same probability space. In practice, one often disposes of the space altogether and just puts a measure on that assigns measure 1 to the whole real line, i.e., one works with probability distributions instead of random variables.
Moments[edit]
The probability distribution of a random variable is often characterised by a small number of parameters, which also have a practical interpretation. For example, it is often enough to know what its "average value" is. This is captured by the mathematical concept of expected value of a random variable, denoted E[X], and also called the first moment. In general, E[f(X)] is not equal to f(E[X]). Once the "average value" is known, one could then ask how far from this average value the values of X typically are, a question that is answered by the variance and standard deviation of a random variable. E[X] can be viewed intuitively as an average obtained from an infinite population, the members of which are particular evaluations of X.
Mathematically, this is known as the (generalised) problem of moments: for a given class of random variables X, find a collection {fi} of functions such that the expectation values E[fi(X)] fully characterise the distribution of the random variable X.
Moments can only be defined for real-valued functions of random variables (or complex-valued, etc.). If the random variable is itself real-valued, then moments of the variable itself can be taken, which are equivalent to moments of the identity function of the random variable. However, even for non-real-valued random variables, moments can be taken of real-valued functions of those variables. For example, for acategorical random variable X that can take on the nominal values "red", "blue" or "green", the real-valued function can be constructed; this uses the Iverson bracket, and has the value 1 if X has the value "green", 0 otherwise. Then, the expected value and other moments of this function can be determined.
Functions of random variables[edit]
A new random variable Y can be defined by applying a real Borel measurable function to the outcomes of a real-valuedrandom variable X. The cumulative distribution function of is
If function g is invertible, i.e. g−1 exists, and is either increasing or decreasing, then the previous relation can be extended to obtain
and, again with the same hypotheses of invertibility of g, assuming also differentiability, we can find the relation between the probability density functions by differentiating both sides with respect to y, in order to obtain
If there is no invertibility of g but each y admits at most a countable number of roots (i.e. a finite, or countably infinite, number of xi such that y = g(xi)) then the previous relation between the probability density functions can be generalized with
where xi = gi-1(y). The formulas for densities do not demand g to be increasing.
In the measure-theoretic, axiomatic approach to probability, if we have a random variable on and a Borel measurable function , then will also be a random variable on , since the composition of measurable functions is also measurable. (However, this is not true if is Lebesgue measurable.) The same procedure that allowed one to go from a probability space to can be used to obtain the distribution of .
Example 1[edit]
Let X be a real-valued, continuous random variable and let Y = X2.
If y < 0, then P(X2 ≤ y) = 0, so
If y ≥ 0, then
so
Example 2[edit]
Suppose is a random variable with a cumulative distribution
where is a fixed parameter. Consider the random variable Then,
The last expression can be calculated in terms of the cumulative distribution of so
Example 3[edit]
Suppose is a random variable with a standard normal distribution, whose density is
Consider the random variable We can find the density using the above formula for a change of variables:
In this case the change is not monotonic, because every value of has two corresponding values of (one positive and negative). However, because of symmetry, both halves will transform identically, i.e.
The inverse transformation is
and its derivative is
Then:
This is a chi-squared distribution with one degree of freedom.
Equivalence of random variables[edit]
There are several different senses in which random variables can be considered to be equivalent. Two random variables can be equal, equal almost surely, or equal in distribution.
In increasing order of strength, the precise definition of these notions of equivalence is given below.
Equality in distribution[edit]
If the sample space is a subset of the real line, random variables X and Yare equal in distribution (denoted ) if they have the same distribution functions:
Two random variables having equal moment generating functions have the same distribution. This provides, for example, a useful method of checking equality of certain functions of i.i.d. random variables. However, the moment generating function exists only for distributions that have a defined Laplace transform.
Almost sure equality[edit]
Two random variables X and Y are equal almost surely if, and only if, the probability that they are different is zero:
For all practical purposes in probability theory, this notion of equivalence is as strong as actual equality. It is associated to the following distance:
where "ess sup" represents the essential supremum in the sense ofmeasure theory.
Equality[edit]
Finally, the two random variables X and Y are equal if they are equal as functions on their measurable space:
Convergence[edit]
A significant theme in mathematical statistics consists of obtaining convergence results for certain sequences of random variables; for instance the law of large numbers and the central limit theorem.
There are various senses in which a sequence (Xn) of random variables can converge to a random variable X. These are explained in the article onconvergence of random variables.
See also[edit]
[/ltr][/size]
[size][ltr]
References[edit]
[/ltr][/size]
[size][ltr]
Literature[edit]
[/ltr][/size]
[size][ltr]
External links[size=13][edit]
[/size][/ltr][/size]
From Wikipedia, the free encyclopedia
(Redirected from Stochastic variable)
|
In probability and statistics, arandom variable, aleatory variable or stochastic variable is a variable whose value is subject to variations due to chance (i.e.randomness, in a mathematical sense).[1]:391 A random variable can take on a set of possible different values (similarly to other mathematical variables), each with an associated probability(if discrete) or a probability density function (if continuous), in contrast to other mathematical variables.
A random variable's possible values might represent the possible outcomes of a yet-to-be-performed experiment, or the possible outcomes of a past experiment whose already-existing value is uncertain (for example, as a result of incomplete information or imprecise measurements). They may also conceptually represent either the results of an "objectively" random process (such as rolling a die) or the "subjective" randomness that results from incomplete knowledge of a quantity. The meaning of the probabilities assigned to the potential values of a random variable is not part of probability theory itself but is instead related to philosophical arguments over the interpretation of probability. The mathematics works the same regardless of the particular interpretation in use.
The mathematical function describing the possible values of a random variable and their associated probabilities is known as a probability distribution. Random variables can be discrete, that is, taking any of a specified finite or countable list of values, endowed with a probability mass function, characteristic of a probability distribution; or continuous, taking any numerical value in an interval or collection of intervals, via a probability density function that is characteristic of a probability distribution; or a mixture of both types. The realizations of a random variable, that is, the results of randomly choosing values according to the variable's probability distribution function, are called random variates.
The formal mathematical treatment of random variables is a topic inprobability theory. In that context, a random variable is understood as afunction defined on a sample space whose outputs are numerical values.[2]
[/ltr][/size]
- 1 Definition
- 1.1 Examples
- 1.2 Probability density
- 1.3 Mixtures
- 1.4 Extensions
- 2 Examples
- 3 Measure-theoretic definition
- 3.1 Real-valued random variables
- 3.2 Distribution functions of random variables
- 4 Moments
- 5 Functions of random variables
- 5.1 Example 1
- 5.2 Example 2
- 5.3 Example 3
- 6 Equivalence of random variables
- 6.1 Equality in distribution
- 6.2 Almost sure equality
- 6.3 Equality
- 7 Convergence
- 8 See also
- 9 References
- 9.1 Literature
- 10 External links
[size][ltr]
Definition[edit]
Random variable is usually understood to mean a real-valued random variable; this discussion assumes real values. A random variable is a real-valued function defined on a set of possible outcomes, the sample spaceΩ. That is, the random variable is a function that maps from its domain, the sample space Ω, to its range, the real numbers or a subset of the real numbers. It is typically some kind of a property or measurement on the random outcome (for example, if the random outcome is a randomly chosen person, the random variable might be the person's height, or number of children).
The fine print: the admissible functions for defining random variables are limited to those for which a probability distribution exists, derivable from aprobability measure that turns the sample space into a probability space. That is, for the mapping to be an admissible random variable, it must be theoretically possible to compute the probability that the value of the random variable is less than any particular real number. Equivalently, the preimage of any range of values of the random variable must be a subset of Ω that has a defined probability; that is, there exists a subset of Ω, anevent, the probability of which is the same probability as the random variable being in the range of real numbers that that event maps to. Furthermore, the notion of a "range of values" here must be generalizable to the non-pathological subset of reals known as Borel sets.[3]
Random variables are typically distinguished as discrete versus continuous ones. Mixtures of both types also exist.
Discrete random variables can take on either a finite or at most acountably infinite set of discrete values (for example, the integers).[1]:392Their probability distribution is given by a probability mass function which directly maps each value of the random variable to a probability; for each possible value of the random variable, the probability is equal to the probability of the event containing all possible outcomes in Ω that map to that value.
Continuous random variables, on the other hand, take on values that vary continuously within one or more real intervals[1]:399, and have a cumulative distribution function (CDF) that is absolutely continuous. As a result, the random variable has an uncountably infinite number of possible values, all of which have probability 0, though ranges of such values can have nonzero probability. The resulting probability distribution of the random variable can be described by a probability density. (Some sources refer to this class as "absolutely continuous random variables", and allow a wider class of "continuous random variables",[4] including those with singular distributions, but note that these are typically not encountered in practical situations.[5])
Random variables with discontinuities in their CDFs can be treated as mixtures of discrete and continuous random variables.
Examples[edit]
For example, in an experiment a person may be chosen at random, and one random variable may be the person's height. Mathematically, the random variable is interpreted as a function which maps the person to the person's height. Associated with the random variable is a probability distribution that allows the computation of the probability that the height is in any non-pathological subset of possible values, such as probability that the height is between 180 and 190 cm, or the probability that the height is either less than 150 or more than 200 cm.
Another random variable may be the person's number of children; this is a discrete random variable with non-negative integer values. It allows the computation of probabilities for individual integer values – the probability mass function (PMF) – or for sets of values, including infinite sets. For example, the event of interest may be "an even number of children". For both finite and infinite event sets, their probabilities can be found by adding up the PMFs of the elements; that is, the probability of an even number of children is the infinite sum PMF(0) + PMF(2) + PMF(4) + ...
In examples such as these, the sample space (the set of all possible persons) is often suppressed, since it is mathematically hard to describe, and the possible values of the random variables are then treated as a sample space. But when two random variables are measured on the same sample space of outcomes, such as the height and number of children being computed on the same random persons, it is easier to track their relationship if it is acknowledged that both height and number of children come from the same random person, for example so that questions of whether such random variables are correlated or not can be posed.
Probability density[edit]
The probability distribution for continuous random variables can be defined using a probability density function (PDF or p.d.f), which indicates the "density" of probability in a small neighborhood around a given value. The probability that a random variable is in a particular range can then be computed from the integral of the probability density function over that range. The PDF is the derivative of the CDF.
Mixtures[edit]
Some random variables are neither discrete nor continuous, but a mixture of both types. Their CDF is not absolutely continuous, and a PDF does not exist. For example, a typical "sparse" continuous random variable may be exactly 0 with probability 0.9, and continuously distributed otherwise, so its CDF has a big jump discontinuity at 0. The PDF therefore does not exist as an ordinary function in this case, though such situations are easily handled by using a distribution instead of a function to represent a PDF, or by using other representations of measure.
Extensions[edit]
The basic concept of "random variable" in statistics is real-valued, and therefore expected values, variances and other measures can be computed. However, one can consider arbitrary types such as boolean values, categorical variables, complex numbers, vectors, matrices,sequences, trees, sets, shapes, manifolds, functions, and processes. The term random element is used to encompass all such related concepts.
Another extension is the stochastic process, a set of indexed random variables (typically indexed by time or space).
These more general concepts are particularly useful in fields such ascomputer science and natural language processing where many of the basic elements of analysis are non-numerical. Such general random elements can sometimes be treated as sets of real-valued random variables — often more specifically as random vectors. For example:
[/ltr][/size]
- A "random word" may be parameterized by an integer-valued index into the vocabulary of possible words; alternatively, as an indicator vector, in which exactly one element is a 1, and the others are 0, with the one indexing a particular word into a vocabulary.
- A "random sentence" may be parameterized as a vector of random words.
- A random graph, for a graph with V edges, may be parameterized as an NxN matrix, indicating the weight for each edge, or 0 for no edge. (If the graph has no weights, 1 indicates an edge; 0 indicates no edge.)
[size][ltr]
Reduction to numerical values is not essential for dealing with random elements: a randomly selected individual remains an individual, not a number.
Examples[edit]
The possible outcomes for one coin toss can be described by the sample space . We can introduce a real-valued random variable that models a $1 payoff for a successful bet on heads as follows:
If the coin is equally likely to land on either side then Y has a probability mass function given by:
[/ltr][/size]
If the sample space is the set of possible numbers rolled on two dice, and the random variable of interest is the sum S of the numbers on the two dice, then S is a discrete random variable whose distribution is described by the probability mass function plotted as the height of picture columns here.
A random variable can also be used to describe the process of rolling dice and the possible outcomes. The most obvious representation for the two-dice case is to take the set of pairs of numbers n1 and n2 from {1, 2, 3, 4, 5, 6} representing the numbers on the two dice as the sample space, defining the random variable X to be equal to the total number rolled, the sum of the numbers in each pair. In this case, the random variable of interest X is defined as the function that maps the pair to the sum:
and has probability mass function ƒX given by:
An example of a continuous random variable would be one based on a spinner that can choose a horizontal direction. Then the values taken by the random variable are directions. We could represent these directions by North, West, East, South, Southeast, etc. However, it is commonly more convenient to map the sample space to a random variable which takes values which are real numbers. This can be done, for example, by mapping a direction to a bearing in degrees clockwise from North. The random variable then takes values which are real numbers from the interval [0, 360), with all parts of the range being "equally likely". In this case, X = the angle spun. Any real number has probability zero of being selected, but a positive probability can be assigned to any range of values. For example, the probability of choosing a number in [0, 180] is ½. Instead of speaking of a probability mass function, we say that the probabilitydensity of X is 1/360. The probability of a subset of [0, 360) can be calculated by multiplying the measure of the set by 1/360. In general, the probability of a set for a given continuous random variable can be calculated by integrating the density over the given set.
An example of a random variable of mixed type would be based on an experiment where a coin is flipped and the spinner is spun only if the result of the coin toss is heads. If the result is tails, X = −1; otherwise X = the value of the spinner as in the preceding example. There is a probability of ½ that this random variable will have the value −1. Other ranges of values would have half the probability of the last example.
Measure-theoretic definition[edit]
The most formal, axiomatic definition of a random variable involvesmeasure theory. Continuous random variables are defined in terms of setsof numbers, along with functions that map such sets to probabilities. Because of various difficulties (e.g. the Banach–Tarski paradox) that arise if such sets are insufficiently constrained, it is necessary to introduce what is termed a sigma-algebra to constrain the possible sets over which probabilities can be defined. Normally, a particular such sigma-algebra is used, the Borel σ-algebra, which allows for probabilities to be defined over any sets that can be derived either directly from continuous intervals of numbers or by a finite or countably infinite number of unions and/orintersections of such intervals.[2]
The measure-theoretic definition is as follows.
Let be a probability space and a measurable space. Then an -valued random variable is a function which is -measurable. The latter means that, for every subset , its preimage where .[6] This definition enables us to measure any subset in the target space by looking at its preimage, which by assumption is measurable.
When is a topological space, then the most common choice for the σ-algebra is the Borel σ-algebra , which is the σ-algebra generated by the collection of all open sets in . In such case the -valued random variable is called the -valued random variable. Moreover, when space is the real line , then such real-valued random variable is called simply the random variable.
Real-valued random variables[edit]
In this case the observation space is the real numbers. Recall, is the probability space. For real observation space, the function is a real-valued random variable if
This definition is a special case of the above because the set generates the Borel σ-algebra on the real numbers, and it suffices to check measurability on any generating set. Here we can prove measurability on this generating set by using the fact that .
Distribution functions of random variables[edit]
If a random variable defined on the probability space is given, we can ask questions like "How likely is it that the value of is equal to 2?". This is the same as the probability of the event which is often written as or for short.
Recording all these probabilities of output ranges of a real-valued random variable yields the probability distribution of . The probability distribution "forgets" about the particular probability space used to define and only records the probabilities of various values of . Such a probability distribution can always be captured by its cumulative distribution function
and sometimes also using a probability density function, . In measure-theoretic terms, we use the random variable to "push-forward" the measure on to a measure on . The underlying probability space is a technical device used to guarantee the existence of random variables, sometimes to construct them, and to define notions such ascorrelation and dependence or independence based on a joint distributionof two or more random variables on the same probability space. In practice, one often disposes of the space altogether and just puts a measure on that assigns measure 1 to the whole real line, i.e., one works with probability distributions instead of random variables.
Moments[edit]
The probability distribution of a random variable is often characterised by a small number of parameters, which also have a practical interpretation. For example, it is often enough to know what its "average value" is. This is captured by the mathematical concept of expected value of a random variable, denoted E[X], and also called the first moment. In general, E[f(X)] is not equal to f(E[X]). Once the "average value" is known, one could then ask how far from this average value the values of X typically are, a question that is answered by the variance and standard deviation of a random variable. E[X] can be viewed intuitively as an average obtained from an infinite population, the members of which are particular evaluations of X.
Mathematically, this is known as the (generalised) problem of moments: for a given class of random variables X, find a collection {fi} of functions such that the expectation values E[fi(X)] fully characterise the distribution of the random variable X.
Moments can only be defined for real-valued functions of random variables (or complex-valued, etc.). If the random variable is itself real-valued, then moments of the variable itself can be taken, which are equivalent to moments of the identity function of the random variable. However, even for non-real-valued random variables, moments can be taken of real-valued functions of those variables. For example, for acategorical random variable X that can take on the nominal values "red", "blue" or "green", the real-valued function can be constructed; this uses the Iverson bracket, and has the value 1 if X has the value "green", 0 otherwise. Then, the expected value and other moments of this function can be determined.
Functions of random variables[edit]
A new random variable Y can be defined by applying a real Borel measurable function to the outcomes of a real-valuedrandom variable X. The cumulative distribution function of is
If function g is invertible, i.e. g−1 exists, and is either increasing or decreasing, then the previous relation can be extended to obtain
and, again with the same hypotheses of invertibility of g, assuming also differentiability, we can find the relation between the probability density functions by differentiating both sides with respect to y, in order to obtain
If there is no invertibility of g but each y admits at most a countable number of roots (i.e. a finite, or countably infinite, number of xi such that y = g(xi)) then the previous relation between the probability density functions can be generalized with
where xi = gi-1(y). The formulas for densities do not demand g to be increasing.
In the measure-theoretic, axiomatic approach to probability, if we have a random variable on and a Borel measurable function , then will also be a random variable on , since the composition of measurable functions is also measurable. (However, this is not true if is Lebesgue measurable.) The same procedure that allowed one to go from a probability space to can be used to obtain the distribution of .
Example 1[edit]
Let X be a real-valued, continuous random variable and let Y = X2.
If y < 0, then P(X2 ≤ y) = 0, so
If y ≥ 0, then
so
Example 2[edit]
Suppose is a random variable with a cumulative distribution
where is a fixed parameter. Consider the random variable Then,
The last expression can be calculated in terms of the cumulative distribution of so
Example 3[edit]
Suppose is a random variable with a standard normal distribution, whose density is
Consider the random variable We can find the density using the above formula for a change of variables:
In this case the change is not monotonic, because every value of has two corresponding values of (one positive and negative). However, because of symmetry, both halves will transform identically, i.e.
The inverse transformation is
and its derivative is
Then:
This is a chi-squared distribution with one degree of freedom.
Equivalence of random variables[edit]
There are several different senses in which random variables can be considered to be equivalent. Two random variables can be equal, equal almost surely, or equal in distribution.
In increasing order of strength, the precise definition of these notions of equivalence is given below.
Equality in distribution[edit]
If the sample space is a subset of the real line, random variables X and Yare equal in distribution (denoted ) if they have the same distribution functions:
Two random variables having equal moment generating functions have the same distribution. This provides, for example, a useful method of checking equality of certain functions of i.i.d. random variables. However, the moment generating function exists only for distributions that have a defined Laplace transform.
Almost sure equality[edit]
Two random variables X and Y are equal almost surely if, and only if, the probability that they are different is zero:
For all practical purposes in probability theory, this notion of equivalence is as strong as actual equality. It is associated to the following distance:
where "ess sup" represents the essential supremum in the sense ofmeasure theory.
Equality[edit]
Finally, the two random variables X and Y are equal if they are equal as functions on their measurable space:
Convergence[edit]
A significant theme in mathematical statistics consists of obtaining convergence results for certain sequences of random variables; for instance the law of large numbers and the central limit theorem.
There are various senses in which a sequence (Xn) of random variables can converge to a random variable X. These are explained in the article onconvergence of random variables.
See also[edit]
[/ltr][/size]
Statistics portal |
- Aleatoricism
- Algebra of random variables
- Event (probability theory)
- Multivariate random variable
- Observable variable
- Probability distribution
- Random element
- Random function
- Random measure
- Randomness
- Stochastic process
[size][ltr]
References[edit]
[/ltr][/size]
- ^ Jump up to:a b c Yates, Daniel S.; Moore, David S; Starnes, Daren S. (2003). The Practice of Statistics (2nd ed.). New York: Freeman. ISBN 978-0-7167-4773-4.
- ^ Jump up to:a b Steigerwald, Douglas G. [url=http://econ.ucsb.edu/~doug/245a/Lectures/Measure Theory.pdf]"Economics 245A – Introduction to Measure Theory"[/url]. University of California, Santa Barbara. Retrieved April 26, 2013.
- Jump up^ Emanuel, Parzen (1962). Stochastic Processes. SIAM. p. 8.ISBN 9780898714418.
- Jump up^ L. Castañeda, V. Arunachalam, and S. Dharmaraja (2012). Introduction to Probability and Stochastic Processes with Applications. Wiley. p. 67.
- Jump up^ Epps, T. W. (2007). Pricing Derivative Securities. World Scientific. p. 52. ISBN 9789812700339.
- Jump up^ Fristedt & Gray (1996, page 11)
This article includes a list of references, butits sources remain unclear because it has insufficient inline citations. Please help to improve this article by introducingmore precise citations. (February 2012) |
Literature[edit]
[/ltr][/size]
- Fristedt, Bert; Gray, Lawrence (1996). A modern approach to probability theory. Boston: Birkhäuser. ISBN 3-7643-3807-5.
- Kallenberg, Olav (1986). Random Measures (4th ed.). Berlin: Akademie Verlag. ISBN 0-12-394960-2. MR MR0854102.
- Kallenberg, Olav (2001). Foundations of Modern Probability (2nd ed.). Berlin: Springer Verlag. ISBN 0-387-95313-2.
- Papoulis, Athanasios (1965). Probability, Random Variables, and Stochastic Processes (9th ed.). Tokyo: McGraw–Hill. ISBN 0-07-119981-0.
[size][ltr]
External links[size=13][edit]
[/size][/ltr][/size]
- Hazewinkel, Michiel, ed. (2001), "Random variable", Encyclopedia of Mathematics, Springer, ISBN 978-1-55608-010-4
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
Characteristic function (probability theory)
From Wikipedia, the free encyclopedia
[ltr]
In probability theory andstatistics, thecharacteristic functionof any real-valued random variable completely defines its probability distribution. If a random variable admits aprobability density function, then the characteristic function is the inverse Fourier transform of the probability density function. Thus it provides the basis of an alternative route to analytical results compared with working directly with probability density functions or cumulative distribution functions. There are particularly simple results for the characteristic functions of distributions defined by the weighted sums of random variables.
In addition to univariate distributions, characteristic functions can be defined for vector- or matrix-valued random variables, and can even be extended to more generic cases.
The characteristic function always exists when treated as a function of a real-valued argument, unlike the moment-generating function. There are relations between the behavior of the characteristic function of a distribution and properties of the distribution, such as the existence of moments and the existence of a density function.[/ltr]
6 Uses
6.1 Basic manipulations of distributions
6.2 Moments
6.3 Data analysis
6.4 Example
7 Entire characteristic functions
8 Related concepts
9 See also
10 References
11 Notes
12 External links
[ltr]
Introduction[edit]
The characteristic function provides an alternative way for describing arandom variable. Similarly to the cumulative distribution function
( where 1{X ≤ x} is the indicator function — it is equal to 1 when X ≤ x, and zero otherwise), which completely determines behavior and properties of the probability distribution of the random variable X, the characteristic function
also completely determines behavior and properties of the probability distribution of the random variable X. The two approaches are equivalent in the sense that knowing one of the functions it is always possible to find the other, yet they both provide different insight for understanding the features of the random variable. However, in particular cases, there can be differences in whether these functions can be represented as expressions involving simple standard functions.
If a random variable admits a density function, then the characteristic function is its dual, in the sense that each of them is a Fourier transform of the other. If a random variable has a moment-generating function, then the domain of the characteristic function can be extended to the complex plane, and
[1]
Note however that the characteristic function of a distribution always exists, even when the probability density function or moment-generating functiondo not.
The characteristic function approach is particularly useful in analysis of linear combinations of independent random variables: a classical proof of the Central Limit Theorem uses characteristic functions and Lévy's continuity theorem. Another important application is to the theory of thedecomposability of random variables.
Definition[edit]
For a scalar random variable X the characteristic function is defined as the expected value of eitX, where i is the imaginary unit, and t ∈ R is the argument of the characteristic function:
Here FX is the cumulative distribution function of X, and the integral is of the Riemann–Stieltjes kind. If random variable X has a probability density function fX, then the characteristic function is its Fourier transform with sign reversal in the complex exponential,[2][3] and the last formula in parentheses is valid. QX(p) is the inverse cumulative distribution function of X also called the quantile function of X.[4]
It should be noted though, that this convention for the constants appearing in the definition of the characteristic function differs from the usual convention for the Fourier transform.[5] For example some authors[6]define φX(t) = Ee−2πitX, which is essentially a change of parameter. Other notation may be encountered in the literature: as the characteristic function for a probability measure p, or as the characteristic function corresponding to a density f.
Generalizations[edit]
The notion of characteristic functions generalizes to multivariate random variables and more complicated random elements. The argument of the characteristic function will always belong to the continuous dual of the space where random variable X takes values. For common cases such definitions are listed below:[/ltr]
[ltr]
[/ltr]
[ltr]
[/ltr]
[ltr]
[/ltr]
[ltr]
[/ltr]
[ltr]
Here denotes matrix transpose, tr(·) — the matrix trace operator, Re(·) is the real part of a complex number, z denotes complex conjugate, and * isconjugate transpose (that is z* = zT ).
Examples[edit][/ltr]
[ltr]
Oberhettinger (1973) provides extensive tables of characteristic functions.
Properties[edit][/ltr]
[ltr]
[/ltr]
[ltr]
[/ltr]
[ltr]
[/ltr]
[ltr]
One specific case is the sum of two independent random variablesX1 and X2 in which case one has[/ltr]
[ltr]
Continuity[edit]
The bijection stated above between probability distributions and characteristic functions is continuous. That is, whenever a sequence of distribution functions Fj(x) converges (weakly) to some distribution F(x), the corresponding sequence of characteristic functions φj(t) will also converge, and the limit φ(t) will correspond to the characteristic function of law F. More formally, this is stated as
Lévy’s continuity theorem: A sequence Xj of n-variate random variables converges in distribution to random variable X if and only if the sequence φX[sub]j[/sub] converges pointwise to a function φ which is continuous at the origin. Then φ is the characteristic function of X.[12]
This theorem is frequently used to prove the law of large numbers, and thecentral limit theorem.
Inversion formulas[edit]
Since there is a one-to-one correspondence between cumulative distribution functions and characteristic functions, it is always possible to find one of these functions if we know the other one. The formula in definition of characteristic function allows us to compute φ when we know the distribution function F (or density f). If, on the other hand, we know the characteristic function φ and want to find the corresponding distribution function, then one of the following inversion theorems can be used.
Theorem. If characteristic function φX is integrable, then FX is absolutely continuous, and therefore X has the probability density function given by
when X is scalar;
in multivariate case the pdf is understood as the Radon–Nikodym derivative of the distribution μX with respect to the Lebesgue measure λ:
Theorem (Lévy).[13] If φX is characteristic function of distribution functionFX, two points a are such that {x|a < x < b} is a continuity set of μX (in the univariate case this condition is equivalent to continuity of FX at pointsa and b), then[/ltr]
[ltr]
[/ltr]
[ltr]
Theorem. If a is (possibly) an atom of X (in the univariate case this means a point of discontinuity of FX ) then[/ltr]
[ltr]
[/ltr]
[ltr]
Theorem (Gil-Pelaez).[14] For a univariate random variable X, if x is a continuity point of FX then
The integral may be not Lebesgue-integrable; for example, when X is thediscrete random variable that is always 0, it becomes the Dirichlet integral.
Inversion formulas for multivariate distributions are available.[15]
Criteria for characteristic functions[edit]
First note that the set of all characteristic functions is closed under certain operations:[/ltr]
[ltr]
It is well known that any non-decreasing càdlàg function F with limits F(−∞) = 0, F(+∞) = 1 corresponds to a cumulative distribution function of some random variable. There is also interest in finding similar simple criteria for when a given function φ could be the characteristic function of some random variable. The central result here is Bochner’s theorem, although its usefulness is limited because the main condition of the theorem, non-negative definiteness, is very hard to verify. Other theorems also exist, such as Khinchine’s, Mathias’s, or Cramér’s, although their application is just as difficult. Pólya’s theorem, on the other hand, provides a very simple convexity condition which is sufficient but not necessary. Characteristic functions which satisfy this condition are called Pólya-type.[16]
Bochner’s theorem. An arbitrary function φ : Rn → C is the characteristic function of some random variable if and only if φ is positive definite, continuous at the origin, and if φ(0) = 1.
Khinchine’s criterion. A complex-valued, absolutely continuous function φ, with φ(0) = 1, is a characteristic function if and only if it admits the representation
Mathias’ theorem. A real-valued, even, continuous, absolutely integrable function φ, with φ(0) = 1, is a characteristic function if and only if
for n = 0,1,2,…, and all p > 0. Here H2n denotes the Hermite polynomial of degree 2n.[/ltr]
[ltr]
Pólya’s theorem. If φ is a real-valued, even, continuous function which satisfies the conditions[/ltr]
[ltr]
then φ(t) is the characteristic function of an absolutely continuous symmetric distribution.
Uses[edit]
Because of the continuity theorem, characteristic functions are used in the most frequently seen proof of the central limit theorem. The main trick involved in *** calculations with a characteristic function is recognizing the function as the characteristic function of a particular distribution.
Basic manipulations of distributions[edit]
Characteristic functions are particularly useful for dealing with linear functions of independent random variables. For example, if X1, X2, ..., Xnis a sequence of independent (and not necessarily identically distributed) random variables, and
where the ai are constants, then the characteristic function for Sn is given by
In particular, φX+Y(t) = φX(t)φY(t). To see this, write out the definition of characteristic function:
Observe that the independence of X and Y is required to establish the equality of the third and fourth expressions.
Another special case of interest is when ai = 1/n and then Sn is the sample mean. In this case, writing X for the mean,
Moments[edit]
Characteristic functions can also be used to find moments of a random variable. Provided that the nth moment exists, characteristic function can be differentiated n times and
For example, suppose X has a standard Cauchy distribution. ThenφX(t) = e−|t|. See how this is not differentiable at t = 0, showing that the Cauchy distribution has no expectation. Also see that the characteristic function of the sample mean X of n independent observations has characteristic function φX(t) = (e−|t|/n)n = e−|t|, using the result from the previous section. This is the characteristic function of the standard Cauchy distribution: thus, the sample mean has the same distribution as the population itself.
The logarithm of a characteristic function is a cumulant generating function, which is useful for finding cumulants; note that some instead define the cumulant generating function as the logarithm of the moment-generating function, and call the logarithm of the characteristic function thesecond cumulant generating function.
Data analysis[edit]
Characteristic functions can be used as part of procedures for fitting probability distributions to samples of data. Cases where this provides a practicable option compared to other possibilities include fitting the stable distribution since closed form expressions for the density are not available which makes implementation of maximum likelihood estimation difficult. Estimation procedures are available which match the theoretical characteristic function to the empirical characteristic function, calculated from the data. Paulson et al. (1975) and Heathcote (1977) provide some theoretical background for such an estimation procedure. In addition, Yu (2004) describes applications of empirical characteristic functions to fittime series models where likelihood procedures are impractical.
Example[edit]
The Gamma distribution with scale parameter θ and a shape parameter khas the characteristic function
Now suppose that we have
with X and Y independent from each other, and we wish to know what the distribution of X + Y is. The characteristic functions are
which by independence and the basic properties of characteristic function leads to
This is the characteristic function of the gamma distribution scale parameter θ and shape parameter k1 + k2, and we therefore conclude
The result can be expanded to n independent gamma distributed random variables with the same scale parameter and we get
Entire characteristic functions[edit][/ltr]
[ltr]
As defined above, the argument of the characteristic function is treated as a real number: however, certain aspects of the theory of characteristic functions are advanced by extending the definition into the complex plane by analytical continuation, in cases where this is possible.[17]
Related concepts[edit]
Related concepts include the moment-generating function and theprobability-generating function. The characteristic function exists for all probability distributions. This is not the case for the moment-generating function.
The characteristic function is closely related to the Fourier transform: the characteristic function of a probability density function p(x) is the complex conjugate of the continuous Fourier transform of p(x) (according to the usual convention; see continuous Fourier transform – other conventions).
where P(t) denotes the continuous Fourier transform of the probability density function p(x). Likewise, p(x) may be recovered from φX(t) through the inverse Fourier transform:
Indeed, even when the random variable does not have a density, the characteristic function may be seen as the Fourier transform of the measure corresponding to the random variable.
Another related concept is the representation of probability distributions as elements of a reproducing kernel Hilbert space via the kernel embedding of distributions. This framework may be viewed as a generalization of the characteristic function under specific choices of the kernel function.
See also[edit][/ltr]
From Wikipedia, the free encyclopedia
The characteristic function of a uniform U(–1,1) random variable. This function is real-valued because it corresponds to a random variable that is symmetric around the origin; however in general case characteristic functions may be complex-valued.
[ltr]
In probability theory andstatistics, thecharacteristic functionof any real-valued random variable completely defines its probability distribution. If a random variable admits aprobability density function, then the characteristic function is the inverse Fourier transform of the probability density function. Thus it provides the basis of an alternative route to analytical results compared with working directly with probability density functions or cumulative distribution functions. There are particularly simple results for the characteristic functions of distributions defined by the weighted sums of random variables.
In addition to univariate distributions, characteristic functions can be defined for vector- or matrix-valued random variables, and can even be extended to more generic cases.
The characteristic function always exists when treated as a function of a real-valued argument, unlike the moment-generating function. There are relations between the behavior of the characteristic function of a distribution and properties of the distribution, such as the existence of moments and the existence of a density function.[/ltr]
- 1 Introduction
- 2 Definition
- 3 Generalizations
- 4 Examples
- 5 Properties
- 5.1 Continuity
- 5.2 Inversion formulas
- 5.3 Criteria for characteristic functions
[ltr]
Introduction[edit]
The characteristic function provides an alternative way for describing arandom variable. Similarly to the cumulative distribution function
( where 1{X ≤ x} is the indicator function — it is equal to 1 when X ≤ x, and zero otherwise), which completely determines behavior and properties of the probability distribution of the random variable X, the characteristic function
also completely determines behavior and properties of the probability distribution of the random variable X. The two approaches are equivalent in the sense that knowing one of the functions it is always possible to find the other, yet they both provide different insight for understanding the features of the random variable. However, in particular cases, there can be differences in whether these functions can be represented as expressions involving simple standard functions.
If a random variable admits a density function, then the characteristic function is its dual, in the sense that each of them is a Fourier transform of the other. If a random variable has a moment-generating function, then the domain of the characteristic function can be extended to the complex plane, and
[1]
Note however that the characteristic function of a distribution always exists, even when the probability density function or moment-generating functiondo not.
The characteristic function approach is particularly useful in analysis of linear combinations of independent random variables: a classical proof of the Central Limit Theorem uses characteristic functions and Lévy's continuity theorem. Another important application is to the theory of thedecomposability of random variables.
Definition[edit]
For a scalar random variable X the characteristic function is defined as the expected value of eitX, where i is the imaginary unit, and t ∈ R is the argument of the characteristic function:
Here FX is the cumulative distribution function of X, and the integral is of the Riemann–Stieltjes kind. If random variable X has a probability density function fX, then the characteristic function is its Fourier transform with sign reversal in the complex exponential,[2][3] and the last formula in parentheses is valid. QX(p) is the inverse cumulative distribution function of X also called the quantile function of X.[4]
It should be noted though, that this convention for the constants appearing in the definition of the characteristic function differs from the usual convention for the Fourier transform.[5] For example some authors[6]define φX(t) = Ee−2πitX, which is essentially a change of parameter. Other notation may be encountered in the literature: as the characteristic function for a probability measure p, or as the characteristic function corresponding to a density f.
Generalizations[edit]
The notion of characteristic functions generalizes to multivariate random variables and more complicated random elements. The argument of the characteristic function will always belong to the continuous dual of the space where random variable X takes values. For common cases such definitions are listed below:[/ltr]
- If X is a k-dimensional random vector, then for t ∈ Rk
[ltr]
[/ltr]
- If X is a k×p-dimensional random matrix, then for t ∈ Rk×p
[ltr]
[/ltr]
- If X is a complex random variable, then for t ∈ C [7]
[ltr]
[/ltr]
- If X is a k-dimensional complex random vector, then for t ∈ Ck[8]
[ltr]
[/ltr]
- If X(s) is a stochastic process, then for all functions t(s) such that the integral ∫Rt(s)X(s)ds converges for almost all realizations of X [9]
[ltr]
Here denotes matrix transpose, tr(·) — the matrix trace operator, Re(·) is the real part of a complex number, z denotes complex conjugate, and * isconjugate transpose (that is z* = zT ).
Examples[edit][/ltr]
Degenerate δa | |
Bernoulli Bern(p) | |
Binomial B(n, p) | |
Negative binomial NB(r, p) | |
Poisson Pois(λ) | |
Uniform U(a, b) | |
Laplace L(μ, b) | |
Normal N(μ, σ2) | |
Chi-squared χ2k | |
Cauchy C(μ, θ) | |
Gamma Γ(k, θ) | |
Exponential Exp(λ) | |
Multivariate normal N(μ, Σ) | |
Multivariate Cauchy MultiCauchy(μ, Σ) [10] |
Oberhettinger (1973) provides extensive tables of characteristic functions.
Properties[edit][/ltr]
- The characteristic function of a real-valued random variable always exists, since it is an integral of a bounded continuous function over a space whose measure is finite.
- A characteristic function is uniformly continuous on the entire space
- It is non-vanishing in a region around zero: φ(0) = 1.
- It is bounded: |φ(t)| ≤ 1.
- It is Hermitian: φ(−t) = φ(t). In particular, the characteristic function of a symmetric (around the origin) random variable is real-valued and even.
- There is a bijection between distribution functions and characteristic functions. That is, for any two random variables X1, X2
[ltr]
[/ltr]
- If a random variable X has moments up to k-th order, then the characteristic function φX is k times continuously differentiable on the entire real line. In this case
[ltr]
[/ltr]
- If a characteristic function φX has a k-th derivative at zero, then the random variable X has all moments up to k if k is even, but only up tok – 1 if k is odd.[11]
[ltr]
[/ltr]
- If X1, …, Xn are independent random variables, and a1, …, an are some constants, then the characteristic function of the linear combination of the Xi 's is
[ltr]
One specific case is the sum of two independent random variablesX1 and X2 in which case one has[/ltr]
- The tail behavior of the characteristic function determines thesmoothness of the corresponding density function.
[ltr]
Continuity[edit]
The bijection stated above between probability distributions and characteristic functions is continuous. That is, whenever a sequence of distribution functions Fj(x) converges (weakly) to some distribution F(x), the corresponding sequence of characteristic functions φj(t) will also converge, and the limit φ(t) will correspond to the characteristic function of law F. More formally, this is stated as
Lévy’s continuity theorem: A sequence Xj of n-variate random variables converges in distribution to random variable X if and only if the sequence φX[sub]j[/sub] converges pointwise to a function φ which is continuous at the origin. Then φ is the characteristic function of X.[12]
This theorem is frequently used to prove the law of large numbers, and thecentral limit theorem.
Inversion formulas[edit]
Since there is a one-to-one correspondence between cumulative distribution functions and characteristic functions, it is always possible to find one of these functions if we know the other one. The formula in definition of characteristic function allows us to compute φ when we know the distribution function F (or density f). If, on the other hand, we know the characteristic function φ and want to find the corresponding distribution function, then one of the following inversion theorems can be used.
Theorem. If characteristic function φX is integrable, then FX is absolutely continuous, and therefore X has the probability density function given by
when X is scalar;
in multivariate case the pdf is understood as the Radon–Nikodym derivative of the distribution μX with respect to the Lebesgue measure λ:
Theorem (Lévy).[13] If φX is characteristic function of distribution functionFX, two points a are such that {x|a < x < b} is a continuity set of μX (in the univariate case this condition is equivalent to continuity of FX at pointsa and b), then[/ltr]
- If X is scalar:
[ltr]
[/ltr]
- If X is a vector random variable:
[ltr]
Theorem. If a is (possibly) an atom of X (in the univariate case this means a point of discontinuity of FX ) then[/ltr]
- If X is scalar:
[ltr]
[/ltr]
- If X is a vector random variable:
[ltr]
Theorem (Gil-Pelaez).[14] For a univariate random variable X, if x is a continuity point of FX then
The integral may be not Lebesgue-integrable; for example, when X is thediscrete random variable that is always 0, it becomes the Dirichlet integral.
Inversion formulas for multivariate distributions are available.[15]
Criteria for characteristic functions[edit]
First note that the set of all characteristic functions is closed under certain operations:[/ltr]
- A convex linear combination (with ) of a finite or a countable number of characteristic functions is also a characteristic function.
- The product of a finite number of characteristic functions is also a characteristic function. The same holds for an infinite product provided that it converges to a function continuous at the origin.
- If φ is a characteristic function and α is a real number, then , Re(φ), |φ|2, and φ(αt) are also characteristic functions.
[ltr]
It is well known that any non-decreasing càdlàg function F with limits F(−∞) = 0, F(+∞) = 1 corresponds to a cumulative distribution function of some random variable. There is also interest in finding similar simple criteria for when a given function φ could be the characteristic function of some random variable. The central result here is Bochner’s theorem, although its usefulness is limited because the main condition of the theorem, non-negative definiteness, is very hard to verify. Other theorems also exist, such as Khinchine’s, Mathias’s, or Cramér’s, although their application is just as difficult. Pólya’s theorem, on the other hand, provides a very simple convexity condition which is sufficient but not necessary. Characteristic functions which satisfy this condition are called Pólya-type.[16]
Bochner’s theorem. An arbitrary function φ : Rn → C is the characteristic function of some random variable if and only if φ is positive definite, continuous at the origin, and if φ(0) = 1.
Khinchine’s criterion. A complex-valued, absolutely continuous function φ, with φ(0) = 1, is a characteristic function if and only if it admits the representation
Mathias’ theorem. A real-valued, even, continuous, absolutely integrable function φ, with φ(0) = 1, is a characteristic function if and only if
for n = 0,1,2,…, and all p > 0. Here H2n denotes the Hermite polynomial of degree 2n.[/ltr]
[ltr]
Pólya’s theorem. If φ is a real-valued, even, continuous function which satisfies the conditions[/ltr]
- φ(0) = 1,
- φ is convex for t > 0,
- φ(∞) = 0,
[ltr]
then φ(t) is the characteristic function of an absolutely continuous symmetric distribution.
Uses[edit]
Because of the continuity theorem, characteristic functions are used in the most frequently seen proof of the central limit theorem. The main trick involved in *** calculations with a characteristic function is recognizing the function as the characteristic function of a particular distribution.
Basic manipulations of distributions[edit]
Characteristic functions are particularly useful for dealing with linear functions of independent random variables. For example, if X1, X2, ..., Xnis a sequence of independent (and not necessarily identically distributed) random variables, and
where the ai are constants, then the characteristic function for Sn is given by
In particular, φX+Y(t) = φX(t)φY(t). To see this, write out the definition of characteristic function:
Observe that the independence of X and Y is required to establish the equality of the third and fourth expressions.
Another special case of interest is when ai = 1/n and then Sn is the sample mean. In this case, writing X for the mean,
Moments[edit]
Characteristic functions can also be used to find moments of a random variable. Provided that the nth moment exists, characteristic function can be differentiated n times and
For example, suppose X has a standard Cauchy distribution. ThenφX(t) = e−|t|. See how this is not differentiable at t = 0, showing that the Cauchy distribution has no expectation. Also see that the characteristic function of the sample mean X of n independent observations has characteristic function φX(t) = (e−|t|/n)n = e−|t|, using the result from the previous section. This is the characteristic function of the standard Cauchy distribution: thus, the sample mean has the same distribution as the population itself.
The logarithm of a characteristic function is a cumulant generating function, which is useful for finding cumulants; note that some instead define the cumulant generating function as the logarithm of the moment-generating function, and call the logarithm of the characteristic function thesecond cumulant generating function.
Data analysis[edit]
Characteristic functions can be used as part of procedures for fitting probability distributions to samples of data. Cases where this provides a practicable option compared to other possibilities include fitting the stable distribution since closed form expressions for the density are not available which makes implementation of maximum likelihood estimation difficult. Estimation procedures are available which match the theoretical characteristic function to the empirical characteristic function, calculated from the data. Paulson et al. (1975) and Heathcote (1977) provide some theoretical background for such an estimation procedure. In addition, Yu (2004) describes applications of empirical characteristic functions to fittime series models where likelihood procedures are impractical.
Example[edit]
The Gamma distribution with scale parameter θ and a shape parameter khas the characteristic function
Now suppose that we have
with X and Y independent from each other, and we wish to know what the distribution of X + Y is. The characteristic functions are
which by independence and the basic properties of characteristic function leads to
This is the characteristic function of the gamma distribution scale parameter θ and shape parameter k1 + k2, and we therefore conclude
The result can be expanded to n independent gamma distributed random variables with the same scale parameter and we get
Entire characteristic functions[edit][/ltr]
This section requires expansion.(December 2009) |
As defined above, the argument of the characteristic function is treated as a real number: however, certain aspects of the theory of characteristic functions are advanced by extending the definition into the complex plane by analytical continuation, in cases where this is possible.[17]
Related concepts[edit]
Related concepts include the moment-generating function and theprobability-generating function. The characteristic function exists for all probability distributions. This is not the case for the moment-generating function.
The characteristic function is closely related to the Fourier transform: the characteristic function of a probability density function p(x) is the complex conjugate of the continuous Fourier transform of p(x) (according to the usual convention; see continuous Fourier transform – other conventions).
where P(t) denotes the continuous Fourier transform of the probability density function p(x). Likewise, p(x) may be recovered from φX(t) through the inverse Fourier transform:
Indeed, even when the random variable does not have a density, the characteristic function may be seen as the Fourier transform of the measure corresponding to the random variable.
Another related concept is the representation of probability distributions as elements of a reproducing kernel Hilbert space via the kernel embedding of distributions. This framework may be viewed as a generalization of the characteristic function under specific choices of the kernel function.
See also[edit][/ltr]
- Subindependence, a weaker condition than independence, that is defined in terms of characteristic functions.
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
3.5 Symmetry and Hopf Algebras
Whoever understands symmetries can understand everything in this world.
Folklore
Symmetries play a crucial role in mathematics and physics. As a rule, it is only
possible to explicitly solve a mathematical or physical problem if a certain symmetry
is available. In the history of sciences, mathematicians and physicists encountered
more and more complicated symmetries.
3.5.1 The Strategy of Coordinatization in Mathematics and
Physics
In what follows, we will show that appropriate complex-valued coordinate functions
χj : G → C, j= 1, 2, . . . (3.45)
on a group G generate a Hopf algebra denoted by H(G). This is the so-called coordinate
algebra of G. In ancient times, mathematicians studied geometric objects.
The ‘coordinatization’ of geometric objects arose in analytic geometry founded by
Descartes (1596–1650). The physicist Galilei (1564–1642) wrote:
Measure everything that is measurable, and make measurable everything
that is not yet so.
In terms of physics,
(i) the group G above describes abstract physical quantities, and
(ii) the coordinate functions χ1, χ2, . . . correspond to measurements performed by
an observer.
Typically, the mathematical structure concerning (ii) looks more complicated (e.g.,
Hopf algebras appear) than the mathematical structure concerning (i). Furthermore,
note the following:
The theory of Hopf algebras allows us to study symmetries which are not
necessarily related to the classical theory of Lie groups and Lie algebras.
For example, this leads to quantum groups. In the physics literature, roughly speaking,
Hopf algebras are also called quantum groups. Sometimes, only special Hopf
algebras are called quantum groups (e.g., one-parameter deformations of the enveloping
algebras of semi-simple Lie algebras).
In (3.45), the translation of properties of the group G into properties of the
coordinate functions corresponds to a general strategy used in mathematics and
physics:
Investigate mathematical objects by studying families of maps defined on
the objects.
In topology, this leads to the crucial concept of cohomology. In physics, this means
that we investigate the properties of the space-time by studying physical fields
depending on space and time. For the convenience of the reader, let us summarize
mathematical and physical topics which are closely related to the concept of Hopf
algebra:
• factorization of the scattering matrix and the Yang–Baxter equation,
• integrable models in statistical physics and the Yang–Baxter equation,
• solutions of the Yang–Baxter equation by means of Hopf algebras and the braid
group,
• Artin’s braid group and braid group statistics of particles,
• integrable models in quantum field theory and quantum groups,
• conformal field theory, Virasora algebras, affine Lie algebras (Kac–Moody algebras),
Verma modules, vertex algebras, and operator products,
• vertex algebras, the completion of the classification of the finite simple groups by
discovering the monster group, which acts on the monstrous moonshine algebra,
• vertex algebras and algebraic curves in algebraic geometry,
• complex function theory, Riemann su***ces, conformal field theory, and strings,
• fusion rules for Feynman diagrams in conformal field theory and the Verlinde
formula,
• models in quantum gravitation, the Moyal product, and the Seiberg–Witten map,
• generalized differential calculi (with Leibniz rule) and noncommutative geometry,
• quantum groups and new topological invariants of knots and 3-dimensional manifolds
due to Jones (related to von Neumann algebras), Vassiliev, and Kontsevich,
• Witten’s topological quantum field theory and topological invariants of knots
and 3-dimensional manifolds,
• number theory (lattices, modular forms, and zeta functions),
• Frobenius manifolds, quantum cohomology and moduli spaces.
In mathematics and physics, one frequently encounters the solution of problems by
using iterative processes.
3.5.2 The Coordinate Hopf Algebra of a Finite Group
Finite groups can equivalently described by incidence numbers.
Folklore
Let G be a finite group with the unit element e. Let H(G) denote the set of all
complex-valued functions
ϕ : G → C
on the group G. For functions ϕ, ψ ∈ H(G), group elements g, h ∈ G, and complex
numbers α, β, we define the following operations:
(i) Linear combination: (αϕ + βψ)(g) := αϕ(g) + βψ(g).
(ii) Product: (ϕψ)(g) := ϕ(g)ψ(g).
(iii) Unit element: 1(g) := 1.
(iv) Coproduct: (Δϕ)(g, h) := ϕ(gh).
(v) Counit: ε(ϕ) := ϕ(e).
(vi) Coinverse: (Sϕ)(g) := ϕ(g−1).
Concerning the definition of αϕ+βψ,ϕψ, 1,Δ(ϕ), ε(ϕ), and S(ϕ), note the following:
• The coproduct Δ sends complex-valued functions g → ϕ(g) of one variable on
the group G to functions (g, h) → ϕ(gh) of two variables by using the group
product gh.
• The counit ε sends complex-valued functions g → ϕ(g) of one variable on the
group G to complex numbers ϕ(e) by using the unit element e of the group.
• The coinverse S sends complex-valued functions g → ϕ(g) of one variable on G
to functions g → ϕ(g−1) of one variable by using the inverse [size=14]g[sup]−1[size=13] of the group[/size][/sup][/size]
element g.
The maps Δ, S, and ε are linear and multiplicative. This means that, for all functions
ϕ, ψ : G → C and all complex numbers α, β, we have
Δ(αϕ + βψ) := αΔϕ + βΔψ, Δ(ϕψ) = ΔϕΔψ.
The same is true if we replace Δ by S (resp. ε).
Recall the following definition. For given functions ϕ, ψ : G → C, we define the
tensor product ϕ ⊗ ψ by setting
(ϕ ⊗ ψ)(g, h) := ϕ(g)ψ(h) for all g, h ∈ G.
This is a function of the form ϕ⊗ψ : G×G → C. In other words, ϕ⊗ψ is a function
of two variables. Note the following peculiarity which follows from the finiteness of
the group G. The functions
χ : G × G → C
of two variables are in one-to-one correspondence to the elements of the tensor
product H(G)⊗H(G). We will show this in (3.46) below by using a basis. In this
sense, the coproduct is a map of the form Δ : H(G)→H(G)⊗H(G).
Proposition 3.8 The algebra H(G) is a commutative Hopf algebra. The dimension
of H(G) is equal to the number of group elements. The Hopf algebra H(G) is
cocommutative iff the group G is commutative.
The proof will be given in Problem 3.11 on page 172.
3.5.4 The Tannaka–Krein Duality for Compact Lie Groups
Try to dualize in mathematics as much as you can.
Folklore
We want to generalize the classical exponential function
χ(x) := exp(ix), x∈ R
in the setting of groups and C
∗
-algebras. The starting point is the functional equation
χ(x + y) = χ(x)χ(y) for all x, y ∈ R.
This means that the map χ : R → U(1) is a group morphism from the additive
group R of real numbers onto the multiplicative group U(1) of the unit circle in
the complex plane. Explicitly, U(1) := {z ∈ C : |z| = 1}. The group U(1) is the
prototype of a compact Lie group. Our goal is to use the group U(1) in order to
study the structure of more general objects, namely, groups and C∗-algebras. By
definition, the character of a group G is a group morphism χ : G → U(1), that is,
χ(gh) = χ(g)χ(h) for all g, h ∈ G.
For example, the characters of the group U(1) are given by the following maps
χn : U(1) → U(1), n = 0,±1,±2, . . . , where
χn(exp(iϕ)) := exp(inϕ), ϕ∈ R. (3.47)
In terms of the angle variable ϕ, this corresponds to the map ϕ → nϕ. The integer
n is called the winding number (or the topological charge) of the map χn. In this
connection, let us discuss the following dualities:
• de Rham duality for manifolds and cohomology,
• Pontryagin duality for commutative compact groups,
• Tannaka–Krein duality for noncommutative compact groups,
• Gelfand–Naimark duality for commutative C∗-algebras and noncommutative geometry
(see Vol. IV).
This fits into the following general strategy in mathematics:
Study the structure of a mathematical object X by investigating maps
χ : X → Y
that live on the object X. From the physical point of view, the prototype
for this strategy is given by physical fields χ that live on the space-time
manifold X.
De Rham duality. Let X be a manifold, and let χ : X → Y be a differential
form on X. Using the Cartan derivative dχ of the differential form χ with the crucial
property
d(dχ) = 0,
that is, dd = 0, the topological properties of the manifold X can be studied by
de Rham cohomology which lies at the heart of modern differential topology. This
yields a powerful theory of topological invariants (called topological charges in
physics) including the theory of characteristic classes and Chern numbers (see Chap.
5 of Vol. I). Cohomology also lies at the heart of both the methods of BRSTquantization
and algebraic renormalization.
3.6 Regularization and Rota–Baxter Algebras
It was noted long ago by Tricomi, and later independently by Cotlar, that
the Hilbert transform
(Hf)(x) := limε→+0f(ξ)/(ξ − x)dξ, x ∈ R, |ξ|≥ε
operating on a suitable function algebra, satisfies the identity
(Hf) (Hf)= ff + 2H(fHf).
Later on, Glen Baxter was the first to point out that the evaluation of
several functionals of sums of independent random variables depended on
a purely algebraic study of a closely related identity,
Pf · Pg = P(Pf · g + f · Pg − wfg).
Here, the fixed real parameter w is called the weight of the Rota–Baxter
operator P.27 The very same identity reappeared in the same guise in
various estimates involving the iteration of the maximum function
x → max(x, 0),
such as occur in the theory of almost-everywhere convergence. . . By looking
at the problem in the rarified atmosphere of universal algebra, we were
led to a systematic method for guessing and verifying identities for Baxter
operators, based upon reducing all computations to identities between
symmetric functions.
Gian-Carlo Rota and David Smith, 1972
It is our goal to generalize the classical operations of
• differentiation and integration of smooth functions, and
• regularization of singular functions
by using deformations. This will lead us to the notion of Rota–Baxter operators
with weights. Hopf algebras and Rota–Baxter algebras play a crucial role in modern
renormalization theory, as we will study later on.
Let A be a complex algebra. Consider an operator P :A→A. One frequently
encounters the following special cases.
(i) Linear map: The operator P is called linear iff we have
P(αf + βg) = αPf + βPg
for all f, g ∈ A and all complex numbers α, β.
(ii) Antilinear map: The operator P is called antilinear iff
P(αf + βg) = α†Pf + β†Pg
for all f, g ∈ A and all complex numbers α, β.
(iii) Multiplicative map: The operator P is called multiplicative iff for all f, g ∈ A,
we have the product property
P(fg) = Pf · Pg.
The operator P is called an endomorphism of the algebra A iff it is linear and
multiplicative. Bijective endomorphisms are called automorphisms.
(iv) Anti-multiplicative map: The operator P is called anti-multiplicative iff for all
f, g ∈ A,
P(fg) = Pg · Pf.
The operator P is called an anti-endomorphism of the algebra A iff it is antilinear
and anti-multiplicative. Bijective anti-endomorphisms are called antiautomorphisms.
(v) Derivation: The operator P is called a derivation iff it is linear, and for all
f, g ∈ A, we have the following Leibniz (product) rule:
P(fg) = Pf · g + f · Pg. (3.49)
Derivations are also called infinitesimal endomorphisms (or generalized differential
operators).
(vi) Inverse derivation: The operator P is called an inverse derivation iff it is linear
and for all f, g ∈ A, we have the following rule:
Pf · Pg = P(Pf · g + f · Pg). (3.50)
As we will show below, this rule generalizes integration by parts. Inverse derivations
are also called generalized integral operators.
(vii) Truncation operator: The operator P is called a truncation iff it is linear with
the projector property P2 = P and with the truncation property, that is, for
all f, g ∈ A we have
Pf · Pg = P(Pf · g + f · Pg − fg). (3.51)
The operator R := I − P is called a regularization operator.
(viii) Rota–Baxter operator: Fix the real number w. The operator P is called a
Rota–Baxter operator of weight w iff it is linear and for all f, g ∈ A, we have
the relation
Pf · Pg = P(Pf · g + f · Pg − wfg). (3.52)
Obviously, the inverse derivation (vi) (resp. the truncation operator (vii)) is a
Rota–Baxter operator of weight w = 0 (resp. w = 1). If P is a Rota–Baxter
operator of nonzero weight w, then 1/wP (resp. −w/P) is a Rota–Baxter
operator of weight 1 (resp. −1).
(ix) Rota–Baxter algebra: By definition, a Rota–Baxter algebra of weight w is a
complex commutative unital algebra A equipped with a fixed Rota–Baxter
operator of weight w.
Differentiation and integration. Set A := E(R), that is, the complex algebra
A consists of all the smooth functions f : R → C.
• Fix n = 1, 2, . . . The operator
Pf := f...f(n个) for all f ∈ A
is multiplicative. This operator is linear iff n = 1. In this special case, the operator
P is the trivial automorphism of the algebra A.
• The operator Pf := f
†
is anti-linear, anti-multiplicative, and bijective. Hence
it is an anti-automorphism of A. (Since f†g†= g†f†, the operator A is also
multiplicative.)
• For fixed real number q, set
(Pf)(x) := f(qx) for all x ∈ R.
Then the operator P :A→A is an endomorphism.
• The operator Pf := df
dx is a derivation of A. In fact, the Leibniz rule of differentiation
tells us that
d(fg)/dx=df/dx· g + f · dg/dx
This is precisely relation (3.49).
• Define the integral operator (Pf)(x) := 0-->x f(u)du for all f ∈ A. Choosing
arbitrary functions f, g ∈ A, we obtain
P(Pf · g + f · Pg) = Pf · Pg.
This means that the operator P : A → A is a Rota–Baxter operator of weight
w = 0. In fact, setting F(x) :=0-->x f(u)du and G(x) := 0-->x g(u)du, and noting
that G(0) = F(0) = 0, integration by parts yields
0-->x(F(u)g(u) + f(u)G(u)) du =0-->xd{F(u)G(u)}/du du = F(x)G(x).
The Importance of the Exponential Function in
Mathematics and Physics
The exponential function is the most important function in mathematics.
Folklore
The following exponential and logarithmic formulas play a crucial role:
• The Dyson series via time-ordering operator (see Sect. 7.17.4 of Vol. I).
• The Trotter exponential formula (see Sect. 8.3 of Vol. I).
• The Baker–Campbell–Hausdorff exponential formula for Lie algebras (see Sect.
8.4 of Vol. I).
• The Fa`a di Bruno exponential formula for Bell polynomials (see (3.32) on page
136).
• The logarithmic formula for Schur polynomials (see (3.39) on page 140).
• The logarithmic formula for reduced correlation functions (or connected Green’s
functions) (see (3.41) on page 142).
• The logarithmic formula for cumulants (see (3.44) on page 144).
• Group characters (see Sect. 3.5.4 on page 152).
• The Volterra–Spitzer formula (see (3.57) on page 160).
• The Spitzer formula for Rota–Baxter algebras (see (3.61) on page 161).
• The perturbation formula (see (3.70) on page 167).
Whoever understands symmetries can understand everything in this world.
Folklore
Symmetries play a crucial role in mathematics and physics. As a rule, it is only
possible to explicitly solve a mathematical or physical problem if a certain symmetry
is available. In the history of sciences, mathematicians and physicists encountered
more and more complicated symmetries.
3.5.1 The Strategy of Coordinatization in Mathematics and
Physics
In what follows, we will show that appropriate complex-valued coordinate functions
χj : G → C, j= 1, 2, . . . (3.45)
on a group G generate a Hopf algebra denoted by H(G). This is the so-called coordinate
algebra of G. In ancient times, mathematicians studied geometric objects.
The ‘coordinatization’ of geometric objects arose in analytic geometry founded by
Descartes (1596–1650). The physicist Galilei (1564–1642) wrote:
Measure everything that is measurable, and make measurable everything
that is not yet so.
In terms of physics,
(i) the group G above describes abstract physical quantities, and
(ii) the coordinate functions χ1, χ2, . . . correspond to measurements performed by
an observer.
Typically, the mathematical structure concerning (ii) looks more complicated (e.g.,
Hopf algebras appear) than the mathematical structure concerning (i). Furthermore,
note the following:
The theory of Hopf algebras allows us to study symmetries which are not
necessarily related to the classical theory of Lie groups and Lie algebras.
For example, this leads to quantum groups. In the physics literature, roughly speaking,
Hopf algebras are also called quantum groups. Sometimes, only special Hopf
algebras are called quantum groups (e.g., one-parameter deformations of the enveloping
algebras of semi-simple Lie algebras).
In (3.45), the translation of properties of the group G into properties of the
coordinate functions corresponds to a general strategy used in mathematics and
physics:
Investigate mathematical objects by studying families of maps defined on
the objects.
In topology, this leads to the crucial concept of cohomology. In physics, this means
that we investigate the properties of the space-time by studying physical fields
depending on space and time. For the convenience of the reader, let us summarize
mathematical and physical topics which are closely related to the concept of Hopf
algebra:
• factorization of the scattering matrix and the Yang–Baxter equation,
• integrable models in statistical physics and the Yang–Baxter equation,
• solutions of the Yang–Baxter equation by means of Hopf algebras and the braid
group,
• Artin’s braid group and braid group statistics of particles,
• integrable models in quantum field theory and quantum groups,
• conformal field theory, Virasora algebras, affine Lie algebras (Kac–Moody algebras),
Verma modules, vertex algebras, and operator products,
• vertex algebras, the completion of the classification of the finite simple groups by
discovering the monster group, which acts on the monstrous moonshine algebra,
• vertex algebras and algebraic curves in algebraic geometry,
• complex function theory, Riemann su***ces, conformal field theory, and strings,
• fusion rules for Feynman diagrams in conformal field theory and the Verlinde
formula,
• models in quantum gravitation, the Moyal product, and the Seiberg–Witten map,
• generalized differential calculi (with Leibniz rule) and noncommutative geometry,
• quantum groups and new topological invariants of knots and 3-dimensional manifolds
due to Jones (related to von Neumann algebras), Vassiliev, and Kontsevich,
• Witten’s topological quantum field theory and topological invariants of knots
and 3-dimensional manifolds,
• number theory (lattices, modular forms, and zeta functions),
• Frobenius manifolds, quantum cohomology and moduli spaces.
In mathematics and physics, one frequently encounters the solution of problems by
using iterative processes.
3.5.2 The Coordinate Hopf Algebra of a Finite Group
Finite groups can equivalently described by incidence numbers.
Folklore
Let G be a finite group with the unit element e. Let H(G) denote the set of all
complex-valued functions
ϕ : G → C
on the group G. For functions ϕ, ψ ∈ H(G), group elements g, h ∈ G, and complex
numbers α, β, we define the following operations:
(i) Linear combination: (αϕ + βψ)(g) := αϕ(g) + βψ(g).
(ii) Product: (ϕψ)(g) := ϕ(g)ψ(g).
(iii) Unit element: 1(g) := 1.
(iv) Coproduct: (Δϕ)(g, h) := ϕ(gh).
(v) Counit: ε(ϕ) := ϕ(e).
(vi) Coinverse: (Sϕ)(g) := ϕ(g−1).
Concerning the definition of αϕ+βψ,ϕψ, 1,Δ(ϕ), ε(ϕ), and S(ϕ), note the following:
• The coproduct Δ sends complex-valued functions g → ϕ(g) of one variable on
the group G to functions (g, h) → ϕ(gh) of two variables by using the group
product gh.
• The counit ε sends complex-valued functions g → ϕ(g) of one variable on the
group G to complex numbers ϕ(e) by using the unit element e of the group.
• The coinverse S sends complex-valued functions g → ϕ(g) of one variable on G
to functions g → ϕ(g−1) of one variable by using the inverse [size=14]g[sup]−1[size=13] of the group[/size][/sup][/size]
element g.
The maps Δ, S, and ε are linear and multiplicative. This means that, for all functions
ϕ, ψ : G → C and all complex numbers α, β, we have
Δ(αϕ + βψ) := αΔϕ + βΔψ, Δ(ϕψ) = ΔϕΔψ.
The same is true if we replace Δ by S (resp. ε).
Recall the following definition. For given functions ϕ, ψ : G → C, we define the
tensor product ϕ ⊗ ψ by setting
(ϕ ⊗ ψ)(g, h) := ϕ(g)ψ(h) for all g, h ∈ G.
This is a function of the form ϕ⊗ψ : G×G → C. In other words, ϕ⊗ψ is a function
of two variables. Note the following peculiarity which follows from the finiteness of
the group G. The functions
χ : G × G → C
of two variables are in one-to-one correspondence to the elements of the tensor
product H(G)⊗H(G). We will show this in (3.46) below by using a basis. In this
sense, the coproduct is a map of the form Δ : H(G)→H(G)⊗H(G).
Proposition 3.8 The algebra H(G) is a commutative Hopf algebra. The dimension
of H(G) is equal to the number of group elements. The Hopf algebra H(G) is
cocommutative iff the group G is commutative.
The proof will be given in Problem 3.11 on page 172.
3.5.4 The Tannaka–Krein Duality for Compact Lie Groups
Try to dualize in mathematics as much as you can.
Folklore
We want to generalize the classical exponential function
χ(x) := exp(ix), x∈ R
in the setting of groups and C
∗
-algebras. The starting point is the functional equation
χ(x + y) = χ(x)χ(y) for all x, y ∈ R.
This means that the map χ : R → U(1) is a group morphism from the additive
group R of real numbers onto the multiplicative group U(1) of the unit circle in
the complex plane. Explicitly, U(1) := {z ∈ C : |z| = 1}. The group U(1) is the
prototype of a compact Lie group. Our goal is to use the group U(1) in order to
study the structure of more general objects, namely, groups and C∗-algebras. By
definition, the character of a group G is a group morphism χ : G → U(1), that is,
χ(gh) = χ(g)χ(h) for all g, h ∈ G.
For example, the characters of the group U(1) are given by the following maps
χn : U(1) → U(1), n = 0,±1,±2, . . . , where
χn(exp(iϕ)) := exp(inϕ), ϕ∈ R. (3.47)
In terms of the angle variable ϕ, this corresponds to the map ϕ → nϕ. The integer
n is called the winding number (or the topological charge) of the map χn. In this
connection, let us discuss the following dualities:
• de Rham duality for manifolds and cohomology,
• Pontryagin duality for commutative compact groups,
• Tannaka–Krein duality for noncommutative compact groups,
• Gelfand–Naimark duality for commutative C∗-algebras and noncommutative geometry
(see Vol. IV).
This fits into the following general strategy in mathematics:
Study the structure of a mathematical object X by investigating maps
χ : X → Y
that live on the object X. From the physical point of view, the prototype
for this strategy is given by physical fields χ that live on the space-time
manifold X.
De Rham duality. Let X be a manifold, and let χ : X → Y be a differential
form on X. Using the Cartan derivative dχ of the differential form χ with the crucial
property
d(dχ) = 0,
that is, dd = 0, the topological properties of the manifold X can be studied by
de Rham cohomology which lies at the heart of modern differential topology. This
yields a powerful theory of topological invariants (called topological charges in
physics) including the theory of characteristic classes and Chern numbers (see Chap.
5 of Vol. I). Cohomology also lies at the heart of both the methods of BRSTquantization
and algebraic renormalization.
3.6 Regularization and Rota–Baxter Algebras
It was noted long ago by Tricomi, and later independently by Cotlar, that
the Hilbert transform
(Hf)(x) := limε→+0f(ξ)/(ξ − x)dξ, x ∈ R, |ξ|≥ε
operating on a suitable function algebra, satisfies the identity
(Hf) (Hf)= ff + 2H(fHf).
Later on, Glen Baxter was the first to point out that the evaluation of
several functionals of sums of independent random variables depended on
a purely algebraic study of a closely related identity,
Pf · Pg = P(Pf · g + f · Pg − wfg).
Here, the fixed real parameter w is called the weight of the Rota–Baxter
operator P.27 The very same identity reappeared in the same guise in
various estimates involving the iteration of the maximum function
x → max(x, 0),
such as occur in the theory of almost-everywhere convergence. . . By looking
at the problem in the rarified atmosphere of universal algebra, we were
led to a systematic method for guessing and verifying identities for Baxter
operators, based upon reducing all computations to identities between
symmetric functions.
Gian-Carlo Rota and David Smith, 1972
It is our goal to generalize the classical operations of
• differentiation and integration of smooth functions, and
• regularization of singular functions
by using deformations. This will lead us to the notion of Rota–Baxter operators
with weights. Hopf algebras and Rota–Baxter algebras play a crucial role in modern
renormalization theory, as we will study later on.
Let A be a complex algebra. Consider an operator P :A→A. One frequently
encounters the following special cases.
(i) Linear map: The operator P is called linear iff we have
P(αf + βg) = αPf + βPg
for all f, g ∈ A and all complex numbers α, β.
(ii) Antilinear map: The operator P is called antilinear iff
P(αf + βg) = α†Pf + β†Pg
for all f, g ∈ A and all complex numbers α, β.
(iii) Multiplicative map: The operator P is called multiplicative iff for all f, g ∈ A,
we have the product property
P(fg) = Pf · Pg.
The operator P is called an endomorphism of the algebra A iff it is linear and
multiplicative. Bijective endomorphisms are called automorphisms.
(iv) Anti-multiplicative map: The operator P is called anti-multiplicative iff for all
f, g ∈ A,
P(fg) = Pg · Pf.
The operator P is called an anti-endomorphism of the algebra A iff it is antilinear
and anti-multiplicative. Bijective anti-endomorphisms are called antiautomorphisms.
(v) Derivation: The operator P is called a derivation iff it is linear, and for all
f, g ∈ A, we have the following Leibniz (product) rule:
P(fg) = Pf · g + f · Pg. (3.49)
Derivations are also called infinitesimal endomorphisms (or generalized differential
operators).
(vi) Inverse derivation: The operator P is called an inverse derivation iff it is linear
and for all f, g ∈ A, we have the following rule:
Pf · Pg = P(Pf · g + f · Pg). (3.50)
As we will show below, this rule generalizes integration by parts. Inverse derivations
are also called generalized integral operators.
(vii) Truncation operator: The operator P is called a truncation iff it is linear with
the projector property P2 = P and with the truncation property, that is, for
all f, g ∈ A we have
Pf · Pg = P(Pf · g + f · Pg − fg). (3.51)
The operator R := I − P is called a regularization operator.
(viii) Rota–Baxter operator: Fix the real number w. The operator P is called a
Rota–Baxter operator of weight w iff it is linear and for all f, g ∈ A, we have
the relation
Pf · Pg = P(Pf · g + f · Pg − wfg). (3.52)
Obviously, the inverse derivation (vi) (resp. the truncation operator (vii)) is a
Rota–Baxter operator of weight w = 0 (resp. w = 1). If P is a Rota–Baxter
operator of nonzero weight w, then 1/wP (resp. −w/P) is a Rota–Baxter
operator of weight 1 (resp. −1).
(ix) Rota–Baxter algebra: By definition, a Rota–Baxter algebra of weight w is a
complex commutative unital algebra A equipped with a fixed Rota–Baxter
operator of weight w.
Differentiation and integration. Set A := E(R), that is, the complex algebra
A consists of all the smooth functions f : R → C.
• Fix n = 1, 2, . . . The operator
Pf := f...f(n个) for all f ∈ A
is multiplicative. This operator is linear iff n = 1. In this special case, the operator
P is the trivial automorphism of the algebra A.
• The operator Pf := f
†
is anti-linear, anti-multiplicative, and bijective. Hence
it is an anti-automorphism of A. (Since f†g†= g†f†, the operator A is also
multiplicative.)
• For fixed real number q, set
(Pf)(x) := f(qx) for all x ∈ R.
Then the operator P :A→A is an endomorphism.
• The operator Pf := df
dx is a derivation of A. In fact, the Leibniz rule of differentiation
tells us that
d(fg)/dx=df/dx· g + f · dg/dx
This is precisely relation (3.49).
• Define the integral operator (Pf)(x) := 0-->x f(u)du for all f ∈ A. Choosing
arbitrary functions f, g ∈ A, we obtain
P(Pf · g + f · Pg) = Pf · Pg.
This means that the operator P : A → A is a Rota–Baxter operator of weight
w = 0. In fact, setting F(x) :=0-->x f(u)du and G(x) := 0-->x g(u)du, and noting
that G(0) = F(0) = 0, integration by parts yields
0-->x(F(u)g(u) + f(u)G(u)) du =0-->xd{F(u)G(u)}/du du = F(x)G(x).
The Importance of the Exponential Function in
Mathematics and Physics
The exponential function is the most important function in mathematics.
Folklore
The following exponential and logarithmic formulas play a crucial role:
• The Dyson series via time-ordering operator (see Sect. 7.17.4 of Vol. I).
• The Trotter exponential formula (see Sect. 8.3 of Vol. I).
• The Baker–Campbell–Hausdorff exponential formula for Lie algebras (see Sect.
8.4 of Vol. I).
• The Fa`a di Bruno exponential formula for Bell polynomials (see (3.32) on page
136).
• The logarithmic formula for Schur polynomials (see (3.39) on page 140).
• The logarithmic formula for reduced correlation functions (or connected Green’s
functions) (see (3.41) on page 142).
• The logarithmic formula for cumulants (see (3.44) on page 144).
• Group characters (see Sect. 3.5.4 on page 152).
• The Volterra–Spitzer formula (see (3.57) on page 160).
• The Spitzer formula for Rota–Baxter algebras (see (3.61) on page 161).
• The perturbation formula (see (3.70) on page 167).
由一星于2014-08-15, 04:09进行了最后一次编辑,总共编辑了5次
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
Braid group
From Wikipedia, the free encyclopedia
[ltr]In mathematics, the braid group on n strands, denoted by Bn, is a groupwhich has an intuitive geometrical representation, and in a sense generalizes the symmetric group Sn. Here, n is a natural number; if n > 1, then Bn is an infinite group. Braid groups find applications in knot theory, since any knot may be represented as the closure of certain braids.[/ltr]
2 Basic properties
2.1 Generators and relations
2.2 Further properties
3 Interactions of braid groups
3.1 Relation with symmetric group and the pure braid group
3.2 Relation between B3 and the modular group
3.3 Relationship to the mapping class group and classification of braids
3.4 Connection to knot theory
3.5 Computational aspects
4 Actions of braid groups
4.1 Representations
5 Infinitely generated braid groups
6 See also
7 Notes
7.1 References
7.2 Further reading
8 External links
[ltr]
Introduction[edit]
Intuitive description[edit]
In this introduction let n = 4; the generalization to other values of n will be straightforward. Consider two sets of four items lying on a table, with the items in each set being arranged in a vertical line, and such that one set sits next to the other. (In the illustrations below, these are the black dots.) Using four strands, each item of the first set is connected with an item of the second set so that a one-to-one correspondence results. Such a connection is called a braid. Often some strands will have to pass over or under others, and this is crucial: the following two connections are differentbraids:[/ltr]
[ltr]
On the other hand, two such connections which can be made to look the same by "pulling the strands" are considered the same braid:[/ltr]
[ltr]
All strands are required to move from left to right; knots like the following are not considered braids:[/ltr]
[ltr]
Any two braids can be composed by drawing the first next to the second, identifying the four items in the middle, and connecting corresponding strands:[/ltr]
[ltr]
Another example:[/ltr]
[ltr]
The composition of the braids σ and τ is written as στ.
The set of all braids on four strands is denoted by B4. The above composition of braids is indeed a group operation. The identity element is the braid consisting of four parallel horizontal strands, and the inverse of a braid consists of that braid which "undoes" whatever the first braid did, which is obtained by flipping a diagram such as the ones above across a vertical line going through its centre. (The first two example braids above are inverses of each other.)
Formal treatment[edit]
To put the above informal discussion of braid groups on firm ground, one needs to use the homotopy concept of algebraic topology, defining braid groups as fundamental groups of a configuration space. This is outlined in the article on braid theory.
Alternatively, one can define the braid group purely algebraically via the braid relations, keeping the pictures in mind only to guide the intuition.
History[edit]
Braid groups were introduced explicitly by Emil Artin in 1925, although (asWilhelm Magnus pointed out in 1974[1]) they were already implicit in Adolf Hurwitz's work on monodromy (1891). In fact, as Magnus says, Hurwitz gave the interpretation of a braid group as the fundamental group of a configuration space (cf. braid theory), an interpretation that was lost from view until it was rediscovered by Ralph Fox and Lee Neuwirth in 1962.
Basic properties[edit]
Generators and relations[edit]
Consider the following three braids:[/ltr]
[ltr]
Every braid in B4 can be written as a composition of a number of these braids and their inverses. In other words, these three braids generate the group B4. To see this, an arbitrary braid is scanned from left to right for crossings; beginning at the top, whenever a crossing of strands i andi + 1 is encountered, σi or σi−1 is written down, depending on whether strand i moves under or over strand i + 1. Upon reaching the right hand end, the braid has been written as a product of the σ's and their inverses.
It is clear that
(i) σ1σ3 = σ3σ1,
while the following two relations are not quite as obvious:
(iia) σ1σ2σ1 = σ2σ1σ2,(iib) σ2σ3σ2 = σ3σ2σ3
(these can be appreciated best by drawing the braid on a piece of paper). It can be shown that all other relations among the braids σ1, σ2 and σ3already follow from these relations and the group axioms.
Generalising this example to n strands, the group Bn can be abstractly defined via the following presentation:
where in the first group of relations 1 ≤ i ≤ n−2 and in the second group of relations, |i − j| ≥ 2. This presentation leads to generalisations of braid groups called Artin groups. The cubic relations, known as the braid relations, play an important role in the theory of Yang–Baxter equation.
Further properties[edit][/ltr]
[ltr]
Interactions of braid groups[edit]
Relation with symmetric group and the pure braid group[edit]
By forgetting how the strands twist and cross, every braid on n strands determines a permutation on n elements. This assignment is onto, compatible with composition, and therefore becomes a surjective group homomorphism Bn → Sn from the braid group into the symmetric group. The image of the braid σi ∈ Bn is the transposition si = (i, i+1) ∈ Sn. These transpositions generate the symmetric group, satisfy the braid group relations, and have order 2. This transforms the Artin presentation of the braid group into the Coxeter presentation of the symmetric group:
The kernel of the homomorphism Bn → Sn is the subgroup of Bn called the pure braid group on n strands and denoted Pn. In a pure braid, the beginning and the end of each strand are in the like positions. Pure braid groups fit into a short exact sequence
This sequence splits and therefore pure braid groups are realized as iterated semi-direct products of free groups.
Relation between B3 and the modular group[edit][/ltr]
[ltr]
The braid group B3is theuniversal central extensionof themodular groupPSL(2, Z), with these sitting as lattices inside the (topological) universal covering group
Furthermore, the modular group has trivial center, and thus the modular group is isomorphic to the quotient group of B3 modulo its center, Z(B3), and equivalently, to the group of inner automorphisms of B3.
Here is a construction of this isomorphism. Define
From the braid relations it follows that a2 = b3. Denoting this latter product as c, one may verify from the braid relations that
implying that c is in the center of B3. Let C denote the subgroup of B3generated by c, since C ⊂ Z(B3), it is a normal subgroup and one may take the quotient group B3/C. We claim B3/C ≅ PSL(2, Z); this isomorphism can be given an explicit form. The cosets σ1C and σ2C map to
where L and R are the standard left and right moves on the Stern-Brocot tree; it is well known that these moves generate the modular group.
Alternately, one common presentation for the modular group is
where
Mapping a to v and b to p yields a surjective group homomorphismB3 → PSL(2, Z).
The center of B3 is equal to C, a consequence of the facts that c is in the center, the modular group has trivial center, and the above surjective homomorphism has kernel C.
Relationship to the mapping class group and classification of braids[edit]
The braid group Bn can be shown to be isomorphic to the mapping class group of a punctured disk with n punctures. This is most easily visualized by imagining each puncture as being connected by a string to the boundary of the disk; each mapping homomorphism that permutes two of the punctures can then be seen to be a homotopy of the strings, that is, a braiding of these strings.
Via this mapping class group interpretation of braids, each braid may be classified as periodic, reducible or pseudo-Anosov.
Connection to knot theory[edit]
If a braid is given and one connects the first left-hand item to the first right-hand item using a new string, the second left-hand item to the second right-hand item etc. (without creating any braids in the new strings), one obtains a link, and sometimes a knot. Alexander's theorem in braid theorystates that the converse is true as well: every knot and every link arises in this fashion from at least one braid; such a braid can be obtained by cutting the link. Since braids can be concretely given as words in the generators σi, this is often the preferred method of entering knots into computer programs.
Computational aspects[edit]
The word problem for the braid relations is efficiently solvable and there exists a normal form for elements of Bn in terms of the generatorsσ1, ..., σn−1. (In essence, computing the normal form of a braid is the algebraic analogue of "pulling the strands" as illustrated in our second set of images above.) The free GAP computer algebra system can carry out computations in Bn if the elements are given in terms of these generators. There is also a package called CHEVIE for GAP3 with special support for braid groups. The word problem is also efficiently solved via the Lawrence-Krammer representation.
Since there are nevertheless several hard computational problems about braid groups, applications in cryptography have been suggested.
Actions of braid groups[edit]
In analogy with the action of the symmetric group by permutations, in various mathematical settings there exists a natural action the braid group on n-tuples of objects or on the n-folded tensor product that involves some "twists". Consider an arbitrary group G and let X be the set of all n-tuples of elements of G whose product is the identity element of G. ThenBn acts on X in the following fashion:
Thus the elements xi and xi+1 exchange places and, in addition, xi is twisted by the inner automorphism corresponding to xi+1 — this ensures that the product of the components of x remains the identity element. It may be checked that the braid group relations are satisfied and this formula indeed defines a group action of Bn on X. As another example, abraided monoidal category is a monoidal category with a braid group action. Such structures play an important role in modern mathematical physics and lead to quantum knot invariants.
Representations[edit]
Elements of the braid group Bn can be represented more concretely by matrices. One classical such representation is Burau representation, where the matrix entries are single variable Laurent polynomials. It had been a long-standing question whether Burau representation was faithful, but the answer turned out to be negative for n ≥ 5. More generally, it was a major open problem whether braid groups were linear. In 1990, Ruth Lawrence described a family of more general "Lawrence representations" depending on several parameters. In 1996, C. Nayak and Frank Wilczekposited that in analogy to projective representations of SO(3), the projective representations of the braid group have a physical meaning for certain quasiparticles in the fractional quantum hall effect. Around 2001 Stephen Bigelow and Daan Krammer independently proved that all braid groups are linear. Their work used the Lawrence–Krammer representationof dimension n(n−1)/2 depending on the variables q and t. By suitably specializing these variables, the braid group Bn may be realized as a subgroup of the general linear group over the complex numbers.
Infinitely generated braid groups[edit]
There are many ways to generalize this notion to an infinite number of strands. The simplest way is take the direct limit of braid groups, where the attaching maps f : Bn → Bn+1 send the n−1 generators of Bn to the firstn−1 generators of Bn+1 (i.e., by attaching a trivial strand). Fabel has shown that there are two topologies that can be imposed on the resulting group each of whose completion yields a different group. One is a very tame group and is isomorphic to the mapping class group of the infinitely punctured disk — a discrete set of punctures limiting to the boundary of the disk.
The second group can be thought of the same as with finite braid groups. Place a strand at each of the points (0, 1/n) and the set of all braids — where a braid is defined to be a collection of paths from the points(0, 1/n, 0) to the points (0, 1/n, 1) so that the function yields a permutation on endpoints — is isomorphic to this wilder group. An interesting fact is that the pure braid group in this group is isomorphic to both the inverse limit of finite pure braid groups Pn and to the fundamental group of the Hilbert cube minus the set
See also[edit][/ltr]
[ltr]
Notes[edit][/ltr]
[ltr]
References[edit][/ltr]
[ltr]
Further reading[edit][/ltr]
[ltr]
External links[edit][/ltr]
Yang–Baxter equation
From Wikipedia, the free encyclopedia
[size][ltr]
In physics, the Yang–Baxter equation (or star-*** relation) is a consistency equation which was first introduced in the field of statistical mechanics. It depends on the idea that in some scattering situations, particles may preserve their momentum while changing their quantum internal states. It states that a matrix , acting on two out of three objects, satisfies
In one dimensional quantum systems, is the scattering matrix and if it satisfies the Yang–Baxter equation then the system is integrable. The Yang–Baxter equation also shows up when discussing knot theory and thebraid groups where corresponds to swapping two strands. Since one can swap three strands two different ways, the Yang–Baxter equation enforces that both paths are the same.
[/ltr][/size][size][ltr]
It takes its name from independent work of C. N. Yang from 1968, andR. J. Baxter from 1971.
[/ltr][/size]
[size][ltr]
Parameter-dependent Yang–Baxter equation[edit]
Let be a unital associative algebra. The parameter-dependent Yang–Baxter equation is an equation for , a parameter-dependentinvertible element of the tensor product (here, is the parameter, which usually ranges over all real numbers in the case of an additive parameter, or over all positive real numbers in the case of a multiplicative parameter). The Yang–Baxter equation is
for all values of and , in the case of an additive parameter. At some value of the parameter can turn into one dimensional projector, this gives rise to quantum determinant. For multiplicative parameter Yang–Baxter equation is
for all values of and , where , , and , for all values of the parameter , and , , and are algebra morphisms determined by
In some cases the determinant of can vanish at specific values of the spectral parameter . Some matrices turn into one dimensional projector at . In this case quantum determinant can be defined.
Parameter-independent Yang–Baxter equation[edit]
Let be a unital associative algebra. The parameter-independent Yang–Baxter equation is an equation for , an invertible element of the tensor product . The Yang–Baxter equation is
where , , and .
Let be a module of . Let be the linear map satisfying for all . Then arepresentation of the braid group, , can be constructed on by for , where on . This representation can be used to determine quasi-invariants of braids, knots and links.
See also[edit]
[/ltr][/size]
[size][ltr]
References[edit]
[/ltr][/size]
[size][ltr]
External links[edit]
[/ltr][/size]
From Wikipedia, the free encyclopedia
[ltr]In mathematics, the braid group on n strands, denoted by Bn, is a groupwhich has an intuitive geometrical representation, and in a sense generalizes the symmetric group Sn. Here, n is a natural number; if n > 1, then Bn is an infinite group. Braid groups find applications in knot theory, since any knot may be represented as the closure of certain braids.[/ltr]
[ltr]
Introduction[edit]
Intuitive description[edit]
In this introduction let n = 4; the generalization to other values of n will be straightforward. Consider two sets of four items lying on a table, with the items in each set being arranged in a vertical line, and such that one set sits next to the other. (In the illustrations below, these are the black dots.) Using four strands, each item of the first set is connected with an item of the second set so that a one-to-one correspondence results. Such a connection is called a braid. Often some strands will have to pass over or under others, and this is crucial: the following two connections are differentbraids:[/ltr]
is different from |
On the other hand, two such connections which can be made to look the same by "pulling the strands" are considered the same braid:[/ltr]
is the same as |
All strands are required to move from left to right; knots like the following are not considered braids:[/ltr]
is not a braid |
Any two braids can be composed by drawing the first next to the second, identifying the four items in the middle, and connecting corresponding strands:[/ltr]
composed with | yields |
Another example:[/ltr]
composed with | yields |
The composition of the braids σ and τ is written as στ.
The set of all braids on four strands is denoted by B4. The above composition of braids is indeed a group operation. The identity element is the braid consisting of four parallel horizontal strands, and the inverse of a braid consists of that braid which "undoes" whatever the first braid did, which is obtained by flipping a diagram such as the ones above across a vertical line going through its centre. (The first two example braids above are inverses of each other.)
Formal treatment[edit]
To put the above informal discussion of braid groups on firm ground, one needs to use the homotopy concept of algebraic topology, defining braid groups as fundamental groups of a configuration space. This is outlined in the article on braid theory.
Alternatively, one can define the braid group purely algebraically via the braid relations, keeping the pictures in mind only to guide the intuition.
History[edit]
Braid groups were introduced explicitly by Emil Artin in 1925, although (asWilhelm Magnus pointed out in 1974[1]) they were already implicit in Adolf Hurwitz's work on monodromy (1891). In fact, as Magnus says, Hurwitz gave the interpretation of a braid group as the fundamental group of a configuration space (cf. braid theory), an interpretation that was lost from view until it was rediscovered by Ralph Fox and Lee Neuwirth in 1962.
Basic properties[edit]
Generators and relations[edit]
Consider the following three braids:[/ltr]
σ1 | σ2 | σ3 |
Every braid in B4 can be written as a composition of a number of these braids and their inverses. In other words, these three braids generate the group B4. To see this, an arbitrary braid is scanned from left to right for crossings; beginning at the top, whenever a crossing of strands i andi + 1 is encountered, σi or σi−1 is written down, depending on whether strand i moves under or over strand i + 1. Upon reaching the right hand end, the braid has been written as a product of the σ's and their inverses.
It is clear that
(i) σ1σ3 = σ3σ1,
while the following two relations are not quite as obvious:
(iia) σ1σ2σ1 = σ2σ1σ2,(iib) σ2σ3σ2 = σ3σ2σ3
(these can be appreciated best by drawing the braid on a piece of paper). It can be shown that all other relations among the braids σ1, σ2 and σ3already follow from these relations and the group axioms.
Generalising this example to n strands, the group Bn can be abstractly defined via the following presentation:
where in the first group of relations 1 ≤ i ≤ n−2 and in the second group of relations, |i − j| ≥ 2. This presentation leads to generalisations of braid groups called Artin groups. The cubic relations, known as the braid relations, play an important role in the theory of Yang–Baxter equation.
Further properties[edit][/ltr]
- The braid group B1 is trivial, B2 is an infinite cyclic group Z, and B3 is isomorphic to the knot group of the trefoil knot – in particular, it is an infinite non-abelian group.
- The n-strand braid group Bn embeds as a subgroup into the (n+1)-strand braid group Bn+1 by adding an extra strand that does not cross any of the first n strands. The increasing union of the braid groups with all n ≥ 1 is the infinite braid group B∞.
- All non-identity elements of Bn have infinite order; i.e., Bn is torsion-free.
- There is a left-invariant linear order on Bn called the Dehornoy order.
- For n ≥ 3, Bn contains a subgroup isomorphic to the free group on two generators.
- There is a homomorphism Bn → Z defined by σi ↦ 1. So for instance, the braid σ2σ3σ1−1σ2σ3 is mapped to 1 + 1 − 1 + 1 + 1 = 3.
[ltr]
Interactions of braid groups[edit]
Relation with symmetric group and the pure braid group[edit]
By forgetting how the strands twist and cross, every braid on n strands determines a permutation on n elements. This assignment is onto, compatible with composition, and therefore becomes a surjective group homomorphism Bn → Sn from the braid group into the symmetric group. The image of the braid σi ∈ Bn is the transposition si = (i, i+1) ∈ Sn. These transpositions generate the symmetric group, satisfy the braid group relations, and have order 2. This transforms the Artin presentation of the braid group into the Coxeter presentation of the symmetric group:
The kernel of the homomorphism Bn → Sn is the subgroup of Bn called the pure braid group on n strands and denoted Pn. In a pure braid, the beginning and the end of each strand are in the like positions. Pure braid groups fit into a short exact sequence
This sequence splits and therefore pure braid groups are realized as iterated semi-direct products of free groups.
Relation between B3 and the modular group[edit][/ltr]
[ltr]
The braid group B3is theuniversal central extensionof themodular groupPSL(2, Z), with these sitting as lattices inside the (topological) universal covering group
Furthermore, the modular group has trivial center, and thus the modular group is isomorphic to the quotient group of B3 modulo its center, Z(B3), and equivalently, to the group of inner automorphisms of B3.
Here is a construction of this isomorphism. Define
From the braid relations it follows that a2 = b3. Denoting this latter product as c, one may verify from the braid relations that
implying that c is in the center of B3. Let C denote the subgroup of B3generated by c, since C ⊂ Z(B3), it is a normal subgroup and one may take the quotient group B3/C. We claim B3/C ≅ PSL(2, Z); this isomorphism can be given an explicit form. The cosets σ1C and σ2C map to
where L and R are the standard left and right moves on the Stern-Brocot tree; it is well known that these moves generate the modular group.
Alternately, one common presentation for the modular group is
where
Mapping a to v and b to p yields a surjective group homomorphismB3 → PSL(2, Z).
The center of B3 is equal to C, a consequence of the facts that c is in the center, the modular group has trivial center, and the above surjective homomorphism has kernel C.
Relationship to the mapping class group and classification of braids[edit]
The braid group Bn can be shown to be isomorphic to the mapping class group of a punctured disk with n punctures. This is most easily visualized by imagining each puncture as being connected by a string to the boundary of the disk; each mapping homomorphism that permutes two of the punctures can then be seen to be a homotopy of the strings, that is, a braiding of these strings.
Via this mapping class group interpretation of braids, each braid may be classified as periodic, reducible or pseudo-Anosov.
Connection to knot theory[edit]
If a braid is given and one connects the first left-hand item to the first right-hand item using a new string, the second left-hand item to the second right-hand item etc. (without creating any braids in the new strings), one obtains a link, and sometimes a knot. Alexander's theorem in braid theorystates that the converse is true as well: every knot and every link arises in this fashion from at least one braid; such a braid can be obtained by cutting the link. Since braids can be concretely given as words in the generators σi, this is often the preferred method of entering knots into computer programs.
Computational aspects[edit]
The word problem for the braid relations is efficiently solvable and there exists a normal form for elements of Bn in terms of the generatorsσ1, ..., σn−1. (In essence, computing the normal form of a braid is the algebraic analogue of "pulling the strands" as illustrated in our second set of images above.) The free GAP computer algebra system can carry out computations in Bn if the elements are given in terms of these generators. There is also a package called CHEVIE for GAP3 with special support for braid groups. The word problem is also efficiently solved via the Lawrence-Krammer representation.
Since there are nevertheless several hard computational problems about braid groups, applications in cryptography have been suggested.
Actions of braid groups[edit]
In analogy with the action of the symmetric group by permutations, in various mathematical settings there exists a natural action the braid group on n-tuples of objects or on the n-folded tensor product that involves some "twists". Consider an arbitrary group G and let X be the set of all n-tuples of elements of G whose product is the identity element of G. ThenBn acts on X in the following fashion:
Thus the elements xi and xi+1 exchange places and, in addition, xi is twisted by the inner automorphism corresponding to xi+1 — this ensures that the product of the components of x remains the identity element. It may be checked that the braid group relations are satisfied and this formula indeed defines a group action of Bn on X. As another example, abraided monoidal category is a monoidal category with a braid group action. Such structures play an important role in modern mathematical physics and lead to quantum knot invariants.
Representations[edit]
Elements of the braid group Bn can be represented more concretely by matrices. One classical such representation is Burau representation, where the matrix entries are single variable Laurent polynomials. It had been a long-standing question whether Burau representation was faithful, but the answer turned out to be negative for n ≥ 5. More generally, it was a major open problem whether braid groups were linear. In 1990, Ruth Lawrence described a family of more general "Lawrence representations" depending on several parameters. In 1996, C. Nayak and Frank Wilczekposited that in analogy to projective representations of SO(3), the projective representations of the braid group have a physical meaning for certain quasiparticles in the fractional quantum hall effect. Around 2001 Stephen Bigelow and Daan Krammer independently proved that all braid groups are linear. Their work used the Lawrence–Krammer representationof dimension n(n−1)/2 depending on the variables q and t. By suitably specializing these variables, the braid group Bn may be realized as a subgroup of the general linear group over the complex numbers.
Infinitely generated braid groups[edit]
There are many ways to generalize this notion to an infinite number of strands. The simplest way is take the direct limit of braid groups, where the attaching maps f : Bn → Bn+1 send the n−1 generators of Bn to the firstn−1 generators of Bn+1 (i.e., by attaching a trivial strand). Fabel has shown that there are two topologies that can be imposed on the resulting group each of whose completion yields a different group. One is a very tame group and is isomorphic to the mapping class group of the infinitely punctured disk — a discrete set of punctures limiting to the boundary of the disk.
The second group can be thought of the same as with finite braid groups. Place a strand at each of the points (0, 1/n) and the set of all braids — where a braid is defined to be a collection of paths from the points(0, 1/n, 0) to the points (0, 1/n, 1) so that the function yields a permutation on endpoints — is isomorphic to this wilder group. An interesting fact is that the pure braid group in this group is isomorphic to both the inverse limit of finite pure braid groups Pn and to the fundamental group of the Hilbert cube minus the set
See also[edit][/ltr]
[ltr]
Notes[edit][/ltr]
- Jump up^ Wilhelm Magnus. Braid groups: A survey. In Lecture Notes in Mathematics, volume 372, pages 463–487. Springer, 1974. Proceedings of the Second International Conference on the Theory of Groups, Canberra, Australia, 1973. ISBN 978-3-540-06845-7
[ltr]
References[edit][/ltr]
- Deligne, Pierre (1972), "Les immeubles des groupes de tresses généralisés", Inventiones Mathematicae 17 (4): 273–302,doi:10.1007/BF01406236, ISSN 0020-9910, MR 0422673
[ltr]
Further reading[edit][/ltr]
- Birman, Joan, and Brendle, Tara E., "Braids: A Survey", revised 26 February 2005. In Menasco and Thistlethwaite.
- Carlucci, Lorenzo; Dehornoy, Patrick; and Weiermann, Andreas,"Unprovability results involving braids", 23 November 2007
- Kassel, Christian; and Turaev, Vladimir, Braid Groups, Springer, 2008. ISBN 0-387-33841-1
- Menasco, W., and Thistlethwaite, M., (editors), Handbook of Knot Theory, Amsterdam : Elsevier, 2005. ISBN 0-444-51452-X
[ltr]
External links[edit][/ltr]
- Braid group at PlanetMath.org.
- CRAG: CRyptography and Groups at Algebraic Cryptography Center Contains extensive library for computations with Braid Groups
- P. Fabel, Completing Artin's braid group on infinitely many strands, Journal of Knot Theory and its Ramifications, Vol. 14, No. 8 (2005) 979–991
- P. Fabel, The mapping class group of a disk with infinitely many holes, Journal of Knot Theory and its Ramifications, Vol. 15, No. 1 (2006) 21–29
- Chernavskii, A.V. (2001), "Braid theory", in Hazewinkel, Michiel,Encyclopedia of Mathematics, Springer, ISBN 978-1-55608-010-4
- Stephen Bigelow's exploration of B5 Java applet.
- C. Nayak and F. Wilczek's connection of projective braid group representations to the fractional quantum Hall effect [1]
- Presentation for FradkinFest by C. V. Nayak [2]
- N. Read's criticism of the reality of Wilczek-Nayak representation [3]
- Cryptography and Braid Groups page - Helger Lipmaa
- Braid group: List of Authority Articles on arxiv.org.
Yang–Baxter equation
From Wikipedia, the free encyclopedia
This article provides insufficient context for those unfamiliar with the subject.Please help improve the article with a good introductory style. (October 2009) |
In physics, the Yang–Baxter equation (or star-*** relation) is a consistency equation which was first introduced in the field of statistical mechanics. It depends on the idea that in some scattering situations, particles may preserve their momentum while changing their quantum internal states. It states that a matrix , acting on two out of three objects, satisfies
In one dimensional quantum systems, is the scattering matrix and if it satisfies the Yang–Baxter equation then the system is integrable. The Yang–Baxter equation also shows up when discussing knot theory and thebraid groups where corresponds to swapping two strands. Since one can swap three strands two different ways, the Yang–Baxter equation enforces that both paths are the same.
[/ltr][/size][size][ltr]
It takes its name from independent work of C. N. Yang from 1968, andR. J. Baxter from 1971.
[/ltr][/size]
[size][ltr]
Parameter-dependent Yang–Baxter equation[edit]
Let be a unital associative algebra. The parameter-dependent Yang–Baxter equation is an equation for , a parameter-dependentinvertible element of the tensor product (here, is the parameter, which usually ranges over all real numbers in the case of an additive parameter, or over all positive real numbers in the case of a multiplicative parameter). The Yang–Baxter equation is
for all values of and , in the case of an additive parameter. At some value of the parameter can turn into one dimensional projector, this gives rise to quantum determinant. For multiplicative parameter Yang–Baxter equation is
for all values of and , where , , and , for all values of the parameter , and , , and are algebra morphisms determined by
In some cases the determinant of can vanish at specific values of the spectral parameter . Some matrices turn into one dimensional projector at . In this case quantum determinant can be defined.
Parameter-independent Yang–Baxter equation[edit]
Let be a unital associative algebra. The parameter-independent Yang–Baxter equation is an equation for , an invertible element of the tensor product . The Yang–Baxter equation is
where , , and .
Let be a module of . Let be the linear map satisfying for all . Then arepresentation of the braid group, , can be constructed on by for , where on . This representation can be used to determine quasi-invariants of braids, knots and links.
See also[edit]
[/ltr][/size]
[size][ltr]
References[edit]
[/ltr][/size]
- H.-D. Doebner, J.-D. Hennig, eds, Quantum groups, Proceedings of the 8th International Workshop on Mathematical Physics, Arnold Sommerfeld Institute, Clausthal, FRG, 1989, Springer-Verlag Berlin,ISBN 3-540-53503-9.
- Vyjayanthi Chari and Andrew Pressley, A Guide to Quantum Groups, (1994), Cambridge University Press, Cambridge ISBN 0-521-55884-0.
- Jacques H.H. Perk and Helen Au-Yang, "Yang–Baxter Equations", (2006), arXiv:math-ph/0606053.
[size][ltr]
External links[edit]
[/ltr][/size]
- Hazewinkel, Michiel, ed. (2001), "Yang-Baxter equation",Encyclopedia of Mathematics, Springer, ISBN 978-1-55608-010-4
一星- 帖子数 : 3787
注册日期 : 13-08-07
回复: Quantum Field Theory II
Quantum group
From Wikipedia, the free encyclopedia
(Redirected from Quantum groups)
[size][ltr]
In mathematics and theoretical physics, the term quantum group denotes various kinds ofnoncommutative algebra with additional structure. In general, a quantum group is some kind ofHopf algebra. There is no single, all-encompassing definition, but instead a family of broadly similar objects.
The term "quantum group" often denotes a kind of noncommutative algebra with additional structure that first appeared in the theory ofquantum integrable systems, and which was then formalized by Vladimir Drinfeld and Michio Jimbo as a particular class of Hopf algebra. The same term is also used for other Hopf algebras that deform or are close to classical Lie groups or Lie algebras, such as a `bicrossproduct' class of quantum groups introduced by Shahn Majid a little after the work of Drinfeld and Jimbo.
In Drinfeld's approach, quantum groups arise as Hopf algebras depending on an auxiliary parameter q or h, which become universal enveloping algebras of a certain Lie algebra, frequently semisimple or affine, when q = 1 or h = 0. Closely related are certain dual objects, also Hopf algebras and also called quantum groups, deforming the algebra of functions on the corresponding semisimple algebraic group or a compact Lie group.
Just as groups often appear as symmetries, quantum groups act on many other mathematical objects and it has become fashionable to introduce the adjective quantum in such cases; for example there are quantum planesand quantum Grassmannians.
[/ltr][/size]
[size][ltr]
Intuitive meaning[edit]
The discovery of quantum groups was quite unexpected, since it was known for a long time that compact groups and semisimple Lie algebras are "rigid" objects, in other words, they cannot be "deformed". One of the ideas behind quantum groups is that if we consider a structure that is in a sense equivalent but larger, namely a group algebra or a universal enveloping algebra, then a group or enveloping algebra can be "deformed", although the deformation will no longer remain a group or enveloping algebra. More precisely, deformation can be accomplished within the category of Hopf algebras that are not required to be eithercommutative or cocommutative. One can think of the deformed object as an algebra of functions on a "noncommutative space", in the spirit of thenoncommutative geometry of Alain Connes. This intuition, however, came after particular classes of quantum groups had already proved their usefulness in the study of the quantum Yang-Baxter equation andquantum inverse scattering method developed by the Leningrad School (Ludwig Faddeev, Leon Takhtajan, Evgenii Sklyanin, Nicolai Reshetikhinand Korepin) and related work by the Japanese School.[1] The intuition behind the second, bicrossproduct, class of quantum groups was different and came from the search for self-dual objects as an approach to quantum gravity.[2]
Drinfeld-Jimbo type quantum groups[edit]
One type of objects commonly called a "quantum group" appeared in the work of Vladimir Drinfeld and Michio Jimbo as a deformation of theuniversal enveloping algebra of a semisimple Lie algebra or, more generally, a Kac–Moody algebra, in the category of Hopf algebras. The resulting algebra has additional structure, *** it into a quasitriangular Hopf algebra.
Let A = (aij) be the Cartan matrix of the Kac–Moody algebra, and let q be a nonzero complex number distinct from 1, then the quantum group, Uq(G), where G is the Lie algebra whose Cartan matrix is A, is defined as theunital associative algebra with generators kλ (where λ is an element of theweight lattice, i.e. 2(λ, αi)/(αi, αi) is an integer for all i), and ei and fi (forsimple roots, αi), subject to the following relations:
[/ltr][/size]
[size][ltr]
where for all positive integers n, and These are the q-factorial and q-number, respectively, the q-analogs of the ordinaryfactorial. The last two relations above are the q-Serre relations, the deformations of the Serre relations.
In the limit as q → 1, these relations approach the relations for the universal enveloping algebra U(G), where kλ → 1 and as q → 1, where the element, tλ, of the Cartan subalgebra satisfies (tλ, h) = λ(h) for all h in the Cartan subalgebra.
There are various coassociative coproducts under which these algebras are Hopf algebras, for example,
[/ltr][/size]
[size][ltr]
where the set of generators has been extended, if required, to include kλfor λ which is expressible as the sum of an element of the weight lattice and half an element of the root lattice.
In addition, any Hopf algebra leads to another with reversed coproduct T oΔ, where T is given by T(x ⊗ y) = y ⊗ x, giving three more possible versions.
The counit on Uq(A) is the same for all these coproducts: ε(kλ) = 1, ε(ei) = ε(fi) = 0, and the respective antipodes for the above coproducts are given by
[/ltr][/size]
[size][ltr]
Alternatively, the quantum group Uq(G) can be regarded as an algebra over the field C(q), the field of all rational functions of an indeterminate qover C.
Similarly, the quantum group Uq(G) can be regarded as an algebra over the field Q(q), the field of all rational functions of an indeterminate q overQ (see below in the section on quantum groups at q = 0). The center of quantum group can be described by quantum determinant.
Representation theory[edit]
Just as there are many different types of representations for Kac–Moody algebras and their universal enveloping algebras, so there are many different types of representation for quantum groups.
As is the case for all Hopf algebras, Uq(G) has an adjoint representationon itself as a module, with the action being given by
where
.
Case 1: q is not a root of unity[edit]
One important type of representation is a weight representation, and the corresponding module is called a weight module. A weight module is a module with a basis of weight vectors. A weight vector is a nonzero vector vsuch that kλ · v = dλv for all λ, where dλ are complex numbers for all weights λ such that
[/ltr][/size]
[size][ltr]
A weight module is called integrable if the actions of ei and fi are locally nilpotent (i.e. for any vector v in the module, there exists a positive integerk, possibly dependent on v, such that for all i). In the case of integrable modules, the complex numbers dλ associated with a weight vector satisfy , where ν is an element of the weight lattice, and cλ are complex numbers such that
[/ltr][/size]
[size][ltr]
Of special interest are highest weight representations, and the corresponding highest weight modules. A highest weight module is a module generated by a weight vector v, subject to kλ · v = dλv for all weights μ, and ei · v = 0 for all i. Similarly, a quantum group can have a lowest weight representation and lowest weight module, i.e. a module generated by a weight vector v, subject to kλ · v = dλv for all weights λ, andfi · v = 0 for all i.
Define a vector v to have weight ν if for all λ in the weight lattice.
If G is a Kac–Moody algebra, then in any irreducible highest weight representation of Uq(G), with highest weight ν, the multiplicities of the weights are equal to their multiplicities in an irreducible representation ofU(G) with equal highest weight. If the highest weight is dominant and integral (a weight μ is dominant and integral if μ satisfies the condition that is a non-negative integer for all i), then the weight spectrum of the irreducible representation is invariant under the Weyl group for G, and the representation is integrable.
Conversely, if a highest weight module is integrable, then its highest weight vector v satisfies , where cλ · v = dλv are complex numbers such that
[/ltr][/size]
[size][ltr]
and ν is dominant and integral.
As is the case for all Hopf algebras, the tensor product of two modules is another module. For an element x of Uq(G), and for vectors v and w in the respective modules, x ⋅ (v ⊗ w) = Δ(x) ⋅ (v ⊗ w), so that , and in the case of coproduct Δ1, and .
The integrable highest weight module described above is a tensor product of a one-dimensional module (on which kλ = cλ for all λ, and ei = fi = 0 for all i) and a highest weight module generated by a nonzero vector v0, subject to for all weights λ, and for all i.
In the specific case where G is a finite-dimensional Lie algebra (as a special case of a Kac–Moody algebra), then the irreducible representations with dominant integral highest weights are also finite-dimensional.
In the case of a tensor product of highest weight modules, its decomposition into submodules is the same as for the tensor product of the corresponding modules of the Kac–Moody algebra (the highest weights are the same, as are their multiplicities).
Case 2: q is a root of unity[edit]
Quasitriangularity[edit]
Case 1: q is not a root of unity[edit]
Strictly, the quantum group Uq(G) is not quasitriangular, but it can be thought of as being "nearly quasitriangular" in that there exists an infinite formal sum which plays the role of an R-matrix. This infinite formal sum is expressible in terms of generators ei and fi, and Cartan generators tλ, where kλ is formally identified with qtλ. The infinite formal sum is the product of two factors,
,
and an infinite formal sum, where λj is a basis for the dual space to the Cartan subalgebra, and μj is the dual basis, and η = ±1.
The formal infinite sum which plays the part of the R-matrix has a well-defined action on the tensor product of two irreducible highest weight modules, and also on the tensor product if two lowest weight modules. Specifically, if v has weight α and w has weight β, then
,
and the fact that the modules are both highest weight modules or both lowest weight modules reduces the action of the other factor on v ⊗ W to a finite sum.
Specifically, if V is a highest weight module, then the formal infinite sum, R, has a well-defined, and invertible, action on V ⊗ V, and this value of R (as an element of End(V ⊗ V)) satisfies the Yang-Baxter equation, and therefore allows us to determine a representation of the braid group, and to define quasi-invariants for knots, links and braids.
Case 2: q is a root of unity[edit]
Quantum groups at q = 0[edit]
Main article: Crystal base
Masaki Kashiwara has researched the limiting behaviour of quantum groups as q → 0, and found a particularly well behaved base called acrystal base.
Description and classification by root-systems and Dynkin diagrams[edit]
There has been considerable progress in describing finite quotients of quantum groups such as the above Uq(g) for qn =1; one usually considers the class of pointed Hopf algebras, meaning that all subcoideals are 1-dimensional and thus there sum form a group called coradical:
[/ltr][/size]
[size][ltr]
Here, as in the classical theory V is a braided vector space of dimension n spanned by the E´s, and σ (a so-called cocylce twist) creates the nontrivial linking between E´s and F´s. Note that in contrast to classical theory, more than two linked components may appear. The role of the quantum Borel algebra is taken by aNichols algebra of the braided vectorspace.
[/ltr][/size]
[size][ltr]
Compact matrix quantum groups[edit]
See also compact quantum group.
S.L. Woronowicz introduced compact matrix quantum groups. Compact matrix quantum groups are abstract structures on which the "continuous functions" on the structure are given by elements of a C*-algebra. The geometry of a compact matrix quantum group is a special case of anoncommutative geometry.
The continuous complex-valued functions on a compact Hausdorff topological space form a commutative C*-algebra. By the Gelfand theorem, a commutative C*-algebra is isomorphic to the C*-algebra of continuous complex-valued functions on a compact Hausdorff topological space, and the topological space is uniquely determined by the C*-algebra up tohomeomorphism.
For a compact topological group, G, there exists a C*-algebra homomorphism Δ: C(G) → C(G) ⊗ C(G) (where C(G) ⊗ C(G) is the C*-algebra tensor product - the completion of the algebraic tensor product ofC(G) and C(G)), such that Δ(f)(x, y) = f(xy) for all f ∈ C(G), and for all x, y∈ G (where (f ⊗ g)(x, y) = f(x)g(y) for all f, g ∈ C(G) and all x, y ∈ G). There also exists a linear multiplicative mapping κ: C(G) → C(G), such that κ(f)(x) = f(x−1) for all f ∈ C(G) and all x ∈ G. Strictly, this does not makeC(G) a Hopf algebra, unless G is finite. On the other hand, a finite-dimensional representation of G can be used to generate a *-subalgebra of C(G) which is also a Hopf *-algebra. Specifically, if is an n-dimensional representation of G, then for all i, j uij ∈ C(G) and
It follows that the *-algebra generated by uij for all i, j and κ(uij) for all i, j is a Hopf *-algebra: the counit is determined by ε(uij) = δij for all i, j (where δijis the Kronecker delta), the antipode is κ, and the unit is given by
As a generalization, a compact matrix quantum group is defined as a pair(C, fu), where C is a C*-algebra and is a matrix with entries in C such that
[/ltr][/size]
[size][ltr]
[/ltr][/size]
[size][ltr]
where I is the identity element of C. Since κ is antimultiplicative, then κ(vw) = κ(w) κ(v) for all v, w in C0.
As a consequence of continuity, the comultiplication on C is coassociative.
In general, C is not a bialgebra, and C0 is a Hopf *-algebra.
Informally, C can be regarded as the *-algebra of continuous complex-valued functions over the compact matrix quantum group, and u can be regarded as a finite-dimensional representation of the compact matrix quantum group.
A representation of the compact matrix quantum group is given by acorepresentation of the Hopf *-algebra (a corepresentation of a counital coassociative coalgebra A is a square matrix with entries in A (so v belongs to M(n, A)) such that
for all i, j and ε(vij) = δij for all i, j). Furthermore, a representation v, is called unitary if the matrix for v is unitary (or equivalently, if κ(vij) = v*ij for all i, j).
An example of a compact matrix quantum group is SUμ(2), where the parameter μ is a positive real number. So SUμ(2) = (C(SUμ(2)), u), where C(SUμ(2)) is the C*-algebra generated by α and γ, subject to
and
so that the comultiplication is determined by ∆(α) = α ⊗ α − γ ⊗ γ*, ∆(γ) = α ⊗ γ + γ ⊗ α*, and the coinverse is determined by κ(α) = α*, κ(γ) = −μ−1γ, κ(γ*) = −μγ*, κ(α*) = α. Note that u is a representation, but not a unitary representation. u is equivalent to the unitary representation
Equivalently, SUμ(2) = (C(SUμ(2)), w), where C(SUμ(2)) is the C*-algebra generated by α and β, subject to
and
so that the comultiplication is determined by ∆(α) = α ⊗ α − μβ ⊗ β*, Δ(β) = α ⊗ β + β ⊗ α*, and the coinverse is determined by κ(α) = α*, κ(β) = −μ−1β, κ(β*) = −μβ*, κ(α*) = α. Note that w is a unitary representation. The realizations can be identified by equating .
When μ = 1, then SUμ(2) is equal to the algebra C(SU(2)) of functions on the concrete compact group SU(2).
Bicrossproduct quantum groups[edit]
Whereas compact matrix pseudogroups are typically versions of Drinfeld-Jimbo quantum groups in a dual function algebra formulation, with additional structure, the bicrossproduct ones are a distinct second family of quantum groups of increasing importance as deformations of solvable rather than semisimple Lie groups. They are associated to Lie splittings of Lie algebras or local factorisations of Lie groups and can be viewed as the cross product or Mackey quantisation of one of the factors acting on the other for the algebra and a similar story for the coproduct Δ with the second factor acting back on the first. The very simplest nontrivial example corresponds to two copies of R locally acting on each other and results in a quantum group (given here in an algebraic form) with generators p, K,K−1, say, and coproduct
where h is the deformation parameter. This quantum group was linked to a toy model of Planck scale physics implementing Born reciprocity when viewed as a deformation of the Heisenberg algebra of quantum mechanics. Also, starting with any compact real form of a semisimple Lie algebra g its complexification as a real Lie algebra of twice the dimension splits into gand a certain solvable Lie algebra (the Iwasawa decomposition), and this provides a canonical bicrossproduct quantum group associated to g. Forsu(2) one obtains a quantum group deformation of the Euclidean groupE(3) of motions in 3 dimensions.
See also[edit]
[/ltr][/size]
[size][ltr]
Notes[edit]
[/ltr][/size]
[size][ltr]
References[edit]
[/ltr][/size]
From Wikipedia, the free encyclopedia
(Redirected from Quantum groups)
In mathematics and theoretical physics, the term quantum group denotes various kinds ofnoncommutative algebra with additional structure. In general, a quantum group is some kind ofHopf algebra. There is no single, all-encompassing definition, but instead a family of broadly similar objects.
The term "quantum group" often denotes a kind of noncommutative algebra with additional structure that first appeared in the theory ofquantum integrable systems, and which was then formalized by Vladimir Drinfeld and Michio Jimbo as a particular class of Hopf algebra. The same term is also used for other Hopf algebras that deform or are close to classical Lie groups or Lie algebras, such as a `bicrossproduct' class of quantum groups introduced by Shahn Majid a little after the work of Drinfeld and Jimbo.
In Drinfeld's approach, quantum groups arise as Hopf algebras depending on an auxiliary parameter q or h, which become universal enveloping algebras of a certain Lie algebra, frequently semisimple or affine, when q = 1 or h = 0. Closely related are certain dual objects, also Hopf algebras and also called quantum groups, deforming the algebra of functions on the corresponding semisimple algebraic group or a compact Lie group.
Just as groups often appear as symmetries, quantum groups act on many other mathematical objects and it has become fashionable to introduce the adjective quantum in such cases; for example there are quantum planesand quantum Grassmannians.
[/ltr][/size]
- 1 Intuitive meaning
- 2 Drinfeld-Jimbo type quantum groups
- 2.1 Representation theory
- 2.1.1 Case 1: q is not a root of unity
- 2.1.2 Case 2: q is a root of unity
- 2.2 Quasitriangularity
- 2.2.1 Case 1: q is not a root of unity
- 2.2.2 Case 2: q is a root of unity
- 2.3 Quantum groups at q = 0
- 2.4 Description and classification by root-systems and Dynkin diagrams
- 3 Compact matrix quantum groups
- 4 Bicrossproduct quantum groups
- 5 See also
- 6 Notes
- 7 References
[size][ltr]
Intuitive meaning[edit]
The discovery of quantum groups was quite unexpected, since it was known for a long time that compact groups and semisimple Lie algebras are "rigid" objects, in other words, they cannot be "deformed". One of the ideas behind quantum groups is that if we consider a structure that is in a sense equivalent but larger, namely a group algebra or a universal enveloping algebra, then a group or enveloping algebra can be "deformed", although the deformation will no longer remain a group or enveloping algebra. More precisely, deformation can be accomplished within the category of Hopf algebras that are not required to be eithercommutative or cocommutative. One can think of the deformed object as an algebra of functions on a "noncommutative space", in the spirit of thenoncommutative geometry of Alain Connes. This intuition, however, came after particular classes of quantum groups had already proved their usefulness in the study of the quantum Yang-Baxter equation andquantum inverse scattering method developed by the Leningrad School (Ludwig Faddeev, Leon Takhtajan, Evgenii Sklyanin, Nicolai Reshetikhinand Korepin) and related work by the Japanese School.[1] The intuition behind the second, bicrossproduct, class of quantum groups was different and came from the search for self-dual objects as an approach to quantum gravity.[2]
Drinfeld-Jimbo type quantum groups[edit]
One type of objects commonly called a "quantum group" appeared in the work of Vladimir Drinfeld and Michio Jimbo as a deformation of theuniversal enveloping algebra of a semisimple Lie algebra or, more generally, a Kac–Moody algebra, in the category of Hopf algebras. The resulting algebra has additional structure, *** it into a quasitriangular Hopf algebra.
Let A = (aij) be the Cartan matrix of the Kac–Moody algebra, and let q be a nonzero complex number distinct from 1, then the quantum group, Uq(G), where G is the Lie algebra whose Cartan matrix is A, is defined as theunital associative algebra with generators kλ (where λ is an element of theweight lattice, i.e. 2(λ, αi)/(αi, αi) is an integer for all i), and ei and fi (forsimple roots, αi), subject to the following relations:
[/ltr][/size]
- ,
- ,
- ,
- ,
- ,
- If i ≠ j then:
[size][ltr]
where for all positive integers n, and These are the q-factorial and q-number, respectively, the q-analogs of the ordinaryfactorial. The last two relations above are the q-Serre relations, the deformations of the Serre relations.
In the limit as q → 1, these relations approach the relations for the universal enveloping algebra U(G), where kλ → 1 and as q → 1, where the element, tλ, of the Cartan subalgebra satisfies (tλ, h) = λ(h) for all h in the Cartan subalgebra.
There are various coassociative coproducts under which these algebras are Hopf algebras, for example,
[/ltr][/size]
- ,
- ,
- ,
- ,
- ,
- ,
- ,
- ,
- ,
[size][ltr]
where the set of generators has been extended, if required, to include kλfor λ which is expressible as the sum of an element of the weight lattice and half an element of the root lattice.
In addition, any Hopf algebra leads to another with reversed coproduct T oΔ, where T is given by T(x ⊗ y) = y ⊗ x, giving three more possible versions.
The counit on Uq(A) is the same for all these coproducts: ε(kλ) = 1, ε(ei) = ε(fi) = 0, and the respective antipodes for the above coproducts are given by
[/ltr][/size]
- ,
- ,
[size][ltr]
Alternatively, the quantum group Uq(G) can be regarded as an algebra over the field C(q), the field of all rational functions of an indeterminate qover C.
Similarly, the quantum group Uq(G) can be regarded as an algebra over the field Q(q), the field of all rational functions of an indeterminate q overQ (see below in the section on quantum groups at q = 0). The center of quantum group can be described by quantum determinant.
Representation theory[edit]
Just as there are many different types of representations for Kac–Moody algebras and their universal enveloping algebras, so there are many different types of representation for quantum groups.
As is the case for all Hopf algebras, Uq(G) has an adjoint representationon itself as a module, with the action being given by
where
.
Case 1: q is not a root of unity[edit]
One important type of representation is a weight representation, and the corresponding module is called a weight module. A weight module is a module with a basis of weight vectors. A weight vector is a nonzero vector vsuch that kλ · v = dλv for all λ, where dλ are complex numbers for all weights λ such that
[/ltr][/size]
- ,
- , for all weights λ and μ.
[size][ltr]
A weight module is called integrable if the actions of ei and fi are locally nilpotent (i.e. for any vector v in the module, there exists a positive integerk, possibly dependent on v, such that for all i). In the case of integrable modules, the complex numbers dλ associated with a weight vector satisfy , where ν is an element of the weight lattice, and cλ are complex numbers such that
[/ltr][/size]
- , for all weights λ and μ,
- for all i.
[size][ltr]
Of special interest are highest weight representations, and the corresponding highest weight modules. A highest weight module is a module generated by a weight vector v, subject to kλ · v = dλv for all weights μ, and ei · v = 0 for all i. Similarly, a quantum group can have a lowest weight representation and lowest weight module, i.e. a module generated by a weight vector v, subject to kλ · v = dλv for all weights λ, andfi · v = 0 for all i.
Define a vector v to have weight ν if for all λ in the weight lattice.
If G is a Kac–Moody algebra, then in any irreducible highest weight representation of Uq(G), with highest weight ν, the multiplicities of the weights are equal to their multiplicities in an irreducible representation ofU(G) with equal highest weight. If the highest weight is dominant and integral (a weight μ is dominant and integral if μ satisfies the condition that is a non-negative integer for all i), then the weight spectrum of the irreducible representation is invariant under the Weyl group for G, and the representation is integrable.
Conversely, if a highest weight module is integrable, then its highest weight vector v satisfies , where cλ · v = dλv are complex numbers such that
[/ltr][/size]
- ,
- , for all weights λ and μ,
- for all i,
[size][ltr]
and ν is dominant and integral.
As is the case for all Hopf algebras, the tensor product of two modules is another module. For an element x of Uq(G), and for vectors v and w in the respective modules, x ⋅ (v ⊗ w) = Δ(x) ⋅ (v ⊗ w), so that , and in the case of coproduct Δ1, and .
The integrable highest weight module described above is a tensor product of a one-dimensional module (on which kλ = cλ for all λ, and ei = fi = 0 for all i) and a highest weight module generated by a nonzero vector v0, subject to for all weights λ, and for all i.
In the specific case where G is a finite-dimensional Lie algebra (as a special case of a Kac–Moody algebra), then the irreducible representations with dominant integral highest weights are also finite-dimensional.
In the case of a tensor product of highest weight modules, its decomposition into submodules is the same as for the tensor product of the corresponding modules of the Kac–Moody algebra (the highest weights are the same, as are their multiplicities).
Case 2: q is a root of unity[edit]
Quasitriangularity[edit]
Case 1: q is not a root of unity[edit]
Strictly, the quantum group Uq(G) is not quasitriangular, but it can be thought of as being "nearly quasitriangular" in that there exists an infinite formal sum which plays the role of an R-matrix. This infinite formal sum is expressible in terms of generators ei and fi, and Cartan generators tλ, where kλ is formally identified with qtλ. The infinite formal sum is the product of two factors,
,
and an infinite formal sum, where λj is a basis for the dual space to the Cartan subalgebra, and μj is the dual basis, and η = ±1.
The formal infinite sum which plays the part of the R-matrix has a well-defined action on the tensor product of two irreducible highest weight modules, and also on the tensor product if two lowest weight modules. Specifically, if v has weight α and w has weight β, then
,
and the fact that the modules are both highest weight modules or both lowest weight modules reduces the action of the other factor on v ⊗ W to a finite sum.
Specifically, if V is a highest weight module, then the formal infinite sum, R, has a well-defined, and invertible, action on V ⊗ V, and this value of R (as an element of End(V ⊗ V)) satisfies the Yang-Baxter equation, and therefore allows us to determine a representation of the braid group, and to define quasi-invariants for knots, links and braids.
Case 2: q is a root of unity[edit]
Quantum groups at q = 0[edit]
Main article: Crystal base
Masaki Kashiwara has researched the limiting behaviour of quantum groups as q → 0, and found a particularly well behaved base called acrystal base.
Description and classification by root-systems and Dynkin diagrams[edit]
There has been considerable progress in describing finite quotients of quantum groups such as the above Uq(g) for qn =1; one usually considers the class of pointed Hopf algebras, meaning that all subcoideals are 1-dimensional and thus there sum form a group called coradical:
[/ltr][/size]
- In 2002 H.-J. Schneider and N. Andruskiewitsch [3] finished their long-term classification effort of pointed Hopf algebras with coradical an abelian group (excluding primes 2, 3, 5, 7), especially as the above finite quotients of Uq(g) Just like ordinary Semisimple Lie algebra they decompose into E´s (Borel part), dual F´s and K´s (Cartan algebra):
[size][ltr]
Here, as in the classical theory V is a braided vector space of dimension n spanned by the E´s, and σ (a so-called cocylce twist) creates the nontrivial linking between E´s and F´s. Note that in contrast to classical theory, more than two linked components may appear. The role of the quantum Borel algebra is taken by aNichols algebra of the braided vectorspace.
[/ltr][/size]
- A crucial ingredient was hence the classification of finite Nichols algebras for abelian groups by I. Heckenberger [4]in terms of generalized Dynkin diagrams. When small primes are present, some exotic examples, such as a ***, occur (see also the Figure of a rank 3 Dankin diagram).
- In the meanwhile, Schneider and Heckenberger[5] have generally proven the existence of an arithmetic root systemalso in then nonabelian case, generating a PBW basis as proven by Kharcheko in the abelian case (without the assumption on finite dimension).This could recently be used[6] on the specific casesUq(g) and explains e.g. the numerical coincidence between certain coideal subalgebras of these quantum groups to the order of the Weyl group of the Lie algebra g.
[size][ltr]
Compact matrix quantum groups[edit]
See also compact quantum group.
S.L. Woronowicz introduced compact matrix quantum groups. Compact matrix quantum groups are abstract structures on which the "continuous functions" on the structure are given by elements of a C*-algebra. The geometry of a compact matrix quantum group is a special case of anoncommutative geometry.
The continuous complex-valued functions on a compact Hausdorff topological space form a commutative C*-algebra. By the Gelfand theorem, a commutative C*-algebra is isomorphic to the C*-algebra of continuous complex-valued functions on a compact Hausdorff topological space, and the topological space is uniquely determined by the C*-algebra up tohomeomorphism.
For a compact topological group, G, there exists a C*-algebra homomorphism Δ: C(G) → C(G) ⊗ C(G) (where C(G) ⊗ C(G) is the C*-algebra tensor product - the completion of the algebraic tensor product ofC(G) and C(G)), such that Δ(f)(x, y) = f(xy) for all f ∈ C(G), and for all x, y∈ G (where (f ⊗ g)(x, y) = f(x)g(y) for all f, g ∈ C(G) and all x, y ∈ G). There also exists a linear multiplicative mapping κ: C(G) → C(G), such that κ(f)(x) = f(x−1) for all f ∈ C(G) and all x ∈ G. Strictly, this does not makeC(G) a Hopf algebra, unless G is finite. On the other hand, a finite-dimensional representation of G can be used to generate a *-subalgebra of C(G) which is also a Hopf *-algebra. Specifically, if is an n-dimensional representation of G, then for all i, j uij ∈ C(G) and
It follows that the *-algebra generated by uij for all i, j and κ(uij) for all i, j is a Hopf *-algebra: the counit is determined by ε(uij) = δij for all i, j (where δijis the Kronecker delta), the antipode is κ, and the unit is given by
As a generalization, a compact matrix quantum group is defined as a pair(C, fu), where C is a C*-algebra and is a matrix with entries in C such that
[/ltr][/size]
- The *-subalgebra, C0, of C, which is generated by the matrix elements of u, is dense in C;
- There exists a C*-algebra homomorphism called the comultiplication Δ: C → C ⊗ C (where C ⊗ C is the C*-algebra tensor product - the completion of the algebraic tensor product of C and C) such that for all i, j we have:
[size][ltr]
[/ltr][/size]
- There exists a linear antimultiplicative map κ: C0 → C0 (the coinverse) such that κ(κ(v*)*) = v for all v ∈ C0 and
[size][ltr]
where I is the identity element of C. Since κ is antimultiplicative, then κ(vw) = κ(w) κ(v) for all v, w in C0.
As a consequence of continuity, the comultiplication on C is coassociative.
In general, C is not a bialgebra, and C0 is a Hopf *-algebra.
Informally, C can be regarded as the *-algebra of continuous complex-valued functions over the compact matrix quantum group, and u can be regarded as a finite-dimensional representation of the compact matrix quantum group.
A representation of the compact matrix quantum group is given by acorepresentation of the Hopf *-algebra (a corepresentation of a counital coassociative coalgebra A is a square matrix with entries in A (so v belongs to M(n, A)) such that
for all i, j and ε(vij) = δij for all i, j). Furthermore, a representation v, is called unitary if the matrix for v is unitary (or equivalently, if κ(vij) = v*ij for all i, j).
An example of a compact matrix quantum group is SUμ(2), where the parameter μ is a positive real number. So SUμ(2) = (C(SUμ(2)), u), where C(SUμ(2)) is the C*-algebra generated by α and γ, subject to
and
so that the comultiplication is determined by ∆(α) = α ⊗ α − γ ⊗ γ*, ∆(γ) = α ⊗ γ + γ ⊗ α*, and the coinverse is determined by κ(α) = α*, κ(γ) = −μ−1γ, κ(γ*) = −μγ*, κ(α*) = α. Note that u is a representation, but not a unitary representation. u is equivalent to the unitary representation
Equivalently, SUμ(2) = (C(SUμ(2)), w), where C(SUμ(2)) is the C*-algebra generated by α and β, subject to
and
so that the comultiplication is determined by ∆(α) = α ⊗ α − μβ ⊗ β*, Δ(β) = α ⊗ β + β ⊗ α*, and the coinverse is determined by κ(α) = α*, κ(β) = −μ−1β, κ(β*) = −μβ*, κ(α*) = α. Note that w is a unitary representation. The realizations can be identified by equating .
When μ = 1, then SUμ(2) is equal to the algebra C(SU(2)) of functions on the concrete compact group SU(2).
Bicrossproduct quantum groups[edit]
Whereas compact matrix pseudogroups are typically versions of Drinfeld-Jimbo quantum groups in a dual function algebra formulation, with additional structure, the bicrossproduct ones are a distinct second family of quantum groups of increasing importance as deformations of solvable rather than semisimple Lie groups. They are associated to Lie splittings of Lie algebras or local factorisations of Lie groups and can be viewed as the cross product or Mackey quantisation of one of the factors acting on the other for the algebra and a similar story for the coproduct Δ with the second factor acting back on the first. The very simplest nontrivial example corresponds to two copies of R locally acting on each other and results in a quantum group (given here in an algebraic form) with generators p, K,K−1, say, and coproduct
where h is the deformation parameter. This quantum group was linked to a toy model of Planck scale physics implementing Born reciprocity when viewed as a deformation of the Heisenberg algebra of quantum mechanics. Also, starting with any compact real form of a semisimple Lie algebra g its complexification as a real Lie algebra of twice the dimension splits into gand a certain solvable Lie algebra (the Iwasawa decomposition), and this provides a canonical bicrossproduct quantum group associated to g. Forsu(2) one obtains a quantum group deformation of the Euclidean groupE(3) of motions in 3 dimensions.
See also[edit]
[/ltr][/size]
[size][ltr]
Notes[edit]
[/ltr][/size]
- Jump up^ Schwiebert, Christian (1994), Generalized quantum inverse scattering, p. 12237, arXiv:hep-th/9412237v3, Bibcode:1994hep.th...12237S
- Jump up^ Majid, Shahn (1988), "Hopf algebras for physics at the Planck scale",Classical and Quantum Gravity 5 (12): 1587–1607,Bibcode:1988CQGra...5.1587M, doi:10.1088/0264-9381/5/12/010
- Jump up^ Andruskiewitsch, Schneider: Pointed Hopf algebras, New directions in Hopf algebras, 1–68, Math. Sci. Res. Inst. Publ., 43, Cambridge Univ. Press, Cambridge, 2002.
- Jump up^ Heckenberger: Nichols algebras of diagonal type and arithmetic root systems, Habilitation thesis 2005.
- Jump up^ Heckenberger, Schneider: Root system and Weyl gruppoid for Nichols algebras, 2008.
- Jump up^ Heckenberger, Schneider: Right coideal subalgebras of Nichols algebras and the Duflo order of the Weyl grupoid, 2009.
[size][ltr]
References[edit]
[/ltr][/size]
- Jagannathan, R. (2001). "Some introductory notes on quantum groups, quantum algebras, and their applications". arXiv:math-ph/0105002v1.
- Kassel, Christian (1995), Quantum groups, Graduate Texts in Mathematics 155, Berlin, New York: Springer-Verlag, ISBN 978-0-387-94370-1, MR 1321145
- Lusztig, George (2010) [1993]. Introduction to Quantum Groups. Cambridge, MA: Birkhäuser. ISBN 978-0-817-64716-2.
- Majid, Shahn (2002), A quantum groups primer, London Mathematical Society Lecture Note Series 292, Cambridge University Press,ISBN 978-0-521-01041-2, MR 1904789
- Majid, Shahn (January 2006), "What Is...a Quantum Group?" (PDF),Notices of the American Mathematical Society 53 (1): 30–31, retrieved 2008-01-16
- Podles, P.; Muller, E. (1997), Introduction to quantum groups, p. 4002,arXiv:q-alg/9704002, Bibcode:1997q.alg.....4002P
- Shnider, Steven; Sternberg, Shlomo (1993). Quantum groups: From coalgebras to Drinfeld algebras. Graduate Texts in Mathematical Physics 2. Cambridge, MA: International Press.
- Street, Ross (2007), Quantum groups, Australian Mathematical Society Lecture Series 19, Cambridge University Press, ISBN 978-0-521-69524-4, MR 2294803
一星- 帖子数 : 3787
注册日期 : 13-08-07
您在这个论坛的权限:
您不能在这个论坛回复主题