Link to return to Modern Physics front page
Links to specific sections in the text:
4.a. De Broglie waves and their proof by Davisson and
Germer
4.b. Phase velocity and group velocity of waves
4.c. Uncertainty relation
4.d. Particles, waves, and probability
As we discussed earlier, the Bohr model of atoms left a number of questions unanswered. The most important of these was the basic assumption, why the laws of classical mechanics and electrodynamics are violated when the electron occupies a stable orbit. Furthermore, if the orbit is indeed stable, why does the atom decays spontaneously?
In 1923 De Broglie, a Ph.D. student decided that the Bohr model picturing the atom as a mechanistic model must be wrong. He recalled the progress made in radiation physics by realizing that electromegnetic radiation has a dual nature: wave and particle. He asked the question: If electromagnetic radiation has a dual nature, wherby in certain circumstances it behaves like a wave, while in other situations as a collection of particles, photons, then why does not matter have this dual nature? Until that time electrons and protons were regarded as mechanistic particles only. His motivation was the following: something must happen at atomic sizes, when electrons are bound inside atoms. There must be a reason why the nature of laws of physics change so drastically. He realized that one way to explain this is that electrons have a dual nature as well and inside atoms the wave nature electrons must play a crucial role. He argued as follows: Mechanical systems, such as the solar system are allowed to have all possible periods, radii, etc. depending on masses and inititial conditions. Some other systems, like an oscillating string or air column, or drum allow only a definite, discrete set, of frequencies. The oscillation of these instruments is characterized by standing waves. Why not consider atoms as system that oscillate at a well defined set of frequencies. His assumption was the initial spark that eventually produced wave mechanics, or quantum mechanics.
De Broglie adopted Bohr's idea that electrons occupy essentially circular orbits, but he assumed that the object occupying the circular orbit is a wave. To have a mecjhanical analogy, imagine either a circular tube or a metal ring. Such a system as any other mechanical system has only a well defined set of oscillation modes. The condition for these circular systems is that when we go around the circle and follow the phase of the wave then we must arrive back at the same phase as we started from, otherwise the wave would have a forbiddn jump at one point. This requirement (called boundary condition) implies that the length of the ring must be exactly an integer multiple of a whole wavelength. This is a so-called periodic boundary condition. In other words the allowed wavelengths are provided by the relation
n l = 2 p r
This must be true for mechanical systems like a metal ring or a circular tube, so De Broglie decided that it must be true for electrons in an atom, as well. This assumption was still not sufficient for obtaining the results of the Bohr model. If electrons are also waves then one must decide what the wavelength is. Here De Broglie took a hint from photons. The wave length of photons is related to their anargy or alternatively, to their momentum. These relations are
f= E/h and l = hc / E
or alternatively,
f = pc/h and l = h / p
The first choice is not very appealing, because it would mean a nonvanishing frequency even when the electron is at rest. It is hard to imagine a particle showing interference phenomena while at rest. Therefore, De Broglie adopted the second option. One must emphasize that he did not have the tiniest shred of experimental evidence for his assumption. On the other hand the assumption, that every moving particle represent a wave of wavelength h / p allowed him to give a brilliant explanation for the Bohr model. Of course, the other motivation of de Broglie for choosing the relation l = h / p was that, combined with his relation between the wave length and radius, n l = 2 p r, it gave back Bohr's results. This is fairly easy to see. Just eliminate the wave length from these last two equations to get
l = h / p = 2 p r / n
Rearranging, we obtain
L = rp = hn / 2 p,
which is just Bohr's quantiztion cndition for angular momentum. As we showed earlier, the Balmer formula follows from this quantization condition.
Note now that though the relation l = h c / E is not appropriate, we can still have f= E/h. This is so because for particles of finite mass the relation between f and l is not what it is for photons. Accordingly, de Broglie also postulated that the frequency associated with matter waves is given by the formula f=E/h.
It should be remarked that it is quite suffcient to treat electrons non-relaticistically in the hydrogen atom. One can calculate easily velocities and they come out to be a fraction of a percent of light velocity. The situation changes for heavy atoms (the velocity is proporitonal to Z), where for low n relativistic effetcs are non-negligible.
The nature of these electron waves was not clear to de Broglie. In fact this was only clarified many years later.The natural assumption (that turned out to be false later) is that electrons are kind of smeared objects and the smearing follows a wave pattern. Over many wavelength the amplitude of the wave (or rather the intensity, that is the square of the amplitude) averages out so one sees a smooth object. In other words, a quantum object is smooth at length much longer than the wavelength but it is fuzzy at distances in the order of l. The natural question arises then whether one could see such a fuziness of a macroscopic object.
- Example: A person of mass 60 kg walks at a speed of 1m/s. What is the wavelength of this person?
Solution: The momentum of such a person is 60 kgm/s. Then the wavelength is l = h / p = 6.6¥10-34Js / 60 kgm/s = 1.1¥10-35m. This is a really small distance. I challenge anyone to be able to make a measurment of this precision. The result is that the person does not seem to be fuzzy the least bit.
In 1927 Davisson and Germer performed their famous experiment in the Bell laboratories. They set out to study the surface of metals, in particular, nickel. They intended to that by scattering electrons of the surface of nickel and observing the scattered electrons. They used a well collimated electron beam and observed electrons at well defined angles. Instead of the smooth spectrum, they expected to get, they observed minima and maxima. These minima and maxima were very similar to those of the Bragg reflection of X-rays. In fact the maxima satisfied the relation (note that F is the angle of the incoming particles and the surface)
sin f = n k,
where k is a constant. They also observed that k changed as a function of the momentum of electrons (which could be easily controled by changing the potential difference accelerating the electrons in the "electron gun" ). In fact, they found that the constant k was inversely proportional to the electron momentum. Since lattice constant of nickel was well known from X-ray diffraction experiments they could show that the data satisfied
2d sin f = n h / p
proving de Broglies relation of wavelength and momentum
l = h / p .
Example: Calculate the accelerating potential needed to produce electrons which when Bragg-scattered off a single crystal of nickel would produce the first maximum when the angle between the incoming and reflected electron beams is 50o. Note that the lattice constant of nickel is about 20nm.
Solution: If the angle between the incoming and reflected waves is 50o then the angle f, which is the angle compared to the surface is
f = (180o -50o) / 2 = 65o
Then l= 2dsin f = 40 nm sin 65o = 33.1 nm.
The momentum of electrons is p = h / l, or equivalently, pc = hc / l = 1240 nm eV/ 33.1 nm =37.46 eV. Consequently the momentum is p = 37.46 eV/c. Then the energy is K=p2/2m = (37.5 eV)2 / 1 MeV = 1.4 meV (milli electron volt). So the accelerating potential is 0.0014 V.
Maxima from Bragg scattering are very sharp because electrons reflected from many layers of atoms interfere. Even if there is the slightest phase difference between rays reflected from subsequent layers one will find a another, deeper layer such that the reflected ray from it interferes destructstively with the first one. In fact, in Davisson and Germer's first experiment they used lower energy electrons which can hardly penetrate nickel. In that case the interference of rays scattered from atoms on the surface layer shows broad maxima, because even if electrons scattered by neighboring atoms are slightly out of phase the waves still mostly add up.
Following Davisson and Germer diffraction experiments were performed by H and He atoms, and later neutrons as well. They all showed the expected wave behavior, exactly as predicted by De Broglie.
Example: How fast a fly should fly so that there would be appreciable diffraction phenomena in the path of a fly when it flies through a metal lattice with lattice constant 0.5 cm? Suppose that the mass of the fly is m = 10-4 kg and we are able to resolve diffraction maxima if they are at least 10-4 radian apart. Is the obtained velocity observable?
Solution: Using Bragg's formula nl= 2dsin f, for small diffraction angles sin f µ f. Subsequent maxima, fn and fn-1 then satisfy
l= 2d ( fn - fn-1 ) > 2 ¥ 0.5 ¥ 10-2 ¥ 10-4 m = 10-6 m=1000 nm = 1 m (micrometer).
In other words, we are able to notice the wave (quantum) nature of flies if their wavelength is larger than a micron. If the wavelength is larger than 10-6 m than the momentum is smaller than
p = h / l < 6.6¥ 10-34 J s / 10-6 m = 6.6¥ 10-28 mkg/s. Then the velocity of the fly is v = p/m = 6.6¥ 10-24 m/s.
Is this velocity detectable? Say, the fly needs to travel at least 1 mm = 10-3 m, so that we would notice its progress. This would take
s/v = 10-3 m / 6.6¥ 10-24 m/s = 1.5¥ 1020 s. Now one year is about 3¥ 107 s. Thus we would have to wait 5¥ 1012 years for the fly to fly that far. Unfortunately the age of the universe is only about 1.5x1010 years. So, we should be unable to measure the wave nature of a fly.
The wave nature of an object is incompatible with its particle nature. The problem can be stated simply. Suppose a particle has a definite momentum. Then, according to De Broglie it also has a definite wavelength, and a definite frequency. The frequency is f = E / h . Then a wave corresponging to such object, assuming that the phase is zero at the origin and t=0, is
j = C cos[ 2 p (x /l - t f ) ] = C cos[ (p x - E t)/hbar ]
This periodic function was chosen to change by a full period when either x changes by l or t changes by T=1/f. It stays constant if one travels with a velocity of x / t = l f = vp, the phase velocity. Alternatively we may write the wave in terms of the wave number k and angular frequency w, defined as k = 2 p / l, w = 2 p f. Then we have the form
j = C cos[ k x - w t ].
Note now the relation between k and p on one hand and w and E on the other hand. We have
k = 2 p /l = 2 p p / h = p / hbar, w = 2 p f = 2 p E / h = E / hbar
This function, describing a particle-wave object is called a wave function. Unfortunately, such an object, though its amplitude oscillates, is extended over an infinite range in space. In other words, since we beleive that this function, j, says something where the object (particle) is, we can say it is everywhere in space. If you think about the object as a particle then we can say that the particle is not localized. It is everywhere is space. This is the first manifestation of a principle that will later be called the Heisenberg uncertainty relation. In this particular case it just says that since we know the monentum of the particle, it is described by a monochromatic wave, which extend over the whole space and so we cannot say where the particle is at all. It is everywhere. What we will later see is that if we relax fixing the momentum (and thus the wavelength and frequency) of a particle exactly then the particle can be localized much more precisely, though not exactly.
For a nonrelativistic partcle the energy is E = p2 / 2m. Then the frequency is f = E / h = p2 / 2mh. Then the phase velocity is x / t = l f = vp = p/2m = v/2.
This is just half of the velocity of the particle. This seems to be surprizing at first sight, but if one thinks a bit one realizes that this velocity is not the velocity of the particle. This is just the velocity of the wave crest, and it does not carry any energy. In fact relaticistically the wave crests travel faster than light. Since the particle is anywhere, as far as we know, this wave form cannot say anything about how fast the particle travels.
To escape from the extreme wave desctiption of a particle one needs to localize it, at least to some extent. Then one needs an analytic form for the wave that is different. The form of a monochromatic wave (a single momentum) is unique, up to teh arbitrarily chosen overall phase. Then the only possibility left is to take a superposition of waves. In fact, there is a theorem (Fourier' theorem) that an arbitrary smooth function can be constructed from the superposition of waves of different wave lengths. We will return to the discussion of this question later.
Our purpose is to construct a wave packet which has a finite extension, and consists of many different waves of similar frequencies. Such a wave packet will have an almost well defined momentum and an almost well defined coordinate. It has roughly the following form

Such a wave packet can only be constructed as the linear combination of infitely many waves of different wavelengths. One can model a wave packet, however, from the linear combination of two waves of wavelengths that are closse together. One obtains an infinite sequence of wave packets of similar form. For the sake of simplicity take the amplitudes of the two waves identical. Then we have
j = C ( cos[ k1 x - w 1t ]+ cos[ k2 x - w 2t ] )
Using rules for combining cosines, cos a + cos b = 2 cos [(a+b) / 2] cos [(a-b) / 2], we obtain
j = 2 C cos[ x ( k1+ k2 ) / 2 - t ( w 1 + w 2 )/ 2]
cos[ x ( k1 - k2 ) / 2 - t ( w 1 - w 2 )/ 2]
Now, if the wave numbers and angular frequencies differ little then k = ( k1+ k2 ) / 2 does not much differ from either k1 or k2 Similarly, w = ( w 1 + w 2 )/ 2 does not much differ from w 1 or w 2 Then the first cosine is just the same as the original waves. On the other hand, the second cosine multiplier has a wave number, t( k1 - k2 ) / 2 that is much smaller than k and an angular frequency, ( w 1 - w 2 )/ 2 that is much smaller than w . If the wave number is small than the wavelength is large, if the angular frequency is small than the oscillation is slow. The result is that the waves created by the first multiplier are modulated by the second multiplier. In a way the amplitude of the oscillation is increasing and then decreasing and then vanishes and then increasing again. In music this is well known, and is called beat. Musicians in an orchestra when tune their instruments to each other's they listen to the beat created by slightly offtuned instruments. They strive to slow the beat (i.e. decrease w 1 - w 2 ). When the period of the beat is infinite the two frequencies are identical (Note that the period of the beat is T12= 2 p / ( w 1 - w 2 ) ).
One can understand this phenomenon in the following way. Oscillations of nearly equal frequency are in phase for a long time. Than they add up, strengthen each others effect. But then, since the frequencies are not exactly equal, they get out of phase. After a while they will be in completely opposite phase and cancel each other. Then the total amplitude vanishes. This process will endlessly repeat itself as shown on the figure below

We can understand that there are two very different velocities. One is the velocity of the wave inside the wave packet,
vp = w / k,
while the other one is the group velocity, the velocity of the envelope,
vg = ( w 1 - w 2) / ( k1 - k2 ) = Dw / Dk .
Let us calculate the group velocity for nonrelativistic and relativistic particles provided the difference between the waves is infinitesimal. For the infinitesimal case the group velocity is defined as the limit of the above formula:
vg = dw / dk = dE/dp.
For a nonrelativistic particle we obtain
vg = dE/dp = (1/2m) d p2 / dp = p/m = v
For a relativistic particle we obtain
vg = dE/dp = (1/2m) d (p2c2+m2c4)1/2 / dp = pc / E = mc2 g v/ mc2 g = v
We can see that both relativistically and nonrelativistically the group velocity is exactly equal to the velocity of the particle.
The wave given by the suporpositionof two simple plane waves is not really a wave packet. It looks almost like a wave packet repeating itself infinitely many times. The reason for this is the simplicity of the constuction. The more wave components of nearby wavelengths and frequencies one uses the more a real wave packet can be approximated.
One can use the above superposition of waves to get another, better glimpse of the complementarity between having definite momenta and definite coordinates. Let us, for one moment, forget about the repetition of the wave packets. Then the fuzziness of a particle dexcribed by such a wave is the length of the wave packet, determined by the relation
Dx Dk = 2 p.
In other words the fuziness, or uncertainty of location of the particle, is related to the fuziness of the wave number. The wave number is however related to the wavelength, and the wavelength, using De Broglie's relation, to the momentum. Namely
Dk = 2 p ( 1 / l1 - 1 / l 2 ) = 2 p ( p1 - p 2 ) / h = 2 p Dp / h
Substituting into Dx Dk = 2 p we obtain
Dx Dp = h.
which, up to a finite constant, is theHeisenberg uncertainty relation between momentum and coordinate. In other words we cannot fix the momentum and the coordinate of the particle at the same time. the more precise one is the less precise must by necessity the other becomes. The extreme example of this we saw earlier: If one takes a monochromatic wave, then the momentum is exactly defined, but then we cannot pin down the location of the particle at all. This form of the Heisenberg uncertainty relation is approximate, valid for the particular case of two simple waves. Generally it will take the form of an inequality to be discussed later.
A relation can be obtained between the uncertainty of time and energy, using similar considerations. The wave packet limits the time interval in which the particle can be found at a given coordinate as
Dt Dw = 2 p
Then using the relation with the frequency and energy we obtain
Dw = 2 p ( f 1 - f 2 ) = 2 p ( E1 - E 2 ) / h = 2 p DE / h
Then we obtain
Dt Dw = Dt 2 p DE / h = 2 p,
or in other words
Dt DE = h.
We will later see that the waves of quantum mechanics are by necessity complex. In other words we will write waves as
j = C exp[ ik x - i w t ]
for a wave traveling to the right and
j = C exp[- ik x - i w t ]
for a wave traveling left. The real part of these wave forms, using Euler's relation
eix = cos x + i sin x,
becomes identical to the wave forms we used previously.
Eulers relation can be easily proved if we define the exponential, as we should by its Taylor series around ix=0. Then using the relations i2=-1, i3=-i, i4=1 we can separate the even and the odd order term of the Taylor series. Even order terms will be real and form exactly the series for cos x while odd order terms are imaginary, where the coefficients of i for exactly the series of sin x.
Suppose first we are only interested in the shape of the wave at t=0. Then a general wave is a linear combination of waves of the form C exp[ ik x]. If we wish to include all the possible values of wave numbers then, since they form a continuous set, we need to integrate over the waves of various k. The most general expression is obtained as
j(x) = [ 1 / (2p )1/2 ] Ú f(k) exp[ ik x ] dk
where f(k) is the amplitude of the component with wave number k. Fourier's theorem says every sufficiently well behaved function of x, j(x), can be reperesented in this form (The condition is that the integral of |j(x)|2 over all admissible coordinates exists). Then to find an nice wave packet we have to choose an appropriate weight function f(k). An integral like the above one is called a Fourier integral. The great thing about Fourier integrals is that they are invertable. In other words, knowing j(x) we are able to calculate the dfunction f(k). Give me the shape of the wave packet you want and then I will find the f(k) weight function that will choose the appropriate linear combination of waves to result in this function. The inverse formula is
f(k) = [ 1 / (2p )1/2 ] Ú j(x) exp[ -ik x ] dx.
The integral over functions containing the imeginary unit i should not disturb you. It should be treated just as any other constant. One could also say that when one integrates over a complex function then one can take the function into is real and imaginary parts, integrate those independently. I.e.
Ú f(k) dk = Ú [Re f(k) + i Im f(k)] dk = Ú Re f(k) dk + i Ú Im f(k) dk
In the last form the integrands are real so they are well defined. The end results is the same as if we just operated with i as a constant.
Example: Do the following integral over eix: Úab eix = (1/i)[ei b- ei a] two different ways: 1) Working with i as with any othe constant, and 2) Using if we do the integral by regarding i as any other constant. On the other hand, if we use the Euler theorem first then we have Úab eix = Úab [cos x + i sin x ] = sin b - sin a - i (cos b - cos a)= - i ( cos b + i sin b) + i (cos a + i sin a)= (1/i) eib - (1/i) eia, which agrees with the previous result obtained by working with i as with any other constant.
Example: Fourier decomposition of a square wave. j(x) is defined by j(x) = A if -c<x<c and it is zero everywhere else. Then we obtain f(k) as
f(k) = [ 1 / (2p )1/2 ] Ú A exp[ -ik x ] dx,
where the limits of the integral are -c and c. The integral of the exponential function can be simply taken. we obtain
f(k) = [ 1 / (2p )1/2 ] (1 / -ik)( exp[ -ik c ] - exp[ +ik c ] ) = [ 2 / k (2p )1/2 ] sin kc
Example: Gaussian wave packet. A combination of waves formed by a gaussian weight distribution approximates best the way we really imagine a wave packet. Assume that the waight of waves
f(k) = C exp [ -(k-k0)2/ 4 (Dk)2],
where k0 is the average wave number and 2Dk is the width of the k distribution. Such a function is called Gaussian (an exponential of a quadratic function). First, this function is symmetric around k0. That is to say it takes the same value if k=k0-a and if k=k0-a. This is fairly obvious from the form. It is maximum where the exponent vanishes, at k=k0. If k moves away from k0 then the functions decreases first slowly then faster and faster. when |k-k0|=2Dk then the function drops to e-1 of its value at the maximum, C. At a twice the distance from k0 when |k-k0|=4Dk then the function is e-4 C, etc. 2Dk is called the half width of the distribution. The curve has the following form
Let us now calculate the wave function j(x)
j(x) = [ 1 / (2p )1/2 ] Ú f(k) exp[ ik x ] dk = [ 1 / (2p )1/2 ] Ú C exp [ -(k-k0)2/ 4 (Dk)2 + ik x ] dk,
where the range of integration is over the whole k axis. This integral can be performed by the method of completion of square in the exponent of the integrand. Note that the exponent of the integral has the form
-(k-k0)2/ 4 (Dk)2 + ik x = -k2/ 4 (Dk)2 + k k0/ 2(Dk)2 + ik x - k02/ 4 (Dk)2
= -k2/ 4 (Dk)2 + k [ k0/ 2(Dk)2 + i x ] - k02/ 4 (Dk)2
= - { k2 - 2 k [ k0 + i x 2 (Dk)2 ] + [ k0 + i x 2 (Dk)2 ]2 }/ 4 (Dk)2 + { [ k0 + i x 2 (Dk)2 ]2 - k02 }/ 4 (Dk)2
= - [ k - k0 + i x 2 (Dk)2]2/ 4 (Dk)2 + { [ k0 + i x 2 (Dk)2 ]2 - k02 }/ 4 (Dk)2
= - [ k - k0 + i x 2 (Dk)2]2/ 4 (Dk)2 - x2 (Dk)2 + i x k0
Notice now that he first term 1) contains all the k dependence 2) is a square of a linear function of k. Substituting this into the expression of j(x) we obtain
j(x) = C exp{ - x 2 (Dk)2+ i x k0 } [1 /(2p )1/2] Ú exp{- [ k - k0 + i x 2 (Dk)2]2/ 4 (Dk)2}
In the integral over k we can substitute k - k0 + i x 2 (Dk)2 --> kDk, to get an integral that does not depend on any parameters. In fact its value is just (p/ 2)1/2, so we finally obtain
j(x) = C Dk exp{ - x2 (Dk)2 + i x k0}
We can see that the resulting wave function is also Gaussian in x-space, centered around the origin. It also oscillates with the "average" wave number, k0. The real part of this wave function has a form very much like the wave packet shown before.
These distriutions, as we will interpret them later, are amplitudes of a probability distribution. |j(x)|2dx provides the probability that the particle described by the wave function j(x) is in the interval (x, x+dx). Actually, the above expression is only proportional to the probability. As we learned earlier ,we need to normalize the distribution to get a true probability distribution. In other words,
P(x) dx = |j(x)|2dx / Ú |j(x)|2dx,
because the integral of this distribution is now 1. We also learned how to find the average of a variable described by a probability distribution.
<x> = Ú x P(x) dx = Ú x |j(x)|2dx / Ú |j(x)|2dx
In our particular case, for the Gaussian distribution we obtained, the average value of x, or as it is usually called the expectation value of x, is 0. This is true because the integrand in the numerator of <x> is an odd function that is integrated over a symmetric interval. One can also say that for every negative value of x there is a positive value with the same weight. To describe a distribution more precisely one needs information about the spread of the distribution around the average value. The average deviation around the average value is of course zero, becouse the average exactly means that the variable is as probably below the average as above it. If one defines the average of a function of x, g(x), the same way as one defined the average of x, i.e.
<g(x)> = Ú g(x) P(x) dx = Ú g(x) |j(x)|2dx / Ú |j(x)|2dx,
then the average deviation is < x - <x> > = <x> - <x> = 0. One neads a better measure of the average deviation from the average. Obviously there is no cancelation of contributions if they are all positive. Thus the average
(Dx)2 = <(x-<x>)2> = Ú (x-<x>)2 P(x) dx = Ú (x-<x>)2 |j(x)|2dx / Ú |j(x)|2dx
is an appropriate way to define the average deviation from the average. If we calculate Dx for our Gaussian distribution exp{ - x2 (Dk)2 + i x k0} then we obtain Dx = 1 / 2Dk.
Now, the function f(k), the "momentum wave function" (note that p = hk / 2p ) provides the average momentum and avarage deviation from the average momentum the same way. |f(k)|2 is (proportional to) the probability that the wave number is between k and k+dk. Then a simple calculation shows that for the distribution
f(k) = C exp [ -(k-k0)2/ 4 (Dk)2]
The average wavenuber is <k> = k0 and
the width of the wavenumber distribution is (<(k-k0)2>)1/2 = Dk
Now if we multiply the uncertainty of the coordinate (the size of the wave packet) and of the wave number (the width of the wave number distribution) we obtain
Dx Dk = (1/2Dk) Dk = 1/ 2
This translates to the uncertainty relation between the momentum and coordinate, using
k = 2 p / l = 2 p p / hDx Dp = h / 4 p. This is a manifestation of the Heisenberg uncertainty relation that will be discussed below. In fact, it can be shown that no other combination of monochromatic waves can achieve a smaller value of the product Dx Dp. The Gaussian wave packet is the minimum uncertainty wave packet. For all other combinations Dx Dp > h / 4 p.
Heisenberg realized in the mid 20s that the relations between the uncertainty of the momentum and of the coordinate, we have been discussing, was a general rule. In fact, as we saw earlier, using general arguments, and a precise mathematical definition of the uncertainty of the momentum one can proove that the uncertainties must always satisfy the relation
Dx Dpx > h / 4 p.
Here px denotes the component of the momentum along the x axis. Similar uncertainty relations hold for the other coordinate and momentum components. The x coordinate and the momentum components py, pz can be determined, however, simultaneously. As Heisenberg emphasized these uncertainties are the consequences of deep laws of nature and not our inability to make precise experiments. As we saw it earlier, they are the consequence of the dual nature of particles. This is exactly the price we must pay for having objects that behave in some circumstances as particles and in some other circumstances as waves. A similar uncertainty relation connects the precision of measuring the energy of the system and the time it takes to perform the measurment:
DE Dt > h / 4 p.
We saw this relation emerge earlier in the simple example of the superposition of two monochromatic waves and in other examples, such as the square pulse and Gaussian wave packet.
Heisenberg devised a thought experiment that shows why the dual nature of particles prevents us to know the exact location and coordinate of particles. This thought experiment is described in the book in much details. I would rather like to provide some examples for the illustration of the power of the uncertainty relation. First of all the notion that the coordinate and the momentum of a particle cannot be defined exactly at the same time is completelly alien to classical physics. We have to show that it does not contradict to our everyday experiences.
Example: Uncertainty in the location and velocity of germs. Take the extreme example of trying to measure the velocity and location of a germ moving under a powerful microscope. People do this and do not observe any hint of the uncertainty relation. Suppose the size of the germ is r = 10-6 m = 1 micron. Suppose we are able to measure the location of the germ with a 1% presision, to Dx = 10-8 m. A good guess that the density, r, of the germ is the same as that of water (it needs to be able to navigate in water!), i.e. 1000 kg / m3. What is the minimum uncertainty of its velocity? Is this measurable under the microscope? Well, we know from the uncertinty relation that Dp > h / (4 p Dx ), or Dv > h / (m 4 p Dx ) µ h / (r r816 p2 Dx ) = 1.14¥10-12m/s. Again, we can percieve the germ moving if it moves away by Dx = 10-8 m (which is really a very, very low limit). It would take then the germ about t = 104s, 3 hours to move that far. Germs move naturally much faster than that. Also even dead germs, due to flows in the fluid would move much more. The conclusiuon is that it is hopeless to detect the effect of the uncertainty relation on the motion of germs.
Example: Localization of electrons in metals. Conduction band electrons, as we learned studying the photoelectric effect, have a binding energy of the order of 2-4eV. They are supposed to behave as free electrons in a box, allowed to roam around. in other words their wave function extend to macroscopic sizes. What is the smallest size for a metal such that conduction electrons can be confined to it?
Solution: An upper limit on the uncertainty of the momentum isDp c < p c < (2mc2E)1 / 2 =(2 ¥500 keV ¥4 eV)1 / 2 = 20,000eV,
where E is the binding energy, because for bound electrons the kinetic energy cannot be larger then the binding energy.
Thus, the uncertainty of the the coordinate
Dx > hc / [ 4 p (2mc2E)1 / 2] = hc / ( 4 p 20,000eV) = 0.005 nm.
Thus if the metal piece is larger than 0.005 nm then the uncertaintly principle allows them to be bound inside the metal.
Example: Localition of electrons inside nuclei. Early nuclear models envisoned electrons to be bound inside the nucleus to explain the fact that A>2Z for most nuclei. Let us check, whether this is possible.
Solution: Take the deuteron nucleus as an example. According to the above suggested model it should contain 2 protons and an electron, since Z=1 and A=2. The size of the d nucleus is approximately r=10-15m. Then Dx < 10-15m = 10-6nm. The electron localized inside the deuteron has a binding energy of approximatelyE µ - ke 2 / r
Then, as before,
Dp c < p c < (2mc2E)1 / 2 =(2 ¥500 keV ¥1.44 eVnm / 10-6nm)1 / 2 = 1.2MeV,
Then Dx Dp c< 1.2MeV¥10-6nm = 1.2eV nm.
This, however, contradicts to the uncertainty relation of
Dx Dp c > hc / 4 p = 1240 eVnm / 4 p µ 100 eVnm.
Consequently no electron can be confined inside the nucleus.
Example: Finite width of spectral lines. Spectral lines are sharp energy lines according to quantum mechanics. observed lines have, however finite width. There are two reason for this phenomenon. One is the so-called natural linewidth, we are going to explor below. The other line broadening due to thermal motion. At nonzero absolute temperature (which means always) atoms in a gas perform more or less random motion, with average kinetic energy proportional to the temperature. Since the motion is random, some of the atoms emitting photons are moving away from the observer, some towards it, some sideways. The radiation reaching the obsrver is the doppler shifted, some redshifted, some blueshifted. The net result is the smearing of the spectral line with an amount proportional to the square root of the absolute temperature. In contrast, the natural linewidth is independent of temperature and is inherent in our world, that is a quantum world. Decaying systems follow an exponential decay law. The lifetime, t, determines the average lifetime only. In t seconds the probability that an individual atom does not decay is e-1 = 0.367. Then within roughly this time interval the it is uncertain whether the sytem undergoes a decay process or not. This is the uncertainty of time, Dt = t. But systems that have a finite uncertainty of time have a nonvanishing uncertainty of energy, as well. The uncertainty of energy of this sytem is larger then
DE > G = h / 4 p t,
where G is the so-called half width of the energy distribution. Since DE = h Df, we obtain that the frequency of emitted photons has a minimum uncertainty of
Df > 1 / 4 p t.
For example if the lifetime of a atomic state is t = 10-8s, then the natural width of the spectral line of the photon emitted at the decay of this excited state is Df = 108 / 4 p Hz µ 8 MHz.
In the scattering of elementary particles, e.g. of protons and p+ mesons one encounters prominent bumps in the total cross section, something like we can see on the graph below:
The reason for such a bump is that a short lived intermediate state, an unstable particle is formed. Such a particle is usually called a resonance. If this particle was stable then there would be an infinitely sharp bump in the cross section. The finite width signifies that the system is unstable. In fact, measuring the cross section one can find out the lifetime of the intermediate unstable particle. The D++ particle formed in the colllision of protons and p+ mesons has a width of G = 120 MeV. What is its lifetime? Use
t = h / 4 p G = 0.527 ¥10-34J.s / (120¥1.6¥10-13J) =2.7¥10-24s,
a very short time indeed. There is no way the lifetime of a particle of such a short lifetime could be directly measured. Still, using the relation between the width of the resonance and lifetime we have a very precise information concerning its lifetime.
After all this discussion we are left with the basic question: What is the meaning of the wave function? Befor eanswereing this question let us discuss some of the basic principles behind quantum mechanics. These are more like axioms than something one can derive. Intuitively we understand that to be considered as a wave a particle amplitude should be desribed by periodic function. We also understand that to get interference phenomena we need to allow the superposition principle. In other words, we need to allow that the linear combination of two admissible wave forms for a particle should also be an admissible wave. That is to say, if y1 and y2 are both admissible (i.e. describe real systems) then the wave function
y = y1 + y2
is also admissible. This is the characteristic property of systems described by linear differential equations, like the Maxwell equations, which are linear in E, B, and their derivatives. So we expect to have an equation that is linear in y. Furthermore, these wave functions are in general complex. So they cannot be directly related to intensities. There is one simple way to associate a complex number with real positive number, related to taking the square, as done in electrodynamics is to take the absolute squared of the wave function. Then this gives a very similar picture of interference phenomena to that in electrodynamic theory. There, one should take the square of the electric field to calculate radiation intensity. After taking the square one can explain all the familiar diffraction and interference phenomena.
Interference phenomena become obvious if one calculates the absolute square of the wave function y
|y|2 = |y1 + y2 |2 = (y1 + y2 ) (y1* + y2* ) = |y1 |2 + | y2 |2 + y2 y1* + y1 y2* ,
where the star signifies complex conjugate numbers. Now every complex number can be written as
z = |z| eia= |z| cos a + i |z| sin a = Re z + i Im z
where the |z| is the modulus of the complex number, and a is the phase. The complex conjugate is of course
z* = Re z - i Im z = |z| cos a - i |z| sin a = |z| e-ia
Then we can write |y|2 as
|y|2 = |y1 |2 + | y2 |2 + |y2| | y1| [ei(a2-a1) + ei(a1-a2) ]
= |y1 |2 + | y2 |2 + 2 |y2| | y1| cos ( a1 - a2),
where a1 and a2 are the phases of y1 and y2. Clearly, depending on the relative phase, a1 - a2, of the two complex wave functions there is a constructive or destructive interference. At fixed magnitudes of y1 and y2 the value of |y|2 for destrictive interference is ( |y1 | - | y2 | )2, while for constuctive interfierence it is . ( |y1 | - | y2 | )2 These happen at the phases a1 - a2 = p and 0, respecitvely. Of course, similar but much more complicatied interference phenomena in the construction of the Gaussion wave packet.
Now we will discuss the interpretation of the wave function. We have already decided thatthe wavefunction itself, being complex cannnot have a direct physical significance. Physical interpretaion can only be given to the quantity |y|2. One natural interprertion could be that particles are not really pointlike, but rather diffuse objects and |y|2 is the energy (or mass) density of the particle at a given point, just like E2 is the energy density created by an electric field. In other words, |y|2 would provide the distribution of matter. This interpretation, however does not hold water. It contradicts to the requirement of causality. Causality is a codename fot the requirement that no signal can go faster than light. To see the problem with this interpretation, imagine that in an experiment the wavefunction of a particle spreads out to large distances. Suppose that we set up a small detector somewhere, where the wavefunction is nonzero. A detector never detects half a particle, say half an electron. Either it detects a whole electron or nothing. At the time of the detection the wave function of the electron spreads way beyond the boundaries of the detector. Yet, in the next instance the particle is entirely inside the detector. In other words, it is sucked into the detector instanteneously. If the wavefunction represents real energy, then energy would be transmitted instantenaously, certainly faster then the speed of light. This contradicts to causality. Even if there was sufficient time for the faraway pieces of the electron to be sucked into the detector, it is not clear how the signal for those faraway parts of the electron would be given to rush in. There is no logical wat to explain this.
The probability interpretation of the wavefunction: Once the equation governing the time development of the wavefunction is established, it does not really matter what the interpretation of the wavefunction is because we would be able to make predictions with confinedce. Still. philosophically, the above interpretation is unacceptable. The universally accepted interpretation of the wavefunction is due to Max Born. He postulated that instead of the physical extent of the particle the wavefunction provides a probability of finding the particle at a certain coordinate. More exaxtly,
r(x) dx = |y(x)|2 dx / Ú |y(x')|2 dx'
is the probability that the particle is found between the coordinates x and x+dx. This distribution is constructed in such a way that when one integrates the distribution over all possible coordinates then one obtains one
Ú r(x) dx = Ú |y(x)|2 dx / Ú |y(x')|2 dx'
as one should, because the probability that one finds the particle anywhere is one, if the particle is known to exist. The factor
N = 1 / Ú |y(x')|2 dx
is a constant, called a normalization factor. Without that factor the probability density, , would be not correctly normalized (i.e. its intgral would not be equal to zero). We will see later that one can normalize the wavefunction itself (multiply it by N1/2 ) so that no extra multiplication would be needed.
Note that the probability density, r(x), can give a measurable number to the
experimentalist, who performes a measurement. If one integrates it
over the volume of the detector, Ú r(x)
dx, then one obtains the probability that one finds the electron
inside the detector. In other words, if one performs an experiment in
which one creates n electrons then, as an average,
n Ú r(x) dx will be detected by the
detector (Here we assume that the detector has a 100% efficiency).
The probability interpretation completely contradicts to our common sense, which is, of course, based on our everyday experience. It was so repulsive to Einstein that he said "God does not play dice with nature." Still, the probability interpretation is the only viable one existing today. The problem most people have with this interpretation that we loose the predictability of the world, a cherished concept since the age of enlightement, going back two hundred years. Born said that the world is inherently unpredictable. All possible outcomes of experiments can be calculated with the chance of any given outcome, but there is no way to predict which outcome will be realized in any individual case. If the experiment is repeated many times then we can predict approximately that how many times a given outcome will occur, but that is the best we can do. Just like throwing dice 6 million times we can predict that we will get approximately 1 million sixes, give or take about 10,000.
Let us consider now the previously considered thought experiment of detecting a particle with an extensive wavefunction in a small detector. If the wavefunction is nonvanishing at the detector, then there is a chance that it will be detected. What happens after detection? Then the previous form of the wavefunction cannot be valid any more because that predicts that most probably the particle is outside of the detector. This can only be reconciled with observations if we assume that after the measurment the wavefunction of the particle changes drastically and abruptly. But there is now problem with causality, because it was not real matter that rushes in with infinite speed, but just the probability distribution changes. The situation is very similar to when the die is cast: each outcome has a 1/6 probability. After the die is cast we know the outcome. The probability distribution is radically changed. This is the second feature of the probability interpretation, beyond the unpredictability of future, that was not palatable to physicist that have grown up on nineteenth century science. Measurements radically change the state of the system. Before quantum mechanics and its probability interpretation it was universally beleived that though in practice measurments do influence the system to some extent an ideal meaasuring device, that does not transfer energy, mommentum, or other physical quantities can be defined and deviced. It was assumed that by decreasing the size and coupling of measuring devices in an idealized limit they would not influence measurements at all. This idealistic picture was completely shattered. Of course such a development is not completely unexpected on philosophical grounds. When physical parameters of a macroscopic object are measured then one can always imagine creating a measuring device that is smaller, and in general has properities such that the object will not be disturbed. Whan one deals with elementary particles the smalles "device" one can imagine is another particle. There is no assurance that the coordinate, velocity, etc. of the particle will not change by the measurement. Anyway, the Heisenberg uncertainty relation tells us that all possible measurment cannot be performed with a complete precision.
The dual, particle - wave, nature of objects also assumes a new meaning under the probability interpreation. This can be seen by another thought experiment (which has also been performed in the laboratory). In a double slit diffraction experiment on electrons a fluorescent screen is set up behind the double slit. The screen is a detection device, or rather a multitude of extremely small detection devices. Each point of the screen serves as a separate detector. For all practical purposes electrons are detected at a point. How is such a detection method compatible with the notion that electrons form waves and indeed show a diffraction patter in such a double slit experiment? The resolution of this apparent contradiction lies in quantum mechanics. The two slits, just like in optics, serve as emitters of electron waves. They emit in every direction. Suppose that the wave function of waves emitted by the two slits are denoted by y1 and y2. Then the intensity of electron waves at a point of the screen is |y1 + y2 |2 , where the wavefunctions should be taken at the appropiate point of the screen. Note that we need to have wavefunction that depend on all three coordinates rather than x only. By moving from point to point on the screen the intensity and with that the probability of detection changes. The electron beam arriving at the screen an passing it can be set to such low intensity that one can state with certainty that there is one electon in the area between the slits and the screen at any given time. Anys one electron will be detected only at a given point albeit with differing probabilities at different points. After the first electron hits we do not really have any nformation about the diffraction pattern. After a million electrons hit the screen the situation changes significantly. The probability of detection at points where |y1 + y2 |2 is large is higher, so more electrons will be detected there. The net effect will be the formation of a beautiful diffraction pattern. So our conlusion states that though the wavefunction of every electron shows the diffraction pattern individual electrons are detected at a single point only. The diffraction pattern is formed after the observation of a large number of electrons only.
Finally, let us return to the Heisenberg uncertainty relations in view of the probability interpretation of quantum mechanics. The probability distribution is defined by
P(x) dx = |y(x)|2 dx / Ú |y(x')|2 dx'
This is the probability that the particle is found between x and x+dx. Then, according to the definition of the theory of probability the average, or expectation value of x is
xav = <x> = Ú x P(x) dx = Ú x |y(x)|2 dx / Ú |y(x')|2 dx'
Furthermore, the square of the standard deviation or uncertainty of the value of variable x is
(x - xav )2av = < ( x - <x> )2 > = Ú (x - xav )2 P(x) dx
= Ú ( x - <x> )2 |y(x)|2 dx / Ú |y(x')|2 dx'
These quantities now have a probability interpretation. They provide certain information concerning the probability distribution itself.