153 90 12MB
German Pages 52 Year 1969
FORTSCHRITTE DER PHYSIK 1IKRAUSGKGKBKN IM AUFTKAGK DKK PHYSIKALISCHEN GKSKLLSCHAFT IN DKK DK UTS CU K.N DEMOKRATISCHEN REPUBLIK YON F. KASCHLUHN, A. LÖSCHE, B. KITSCHL UND K. KOMPK
B A N D 16 • H E F T 6 • 1968
A K A D E M I E - V E R L A G
•
B E R L I N
Spezialwerkstofie der Elektrotechnik Von Dr. H E R B E R T P E I B S T , Prof. Dr. R O B E R T R O M P E , Dr. K L A U S T H I E S S E N u n d Dr. KLAUS-THOMAS W I L K E
Werkstoffe für die Elekrotechnik Von Dipl.-Phys. R E I N H A R D B E R N S T , Dipl.-Phys. P E T E R F U C H S , Prof. Dr.-Ing. P A U L G Ö R L I C H ,
Dr. HANS K A R R A S ,
Dipl.-Phys.
WERNER
M E I N E L , Dipl.-Chem. K L A U S T Ö P F E R u n d Dipl.-Ing. W O L F G A N G W R E D E (Sitzungsberichte
der Deutschen
Akademie
der Wissenschaften zu
Berlin,
Klasse f ü r Mathematik, Physik und Technik) B e i d e A r b e i t e n i n e i n e m H e f t . 1964. 83 S e i t e n — 3 A b b i l d u n g e n — 17 T a b e l l e n 8° - 5,10 M a r k Die b e k a n n t e n A u t o r e n b e r i c h t e n i n d i e s e m B a n d ü b e r H a l b l e i t e r w e r k s t o f f e u n d i h r e A n w e n d u n g e n i n d e r E l e k t r o t e c h n i k , ü b e r m e t a l l i s c h e W e r k s t o f f e u n d i h r e S p e z i f i t ä t e n (als L e i t e r - , W i d e r s t a n d s - u n d K o n t a k t w e r k s t o f f e , sowie i h r e r m a g n e t i s c h e n u n d t h e r m i s c h e n E i g e n s c h a f t e n w e g e n ) u n d ü b e r M a t e r i a l i e n f ü r die S p e i c h e r t e c h n i k . D a r ü b e r h i n a u s weisen sie auf die h o h e B e d e u t u n g h i n , die die o r g a n i s c h e n I s o l i e r s t o f f e b e s i t z e n . A u s f ü h r l i c h g e h e n sie a u f die R e i n h e i t s f o r d e r u n g e n bei H a l b l e i t e r w e r k s t o f f e n ein. Sie g e b e n d i e p h y s i k a l i s c h e n Gesetze a n , a u s d e n e n sich die B e d i n g u n g e n b e z ü g l i c h c h e m i s c h e r R e i n h e i t u n d k r i s t a l l o g r a p h i s c h e r B e s c h a f f e n h e i t d e r W e r k s t o f f e a b l e i t e n l a s s e n . A b s c h l i e ß e n d b e h a n d e l n die A u t o r e n die c h e m i s c h e n , k r i s t a l l o g r a p h i s c h e n u n d p h y s i k a l i s c h e n N a c h w e i s m e t h o d e n . A n e i n e m Beispiel, d e n W e r k s t o f f e n f ü r L u m i n o p h o r e , b e s p r e c h e n sie a u c h F r a g e n d e r W i r t schaftlichkeit im Zusammenhang mit der Materialauswahl.
Bestellungen
A K A D E M I E - V E R L A G
•
durch eine Buchhandlung
erbeten
B E R L I N
BEZUGSMÖGLICHKEITEN S ä m t l i c h e V e r ö f f e n t l i c h u n g e n u n s e r e s V e r l a g e s sind d u r c h j e d e B u c h h a n d l u n g i m I n - u n d A u s l a n d z u b e z i e h e n . F a l l s k e i n e B e z u g s m ö g l i c h k e i t v o r h a n d e n i s t , w e n d e m a n sich in der Deutschen Demokratischen Republik a n d e n A K A D E M I E - V E R L A G , G m b H , D D R - 1 0 8 B e r l i n , Leipziger S t r a ß e 3 - 4 in d e r D e u t s c h e n B u n d e s r e p u b l i k a n K U N S T U N D W I S S E N , E r i c h B i e b e r , 7 S t u t t g a r t 1, W i l h e l m s t r a ß e 4 - 6 in Österreich a n d e n G L O B U S - B u c h v e r t r i e b , W i e n I , Salzgries 16 in Nord- u n d Südamerika a n G o r d o n a n d B r e a c h Science P u b l i s h e r s , I n c . , 150 F i f t h A v e n u e , N e w Y o r k , N . Y . 100 11 U . S . A . i m sozialistischen A u s l a n d a n die B u c h h a n d l u n g e n f ü r f r e m d s p r a c h i g e L i t e r a t u r b z w . d e n zuständigen Postzeitungsvertrieb bei W o h n s i t z i m ü b r i g e n A u s l a n d a n d e n D e u t s c h e n B u c h - E x p o r t u n d - I m p o r t G m b H , D D R - 7 0 1 Leipzig, L e n i n s t r a ß e 16. Auf W u n s c h sendet der A K A D E M I E - V E R L A G Interessenten bei B e k a n n t g a b e der Anschrift u n d Fachgebiete unverbindlich Informationen über lieferbare u n d kommende Veröffentlichungen und gibt auch Bezugsquellen im In- und Ausland bekannt.
Fortschritte der Physik 16, 3 2 5 - 3 5 5 (1968)
Design of Physical Experiments (Statistical Methods) V . V . FEDOROV1), A .
PÀZMAN2)
Joint Institute for Nuclear Research,
Dubna
In the present state of the development of theory and experimental techniques many investigations in physics, especially those concerning the properties of elementary particles, require, sophisticated, expensive, and extended experiments to be performed. Actual experience shows that the design of such experiments by the methods of mathematical statistics allows, in most cases, to use the available means (e. g. time, money, and material) more effectively than in passive (non-planned) experiments. In view of this a more extensive application of the mathematical methods for designing experiments becomes necessary. In this review the existing methods of designing experiments are described in a universally accessible form. Examples concerning mainly experiments on the phase shift analysis of elastic scattering and measurements of differential effective cross sections are considered in this review. For simplicity, these examples are presented schematically. The examples are selected from the domain of nuclear physics not by chance but because experiments in this region are specially expensive. In comparison with the book by K L E P I K O V and SOKOLOV (1961) containing one chapter on the planning of physical experiments, the present paper embraces a wider variety of problems.
Contents List of symbols § 1. The method of maximum likelihood § 2. The criteria for optimality of regression experiments § 3. The static design of regression experiments § 4. The sequential and continuous design § 5. The design of descriminating experiments § 6. The design of experiments based on the measure of information
325 326 330 334 341 345 351
List of Symbols E{) 3¡{
f()
)
the operator of averaging a random variable, the operator of computing the variance of a random variable or the covariance matrix of a multi-dimensional random variable, a probability density function,
') Moscow State University, Laboratory of Statistical Research. ) On leave from Czechoslovak Academy of Sciences, Bratislava, Czechoslovakia.
2
24
Zeitschrift „Fortschritte der Physik", Heft 6
326
V . V . FEDOROV, A . PAZMAN
' &„ © = (®j, . . ., 0m)' x «
T](&,
X)
y t m T
the symbol of the transposition of a vector, the symbol of the matrix inversion, the (0) are supposed to be near to the true values of the studied parameters. 4. The minimax criteria Let
rj(0,x)
be a linear function of the unknown parameters: V
( 0 , X) =
/'(«)©.
Consider the variance of the function f)(0, x) at the point x. By Eq. (2.8) ®{r,(0,x)}
= f'(x)Df(x).
(2.10)
When the experimenter is interested in a precise definition of the surface r)(0, x) it is naturally to demand that the maximal value of @ {rj ( 0, a;)} in a region x e X' should be minimal: min max @ { t ] ( 0 , x ) } . (2.11) e
xtX'
Note that the region X' can be a part of X (the region in which the measurements are possible), can include the whole region X, or a part of it, or can have no common points with the region X (the extrapolation to the region X'). I n the particular case X' can degenerate to a single point. 5. /-optimality If rj(0,x) is a nonlinear function of 0 and the size of the sample is small, then a situation may occur when the elements of D (the second moments) would not be sufficient to describe the accuracy of our knowledge. I t is more convenient then to use the measure of amount of information (based on the notion of entropy) acquired from the experiment: / =
log
/ L(0/y)
L ( 0 l y ) d0
(2.12)
where L(0/y) is the normed likelihood function. The experiment will be optimal, if it minimizes the mean increment of information averaged with respect to all possible sample : min J = min E {/ - 70} e
e
where I 0 is the amount of information before the planned experiment.
(2.13)
333
Design of Physical Experiments
If only k parameters are needed then integration in (2.12) must be made only with respect to the "useful" parameters to. Let us note t h a t the information criterion is a sort of some generalization of the criteria described above and of the criterion for discriminating experiments described in § 5 (in more detail see § 6). All considered criteria for optimality are given in Table 2.1. The relations between the different criteria are demonstated by this Table. T a b l e 2.1 The conditions of equivalency of different criteria for optimality (under the assumption that m
«?(«,») = Z ©«/«(»)) criterion
D
A
d
D
/
A
à
The matrix D of the optimal plan is diagonal
I
m = 1
always equivalent
are normally distributed or T oo
m = 1
the condition A
condition / D and condition A ++D
/ /
min max a2 (x) e xiX
/
condition I D and
TO = 1
TO = 1
/
min max a2(x) e xtX
condition I^D
I
E x a m p l e 2.1 The Z)-optimality and /-optimality criteria are used in designing experiments on elastic scattering (see ex. 1.1.). The experimenter requires the maximum accuracy of all phase shifts up to the phase shift corresponding to some maximal value of the orbital moment. D-optimality can be required if the set of phase shift estimates is single-valued (a single solution in (1.17)) and /-optimality should be required if there are several possible sets of phase shifts (several solutions in (1.17)). E x a m p l e 2.2 The minimax criterion is suited for planning of experiments which are discussed in the example 1.2. The condition (2.11) requires t h a t the maximal deviation of
334
V. V. FEDOROV, A. PAZMAN
the estimate of the differential cross section from its real value considered in a given interval of angles should be minimal. This is equivalent to the requirements of a maximal accuracy of all the parameters 0 X in (1.18) (in the sense of D-optimality; see Table 2.1). Sometimes it may be useful to require t h a t the mean-squared deviate of the estimate rj (x) from the true curve rj (x) should be minimal: b min fa2(x)dx c a
(2.14)
where (a, b) is the interval in which we estimate the differential cross section. and LANDSDORF ( 1 9 6 5 ) prove the equivalence of ( 2 . 1 4 ) and of the Aoptimality criterion supposing t h a t n = m (the number of parameters coincide with the number of measurement points) and the polynomials pa(x) are orthogonal, i.e. b Jq(x) pa(x) pp(x) dx = dxP (2.15) MONAHAN
a
where q (x) is an arbitrary continuous weight function.
§ 3. The static design of regression experiments Consider the case when the measured quantity is linearly dependent on the studied parameters: )=/'(»)©. (3.1) v(0,x Let us suppose t h a t we know the function of effectiveness X(x) SOKOLOV, 1961) which is defined as:
(KLEPIKOV
and
Here a 2 (x) is the variance of y measured a t the point x during the time t. The static design is a design of the whole experiment (or of a serie of experiments) from the beginning to the end. I t is evident t h a t for a static design the function of effectiveness must be invariable throughout the designed experiment or must be a known function of time. Here we shall suppose t h a t the experimental conditions are invariable and t h a t no initial information on the parameters & exists (M = 0). The static design with varying experimental conditions and given initial information represents a complex and cumbersome problem. We shall not consider this problem here but we shall give in a short form some approximative solutions of such tasks in § 4. As it was mentioned in § 2 the optimal experiment is determined by the criterion for optimality, the corresponding set of measurement points £C(1),. . ., a:(B) and the measurement times at these points i (1) , . . ., i (n) . Simultaneously t h e condin tion £ t{i) = T must be satisfied where T is the time of the whole experiment. t=l
335
Design of Physical Experiments
I. A single controlled variable Let us suppose that a single controlled variable x assuming its value from some interval (a, b) is available. The general case when the variable assumes its value from some compact set is taken up by KIEFER and WOLFOWITZ (1959). 1.1. D-optimal plans The minimum of the determinant of the error matrix corresponds to the maximum of the determinant of the information matrix M (see (1.11)): M = FWF'
(3.3)
where Fai = /„(a* ), i = 1, . . n, a = 1, . . ., m and Wij = A (a***) («My, i,j = i,...,n. It follows from the condition | M \ =)= 0 that the inferior limit of the number of points n in which the measurements must be made is not to be less than the number of unknown parameters m: n>m. (3.4) 0
The information matrix being symmetric is completely defined by its m(m + l)/2 elements. On the basis of this and using the methods of linear algebra we can demonstrate (see STONE, 1959) that for any D-optimal plan with the number of measurement points more than m(m + l)/2 and with a known matrix M0 it is possible to find a plan with the number of measurement points n: + »
(3.5)
and with the same information matrix M = M0.
(3.6)
So, the optimal number of measurement points is to be limited by
+
(37)
Li
Analogically in the case of finding k parameters from m (STONE, 1959): n{h(@)} = s*(x0) will be minimal, if the "amplitude" of the curve s(x) on the whole (a, b) will be as small as possible (see Fig. 3.1), i.e. if the curve s (x) can be inscribed into the corridor ( — l/j/A (x), l/yT(x)) (For a detailed proof see the paper by SOKOLOV and K L E P I K O V (1963)). I t follows from (3.14) that the functions x) are linear combinations of the functions fx{x). In consequence (see (3.21)) the function |/T s(x) can also be written as a linear combination of these functions m )/TS(x)=ZaxMx). «=i
(3.23)
Thus, to determine the optimal experimental points the coefficients a x , . . ., a m need be choosed so that £ a x f x (x) should be alternatively tangent to the curves « l/]/A(x) and —l/yA(x) for x taken from the interval (a, b). The abscissas of the points of tangency constitute the optimal experimental points. When it is impossible to inscribe the curve like (3.23) into the given corridor the optimal experiment is degenerated, i.e. the number of the measurement points is less than the number of parameters. In such a case the curve (3.23) must be calculated so that the deviation I y ¥ a ( x ) ± i i f m \ o f 1IT s(x) outside the corridor should be as small as possible. The abscissas of the points of intersection of f f s(x) with the bounds of the corridor constitute some approximately optimal measurement points. If the functions fx(x) are sufficiently simple and the number of parameters 0X is small (m sS 6) the described solution can be obtained graphically ( K L E P I K O V a n d SOKOLOV, 1 9 6 1 ) .
Example
3 . 1 (SOKOLOV
and
K L E P I K O V , 19*63)
I t is known that the differential effective cross section (see example 1.2) cannot be directly measured at the angle x = 0. Therefore, the estimation is made in an indirect way, by measurements at the angles different from zero (an extrapolation).
339
Design of Physical Experiments
Suppose that in some interval 0 x ^ 6 the differential cross section t] (x) varies as a polynomial of second order t](x) =
V
( 0 , x)
=
0
+
1
0
2
x
+
(3-24)
03x*
where 6>1; 02, 0 3 are some (yet unknown) coefficients. We are interested in the estimate of i?(0) = 01 i.e. the experiment determining 01 with a minimal variance is required (d-optimality). In the given task the function of effectiveness is proportional to the unknown differential cross section (see (1.19)): l ( x ) pa r j ( x ) . So r) ( x ) is to be preliminary ri estimated (from the theory or by measurements of short duration). \ We can determine the measurement points a; = 0 . 4 7 7%
I =
0 . 1 8 7%
II. Several controlled variables The design of the optimal experiment in the multi-dimensional case (with several controlled variables) is a technically much more complicated task than in the one-dimensional case. Besides the technical difficulties other basic difficulties occur. For example, it may be possible that for some classes of functions rj{0, x) many -D-optimal plans exist. This excludes practically the possibility of using approximate computing methods (e.g. a gradient method of minimizing or a sequential method of design (see § 4 of the given review)), since these methods may converge to plans with a too large number of measurement points (much larger than the number of unknown parameters) or not converge at all. At the present time plans which are optimal with respect to the minimax criterion (i.e. also with respect to the Z)-optimality criterion; see Table 2.1) were found only for the simplest functions r j ( 0 , x), namely, when i ] ( 0 , x ) is a linear or quadratic function of the controlled variables xv . . .,xs: V
or V
( 0 , x)
=
( 0 , x)
J=1
9
=
m
01
x ,
+
+
02XJ
+
. . . +
Z & i ^ x )
+
j=l
0s+1xs
¿
2
j=l 4=1
(3.26)
e
t t + f + k
-lxix
k
.
(3-27)
k>j
The case (3.26) is rather trivial and described in detail in (NALIMOV and CHER-
NOVA, 1965). Consider briefly a n e x a m p l e of t h e case (3.27) (see KIEFER, 1961B).
We shall suppose that — 1 sS 1, for i = 1, . . ., s (this may be obtained by change of the scale of Xi) and that P.(x) «a 1 (this is a strong restriction). From (3.27) follows that the number of studied parameters is equal t o : »
=
(
*
+
1 )
J *
+ 2
¿i
>
(3.28)
where s is the number of the controlled variables. We shall choose the measurement points as follows: from all possible plans we shall consider only the plans with measurement of weights wt = oc in each from the 2s apexes of an s-dimensional cube, with measurement of weights = in each from the S2 S_1 points which are the centers of the edges and with weights w{ = y in each from the s(s — 1)2S~3 centers of the two-dimensional faces. An optimal minimax plan satisfies t h e following condition (KARLIN and STUD DEN, 1 9 6 6 ) : max
Zj€ i = 1,.. .,s
k(x
)a2(x)
=
mlT.
(3.29)
341
Design of Physical Experiments
Therefore, in our case the values 0)
The optimal measurement point corresponds to the: min
X [A(ac)T]-i + Ó-S(ÌC)
min |D| min \ D(o>) where to'
=
(@„
max
X ( x ) T
m a x
X ( x ) [ h ' D ( 0 ) f ( x ) ]
max A(x)
a l ( x )
2
X
a \ ( x )
|
. .
0
t
)
max «,0=1 —
r-
max A (as) x
E
[¿^(O;
o > ] x f i C
a f
) ( x )
«,0=1
[A (as ) T y i + al(x) where C ( x )
=
D ( 0 ) f ( x ) f ' ( x ) D ( 0 )
Note t h a t , in principle, the experiment can be designed so t h a t during the given time T the measurement are made in several measurement points the number of which is given before (PAZMAN, 1966). Such a design necessites rather complicated computations. I t is interesting to compare the economy of expenses (of measurement time) in using the continuous (sequential) and the static experimental design. When there
Design of Physical Experiments
343
are no previous experiments the expenses for an experiment are the same for the continuous and static design as well. If some previous experiments were made in a passive way (without planning) then at the beginning, the continuous design is more economical than the static one. But for a long measurement time T, the results of the non-planned experiments taking into account, the static design is slightly better than the continuous one. To determine the optimal measurement point x0 the maximum of some function «(sc) must be found (see Table 4.1). If no more than 3 controlled variables exist (x is a three-dimensional vector), then x0 can be found by computing x(x) for different x. I t is useful to compute the values of x(x) in the neighbourhood of x0 in order to estimate how the small shifts of x0 change the value of x (x0 ) (the stability of the solution). If the dimensionality of x is great we must use more effective methods of finding an extreme (see for ex. BEHEZIN and JIDKOV, 1966).
The estimation of the stability of the solution can be found from the secondary derivative of the function x(x) at the point x0 (approximation of x(x) in the neighbourhood of x0 by a multidimensional paraboloid). E x a m p l e 4.1 Let us model an experiment which will show how effective the sequential design is. Consider the following chemical reaction:
+P. The rate rj of that chemical reaction is given as: R, =
0 1 &
*XL
(46)
where x1 is the partial pressure of the starting product R; x2 is the partial pressure of the product P1 ; 0lt 02 , 0 3 are the parameters to be determined. To model the experiment we shall use the true parameters values known from literature ©irue = 2 . 9 ,
©true
=
12.2,
0»rue = 0.69.
(4.7)
In the following we shall assume that the measurements are possible in the region of O^Xi^S, 0 a;2 3. (4.8) We shall model the measurements y by the following way. A sampling is made from the table of random numbers (the normal distribution with parameters 0, a = 0.01). The obtained number is added to rj(&, x) where 0 is determined by eqs. (4.7). It is easily seen that in this case the effectiveness X (x) has a constant value in the region (4.8) and the weight of measurement made at any point x (i) is equal to Wi — Nil a 2 where Nt is the number of separate measurements at the point a?w, = N) (note that N is proportional to the measurement i time T). The initial estimates of the parameters &f\ are obtained by the least square method from the first four experiments from Table 4.2. 25*
344
V. V. F e d o r o v , A. P â z m a n
Supposing the estimates are not much different from the true ones, we use the method developed above and compute /«(*) =
0301x1 80«
1 +
0
+
02X2
(4,9)
5=5
By Eq. (1.12) after the additional (fifth) measurement the Fisher information matrix M takes the following form : Mi
f
=0.1; 4 6 ) = 0. (4.11) We obtained this result by computing the determinant of the information matrix M for 31 X 31 different point x. The experiment "performed" at the point a;(6) gives the result yh = 0,186. Including the fifth measurement into the least square method we obtain "new" parameter estimates = 3.ii,
e p = 15.19,
ep> = 0.79.
(4.12)
By formula (4.9) we calculate the derivatives at 0 = 0 ( 1 ) and composing the matrix M, we find the maximum of its determinant (the maximum can be obtained at a/]6* = 3.0 and x'2e> = 0.0). Then we make the measurements at the point £C(6) and we obtain the estimates 0 ( 2 ) and so on. Seven measurements have been designed by a similar way. The results of designing and the results of experiments are given in Table 4.2. The experiments have been stopped when the parameter estimates differ slightly and the estimate values become close to the theoretical values of parameters 6»irue, 0*™, = (1, 2), as(3) = (2, 2), ¡c = (2,1) N1 = 3, N2 = 3, N3 = 3, Nt = 3
5 x 10'
ac7)
If the second hypothesis is true the measurement must be made at the point corresponding to ~
max
+
a i ( x ) - a l ( x )
.
(5.8)
If the points obtained from (5.7) and (5.8) don't coincide, then the measurement must be made at the point corresponding to max [ W ^ i x K n + 1) -
X f(n
+ 1)} + W M £ ( n + 1) - *•(» + 1)}] (5.9)
X
where the weight W1 and W2, generally speaking, depend on the aim of the experiment. If the loss occuring when the false hypothesis is accepted is equal to the loss occuring when the true hypothesis is rejected, the weights are defined as: Wt & e-1/. *!(»),
W2 ^ e-'/^H«).
(5.10)
In some cases the aim of the experiments is not expressed in a form which allows to find the ratio of the loss due to the false acceptation of the first hypothesis to the loss due to the false acceptation of the second one. In such a case the measurement must be made at the point obtained from: max mm Ek {x)(n + 1) - £(n + 1)}.
(5.11)
j+k'
A different method of designing which is applicable if in a paper by FEDOROV and KLEPIKOV ( 1 9 6 5 ) .
of
( x )
=
a l ( x ) ,
is described
348
V . V . FEDOROV, A . PAZMAN
E x a m p l e 5.1. S c a t t e r i n g of p o l a r i z e d p r o t o n b e a m s (see ex. 1.1. a n d 2.1.) Let us discuss the following experiment (Fig. 5.1.). A beam of polarized protons is scattered on a non-polarized neutron target Tx. As a consequence of this scattering the polarization vector Px changes its direction. This can be described by five Wolfenstein parameters R(x), R'(x), A(z), A'(x), D(x) (FAISSNER, 1959). Here x is the scattering angle on the target Tx. If the angle between the normal n to the scattering plane (for the scattering Tj) and the polarization vector P1 of the primary beam is properly chosen, it is possible to determine one of the Wolfenstein parameters, say the parameter R(x), by a second scattering on the analysing target T2. Thus, in the given experiment R (x) is the "directly" measured quantity rj(x). Preparing an experiment the experimenter knows some previous phase shift estimates (obtained usually from otter experiments which are made at different laboratories) and he is limited by the time T allocated for the experimenter at the accelerator. B y L E H A E , F E D O E O V and J A N O U T ( 1 9 6 7 ) a situation is described where two sets of phase shift estimates & r > and © exist with "x2-Values" X
\ = 112.35,
xl = 115-9
respectively. As a consequence of this there exist two curves reliably estimating R(x) (#(© 0). (It will be possible then to compare the obtained results with those obtained in the example 6.1. in § 6). The design consists in computing the values of the functions (x) = X(x) {[%(*) - %(*)]2 +
V
(obtained from (5.7) for T
0)
- a\{x)}
(5.12)
Design of Physical Experiments
349
and W(x) = k(x) {&(*)
- fjt(xyp + St(x) -
al(x)}
(5.13)
(obtained from (5.8) for T -s- 0) for different angles x and finding their maxima (Here jjx(®) = R(©(1\ x) etc.). The results are presented in Fig. 5.4 (for rj (x) = R (x)) and in Fig. 5.6 (for rj (x) = = D{x)Y We can see that (p(x) and *P(x) in both figures are nearly equal in the neighbourhood of their maxima, i. e. the formulae (5.9) and (5.11) yield here the
Design of Physical Experiments
351
same results. If comparing the experiment measuring R(x) with the experiment measuring D(x), the second one is more advantageous (mostly because of the great distance between D(0 {1\ x) and D(© (2) , a;)).
§ 6. The design of experiments based on the measure of information
In the case of a strong nonlinear dependence of rj(0,x) on the parameters 0 and of an insufficient amount of made measurements, several maxima of the likelihood function L(0) (see § 1), which slightly differ in height, can occur. The positions of these maxima are considered as the sets of estimates of the parameters 0 obtained by the maximum likelihood method. The experiments designed with methods described in the previous paragraph allow to discriminate most efficiently the false sets of estimates. In that case the experimenter is not interested in the accuracy of the parameter estimates (i.e. the error matrix). That is, only after making some discriminating experiments (§ 5) and after a reliable discrimination a design of some experiments specifying the parameters is made. Evidently the most perspective criterion for optimality are those which allow to find the experiments descriminating and specifying simultaneously. The criterion based on the entropy measure of amount of information ( L I N D L E Y , 1956) has such a property. The amount of information contained in the experimental data is equal to: / g = j L(0)lnL(0)d0
(6.1)
where L(0) is the likelihood function. The function L(0) must be normalized, i.e. ¡ L ( 0 ) d 0 = 1.
(6.2)
(The integrals (6.1) and (6.2) are taken within the region of all possible parameters values).
352
V. V. FEDOROV, A. PÂZMAN
Analogically, if the experimenter is interested in k from m parameters (see § 4), then the amount of information on these "useful" parameters to is equal to: 7-
= / L (to) In L (to) da>.
(6.3)
Here
L(v)) = j L(0) d- = x" + dx*. The functional determinant
8(x')
——-
is assumed to be unequal to zero. We write the 4
8(x)
corresponding infinitesimal transformations of the fields and their derivations as V'r(x')
=
8V'r(x')
VT (x)
+
6 U
r
( x ) ,
8Ur(x)
=
8x'r
(3 a)
8Ur(x)
8xx
8x"
'
(
'
On the other hand, the variation of Ur without changing the independent variables is defined by U'r(x)
=
Ur(x) + 6 U
r
( x ) .
(4a)
The variations ~d Ur and d UT are related by the equation 8U
dUT(x)
=
8V
r
(x\
(x) -
dx".
(4b)
We further note some relations which will be used later. Because of Ofi- TT dx" + . . . d UJx + dx) = d Ur(x) + — 8 x''
one has in first order 8
6Ur(x
+ d x )
=
dUT(x).
(5)
) This condition also is necessary. See, for example, R. Courant and H. Hilbert [I.5] p. 195.
362
U. E . SCHRODER 8 U
From the equation (4 a) and the corresponding one for 3 ——-, one gets 1 8V
— 88U
8x"
8x"
8 x*
8 ZJ
5 -—- = ——-. On the other hand, the computation of S -—- yields J ^
6
8U,
=
ddUr
dxf
_
8xf
8Ur
8xf
8x"
8 xf
B
B
8x>*
3xv
from which one sees that in general d - — UT 4= T—
Ur?)
Neglecting all
infinitesimal quantities of order higher than the first, one obtains the functional determinant = 1 (7) w 8(x) 8x" In the following we shall use the symbol 8^ to characterize the complete partial derivation 8 f M
=
— 8xf
+ — 8Ur'
^El 8xf
8 1
^ 8Ur'
(M
82 Ur
8x, 8xf
W
~~8xr
which means that the derivation 8^ also refers to those variables x» occurring implicitly in 1. As an abbreviation we shall further write 8^ VT = Ufili. 2.3. The c h a r a c t e r i z a t i o n of admissible t r a n s f o r m a t i o n groups The group G of transformations T is called an admissible group if for all transformations T e G the action integral remains invariant (in the sense of equation (2)). In order to find a criterion for transformations of this kind, one starts from the definition of an invariant action integral. Combining equations (2) and (1 a) one obtains jdx R
l(x,
U, . . .) = / dx' {*(*', R'
V , . . .) +
dpdQUx',
V')}.
(9)
The transformation is assumed infinitesimal. Now the first term on the right hand side of equation (9) can be expanded into a Taylor series. Retaining only first order terms and introducing with (7) the variable of integration x, one gets jdx' R'
X(x', -
V , . . .) /
j
r
n
= v,..., +
«
+
£
av, +
^
t
u,..
+
r
Thus from (9) one concludes j d x
+
~
^
+
~
+
3,»*}
R 9
) According to (6), the sign of equality is obtained for translations.
=
0.
! g \ .
Noether's Theorem and the Conservation Laws
363
Up to now the region of integration B has not been fixed by the variations leading to the field equations. Since the region R is arbitrary, the integrand itself is equal to zero
Thus the equation (10) is a necessary condition for the variational integral to be invariant under the transformations considered. By reversing the last conclusions, one confirms that this condition also is sufficient. The group of transformations G is admissible if and only if the terms in equation (10) derivable from the Lagrangian density, and the transformations considered can be written as a divergence expression. 2.4. The theorem of Noether The first theorem of Noether states : If the action integral I =
j d x ï ( x , Ur,
UT,„)
R
is invariant under the transformations of the n-dimensional Lie group Gn, then the relation holds m à U
where
=
T
8X(Q*
-ÔQ*)
(11)
are the variational derivatives of the Lagrangian Ï , and the definition of Q'- is ¿> A=
- - ^ - d V r + ô x J M dVr.x " " \ 8 U ,
Uf
- g ^ X ) .
(13)
This means that the inner product of the variational derivatives and the local variations of the field functions 6U r can be written as a divergence. In order to give a brief proof of the theorem, we write down explicitly the left hand side of equation (11) using relation (4 b) and definition (12)
+ B"W~ •
8U 0
j
r
r i f t
' > '^r
V dX
r
8x"
^
8 VTill
+
r
8x"
U
^dXi -
+
+
Here the last three pairs of terms with opposite signs have been added. Now, in order to use the condition for a transformation to be admissible (10), the terms
364
U . E . SCHBÔDEB
of the sum shall be written in a convenient ordering
i
r
r
*
u
u
' 81
- ^ - Z J f d X i+
-H} + w;ÔU' +
dôx,
81
8,(0x^1).
Using (6) and (13) we find t h a t [IYÔUT - B ^ + ^ U .
+ ^
Ô
ur„ + g>, X ^
+
and further with (8)
By assumption, the group of transformations considered is admissible and therefore condition (10) holds. The application of this criterion immediately leads to the divergence relation (11) [Jt]'SUr=
— &Qx).
which completes the proof. The inverse of this theorem can also be proved [5]. 3. The conservation laws from Noether's theorem 3.1. C l o s e d a n d n o n - c l o s e d s y s t e m s The theorem of Noether shall now be applied to the theory of local fields. This theorem follows from the existence of an invariant action integral. The field equations to be derived via the variational principle are then covariant under admissible transformations. Consequently one can define the set of all admissible transformations as the in variance group of the system. We now distinguish between closed and non-closed systems by a convenient utilization of the variational derivatives of the Lagrangian density. A closed system is defined as an isolated one so t h a t no external influences affect it. The Lagran8X gian density does not depend explicitly on the coordinates, i . e . —— = 0, 8x and the field equations [jf] r = 0 follow from the Lagrangian I describing the isolated system. A non-closed system is not isolated from external influences but interacts with potentials or sources. 10 ) I t can be considered as the subsystem of a closed one. 10
) The corresponding Lagrangian may depend explicitly on the coordinates.
Noether's Theorem and the Conservation Laws
365
The total Lagrangian of two interacting fields U ^ and U^1 can be written as I tot = = + J?2 + where and are the Lagrangian densities of the corresponding free fields, and l w denotes the coupling term containing both fields. Now by varying U\P in Jf tot one obtains the field equations for U (r l>. These equations contain the coupling to the second field Uf ]. On the other hand, by varying Uf } one is led to the field equations for U{ 2> with coupling to the field U[ l\ The coupling term (inhomogeneity of the field equation) may contain the varied field functions or may be independent of them. In the first case the field is influenced e.g. by potentials; in the second it interacts with sources. As an example we point to the electromagnetic field, depending on currents and charges, with the perturbation (inhomogeneity) given by the current density. Now for the subsystems + £ w and + %w, the perturbation terms can be calculated from the interaction term in the Lagrangian by varying the fields U'- r> and E7 -
1/2 W>" + 1/2 W-'"),
where the sum over fi now is limited to fi < v. Since the six parameters * = 0 the tensor = Qfv + 1/2 8lHxf" is symmetric. Thus for closed systems the invariance under the inhomogeneous Lorentz group leads to a symmetrical tensor The expression in (22) M'Xfiv — XV Qlp. _
xn
QXv —
Hl!lv
is often called the angular momentum tensor. 14 ) However, this tensor does not exhibit the relationship between momentum and angular momentum quantities as it is known from classical mechanics. 15 ) A natural definition of the angular momentum tensor is = x°
xp- TXv,
—
where T r e p r e s e n t s the energy-momentum tensor still to be defined. One can achieve this analogy to classical mechanics by completing T*£ to a tensor the divergence of which vanishes and which is also symmetric. I t can now be shown t h a t 1/28x{H»vX + H'i a ) is the correct symmetrical completion which leads to the conservation law 8iMxi'v — 0, with M'-f" = x"T'tl — x^Tu. For that purpose one adds 1 /2 8X (H'n!- + H^1) to the second term of and confirms that qxhy _ i j2 (H'-t"
+ HfvX
+
H'^)1*)
is antisymmetric in the first two indices 0Xf,v _
From the definition of 0^"
_ 0fiXv _
(23)
and from (23) one obtains JJXfiv — 0Xftv — 0Xvp
(24)
Because of >-i" = 8„ (x* 0ixv)
— x* 8X 0Km
M';-iJV can be written in the form M'Xpv
=
x" Q*-!1 —
— 0Xliv _(_ 0Xvn
= X"(9>-IJ + 8X 0«^)
Now the derivation 8X 8X (x'' 0xi" law holds
— Xl'(6lv
+ 8x0*Xv)
— xv 0xif)
+ dx (£'' *>v — X" 0*^)
vanishes, and hence the conservation
- xf TXv} = 0,
8X {x'
with =
QIK
+
.
Qx QxXn _
(25) (26)
The tensor Ti" is symmetric because the equations (22) and (24) lead with (18a) to (9VII _ 011V _
13
0Xi*v
gl 0Xvf. — o.
) In general the divergence of T^' is different from zero, i. e. can not be used as energymomentum tensor. 14 ) See, for example, W . P A U L I [ 7 2 ] , 15 ) Moreover for the electromagnetic field the tensor M'lt" is not gauge invariant. 16 ) The tensor X>" has been introduced by F . J . B E L I N F A N T E to symmetrize the canonical tensor. See also the clear discussion in the paper by W. P A U L I [1Z\.
Noether's Theorem and the Conservation Laws
369
The divergence of Tf" is equal to the divergence of the canonical tensor, since using (23) one has 8X dH = 0. The term 8„ does not contribute to the total energy and momentum of the field, since the integral of the spatial divergence 8 k 0 k °i' vanishes, J d3x dx&x*f = Jd3x 8k0k°f = 0. Because of these properties Tf" can be considered as the general energy-moment u m tensor. I n context with the treatment in [7] it should be noted t h a t for deriving the law of angular-momentum conservation, the validity of energy-momentum conservation is not necessary. However, to establish the symmetry of Tf', the invariance under translations is to be assumed. For a non-closed system one obtains from (21) with [Jf] r = — [j?u,]r k" =
-[Iw]'Uf
x»hL — X" k" +
P/'Vs
= 8X {x?6xp — x*ex"
— Hx»"}.
(27)
Consider now the scalar field. The results in this case are for a closed system, 8X 6M" = 0,
8X {x'0^
— x" QXv\ = 0
and for the non-closed system, 8X 6>A" = kf, 8x{x'0xf — x^ " is a pure spin contribution. Hence the spin can be defined in this context as t h a t property of a field which contributes the term 8 l 0 x w to the energy-momentum tensor. The canonical tensor does not contain any spin properties. ) Here the relation 8XM'1^' = 8XMX>" has been used.
17
370
U . E.SCHRODER
It should be noted that it is in general not possible to extract a pure spin density from a decomposition of the angular momentum tensor, because in ' X"
— X- (15) thus obtaining Ur -
[ I ] ' * V*}ioc
=
-ixd„
UT -
C/*}.
Using the field equations \j£Y = [Jf] f * = 0 one gets d^s" = 0. From the invariance under phase transformations it follows the existence of a vector dX i dJt T7 r J
which can be interpreted as the current density. Then d ^ = 0 describes the conservation law of charge. It should be noted that [J?]' Ur = [jf] r * U* is suffi18) However, a covariant decomposition of the total angular momentum into orbital and spin term is not possible, because only the total angular momentum is conserved and not the individual terms. This question is discussed for the example of the electromagnetic field by
F . ROHRLICH [20] p. 9 5 f f . 19)
The application of this result to the electromagnetic field in matter shall be discussed elsewhere.
Noether's Theorem and the Conservation Laws
371
cient to guarantee this conservation law if the invariance under phase transformations holds. 3.5. I n t r i n s i c s y m m e t r y g r o u p s a n d c o n s e r v e d c u r r e n t s Let us suppose t h a t there exists a so-called intrinsic symmetry group of the system, i.e. the Lagrangian is invariant under transformations acting on "internal labels" 20 ) of the fields, whereas the space-time coordinates are not affected. The corresponding infinitesimal transformations can be written dU = ieXkU,
6xk = 0,
dU =
dU,
where e denotes an infinitesimal parameter and Xk is the &-th generating element of the Lie group under consideration. 21 ) Now according to Noether's theorem with [Jf] r = 0 one obtains
After integrating the zeroth component of the conserved current it follows t h a t the generalized charge Gk = J
d3xl°k(x)
is a constant of time. As examples of this intrinsic symmetry groups we point to the unitary groups as SU(2), SU(3), a n d others, which are used in the theory of f u n d a m e n t a l particles. I wish to thank Dr. H. D. DOEBNER for many valuable discussions as well as Professor G. SüssMANN f o r critical remarks.
References [1] G. HAMEL, Z. Math. Phys. 50, 1 (1904). [2] G. HERGLOTZ, A n n . P h y s . 8 6 , 4 9 3 (1911).
[3] H. WEYL, Raum — Zeit — Materie, Springer-Verlag, Berlin, 1918. [4] F. KLEIN, Nachr. Akad. Wiss. Göttingen, Math.-Phys. KL. 171 (1918). [5] E. NOETHER, Nachr. Akad. Wiss. Göttingen, Math.-Physik. Kl. 235 (1918). [6] [7] [5] [9]
E. F. L. R.
BESSEL-HAGEN, M a t h . A n n . 8 4 , 2 5 8 (1921). J . BELINFANTE, P h y s i c a 6 , 887 (1939), 7, 4 9 9 (1940). ROSENFELD, M e m . A c a d . R o y . B e l g i q u e 6 , 3 0 (1940). ISKKAUT, Z. P h y s . 1 1 9 , 6 5 9 (1942).
[Iff] F. HUND, Materie als Feld, Springer-Verlag, Berlin, 1954. [11] E. L. HILL, Rev. Mod. Phys. 23, 253 (1951). [1Z\ W. PAULI, Rev. Mod. Phys. 13, 203 (1941). 20
) These labels refer to internal degrees of freedom such as charge, baryonic number, hypercharge, etc. ) The generating elements Xk form a Lie algebra of dimension n (k = 1, . . ., n), where n is the number of independent parameters of the corresponding Lie group. The indices of the fields are omitted.
21
372
U . E . SCHRÖDER
[73] T . A . MORGAN a n d D . W . JOSEPH, NUOVO C i m e n t o 8 9 , 4 9 4 (1965). [ « ] D . M. LIPKIN, J . M a t h . P h y s . 5, 6 9 6 (1964).
T. A. MORGAN, J. Math. Phys. 5, 1659 (1964). [75] R. COURANT and H. HILBERT, Methods of Mathematical Physics, Vol. I, Interscience, New York, 1953. [IS]
D . HORN, A n n . P h y s . ( N . Y . ) 3 2 , 4 4 4 (1965).
[27] H. STEUDEL, Z. Naturforsch. 17 a, 129 (1962), Nuovo Cimento 89, 395 (1965). [J8] T . DASS, P h y s . R e v . 1 4 5 , 1 0 1 1 (1966), ibid. 1 5 0 , 1 2 5 1 (1966).
[19] T. H. BOYER, Ann. Phys. (N. Y.) 42, 445 (1967). [20] F. ROHRLICH, Classical Charged Particles, Addison-Wesley, Reading, Mass. 1965.
Herausgeber: Prof. Dr. Frank Kaschluhn, Prot. Dr. Artur Lösche, Prof. Dr. Rudolf Ritsehl und Prof. Dr. Robert Rompe, Manuskripte sind zu richten an die Schriftleitung: Dr. Lutz Rothkirch, II. Physikalisches Institut der Humboldt-Universität Berlin, 104 Berlin, Hessische Str. 2. Verlag: Akademie-Verlag GmbH, 108 Berlin, Leipziger Str. 3/4, Fernruf: 220441, Telex-Nr. 0112020, Postscheckkonto: Berlin 35021. Die Zeitschrift „Fortschritte der Physik" erscheint monatlich; Bezugspreis dieses Heftes 10,— (Sonderpreis für die DDR 6,— M). Bestellnummer dieses Heftes: 1027/16/6. — Satz und Druck: VEB Druckhaus „Maxim Gorkl", 74 Altenburg, Bez. Leipzig, Carl-von-Ossietzky-Str. 30/31. — Veröffentlicht unter der Lizenznummer 1324 des Presseamtes beim Vorsitzenden des Ministerrates der Deutschen Demokratischen Republik.
P r o f . D r . W E R N E R H O L Z M Ü L L E R / Dr. K U R T A L T E N B U R G
Physik der Kunststoffe E i n e E i n f ü h r u n g in das physikalische V e r h a l t e n m a k r o m o l e k u l a r e r S u b s t a n z e n 1961. X V I , 652 Seiten — 373 A b h ü d u n g e n — 38 Tabellen, d a v o n 2 auf 2 F a l t t a f e l n — 13 Abbildungen auf 7 T a f e l n — gr. 8° — Lederin 62, — Mark
Bisher gab es n u r wenig zusammenfassende u n d keine e i n f ü h r e n d e n Darstellungen in das Gebiet der s y n t h e t i s c h hergestellten organischen W e r k s t o f f e . Aus diesem G r u n d e w u r d e dieses L e h r b u c h a b g e f a ß t , in d e m einleitend die Prinzipien des A u f b a u e s m a k r o m o l e k u l a r e r Stoffe b e h a n d e l t w e r d e n , soweit das f ü r ein V e r s t ä n d n i s des physikalischen V e r h a l t e n s der H o c h p o l y m e r e n erforderlich ist. Anschließend wird das V e r h a l t e n m a k r o m o l e k u l a r e r Lösungen, vor allem im Hinblick auf die Methoden der Molekulargewichtsbestimmung, d i s k u t i e r t . Der größte Teil des Buches ist den E i g e n s c h a f t e n u n d d e m V e r h a l t e n der H o c h p o l y m e r e n in festem Z u s t a n d gewidmet. N a c h B e t r a c h t u n g e n ü b e r die S t r u k t u r , die K r ä f t e u n d Ordn u n g s z u s t ä n d e in f e s t e n H o c h p o l y m e r e n u n d die M e t h o d e n ihrer E r f o r s c h u n g w e r d e n das mechanische V e r h a l t e n v o n Seiten der phänomenologischen Theorie erörtert u n d eine D e u t u n g der elastischen u n d plastischen E i g e n s c h a f t e n auf G r u n d der Platzwechseltheorie gegeben. F e r n e r w e r d e n die thermischen E i g e n s c h a f t e n der K u n s t s t o f f e u n t e r s u c h t . I m Hinblick auf die große p r a k t i s c h e B e d e u t u n g der K u n s t s t o f f e als Isolierstoffe werden die elektrischen E i g e n s c h a f t e n ausführlich b e h a n d e l t . N e b e n einer phänomenologischen Beschreibung des elektrischen V e r h a l t e n s werden die dielektrischen E i g e n s c h a f t e n v o m molekularen S t a n d p u n k t aus d i s k u t i e r t . Weiter werden die optischen E i g e n s c h a f t e n der H o c h p o l y m e r e n , insbesondere die Anwendungsmöglichkeit der U l t r a r o t s p e k t r o s k o p i e zur A u f k l ä r u n g der S t r u k t u r u n d der v o r h a n d e n e n B i n d u n g e n u n t e r s u c h t . E i n abschließender Uberblick zeigt die wichtigsten P r ü f v e r f a h r e n f ü r K u n s t s t o f f e .
Bestellungen
durch eine Buchhandlung
AK AD E M I E - V E R LAG
erbeten
•
B E R L I N
31728
I N H A L T
V. V. I ' E D O R O V , A. P A Z M A N : Design of Physical E x p e r i m e n t s (Statistical Methods) . 325 U . E . S C H R O D E R : N o e t h c r ' s Theorem and t h e Conservation Laws in Classical Field Theories
357
Die „ F O R T S C H R I T T E D E R P H Y S I K " dienen der P u b l i k a t i o n zusammenfassender Berichte über aktuelle Teilgebiete der Physik. Neben Originalberichten in deutscher oder englischer Sprache werden deutsche Übersetzungen wichtiger fremdsprachiger Berichte a b g e d r u c k t .
Aus dem I n h a l t der folgenden H e f t e : B. O. E N F L O , B. E . L A U R E N T : Dirac formalism for a r b i t r a r y spin in S - m a t r i x theory K . H . K R E B S : E l e c t r o n ejection f r o m solids b y atomic particles with kinetic energy G. C. H E G E R F E L D T , J . H E N N I G : Coupling of space-time a n d internal s y m m e t r y G. D A U T C O U R T , G. W A L L I S : The cosmic b l a c k b o d y r a d i a t i o n
Manuskripte werden n a c h V e r e i n b a r u n g mit dem A u t o r v o n der Schriftleitung entgegengenommen.