408 96 2MB
English Pages [69] Year 1967
This dissertation has been microfilmed exactly as received
g8-8271
WILSON, Paul Robert, 1939LINEAR ALGEBRA OVER SKEWFIELDS. University of Illinois, Ph.D., 1967 Mathematics
University Microfilms, Inc., A n n Arbor, Michigan
LINEAR ALGEBRA OVER SKEWFIELDS
BY
PAUL ROBERT WILSON A.B., University of Cincinnati, 1961 A.M., University of Cincinnati, 1962
THESIS Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Mathematics in the Graduate College of the University of Illinois, 1967
Urbana, Illinois
UNIVERSITY OF ILLINOIS THE GRADUATE COLLEGE
SEPTEMBER 1 1 , 1967
I HEREBY RECOMMEND THAT THE THESIS PREPARED UNDER MY SUPERVISION wv iTMTTTT.F.n
PAUL ROBERT WILSON
LINEAR ALGEBRA OVER SKEWFIELDS
BE ACCEPTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF_
DOCTOR OF PHILOSOPHY
./"\ y\
O C/
Uncharge of Uncharge of Thesis Head of Department
Recommendation concurred inf
"^K^fr
yh^JL&L.
~)LtL^ if TSHom(V), then a matrix in (K) afforded by T will often be indicated by T; matrices will usually be represented by underlined capitals.
The symbol I always stands for the identity
transformation and I for the identity matrix.
The sym-
bol © means "direct sum", iff means "if and only ifo"
v
TABLE OF CONTENTS
CHAPTER
PAGE
1.
PRELIMINARIES
1
2.
REDUCED CHARACTERISTIC POLYNOMIALS
12
3.
THE REDUCED DETERMINANT
21
4.
LINEAR TRANSFORMATIONS . .
5.
DIAGONALIZABLE OPERATORS AND CANONICAL FORMS
K$
APPENDIX I
50
.......
............
APPENDIX I I . . . . . . . JLJJLO X
\j£
. . 57
XullP £jJt\i!jlM0iljO a o « « o * o » « * » » o * « t « o o s o o o « o i e o o o o » e o o
V Am l i i o e * » o o t o » o o t o o « « » « « « « » o o o o o o » o «
29
• o « o o o o o o * o » « o « o
o •
OJL
O O
1
CHAPTER 1 PRELIMINARIES
It is assumed that the reader is familiar with the elementary theory of vector spaces as presented in (8) and with the theory of Dieudonne determinants (of which there is an excellent short exposition in (1),).
In this chapter we review, briefly and select-
ively, some important results in the theory of linear algebras over skewfields.
For all details the reader
is referred to the third chapter of Jacobson's Theory of Rings 0 It is well known that the ring of polynomials in one indeterminate over a skewfield has both left and right division algorithms, so we can make the following definition. DEFINITION 1.1.
Let K be a skewfield and R = K[X]
be its ring of polynomials.
Given a,b e R, not
both zero, there exist uniquely determined elements d,m e R such that (i) d and m are monic polynomials, (ii) Ra + Rb = Rd
and
R a O R b = Rm.
Then d will be called the GREATEST COMMON RIGHT DIVISOR of a and b and written d = gcrd(a,b); m will be called the LEAST COMMON LEFT MULTIPLE
2
of a and b and written m = lcrm(a,b).
The poly-
nomials gcld(a,b) and lcrm(a,b) are similarly defined. Most important results in linear algebra depend in some way on the correspondence between linear transformations and finitely generated modules over polynomial rings. Such modules can always be decomposed into cyclic modules.
In the commutative case
two cyclic modules R/Ra and R/Rb are isomorphic iff a and b are associates in R.
The situation is less
simple in the non-commutative case. DEFINITION 1.2.
Two elements a,b e R are said to be
LEFT SIMILAR if R/Ra S R/Rb; RIGHT SIMILAR if R/aR * R/bR. 0. Ore (see (12)) introduced this notion and proved that left and right similarity are equivalent, so we may speak of SIMILAR elements; we shall often write a***b to indicate the similarity of a and b. THEOREM lojo
Let a,b e R be non-constant polynomials.
The following statements are equivalent, (i)
a and b are similar}
(ii)
there is a u s R such that deg(u) < deg(b), Ru + Rb = R, and RunRb = Rau;
3
(iii) there is a v e R such that deg(v) < deg(a), vR + aR = R, and vRoaR = vbR; (iv)
there exist u,v s R such that gcrd(u,b) = 1, gcld(v,a) = 1, deg(u) < deg(b), deg(v) < deg(a), and au = vb.
The proof is clear. The annihilator in R of any cyclic R-module is a principal two-sided ideal; Jacobson shows that any such ideal has a generator which is a monic polynomial in the center C of R„ DEFINITION 1.4. The sum of all two-sided ideals contained in a left (or right) ideal A of R will be called the BOUND of A and denoted by bd(A). If a s R, a ^ 0, then the unique monic polynomial a* s 0 such that bd(Ra), = Ra* will be called the BOUND of a and written either bd(a) or a*. We set bd(0) = 0« The apparent left-right asymmetry in the definition of bd(a) is illusory, for bd(Ra) = bd(aR); also, bd(Ra) is the annihilator m R of R/Ra so a~b implies bd(a) = bd(b)0 THEOREM 1.5o
Every non-zero element of R has a non-
zero bound.
4-
It suffices to prove that if a e R, a ^ 0, then bd(Ra) / 0o
If a is a unit, Ra = R and bd(Ra) = R / 0.
If a has degree n > 1 then R/Ra is an n-dimensional (left) K-vector space. by
Let T s Hom(R/Ra) be defined
T(u + Ra) = Xu + Ra. Obviously T e Homz(R/Ra,R/Ra),
so by the Cayley-Hamilton Theorem there is an f s C, £ / 0, such that f(T) = 0. Then f e bd(Ra). COROLLARY 1.6.
If a e R, a / 0, then bd(a) is the
monic polynomial of least degree in C which is a multiple (left and right) of a. That bd(a) is both a left and right multiple of a follows from bd(Ra) = bd(aR) = R(bd(a)). DEFINITION 1.7. If a,b e R and Ea^RbR, we say that a is a TOTAL DIVISOR of b and write a|b. NOTE.
Since aR^bd(aR) = bd(Ra) it is clear that a|b
iff aR2RbR. The next theorem asserts that total divisibility is a similarity invariant. THEOREM 1.8.
Suppose a~a', b-"b«.
Then a|b iff a0|b'o
The proof is straightforward. Ve now take up the study of square matrices over
R and their reduction to a canonical form.
Jacobson,
working over a non—commutative principal ideal domain, defines three kinds of elementary transformations in (R)
and proves that if A e (R)_9 then by using these
transformations on A one can reduce it to a diagonal matrix diag(a-,, . . . ,a ) in which (if n > 1) aJ I & i + 1 for 1 < i < n.
Because we are working over a ring
which has both left and right division algorithms, we can simplify Jacobson's set of elementary transformations and at the same time make an important improvement in his canonical form. DEFINITION 1.9.
Let GL(n,R) be the multiplicative
group of units in (R) .
For each a e R and each
pair of integers i,j such that 1 < i,j < n, i £ j , let E- .(a) be the element of (R)„ having io
n.
l's down the main diagonal, a in the (i,j) spot, and O's elsewhere.
Denote by SL(n,R) the sub-
group of GL(n,R), generated by all such E. .(a)'s. NOTE 1.
The inverse of E. .(a) is E. .(-a). l-d
NOTE 2.
ID
GL(1,R) = K, SL(l,R) = {l}. For n > 1
SL(n,R) is the set of elements of GL(n,R) having Dieudonne determinant 1 (see (5)). Therefore SL(n,R) is a normal subgroup of GL(n,R) for all n.
6
Observing that all of the elementary transformations used by Jacobson are members of SL(n,R), we have: THEOREM 1.10. Let A s (R) n .
Then there exist U,W in
SL(n,R) such that UAW = diag(a,,.«.,a), where a.|a. - for 1 < i < n. See lemma 1, Appendix I. We now fix our attention on the class of matrices over R of form IX - T, where T = (a..) e (K) . Let XJ
XL
us recall how such matrices appear in the study of linear transformations. Let V be an n-dimensional Kvector space with basis {v,,..,,v } and let T s Hom(V) afford the matrix T relative to this basis. Make V into a left R-module V™ by letting X act as T. Now if 1 = Rx, 1 ©...© Rx^n is a free left R-module on n generators, and N is the kernel in F of the R-homomorphism defined by R-isomorphic to F/N. elements yn »«"»»7n
e
x. — > v., 1 < i < n, then V™ is It is easily proved that the ^ defined by
y. - XX;j - j ^ ^ form a basis for N (i.e. N = Ry, ©...© Ry (x,,...,x )(IX - T) = (y-|j...,y )«
) and that
By (1.10) we can
find U,W e SL(n,R) such that U(IX - T)W = diag(ax,...,aQ), where a.|a. - for 1 1 and there exist U,W e SL(n,R)
11
such that
U(X - a)W = (X - p)I, it is not difficult
to show that W SL(n,K).
= TJ and that W may be chosen from
Therefore if (X - ot)I and (X - p)I are
equivalent normal forms of T there is a W s SL(n,K) such that W — 1, therefore
alX^iy = A(U.diag(l,...,a)«V) = A(diag(l,...,b)
= bpc^iy as asserted. COROLLARY 2.6. If a,b s R are monic and similar, then
N r d (a) = N rd (b) s C.
The converse is false.
Let Z be the field of
rational numbers, and K = Z[i,j] be the quaternions over Z. Every element of K-, has a unique representation of the form and
f., + foi + f^j + f^k, where
k = ij
f e Zn m l
for 1 < m < 4. The reduced norm of — — 2 2 2 2 such an element i s f, + f~ + f, + fy,. Consider the elements clear that
a = ( X 2 - l ) + (2X)i , Nrd(a) = Nrd(b) .
we can find monic polynomials such t h a t
au = vb
b = (X 2 +l) . If
a~b
u,v s R
and g c l d ( u , b ) = 1,
It
is
then by (1.3) of degree < 2 But no such
polynomials exist : obviously u and v cannot be constants, suppose au = vb
u = X+oc , v = X+P , oc,p e Z ; then
implies (comparing coefficients) that
a = -i
so X-i is the only possible choice for u.
But then
gcld(u,b) = X-i =J= 1 , for
Thus a
and b are not similar.
b = (X-i)(X+i).
This example also proves that
similarity classes do not multiply, that is, if f ~ f ' and g'vg' it may happen that fg+'f'g'? for in the example, a = (X+i) nomials X+i
and
, b = (X+i)(X-i) , and the polyX-i
are similar. We have shown that Nrd induces a multiplicative
group homomorphism from
K*/[K,,K.j] into Z, which
takes cosets of elements of K* into Z, and cosets of elements of R into C. We now compose this induced map with the Dieudonne determinant from (K^) into
KJ/CK^Kj]. DEFINITION 2.7. If A e ( K 1 ) n is not invertible, define the REDUCED DETERMINANT of A to be 0. If A e GL(n,K,), let a be any representative of the coset A ( A ) ; define the reduced determinant of A to be
Nrd(oc).
For all A s (2^)
the reduced
determinant of A will be written Arir m (A) or
VZ1 " simply A r d ( A ) . The discussion preceding (2.7) and. the fact that
20
the Dieudonne determinant on (Kn ) is an extension of In that on (K) implies that if A e (K) e Z , while A e (R) A
r
then A r d ( A )
implies A r d (A) s C.
Therefore
may be viewed as a group homomorphism from GL(n,K)
into Z*, or from GL(n,lL)into Z£. As the subscript K,/Z,
in the definition of A xr /%
ful purpose, it will be dropped.
serves no use-
CHAPTER 3 THE REDUCED DETERMINANT
Let
T s Hom(V)} if T, and Tp are matrices afford-
ed by T, then obviously A ^ C ^ ) = A r d ( £ 2 ) . Suppose T is afforded by T and diag(e1,...,e ) is a normal form of IX - T; then there exist U,W e SL(n,R) such that
U(IX - T)W = diag(e1,...,en).
THEOREM 3.1.
If
T e (K)
has normal form
n
-rv m\ rd/ diag(e1,...,en), then A r d /(IX - T) = T, NTra (e 1 e 2 ...e n )
The proof follows from the fact that SL(n,R) is contained in the kernel of A
viewed as a homomorph-
ism on GL(n,K,) ; theorem (1.11) guarantees that IX - T s GL(n,R),. If (e, 1 ,...,en ) and (f-.,...,f 1 n ) are invariant lists for
T e Hom(V), then by Nakayama's theorem
GA^?*
f° r
each i, so by (2.6) N r d (e i ) = N 3 7 ^ ^ ) for each i.
This
shows that the next definition is well formed. DEFINITION 3.2. Let
T e Hom(V) and (e1,...,en) be
any invariant list for T, then the sequence (Nrd(e1),...,Nrd(en)) will be called the REDUCED LIST of T.
An analogous definition is
given for the reduced list of a
T s (K) .
The reader might note here that if K is a field (so K = Z), then N A
= A
is the identity map on K and
is the ordinary determinant; thus the defin-
itions and results so far obtained coincide with the usual ones of linear algebra. Two questions occur naturally at this point:
Is
there a practicable way to calculate the reduced list of — T s x(K)? 'n list?
Can —T be recovered from its reduced
The second question is easy. Choose
a,b e R,
monic, with the same reduced norm, but not similar (see the example following (2.6)).
If a and b have
degree n > 1 , then the n-tuples (l,...,l,a) and (l,...,l,b) are invariant lists for two non-similar transformations on an n-dimensional vector space, yet these transformations have exactly the same reduced list:
(l,...,l,N
(a)). In general T cannot be re-
covered from its reduced list. To answer the first question we proceed in the following way.
Suppose
A e (K-,) and 0 < r < n;
after
striking out any n-r rows and n-r columns of A, we are left with an rxr submatrix of A; the reduced determi" nant of this submatrix is an element of Z.,, it will be called an r-rowed reduced minor of A. DEFINITION 3.3. Let
A s (\)n
tive integer < n.
Define
and r be a non-negaD
: (K,) — > Z, as
follows. 0, put
If every r-rowed reduced minor of A is
Dr(A) = 0. If some r-rowed reduced minor
of A is different from 0 let
D (A) be the
greatest common divisor of all the r-rowed reduced minors of A. It is obvious that if A e (K-,) and B,C e SL(n,K1), then
Dr(BAC) = D (A) for all r.
In particular, if
T s (K) , then for any normal form diag(e-,,...,e ) of T we haves
Dr(IX - T) = Dr(diag(e1,...1en)). It is
easily verified that the greatest common divisor of the (ordinary) r-rowed minors of the matrix diag(Nrd(e1),... ,Nrd(en)) is
D^diagCe^,,... ,e n )).
But for each i < n, e jje i+1 > so NJ?d(ei) |N rd Ce i+1 ) , therefore the greatest common divisor of the r-rowed minors of N
diag(N(e,),...,N
(e-,)»»'N
(©n)) is exactly
(e ) . We can now calculate the reduced
list of T e (K) in this way: (1) Calculate the reduced? norm of each entry of IX-T. The greatest common divisor of these reduced norms (in C) will be N r d ( e 1 ) ; (2) Calculate all 2-rowed reduced minors of IX-T. The greatest common divisor of these will be D2(IX-£) = N r d (e 1 )N r d (e 2 ).
Divide by N r d (e 1 ) to
obtain Nrd(e2)o
Repeating t h i s process for
r = 3*4-,...,n
produces
the desired result. We now turn our attention from matrices to linear transformations. Let V be an n-dimensional K-vector space and
S = Hom(V), then S is a central simple Z-
algebra isomorphic to (K) , and S can be decomposed into a direct sum of n minimal left ideals, all of which are isomorphic to V as S-modules:
S = V^n',
Let E be a splitting field for S; s(x)z E is a central simple E-algebra; if W is one of its minimal left ideals, then
S 0 E ^ W ^ n d \ and V © E * W ^ d \
Let
s s S , and let s © l be its image in SQc)E; clearly, if (s x l ) L e HomE(W,W) is left multiplication by s(x)l on the left ideal W, then CPg^z(s;X) s Z[XJ = C.
CPE((s x l)L;X) =
In addition, we observe that
if (s(x)l)L, s Hom E (W^ d \w^ d b is left multiplication by s(x)l on the nd -dimensional E-vector space W^ ', then
CPE((s(x)l):r, ;X) is exactly
CPZ(ST;X)
, where
ST S Homz(V,V) is left multiplication by s on the 2 nd -dimensional Z-vector space V. But CP^(s©l) L ,;X) = (CP E ((s©l) L ;X)) d = (CP£dz(s;X))d. We have proved: THEOREM 3.4. Let K be a skewfield of index d over its center Z, let V be an n- dimensional vector space
25
over K, S = Hom K (V), and
T s S.
Then letting
CPZ(T;X) denote the characteristic polynomial of T considered as an element of Homz(V,V) we have (CP^ z (T;X)) d = CPZ(T;X). From this point on, for
T s S = HomK(V),
CPrd(T;X) will always mean CPg/z(T;X). Suppose
a e R is a monic polynomial of degree
n > 1, let V = R/Ra be the associated n-dimensional K-vector space, and let iplication by X in V.
T e Hom(V) correspond to mult-
Since any Z-basis for K is also
a C-basis for R, as well as a Z,-basis for K,, we have N(a) = CPZ(T;X) e C.
(This follows from the fact that
N(a) is the determinant of a., where a,, s Hom„ (K.. ,K.) is left multiplication by a in K,).
Therefore, by (2.4),
(N rd (a)) d » N(a) = CPZ(T;X) = (CPrd(T;X))d«. LEMMA 3.5. Let
x,y s 0 be monic and let m be any
positive integer, then x111 = y m Let xy~
= z s Z, so that
implies x = y.
z m = 1.
Since x and
y have the same degree, z is a power series of the 00
form Let
•?
L a.X , where a. s Z for all i and a n + 0. 1 u ioO1 a, = ... = O C „ T = 0 , a + 0 . Assume the char-
1
s-l
'
s
26
acteristic of Z is 0. 0CQ
Then
z m = 1 implies that
+ IMXQ" a X s + ... = 1 , whence
a
= 0.
The contra-
diction shows that if Z has characteristic 0, z = a Q 6 K, but then
x = 0, let m = p m,, where k (m-,,p) = 1. The proof just given shows that z p = 1, k k k k k so x p = y p . But then x p - y p = (x - y ) p = 0, whence
x = y. THEOREM 3.6. and let
Let
a e R
V = R/Ra
be the associated n-dimensional
K-vector space. If by X on V, then
be monic of degree n > 1,
T s Hom(V) is multiplication
CPrd(T;X) = N r d (a).
The proof is an immediate consequence of (3.5)/ and the remarks preceeding it. Now let V be any n-dimensional vector space over K and choose
T e Hom(V).
We can decompose the left
R-module Vm into a direct sum
R/Ra, ©...© R/Ra , in
which the a.'s are monic polynomials. Let T. be the restriction of T to R/Ra.. Relative to an appropriate basis, the matrix of T is the direct sum of the matrices of the T.'s; from the ordinary properties of the determinant used to define CP
(T;X), we therefore have
the following important consequence of (3«6).
2.7
THEOREM 3-7.
If
T e Hom(V) and V^ = R/Ra-j®.. ,©R/Ran,
where the a.'s are monic, then we have: CPrdT;X) = N rd ( ai )...N rd (a n ) = N r d ( a i .•-a^). COROLLARY 3.8. Let (e,,...,e ) be any invariant list for
T e Hom(V), then CPrd(T;X) = Nrd(e1«--e^) =
N rd ( ei )...N rd (e n ). In particular, CPrd(T;X) is the product of the entries in the reduced list of T. THEOREM 3.9. If
T e (K) n , then
CPrd(T;X) = A rd (IX - T ) . The proof follows from (3.1) and (3.8). We can now prove a non-commutative version of one of the basic results of linear algebra. THEOREM 3.10.
(Cayley-Hamilton). For every ~T e Hom(V),
CPrd(T;T) = 0. The ordinary Cayley-Hamilton theorem says that CPZ(T;T) = 0
so the proof follows from (3.4).
The matrix form of (3.10) is: THEOREM 3.11. For every T s (K)
, CPrd(T;T) = 0 .
We conclude with two more generalizations of stand-
ard results. THEOREM 3.12.
If S,T e Hom(V), then
CP^STjX) = CPrd(TS;X). The proof results from (3*4-)» (3»5)i and the truth of the corresponding theorem in the ordinary case. COROLLARY 3.13.
If
S,T s Hom(V) are similar, then
CPrd(S;X) = CPrd(T;X).
29
CHAPTER 4 LINEAR TRANSFORMATIONS
Earlier, in (1.4), we defined the bound a* of a polynomial a s R.
Recall that a* is monic (unless a = 0)
and belongs to C, it generates the annihilator in R of R/Ra, and is the monic multiple of a of least degree which lies in C. THEOREM 4.1.
If
a e R , a + 0 , then
a|a*
and
a*|N r d (a). It suffices to prove the result for monic polynomials; we may assume a has degree n > 1. vector space
V = R/Ra, and let T s Hom(V) be multi-
plication by X on V. chapter, N
N
rd
Form the
By the results of the last rd
(a) = CP (T;X), and
CP(T;T) = 0, so that
(a) belongs to the annihilator of R/Ra; that is,
a* |N
(a). We already know that a|a*, so the theorem
is proved. The following example shows that a,a*, and N can all be distinct.
Let Z be the field of rational
numbers, and K be the quaternions over Z. a = X(X+i) s R. and
(a)
It is easily seen that
Choose a* = iP + X
N r d (a) = X 4 + X 2 .
DEFINITION 4.2. let
Let V be a vector space over K, and
T s Hom(V).
We define the MINIMAL POLYNOM-
30
IAL of T over K to be the unique monic polynomial f e R
of least degree such that f annihilates
the R-module V™.
This f will be denoted by
MPK(T;X) of by MP(T;X). We note that such an f always exists, it is the monic generator of the two-sided ideal in R which annihilates V,jj and therefore lies in C. It is immediate from the definition that MP(T;T) = 0. THEOREM 4.3.
Let V be a K-vector space, T s Hom(V).
If (e1,...,e ) is any invariant list for T, then MP(T;X) o e*. First, if (a,,...,a) is another invariant list, we have
a„/v e_ , so n n
a* = e* . We know that V m is n n T
isomorphic to the direct sum of the modules R/Re., for i = 1,2,...,n. Since
e
ilen
f° r each'i, the annihila-
tor of V™ is exactly the annihilator of R/Re , i.e. e*. Theorem (4.3) has the following easy generalization. THEOREM 4.4.
Let V and T be as in (4.3). If &v...,an -,
are arbitrary polynomials in R such that V T = R/Ra]_ ©...© R/Ran , then MP(T;X) is the least common multiple of the polynomials a?,...,a*. The proof is obvious. One other elementary fact
about minimal polynomials deserves mention; if V is a K-vector space, T e Hom(V), then considering T as a member of Homz(V,V), the minimal polynomial of T, denoted by MPZ(T;X), is exactly the same as MP(T;X). We next examine questions relating to characteristic values of linear transformations. Recall that for
a,p s K, the relation
a = p means that there is c
a X s K* such that \oc\~ = P; conjugate in K.
we say that a and p are
Throughout this discussion V will re-
i
present an n-dimensional K-vector space and T will be a fixed transformation on V. \~ aX will be denoted by
For oc,\ s K, the product
a .
!
DEFINI$I0N 4.5. An element a sK will be called a CHARACTERISTIC VALUE of T if Tv = av for some vsV,* v 4 = 0 . For each a e K, define S_a to be ' {vsV:Tv=av}, and define W
to be
ES, \v, where
X runs over all non-zero elements of K. THEOREM 4.6.
For each a s K, W
is a K-subspace of V.
Note that S> ?uis a vector space over Z for each X, and that if Therefore
P e K, then pWa = fl&\y
PQI^-N"
{pvsveSaX} - S a (\p~ 1 ).
SSa(\p~1) C W a .
DEFINITION 4.7. W a will be called the CHARACTERISTIC. SPACE of T associated with a.
NOTE. W„ cx is T-invariant, i.e., We observe that
a = P
T(WJ ex C V tx.
implies
W
= W« . The
converse is also true. LEMMA 4.8. a
If
oc,p s K
are not conjugate, then
a A V p - tO}.
Suppose
v e S a ^Vo » v + 0. Then
also v = v, +...+V
Tv = av ; but
, v. s S Q , where p. = p for
each i, and p,,...,p
are distinct. Of all such v's,
choose one for which m is minimal. We have av = Tv = T(v, 1 +...+ v m) = i p,v l n +...+ P„v m m. But
av = av, +...+ av
so
X = P-i-a + 0.
E(p.-a) = 0.
a 4 P
so
Obviously
write
v 1 = X" (a-P2)v2 +...+ X
By hypothesis
m > 1, so we can
(a-Pm)vm« Putting
X. = X~ (a-pi) for i = 2,...,m , we have v = X20 v2 0 +...+ X„v m m . This contradiction to the minimality of m shows that no such v exists.
—>
THEOREM 4.9.
are non-
Suppose that
o^,...,^ s K
conjugate in pairs and that W„ 4= 0 a i Then Wa +...+ W o^ is a direct sum.
for each i.
1
If the sum is not direct choose that
w. 6 W
such
Ew. = 0 , not all w. = 0. By renumbering we may
assume
w, +...+ w. = 0
with
w. 4= 0 for
1 < d < t.
Let t be the smallest integer for which such a relation holds. Now from all such relations, (t is fixed), pick one in which w-, is expressible as a sum
v, +...+ v_
i
l
v.. e S„(X.) , m minimal. By (4.8) m > 1. mality of m implies that the elements for
The mini-
p. = a, ^ i^ ,
1 < i < m , are all distinct. Since
have
m ,
Ew. = 0, we
0 = T(w1 +...+ w^) - P1(w1 +...+ w^) = wj+...+w£,
where
w!. = T(w.)-P,w. e W
for each j. In particu-
lar, w.J_ = T(w1)-p1w1 = E P ^ - Ep.jV.j_ = 2(p±-P1)vi
is
a sum of fewer that m elements, each of the same genre as the v.'s (see the proof of (4.6)).
If this fact is
not to contradict the minimality of m, then it must happen that each of the w\ 's is 0 (otherwise the minimality of t would be contradicted). But if then T(w.)-|3,w. = w'. = 0 implies w. e S n d
i j
d
^
"l
j > 1, = S (X,). 1
Since, by hypothesis, the a.*s are non-conjugate in pairs, this contradicts (4.8).
We conclude that no
such w.'s can exist, so the sum is direct. We can use (4.9) to prove the non-commutative counterparts of many familiar results. THEOREM 4.10. A linear transformation has at most n distinct, non-conjugate characteristic values in K, when n is the dimension of the vector space.
34
If a,,...,a, are characteristic values of T, nonconjugate in pairs, then for each i, W
has dimension a
i
at least 1 over K, so by (4.9) there can be at most n such a.'s. Let us look at a special case; suppose monic of degree
a e R is
m > 1. Let U = R/Ra be the associated
m-dimensional vector space, and let
S s Hom(U) be
multiplication by X on U. THEOREM 4.11.
If a, U, and S are as above, then a s K
is a characteristic value of S iff there is a X + 0 in K such that Suppose
u=f+RasU
(X-a ) is a left divisor of a. , u + 0 , and
Su = au.
We may assume the degree of f is less than that of a. Then, using the fact that S is multiplication by X, we see that
(X-a)f e Ra. By comparing degrees of the
two sides we find that there is a X 4= 0 in K such that (X-a)f = Xa. The result follows. Conversely, if (X-a )f = a, then for the element have:
u = Xf + Ra s U we
Su = au. Since u cannot be 0, the converse is
proved. In the classical theory the characteristic values of a transformation are exactly the zeros of its minimal polynomial; after some preliminary work we shall prove the same result for the non-commutative case.
DEFINITION 4.12. Let
f e a, f 4= 0 , and let
We shall call a a ZERO of f if NOTE.
f(a) = 0._
Since Z(a) is a field for all
no ambiguity about this substitution. a zero of
a s K.
a e K , there is Obviously a is
f e C iff X-a is a divisor of f in R.
LEMMA 4.13.
Let a, U, and S be as in (4.11).
If aeK
is a characteristic value of S, then it is a zero of CPrd(S;X), i.e., CPrd(S;a) = 0. By (4.11) a = (X-aX)f Therefore (4.1)
for some X 4= 0 , f e R.
CPrd(S;X) = N rd (a) = N rd (X-a X ).N rd (f). By
(X-aX) divides N r d (X-a X ), so it divides
CP rd (S;X).
Therefore CPrd(S;aX) = 0, whence the result,
COROLLARY 4.14. If a is a characteristic value of S, then a is a zero of a*. Same proof as for (4.13). Before we can prove a converse to (4.14), we need to know more about zeros of polynomials. THEOREM 4.15.
If
c e C is irreducible and non-con-
stant, then all zeros of c in K are conjugate. Let
a,p e K
be zeros of c. Let a: Z(a) ->• Z(p)
be the Z-isomorphism which takes a to P". By the SkolemNoether theorem (see (3)> chapter 8, page 110) there
is a X 4= 0
in K such that o*(x) = x
in particular, a
for every xeZ(a);
= p.
Since K is finite dimensional over Z. every element of K is algebraic over Z-.
For
p e K , let the
irreducible polynomial of p over Z- be written Irr(p;X). It follows from the definitions that for a s K , Irr(a;X) => (X-a)*; if
]i e K , u 4s 0 , it is
clear that (X-a*1) is a left divisor of (X-a)*.
Let f
be the monic polynomial of least degree in R having the property that for each \i 4s 0, (X-a*1) is a left divisor of f; for each )i | 0 f = (X-a*1)^ . If that and
X f 4 f j let
f | 0
choose
choose
so that
X s K , X + 0 , such
X f - f = 5h , where
6 4= 0 . For each u 4> 0
g^ e R
h s R
is monic
we have fX = (X-a^)ff X;
but as u ranges over the non-zero elements of K, so does uX. Thus each (X-a") is a left divisor of 5h, and so of h.
The degree of h is less than that of f;
this contradicts the minimality of the degree of f. We conclude that
f e C. Since (X-a)* is the monic
polynomial of least degree in C having the stated property, it follows that
f = (X-a)*.
THEOREM 4.16. For each a e K , Irr(a;X) can be written as a product of linear factors (X-a1)(X-a2)...(X-cxk) ,
where the a . ' s are a l l conjugate and a, = a.
We can write Irr(a;X) = (X-a1)v1 , where a, = a and
v, e E , If
a e Z , then v.,= 1
and we are
through; if a £ Z , we shall pick a conjugate of a by making use of the following observation. Suppose that X-p is a left divisor of uv, but not of u. Let u = (X-p")un + X , X 4= 0; then uv = (X-p)u,v + Xv, so (X-P ) is a left divisor of v. Returning to our proof, we note that since
a | Z
we can pick a conjugate a.1
of a which is not a; therefore (X-ai) is a left divisor of
(X-a,)vn , but not of (X-a,). By the above remark
there is a conjugate a 2 of ai such that Irr(a;X) = (X-a,)(X-a2)v2. Now if, for each i± 4= 0 , (X-a^") is a left divisor of (X-a,)(X-a2) then by the remarks preceeding the theorem (X-a,)(X-a2) is already equal to Irr(a;X), and we are through. Otherwise we proceed as before to find
ou = a
such that Irr(a;X) is equal
to (X-a, )(X-a2)(X-a,)v, . An obvious induction completes the proof. LEMMA 4.17. Let
a,b s R be distinct, monic, non-
constant, irreducible polynomials, and let a,PsK be zeros of a and b. If
aX = P , then
Then a and p are not conjugate.
a(p) = a(aX) = (a(a))X = 0.
But this is impossible, for 3 satisfies only one irreducible polynomial in C.
38
We wish to relate the factors of
a e R
to the
zeros of its bound a*. LEMMA 4.18.
Let
a e R . If
aeK
has the propyl in K, (X-a ) is not
erty that for each X 4= 0
a left divisor of a, then gcld(a,Irr(a;X)) = 1. By (4.15) all zeros of Irr(a;X) are conjugate; by (4.16) Irr(a;X) splits in R. LEMMA 4.19. where
Suppose
a e R
c,d e C.
The rest is obvious.
is a left divisor of cd,
If gcld(a,c) = 1, then a is a
left divisor of d. Choose
x,y e R
a(xd)+c(yd) = d . But THEOREM 4.20. Let
such that d e C
a eR
of a*, then for some
ax+cy = 1; then
so the result follows.
and a e K . X 4= 0
If a is a zero
(X-a ) is a left
divisor of a. If for all
X 4= 0, (X-aX) is not a left divisor
of a, then by (4.18) geld (a,Irr(a;X)) = 1. Since a is a zero of a*, (X-a), is a factor of a*, and necessarily a* belongs to the annihilator of R/R(X-a), i.e., (X-a)* is a factor of a*. Let
a* = (X-a)*d ,
d s C. We know that a divides a*, so by (4.19) & divides d. follows.
This contradicts (1.6).
The result
Let us return to a situation dealt with earlier: a eR
is monic of degree m > 1, U = R/Ra is the
associated m-dimensional vector space over K, and S s Hom(U) is multiplication by X on U. earlier results that is equal to a*.
CPrd(S;X) = N rd (a)
We know from and MP(S;X)
In the discussion just completed,
we have proved most of the following theorem. THEOREM 4.21.
The following are equivalent:
(i)
a e K is a characteristic value of S;
(ii)
a e K is a zero of
a* = MP(S;X) ;
(iii) a e K is a zero of N rd (a) = CPrd(S;X) ; (iv)
there is a X 4= 0
in K such that (X-a )
is a left divisor of a in R. (i) implies (ii), by (4.14). because a* divides N
(ii) implies (iii),
(a), (iii) implies (iv): we
have (CPrd(S;X))d = CPZ(S;X) , and MP(S;X) = MPZ(S;X). Ordinary linear algebra tells us that any zero of CPZ(S;X) (in an extension field of Z) is also a zero of MP(S;X) = MPZ(S;X), so any zero of CPrd(S;X) must be a zero of MP(S;X).
This proves that (iii) implies
(ii); by (4.20) (ii) implies (iv); therefore (iii) implies (iv). Finally, (iv) implies (i), by (4.11). NOTE.
It is worth remarking that we can omit the
word left in statement (iv) of (4.21); for conditions
(ii) and (iv) say that; a e K bound of a iff for some isor of a.
is a zero of the (left)
X 4= 0 , (X-a ) is a left div-
It is quite clear that by working with
right modules rather than left, we could prove that a s K some
is a zero of the (right) bound of a iff for X 4= O , (X-a ) is a right divisor of a.
We long
ago ooserved that the right and left bounds of a are the same, so we may assert that (X-a) is a left divisor of a iff for some
X + 0 , (X-a ) is a right divi-
sor of a. We can use (4.21) to prove the corresponding result in the more general case.
Let V; be an n-dimen-
sional vector space over K, T s Hom(V), and let (e,,...,e 1 n) be an invariant list for T. THEOREM 4.22.
The following are equivalent:
(i)
a s K is a characteristic value of T;
(ii)
a s K is a zero of
(iii)
a e K is a zero of N r d (e 1 )---N r d (e ) -
e* = MP(T;X) ;
CP rd (T;X) ; (iv)
there is a
X + 0
in K such that (X-a^)
is a left divisor of e„ in R. n (i) implies (ii). V™ is isomorphic to the direct sum of the modules R/Re., 1 < i < n , so if Tv = av for some
v + 0
in V, we can find
^i^m,m^^r\
E
R »
such that under the aoove mentioned isomorphism.v
41
corresponds to implies that
(f,+Re,,...,f_+Re ) . Then (X-a)f. e Re.
Tv = av
for each i. We may assume
that the degree of f. is less than that of e. for each i, and that Then
f . 4= 0
for at least one j (since
«J
(X-a)f. s He . implies that 0
v | 0).
(X-a)f. = Xe. , X not
0
do
0. In turn, this implies that (X-a ) is a left divisor of e ., so by (4.21) a is a zero of e*.. Since e*. 0
divides
3
0
e* divides N
(en)»
e* we are through.
(ii) implies (iii).
Obvious, for
(iii) implies (iv). If a is a zero of N (ei"*eTi)> then
a is a zero of some N
(e.), so by (4.21) it
also has the property that (X-a ) is a left divisor of e . for some a
b e R
X 4= 0
such that
in K. e
Since
e
-J e n » there is
= be. . Consider the poly-
nomial b(X-a ) ; by virtue of the note following (4.21), there is a u 4= 0
in K such that (X-a*1) is a left
divisor of b(X-a ) and therefore also of be. = e . J n (iv) implies (i). Let T be the restriction of T to R/Re . By (4.11) a is a characteristic value of T . Choose
f
Taking
v s V
with
eR ,f
4= 0 , such that ' (X-aA)f
= en .
to be the (non-zero) vector identified
(0,...,0,Xf +Re ) we have Tv = av, as required. Before closing this section, let us use the re-
suits just obtained to draw some interesting conclusions about 'zeros' of polynomials in R. DEFINITION 4.23.
Let
a e R , a 4= 0 . An element a
of K will be called a PSEUDO-ZERO of a if for some X 4= 0
in K (X-a^) is a left (or right)
divisor of a. COROLLARY 4.24.
Let
aeR, a4=0,asK.
Then a
is a pseudo-zero of a iff a is a zero of a*. This is a restatement of part of (4.21). The next theorem is the non-commutative version of the fundamental theorem of algebra. THEOREM 4.25. Let
aeR
have degree n > 1.
Then a
has at most n non-conjugate pseudo-zeros in K. Let U = R/Ra, and let cation by X on U.
S e Horn (U) be multipli-
By (4.10) S has at most n non-con-
jugate characteristic values in K. and (4.23) the proof is complete.
Thus by (4.21)
43
CHAPTER 5 DIAGONALIZABLE OPERATORS AND CANONICAL FORMS
Our first theorem is the well known Primary Decomposition Theorem. THEOREM 5.1. Let V be an n-dimensional vector space over K, and let
T e Hom(V).
Suppose that
MP(T;X) = (p1)rl«(p2)r2...(pk)rk , where the
Pi 's
are monic irreducible polynomials in C, and the r.'s are positive integers. Let V ± » { veV : Pi (T) r i(v) =0 } , 1 < i < k , be the null space of the transformation p.(T)ri . (i)
V = W 1 © ... © W k .
(ii)
T(Wi) C W . , 1 < i < k .
(iii) If T. is the restriction of T to W., then MP(Ti5X) = (p±)ri . The proof found in most elementary books on linear algebra (e.g., (7)) remains valid in the non-commutative case. Until further notice V and T are fixed, and for each
a e K ,W
will represent the characteristic
space of T associated with a.
(See (4.7).)
DEFINITION 5.2. V = SW THEOREM 5.3.
T is said to bte DIAGONALIZABLE if
, where a ranges over all of K. T is diagonal!zable iff there exist
a, ,. ..,a n E K , k < n , such that the a.•s are non-conjugate in pairs and V = W
+ ••• + W . l °lc This follows immediately from (5.2) and (4.9). a
THEOREM 5.4.
T is diagonalizable iff there is a
basis of V, each element of which is a characteristic vector of 1. Suppose T is diagonalizable; then as in (5.3).
Recalling that
W
= a
i
V =
k S W , i=l a i
S S X , it is
X+0 a i
clear that one can choose a K-basis for W
, each ele— a
i
ment of which is a characteristic vector of T.
Con-
versely, if {v,,...,v} is a basis of V and Tv. = a.v. for each i, arrange the v.'s so that {a,,...,a, } is a complete set of pairwise non—conjugate elements of the set {cc,,...,a }. Then for each j>k , there is a unique i < and W„, ^ — k such that a.j =c a. i a.= W„. a. . Thus J
1
V = Wa +•••+ W , so by (4.9) the proof is complete. l "k DEFINITION 5.5.
A set {a 1 ,...,a k } C K will be called
a COMPLETE SET OF CHARACTERISTIC VALUES of T if
45
every characteristic value of T is conjugate to exactly one of the a.'s. THEOREM 5.6.
Let T be diagonalizable, and let
{a,,...,a,} be a complete set of characteristic values of T. Then MP(T;X) = (X-o^)**- • (X-o^)*. Let
f = MP(T;X), and let {vlf...,v } be the
basis of V guaranteed by (5-4-). As in the proof of (5.4), arrange the v.'s so that
Tv. = ot.v. , where,
for i > k , a. is conjugate to one of a-,,...,a,. Thus, f(T) = 0
implies
f(T) = 0 , with
T being
the diagonal matrix diag(a1,...,an) afforded by T; that is,
0 = f(T) = diag(f(a,),...,f(a )) .
f(ai) = 0
for each i < n.
f for each i.
But then,
Therefore (X-a.)* divides
By (4.15), the polynomials (X-a.)* ,
1 < i < k , are distinct; all are irreducible, so they are relatively prime in pairs. We conclude that their product
(X-a,)*•••(X-a, )*
divides f. Let f
denote
this product; by (4.16) each (X-a.) , i < k , is a left divisor of f1, so f'(a.) = 0 for i < k. f'(a.) = 0
for all
i < n , so
T = diag(a,,... ,a ) ; thus f VT.
It follows that
f|f
But then
f'(T) = 0 , where
annihilated the R-module
, so
f = f .
We could also give a non-matrix proof of theorem (5.6) by arguing that for
1 < i < n , the vector space
Kv. is isomorphic to R/R(X-a. ) as an R-module, so that
¥
n m1 - © R/R(X-a.) . For each j > k, there is a unx i=l
ique i < k for which R/R(X-a.) and R/R(X-a.) have the same annihilator in R.
From this, and from (4.4), it
follows that the annihilator of Vm is the least common multiple of the polynomials (X-a.)*, 1 < i < k , i.e., (X-a,)*•••(X-a, ) * .
This proves the theorem, for the
annihilator of 7^ is MP(T;X). LEMMA 5.7. If
2
and t h a t the r e s u l t holds for
a l l matrices of order n-1 which s a t i s f y conditions ( i ) - ( i i i ) of the lemma.
If
a, . = a., = 0 •J- (J
for a l l
fj •A*
j >1 , then by the induction hypothesis we are through. Otherwise, some a, . or a.-, is different from 0; for notational convenience assume
a 2 , = Of / 0. Using a2,
to eliminate all other column 1 entries, we obtain 0 oc
where
-of g 2 l
22
0
lin
0
lin
-g>
3
n *2n
l
23
lin
-a" g 2 = a 2 1 - a^oi" a22 '
lin 5
$ " a Jl ~ a ll a
a
2J
etc.- Note that g 2 is monic of degree m+1, while g 2 , g
are all of degree m. All of the entries in the
lower right hand (n-2)*(n-l) block are constants except those marked "lin" which are either constants of linear polynomials;
those polynomials marked "lin"
along the upper diagonal of this block are all monic of first degree. We now use a to eliminate all a2J» j > 2 , and then interchange rows 1 and 2, at the same time changing the signs of the new row two entries. This gives us:
a
0
0
a s2
XL
±1-
0
p
: lin 9
«
•
•
0
. . .
6
•
3
o
0 »
Sn
lin •
•
•
lin
I lin
Notice that by adding an appropriate constant multiple of the j
column (j >2) to the 2
column we can in-
sure that all of the entries (3,2), (4,2),... ,(n,2) are constants, while keeping (X~ as the leading coefficient of the (2,2) entry.
Finally, use the algor-
ithm in R to reduce the (2,3), (2,4),..., (2,n) entries to constants, by adding to them appropriate left multiples of the (3,3), (4,4),..., (n,n) elements. None of these alterations affects either the degree or the leading coefficient of the (2,2) entry, for that polynomial has degree
m+1> 1.
The result is
a
1 :
1 .
Let x,y 6 R be monic, with is a monic polynomial b such that
Rx JS RyR • Then there yb ft Rx . Using
t h e a l g o r i t h m , we g e t : yb = qQx + ^ x = 0.1 r i
r
t-l = %rt
+
, d e g r e e ( r 1 ) < degree(x) , r
p » degree ( r 2 ) < d e g r e e ( r 1 )
,
'
Let l the leading co-
. We now choose elements
from SL(n,R) to effect the transformations indicated below by arrows.
"x o"
"x 0"
.° y.
->
yb y_
"x o' ->
.rll
->
'T2
-^7
"r2
-»
7
.rl
-w
r* P(q2»qi)y
" ^ -P(q3»i2.^ ^° ^
&
H
over
again.
Because degrees are continually reduced, this must lead finally to a matrix which satisfies the requirements of the lemma. This completes the proof of (1.12).
APPENDIX II
The characterizations of similarity noted in chapter 1 are deficient in two respects:
first, they
suggest no general method for determining whether two given elements of a ring are similar; second, they provide no method for constructing similar elements (except for associates).
Here we give a characteriza-
tion of similarity in a non-commutative Euclidean ring, which eliminates at least the second of these difficulties. Let R be a ring with 1 having both left and right division algorithms and no zero divisors; we shall assume the definition of the function P on finite sequences of elements of R to be known (see Appendix I, definition 6 ) . It is an easy induction to prove that if and
x, ,...,x
£R
n>2
, then
PCx^ , . • . )X]lJ = PQXj , • . • «XI1_3_ 'Xri + ^vx3_ » • • • »x*j_2'• Readers familiar with continued fractions will recognize P as the recursive function which gives the numerators and denominators of the convergents. x,,y, fiR are simi lar.
Suppose
If they are in K they are a s -
sociates; if they are not in K we can choose
xP6 R
of lower degree than x,, such that
Rx, + Rx 2 = R
and
Rx,nRx 2 = Ry-jXo. From the division algorithm we get x
l
= w
lx2
+x
3
x 2 = w 2 x 3 + x^ (1)
, deg(xi) 1 x. = P(w , ...,w )(5 .
P(w ,...,w. ,)x. = P(w ,... ,w. )x. -, for i