126 119 3MB
English Pages [162] Year 2016
Computer Systems Architecture exercises
Computer Systems Architecture Exercises
Drawing by Tamar Yadin
Aharon Yadin
© Revision 1.0
1
Computer Systems Architecture exercises
Computer Systems Architecture Exercises Preface This booklet contains many exercises related to various chapters of the book. The intention is to provide additional rehearsal materials for the students. The exercises are divided into the relevant book chapters as well as the various learning subjects. The students do not have to solve all exercises. For each subject the student should try to solve several exercises. If these were solved correctly, the student can proceed to the next subject. If, on the other hand, the solutions are wrong, the student is advised to continue solving additional exercises. For the calculation exercises there are solutions at the end of the booklet
© Revision 1.0
2
Computer Systems Architecture exercises
Revision History
Date October 2016
Changes Original document
© Revision 1.0
Revision 1.0
3
Computer Systems Architecture exercises
Chapter 2 – Data Representation exercises 1. Decimal to Binary conversions All numbers are positive (unsigned numbers). First line is a solved example No.
Decimal Number
1
123
2
1 023
3
765
4
9 898
5
3 999
6
159
7
762
8
7 602
9
5 577
10
2 004
11
11 000
12
35 355
13
404
14
1 234
15
4 949
16
573
17
669
18
917
19
8 192
20
7 811
21
8 642
22
3 789
23
7 003
Binary Number
© Revision 1.0
111 1011
4
Computer Systems Architecture exercises No.
Decimal Number
24
3 887
25
6 423
26
12 129
27
27 298
28
19 999
29
9 873
30
17 399
31
57
32
634
33
9 824
34
10 000
35
5 665
36
7 991
37
999
38
800
39
3 333
40
7 007
41
12 123
42
255
43
7 777
44
5 656
45
4 321
46
99
47
375
48
1 010
49
8 119
Binary Number
© Revision 1.0
5
Computer Systems Architecture exercises No.
Decimal Number
50
6 944
51
2 468
52
1 753
53
4 762
54
8 117
55
1 928
56
7 956
57
19 175
58
22 222
59
7 275
60
1 983
61
5 555
62
36 133
63
11 223
64
4 590
65
21 325
66
9 176
67
9
68
81
69
5 933
70
9 724
71
5 311
72
14 000
73
781
74
35
75
28 753
Binary Number
© Revision 1.0
6
Computer Systems Architecture exercises
2. Binary to Decimal conversions All numbers are positive (unsigned numbers). First line is a solved example No.
Binary Number
1
1 0000 0000
2
1100 1100 1101
3
1 0010 0011 0100
4
1 1000 0100 0010
5
100 0000 0001
6
1010 1010 1010
7
1 0000 0000 0001
8
100 1000 1000 1101
9
1100 1100 1100
10
111 1000 0111
11
1 0000 0010 0100
12
10 1010 1010 1010
13
111 0111 0111 0111
14
111 1011 0110
15
1101 0000 1101
16
1110 1001 1110
17
101 0010 1100 0001
18
101 0000 0000 1010
19
101 0001 0001 1111
20
100 1101 1100 0001
21
1 1111 0011 0011
22
11 0001 1111 1111
23
11 0100 0101 0110
24
111 1010 0001 0011
Decimal Number
© Revision 1.0
256
7
Computer Systems Architecture exercises No.
Binary Number
25
111 1010 1011 1100
26
11 0011 1001 1000
27
1111 1010 1010
28
10 1000 1100 1001
29
1011 0101 0101
30
110 1110 1111
31
11 1111 0100 0000
32
1 0001 0000 0001
33
1 1111 1111 1111
34
110 0110 1000 0100
35
1001 1001 1001
36
1001 1010 1010
37
11 0110 0111
38
1111 0000 0000
39
1000 0000 0001
40
1010 0000 1111
41
1000 1001 1010 1011
42
1001 0111 0110
43
11 0011 0011 0011
44
1001 0111 0101 0011
45
11 1000 1010
46
1011 1011 1000
47
1001 1110 1110
48
1110 1110 1001
49
101 0001 1110 0101
50
11 0000 0000 0011
Decimal Number
© Revision 1.0
8
Computer Systems Architecture exercises No.
Binary Number
51
10 0100 0110 1001
52
111 1010 1011
53
1 1010 0010 0010
54
101 0001 0001 0001
55
111 0110 0101 0100
56
1001 0010 0011 0100
57
1 1001 0010 1000
58
1000 0010 0111 0011
59
1011 1100 1100
60
1010 1001 1000
61
110 1000 1010 0001
62
100 1011 0001 1010
63
10 1011 0110 1001
64
111 0001 0101 0111
65
111 0101 1001 0100
66
1 1111 0000 1000
67
1 0101 1011
68
10 1011 0000 0011
69
10 0010 1011 1000
70
1 1000 1011 1101
71
1110 1001 1001
72
1 1110 1000 1111
73
110 1100 0001 1001
74
100 1010 0011 1101
75
10 0000 0011 0110
76
100 0000 0000 0000
Decimal Number
© Revision 1.0
9
Computer Systems Architecture exercises
3. Decimal to Hexadecimal conversions All numbers are positive (unsigned numbers). First line is a solved example No. 1
Decimal Number 31 987
2
21 007
3
17 000
4
14 555
5
17 800
6
15 841
7
14 773
8
17 183
9
79 891
10
22 888
11
12 871
12
21 680
13
14 000
14
29 612
15
16 658
16
22 888
17
10 000
18
29 999
19
30 651
20
31 113
21
30 001
22
17 325
23
9 876
24
21 111
Hexadecimal Number 7CF3
© Revision 1.0
10
Computer Systems Architecture exercises No. 25
Decimal Number 911
26
14 590
27
17 618
28
9 784
29
11 011
30
8 933
31
12 617
32
21 039
33
10 000
34
6 785
35
12 777
36
24 242
37
19 898
38
6 444
39
717
40
3 982
41
10 986
42
519
43
4 102
44
22 097
45
27 963
46
16 741
47
3 785
48
9 261
49
4 022
50
26 789
Hexadecimal Number
© Revision 1.0
11
Computer Systems Architecture exercises
4. Hexadecimal to Decimal conversions All numbers are positive (unsigned numbers). First line is a solved example No. 1
Hexadecimal Number 8 000
2
6 B8C
3
3 5DD
4
3 E1D
5
2 B67
6
3 FCD
7
2 6A0
8
4 705
9
A FC8
10
2 FAD
11
8 791
12
7 A7A
13
B 00F
14
A 000
15
7 575
16
2 FAD
17
5 AB0
18
5 7E4
19
4 944
20
313
21
3 745
22
6 20B
23
2 2CF
24
5 A9B
Decimal Number 32 768
© Revision 1.0
12
Computer Systems Architecture exercises No. 25
Hexadecimal Number 9 C40
26
20F
27
7E0
28
7 97B
29
2 F59
30
2 333
31
4 2E0
32
2 000
33
1 D6B
34
395
35
3 30A
36
2 222
37
4 D6D
38
4 36F
39
2C7
40
1 B63
41
DAB
42
FFA
43
1 F62
44
1 565
45
393
46
9A
47
4 7B3
48
4 267
49
2 DE2
50
2 107
Decimal Number
© Revision 1.0
13
Computer Systems Architecture exercises
5. Decimal to Octal conversions All numbers are positive (unsigned numbers). First line is a solved example No. 1
Decimal Number 3 755
2
11 111
3
1 733
4
12 121
5
9 875
6
9 077
7
96 017
8
23 777
9
35 000
10
33 333
11
44 330
12
45 145
13
53 723
14
2 323
15
47 474
16
32 432
17
53 521
18
39 999
19
22 555
20
40 960
21
261 549
22
20 480
23
32 767
24
319
Octal Number 7 253
© Revision 1.0
14
Computer Systems Architecture exercises No. 25
Decimal Number 26 788
26
10 976
27
17 435
28
8 099
29
11 099
30
25 000
31
9 822
32
12 377
33
5 077
34
7 333
35
49
36
18 555
37
9 000
38
7 102
39
8 989
40
15 415
41
2 017
42
6 522
43
16 581
44
14 037
45
9 378
46
14 043
47
9 998
48
5 037
49
17 321
50
12 055
Octal Number
© Revision 1.0
15
Computer Systems Architecture exercises
6. Octal to Decimal conversions All numbers are positive (unsigned numbers). First line is a solved example No. 1
Octal Number 4 227
2
23 417
3
1 733
4
6 474
5
36 475
6
1 761
7
17 630
8
47 037
9
76 220
10
67 175
11
137 743
12
114 265
13
67 742
14
2 274
15
27 351
16
105 033
17
202 152
18
121 666
19
51 167
20
65 500
21
17 777
22
77 123
23
37 000
24
325
Decimal Number 2 199
© Revision 1.0
16
Computer Systems Architecture exercises No. 25
Octal Number 1 111
26
12 345
27
25 252
28
6 666
29
70 001
30
56 712
31
1 000
32
63 451
33
36 712
34
3 412
35
6 711
36
7 033
37
4 510
38
2 777
39
10 666
40
12 012
41
43 651
42
33 773
43
6 236
44
1 177
45
17 451
46
6 755
47
23 773
48
5 000
49
16 666
50
7 654
Decimal Number
© Revision 1.0
17
Computer Systems Architecture exercises
7. Bases conversions (3-10) All numbers are positive (unsigned numbers). First line is a solved example No. 1
Decimal Number 159
2
1 991
Base 3
Base 4
12 220
2 133
Base 5 1 114
Base 6 423
133 102
4
304
5
363
6 789 1 210 010
8
33 221
9
24 102
10
21 241
11
20 352
12 13
8 192 101 021 210
14
332 133
15
112 421
16
51 155
17
20 410
18 19 20 21
315
101 111 111
3
7
Base 7
2 966 2 121 021 120 132 21 401
22
5 153
23
11 520
24
© Revision 1.0
18
Computer Systems Architecture exercises No. 25
Decimal Number 1 733
Base 3
Base 4
Base 5
Base 6
22 212 210
26
20 321 323
27
24 021
28
213 342
29
212 660
30 31
16 220 1 101 021 111
32
21 031 233
33
424 030
34
114 143
35
6 426
36 37
21 351 111 202 121
38
10 322 320
39
331 313
40
140 543
41
22 015
42 43
6 236 1 121 121
44
3 201 303
45
222 010
46
302 021
47
20 402
48 49 50
Base 7
16 666 101 021 022
© Revision 1.0
19
Computer Systems Architecture exercises
8. Bases conversions (10-15) All numbers are positive (unsigned numbers). First line is a solved example No. 1
Decimal Number 1234
2
3591
Base 11
Base 12
Base 13
Base 14
Base 15
A22
86A
73C
642
574
586A
3
3580
4
101C
5
1386
6
13AC
7 8
2987 911A
9
5953
10
3B3B
11
2939
12
1DBC
13 14
3926 4151
15
4238
16
5BB0
17
4099
18
2E74
19 20 21 22
13456 15211 B6A7 62BC
23
5655
24
© Revision 1.0
20
Computer Systems Architecture exercises
9. Decimal fractions to Binary fractions conversions All numbers are positive (unsigned numbers). First line is a solved example No.
Decimal Fraction
1
0.4375
2
0.1875
3
0.625
4
0.1171875
5
0.546875
6
0.875
7
0.34375
8
0.9375
9
0.375
10
0.40625
11
0.140625
12
0.65625
13
0.234375
14
0.734375
15
0.8125
16
0.8984375
17
0.71875
18
0.109375
19
0.578125
20
0.953125
21
0.5546875
22
0.4765625
23
0.13671875
24
0.49609375
Binary Fraction
© Revision 1.0
0.0111
21
Computer Systems Architecture exercises
10.Binary fractions to Decimal fractions conversions All numbers are positive (unsigned numbers). First line is a solved example No.
Binary Fraction
1
0.1
2
0.001
3
0.1111111
4
0.010001
5
0.010111
6
0.000001
7
0.00011
8
0.1011
9
0.00101
10
0.000101
11
0.10001
12
0.001101
13
0.11101
14
0.11
15
0.100001
16
0.00111
17
0.1001
18
0.010011
19
0.011011
20
0.11111
21
0.0101101
22
0.1100101
23
0.0000011
24
0.00011011
Decimal Fraction
© Revision 1.0
0.5
22
Computer Systems Architecture exercises
11.Negative numbers representations Convert the negative decimal number to binary (16 bits). First line is a solved example Binary Numbers No.
Decimal Number
One’s Complement
Two’s Complement
1
- 4095
1111 0000 0000 0000
1111 0000 0000 0001
2
- 7612
3
- 1985
4
- 5777
5
- 8000
6
- 729
7
- 3333
8
- 4799
9
- 6222
10
- 3488
11
- 8190
12
- 9999
13
- 7231
14
-3
15
- 127
16
- 4777
17
- 1741
18
- 7676
19
- 4288
20
- 3636
21
- 1901
22
- 8076
23
- 5555
© Revision 1.0
Sign and Magnitude 1000 1111 1111 1111
23
Computer Systems Architecture exercises
12.Numbers representations Assuming the Hexadecimal number represents a 16 bits signed binary number convert it to a decimal number. The three columns represent the binary notation. First line is a solved example Decimal Numbers No.
Hexadecimal Number
1
64
2
8FFF
3
6ABC
4
F0F0
5
FC92
6
FAB0
7
2000
8
EEEE
9
D012
10
8111
11
7FFF
12
BAD1
13
A000
14
E900
15
D780
16
9BBB
17
AAAA
18
6819
19
8750
20
C009
21
ABCD
22
DFEF
One’s Complement 100
© Revision 1.0
Two’s Complement 100
Sign and Magnitude 100
24
Computer Systems Architecture exercises
13.Adding binary numbers First line is a solved example Binary Numbers No.
First Number
Second Number
1
0011 0000 1101 0100
0010 0111 0000 0011
2
0010 0111 0000 1111
0010 1110 1110 1100
3
0000 1010 1110 0100
0010 1110 0000 1000
4
0010 1011 0110 0111
0000 0010 0010 1011
5
0001 1111 1111 1111
0001 1111 1111 1111
6
0000 1111 1111 1111
0010 1010 1100 1010
7
0010 1010 0101 1010
0000 0111 1111 1111
8
0001 1010 1011 1100
0001 1011 1100 1101
9
0001 1111 1110 1111
0000 1111 1101 1011
10
0011 1011 1010 1101
0001 1101 1101 1101
11
0101 1001 1001 0011
0001 1111 1111 1111
12
0001 0010 0011 0100
0001 0010 0011 0100
13
0100 1010 1011 0101
0010 0101 0111 1011
14
0011 1111 1111 1111
0011 1111 1111 1111
15
0001 1010 1010 1010
0010 1101 1101 1101
16
0100 1110 1110 1110
0001 1100 1100 1100
17
0101 0101 0101 0101
0001 1111 1111 1111
18
0001 1001 1111 1011
0011 1111 1011 1001
19
0011 1010 0110 0111
0001 0110 0111 1010
20
0001 1111 1011 1110
0011 1010 1100 1101
21
0101 1010 1000 1011
0000 1010 1000 1011
22
0010 0111 0111 0111
0010 1011 1011 1011
23
0100 0100 0100 0100
0001 1100 1100 1100
© Revision 1.0
Result 0101 0111 1101 0111
25
Computer Systems Architecture exercises
14.Adding Hexadecimal unsigned numbers First line is a solved example Hexadecimal Numbers No.
First Number
Second Number
1
7895
4ABC
2
1DDD
2CCC
3
3DEF
2999
4
42BD
2BFC
5
4623
3BDE
6
23DD
3D1F
7
5A96
1E0F
8
8B1E
1ED6
9
22FF
3DF9
10
2D2D
3F3F
11
3DDE
3EED
12
17EF
6E3F
13
9ABC
1234
14
5555
6789
15
1FFF
6BAD
16
6841
30FB
17
351D
4FE2
18
10FF
3EC7
19
22EE
3DC9
20
6699
1EB5
21
3D8E
3F0F
22
90E9
2E3F
23
4BA4
5CB5
© Revision 1.0
Result C351
26
Computer Systems Architecture exercises
15.Adding Octal unsigned numbers First line is a solved example Octal Numbers No.
First Number
Second Number
1
3125
5632
2
7777
1111
3
21430
6705
4
23561
17777
5
36217
23567
6
15374
25517
7
52527
12774
8
22337
33557
9
17652
21576
10
11666
55645
11
21766
51622
12
62357
22366
13
12345
12345
14
76543
1111
15
32145
12306
16
17365
36547
17
47147
15647
18
16736
26576
19
26077
25716
20
37437
26726
21
42637
20675
22
33512
32645
23
32325
35353
© Revision 1.0
Result 10757
27
Computer Systems Architecture exercises
16.Adding Base 3 unsigned numbers First line is a solved example Base 3 Numbers No.
First Number
Second Number
1
111020
11221
2
11101
100022
3
10121
20120
4
10202
21111
5
11010
22020
6
11120
21220
7
101201
20111
8
101011
102121
9
21101
21102
10
21201
21220
11
22020
20120
12
101120
20021
13
102010
2001
14
100212
20011
15
101210
21002
16
20101
12201
17
22122
20120
18
101201
21110
19
100110
20111
20
21202
12111
21
21211
101010
22
102101
11121
23
22002
101011
© Revision 1.0
Result 200011
28
Computer Systems Architecture exercises
17.Adding Base 4 unsigned numbers First line is a solved example Base 4 Numbers No.
First Number
Second Number
1
30110
10320
2
22323
11120
3
20223
11031
4
21223
11322
5
22333
10332
6
30312
12233
7
30112
10331
8
22100
20311
9
32011
12003
10
31221
2001
11
23331
22032
12
33031
12300
13
31320
22122
14
32211
23020
15
22323
21121
16
23202
30322
17
30230
13212
18
32003
23113
19
32331
22233
20
33220
13310
21
103102
3222
22
101113
11031
23
31333
30122
© Revision 1.0
Result 101030
29
Computer Systems Architecture exercises
18.Multiplying binary numbers All are unsigned 16 bits numbers. First line is a solved example Binary Numbers No.
First Number
Second Number
1
1111 1111
110 0010
2
10 0011
101 0110
3
100 1000
1000 1010
4
1 1010 1011 0010
1101
5
101 1101 1111
1 0011
6
100 0000 1111
1 1111
7
11 1100 1011
1001
8
1 1000 0011
1 1101
9
111 1000 1001
1101
10
1101 1101 1110
110
11
1 0000 1000 1001
101
12
10 0011 1001
1011
13
1010 1011 1011
111
14
1 1010 1011
1 0010
15
11 0110 1001
1110
16
101 0101 0010
1 1110
17
1110 1110 0010
1011
18
1011 1010 1101
1110
19
1011 0000 0000
1101
20
100 1000 0011
1 1011
21
1011 1011 0001
110
22
1111 1111
1111 1111
23
1011 1011
1011 1011
© Revision 1.0
Result 110 0001 1001 1110
30
Computer Systems Architecture exercises
19.Floating point (754) notation (signed integers) Convert between the numbers. First line is a solved example No.
Decimal Number
1
1
2
2
3 4
40400000 5
5 6
41300000 17
7 8
41C00000 28
9 10
42000000 49
11 12
426C0000 65
13 14
429E0000 85
15 16
42C60000 127
17 18
436E0000 256
19 20
43808000 313
21 22
43B18000 485
23 24
Floating point Number 3F800000
43F50000 515
© Revision 1.0
31
Computer Systems Architecture exercises No.
Decimal Number
Floating point Number 44054000
25 26
-4 C0E00000
27 28
-13 C1980000
29 30
-25 C1F80000
31 32
-37 C22C0000
33 34
-55 C2860000
35 36
-73 C29E0000
37 38
-86 C2BA0000
39 40
-101 C2EC0000
41 42
-197 C3710000
43 44
-267 C39A8000
45 46
-333 C3C80000
47 48
-499 C4034000
49 50
-541
© Revision 1.0
32
Computer Systems Architecture exercises
20.Floating point (754) notation (signed fractions) Convert between the numbers. First line is a solved example No. 1
Decimal Number 123.75
2
76.25
3 4
429B0000 64.3125
5 6
42378000 75.75
7 8
42158000 19.25
9 10
41F90000 0.625
11 12
42478000 63.5
13 14
42C68000 12.125
15 16
41C40000 31.625
17 18
42B10000 73.25
19 20
3F900000 3.375
21 22
419E0000 66.875
23 24
Floating point Number 42F78000
425A8000 27.25
© Revision 1.0
33
Computer Systems Architecture exercises No.
Decimal Number
25 26
-25.625 BE000000
27 28
-8.125 C1CA0000
29 30
-26.375 C2AF0000
31 32
-63.75 C2250000
33 34
-0.3125 C0BC0000
35 36
-33.375 C1420000
37 38
-68.875 C2978000
39 40
-1.75 C2854000
41 42
-49.5 C1CE0000
43 44
-99.125 BF600000
45 46
-61.25 C21D8000
47 48
-25.25 C2560000
49 50
Floating point Number C2B24000
-71.125
© Revision 1.0
34
Computer Systems Architecture exercises
21.Adding Floating point (754) numbers Add the floating-point numbers, then covert them to decimal, add the decimal numbers and check against the floating-point result obtained. First line is a solved example Floating Point Numbers No. 1
First Number 41700000
Second Number 420C0000
2
C14C0000
40300000
3
418E0000
41B20000
4
3F600000
4280A000
5
C2AF0000
422D0000
6
429A0000
42080000
7
44400000
43800000
8
41CA0000
43BBB000
9
414A0000
41460000
10
C1EC0000
3F000000
11
C18E0000
C20B0000
12
BE400000
BF500000
13
41DD0000
C0500000
14
42C00000
41C00000
15
44B08000
43008000
16
42C80000
41100000
17
41E40000
3FC00000
18
BF400000
40300000
19
413E0000
411E0000
20
C0FC0000
41140000
21
40C40000
40C40000
22
40B80000
C1340000
Result 42480000
© Revision 1.0
Decimal Numbers First Number 15.0
Second Number 35.0
Result 50.0
35
Computer Systems Architecture exercises
22.Multiplying Floating point (754) numbers Multiply the floating-point numbers, then covert them to decimal, multiply the decimal numbers and check against the floating-point result obtained. First line is a solved example Floating Point Numbers No. 1
First Number 40A00000
Second Number C0000000
2
41400000
40800000
3
C0700000
C1000000
4
40200000
40700000
5
42C80000
C17C0000
6
43000000
3FC00000
7
C2800000
C2000000
8
40900000
40B00000
9
44800000
44000000
10
C1C00000
42120000
11
41700000
42080000
12
411C0000
C0500000
13
44200000
3E000000
14
C2960000
C0800000
15
417A0000
40900000
16
41940000
C22D0000
17
C1900000
C2B60000
18
C2F00000
C2600000
19
41C80000
41C00000
20
C0C80000
42EC0000
21
414A0000
41840000
Result C1200000
© Revision 1.0
Decimal Numbers First Number 5
Second Number -2
Result -10
36
Computer Systems Architecture exercises
23.BCD numbers (8421, 2421) All are unsigned 16 bits numbers. First line is a solved example
No.
Decimal Number
1
1234
2
2468
3
7136
4
5497
5
3780
6
9026
7
4512
8
7462
9
3297
10
4582
11
1097
12
7651
13
1928
14
8763
15
7329
16
8461
17
1357
18
9042
19
6703
20
2730
21
4587
22
9768
23
8264
BCD Numbers 8421 0001 0010 0011 0100
© Revision 1.0
2421 0001 0010 0011 0100
37
Computer Systems Architecture exercises
24.BCD numbers (84-2-1, Excess-3) All are unsigned 16 bits numbers. First line is a solved example
No.
Decimal Number
1
1234
2
2468
3
7136
4
5497
5
3780
6
9026
7
4512
8
7462
9
3297
10
4582
11
1097
12
7651
13
1928
14
8763
15
7329
16
8461
17
1357
18
9042
19
6703
20
2730
21
4587
22
9768
23
8264
BCD Numbers 84-2-1 0111 0110 0101 0100
© Revision 1.0
Excess - 3 0100 0101 0110 0111
38
Computer Systems Architecture exercises
Chapter 4 – Central Processing Unit exercises 1. Architectures In the following questions the Answer should relate to a Stack-based architecture, Accumulator-based architecture and Register-Register architecture a) Write the instructions needed for executing the following formula: C = 2A + 3B Where A, B, C are variables. Try to optimize the code (minimum instructions possible) b) Write the instructions needed for executing the following formula: Sum = 8A + 4B + 2C +D Where A, B, C, D and Sum are variables c) Write the instructions needed for executing the following formula: Sum = A + 2B + 4C +8D Where A, B, C, D and Sum are variables d) Write the instructions needed for executing the following formula: Sum = 2(A + 2B) – 3(C +2D) Where A, B, C, D and Sum are variables e) Write the instructions needed for executing the following formula: Sum = 2AB+3CD Where A, B, C, D and Sum are variables f) Write the instructions needed for executing the following formula: Sum = 3(A+B)*(C+D) Where A, B, C, D and Sum are variables g) Write the instructions needed for executing the following formula: Sum = (A+2B+3C)*(3A+2B+C) Where A, B, C and Sum are variables h) Write the instructions needed for executing the following formula: Sum = A*B*C/(A+B+C) Where A, B, C and Sum are variables i) Write the instructions needed for executing the following formula: Sum = (A+B)*(B+C)*(A+C) Where A, B, C and Sum are variables
© Revision 1.0
39
Computer Systems Architecture exercises
2. CPI a) A specific program was executed on two computer systems. M1 has a cycle time of 1 ns and CPI=5 M2 has a cycle time of 1.6 ns and CPI=2.5 Which system is faster (for this specific program) and by how much b) Assuming we need to get similar speed on both system (described in the question above). What should be the CPI of each one of the systems. c) Assuming we need to get similar speed on both system (described in the questions above). What should be the cycle time of each one of the systems. d) A specific system implemented four groups of instructions as outlined in the following table. When a specific program was executed the following instructions’ frequencies were observed Group ALU
Frequency 50%
Cycles 1
Load
20%
2
Store
10%
2
Branch
20%
2
Calculate the average CPI for that specific program.
e) A specific system implemented four groups of instructions as outlined in the following table. When a specific program was executed the following instructions’ frequencies were observed Group ALU
Frequency 43%
Cycles 1
Load
21%
1
Store
12%
2
Branch
24%
2
The hardware engineers can improve the “Store” group of instructions performance and execute each instruction in one cycle. However this will require
© Revision 1.0
40
Computer Systems Architecture exercises increasing the systems cycle time by 15%. Is it worthwhile to implement the proposed change?
f) A specific system runs at 1GHz and supports four groups of instructions. When a specific program was executed the following usage was observed: Group 1
Usage 20%
CPI 1
2
30%
2
3
10%
3
4
40%
4
Calculate the CPI for that program. If there were 1010 instructions how many cycles it required and what was the execution time? g) A program was executed on a specific system and the following attributes were measured: - The number of instruction executed was 1009 - The CPI measured during the run was 2.5 - The clock rate was 2.5 GHz. Calculate the amount of time this program ran.
h) A program from the previous question was executed once again but this rime using a different compiler and a different hardware system. The attributes of the second runs were as follows: - The number of instruction executed was 950,000,000 - The CPI measured during the run was 3.0 - The clock rate was 3 GHz. Which system is faster and by how much
i) A program was executed on two system. The first system with a 20 ns cycle time and the second with a 30 ns. On the first system the CPI measured was 2.0 and on the second system the CPI was 0.5. Calculate the run time on each one of the systems
© Revision 1.0
41
Computer Systems Architecture exercises j) A program was executed on two system. The first system with a 500MHZ clock rate and the second with a 650 MHz clock rate. On the first system it used 1*108 cycles and on the second system it used 1.2 *108 cycles. Which system is faster and by how much? k) A system has three types of instructions. The first type executes in one cycle, the second requires two cycles and the third requires three cycles. When running a piece of code, five instructions were of type one, three were of type two and two instructions were of type three, Calculate the CPI of this piece of code. l) While running a program on system A with a clock rate of 600 MHz the CPI was 1.3. Running the program on system B with a clock rate of 750 MHz produced a CPI of 2.5. Assuming the number of instruction was 100,000’ what should the number of instruction on system B for achieving the same execution time? m) Running a specific program requires 3*1010 cycles. How long does it run if it executes on a system with a clock rate of 100 MHz? And how long does it run on a system with a clock rate of 3 GHz? n) A specific system uses 5 groups of instruction. Each group requires a different number of cycle for execution. While executing a test run the group frequencies were measured as outlined in the following table Group 1
Name ALU
Usage 22%
CPI 1
2
Memory Access
36%
5
3
Branch
16%
3
4
Call
13%
4
5
Return
13%
4
In an attempt to increase the system’s performance the hardware engineers have come up with several improvement suggestions. However’ each one of the suggestions has some draw backs (in addition to the benefits). The following tables summarizes the suggestions and the drawbacks. Usually an improvement relates to one group of instructions and the drawback is the extra time required to increase the cycle time.
© Revision 1.0
42
Computer Systems Architecture exercises Group
Name
Improvement
1
ALU
25%
Cycle time increase 7%
2
Memory Access
35%
17%
3
Branch
90%
1%
4
Call
45%
2%
5
Return
45%
2%
For example it is possible to decrease the number of cycle required by the ALU by 25% but it requires increasing the cycle time (for all the groups) by 7%. The only limitation is that only one improvement can be implemented. Which one is the preferred suggestion? Which one is the worst one.
© Revision 1.0
43
Computer Systems Architecture exercises
3. Amdahl’s Law a) A specific system performs a task in 100 ns. It is possible to introduce an improvement and then the task will be performed in 20 ns. The task is executed only during 30% of the time. Calculate the improvement b) A new system can execute a specific task 10 times faster compared to the existing system. The task is performed during 40% of the time while 60% of the time is dedicated to I/O operations. What is the total speed up to be achieved from replacing the system. (Note: the 60% I/O operation will not change) c) On a specific application 5% of the code has to run in serial mode while 95% of the code can be executed in parallel. What will be the improvement when moving the application to a 10 CPUs system d) After spending a lot of time a hardware engineer managed to improve the floating point arithmetic which now executes twice as fast. Unfortunately floating point operations are executed only during 10% of the time. What is the overall speedup achieved? e) A scientific application was developed so it can exploit parallel systems utilizing many processors. As such the parallel part is executed during 95% of the time. - On how many processors the application should run so it is 10 times faster? - What will be the number of processors required to achieve a 25 times faster time? - After spending additional work the developers succeeded to increase the parallel percentage to 97%. - How many processors are required for a 10 times faster run? - How many are required for a 20 times faster run? f) A scientific application runs on parallel systems, however only 70% is suited for parallel processors. Calculate the speedup obtained by using, two, three, four and five processors.. g) A scientific application runs for 100 minutes. 40 minutes are CPU time while 60 minutes are I/O time. For increasing the speed, two alternatives are considered: replacing the processor by a new model 90 times faster or replacing the disk system by a newer model 4 times faster. - Which alternative is better? - What will be the run time for each alternative
© Revision 1.0
44
Computer Systems Architecture exercises h) In an attempt to improve the system’s performance, the hardware engineers came up with two possible solutions: adding a special hardware device that will execute the square root instruction 10 times faster, or improving the floating point instruction so they will execute twice as fast. On the benchmark programs the square root instructions are executed 20% of the time while the floating point instructions execute during 50% of the time. Which alternative is better?
i) In a specific processor the Load and Store instructions were improved so they execute four times faster. These instructions account for 50% of the total run time. What is the overall speedup? Assuming the test program ran for 160 seconds, how long it will run after the improvement? j) In a specific processor all logic instructions were improved and the new ones execute five time faster. Assuming the logic instructions account for 50% of the run time, how long will run a program that previously ran 10 seconds? k) \ After improving the arithmetic instructions they run five times faster. What should their percentage be if the overall speedup required is 3 times? l) You are the CIO a manufacturing organization. Due to anticipated changes the board asked that the main application executed on the system will run twice as fast. You checked the application and discovered that 65% is suited for parallel execution. How many processors (or cores) do you have to buy? m) A specific system is using RAM only (no cache memory). It is possible to ass the cache memory that runs five times faster. How the execution time will change if the applications use the cache 80% on average? What will happen if the cache is used only 50% of the time? n) A vector computer executes an instruction on a vector (array) of values, contrary to a scalar computer that executes the instruction on a single value. On a specific vector computer, the vector instructions are 20 times faster compared to the scalar instructions. What the speedup will be assuming only 70% of the application can be vectorized? o) There was an urgent need to improve the performance of a specific CISC based system. Like any other CISC based system it supported many instructions. The hardware engineers mapped the instructions and addressed the 10 most used ones. These 10 instructions account for 90% of the time. After the improvement these 10 instructions executed 6 times faster. What is the overall speedup obtained
© Revision 1.0
45
Computer Systems Architecture exercises p) During a meeting dedicated to the new system acquired, the CIO said that the fact the new system has 100 processors means that all application will benefit. Some application may see an= a increase of 50 times and others less. Nevertheless he said all application will run at least twice as fast. Do you agree to this assumption?
© Revision 1.0
46
Computer Systems Architecture exercises
4. Scoreboarding a) Draw the score board for the following Register-Register based architecture’s instructions: Add Sub Add Sub
R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5
b) Draw the score board for the following Register-Register based architecture’s instructions: Mult Mult Add Div
R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4
c) The high level computer instruction SUM = A+B+C+D Can be implemented in assembly language by: Add Add Add
R5,R1,R2 R5,R5,R3 R5,R5,R4
Add Add Add
R5,R1,R2 R6,R3,R4 R5,R5,R6
Or
Which one is better? Draw the score board for the two implementations. Assume that after issuing an instruction there is one extra cycle before the register becomes available
© Revision 1.0
47
Computer Systems Architecture exercises
5. Branch Prediction a) The following string represents the behavior of a specific branch instruction. - “1” means the branch was taken - “0” means the branch was not taken 1001110111 Calculate the success rate of the branch prediction when using one and two bits. In both cases assume the default is not to branch. b) What are the success rates of the branches in the previous exercise if the default is branch taken? Solution c) Calculate the success rates of the branch prediction mechanisms for the following scenarios. A bit set represents a branch taken. Scenario 1110000110110
One bit default Taken
Two bits default Taken
10101010111
Taken
Not taken
10101010111
Not taken
Taken
1110001010
Taken
Taken
1110001010
Not taken
Not taken
1011100110110110
Not taken
Taken
1110101010111011
Not taken
Taken
1011001110001010
Taken
Not taken
1100110011
Taken
Taken
1100011011101011
Not taken
Not taken
0011101000011001
Taken
Taken
1011011101111011
Taken
Taken
1100011110000011
Not taken
Not taken
1110101011
Not taken
Not taken
1010101011
Taken
Taken
© Revision 1.0
One bit success 61.54%
Two bits success 46.15%
48
Computer Systems Architecture exercises 1101101101101101
Taken
Taken
0111111111110
Not taken
Not taken
0111111111110
Taken
Taken
1111000011110000
Not taken
Not taken
0111111101111111
Not taken
Not taken
0001111010101111
Not taken
Not taken
1110101000110111
Taken
Taken
1101011111011000
Taken
Taken
1101110111011101
Taken
Taken
© Revision 1.0
49
Computer Systems Architecture exercises
Chapter 6 – Cache Memory exercises 1. Cache Memory improvements a) A specific system has three levels of memory (Main, Cache L1 and Cache L2) The memory access time is 50 ns, L1 access time is 1 ns L2 access time is 5 ns An application was executed on that system which is characterized by the fact that 30% of the instructions access memory. 90% of these access are found in L1 and only 1% has to be brought from main memory. Calculate the amount of time added to each cycle, if: - The application is using only main memory - The application is using main memory and L1 - The application is using main memory and L2 - The application is using main memory, L1 and L2. case 0.56 ns will be added to the calculated CPI. b) A specific system has four levels of memory (Main, Cache L1, Cache L2 and Cache L3) . The memory access percent is 33% The memory access time is 60 ns, 0 missing rate L1 access time is 1 ns, 15% missing rate (the datum is not found in L1) L2 access time is 10 ns, 7% missing rate L2 access time is 20 ns, 1% missing rate Calculate the amount of time added to each cycle, if: - The application is using only main memory - The application is using main memory and L1 - The application is using main memory and L2 - The application is using main memory and L3 - The application is using main memory, L1 and L2. - The application is using main memory, L1, L2 and L3.
c) The following table defines several systems with different characteristics. All systems have a three level hierarchical memory (main, L1, L2). - The memory access represents the percent of the instructions that access memory. - The access time defines the time required to access each of the levels. - The missing rate represents the percentage of misses per each level. The missing rate for the main memory is always zero. For each line calculate the amount of time added to the CPI if: © Revision 1.0
50
Computer Systems Architecture exercises -
No.
The application is using only main memory The application is using main memory and L1 The application is using main memory and L2 The application is using main memory, L1 and L2.
Memory Access %
Access time L2
Missing Rate (%) L1 L2
Main
Added Time ns L1 L2
Main
L1
1
35
75
2
10
12
2
26.25
3.77
2
40
80
2
15
8
1
3
25
50
3
12
9
2
4
29
48
2
10
12
2
5
32
56
1
8
11
2
6
48
40
2
12
14
3
7
28
50
3
15
12
2
8
36
54
1
12
10
1
9
65
50
1
11
8
1
10
52
52
2
11
9
2
© Revision 1.0
3.96
L1+L2 1.49
51
Computer Systems Architecture exercises
Chapter 7 – BUS exercises 1. Parity a) The following table contains binary number and the definition of the parity bit (odd or even) Calculate the parity and complete the table. The first exercise is solved. No.
Binary Number
Parity
Parity Bit 0
1
10 1010 1010 1010
Odd
2
1111 0000 1111
Even
3
1 1011 1101 1010
Even
4
11 0000 1011
Odd
5
1111 1100 0000
Odd
6
10 0000 0100
Odd
7
100 0111 1010
Even
8
000 1101 1010
Even
9
1 0111 0101 1011
Even
10
1 1111 1111
Odd
b) For increasing the integrity of the information sent over the network a special parity algorithm was define. Instead of adding just one bit per a block (seven bits) the algorithm adds three parity bits. Each such parity bit guards several data bits. Unlike the ordinary simple parity bit mechanism, thus mechanism increases the overhead but provides the capability to correct faulty blocks without the need to re-transmit. The algorithm is defined by: P0 = even_parity (D0, D1, D3, D4, D5) P1 = even_parity (D1, D2, D3, D5, D6) P2 = even_parity (D0, D2, D3, D4, D6) Encode the binary number 0110010 by adding the three parity bits as define by the algorithm above
© Revision 1.0
52
Computer Systems Architecture exercises c) The following table contains a list of 7 bits block that have to be encoded using the previously described algorithm: P0 = even_parity (D0, D1, D3, D4, D5) P1 = even_parity (D1, D2, D3, D5, D6) P2 = even_parity (D0, D2, D3, D4, D6) Calculate the three parity bits for each block. No.
Block Content
1
111 1111
2
101 1110
3
101 1110
4
001 0011
5
101 0101
6
110 1101
7
111 0011
8
101 1111
9
111 1001
10
100 0111
11
100 1001
12
111 1100
13
001 0010
P0
P1
P2
1
1
1
d) The following table contains a list of 7 bits block that have to be encoded using a very similar algorithm (same locations but the parity is odd): P0 = odd_parity (D0, D1, D3, D4, D5) P1 = odd_parity (D1, D2, D3, D5, D6) P2 = odd_parity (D0, D2, D3, D4, D6) Calculate the three parity bits for each block No.
Block Content
1
011 1111
2
000 1010
P0
P1
P2
1
0
1
© Revision 1.0
53
Computer Systems Architecture exercises 3
111 0000
4
101 0000
5
111 0111
6
110 0011
7
111 1111
8
100 0001
9
011 1110
10
101 0101
11
111 0001
12
101 1101
13
110 1101
e) The block: 111 0101 110 arrived through the network after it was encoded using the following algorithm: P0 = even_parity (D0, D1, D3, D4, D5) P1 = even_parity (D1, D2, D3, D5, D6) P2 = even_parity (D0, D2, D3, D4, D6) Decode it and find the correct data bits of the original block.
f) The block: 101 1111 000 arrived through the network after it was encoded using the following algorithm: P0 = odd_parity (D0, D1, D3, D4, D5) P1 = odd_parity (D1, D2, D3, D5, D6) P2 = odd_parity (D0, D2, D3, D4, D6) Decode it and find the correct data bits of the original block.
g) The block: 111 0111 101 arrived through the network after it was encoded using the following algorithm: P0 = odd_parity (D0, D1, D3, D4, D5) P1 = odd_parity (D1, D2, D3, D5, D6) P2 = odd_parity (D0, D2, D3, D4, D6) Decode it and find the correct data bits of the original block.
© Revision 1.0
54
Computer Systems Architecture exercises h) The following table contains a list of block that arrived through the network. All blocks were encoded using the following odd or even algorithm: P0 = parity (D0, D1, D3, D4, D5) P1 = parity (D1, D2, D3, D5, D6) P2 = parity (D0, D2, D3, D4, D6) The correct algorithm (odd or even) is define in the table for each block. Decode each received block to find the correct data bits of the original block assuming no more than one bit flipped. No.
Block Received
Algorithm
Flipped bit
Original Block
D5
111 1111
1
111 1101 000
Odd
2
000 0100 111
Odd
3
110 1001 110
Even
4
110 0111 000
Even
5
110 0001 000
Even
6
101 1010 000
Even
7
100 1111 010
Odd
8
010 1111 000
Odd
9
101 0111 000
Odd
10
111 0110 000
Even
11
110 0111 110
Even
12
111 0001 011
Even
13
111 0101 101
Odd
14
101 0100 110
Odd
15
001 1000 110
Odd
16
011 1010 101
Odd
17
000 0000 100
Odd
18
111 1111 010
Even
19
000 0010 001
Even
20
010 0101 101
Even
© Revision 1.0
55
Computer Systems Architecture exercises 21
011 1011 110
Even
22
011 0101 010
Odd
23
011 0111 100
Odd
24
010 1001 000
Odd
© Revision 1.0
56
Computer Systems Architecture exercises
2. Hamming Codes a) Use odd hamming codes for decoding the value 1111 b) Use odd hamming codes for decoding the value 1010 c) Use even hamming codes for decoding the value 1101 1110 d) The following table contains a list of block that have to be encoded using Hamming codes. Calculate the Hamming codes for each line that represents a data block. The specific parity to be used per each block is defined as well. No.
Original Data
Parity
1
011 1111
Even
2
011 1111
Odd
3
111 0001
Odd
4
101 1011
Even
5
000 1111
Even
6
010 0100
Odd
7
101 1100
Odd
8
101 1100
Even
9
100 1000
Even
10
111 1000
Even
11
010 1001
Odd
12
0101001
Even
13
001 1111
Odd
14
000 0111
Odd
15
101 0101
Even
© Revision 1.0
Encoded Block 000 1111 1111
57
Computer Systems Architecture exercises
3. SECDED a) Use even Hamming codes for decoding the value 110 1001 and in addition add an odd SECDED bit b) The following table contains a list of blocks that have to be encoded using Hamming codes. Calculate the Hamming codes As well as the SECDED The specific parity to be used per each block is defined as well. No. 1
Original Data 101 1111
Hamming Parity Even
SECDED Parity Odd
2
010 1100
Even
Even
3
001 0001
Odd
Odd
4
100 1110
Even
Odd
5
101 0001
Odd
Even
6
111 0001
Odd
Odd
7
011 0110
Even
Odd
8
001 1010
Even
Even
9
010 0010
Odd
Even
10
100 1110
Odd
Odd
11
000 0111
Even
Odd
12
010 1110
Even
Odd
13
111 1000
Odd
Even
14
011 0011
Odd
Odd
Encoded Block 1011 0011 1111
c) A message with the content: 0001 0111 0111 arrived from the network. It was encoded using odd Hamming codes and an odd SECDED. Decode it to obtain the original data. d) A message with the content: 1111 0010 0011 arrived from the network. It was encoded using even Hamming codes and an even SECDED. Decode it to obtain the original data.
© Revision 1.0
58
Computer Systems Architecture exercises e) The following table contains a list of blocks that were received. Decode the Hamming codes as well as the SECDED. The specific parity to be used per each block is defined as well. No.
Received Block
1
0001 0101 0111
Hamming Parity Odd
2
1010 1101 1001
Even
Even
3
1000 1000 0000
Odd
Odd
4
0110 1100 0101
Even
Odd
5
0100 0111 1110
Odd
Even
6
0001 0010 1001
Odd
Odd
7
1110 1101 1111
Even
Odd
8
0110 0010 1010
Even
Even
9
1000 0111 1000
Odd
Even
10
0100 0000 0001
Odd
Odd
11
1011 1100 1101
Even
Odd
12
1000 1010 1110
Even
Odd
13
1001 1111 1011
Odd
Even
14
1110 0111 1010
Odd
Odd
15
1011 1101 1100
Even
Odd
16
1101 1010 0101
Even
Even
17
1100 0001 1111
Even
Even
18
0110 1000 0001
Odd
Odd
19
1010 0001 0000
Odd
Even
20
0001 1001 0111
Even
Odd
© Revision 1.0
SECDED Parity Odd
Bit Flipped D6
Original Data 111 1111
59
Computer Systems Architecture exercises
© Revision 1.0
60
Computer Systems Architecture exercises
System Architecture Exercises Solutions
© Revision 1.0
61
Computer Systems Architecture exercises
Chapter 2 – Data Representation exercises 1. Decimal to Binary conversions All numbers are positive (unsigned numbers). First line is a solved example No.
Decimal Number
Binary Number
1
123
111 1011
2
1 023
11 1111 1111
3
765
10 1111 1101
4
9 898
10 0110 1010 1010
5
3 999
1111 1001 1111
6
159
1001 1111
7
762
10 1111 1010
8
7 602
1 1101 1011 0010
9
5 577
1 0101 1100 1001
10
2 004
111 1101 0100
11
11 000
10 1010 1111 1000
12
35 355
1000 1010 0001 1011
13
404
1 1001 0100
14
1 234
100 1101 0010
15
4 949
1 0011 0101 0101
16
573
10 0011 1101
17
669
10 1001 1101
18
917
11 1001 0101
19
8 192
10 0000 0000 0000
20
7 811
1 1110 1000 0011
21
8 642
10 0001 1100 0010
22
3 789
1110 1100 1101
23
7 003
1 1011 0101 1011
© Revision 1.0
62
Computer Systems Architecture exercises No.
Decimal Number
Binary Number
24
3 887
1111 0010 1111
25
6 423
1 1001 0001 0111
26
12 129
10 1111 0110 0001
27
27 298
100 1010 1010 0010
28
19 999
100 1110 0001 1111
29
9 873
10 0110 1001 0001
30
17 399
100 0011 1111 0111
31
57
11 1001
32
634
10 0111 1010
33
9 824
10 0110 0110 0000
34
10 000
10 0111 0001 0000
35
5 665
1 0110 0010 0001
36
7 991
1 1111 0011 0111
37
999
11 1110 0111
38
800
11 0010 0000
39
3 333
1101 0000 0101
40
7 007
1 1011 0101 1111
41
12 123
10 1111 0101 1011
42
255
1111 1111
43
7 777
1 1110 0110 0001
44
5 656
1 0110 0001 1000
45
4 321
1 0000 1110 0001
46
99
110 0011
47
375
1 0111 0111
48
1 010
11 1111 0010
49
8 119
1 1111 1011 0111
© Revision 1.0
63
Computer Systems Architecture exercises No.
Decimal Number
Binary Number
50
6 944
1 1011 0010 0000
51
2 468
1001 1010 0100
52
1 753
110 1101 1001
53
4 762
1 0010 1001 1010
54
8 117
1 1111 1011 0101
55
1 928
111 1000 1000
56
7 956
1 1111 1 0100
57
19 175
100 1010 1110 0111
58
22 222
101 0110 1100 1110
59
7 275
1 1100 0110 1011
60
1 983
111 1011 1111
61
5 555
1 0101 1011 0011
62
36 133
1000 1101 0010 0101
63
11 223
10 1011 1101 0111
64
4 590
1 0001 1110 1110
65
21 325
101 0011 0100 1101
66
9 176
10 0011 1101 1000
67
9
1001
68
81
101 0001
69
5 933
1 0111 0010 1101
70
9 724
10 1010 1111 1100
71
5 311
1 0100 1011 1111
72
14 000
11 0110 1011 0000
73
781
11 0000 1101
74
35
10 0011
75
28 753
111 0000 0101 0001
© Revision 1.0
64
Computer Systems Architecture exercises
2. Binary to Decimal conversions All numbers are positive (unsigned numbers). First line is a solved example No.
Binary Number
Decimal Number
1
1 0000 0000
256
2
1100 1100 1101
3 277
3
1 0010 0011 0100
4 660
4
1 1000 0100 0010
6 210
5
100 0000 0001
1 025
6
1010 1010 1010
2 730
7
1 0000 0000 0001
4 097
8
100 1000 1000 1101
18 573
9
1100 1100 1100
3 276
10
111 1000 0111
1 927
11
1 0000 0010 0100
4 132
12
10 1010 1010 1010
10 922
13
111 0111 0111 0111
30 583
14
111 1011 0110
1 974
15
1101 0000 1101
3 341
16
1110 1001 1110
3 742
17
101 0010 1100 0001
21 185
18
101 0000 0000 1010
20 490
19
101 0001 0001 1111
20 767
20
100 1101 1100 0001
19 905
21
1 1111 0011 0011
7 987
22
11 0001 1111 1111
12 799
23
11 0100 0101 0110
13 398
24
111 1010 0001 0011
31 251
© Revision 1.0
65
Computer Systems Architecture exercises No.
Binary Number
Decimal Number
25
111 1010 1011 1100
31 420
26
11 0011 1001 1000
12 696
27
1111 1010 1010
4 010
28
10 1000 1100 1001
10 441
29
1011 0101 0101
2 901
30
110 1110 1111
1 775
31
11 1111 0100 0000
16 192
32
1 0001 0000 0001
4 353
33
1 1111 1111 1111
8 191
34
110 0110 1000 0100
26 244
35
1001 1001 1001
2 457
36
1001 1010 1010
2 474
37
11 0110 0111
871
38
1111 0000 0000
3 840
39
1000 0000 0001
2 049
40
1010 0000 1111
2 575
41
1000 1001 1010 1011
35 243
42
1001 0111 0110
2 422
43
11 0011 0011 0011
13 107
44
1001 0111 0101 0011
38 739
45
11 1000 1010
906
46
1011 1011 1000
3 000
47
1001 1110 1110
2 542
48
1110 1110 1001
3 817
49
101 0001 1110 0101
20 965
50
11 0000 0000 0011
12 291
© Revision 1.0
66
Computer Systems Architecture exercises No.
Binary Number
Decimal Number
51
10 0100 0110 1001
9 321
52
111 1010 1011
1 963
53
1 1010 0010 0010
6 690
54
101 0001 0001 0001
20 753
55
111 0110 0101 0100
30 292
56
1001 0010 0011 0100
37 428
57
1 1001 0010 1000
6 440
58
1000 0010 0111 0011
33 395
59
1011 1100 1100
3 020
60
1010 1001 1000
2 712
61
110 1000 1010 0001
26 785
62
100 1011 0001 1010
19 226
63
10 1011 0110 1001
11 113
64
111 0001 0101 0111
29 015
65
111 0101 1001 0100
30 100
66
1 1111 0000 1000
7 944
67
1 0101 1011
347
68
10 1011 0000 0011
11 011
69
10 0010 1011 1000
8 888
70
1 1000 1011 1101
6 333
71
1110 1001 1001
3 737
72
1 1110 1000 1111
7 823
73
110 1100 0001 1001
27 673
74
100 1010 0011 1101
19 005
75
10 0000 0011 0110
8 246
76
100 0000 0000 0000
16 384
© Revision 1.0
67
Computer Systems Architecture exercises
3. Decimal to Hexadecimal conversions All numbers are positive (unsigned numbers). First line is a solved example No. 1
Decimal Number 31 987
Hexadecimal Number 7CF3
2
21 007
520F
3
17 000
4268
4
14 555
38DB
5
17 800
4588
6
15 841
3DE1
7
14 773
39B5
8
17 183
431F
9
79 891
13813
10
22 888
5968
11
12 871
3247
12
21 680
54B0
13
14 000
36B0
14
29 612
73AC
15
16 658
4112
16
22 888
5968
17
10 000
2710
18
29 999
752F
19
30 651
77BB
20
31 113
7989
21
30 001
7531
22
17 325
43AD
23
9 876
2694
24
21 111
5277
© Revision 1.0
68
Computer Systems Architecture exercises No. 25
Decimal Number 911
Hexadecimal Number 28F
26
14 590
38FE
27
17 618
44D2
28
9 784
2638
29
11 011
2B03
30
8 933
22E5
31
12 617
3149
32
21 039
522F
33
10 000
2710
34
6 785
1A81
35
12 777
31E9
36
24 242
5EB2
37
19 898
4DBA
38
6 444
192C
39
717
2CD
40
3 982
F8E
41
10 986
2AEA
42
519
207
43
4 102
1006
44
22 097
5651
45
27 963
6D3B
46
16 741
4165
47
3 785
EC9
48
9 261
242D
49
4 022
FB6
50
26 789
68A5
© Revision 1.0
69
Computer Systems Architecture exercises
4. Hexadecimal to Decimal conversions All numbers are positive (unsigned numbers). First line is a solved example No. 1
Hexadecimal Number 8 000
Decimal Number 32 768
2
6 B8C
27 532
3
3 5DD
13 789
4
3 E1D
15 901
5
2 B67
11 111
6
3 FCD
16 333
7
2 6A0
9 888
8
4 705
18 181
9
A FC8
45 000
10
2 FAD
12 205
11
8 791
34 705
12
7 A7A
31 354
13
B 00F
45 071
14
A 000
40 960
15
7 575
30 069
16
2 FAD
12 205
17
5 AB0
23 456
18
5 7E4
22 500
19
4 944
18 756
20
313
787
21
3 745
14 149
22
6 20B
25 099
23
2 2CF
8 911
24
5 A9B
23 195
© Revision 1.0
70
Computer Systems Architecture exercises No. 25
Hexadecimal Number 9 C40
Decimal Number 40 000
26
20F
527
27
7E0
2 016
28
7 97B
31 099
29
2 F59
12 121
30
2 333
9 011
31
4 2E0
17 120
32
2 000
8 192
33
1 D6B
7 531
34
395
917
35
3 30A
13 066
36
2 222
8 738
37
4 D6D
19 821
38
4 36F
17 263
39
2C7
711
40
1 B63
7 011
41
DAB
3 499
42
FFA
4 090
43
1 F62
8 034
44
1 565
5 477
45
393
915
46
9A
155
47
4 7B3
18 355
48
4 267
16 999
49
2 DE2
11 746
50
2 107
8 455
© Revision 1.0
71
Computer Systems Architecture exercises
5. Decimal to Octal conversions All numbers are positive (unsigned numbers). First line is a solved example No. 1
Decimal Number 3 755
Octal Number 7 253
2
11 111
25 547
3
1 733
3 305
4
12 121
27 531
5
9 875
23 223
6
9 077
21 565
7
96 017
273 421
8
23 777
56 341
9
35 000
104 270
10
33 333
101 065
11
44 330
126 452
12
45 145
130 131
13
53 723
150 733
14
2 323
4 423
15
47 474
134 562
16
32 432
77 260
17
53 521
150 421
18
39 999
116 077
19
22 555
54 033
20
40 960
120 000
21
261 549
776 655
22
20 480
50 000
23
32 767
77 777
24
319
477
© Revision 1.0
72
Computer Systems Architecture exercises No. 25
Decimal Number 26 788
Octal Number 64 244
26
10 976
25 340
27
17 435
42 033
28
8 099
17 643
29
11 099
25 533
30
25 000
60 650
31
9 822
23 136
32
12 377
30 131
33
5 077
11 725
34
7 333
16 235
35
49
61
36
18 555
44 173
37
9 000
21 450
38
7 102
15 676
39
8 989
21 435
40
15 415
36 067
41
2 017
3 741
42
6 522
14 572
43
16 581
40 305
44
14 037
33 325
45
9 378
22 242
46
14 043
33 333
47
9 998
23 416
48
5 037
11 655
49
17 321
41 517
50
12 055
27 427
© Revision 1.0
73
Computer Systems Architecture exercises
6. Octal to Decimal conversions All numbers are positive (unsigned numbers). First line is a solved example No. 1
Octal Number 4 227
Decimal Number 2 199
2
23 417
9 999
3
1 733
987
4
6 474
3 388
5
36 475
15 677
6
1 761
1 009
7
17 630
8 088
8
47 037
19 999
9
76 220
31 888
10
67 175
28 285
11
137 743
49 123
12
114 265
39 093
13
67 742
28 642
14
2 274
1 212
15
27 351
12 009
16
105 033
35 355
17
202 152
66 666
18
121 666
41 910
19
51 167
21 111
20
65 500
27 456
21
17 777
8 191
22
77 123
32 339
23
37 000
15 872
24
325
213
© Revision 1.0
74
Computer Systems Architecture exercises No. 25
Octal Number 1 111
Decimal Number 585
26
12 345
5 349
27
25 252
10 922
28
6 666
3 510
29
70 001
28 673
30
56 712
24 010
31
1 000
512
32
63 451
26 409
33
36 712
15 818
34
3 412
1 802
35
6 711
3 529
36
7 033
3 611
37
4 510
2 376
38
2 777
1 535
39
10 666
4 534
40
12 012
5 130
41
43 651
18 345
42
33 773
14 331
43
6 236
3 230
44
1 177
639
45
17 451
7 977
46
6 755
3 565
47
23 773
10 235
48
5 000
2 560
49
16 666
7 606
50
7 654
4 012
© Revision 1.0
75
Computer Systems Architecture exercises
7. Bases conversions (3-10) All numbers are positive (unsigned numbers). First line is a solved example No. 1
Decimal Number 159
2
Base 3
Base 4
Base 5
Base 6
Base 7
12 220
2 133
1 114
423
315
1 991
2 201 202
133 013
30 431
13 115
5 543
3
7 654
101 111 111
1 313 212
221 104
55 234
31 213
4
2 002
2 202 011
133 102
31 002
13 134
5 560
5
79
2 221
1 033
304
211
142
6
192
21 010
3 000
1 232
520
363
7
789
1 002 020
30 111
11 124
3 353
2 205
8
1 299
1 210 010
110 103
20 144
10 003
3 534
9
1 001
1 101 002
33 221
13 001
4 345
2 630
10
1 777
2 102 211
123 301
24 102
12 121
5 116
11
2 905
10 222 121
231 121
43 110
21 241
11 320
12
4 986
20 211 200
1 031 322
124 421
35 030
20 352
13
8 192
102 020 102
2 000 000
230 232
101 532
32 612
14
7 500
101 021 210
1 311 030
220 000
54 420
30 603
15
3 999
12 111 010
332 133
111 444
30 303
14 442
16
4 111
12 122 021
1 000 033
112 421
31 011
14 662
17
6 767
100 021 122
1 221 233
204 032
51 155
25 505
18
5 005
20 212 102
1 032 031
130 010
35 101
20 410
19
2 966
11 001 212
232 112
43 331
21 422
11 435
20
1 897
2 121 021
131 221
30 042
12 441
5 350
21
1 566
2 011 000
120 132
22 231
11 130
4 365
22
1 476
2 000 200
113 010
21 401
10 500
4 206
23
1 149
1 120 120
101 331
14 044
5 153
3 231
24
3 003
11 010 020
232 323
44 003
21 523
11 520
© Revision 1.0
76
Computer Systems Architecture exercises No. 25
Decimal Number 1 733
26
Base 3
Base 4
Base 5
Base 6
Base 7
2 101 012
123 011
23 413
12 005
5 024
6 474
22 212 210
1 211 022
201 344
45 550
24 606
27
36 475
1212 000 221
20 321 323
2 131 400
440 511
211 225
28
1 761
2 102 020
123 201
24 021
12 053
5 064
29
17 630
220 011 222
10 103 132
1 031 010
213 342
102 254
30
37 037
1 212 210 202
21 002 231
2 141 122
443 245
212 660
31
16 220
211 020 202
3 331 130
1 004 340
203 032
65 201
32
27 175
1 101 021 111
12 220 213
1 332 200
325 451
142 141
33
37 743
1 220 202 220
21 031 233
2 201 433
450 423
215 016
34
14 265
201 120 100
3 132 321
424 030
150 013
56 406
35
9 999
111 201 100
2 130 033
304 444
114 143
41 103
36
2 274
10 010 020
203 202
33 044
14 310
6 426
37
21 351
1 002 021 210
11 031 213
1 140 401
242 503
116 151
38
10 033
111 202 121
2 130 301
310 113
114 241
41 152
39
20 152
1 000 122 101
10 322 320
1 121 102
233 144
112 516
40
11 666
121 000 002
2 312 102
331 313
130 002
46 004
41
13 167
200 001 200
3 031 233
410 132
140 543
53 250
42
5 500
21 112 201
1 111 330
134 000
41 244
22 015
43
6 236
22 112 222
1 201 130
144 421
44 512
24 116
44
1 177
1 121 121
102 121
14 202
5 241
3 301
45
14 451
201 211 020
3 201 303
430 301
150 523
60 063
46
7 755
101 122 020
1 321 023
222 010
55 523
31 416
47
23 773
1 012 121 111
11 303 131
1 230 043
302 021
126 211
48
5 000
20 212 012
1 032 020
130 000
35 052
20 402
49
16 666
211 212 021
10 010 122
1 013 131
205 054
66 406
50
7 811
101 021 022
1 322 003
222 221
100 055
31 526
© Revision 1.0
77
Computer Systems Architecture exercises
8. Bases conversions (10-15) All numbers are positive (unsigned numbers). First line is a solved example No.
Base 11
Base 12
Base 13
Base 14
Base 15
1
Decimal Number 1234
A22
86A
73C
642
574
2
3591
2775
20B3
1833
1447
10E6
3
7699
586A
4557
3673
2B3D
2434
4
6000
4565
3580
2967
2288
1BA0
5
2222
1740
1352
101C
B4A
9D2
6
3450
2657
1BB6
1755
1386
1050
7
4212
318A
2530
1BC0
176C
13AC
8
2987
2276
188B
148A
1135
D42
9
12121
911A
7021
5695
45BB
38D1
10
9999
7570
5953
4722
3903
2E69
11
8500
6428
4B04
3B3B
3152
27BA
12
7303
553A
4287
342A
2939
226D
13
6477
4959
38B9
2C43
2509
1DBC
14
3926
2A4A
2332
1A30
1606
126B
15
5501
4151
3225
2672
200D
196B
16
7244
5496
4238
33B3
28D6
222E
17
12987
9837
7623
5BB0
4A39
3CAC
18
11111
8391
651B
5099
4099
345B
19
10009
757A
5961
472C
390D
2E74
20
13456
A123
7954
6181
4C92
3EC1
21
21550
15211
1057A
9A69
7BD4
65BA
22
19999
14031
B6A7
9145
7407
5DD4
23
13675
A302
7AB7
62BC
4DAB
40BA
24
14971
10280
87B7
6A78
5655
4681
© Revision 1.0
78
Computer Systems Architecture exercises
9. Decimal fractions to Binary fractions conversions All numbers are positive (unsigned numbers). First line is a solved example No.
Decimal Fraction
Binary Fraction
1
0.4375
0.0111
2
0.1875
0.0011
3
0.625
0.101
4
0.1171875
0.0001111
5
0.546875
0.100011
6
0.875
0.111
7
0.34375
0.01011
8
0.9375
0.1111
9
0.375
0.011
10
0.40625
0.01101
11
0.140625
0.001001
12
0.65625
0.10101
13
0.234375
0.001111
14
0.734375
0.101111
15
0.8125
0.1101
16
0.8984375
0.1110011
17
0.71875
0.10111
18
0.109375
0.000111
19
0.578125
0.100101
20
0.953125
0.111101
21
0.5546875
0.1000111
22
0.4765625
0.0111101
23
0.13671875
0.00100011
24
0.49609375
0.01111111
© Revision 1.0
79
Computer Systems Architecture exercises
10.Binary fractions to Decimal fractions conversions All numbers are positive (unsigned numbers). First line is a solved example No.
Binary Fraction
Decimal Fraction
1
0.1
0.5
2
0.001
0.125
3
0.1111111
0.9921875
4
0.010001
0.265625
5
0.010111
0.359375
6
0.000001
0.015625
7
0.00011
0.09375
8
0.1011
0.6875
9
0.00101
0.15625
10
0.000101
0.078125
11
0.10001
0.53125
12
0.001101
0.203125
13
0.11101
0.90625
14
0.11
0.75
15
0.100001
0.515625
16
0.00111
0.21875
17
0.1001
0.5625
18
0.010011
0.296875
19
0.011011
0.421875
20
0.11111
0.96875
21
0.0101101
0.3515625
22
0.1100101
0.7890625
23
0.0000011
0.02734375
24
0.00011011
0.10546875
© Revision 1.0
80
Computer Systems Architecture exercises
11.Negative numbers representations Convert the negative decimal number to binary (16 bits). First line is a solved example Binary Numbers No.
Decimal Number
One’s Complement
Two’s Complement
1
- 4095
1111 0000 0000 0000
1111 0000 0000 0001
1000 1111 1111 1111
2
- 7612
1110 0010 0100 0011
1110 0010 0100 0100
1001 1101 1011 1100
3
- 1985
1111 1000 0011 1110
1111 1000 0011 1111
1000 0111 1100 0001
4
- 5777
1110 1001 0110 1110
1110 1001 0110 1111
1001 0110 1001 0001
5
- 8000
1110 0000 1011 1111
1110 0000 1100 0000
1001 1111 0100 0000
6
- 729
1111 1101 0010 0110
1111 1101 0010 0111
1000 0010 1101 1001
7
- 3333
1111 0010 1111 1010
1111 0010 1111 1011
1000 1101 0000 0101
8
- 4799
1110 1101 0100 0000
1110 1101 0100 0001
1001 0010 1011 1111
9
- 6222
1110 0111 1011 0001
1110 0111 1011 0010
1001 1000 0100 1110
10
- 3488
1111 0010 0101 1111
1111 0010 0110 0000
1000 1101 1010 0000
11
- 8190
1110 0000 0000 0001
1110 0000 0000 0010
1001 1111 1111 1110
12
- 9999
1110 1000 1111 0000
1110 1000 1111 0001
1001 0111 0000 1111
13
- 7231
1110 0011 1100 0000
1110 0011 1100 0001
1001 1100 0011 1111
14
-3
1111 1111 1111 1100
1111 1111 1111 1101
1000 0000 0000 0011
15
- 127
1111 1111 1000 0000
1111 1111 1000 0001
1000 0000 0111 1111
16
- 4777
1111 1101 0101 0110
1111 1101 0101 0111
1001 0010 1010 1001
17
- 1741
1111 1001 0011 0010
1111 1001 0011 0011
1000 0110 1100 1101
18
- 7676
1110 0010 0000 0011
1110 0010 0000 0100
1001 1101 1111 1100
19
- 4288
1110 1111 0011 1111
1110 1111 0100 0000
1001 0000 1100 0000
20
- 3636
1111 0001 1100 1011
1111 0001 1100 1100
1000 1110 0011 0100
21
- 1901
1111 1000 1001 0010
1111 1000 1001 0011
1000 0111 0110 1101
22
- 8076
1110 0000 0111 0011
1110 1111 0111 0100
1001 1111 1000 1100
23
- 5555
1110 1010 0100 1100
1110 1010 0100 1101
1001 0101 1011 0011
© Revision 1.0
Sign and Magnitude
81
Computer Systems Architecture exercises
12.Numbers representations Assuming the Hexadecimal number represents a 16 bits signed binary number convert it to a decimal number. The three columns represent the binary notation. First line is a solved example Decimal Numbers No.
Hexadecimal Number
1
64
One’s Complement 100
Two’s Complement 100
Sign and Magnitude 100
2
8FFF
- 28672
- 28673
- 4095
3
6ABC
27324
27324
27324
4
F0F0
- 3855
- 3856
- 28912
5
FC92
- 877
- 878
- 31890
6
FAB0
-1359
- 1360
- 31408
7
2000
8192
8192
8192
8
EEEE
- 4369
- 4370
- 28398
9
D012
- 12269
- 12270
- 20498
10
8111
- 32494
- 32495
- 273
11
7FFF
32767
32767
32767
12
BAD1
- 17710
- 17711
- 15057
13
A000
- 24575
- 24576
- 8192
14
E900
- 5887
- 5888
- 26880
15
D780
- 10367
- 10368
-22400
16
9BBB
- 25668
- 25669
- 7099
17
AAAA
- 21845
- 21846
- 10922
18
6819
26649
26649
26649
19
8750
- 30895
- 30896
- 1872
20
C009
- 16734
- 16375
-16393
21
ABCD
- 21255
- 21256
- 11213
22
DFEF
- 8208
- 8209
- 24559
© Revision 1.0
82
Computer Systems Architecture exercises
13.Adding binary numbers First line is a solved example Binary Numbers No.
First Number
Second Number
Result
1
0011 0000 1101 0100
0010 0111 0000 0011
0101 0111 1101 0111
2
0010 0111 0000 1111
0010 1110 1110 1100
0101 0101 1111 1011
3
0000 1010 1110 0100
0010 1110 0000 1000
0011 1000 1110 1100
4
0010 1011 0110 0111
0000 0010 0010 1011
0010 1101 1001 0010
5
0001 1111 1111 1111
0001 1111 1111 1111
0011 1111 1111 1110
6
0000 1111 1111 1111
0010 1010 1100 1010
0011 1010 1100 1001
7
0010 1010 0101 1010
0000 0111 1111 1111
0011 0010 0101 1001
8
0001 1010 1011 1100
0001 1011 1100 1101
0011 0110 1000 1001
9
0001 1111 1110 1111
0000 1111 1101 1011
0010 1111 1100 1010
10
0011 1011 1010 1101
0001 1101 1101 1101
0101 1001 1001 0011
11
0101 1001 1001 0011
0001 1111 1111 1111
0111 1001 1001 0010
12
0001 0010 0011 0100
0001 0010 0011 0100
0020 0100 0110 1000
13
0100 1010 1011 0101
0010 0101 0111 1011
0111 0000 0011 0000
14
0011 1111 1111 1111
0011 1111 1111 1111
0111 1111 1111 1110
15
0001 1010 1010 1010
0010 1101 1101 1101
0100 1000 1000 0111
16
0100 1110 1110 1110
0001 1100 1100 1100
0110 1011 1011 1010
17
0101 0101 0101 0101
0001 1111 1111 1111
0111 0101 0101 0100
18
0001 1001 1111 1011
0011 1111 1011 1001
0101 1001 1011 0100
19
0011 1010 0110 0111
0001 0110 0111 1010
0101 0000 1110 0001
20
0001 1111 1011 1110
0011 1010 1100 1101
0101 1010 1000 1011
21
0101 1010 1000 1011
0000 1010 1000 1011
0110 0101 0001 0110
22
0010 0111 0111 0111
0010 1011 1011 1011
0101 0011 0011 0010
23
0100 0100 0100 0100
0001 1100 1100 1100
0110 0001 0001 0000
© Revision 1.0
83
Computer Systems Architecture exercises
14.Adding Hexadecimal unsigned numbers First line is a solved example Hexadecimal Numbers No.
First Number
Second Number
1
7895
4ABC
C351
2
1DDD
2CCC
4AA9
3
3DEF
2999
6788
4
42BD
2BFC
6EB9
5
4623
3BDE
8201
6
23DD
3D1F
60FC
7
5A96
1E0F
78A5
8
8B1E
1ED6
A9F4
9
22FF
3DF9
60F8
10
2D2D
3F3F
6C6C
11
3DDE
3EED
7CCB
12
17EF
6E3F
862E
13
9ABC
1234
ACF0
14
5555
6789
BCDE
15
1FFF
6BAD
8BAC
16
6841
30FB
993C
17
351D
4FE2
84FF
18
10FF
3EC7
59C6
19
22EE
3DC9
60B7
20
6699
1EB5
854E
21
3D8E
3F0F
7C9D
22
90E9
2E3F
BF28
23
4BA4
5CB5
A859
© Revision 1.0
Result
84
Computer Systems Architecture exercises
15.Adding Octal unsigned numbers First line is a solved example Octal Numbers No.
First Number
Second Number
1
3125
5632
10757
2
7777
1111
11110
3
21430
6705
30335
4
23561
17777
43560
5
36217
23567
62006
6
15374
25517
43113
7
52527
12774
65523
8
22337
33557
56116
9
17652
21576
41450
10
11666
55645
67533
11
21766
51622
73610
12
62357
22366
104745
13
12345
12345
24712
14
76543
1111
77654
15
32145
12306
44453
16
17365
36547
56134
17
47147
15647
65016
18
16736
26576
45534
19
26077
25716
54015
20
37437
26726
66365
21
42637
20675
63534
22
33512
32645
66357
23
32325
35353
67700
© Revision 1.0
Result
85
Computer Systems Architecture exercises
16.Adding Base 3 unsigned numbers First line is a solved example Base 3 Numbers No.
First Number
Second Number
1
111020
11221
200011
2
11101
100022
111200
3
10121
20120
101011
4
10202
21111
102020
5
11010
22020
110100
6
11120
21220
110110
7
101201
20111
122012
8
101011
102121
210202
9
21101
21102
112210
10
21201
21220
120121
11
22020
20120
112210
12
101120
20021
121211
13
102010
2001
111011
14
100212
20011
121000
15
101210
21002
122212
16
20101
12201
110002
17
22122
20120
120012
18
101201
21110
200011
19
100110
20111
120221
20
21202
12111
111020
21
21211
101010
122221
22
102101
11121
120222
23
22002
101011
200020
© Revision 1.0
Result
86
Computer Systems Architecture exercises
17.Adding Base 4 unsigned numbers First line is a solved example Base 4 Numbers No.
First Number
Second Number
1
30110
10320
101030
2
22323
11120
100103
3
20223
11031
31320
4
21223
11322
33211
5
22333
10332
33331
6
30312
12233
103211
7
30112
10331
101103
8
22100
20311
103011
9
32011
12003
110020
10
31221
2001
111222
11
23331
22032
112023
12
33031
12300
111331
13
31320
22122
120102
14
32211
23020
121231
15
22323
21121
110110
16
23202
30322
120130
17
30230
13212
110102
18
32003
23113
121122
19
32331
22233
121230
20
33220
13310
113130
21
103102
3222
112330
22
101113
11031
112210
23
31333
30122
122121
© Revision 1.0
Result
87
Computer Systems Architecture exercises
18.Multiplying binary numbers All are unsigned 16 bits numbers. First line is a solved example Binary Numbers No.
First Number
Second Number
Result
1
1111 1111
110 0010
110 0001 1001 1110
2
10 0011
101 0110
1011 1100 0010
3
100 1000
1000 1010
10 0110 1101 0000
4
1 1010 1011 0010
1101
101 1011 0000 1010
5
101 1101 1111
1 0011
1111 1000 1110 0011
6
100 0000 1111
1 1111
111 1101 1101 0001
7
11 1100 1011
1001
10 0010 0010 0011
8
1 1000 0011
1 1101
10 1011 1101 0111
9
111 1000 1001
1101
110 0001 1111 0101
10
1101 1101 1110
110
101 0011 0011 0100
11
1 0000 1000 1001
101
101 0010 1010 1101
12
10 0011 1001
1011
1 1000 0111 0011
13
1010 1011 1011
111
100 1011 0001 1101
14
1 1010 1011
1 0010
1 1110 0000 0110
15
11 0110 1001
1110
10 0001 1011 1110
16
101 0101 0010
1 1110
101 1111 0010 1001
17
1110 1110 0010
1011
1010 0011 1011 0110
18
1011 1010 1101
1110
1010 0011 0111 0110
19
1011 0000 0000
1101
1000 1111 0000 0000
20
100 1000 0011
1 1011
111 1001 1101 0001
21
1011 1011 0001
110
100 0110 0010 0110
22
1111 1111
1111 1111
1111 1110 0000 0001
23
1011 1011
1011 1011
1000 1000 1001 1001
© Revision 1.0
88
Computer Systems Architecture exercises
19.Floating point (754) notation (signed integers) Convert between the numbers. First line is a solved example No.
Decimal Number
1
1
Floating point Number 3F800000
2
2
40000000
3
3
40400000
4
5
40A00000
5
11
41300000
6
17
41880000
7
24
41C00000
8
28
41E00000
9
32
42000000
10
49
42440000
11
59
426C0000
12
65
42820000
13
79
429E0000
14
85
42AA0000
15
99
42C60000
16
127
42FE0000
17
238
436E0000
18
256
43800000
19
257
43808000
20
313
439C8000
21
355
43B18000
22
485
43F28000
23
490
43F50000
24
515
4400C000
© Revision 1.0
89
Computer Systems Architecture exercises No. 25
Decimal Number 533
Floating point Number 44054000
26
-4
C0800000
27
-7
C0E00000
28
-13
C1500000
29
-19
C1980000
30
-25
C1C80000
31
-31
C1F80000
32
-37
C2140000
33
-43
C22C0000
34
-55
C25C0000
35
-67
C2860000
36
-73
C2920000
37
-79
C29E0000
38
-86
C2AC0000
39
-93
C2BA0000
40
-101
C2CA0000
41
-118
C2EC0000
42
-197
C3450000
43
-241
C3710000
44
-267
C3858000
45
-309
C39A8000
46
-333
C3A68000
47
-400
C3C80000
48
-499
C3F98000
49
-525
C4034000
50
-541
C4074000
© Revision 1.0
90
Computer Systems Architecture exercises
20.Floating point (754) notation (signed fractions) Convert the numbers. First line is a solved example No. 1
Decimal Number 123.75
Floating point Number 42F78000
2
76.25
42988000
3
77.5
429B0000
4
64.3125
4280A000
5
45.875
42378000
6
75.75
42978000
7
37.375
42158000
8
19.25
419A0000
9
31.125
41F90000
10
0.625
3F200000
11
49.875
42478000
12
63.5
427E0000
13
99.25
42C68000
14
12.125
41420000
15
24.5
41C40000
16
31.625
41FD0000
17
88.5
42B10000
18
73.25
42928000
19
1.125
3F900000
20
3.375
40580000
21
19.75
419E0000
22
66.875
4285C000
23
54.625
425A8000
24
27.25
41DA0000
© Revision 1.0
91
Computer Systems Architecture exercises No. 25
Decimal Number -89.125
Floating point Number C2B24000
26
-25.625
C1CD0000
27
-0.125
BE000000
28
-8.125
C1020000
29
-25.25
C1CA0000
30
-26.375
C1D30000
31
-87.5
C2AF0000
32
-63.75
C27F0000
33
-41.25
C2250000
34
-0.3125
BEA00000
35
-5.875
C0BC0000
36
-33.375
C2058000
37
-12.125
C1420000
38
-68.875
C289C000
39
-75.75
C2978000
40
-1.75
BFE00000
41
-66.625
C2854000
42
-49.5
C2460000
43
-25.75
C1CE0000
44
-99.125
C2C64000
45
-0.875
BF600000
46
-61.25
C2750000
47
-39.375
C21D8000
48
-25.25
C1CA0000
49
-53.5
C2560000
50
-71.125
C28E4000
© Revision 1.0
92
Computer Systems Architecture exercises
21.Adding Floating point (754) numbers Add the floating-point numbers, then covert them to decimal, add the decimal numbers and check against the floating-point result obtained. First line is a solved example Floating Point Numbers No. 1
First Number 41700000
Second Number 420C0000
2
C14C0000
3
Result
Decimal Numbers
42480000
First Number 15.0
Second Number 35.0
40300000
C1200000
-12.75
2.5
-10.0
418E0000
41B20000
42200000
17.75
22.25
40.0
4
3F600000
4280A000
42C64000
0.875
99.125
100.0
5
C2AF0000
422D0000
C2310000
-87.5
43.25
-44.25
6
429A0000
42080000
42DE0000
77.0
34.0
111.0
7
44400000
43800000
44800000
768.0
256.0
1024.0
8
41CA0000
43BBB000
43C85000
25.25
375.375
400.625
9
414A0000
41460000
41C80000
12.625
12.375
25.0
10
C1EC0000
3F000000
C1E80000
-29.5
0.5
-29.0
11
C18E0000
C20B0000
C2520000
-17.75
-34.75
-52.5
12
BE400000
BF500000
BF800000
-0.1875
-0.8125
-1.0
13
41DD0000
C0500000
41C30000
27.625
-3.25
24.375
14
42C00000
41C00000
42F00000
96.0
24.0
120.0
15
44B08000
43008000
442B2800
556.125
128.5
684.625
16
42C80000
41100000
42DA0000
100.0
9.0
109.0
17
41E40000
3FC00000
41F00000
28.5
1.5
30.0
18
BF400000
40300000
40000000
-0.75
2.75
2.0
19
413E0000
411E0000
41AE0000
11.875
9.875
21.75
20
C0FC0000
41140000
3FB00000
-7.875
9.25
1.375
21
40C40000
40C40000
41440000
6.125
6.125
12.25
22
40B80000
C1340000
C0B00000
5.75
-11.25
-5.5
© Revision 1.0
Result 50.0
93
Computer Systems Architecture exercises
22.Multiplying Floating point (754) numbers Multiply the floating-point numbers, then covert them to decimal, multiply the decimal numbers and check against the floating-point result obtained. First line is a solved example Floating Point Numbers No.
Result
Decimal Numbers
1
First Number 40A00000
Second Number C0000000
C1200000
5
Second Number -2
2
41400000
40800000
42400000
12
4
48
3
C0700000
C1000000
41F00000
-3.75
-8
30
4
40200000
40700000
41160000
2.5
3.75
9.375
5
42C80000
C17C0000
C4C4E000
100
-15.75
-1575
6
43000000
3FC00000
43400000
128
1.5
192
7
C2800000
C2000000
45000000
-64
-32
2048
8
40900000
40B00000
41C60000
4.5
5.5
24.75
9
44800000
44000000
49000000
1024
512
524288
10
C1C00000
42120000
C45B0000
-24
36.5
-876
11
41700000
42080000
43FF0000
15
34
510
12
411C0000
C0500000
C1FD8000
9.75
-3.25
-31.6875
13
44200000
3E000000
42A00000
640
0.125
80
14
C2960000
C0800000
43960000
-75
-4
300
15
417A0000
40900000
428CA000
15.625
4.5
70.3125
16
41940000
C22D0000
C4480800
18.5
-43.25
-800.125
17
C1900000
C2B60000
44CCC000
-18
-91
1638
18
C2F00000
C2600000
45D20000
-120
-56
6720
19
41C80000
41C00000
44160000
25
24
600
20
C0C80000
42EC0000
C4386000
-6.25
118
-737.5
21
414A0000
41840000
434E5000
12.625
16.5
206.3125
© Revision 1.0
First Number
Result -10
94
Computer Systems Architecture exercises
23.BCD numbers (8421, 2421) All are unsigned 16 bits numbers. First line is a solved example
No.
Decimal Number
BCD Numbers
1
1234
0001 0010 0011 0100
0001 0010 0011 0100
2
2468
0010 0100 0110 1000
0010 0100 1100 1110
3
7136
0111 0001 0011 0110
1101 0001 0011 1100
4
5497
0101 0100 1001 0111
1011 0100 1111 1101
5
3780
0011 0111 1000 0000
0011 1101 1110 0000
6
9026
1001 0000 0010 0110
1111 0000 0010 1100
7
4512
0100 0101 0001 0010
0100 1011 0001 0010
8
7462
0111 0100 0110 0010
1101 0100 1100 0010
9
3297
0011 0010 1001 0111
0011 0010 1111 1101
10
4582
0100 0101 1000 0010
0100 1011 1110 0010
11
1097
0001 0000 1001 0111
0001 0000 1111 1101
12
7651
0111 0110 0101 0001
1101 1100 1011 0001
13
1928
0001 1001 0010 1000
0001 1111 0010 1110
14
8763
1000 0111 0110 0011
1110 1101 1100 0011
15
7329
0111 0011 0010 1001
1101 0011 0010 1111
16
8461
1000 0100 0110 0001
1110 0100 1100 0001
17
1357
0001 0011 0101 0111
0001 0011 1011 1101
18
9042
1001 0000 0100 0010
1111 0000 0100 0010
19
6703
0110 0111 0000 0011
1100 1101 0000 0011
20
2730
0010 0111 0011 0000
0010 1101 0011 0000
21
4587
0100 0101 1000 0111
0100 1011 1110 1101
22
9768
1001 0111 0110 1000
1111 1101 1100 1110
23
8264
1000 0010 0110 0100
1110 0010 1100 0100
8421
© Revision 1.0
2421
95
Computer Systems Architecture exercises
24.BCD numbers (84-2-1, Excess-3) All are unsigned 16 bits numbers. First line is a solved example
No.
Decimal Number
BCD Numbers
1
1234
0111 0110 0101 0100
0100 0101 0110 0111
2
2468
0110 0100 1010 1000
0101 0111 1001 1011
3
7136
1001 0111 0101 1010
1010 0100 0110 1001
4
5497
1011 0100 1111 1001
1000 0111 1100 1010
5
3780
0101 1001 1000 0000
0110 1010 1011 0011
6
9026
1111 0000 0110 1010
1100 0011 0101 1001
7
4512
0100 1011 0111 0110
0111 1000 0100 0101
8
7462
1001 0100 1010 0110
1010 011 1001 0101
9
3297
0101 0110 1111 1001
0110 0101 1100 1010
10
4582
0100 1011 1000 0110
0111 1000 1011 0101
11
1097
0111 0000 1111 1001
0100 0011 1100 1010
12
7651
1001 1010 1011 0111
1010 1001 1000 0100
13
1928
0111 1111 0110 1000
0100 1100 0101 1011
14
8763
1000 1001 1010 0101
1011 1010 1001 0110
15
7329
1001 0101 0110 1111
1010 0110 0101 1100
16
8461
1000 0100 1010 0111
1011 0111 1001 0100
17
1357
0011 0101 1011 1001
0100 0110 1000 1010
18
9042
1111 0000 0100 0110
1100 0011 0111 0101
19
6703
1010 1001 0000 0101
1001 1010 0011 0110
20
2730
0110 1001 0101 0000
0101 1010 0110 0011
21
4587
0100 1011 1000 1001
0111 1000 1011 1010
22
9768
1111 1001 1010 1000
1100 1010 1001 1011
23
8264
1000 0110 1010 0100
1011 0101 1001 0111
84-2-1
© Revision 1.0
Excess - 3
96
Computer Systems Architecture exercises
Chapter 4 – Central Processing Unit exercises 1. Architectures In the following questions the Answer should relate to a Stack-based architecture, Accumulator-based architecture, Memory-Register architecture and Register-Register architecture a) Write the instructions needed for executing the following formula: C = 2A + 3B Where A, B, C are variables. Try to optimize the code (minimum instructions possible) Stack-based architecture, First version - the simple and straightforward solution: # First Version Push A Push A Add Pop C Push B Push B Add Push B Add Push C Add Pop C
# Push variable A on TOS # Push variable A on TOS # 2A is stored on TOS # C=2A # Push variable B on TOS # Push variable B on TOS # 2B is stored on TOS # Push B on TOS # 3B is stored on TOS # Push C (contains 2A) on TOS # 2A+3B is stored on TOS # C=2A+3B
Since the stack can contain several variable it is a waste to move 2A into variable C and then load it once again. The second version will be a little bit shorter # Second Version Push A Push A Add Push B Push B Add Push B Add Add Pop C
# Push variable A on TOS # Push variable A on TOS # 2A is stored on TOS # Push variable B on TOS # Push variable B on TOS # 2B is stored on TOS # Push B on TOS # 3B is stored on TOS # 2A+3B is stored on TOS # C=2A+3B
The third version uses a multiply instruction thus is it even shorter
© Revision 1.0
97
Computer Systems Architecture exercises # Third Version Push A Push A Add Push 3 Push B Mult Add Pop C
# Push variable A on TOS # Push variable A on TOS # 2A is stored on TOS # Push constant 3 on TOS # Push variable B on TOS # 3B is stored on TOS # 2A+3B is stored on TOS # C=2A+3B
Accumulator-based architecture As with the previous case the first version is straightforward # Version 1 Load A Add A Add B Add B Add B Store C
# Load variable A into the Accumulator # Add the Accumulator content to A # Accumulator = 2A+B # Accumulator = 2A+2B # Accumulator = 2A+3B # C=2A+3B
Version 2 is using the multiply instruction # Version 2 Load B Mult 3 Add A Add A Store C
# Load variable B into the Accumulator # Accumulator = 3B # Accumulator = 3B+A # Accumulator = 3B+2A # C=2A+3B
Version 3 is using the multiply instruction but the algorithm was changed a little bit. It does not decrease the number of instructions # Version 3 Load A Add B Mult 2 Add B Store C
# Load variable A into the Accumulator # Accumulator = A+B # Accumulator = 2A+2B # Accumulator = 2A+3B # C=2A+3B
Memory-Register architecture As with previous cases the first version is simple # Version 1
© Revision 1.0
98
Computer Systems Architecture exercises Load Load Add Add Add Store Add Store
R1,A R2,B R1,A R2,B R2,B Temp,R1 R2,Temp C,R2
# Load variable A into R1 # Load variable B into R2 # R1=2A # R2=2B # R2=3B # Temp variable contains 2A # R2=3B+2A # C=2A+3B
The second version uses the multiply instruction assuming it can have one of the operands in memory or a constant. # Version 2 Load R1,A Load R2,B Mult R1,2 Mult R2,3 Store Temp,R1 Add R2,Temp Store C,R2
# Load variable A into R1 # Load variable B into R2 # R1=2A # R2=3B # Temp variable = 2A # R3=2A+3B # C=2A+3B
Register-Register architecture As with previous cases the first version is simple # Version 1 Load R1,A Load R2,B Add R1,R1,R1 Add R3,R2,R2 Add R3,R3,R2 Add R3,R3,R1 Store C,R3
# Load variable A into R1 # Load variable B into R2 # R1=2A # R3=2B # R3=3B # R3=3B+2A # C=2A+3B
The second version uses the multiply instruction assuming it can have one of the operands in memory or a constant. # Version 2 Load R1,A Load R2,B Mult R1,R1,2 Mult R2,R2,3 Add R3,R1,R2 Store C,R3
# Load variable A into R1 # Load variable B into R2 # R1=2A # R2=3B # R3=2A+3B # C=2A+3B
© Revision 1.0
99
Computer Systems Architecture exercises As with the Accumulator based architecture using a different algorithm does not produce a better solution.
b) Write the instructions needed for executing the following formula: Sum = 8A + 4B + 2C +D Where A, B, C, D and Sum are variables
Stack-based architecture, First version - the simple and straightforward solution: # First Version Push A Push A Add Push B Add Push 2 Mult Push C Add Push 2 Mult Push D Add Pop Sum
# Push variable A on TOS # Push variable A on TOS # 2A is stored on TOS # Push variable B on TOS # 2A+B is stored on TOS # Push 2 on TOS # 4A+2B is stored on TOS # Push C on TOS # 4A+2B+C is stored on TOS # 8A+4B+2C is stored on TOS # D is stored on TOS # 8A+4B+2C+D is on TOS # Sum = 8A+4B+2C+D
Same program can be written using Polish notation, but it does not shorten the program. # Second Version – Polish notation Push D # Push variable D on TOS Push 2 # Push 2 on TOS Push C # Push variable C on TOS Push 2 # Push 2 on TOS Push B # Push variable B on TOS Push A # Push variable A on TOS Push A # Push variable A on TOS Add # 2A is stored on TOS Add # 2A+B is stored on TOS Mult #4A+2B is stored on TOS Add # 4A+2B+C is stored on TOS Mult # 8A+4B+2C is stored on TOS Add # 8A+4B+2C+D is stored on TOS Pop Sum # Sum = 8A+4B+2C+D
© Revision 1.0
100
Computer Systems Architecture exercises
Accumulator-based architecture As with the previous case the first version is straightforward # Version 1 Load A Mult 8 Store Sum Load B Mult 4 Add Sum Store Sum Load C Mult 2 Add D Add Sum Store Sum
# Load variable A into the Accumulator # Accumulator =8A # Sum =8A # Accumulator =B # Accumulator =4B # Accumulator =8A+4B # Sum = 8A+4B # Accumulator =C # Accumulator =2C # Accumulator =2C+D # Accumulator =8A+4B+2C+D # Sum= 8A+4B+2C+D
Version 2 is slightly better. There is no need to store 8A+4B which may saves two instructions. # Version 2 Load A Mult 8 Store Sum Load B Mult 4 Add Sum Add C Add C Add D Store Sum
# Load variable A into the Accumulator # Accumulator = 8A # Sum = 8A # Accumulator = B # Accumulator =4B # Accumulator =8A+4B # Accumulator =8A+4B+C # Accumulator =8A+4B+2C # Accumulator =8A+4B+2C+D # Sum=8A+4B+2C+D
Version 3 is using a slightly different algorithm which is more efficient # Version 3 Load A Add A Add B Mult 2 Add C Mult 2 Add D Store Sum
# Load variable A into the Accumulator # Add the Accumulator content to A # Accumulator = 2A+B # Accumulator = 4A+2B # Accumulator = 4A+2B+C # Accumulator = 8A+4B+2C # Accumulator = 8A+4B+2C+D # Sum=8A+4B+2C+D
© Revision 1.0
101
Computer Systems Architecture exercises Memory-Register architecture As with previous cases the first version is simple, similar to the Accumulator version but using more registers # Version 1 Load R1,A Mult R1,8 Store Sum,R1 Load R2,B Mult R2,4 Add R2,Sum Add R2,C Add R2,C Add R2,D Store Sum,R2
# Load variable A into the R1 # R1=8A # Sum=8A # R2=B # R2=4B # R2=8A+4B # R2=8A+4B+C # R2=8A+4B+2C # R2=8A+4B+2C+D # Sum=8A+4B+2C+D
The second version is similar to the third Accumulator based architecture version. Although here there are many registers, this implementation is using just one register. # Version 2 Load R1,A Add R1,A Add R1,B Mult R1,2 Add R1,C Mult R1,2 Add R1,D Store Sum,R1
# Load variable A into R1 # R1=2A # R1=2A+B # R1=4A+2B # R1=4A+2B+C # R1=8A+4B+2C # R1=8A+4B+2C+D # Sum=8A+4B+2C+D
Register-Register architecture As with previous cases the first version is simple and it starts by loading the variables and the constants into the registers # Version 1 Load R1,A Load R2,B Load R3,C Load R4,D Load R5,2 Load R6,4 Load R7,8 Mult R1,R1,R7 Mult R2,R2,R6 Mult R3,R3,R5
# Load variable A into R1 # Load variable B into R2 # Load variable C into R3 # Load variable D into R4 # Load the constant 2 into R5 # Load the constant 4 into R6 # Load the constant 8 into R7 # R1=8A # R2=4B # R3=2C
© Revision 1.0
102
Computer Systems Architecture exercises Add Add Add Store
R4,R4,R1 R4,R4,R2 R4,R4,R3 Sum,R4
# R4=8A+D # R4=8A=4B+D # R4=8A+4B+2C+D # Sum=8A+4B+2C+D
The second version is slightly better # Version 2 Load R1,A Load R2,B Load R3,C Load R4,D Load R5,2 Add R1,R1,R1 Add R6,R1,R2 Mult R6,R6,R5 Add R6,R6,R3 Mult R6,R6,R5 Add R6,R6,R4 Store Sum,R6
# Load variable A into R1 # Load variable B into R2 # Load variable C into R3 # Load variable D into R4 # Load the constant 2 into R5 # R1=2A # R6=2A+B # R6=4A+2B # R6=4A+2B+C # R6=8A+4B+2C # R6=8A+4B+2C+D # Sum=8A+4B+2C+D
c) Write the instructions needed for executing the following formula: Sum = A + 2B + 4C +8D Where A, B, C, D and Sum are variables This exercise is identical to the previous one, just the variables are different. Instead of using variables A,B,C,D we have to used D,C,B,A (respectively) and the solutions will fit.
d) Write the instructions needed for executing the following formula: Sum = 2(A + 2B) – 3(C +2D) Where A, B, C, D and Sum are variables Stack-based architecture, Simple and straightforward solution: Push Push Add Push Add Push Mult Push Push
C D D 3 A B
# Push C on TOS # Push D on TOS # C+D is stored on TOS # D is stored on TOS # C+2D is stored on TOS # 3 is stored on TOS # 3(C+2D) is on TOS # Push variable A on TOS # Push variable B on TOS
© Revision 1.0
103
Computer Systems Architecture exercises Add Push Add Push Mult Sub Pop
B 2
Sum
# A+B is stored on TOS # Push variable B on TOS # A+2B is stored on TOS # Push 2 on TOS # 2(A+2B) is stored on TOS # 2(A+2B)-3(C+2D) is stored on TOS # Sum = 2(A+2B)-3(C+2D)
Accumulator-based architecture # Version 1 Load A Add B Add B Mult 2 Store Sum Load C Add D Add D Mult 3 Store Tmp Load Sum Sub Tmp Store Sum
# Load variable A into the Accumulator # Accumulator =A+B # Accumulator =A+2B # Accumulator =2(A+2B) # Sum =2(A+2B) # Accumulator =C # Accumulator =C+D # Accumulator =C+2D # Accumulator =3(C+2D) # Temporary location = 3(C+2D) # Accumulator =2(A+2B) # Accumulator =2(A+2B)-3(C+2D) # Sum= 2(A+2B)-3(C+2D)
Version 2 is slightly different and it is using expanded formula. Sum = 2(A+2B)-3(C+2D) = 2A+4B-3C-6D Unfortunately it is worse (17 instructions compared to 13 on version 1) # Version 2 Load A Mult 2 Store A Load B Mult 4 Add A Store Sum Load C Mult 3 Store C Load D Mult 6 Add C Store Tmp Load Sum
# Load variable A into the Accumulator # Accumulator =2A # A=2A # Accumulator = B # Accumulator = 4B # Accumulator = 2A+4B # Sum =2A+4B # Accumulator =C # Accumulator = 3C # C=3C # Accumulator = D # Accumulator = 6D # Accumulator = 3C+6D # Temporary location = 3C+6D # Accumulator =2A+4B
© Revision 1.0
104
Computer Systems Architecture exercises Sub Tmp Store Sum
# Accumulator =2A+4B-3C-6D # Sum= 2(A+2B)-3(C+2D)
Changing the order of the calculation can save some instructions, nevertheless version 1 is still better # Version 3 Load C Mult 3 Store C Load D Mult 6 Add C Store Tmp Load A Mult 2 Store A Load B Mult 4 Add A Sub Tmp Store Sum
# Accumulator =C # Accumulator = 3C # C=3C # Accumulator = D # Accumulator = 6D # Accumulator = 3C+6D # Temporary location = 3C+6D # Load variable A into the Accumulator # Accumulator =2A # A=2A # Accumulator = B # Accumulator = 4B # Accumulator = 2A+4B # Accumulator =2A+4B-3C-6D # Sum= 2(A+2B)-3(C+2D)
Memory-Register architecture As with previous cases the first version is simple, similar to the Accumulator version but using more registers Load Add Add Mult Store Load Add Add Mult Store Load Sub Store
R1,B R1,B R1,A R1,2 Sum,R1 R2,D R2,D R2,C R2,3 Tmp,R3 R1,Sum R1,Tmp Sum,R1
# Load variable B into the R1 # R1=2B # R1=A+2B # R1=2(A+2B) # Sum=8A # R2=D # R2=2D # R2=C+2D # R2=3(C+2D) # Tmp=3(C+2D) # R1=2(A+2B) # R1=2(A+2B)-3(C+2D) # Sum=2(A+2B)-3(C+2D)
Register-Register architecture As with previous cases the first version is simple and it starts by loading the variables and the constants into the registers
© Revision 1.0
105
Computer Systems Architecture exercises
Load Load Load Load Load Load Add Add Mult Add Add Mult Sub Store
R1,A R2,B R3,C R4,D R5,2 R6,3 R1,R1,R2 R1,R1,R2 R1,R1,R5 R3,R3,R4 R3,R3,R4 R3,R3,R6 R1,R1,R3 Sum,R4
# Load variable A into R1 # Load variable B into R2 # Load variable C into R3 # Load variable D into R4 # Load the constant 2 into R5 # Load the constant 3 into R6 # R1=A+B # R1=A+2B # R1=2(A+2B) # R3=C+D # R3=C+2D # R3=3(C+2D) # R1=2(A+2B)-3(C+2D) # Sum=2(A+2B)-3(C+2D)
e) Write the instructions needed for executing the following formula: Sum = 2AB+3CD Where A, B, C, D and Sum are variables Stack-based architecture, Simple and straightforward solution: Push Push Mult Push Mult Push Push Mult Push Mult Add Pop
C D 3 A B 2
Sum
# Push C on TOS # Push D on TOS # CD is stored on TOS # 3 is stored on TOS # 3CD is on TOS # Push variable A on TOS # Push variable B on TOS # AB is stored on TOS # Push 2 on TOS # 2AB is stored on TOS # 2AB+3CD is stored on TOS # Sum = 2AB+3CD
Accumulator-based architecture Load Mult Mult Store Load Mult Mult
C D 3 Tmp A B 2
# Load variable C into the Accumulator # Accumulator =CD # Accumulator =3CD # Tmp = 3CD # Accumulator = A # Accumulator =AB # Accumulator = 2AB
© Revision 1.0
106
Computer Systems Architecture exercises Add Tmp Store Sum
# Accumulator = 2AB+3CD # Sum= 2AB+3CD
Memory-Register architecture As with previous cases the first version is simple, similar to the Accumulator version but using more registers Load Mult Mult Store Load Mult Mult Add Store
R1,A R1,B R1,2 Tmp,R1 R2,C R2,D R2,3 R2,Tmp Sum,R2
# Load variable A into the R1 # R1=AB # R1=2AB # Tmp = 2AB # R2=C # R2=CD # R2=3CD # R2=2AB+3CD # Sum=2AB+3CD
Register-Register architecture As with previous cases the first version is simple and it starts by loading the variables and the constants into the registers Load Load Load Load Load Load Mult Mult Mult Mult Add Store
R1,A R2,B R3,C R4,D R5,2 R6,3 R1,R1,R2 R1,R1,R5 R3,R3,R4 R3,R3,R6 R3,R3,R1 Sum,R3
# Load variable A into the R1 # R2=B # R3=C # R4=D # R5=2 # R6=3 # R1=AB # R1=2AB # R3=CD # R3=3CD # R3=2AB+3CD # Sum=2AB+3CD
f) Write the instructions needed for executing the following formula: Sum = 3(A+B)*(C+D) Where A, B, C, D and Sum are variables Stack-based architecture, Simple and straightforward solution:
© Revision 1.0
107
Computer Systems Architecture exercises Push Push Add Push Push Add Push Mult Mult Pop
C D A B 3
Sum
# Push C on TOS # Push D on TOS # C+D is stored on TOS # Push variable A on TOS # Push variable B on TOS # A+B is stored on TOS # 3 is stored on TOS # 3(A+B) is on TOS # 3(A+B)*(C+D) is stored on TOS # Sum = 3(A+B)*(C+D)
Accumulator-based architecture Load Add Store Load Add Mult Mult Store
C D Tmp A B 3 Tmp Sum
# Load variable C into the Accumulator # Accumulator =C+D # Tmp = C+D # Accumulator = A # Accumulator =A+B # Accumulator = 3(A+B) # Accumulator = 3(A+B)^(C+D) # Sum= 3(A+B)^(C+D)
Memory-Register architecture As with previous cases the first version is simple, similar to the Accumulator version but using more registers Load Add Mult Store Load Add Mult Store
R1,A R1,B R1,3 Tmp,R1 R2,C R2,D R2,Tmp Sum,R2
# Load variable A into the R1 # R1=A+B # R1=3(A+B) # Tmp = 3(A+B) # R2=C # R2=C+D # R2=3(A+B)*(C+D) # Sum=3(A+B)^(C+D)
Register-Register architecture As with previous cases the first version is simple and it starts by loading the variables and the constants into the registers Load Load Load Load Load
R1,A R2,B R3,C R4,D R5,3
# Load variable A into the R1 # R2=B # R3=C # R4=D # R5=3
© Revision 1.0
108
Computer Systems Architecture exercises Add Mult Add Mult Store
R1,R1,R2 R1,R1,R5 R3,R3,R4 R3,R3,R1 Sum,R3
# R1=A+B # R1=3(A+B) # R3=C+D # R3=3(A+B)*(C+D) # Sum=3(A+B)*(C+D)
g) Write the instructions needed for executing the following formula: Sum = (A+2B+3C)*(3A+2B+C) Where A, B, C and Sum are variables Stack-based architecture, Simple and straightforward solution: Push Push Mult Push Push Add Push Add Add Push Push Mult Push Push Add Push Add Add Mult Pop
C 3 B B A
A 3 B B C
Sum
# Push C on TOS # Push 3 on TOS # 3C is stored on TOS # Push variable B on TOS # Push variable B on TOS # 2B is stored on TOS # A is stored on TOS # A+2B is on TOS # A+2B+3C is stored on TOS # Push A on TOS # Push 3 on TOS # 3A is stored on TOS # Push variable B on TOS # Push variable B on TOS # 2B is stored on TOS # C is stored on TOS # 3A+2B is on TOS # 3A+2B+C is stored on TOS # (3A+2B+C)*(A+2B+3C) is stored on TOS # Sum = (3A+2B+C)*(A+2B+3C)
Accumulator-based architecture Load Mult Store Load Add Store Load Add Add
C 3 C3 B B B2 A B2 C3
# Accumulator = C # Accumulator = 3C # Variable C3=3C # Accumulator = B # Accumulator = 2B # Variable B2=2B # Accumulator = A # Accumulator = A+2B # Accumulator = A+2B+3C
© Revision 1.0
109
Computer Systems Architecture exercises Store Load Mult Store Load Add Add Mult Store
Part1 A 3 A3 C B2 A3 Part1 Sum
# Variable Part1 = A+2B+3C # Accumulator = A # Accumulator = 3A # Variable A3=3A # Accumulator = C # Accumulator = C+2B # Accumulator = C+2B+3A # Accumulator = (C+2B+3A)*(A+2B+3C) # Sum = (3A+2B+C)*(A+2B+3C)
Memory-Register architecture As with previous cases the first version is simple, similar to the Accumulator version but using more registers Load Mult Add Add Add Store Load Mult Add Add Add Mult Store
R1,A R1,3 R1,B R1,B R1,C Tmp,R1 R2,C R2,3 R2,B R2,B R2,A R2,Tmp Sum,R2
# Load variable A into the R1 # R1=3A # R1=3A+B # R1=3A+2B # R1=3A+2B+C # Tmp = 3A+2B+C # R2=C # R2=3C # R2=3C+B # R2=3C+2B # R2=3C+2B+A # R2=(3A+2B+C)*(A+2B+3C) # Sum=(3A+2B+C)*(A+2B+3C)
Register-Register architecture As with previous cases the first version is simple and it starts by loading the variables and the constants into the registers Load Load Load Load Mult Add Add Add Mult Add Add Add Mult Store
R1,A R2,B R3,C R4,3 R5,R3,R4 R5,R5,R1 R5,R5,R2 R5,R5,R2 R6,R1,R4 R6,R6,R2 R6,R6,R2 R6,R6,R3 R6,R6,R5 Sum,R6
# Load variable A into the R1 # R2=B # R3=C # R4=3 # R5 = 3C # R5 = 3C+A # R5 = 3C+A+B # R5 = 3C+A+2B # R6 = 3A # R6 = 3A+B # R6 = 3A+2B # R6 = 3A+2B+C # R6=(3A+2B+C)*(A+2B+3C) # Sum=(3A+2B+C)*(A+2B+3C)
© Revision 1.0
110
Computer Systems Architecture exercises h) Write the instructions needed for executing the following formula: Sum = A*B*C/(A+B+C) Where A, B, C and Sum are variables Stack-based architecture, Simple and straightforward solution: Push Push Add Push Add Push Push Mult Push Mult Div Pop
C B A A B C
Sum
# Push C on TOS # Push B on TOS # C+B is stored on TOS # Push variable A on TOS # A+B+C is stored on TOS # A is stored on TOS # B is stored on TOS # A*B is on TOS # Push C on TOS # A*B*C is stored on TOS # A*B*C/(A+B+C) is stored on TOS # Sum = A*B*C/(A+B+C)
Accumulator-based architecture Load Add Add Store Load Mult Mult Div Store
C B A Tmp A B C Tmp Sum
# Accumulator = C # Accumulator = B+C # Accumulator = B+C+A # Variable Tmp = B+C+A # Accumulator = A # Accumulator = AB # Accumulator = A*B*C # Accumulator = # A*B*C/(A+B+C) # Sum = A*B*C/(A+B+C)
Memory-Register architecture As with previous cases the first version is simple, similar to the Accumulator version but using more registers Load Add Add Store Load Mult Mult Div Store
R1,A R1,B R1,C Tmp,R1 R2,C R2,B R2,A R2,Tmp Sum,R2
# Load variable A into the R1 # R1=A+B # R1=A+B+C # Tmp = A+B+C # R2=C # R2=B*C # R2=A*B*C # R2= A*B*C/(A+B+C) # Sum= A*B*C/(A+B+C)
© Revision 1.0
111
Computer Systems Architecture exercises Register-Register architecture As with previous cases the first version is simple and it starts by loading the variables and the constants into the registers Load Load Load Mult Mult Add Add Div Store
R1,A R2,B R3,C R5,R1,R2 R5,R5,R3 R4,R1,R2 R4,R1,R3 R5,R5,R4 Sum,R5
# Load variable A into the R1 # R2=B # R3=C # R5 = AB # R5 = ABC # R4 = A+B # R4 = A+B+C # R5 = A*B*C/(A+B+C) # Sum= A*B*C/(A+B+C)
i) Write the instructions needed for executing the following formula: Sum = (A+B)*(B+C)*(A+C) Where A, B, C and Sum are variables Stack-based architecture, Simple and straightforward solution: Push Push Add Push Push Add Push Push Add Mult Mult Pop
A B B C A C
Sum
# Push A on TOS # Push B on TOS # A+B is stored on TOS # Push variable B on TOS # Push variable C on TOS # B+C is stored on TOS # A is stored on TOS # C is stored on TOS # A+C is on TOS # Push (A+C)*(B+C) on TOS # Push (A+C)*(B+C)*(A+B) on TOS # Sum = (A+C)*(B+C)*(A+B)
Accumulator-based architecture Load Add Store Load Add Store Load Add Mult Mult
A B TmpAB B C TmpBC C A TmpAB TmpBC
# Accumulator = A # Accumulator = A+B # Variable TmpAB = A+B # Accumulator = B # Accumulator = B+C # Variable TmpBC = B+C # Accumulator = C # Accumulator = C+A # Accumulator = (A+B)*(C+A) # Accumulator = (A+B)*(C+A)*(B+C)
© Revision 1.0
112
Computer Systems Architecture exercises Store Sum
# Sum = (A+B)*(C+A)*(B+C)
Memory-Register architecture As with previous cases the first version is simple, similar to the Accumulator version but using more registers Load Add Store Load Add Store Load Add Mult Mult Store
R1,A R1,B TmpAB,R1 R2,B R2,C TmpBC,R2 R3,C R3,A R3,TmpAB R3,TmpBC Sum,R3
# Load variable A into the R1 # R1=A+B # Variable TmpAB = A+B # Load variable B into the R2 # R2=B+C # Variable TmpBC = B+C # Load variable C into the R3 # R3=C+A # R3 = (C+A)*(A+B) # R3 = (C+A)*(A+B)*(B+C) # Sum=(C+A)*(A+B)*(B+C)
Register-Register architecture As with previous cases the first version is simple and it starts by loading the variables and the constants into the registers Load Load Load Add Add Add Mult Mult Store
R1,A R2,B R3,C R4,R1,R2 R5,R2,R3 R6,R1,R3 R6,R6,R4 R6,R6,R5 Sum,R5
# Load variable A into the R1 # R2=B # R3=C # R4 = A+B # R5 = B+C # R6 = A+C # R6 = (A+B)*(A+C) # R6 = (A+B)*(A+C)*(B+C) # Sum= (A+B)*(A+C)*(B+C)
© Revision 1.0
113
Computer Systems Architecture exercises
2. CPI a) A specific program was executed on two computer systems. A has a cycle time of 1 ns and CPI=5 B has a cycle time of 1.6 ns and CPI=2.5 Which system is faster (for this specific program) and by how much: Solution Assuming that C is the number of instruction executed. TimeA = C * 5 * 1 * 10-9 = 5 * C * 1 * 10-9 TimeB = C * 2.5 * 1.6 * 10-9 = 4 * C * 1 * 10-9 TimeA/TimeB = (4 * C * 1 * 10-9)/(4 * C * 1 * 10-9) = 1.25 System B is faster by 25% b) Assuming we need to get similar speed on both system (described in the question above). What should be the CPI of each one of the systems. Solution There are two possibilities. The first one is slowing M2. TimeA=TimeB TimeA = C * 5 * 1 * 10-9 TimeB = C * X * 1.6 *10-9 X = 5/1.6 = 3.125 If the CPI of the second (the faster) system will increase from 2.5 to 3.125 the speeds of the two systems will be identical. The second possibility is increasing the speed of the first system TimeA=TimeB TimeA = C * X * 1 * 10-9 TimeB = C * 2.5 * 1.6 *10-9 X = 2.5 * 1.6 = 4 If the CPI of the first system will decrease from 5 to 4 the speeds of the system will be identical.
c) Assuming we need to get similar speed on both system (described in the questions above). What should be the cycle time of each one of the systems.
© Revision 1.0
114
Computer Systems Architecture exercises Solution As with the previous example, here too there are two possible solutions. The first is slowing down the second system. TimeA=TimeB TimeA = C * 5 * 1 * 10-9 TimeB = C * 2.5 * X *10-9 X = 5/2.5 = 2 If the cycle time of the second system will increase from 1.6 ns to 2 ns the speeds of the two system will be identical The second possibility is increasing the speed of the first system TimeA=TimeB TimeA = C * 5 * X * 10-9 TimeB = C * 2.5 * 1.6 *10-9 X = 2.5*1.6/5 = 4/5 = 0.8 If the cycle time of the first system will decrease from 1 ns to 0.8 ns the systems speeds will be identical
d) A specific system implemented four groups of instructions as outlined in the following table. When a specific program was executed the following instructions’ frequencies were observed Group ALU
Frequency 50%
Cycles 1
Load
20%
2
Store
10%
2
Branch
20%
2
Calculate the average CPI for that specific program. Solution CPI = 50%*1 + 20%*2 + 10%*2 +10%*2 = 0.5 + 0.4 + 0.2 + 0.4 = 1.5
© Revision 1.0
115
Computer Systems Architecture exercises e) A specific system implemented four groups of instructions as outlined in the following table. When a specific program was executed the following instructions’ frequencies were observed Group ALU
Frequency 43%
Cycles 1
Load
21%
1
Store
12%
2
Branch
24%
2
The hardware engineers can improve the “Store” group of instructions performance and execute each instruction in one cycle. However this will require increasing the systems cycle time by 15%. Is it worthwhile to implement the proposed change? Solution We will start by calculating the CPI for each case. We will mark the unknown cycle time by T CPIOld = 43%*1 + 21%*1 + 12%*2 + 24% * 2 = 1.36 CPINew = 43%*1 + 21%*1 + 12%*1 + 24% * 2 = 1.24 Speedup
= Old time / New time = (C * CPIOld * T) / (C * CPINew * 1.15 * T) = 1.36 / (1.24 * 1.15) = 0.95
The new proposed solution is not worthwhile since is slows the system down.
f) A specific system runs at 1GHz and supports four groups of instructions. When a specific program was executed the following usage was observed: Group 1
Usage 20%
CPI 1
2
30%
2
3
10%
3
4
40%
4
© Revision 1.0
116
Computer Systems Architecture exercises Calculate the CPI for that program. If there were 1010 instructions how many cycles it required and what was the execution time? Solution CPI
= 1*20% + 2*30% + 3* 10% + 4*40% = 2.7
The total number of cycles is calculated by the number of instructions multiplied by the average CPI Total number of cycles = 2.7 * 1010
The run time is calculated by dividing the total number of instructions by the clock rate Total runtime = CPI * Instruction Count / Clock rate = 2.7 * 1010 /1* 109 = 27 Seconds
g) A program was executed on a specific system and the following attributes were measured: - The number of instruction executed was 1009 - The CPI measured during the run was 2.5 - The clock rate was 2.5 GHz. Calculate the amount of time this program ran. Solution 1.25 seconds obtained by multiplying the number of instruction by the CPI and the cycle time
h) A program from the previous question was executed once again but this rime using a different compiler and a different hardware system. The attributes of the second runs were as follows: - The number of instruction executed was 950,000,000 - The CPI measured during the run was 3.0 - The clock rate was 3 GHz. Which system is faster and by how much © Revision 1.0
117
Computer Systems Architecture exercises Solution The new execution time is 0.95 seconds which represents an improvement of 32% compared to the first system. i) A program was executed on two system. The first system with a 20 ns cycle time and the second with a 30 ns. On the first system the CPI measured was 2.0 and on the second system the CPI was 0.5. Calculate the run time on each one of the systems Solution Since the number of instructions is not given we can only define the formula. Assuming the number of instructions executed is X then Time1 = X * 2.0 * 2.0 *10-8 Time2 = X * 0.5 * 3 * 10-8
j) A program was executed on two system. The first system with a 500MHZ clock rate and the second with a 650 MHz clock rate. On the first system it used 1*108 cycles and on the second system it used 1.2 *108 cycles. Which system is faster and by how much? Solution Time1 = 1*108 / (500*106) = 0.2 Seconds Time2 = 1.2*108 / (650 * 106) = 0.18 seconds This means that the second system is faster by 11% (0.2/0.18) k) A system has three types of instructions. The first type executes in one cycle, the second requires two cycles and the third requires three cycles. When running a piece of code, five instructions were of type one, three were of type two and two instructions were of type three, Calculate the CPI of this piece of code. Solution The total number of cycles id given by: 5*1 + 3*2 + 2*3 = 17 The number of instruction executed was: 5+3+2 = 10 CPI = 17/10 = 1.7 l) While running a program on system A with a clock rate of 600 MHz the CPI was 1.3. Running the program on system B with a clock rate of 750 MHz produced a CPI of 2.5. Assuming the number of instruction was 100,000’ what should the number of instruction on system B for achieving the same execution time?
© Revision 1.0
118
Computer Systems Architecture exercises Solution TimeA = 100,000 * 1.3 *1/(600 * 106) TimeB = X * 2.5 * 1/(750*106) X = 100,000 *1.3*1/(600*106)/2.5*1/(750*106) = 65,000 If the number of instruction on system B will be 65,000 the execution times will be identical. m) Running a specific program requires 3*1010 cycles. How long does it run if it executes on a system with a clock rate of 100 MHz? And how long does it run on a system with a clock rate of 3 GHz? Solution TimeA = 3*1010*1/(100*106)= 300 TimeB = 3*1010*1/(3*109)=10 On the first system it will run 300 seconds and on the second system it will run 10 seconds n) A specific system uses 5 groups of instruction. Each group requires a different number of cycle for execution. While executing a test run the group frequencies were measured as outlined in the following table Group 1
Name ALU
Usage 22%
CPI 1
2
Memory Access
36%
5
3
Branch
16%
3
4
Call
13%
4
5
Return
13%
4
In an attempt to increase the system’s performance the hardware engineers have come up with several improvement suggestions. However’ each one of the suggestions has some draw backs (in addition to the benefits). The following tables summarizes the suggestions and the drawbacks. Usually an improvement relates to one group of instructions and the drawback is the extra time required to increase the cycle time.
© Revision 1.0
119
Computer Systems Architecture exercises Group
Name
Improvement
1
ALU
25%
Cycle time increase 7%
2
Memory Access
35%
17%
3
Branch
90%
1%
4
Call
45%
2%
5
Return
45%
2%
For example it is possible to decrease the number of cycle required by the ALU by 25% but it requires increasing the cycle time (for all the groups) by 7%. The only limitation is that only one improvement can be implemented. Which one is the preferred suggestion? Which one is the worst one. Solution First we have to calculate the CPI prior to the improvements. CPI = 3.54 Then we’ll calculate the CPI improvement for each one of the possibilities, as defined in the following table: Alternative 1
Relative Improvement 1.053
2
0.977
3
1.079
4
1.085
5
1.085
The best improvement is alternative 2 (improving the memory access). Actually this is the only alternative that improves the situation. All other alternative make it worse. The worst alternative is 4/5 (improving the Call/Return)
© Revision 1.0
120
Computer Systems Architecture exercises
3. Amdahl’s Law a) A specific system performs a task in 100 ns. It is possible to introduce an improvement and then the task will be performed in 20 ns. The task is executed only during 30% of the time. Calculate the improvement Solution Using the Amdahl formula. Fe = 0.3 Se = 100/20 = 5 Speedup
= 1//(1-0.3)+0.3/5) = 1.32
The improvement gained is 32% b) A new system can execute a specific task 10 times faster compared to the existing system. The task is performed during 40% of the time while 60% of the time is dedicated to I/O operations. What is the total speed up to be achieved from replacing the system. (Note: the 60% I/O operation will not change) Solution Using the Amdahl formula. Fe = 0.4 Se = 10 Speedup
= 1//(1-0.4)+0.4/10) = 1.56
The improvement gained is 56%
c) On a specific application 5% of the code has to run in serial mode while 95% of the code can be executed in parallel. What will be the improvement when moving the application to a 10 CPUs system
© Revision 1.0
121
Computer Systems Architecture exercises Solution Using the Amdahl formula. Fe = 0.95 Se = 10 Speedup
= 1//(1-0.95)+0.95/10) = 6.9
The code will execute 6.9 times faster. d) After spending a lot of time a hardware engineer managed to improve the floating point arithmetic which now executes twice as fast. Unfortunately floating point operations are executed only during 10% of the time. What is the overall speedup achieved? Solution Using the Amdahl formula. Fe = 0.95 Se = 10 Speedup
= 1//(1-0.95)+0.95/10) = 6.9
The code will execute 6.9 times faster. e) A scientific application was developed so it can exploit parallel systems utilizing many processors. As such the parallel part is executed during 95% of the time. - On how many processors the application should run so it is 10 times faster? - What will be the number of processors required to achieve a 25 times faster time? - After spending additional work the developers succeeded to increase the parallel percentage to 97%. - How many processors are required for a 10 times faster run? - How many are required for a 20 times faster run?
© Revision 1.0
122
Computer Systems Architecture exercises Solution Using the Amdahl formula. -
19 processors. Here F is given (95%) and the speedup is define (10) so S has to be calculated It is not possible. The system will never achieve this speed 14 processors 49 processors.
f) A scientific application runs on parallel systems, however only 70% is suited for parallel processors. Calculate the speedup obtained by using, two, three, four and five processors.. Solution Using the Amdahl formula. F = 70% S is given by 200% for two processors, 300% for three and so on. The speedup for two processors is 1.54 The speedup for three processors is 1.88 The speedup for four processors is 2.11 The speedup for five processors is 2.27
g) A scientific application runs for 100 minutes. 40 minutes are CPU time while 60 minutes are I/O time. For increasing the speed, two alternatives are considered: replacing the processor by a new model 90 times faster or replacing the disk system by a newer model 4 times faster. - Which alternative is better? - What will be the run time for each alternative Solution Using the Amdahl formula twice For the CPU upgrade F = 40% S = 90
© Revision 1.0
123
Computer Systems Architecture exercises Speedup = 1.65 Run time = 100/1.65= 60 minutes
For the Disk upgrade F = 60% S=4 Speedup = 1.82 Run time = 100/1.82 = 55 minutes.
The preferred alternative is upgrading the disk.
h) In an attempt to improve the system’s performance, the hardware engineers came up with two possible solutions: adding a special hardware device that will execute the square root instruction 10 times faster, or improving the floating point instruction so they will execute twice as fast. On the benchmark programs the square root instructions are executed 20% of the time while the floating point instructions execute during 50% of the time. Which alternative is better? Solution Using the Amdahl formula twice For the square root instructions upgrade F = 20% S = 10 Speedup = 1.22 For the floating point instructions upgrade F = 50% S=2 Speedup = 1.33 This means that the floating point upgrade is better.
© Revision 1.0
124
Computer Systems Architecture exercises i) In a specific processor the Load and Store instructions were improved so they execute four times faster. These instructions account for 50% of the total run time. What is the overall speedup? Assuming the test program ran for 160 seconds, how long it will run after the improvement? Solution Using the Amdahl formula F = 50% S=4 Speedup = 1.6 The new run time = 160/1.6 = 100
j) In a specific processor all logic instructions were improved and the new ones execute five time faster. Assuming the logic instructions account for 50% of the run time, how long will run a program that previously ran 10 seconds? Solution Using the Amdahl formula F = 50% S=5 Speedup = 1.67 The new run time = 10/1.67 = 6 k) After improving the arithmetic instructions they run five times faster. What should their percentage be if the overall speedup required is 3 times? Solution Using the Amdahl formula F = needs to be calculated S=5 Speedup = 3 The percentage should be ~84% © Revision 1.0
125
Computer Systems Architecture exercises l) You are the CIO a manufacturing organization. Due to anticipated changes the board asked that the main application executed on the system will run twice as fast. You checked the application and discovered that 65% is suited for parallel execution. How many processors (or cores) do you have to buy? Solution Using the Amdahl formula F = 65% S = Needs to be calculated Speedup = 2 The number of processors (or cores) should be 5
m) A specific system is using RAM only (no cache memory). It is possible to ass the cache memory that runs five times faster. How the execution time will change if the applications use the cache 80% on average? What will happen if the cache is used only 50% of the time? Solution Using the Amdahl formula The case of 90% F = 90% S=5 Speedup = 3.57
The case of 50% F = 50% S=5 Speedup = 1.67
If the cache is used 90% of the time the speedup will be 3.57 If the cache is used only 50% of the time the speedup will be 1.67
© Revision 1.0
126
Computer Systems Architecture exercises n) A vector computer executes an instruction on a vector (array) of values, contrary to a scalar computer that executes the instruction on a single value. On a specific vector computer, the vector instructions are 20 times faster compared to the scalar instructions. What the speedup will be assuming only 70% of the application can be vectorized? Solution Using the Amdahl formula F = 70% S = 20 Speedup = 2.99
o) There was an urgent need to improve the performance of a specific CISC based system. Like any other CISC based system it supported many instructions. The hardware engineers mapped the instructions and addressed the 10 most used ones. These 10 instructions account for 90% of the time. After the improvement these 10 instructions executed 6 times faster. What is the overall speedup obtained Solution Using the Amdahl formula F = 90% S=6 Speedup = 4
p) During a meeting dedicated to the new system acquired, the CIO said that the fact the new system has 100 processors means that all application will benefit. Some application may see an= a increase of 50 times and others less. Nevertheless he said all application will run at least twice as fast. Do you agree to this assumption? Solution No. Achieving the 50 times faster time means that 99% of the application should be ready for parallel execution. Furthermore, obtaining the promised speed (twice as fast) requires that 50% of the application will be parallel ready. Application with smaller percentage will not see the promised speedup © Revision 1.0
127
Computer Systems Architecture exercises
4. Scoreboarding a) Draw the score board for the following Register-Register based architecture’s instructions: Add Sub Add Sub
R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5
Solution At the beginning all the registers valid bits are set (there is no hazard). The table on each stage represents the entry situation (Start Stage) and the completion situation (End Stage). The middle column define the tests as well as the activities on each stage. Please note that since this is a dynamic scheduling there might be other solutions as well, for example several attempts to issue an instruction before the input register becomes valid. Stage 1 - Attempt executing the first instruction Add Sub Add Sub Start Stage
R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5 Activities
R1: 1
End Stage R1: 1
R2: 1
R3.valid=1
R2: 0
R3: 1
R4.valid=1
R3: 1
R4: 1 R5: 1
R4: 1 Issue instruction
R6: 1 R7: 1
R5: 1 R6: 1
Set R2.valid=0
R8: 1
R7: 1 R8: 1
Stage 2 - Attempt executing the second instruction © Revision 1.0
128
Computer Systems Architecture exercises
Add Sub Add Sub Start Stage
R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5 Activities
R1: 1
End Stage R1: 1
R2: 0
R1.valid=1
R2: 0
R3: 1
R4.valid=1
R3: 0
R4: 1 R5: 1
R4: 1 Issue instruction
R6: 1 R7: 1
R5: 1 R6: 1
Set R3.valid=0
R8: 1
R7: 1 R8: 1
Stage 3 - Attempt executing the third instruction Add Sub Add Sub Start Stage
R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5 Activities
R1: 1 R2: 0
End Stage R1: 1
R2.valid=0
R2: 0
R3: 0
R3: 0
R4: 1
R4: 1
R5: 1
Stall
R5: 1
R6: 1
R6: 1
R7: 1
R7: 1
R8: 1
R8: 1
Stage 4 – First instruction completed
© Revision 1.0
129
Computer Systems Architecture exercises
Add Sub Add Sub Start Stage
R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5 Activities
R1: 1
End Stage R1: 1
R2: 0
Add R2,R3,R4
R2: 1
R3: 0
Completed
R3: 0
R4: 1
R4: 1
R5: 1
R5: 1
R6: 1
R6: 1
R7: 1
Set R2.valid=1
R8: 1
R7: 1 R8: 1
Stage 5 - Attempt executing the third instruction Add Sub Add Sub Start Stage
R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5 Activities
R1: 1
End Stage R1: 1
R2: 1
R2.valid=1
R2: 1
R3: 0
R3.valid=0
R3: 0
R4: 1 R5: 1
R4: 1 Stall
R5: 1
R6: 1
R6: 1
R7: 1
R7: 1
R8: 1
R8: 1
Stage 6 – Second instruction completed
© Revision 1.0
130
Computer Systems Architecture exercises
Add Sub Add Sub Start Stage
R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5 Activities
R1: 1
End Stage R1: 1
R2: 1
Sub R3,R1,R4
R2: 1
R3: 0
Completed
R3: 1
R4: 1
R4: 1
R5: 1
R5: 1
R6: 1
R6: 1
R7: 1
Set R3.valid=1
R8: 1
R7: 1 R8: 1
Stage 7 - Attempt executing the third instruction Add Sub Add Sub Start Stage
R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5 Activities
R1: 1
End Stage R1: 1
R2: 1
R2.valid=1
R2: 1
R3: 1
R3.valid=1
R3: 1
R4: 1 R5: 1
R4: 1 Issue instruction
R6: 1 R7: 1
R5: 1 R6: 1
Set R7.valid=0
R8: 1
R7: 0 R8: 1
Stage 8 - Attempt executing the fourth instruction
© Revision 1.0
131
Computer Systems Architecture exercises
Add Sub Add Sub Start Stage
R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5 Activities
R1: 1
End Stage R1: 1
R2: 1
R7.valid=0
R2: 1
R3: 1
R3: 1
R4: 1
R4: 1
R5: 1
Stall
R5: 1
R6: 1
R6: 1
R7: 0
R7: 0
R8: 1
R8: 1
Stage 9 – Third instruction completed Add Sub Add Sub Start Stage
R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5 Activities
R1: 1
End Stage R1: 1
R2: 1
Add R7,R2,R3
R2: 1
R3: 1
Completed
R3: 1
R4: 1
R4: 1
R5: 1
R5: 1
R6: 1
R6: 1
R7: 0
Set R7.valid=1
R8: 1
R7: 1 R8: 1
Stage 10 - Attempt executing the fourth instruction
© Revision 1.0
132
Computer Systems Architecture exercises
Add Sub Add Sub
R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5
Start Stage
Activities
R1: 1
End Stage R1: 1
R2: 1
R7.valid=1
R2: 1
R3: 1
R5.valid=1
R3: 1
R4: 1 R5: 1
R4: 1 Issue instruction
R6: 1 R7: 1
R5: 1 R6: 1
Set R8.valid=0
R8: 1
R7: 1 R8: 0
b) Draw the score board for the following Register-Register based architecture’s instructions: Mult Mult Add Div
R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4
Solution At the beginning all the registers valid bits are set (there is no hazard). The table on each stage represents the entry situation (Start Stage) and the completion situation (End Stage). The middle column define the tests as well as the activities on each stage. Stage 1 - Attempt executing the first instruction Mult Mult Add Div
R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4
© Revision 1.0
133
Computer Systems Architecture exercises Start Stage
Activities
R1: 1 R2: 1
End Stage R1: 0
R2.valid=1
R2: 1
R3: 1
R3: 1
R4: 1
R4: 1
R5: 1
Issue instruction
R6: 1 R7: 1
R5: 1 R6: 1
Set R1.valid=0
R8: 1
R7: 1 R8: 1
Stage 2 - Attempt executing the second instruction Mult Mult Add Div Start Stage
R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4 Activities
R1: 0
End Stage R1: 0
R2: 1
R2.valid=1
R2: 1
R3: 1
R1.valid=0
R3: 1
R4: 1
R4: 1
R5: 1
Stall
R5: 1
R6: 1
R6: 1
R7: 1
R7: 1
R8: 1
R8: 1
Stage 3 - Attempt executing the second instruction Mult Mult Add Div
R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4
© Revision 1.0
134
Computer Systems Architecture exercises
Start Stage
Activities
R1: 0
End Stage R1: 0
R2: 1
R2.valid=1
R2: 1
R3: 1
R1.valid=0
R3: 1
R4: 1
R4: 1
R5: 1
Stall
R5: 1
R6: 1
R6: 1
R7: 1
R7: 1
R8: 1
R8: 1
The first instruction still did not complete. Stage 4 – The first instruction completed Mult Mult Add Div Start Stage
R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4 Activities
R1: 0 R2: 1 R3: 1
End Stage R1: 1
Mult
R1,R2,R2
Completed
R2: 1 R3: 1
R4: 1
R4: 1
R5: 1
R5: 1
R6: 1
R6: 1
R7: 1
Set R1.valid=1
R8: 1
R7: 1 R8: 1
© Revision 1.0
135
Computer Systems Architecture exercises Stage 5 - Attempt executing the second instruction Mult Mult Add Div Start Stage
R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4 Activities
R1: 1
End Stage R1: 1
R2: 1
R2.valid=1
R2: 1
R3: 1
R1.valid=1
R3: 0
R4: 1 R5: 1
R4: 1 Issue instruction
R6: 1 R7: 1
R5: 1 R6: 1
Set R3.valid=0
R8: 1
R7: 1 R8: 1
Stage 6 - Attempt executing the third instruction Mult Mult Add Div Start Stage
R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4 Activities
R1: 1
End Stage R1: 1
R2: 1
R1.valid=1
R2: 1
R3: 0
R3.valid=0
R3: 0
R4: 1 R5: 1
R4: 1 Stall
R5: 1
R6: 1
R6: 1
R7: 1
R7: 1
R8: 1
R8: 1
© Revision 1.0
136
Computer Systems Architecture exercises Stage 7 – Second instruction completed Mult Mult Add Div Start Stage
R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4 Activities
R1: 1 R2: 1
End Stage R1: 1
Mult
R3: 0
R3,R2,R1
Completed
R2: 1 R3: 1
R4: 1
R4: 1
R5: 1
R5: 1
R6: 1
R6: 1
R7: 1
Set R3.valid=1
R8: 1
R7: 1 R8: 1
Stage 8 - Attempt executing the third instruction Mult Mult Add Div Start Stage
R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4 Activities
R1: 1
End Stage R1: 1
R2: 1
R1.valid=1
R2: 1
R3: 1
R3.valid=1
R3: 1
R4: 1 R5: 1
R4: 1 Issue instruction
R6: 1 R7: 1
R5: 0 R6: 1
Set R5.valid=0
R8: 1
R7: 1 R8: 1
© Revision 1.0
137
Computer Systems Architecture exercises Stage 9 - Attempt executing the fourth instruction Mult Mult Add Div Start Stage
R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4 Activities
R1: 1
End Stage R1: 1
R2: 1
R5.valid=0
R2: 1
R3: 1
R3: 1
R4: 1
R4: 1
R5: 0
Stall
R5: 0
R6: 1
R6: 1
R7: 1
R7: 1
R8: 1
R8: 1
Stage 10 – Third instruction completed Mult Mult Add Div Start Stage
R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4 Activities
R1: 1 R2: 1 R3: 1
End Stage R1: 1
Add
R5,R1,R3
Completed
R2: 1 R3: 1
R4: 1
R4: 1
R5: 0
R5: 1
R6: 1
R6: 1
R7: 1
Set R5.valid=1
R8: 1
R7: 1 R8: 1
© Revision 1.0
138
Computer Systems Architecture exercises Stage 11 - Attempt executing the fourth instruction Mult Mult Add Div
R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4
Start Stage
Activities
R1: 1
End Stage R1: 1
R2: 1
R5.valid=1
R2: 1
R3: 1
R4.valid=1
R3: 1
R4: 1 R5:1
R4: 1 Issue instruction
R6: 1 R7: 1
R5:1 R6: 1
Set R8.valid=0
R8: 1
R7: 1 R8: 0
c) The high level computer instruction SUM = A+B+C+D Can be implemented in assembly language by: Add Add Add
R5,R1,R2 R5,R5,R3 R5,R5,R4
Add Add Add
R5,R1,R2 R6,R3,R4 R5,R5,R6
Or
Which one is better? Draw the score board for the two implementations. Assume that after issuing an instruction there is one extra cycle before the register becomes available
© Revision 1.0
139
Computer Systems Architecture exercises Solution In both cases we’ll assume that at the beginning all the registers valid bits are set (there is no hazard). In addition the variables A,B,C,D were loaded already into R1,R2,R3,R4. The table on each stage represents the entry situation (Start Stage) and the completion situation (End Stage). The middle column define the tests as well as the activities on each stage. First implementation Stage 1 - Attempt executing the first instruction Add Add Add Start Stage
R5,R1,R2 R5,R5,R3 R5,R5,R4 Activities
R1: 1
End Stage R1: 1
R2: 1
R1.valid=1
R2: 1
R3: 1
R2.valid=1
R3: 1
R4: 1 R5: 1
R4: 1 Issue instruction
R6: 1 R7: 1
R5: 0 R6: 1
Set R5.valid=0
R8: 1
R7: 1 R8: 1
Stage 2 - Attempt executing the second instruction Add Add Add
R5,R1,R2 R5,R5,R3 R5,R5,R4
© Revision 1.0
140
Computer Systems Architecture exercises Start Stage
Activities
R1: 1
End Stage R1: 1
R2: 1
R5.valid=0
R2: 1
R3: 1
R3: 1
R4: 1
R4: 1
R5: 0
Stall
R5: 0
R6: 1
R6: 1
R7: 1
R7: 1
R8: 1
R8: 1
Stage 3 – First instruction completed Add Add Add Start Stage
R5,R1,R2 R5,R5,R3 R5,R5,R4 Activities
R1: 1 R2: 1
End Stage R1: 1
Add
R3: 1
R5,R1,R2
Completed
R2: 1 R3: 1
R4: 1
R4: 1
R5: 0
R5: 1
R6: 1
R6: 1
R7: 1
Set R5.valid=1
R8: 1
R7: 1 R8: 1
Stage 4 - Attempt executing the second instruction Add Add Add
R5,R1,R2 R5,R5,R3 R5,R5,R4
© Revision 1.0
141
Computer Systems Architecture exercises
Start Stage
Activities
R1: 1
End Stage R1: 1
R2: 1
R5.valid=1
R2: 1
R3: 1
R3.valid=1
R3: 1
R4: 1 R5: 1
R4: 1 Issue instruction
R6: 1 R7: 1
R5: 0 R6: 1
Set R5.valid=0
R8: 1
R7: 1 R8: 1
Stage 5 - Attempt executing the third instruction Add Add Add Start Stage
R5,R1,R2 R5,R5,R3 R5,R5,R4 Activities
R1: 1
End Stage R1: 1
R2: 1
R5.valid=0
R2: 1
R3: 1
R3: 1
R4: 1
R4: 1
R5: 0
Stall
R5: 0
R6: 1
R6: 1
R7: 1
R7: 1
R8: 1
R8: 1
Stage 6 – Second instruction completed Add Add Add
R5,R1,R2 R5,R5,R3 R5,R5,R4
© Revision 1.0
142
Computer Systems Architecture exercises
Start Stage
Activities
R1: 1 R2: 1
End Stage R1: 1
Add
R3: 1
R5,R5,R3
Completed
R2: 1 R3: 1
R4: 1
R4: 1
R5: 0
R5: 1
R6: 1
R6: 1
R7: 1
Set R5.valid=1
R8: 1
R7: 1 R8: 1
Stage 7 - Attempt executing the third instruction Add Add Add Start Stage
R5,R1,R2 R5,R5,R3 R5,R5,R4 Activities
R1: 1
End Stage R1: 1
R2: 1
R5.valid=1
R2: 1
R3: 1
R4.valid=1
R3: 1
R4: 1 R5: 1
R4: 1 Issue instruction
R6: 1 R7: 1
R5: 0 R6: 1
Set R5.valid=0
R8: 1
R7: 1 R8: 1
Due to the excessive usage of register 5 seven stages were required for executing the code.
© Revision 1.0
143
Computer Systems Architecture exercises Second implementation Stage 1 - Attempt executing the first instruction Add Add Add Start Stage
R5,R1,R2 R6,R3,R4 R5,R5,R6 Activities
R1: 1
End Stage R1: 1
R2: 1
R1.valid=1
R2: 1
R3: 1
R2.valid=1
R3: 1
R4: 1 R5: 1
R4: 1 Issue instruction
R6: 1 R7: 1
R5: 0 R6: 1
Set R5.valid=0
R8: 1
R7: 1 R8: 1
Stage 2 - Attempt executing the second instruction Add Add Add Start Stage
R5,R1,R2 R6,R3,R4 R5,R5,R6 Activities
R1: 1
End Stage R1: 1
R2: 1
R3.valid=1
R2: 1
R3: 1
R4.valid=1
R3: 1
R4: 1 R5: 0
R4: 1 Issue instruction
R6: 1 R7: 1
R5: 0 R6: 0
Set R6.valid=0
R8: 1
R7: 1 R8: 1
© Revision 1.0
144
Computer Systems Architecture exercises Stage 3 – First instruction completed Add Add Add Start Stage
R5,R1,R2 R6,R3,R4 R5,R5,R6 Activities
R1: 1 R2: 1
End Stage R1: 1
Add
R3: 1
R5,R1,R2
Completed
R2: 1 R3: 1
R4: 1
R4: 1
R5: 0
R5: 1
R6: 0
R6: 0
R7: 1
Set R5.valid=1
R8: 1
R7: 1 R8: 1
Stage 4 - Attempt executing the third instruction Add Add Add Start Stage
R5,R1,R2 R6,R3,R4 R5,R5,R6 Activities
R1: 1
End Stage R1: 1
R2: 1
R5.valid=1
R2: 1
R3: 1
R6.valid=0
R3: 1
R4: 1 R5: 1
R4: 1 Stall
R5: 1
R6: 0
R6: 0
R7: 1
R7: 1
R8: 1
R8: 1
© Revision 1.0
145
Computer Systems Architecture exercises Stage 5 – Second instruction completed Add Add Add Start Stage
R5,R1,R2 R6,R3,R4 R5,R5,R6 Activities
R1: 1 R2: 1
End Stage R1: 1
Add
R3: 1
R6,R3,R3
Completed
R2: 1 R3: 1
R4: 1
R4: 1
R5: 1
R5: 1
R6: 0
R6: 1
R7: 1
Set R6.valid=1
R8: 1
R7: 1 R8: 1
Stage 6 - Attempt executing the third instruction Add Add Add Start Stage
R5,R1,R2 R6,R3,R4 R5,R5,R6 Activities
R1: 1
End Stage R1: 1
R2: 1
R5.valid=1
R2: 1
R3: 1
R6.valid=1
R3: 1
R4: 1 R5: 1
R4: 1 Issue instruction
R6: 1 R7: 1
R5: 0 R6: 1
Set R5.valid=0
R8: 1
R7: 1 R8: 1
Here the second instruction can be executed without delay.
© Revision 1.0
146
Computer Systems Architecture exercises
5. Branch Prediction a) The following string represents the behavior of a specific branch instruction. - “1” means the branch was taken - “0” means the branch was not taken 1001110111 Calculate the success rate of the branch prediction when using one and two bits. In both cases assume the default is not to branch. Solution One Bit Prediction Cycle
1
2
3
4
5
6
7
8
9
10
Value
1
0
0
1
1
1
0
1
1
1
Branch
Y
N
N
Y
Y
Y
N
Y
Y
Y
Anticipation
N
Y
N
N
Y
Y
Y
N
Y
Y
Success
N
N
Y
N
Y
Y
N
N
Y
Y
Success rate = 50% (5 out of the 10 branches)
Two Bits Prediction Cycle
1
2
3
4
5
6
7
8
9
10
Value
1
0
0
1
1
1
0
1
1
1
Branch
Y
N
N
Y
Y
Y
N
Y
Y
Y
Anticipation
N!
N?
N!
N!
N?
Y?
Y!
Y?
Y!
Y!
Success
N
Y
Y
N
Y
Y
N
Y
Y
Y
Success rate = 60% (6 out of the 10 branches) b) What are the success rates of the branches in the previous exercise if the default is branch taken? Solution One Bit Prediction = 60% Two Bits Prediction = 60%
© Revision 1.0
147
Computer Systems Architecture exercises c) Calculate the success rates of the branch prediction mechanisms for the following scenarios. A bit set represents a branch taken. Scenario 1110000110110
One bit default Taken
Two bits default Taken
One bit success 61.54%
Two bits success 46.15%
10101010111
Taken
Not taken
27.27%
45.45%
10101010111
Not taken
Taken
18.18%
63.64%
1110001010
Taken
Taken
50.00%
60.00%
1110001010
Not taken
Not taken
40.00%
40.00%
1011100110110110
Not taken
Taken
37.50%
56.25%
1110101010111011
Not taken
Taken
31.25%
68.75%
1011001110001010
Taken
Not taken
43.75%
37.50%
1100110011
Taken
Taken
60.00%
20.00%
1100011011101011
Not taken
Not taken
43.75%
43.75%
0011101000011001
Taken
Taken
50.00%
37.50%
1011011101111011
Taken
Taken
50.00%
75.00%
1100011110000011
Not taken
Not taken
68.75%
43.75%
1110101011
Not taken
Not taken
30.00%
50.00%
1010101011
Taken
Taken
20.00%
60.00%
1101101101101101
Taken
Taken
37.50%
68.75%
0111111111110
Not taken
Not taken
84.62%
76.92%
0111111111110
Taken
Taken
76.92%
84.62%
1111000011110000
Not taken
Not taken
75.00%
50.00%
0111111101111111
Not taken
Not taken
81.25%
81.25%
0001111010101111
Not taken
Not taken
56.25%
68.75%
1110101000110111
Taken
Taken
50.00%
50.00%
1101011111011000
Taken
Taken
56.25%
68.75%
1101110111011101
Taken
Taken
50.00%
75.00%
© Revision 1.0
148
Computer Systems Architecture exercises
Chapter 6 – Cache Memory exercises 1. Cache Memory improvements a) A specific system has three levels of memory (Main, Cache L1 and Cache L2) The memory access time is 50 ns, L1 access time is 1 ns L2 access time is 5 ns An application was executed on that system which is characterized by the fact that 30% of the instructions access memory. 90% of these access are found in L1 and only 1% has to be brought from main memory. Calculate the amount of time added to each cycle, if: - The application is using only main memory - The application is using main memory and L1 - The application is using main memory and L2 - The application is using main memory, L1 and L2. Solution No cache (just man memory) New CPI = CPI + 30% * 50 = CPI + 15 Since 30% of the instructions access memory, a 15 ns will be added to the calculated CPI.
Main memory and L1 only New CPI = CPI + 30% * (10%*50 + 90%*1) = CPI + 1.77 30% of the instructions access memory, however 90% of them find the datum, in L1 and only 10% have to reach main memory. In this case 1.77 ns will be added to the calculated CPI
Main memory and L2 only New CPI = CPI + 30% * (99%*5 + 1%*50) = CPI + 1.64 30% of the instructions access memory, however only 1% has to access main memory while 99% find the datum in L2. In this case 1.64 ns will be added to the calculated CPI
© Revision 1.0
149
Computer Systems Architecture exercises Main memory, L1 and L2 New CPI = CPI + 30% * (90%*1 + 9%*5 +1%*50) = CPI + 0.56 30% of the instructions access memory, from which 90% find the datum in L1, another 9% find it in L2 and only in 1% of the cases the instruction has to access main memory. In this case 0.56 ns will be added to the calculated CPI.
b) A specific system has four levels of memory (Main, Cache L1, Cache L2 and Cache L3) . The memory access percent is 33% The memory access time is 60 ns, 0 missing rate L1 access time is 1 ns, 15% missing rate (the datum is not found in L1) L2 access time is 10 ns, 7% missing rate L2 access time is 20 ns, 1% missing rate Calculate the amount of time added to each cycle, if: - The application is using only main memory - The application is using main memory and L1 - The application is using main memory and L2 - The application is using main memory and L3 - The application is using main memory, L1 and L2. - The application is using main memory, L1, L2 and L3. Solution No cache (just man memory) New CPI = CPI + 33%*60 = CPI + 19.8 ns
Memory and cache L1 New CPI = CPI + 33% * (85%*1 + 15%*60) = CPI + 3.25 ns
Memory and cache L2 New CPI = CPI + 33%* (93%*10 + 7%*60) = CPI + 4.46 ns
Memory and cache L3 New CPI = CPI + 33% * (99%*20 + 1%*60) = CPI + 6.73 ns
© Revision 1.0
150
Computer Systems Architecture exercises Memory, cache L1 and cache L2 New CPI = CPI + 33^ (85%*1 + 8%*10 + 7%*60) = CPI + 1.93 ns Memory, cache L1, cache L2 and cache L3 New CPI = CPI + 33%(85%*1 + 8%*10 + 6%*20 + 1%*60) = CPI + 1.14 ns c) The following table defines several systems with different characteristics. All systems have a three level hierarchical memory (main, L1, L2). - The memory access represents the percent of the instructions that access memory. - The access time defines the time required to access each of the levels. - The missing rate represents the percentage of misses per each level. The missing rate for the main memory is always zero. For each line calculate the amount of time added to the CPI if: - The application is using only main memory - The application is using main memory and L1 - The application is using main memory and L2 - The application is using main memory, L1 and L2.
No.
Memory Access %
Access time L2
Missing Rate (%) L1 L2
Main
Added Time ns L1 L2
Main
L1
1
35
75
2
10
12
2
26.25
3.77
3.96
1.49
2
40
80
2
15
8
1
32.00
2.20
6.26
1.48
3
25
50
3
12
9
2
12.50
1.81
3.19
1.14
4
29
48
2
10
12
2
13.92
2.18
3.12
1.08
5
32
56
1
8
11
2
17.92
2.26
2.87
0.87
6
48
40
2
12
14
3
19.20
3.51
6.16
2.04
7
28
50
3
15
12
2
14.00
2.42
4.40
1.44
8
36
54
1
12
10
1
19.44
2.27
4.47
0.91
9
65
50
1
11
8
1
32.50
3.20
7.40
1.42
10
52
52
2
11
9
2
27.04
3.38
6.16
1.89
© Revision 1.0
L1+L2
151
Computer Systems Architecture exercises
Chapter 7 – BUS exercises 1. Parity a) The following table contains binary number and the definition of the parity bit (odd or even) Calculate the parity and complete the table. The first exercise is solved. No.
Binary Number
Parity
Parity Bit
1
10 1010 1010 1010
Odd
0
2
1111 0000 1111
Even
0
3
1 1011 1101 1010
Even
1
4
11 0000 1011
Odd
0
5
1111 1100 0000
Odd
0
6
10 0000 0100
Odd
1
7
100 0111 1010
Even
0
8
000 1101 1010
Even
1
9
1 0111 0101 1011
Even
1
10
1 1111 1111
Odd
1
b) For increasing the integrity of the information sent over the network a special parity algorithm was define. Instead of adding just one bit per a block (seven bits) the algorithm adds three parity bits. Each such parity bit guards several data bits. Unlike the ordinary simple parity bit mechanism, thus mechanism increases the overhead but provides the capability to correct faulty blocks without the need to re-transmit. The algorithm is defined by: P0 = even_parity (D0, D1, D3, D4, D5) P1 = even_parity (D1, D2, D3, D5, D6) P2 = even_parity (D0, D2, D3, D4, D6) Encode the binary number 0110010 by adding the three parity bits as define by the algorithm above Solution
© Revision 1.0
152
Computer Systems Architecture exercises The following table define the original value (the data bits), the bits that are being guarded by the parity bits and the value of these parity bits. D0
D1
D2
D3
D4
D5
D6
Parity
Value
0
1
1
0
0
1
0
P0
0
1
---
0
0
1
---
0
P1
---
1
1
0
---
1
0
1
P2
0
---
1
0
0
---
0
1
The encoded value is: 0110010 011 c) The following table contains a list of 7 bits block that have to be encoded using the previously described algorithm: P0 = even_parity (D0, D1, D3, D4, D5) P1 = even_parity (D1, D2, D3, D5, D6) P2 = even_parity (D0, D2, D3, D4, D6) Calculate the three parity bits for each block No.
Block Content
P0
P1
P2
1
111 1111
1
1
1
2
101 1110
0
1
0
3
101 1110
1
1
0
4
001 0011
1
1
0
5
101 0101
0
0
0
6
110 1101
0
1
0
7
111 0011
1
0
1
8
101 1111
0
0
1
9
111 1001
1
0
0
10
100 0111
1
0
1
11
100 1001
0
0
1
12
111 1100
0
1
0
13
001 0010
0
1
0
© Revision 1.0
153
Computer Systems Architecture exercises d) The following table contains a list of 7 bits block that have to be encoded using a very similar algorithm (same locations but the parity is odd): P0 = odd_parity (D0, D1, D3, D4, D5) P1 = odd_parity (D1, D2, D3, D5, D6) P2 = odd_parity (D0, D2, D3, D4, D6) Calculate the three parity bits for each block No.
Block Content
P0
P1
P2
1
011 1111
1
0
1
2
000 1010
1
1
0
3
111 0000
1
1
1
4
101 0000
0
0
1
5
111 0111
1
1
1
6
110 0011
0
0
1
7
111 1111
0
0
0
8
100 0001
0
0
1
9
011 1110
1
1
0
10
101 0101
1
1
1
11
111 0001
1
0
0
12
101 1101
0
0
0
13
110 1101
1
0
1
e) The block: 111 0101 110 arrived through the network after it was encoded using the following algorithm: P0 = even_parity (D0, D1, D3, D4, D5) P1 = even_parity (D1, D2, D3, D5, D6) P2 = even_parity (D0, D2, D3, D4, D6) Decode it and find the correct data bits of the original block. Solution First we will encode the data bits in the received message. Since the calculated parity bits are identical to the values in the message we assume the block is correct. The data bits are: 1110101
© Revision 1.0
154
Computer Systems Architecture exercises f) The block: 101 1111 000 arrived through the network after it was encoded using the following algorithm: P0 = odd_parity (D0, D1, D3, D4, D5) P1 = odd_parity (D1, D2, D3, D5, D6) P2 = odd_parity (D0, D2, D3, D4, D6) Decode it and find the correct data bits of the original block. Solution First we will encode the data bits in the received message. The calculated parity bits are: P0=1, P1=1, P2=0. It can be seen that parity bits 0 and 1 are different. Assuming just one bit flipped during the transmission, it should be D5 the only bit that is guarded by the two different parity bits. The correct data bits are: 101 1101
g) The block: 111 0111 101 arrived through the network after it was encoded using the following algorithm: P0 = odd_parity (D0, D1, D3, D4, D5) P1 = odd_parity (D1, D2, D3, D5, D6) P2 = odd_parity (D0, D2, D3, D4, D6) Decode it and find the correct data bits of the original block. Solution First we will decode the data bits in the received message. The calculated parity bits are: P0=1, P1=1, P2=1. It can be seen that parity bit 1 is different. Assuming just one bit flipped during the transmission, it should be P1 itself. If it was a data bit that flipped than more than one parity bit should have been erroneous. The correct data bits are: 111 0111
h) The following table contains a list of block that arrived through the network. All blocks were decoded using the following odd or even algorithm: P0 = parity (D0, D1, D3, D4, D5) P1 = parity (D1, D2, D3, D5, D6) P2 = parity (D0, D2, D3, D4, D6)
© Revision 1.0
155
Computer Systems Architecture exercises The correct algorithm (odd or even) is define in the table for each block. Decode each received block to find the correct data bits of the original block assuming no more than one bit flipped. No.
Block Received
Algorithm
Flipped bit
Original Block
1
111 1101 000
Odd
D5
111 1111
2
000 0100 111
Odd
D0
100 0100
3
110 1001 110
Even
P2
110 1001
4
110 0111 000
Even
D2
111 0111
5
110 0001 000
Even
None
110 0001
6
101 1010 000
Even
D3
101 0010
7
100 1111 010
Odd
D3
100 0111
8
010 1111 000
Odd
D1
000 1111
9
101 0111 000
Odd
P2
101 0111
10
111 0110 000
Even
D6
111 0111
11
110 0111 110
Even
D4
110 0011
12
111 0001 011
Even
None
111 0001
13
111 0101 101
Odd
P0
111 0101
14
101 0100 110
Odd
P1
101 0100
15
001 1000 110
Odd
D4
001 1100
16
011 1010 101
Odd
D5
011 1000
17
000 0000 100
Odd
D6
000 0001
18
111 1111 010
Even
D0
011 1111
19
000 0010 001
Even
D3
000 1010
20
010 0101 101
Even
D4
010 0001
21
011 1011 110
Even
P2
011 1011
22
011 0101 010
Odd
D5
011 0111
23
011 0111 100
Odd
D5
011 0101
24
010 1001 000
Odd
D4
010 1101
© Revision 1.0
156
Computer Systems Architecture exercises
2. Hamming Codes a) Use odd hamming codes for decoding the value 1111 Solution According to the Hamming codes mechanism - P1 guards all bits with the “one” bit on in their address (bits D3, D5, D7) - P2 guards the bits with the “two” bit on in their address ((D3, D6, D7) - P4 guards the bits with the “four” bit on in their address (D5, D6, D7) We have to calculate the parity bits according to this rules and we’ll get: - P1=0 - P2=0 - P4=0 The following table describes the process divided into steps. -
First line define the addresses of the bits Second line adds the data bits (in the proper locations) The next three lines set the parity bits The last line is the encoded value.. Bit
P1
P2
D3
P4
D5
D6
D7
Address
001
010
011
100
101
110
111
1
1
1
1
1
1
Data P1
0 0
P2
1
P4 Encoded
0
0
1
1 1
1
0
1
1
1
1
1
1
1
There is however another way, which some students may fins easier. Performing the XOR function on all the addresses of the bits that are set in the original message. (3,5,6,7 or in binary 011, 101, 110, 111). Such a function will produce the value 111. This value is correct for even parity. Since in this case the parity was define as odd, the number has to be inverted and wel will get 000. All that remains is placing the parity bits in their proper location in the message to get the encoded message 0010111
© Revision 1.0
157
Computer Systems Architecture exercises b) Use odd hamming codes for decoding the value 1010 Solution According to the Hamming codes mechanism - P1 guards all bits with the “one” bit on in their address (bits D3, D5, D7) - P2 guards the bits with the “two” bit on in their address ((D3, D6, D7) - P4 guards the bits with the “four” bit on in their address (D5, D6, D7) We have to calculate the parity bits according to this rules and we’ll get: - P1=0 - P2=1 - P4=0 The following table describes the process divided into steps. -
First line define the addresses of the bits Second line adds the data bits (in the proper locations) The next three lines set the parity bits The last line is the encoded value.. Bit
P1
P2
D3
P4
D5
D6
D7
Address
001
010
011
100
101
110
111
1
0
1
0
1
0
Data P1
0 1
P2
1
P4 Encoded
0
1
1
0 1
0
0
0
1
0
0
0
1
10
There is also the shorter way of using the XOR function on all the addresses of the bits that are set in the original message. (3,6 or in binary 011, 110). Such a function will produce the value 101. This value is correct for even parity. Since in this case the parity was define as odd, the number has to be inverted and wel will get 010. All that remains is placing the parity bits in their proper location in the message to get the encoded message 0110010
c) Use even hamming codes for decoding the value 1101 1110
© Revision 1.0
158
Computer Systems Architecture exercises Solution Since this number contains more bits we will need more parity bits as well. It is possible to decode the number using the longer way (with a table, as described in the previous two exercises), but we will use the shorter way. We will place the data bits in their proper location and perform a XOR between the address of the bits that are on (in this case address 3,5,7,9,10,11. The XOR (011 101 111 1001 1010 1011) = 1001 Since in this case the parity is define as even, the XOR code represents the parity bits. All that remains is to construct the decoded message 1010 1011 1110 d) The following table contains a list of block that have to be encoded using Hamming codes. Calculate the Hamming codes for each line that represents a data block. The specific parity to be used per each block is defined as well. No.
Original Data
Parity
Encoded Block
1
011 1111
Even
000 1111 1111
2
011 1111
Odd
110 0111 0111
3
111 0001
Odd
001 1110 0001
4
101 1011
Even
111 0011 0011
5
000 1111
Even
110 1001 1111
6
010 0100
Odd
110 0100 0100
7
101 1100
Odd
001 1011 0100
8
101 1100
Even
111 0011 1100
9
100 1000
Even
001 1001 0000
10
111 1000
Even
111 1111 0000
11
010 1001
Odd
010 1101 0001
12
0101001
Even
100 0101 1001
13
001 1111
Odd
010 1011 0111
14
000 0111
Odd
110 1000 0111
15
101 0101
Even
111 1010 0101
© Revision 1.0
159
Computer Systems Architecture exercises
3. SECDED a) Use even Hamming codes for decoding the value 110 1001 and in addition add an odd SECDED bit Solution Calculating the Hamming codes was addressed in previous exercises (either by calculating the codes using a table or using the shorter way with the XOR function). In this specific the data 110 1001 will be encoded into 011 0101 1001. The SECDED bit that in case of failure can correct a single bit flip and detect double bits flips adds a parity bit on the whole encoded block. After adding the odd SECDED parity the block will contain: 1011 0101 1001 b) The following table contains a list of blocks that have to be encoded using Hamming codes. Calculate the Hamming codes As well as the SECDED The specific parity to be used per each block is defined as well. No. 1
Original Data 101 1111
Hamming Parity Even
SECDED Parity Odd
2
010 1100
Even
Even
0110 0101 1100
3
001 0001
Odd
Odd
0010 0010 0001
4
100 1110
Even
Odd
0111 1001 0110
5
101 0001
Odd
Even
0101 0010 0001
6
111 0001
Odd
Odd
0001 1110 0001
7
011 0110
Even
Odd
1000 0110 0110
8
001 1010
Even
Even
0110 0011 1010
9
010 0010
Odd
Even
0000 0100 0010
10
100 1110
Odd
Odd
0001 0001 1110
11
000 0111
Even
Odd
1000 0000 1111
12
010 1110
Even
Odd
0100 0101 0110
13
111 1000
Odd
Even
1001 0111 1000
14
011 0011
Odd
Odd
0100 1110 1011
© Revision 1.0
Encoded Block 1011 0011 1111
160
Computer Systems Architecture exercises c) A message with the content: 0001 0111 0111 arrived from the network. It was encoded using odd Hamming codes and an odd SECDED. Decode it to obtain the original data. Solution The first step is checking the SECDED. Since it is correct it implies that either the message is correct or that two bit were flipped. In the case two bit were flipped the original data cannot be calculated. The next step is calculating the Hamming codes. In this case the codes are correct. This means that the message is correct. The third and last step is to retrieve the original data bits. In this case the data sent was: 111 1111
d) A message with the content: 1111 0010 0011 arrived from the network. It was encoded using even Hamming codes and an even SECDED. Decode it to obtain the original data. Solution The first step is checking the SECDED. Since it is wrong it implies that one bit was flipped. The next step is calculating the Hamming codes. In this case The three Hamming codes P1,P2,P4 are wrong. The only bit that is shared by these three guard bits is bit 7.. The third and last step is to retrieve the original data bits and flip bit 7. In this case the data sent was: 101 1011. As with the previous example (of calculating the Hamming codes), there is also a short way to figure out the integrity of the message received. Step 1 - Performing a XOR between the addresses of the bits set in the message received. XOR (0011 0110 1010 1011) = 0100 Step 2 – Performing a XOR between the parity bits obtained in the message XOR (0100 0011) = 0111 The value calculated is the bit flipped
© Revision 1.0
161
Computer Systems Architecture exercises e) The following table contains a list of blocks that were received. Decode the Hamming codes as well as the SECDED. The specific parity to be used per each block is defined as well. No.
Received Block
1
0001 0101 0111
Hamming Parity Odd
2
1010 1101 1001
Even
Even
D7
010 0001
3
1000 1000 0000
Odd
Odd
D11
000 0001
4
0110 1100 0101
Even
Odd
SECDED
010 0101
5
0100 0111 1110
Odd
Even
P1
011 1110
6
0001 0010 1001
Odd
Odd
D9
101 0101
7
1110 1101 1111
Even
Odd
D5
000 1111
8
0110 0010 1010
Even
Even
D7
001 1010
9
1000 0111 1000
Odd
Even
D3
111 1000
10
0100 0000 0001
Odd
Odd
D5
010 0001
11
1011 1100 1101
Even
Odd
D10
110 0111
12
1000 1010 1110
Even
Odd
D9
001 0010
13
1001 1111 1011
Odd
Even
D5
101 1011
14
1110 0111 1010
Odd
Odd
D10
011 1000
15
1011 1101 1100
Even
Odd
D6
111 1100
16
1101 1010 0101
Even
Even
P2
101 0101
17
1100 0001 1111
Even
Even
D6
001 1111
18
0110 1000 0001
Odd
Odd
D3
100 0001
19
1010 0001 0000
Odd
Even
D10
000 1010
20
0001 1001 0111
Even
Odd
P8
100 1111
© Revision 1.0
SECDED Parity Odd
Bit Flipped D6
Original Data 111 1111
162