Computer Systems Architecture (Solutions, Instructor Solution Manual) [1 ed.] 9781482231052, 1482231050


130 119 3MB

English Pages [162] Year 2016

Report DMCA / Copyright

DOWNLOAD PDF FILE

Recommend Papers

Computer Systems Architecture   (Solutions, Instructor Solution Manual) [1 ed.]
 9781482231052, 1482231050

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Computer Systems Architecture exercises

Computer Systems Architecture Exercises

Drawing by Tamar Yadin

Aharon Yadin

© Revision 1.0

1

Computer Systems Architecture exercises

Computer Systems Architecture Exercises Preface This booklet contains many exercises related to various chapters of the book. The intention is to provide additional rehearsal materials for the students. The exercises are divided into the relevant book chapters as well as the various learning subjects. The students do not have to solve all exercises. For each subject the student should try to solve several exercises. If these were solved correctly, the student can proceed to the next subject. If, on the other hand, the solutions are wrong, the student is advised to continue solving additional exercises. For the calculation exercises there are solutions at the end of the booklet

© Revision 1.0

2

Computer Systems Architecture exercises

Revision History

Date October 2016

Changes Original document

© Revision 1.0

Revision 1.0

3

Computer Systems Architecture exercises

Chapter 2 – Data Representation exercises 1. Decimal to Binary conversions All numbers are positive (unsigned numbers). First line is a solved example No.

Decimal Number

1

123

2

1 023

3

765

4

9 898

5

3 999

6

159

7

762

8

7 602

9

5 577

10

2 004

11

11 000

12

35 355

13

404

14

1 234

15

4 949

16

573

17

669

18

917

19

8 192

20

7 811

21

8 642

22

3 789

23

7 003

Binary Number

© Revision 1.0

111 1011

4

Computer Systems Architecture exercises No.

Decimal Number

24

3 887

25

6 423

26

12 129

27

27 298

28

19 999

29

9 873

30

17 399

31

57

32

634

33

9 824

34

10 000

35

5 665

36

7 991

37

999

38

800

39

3 333

40

7 007

41

12 123

42

255

43

7 777

44

5 656

45

4 321

46

99

47

375

48

1 010

49

8 119

Binary Number

© Revision 1.0

5

Computer Systems Architecture exercises No.

Decimal Number

50

6 944

51

2 468

52

1 753

53

4 762

54

8 117

55

1 928

56

7 956

57

19 175

58

22 222

59

7 275

60

1 983

61

5 555

62

36 133

63

11 223

64

4 590

65

21 325

66

9 176

67

9

68

81

69

5 933

70

9 724

71

5 311

72

14 000

73

781

74

35

75

28 753

Binary Number

© Revision 1.0

6

Computer Systems Architecture exercises

2. Binary to Decimal conversions All numbers are positive (unsigned numbers). First line is a solved example No.

Binary Number

1

1 0000 0000

2

1100 1100 1101

3

1 0010 0011 0100

4

1 1000 0100 0010

5

100 0000 0001

6

1010 1010 1010

7

1 0000 0000 0001

8

100 1000 1000 1101

9

1100 1100 1100

10

111 1000 0111

11

1 0000 0010 0100

12

10 1010 1010 1010

13

111 0111 0111 0111

14

111 1011 0110

15

1101 0000 1101

16

1110 1001 1110

17

101 0010 1100 0001

18

101 0000 0000 1010

19

101 0001 0001 1111

20

100 1101 1100 0001

21

1 1111 0011 0011

22

11 0001 1111 1111

23

11 0100 0101 0110

24

111 1010 0001 0011

Decimal Number

© Revision 1.0

256

7

Computer Systems Architecture exercises No.

Binary Number

25

111 1010 1011 1100

26

11 0011 1001 1000

27

1111 1010 1010

28

10 1000 1100 1001

29

1011 0101 0101

30

110 1110 1111

31

11 1111 0100 0000

32

1 0001 0000 0001

33

1 1111 1111 1111

34

110 0110 1000 0100

35

1001 1001 1001

36

1001 1010 1010

37

11 0110 0111

38

1111 0000 0000

39

1000 0000 0001

40

1010 0000 1111

41

1000 1001 1010 1011

42

1001 0111 0110

43

11 0011 0011 0011

44

1001 0111 0101 0011

45

11 1000 1010

46

1011 1011 1000

47

1001 1110 1110

48

1110 1110 1001

49

101 0001 1110 0101

50

11 0000 0000 0011

Decimal Number

© Revision 1.0

8

Computer Systems Architecture exercises No.

Binary Number

51

10 0100 0110 1001

52

111 1010 1011

53

1 1010 0010 0010

54

101 0001 0001 0001

55

111 0110 0101 0100

56

1001 0010 0011 0100

57

1 1001 0010 1000

58

1000 0010 0111 0011

59

1011 1100 1100

60

1010 1001 1000

61

110 1000 1010 0001

62

100 1011 0001 1010

63

10 1011 0110 1001

64

111 0001 0101 0111

65

111 0101 1001 0100

66

1 1111 0000 1000

67

1 0101 1011

68

10 1011 0000 0011

69

10 0010 1011 1000

70

1 1000 1011 1101

71

1110 1001 1001

72

1 1110 1000 1111

73

110 1100 0001 1001

74

100 1010 0011 1101

75

10 0000 0011 0110

76

100 0000 0000 0000

Decimal Number

© Revision 1.0

9

Computer Systems Architecture exercises

3. Decimal to Hexadecimal conversions All numbers are positive (unsigned numbers). First line is a solved example No. 1

Decimal Number 31 987

2

21 007

3

17 000

4

14 555

5

17 800

6

15 841

7

14 773

8

17 183

9

79 891

10

22 888

11

12 871

12

21 680

13

14 000

14

29 612

15

16 658

16

22 888

17

10 000

18

29 999

19

30 651

20

31 113

21

30 001

22

17 325

23

9 876

24

21 111

Hexadecimal Number 7CF3

© Revision 1.0

10

Computer Systems Architecture exercises No. 25

Decimal Number 911

26

14 590

27

17 618

28

9 784

29

11 011

30

8 933

31

12 617

32

21 039

33

10 000

34

6 785

35

12 777

36

24 242

37

19 898

38

6 444

39

717

40

3 982

41

10 986

42

519

43

4 102

44

22 097

45

27 963

46

16 741

47

3 785

48

9 261

49

4 022

50

26 789

Hexadecimal Number

© Revision 1.0

11

Computer Systems Architecture exercises

4. Hexadecimal to Decimal conversions All numbers are positive (unsigned numbers). First line is a solved example No. 1

Hexadecimal Number 8 000

2

6 B8C

3

3 5DD

4

3 E1D

5

2 B67

6

3 FCD

7

2 6A0

8

4 705

9

A FC8

10

2 FAD

11

8 791

12

7 A7A

13

B 00F

14

A 000

15

7 575

16

2 FAD

17

5 AB0

18

5 7E4

19

4 944

20

313

21

3 745

22

6 20B

23

2 2CF

24

5 A9B

Decimal Number 32 768

© Revision 1.0

12

Computer Systems Architecture exercises No. 25

Hexadecimal Number 9 C40

26

20F

27

7E0

28

7 97B

29

2 F59

30

2 333

31

4 2E0

32

2 000

33

1 D6B

34

395

35

3 30A

36

2 222

37

4 D6D

38

4 36F

39

2C7

40

1 B63

41

DAB

42

FFA

43

1 F62

44

1 565

45

393

46

9A

47

4 7B3

48

4 267

49

2 DE2

50

2 107

Decimal Number

© Revision 1.0

13

Computer Systems Architecture exercises

5. Decimal to Octal conversions All numbers are positive (unsigned numbers). First line is a solved example No. 1

Decimal Number 3 755

2

11 111

3

1 733

4

12 121

5

9 875

6

9 077

7

96 017

8

23 777

9

35 000

10

33 333

11

44 330

12

45 145

13

53 723

14

2 323

15

47 474

16

32 432

17

53 521

18

39 999

19

22 555

20

40 960

21

261 549

22

20 480

23

32 767

24

319

Octal Number 7 253

© Revision 1.0

14

Computer Systems Architecture exercises No. 25

Decimal Number 26 788

26

10 976

27

17 435

28

8 099

29

11 099

30

25 000

31

9 822

32

12 377

33

5 077

34

7 333

35

49

36

18 555

37

9 000

38

7 102

39

8 989

40

15 415

41

2 017

42

6 522

43

16 581

44

14 037

45

9 378

46

14 043

47

9 998

48

5 037

49

17 321

50

12 055

Octal Number

© Revision 1.0

15

Computer Systems Architecture exercises

6. Octal to Decimal conversions All numbers are positive (unsigned numbers). First line is a solved example No. 1

Octal Number 4 227

2

23 417

3

1 733

4

6 474

5

36 475

6

1 761

7

17 630

8

47 037

9

76 220

10

67 175

11

137 743

12

114 265

13

67 742

14

2 274

15

27 351

16

105 033

17

202 152

18

121 666

19

51 167

20

65 500

21

17 777

22

77 123

23

37 000

24

325

Decimal Number 2 199

© Revision 1.0

16

Computer Systems Architecture exercises No. 25

Octal Number 1 111

26

12 345

27

25 252

28

6 666

29

70 001

30

56 712

31

1 000

32

63 451

33

36 712

34

3 412

35

6 711

36

7 033

37

4 510

38

2 777

39

10 666

40

12 012

41

43 651

42

33 773

43

6 236

44

1 177

45

17 451

46

6 755

47

23 773

48

5 000

49

16 666

50

7 654

Decimal Number

© Revision 1.0

17

Computer Systems Architecture exercises

7. Bases conversions (3-10) All numbers are positive (unsigned numbers). First line is a solved example No. 1

Decimal Number 159

2

1 991

Base 3

Base 4

12 220

2 133

Base 5 1 114

Base 6 423

133 102

4

304

5

363

6 789 1 210 010

8

33 221

9

24 102

10

21 241

11

20 352

12 13

8 192 101 021 210

14

332 133

15

112 421

16

51 155

17

20 410

18 19 20 21

315

101 111 111

3

7

Base 7

2 966 2 121 021 120 132 21 401

22

5 153

23

11 520

24

© Revision 1.0

18

Computer Systems Architecture exercises No. 25

Decimal Number 1 733

Base 3

Base 4

Base 5

Base 6

22 212 210

26

20 321 323

27

24 021

28

213 342

29

212 660

30 31

16 220 1 101 021 111

32

21 031 233

33

424 030

34

114 143

35

6 426

36 37

21 351 111 202 121

38

10 322 320

39

331 313

40

140 543

41

22 015

42 43

6 236 1 121 121

44

3 201 303

45

222 010

46

302 021

47

20 402

48 49 50

Base 7

16 666 101 021 022

© Revision 1.0

19

Computer Systems Architecture exercises

8. Bases conversions (10-15) All numbers are positive (unsigned numbers). First line is a solved example No. 1

Decimal Number 1234

2

3591

Base 11

Base 12

Base 13

Base 14

Base 15

A22

86A

73C

642

574

586A

3

3580

4

101C

5

1386

6

13AC

7 8

2987 911A

9

5953

10

3B3B

11

2939

12

1DBC

13 14

3926 4151

15

4238

16

5BB0

17

4099

18

2E74

19 20 21 22

13456 15211 B6A7 62BC

23

5655

24

© Revision 1.0

20

Computer Systems Architecture exercises

9. Decimal fractions to Binary fractions conversions All numbers are positive (unsigned numbers). First line is a solved example No.

Decimal Fraction

1

0.4375

2

0.1875

3

0.625

4

0.1171875

5

0.546875

6

0.875

7

0.34375

8

0.9375

9

0.375

10

0.40625

11

0.140625

12

0.65625

13

0.234375

14

0.734375

15

0.8125

16

0.8984375

17

0.71875

18

0.109375

19

0.578125

20

0.953125

21

0.5546875

22

0.4765625

23

0.13671875

24

0.49609375

Binary Fraction

© Revision 1.0

0.0111

21

Computer Systems Architecture exercises

10.Binary fractions to Decimal fractions conversions All numbers are positive (unsigned numbers). First line is a solved example No.

Binary Fraction

1

0.1

2

0.001

3

0.1111111

4

0.010001

5

0.010111

6

0.000001

7

0.00011

8

0.1011

9

0.00101

10

0.000101

11

0.10001

12

0.001101

13

0.11101

14

0.11

15

0.100001

16

0.00111

17

0.1001

18

0.010011

19

0.011011

20

0.11111

21

0.0101101

22

0.1100101

23

0.0000011

24

0.00011011

Decimal Fraction

© Revision 1.0

0.5

22

Computer Systems Architecture exercises

11.Negative numbers representations Convert the negative decimal number to binary (16 bits). First line is a solved example Binary Numbers No.

Decimal Number

One’s Complement

Two’s Complement

1

- 4095

1111 0000 0000 0000

1111 0000 0000 0001

2

- 7612

3

- 1985

4

- 5777

5

- 8000

6

- 729

7

- 3333

8

- 4799

9

- 6222

10

- 3488

11

- 8190

12

- 9999

13

- 7231

14

-3

15

- 127

16

- 4777

17

- 1741

18

- 7676

19

- 4288

20

- 3636

21

- 1901

22

- 8076

23

- 5555

© Revision 1.0

Sign and Magnitude 1000 1111 1111 1111

23

Computer Systems Architecture exercises

12.Numbers representations Assuming the Hexadecimal number represents a 16 bits signed binary number convert it to a decimal number. The three columns represent the binary notation. First line is a solved example Decimal Numbers No.

Hexadecimal Number

1

64

2

8FFF

3

6ABC

4

F0F0

5

FC92

6

FAB0

7

2000

8

EEEE

9

D012

10

8111

11

7FFF

12

BAD1

13

A000

14

E900

15

D780

16

9BBB

17

AAAA

18

6819

19

8750

20

C009

21

ABCD

22

DFEF

One’s Complement 100

© Revision 1.0

Two’s Complement 100

Sign and Magnitude 100

24

Computer Systems Architecture exercises

13.Adding binary numbers First line is a solved example Binary Numbers No.

First Number

Second Number

1

0011 0000 1101 0100

0010 0111 0000 0011

2

0010 0111 0000 1111

0010 1110 1110 1100

3

0000 1010 1110 0100

0010 1110 0000 1000

4

0010 1011 0110 0111

0000 0010 0010 1011

5

0001 1111 1111 1111

0001 1111 1111 1111

6

0000 1111 1111 1111

0010 1010 1100 1010

7

0010 1010 0101 1010

0000 0111 1111 1111

8

0001 1010 1011 1100

0001 1011 1100 1101

9

0001 1111 1110 1111

0000 1111 1101 1011

10

0011 1011 1010 1101

0001 1101 1101 1101

11

0101 1001 1001 0011

0001 1111 1111 1111

12

0001 0010 0011 0100

0001 0010 0011 0100

13

0100 1010 1011 0101

0010 0101 0111 1011

14

0011 1111 1111 1111

0011 1111 1111 1111

15

0001 1010 1010 1010

0010 1101 1101 1101

16

0100 1110 1110 1110

0001 1100 1100 1100

17

0101 0101 0101 0101

0001 1111 1111 1111

18

0001 1001 1111 1011

0011 1111 1011 1001

19

0011 1010 0110 0111

0001 0110 0111 1010

20

0001 1111 1011 1110

0011 1010 1100 1101

21

0101 1010 1000 1011

0000 1010 1000 1011

22

0010 0111 0111 0111

0010 1011 1011 1011

23

0100 0100 0100 0100

0001 1100 1100 1100

© Revision 1.0

Result 0101 0111 1101 0111

25

Computer Systems Architecture exercises

14.Adding Hexadecimal unsigned numbers First line is a solved example Hexadecimal Numbers No.

First Number

Second Number

1

7895

4ABC

2

1DDD

2CCC

3

3DEF

2999

4

42BD

2BFC

5

4623

3BDE

6

23DD

3D1F

7

5A96

1E0F

8

8B1E

1ED6

9

22FF

3DF9

10

2D2D

3F3F

11

3DDE

3EED

12

17EF

6E3F

13

9ABC

1234

14

5555

6789

15

1FFF

6BAD

16

6841

30FB

17

351D

4FE2

18

10FF

3EC7

19

22EE

3DC9

20

6699

1EB5

21

3D8E

3F0F

22

90E9

2E3F

23

4BA4

5CB5

© Revision 1.0

Result C351

26

Computer Systems Architecture exercises

15.Adding Octal unsigned numbers First line is a solved example Octal Numbers No.

First Number

Second Number

1

3125

5632

2

7777

1111

3

21430

6705

4

23561

17777

5

36217

23567

6

15374

25517

7

52527

12774

8

22337

33557

9

17652

21576

10

11666

55645

11

21766

51622

12

62357

22366

13

12345

12345

14

76543

1111

15

32145

12306

16

17365

36547

17

47147

15647

18

16736

26576

19

26077

25716

20

37437

26726

21

42637

20675

22

33512

32645

23

32325

35353

© Revision 1.0

Result 10757

27

Computer Systems Architecture exercises

16.Adding Base 3 unsigned numbers First line is a solved example Base 3 Numbers No.

First Number

Second Number

1

111020

11221

2

11101

100022

3

10121

20120

4

10202

21111

5

11010

22020

6

11120

21220

7

101201

20111

8

101011

102121

9

21101

21102

10

21201

21220

11

22020

20120

12

101120

20021

13

102010

2001

14

100212

20011

15

101210

21002

16

20101

12201

17

22122

20120

18

101201

21110

19

100110

20111

20

21202

12111

21

21211

101010

22

102101

11121

23

22002

101011

© Revision 1.0

Result 200011

28

Computer Systems Architecture exercises

17.Adding Base 4 unsigned numbers First line is a solved example Base 4 Numbers No.

First Number

Second Number

1

30110

10320

2

22323

11120

3

20223

11031

4

21223

11322

5

22333

10332

6

30312

12233

7

30112

10331

8

22100

20311

9

32011

12003

10

31221

2001

11

23331

22032

12

33031

12300

13

31320

22122

14

32211

23020

15

22323

21121

16

23202

30322

17

30230

13212

18

32003

23113

19

32331

22233

20

33220

13310

21

103102

3222

22

101113

11031

23

31333

30122

© Revision 1.0

Result 101030

29

Computer Systems Architecture exercises

18.Multiplying binary numbers All are unsigned 16 bits numbers. First line is a solved example Binary Numbers No.

First Number

Second Number

1

1111 1111

110 0010

2

10 0011

101 0110

3

100 1000

1000 1010

4

1 1010 1011 0010

1101

5

101 1101 1111

1 0011

6

100 0000 1111

1 1111

7

11 1100 1011

1001

8

1 1000 0011

1 1101

9

111 1000 1001

1101

10

1101 1101 1110

110

11

1 0000 1000 1001

101

12

10 0011 1001

1011

13

1010 1011 1011

111

14

1 1010 1011

1 0010

15

11 0110 1001

1110

16

101 0101 0010

1 1110

17

1110 1110 0010

1011

18

1011 1010 1101

1110

19

1011 0000 0000

1101

20

100 1000 0011

1 1011

21

1011 1011 0001

110

22

1111 1111

1111 1111

23

1011 1011

1011 1011

© Revision 1.0

Result 110 0001 1001 1110

30

Computer Systems Architecture exercises

19.Floating point (754) notation (signed integers) Convert between the numbers. First line is a solved example No.

Decimal Number

1

1

2

2

3 4

40400000 5

5 6

41300000 17

7 8

41C00000 28

9 10

42000000 49

11 12

426C0000 65

13 14

429E0000 85

15 16

42C60000 127

17 18

436E0000 256

19 20

43808000 313

21 22

43B18000 485

23 24

Floating point Number 3F800000

43F50000 515

© Revision 1.0

31

Computer Systems Architecture exercises No.

Decimal Number

Floating point Number 44054000

25 26

-4 C0E00000

27 28

-13 C1980000

29 30

-25 C1F80000

31 32

-37 C22C0000

33 34

-55 C2860000

35 36

-73 C29E0000

37 38

-86 C2BA0000

39 40

-101 C2EC0000

41 42

-197 C3710000

43 44

-267 C39A8000

45 46

-333 C3C80000

47 48

-499 C4034000

49 50

-541

© Revision 1.0

32

Computer Systems Architecture exercises

20.Floating point (754) notation (signed fractions) Convert between the numbers. First line is a solved example No. 1

Decimal Number 123.75

2

76.25

3 4

429B0000 64.3125

5 6

42378000 75.75

7 8

42158000 19.25

9 10

41F90000 0.625

11 12

42478000 63.5

13 14

42C68000 12.125

15 16

41C40000 31.625

17 18

42B10000 73.25

19 20

3F900000 3.375

21 22

419E0000 66.875

23 24

Floating point Number 42F78000

425A8000 27.25

© Revision 1.0

33

Computer Systems Architecture exercises No.

Decimal Number

25 26

-25.625 BE000000

27 28

-8.125 C1CA0000

29 30

-26.375 C2AF0000

31 32

-63.75 C2250000

33 34

-0.3125 C0BC0000

35 36

-33.375 C1420000

37 38

-68.875 C2978000

39 40

-1.75 C2854000

41 42

-49.5 C1CE0000

43 44

-99.125 BF600000

45 46

-61.25 C21D8000

47 48

-25.25 C2560000

49 50

Floating point Number C2B24000

-71.125

© Revision 1.0

34

Computer Systems Architecture exercises

21.Adding Floating point (754) numbers Add the floating-point numbers, then covert them to decimal, add the decimal numbers and check against the floating-point result obtained. First line is a solved example Floating Point Numbers No. 1

First Number 41700000

Second Number 420C0000

2

C14C0000

40300000

3

418E0000

41B20000

4

3F600000

4280A000

5

C2AF0000

422D0000

6

429A0000

42080000

7

44400000

43800000

8

41CA0000

43BBB000

9

414A0000

41460000

10

C1EC0000

3F000000

11

C18E0000

C20B0000

12

BE400000

BF500000

13

41DD0000

C0500000

14

42C00000

41C00000

15

44B08000

43008000

16

42C80000

41100000

17

41E40000

3FC00000

18

BF400000

40300000

19

413E0000

411E0000

20

C0FC0000

41140000

21

40C40000

40C40000

22

40B80000

C1340000

Result 42480000

© Revision 1.0

Decimal Numbers First Number 15.0

Second Number 35.0

Result 50.0

35

Computer Systems Architecture exercises

22.Multiplying Floating point (754) numbers Multiply the floating-point numbers, then covert them to decimal, multiply the decimal numbers and check against the floating-point result obtained. First line is a solved example Floating Point Numbers No. 1

First Number 40A00000

Second Number C0000000

2

41400000

40800000

3

C0700000

C1000000

4

40200000

40700000

5

42C80000

C17C0000

6

43000000

3FC00000

7

C2800000

C2000000

8

40900000

40B00000

9

44800000

44000000

10

C1C00000

42120000

11

41700000

42080000

12

411C0000

C0500000

13

44200000

3E000000

14

C2960000

C0800000

15

417A0000

40900000

16

41940000

C22D0000

17

C1900000

C2B60000

18

C2F00000

C2600000

19

41C80000

41C00000

20

C0C80000

42EC0000

21

414A0000

41840000

Result C1200000

© Revision 1.0

Decimal Numbers First Number 5

Second Number -2

Result -10

36

Computer Systems Architecture exercises

23.BCD numbers (8421, 2421) All are unsigned 16 bits numbers. First line is a solved example

No.

Decimal Number

1

1234

2

2468

3

7136

4

5497

5

3780

6

9026

7

4512

8

7462

9

3297

10

4582

11

1097

12

7651

13

1928

14

8763

15

7329

16

8461

17

1357

18

9042

19

6703

20

2730

21

4587

22

9768

23

8264

BCD Numbers 8421 0001 0010 0011 0100

© Revision 1.0

2421 0001 0010 0011 0100

37

Computer Systems Architecture exercises

24.BCD numbers (84-2-1, Excess-3) All are unsigned 16 bits numbers. First line is a solved example

No.

Decimal Number

1

1234

2

2468

3

7136

4

5497

5

3780

6

9026

7

4512

8

7462

9

3297

10

4582

11

1097

12

7651

13

1928

14

8763

15

7329

16

8461

17

1357

18

9042

19

6703

20

2730

21

4587

22

9768

23

8264

BCD Numbers 84-2-1 0111 0110 0101 0100

© Revision 1.0

Excess - 3 0100 0101 0110 0111

38

Computer Systems Architecture exercises

Chapter 4 – Central Processing Unit exercises 1. Architectures In the following questions the Answer should relate to a Stack-based architecture, Accumulator-based architecture and Register-Register architecture a) Write the instructions needed for executing the following formula: C = 2A + 3B Where A, B, C are variables. Try to optimize the code (minimum instructions possible) b) Write the instructions needed for executing the following formula: Sum = 8A + 4B + 2C +D Where A, B, C, D and Sum are variables c) Write the instructions needed for executing the following formula: Sum = A + 2B + 4C +8D Where A, B, C, D and Sum are variables d) Write the instructions needed for executing the following formula: Sum = 2(A + 2B) – 3(C +2D) Where A, B, C, D and Sum are variables e) Write the instructions needed for executing the following formula: Sum = 2AB+3CD Where A, B, C, D and Sum are variables f) Write the instructions needed for executing the following formula: Sum = 3(A+B)*(C+D) Where A, B, C, D and Sum are variables g) Write the instructions needed for executing the following formula: Sum = (A+2B+3C)*(3A+2B+C) Where A, B, C and Sum are variables h) Write the instructions needed for executing the following formula: Sum = A*B*C/(A+B+C) Where A, B, C and Sum are variables i) Write the instructions needed for executing the following formula: Sum = (A+B)*(B+C)*(A+C) Where A, B, C and Sum are variables

© Revision 1.0

39

Computer Systems Architecture exercises

2. CPI a) A specific program was executed on two computer systems. M1 has a cycle time of 1 ns and CPI=5 M2 has a cycle time of 1.6 ns and CPI=2.5 Which system is faster (for this specific program) and by how much b) Assuming we need to get similar speed on both system (described in the question above). What should be the CPI of each one of the systems. c) Assuming we need to get similar speed on both system (described in the questions above). What should be the cycle time of each one of the systems. d) A specific system implemented four groups of instructions as outlined in the following table. When a specific program was executed the following instructions’ frequencies were observed Group ALU

Frequency 50%

Cycles 1

Load

20%

2

Store

10%

2

Branch

20%

2

Calculate the average CPI for that specific program.

e) A specific system implemented four groups of instructions as outlined in the following table. When a specific program was executed the following instructions’ frequencies were observed Group ALU

Frequency 43%

Cycles 1

Load

21%

1

Store

12%

2

Branch

24%

2

The hardware engineers can improve the “Store” group of instructions performance and execute each instruction in one cycle. However this will require

© Revision 1.0

40

Computer Systems Architecture exercises increasing the systems cycle time by 15%. Is it worthwhile to implement the proposed change?

f) A specific system runs at 1GHz and supports four groups of instructions. When a specific program was executed the following usage was observed: Group 1

Usage 20%

CPI 1

2

30%

2

3

10%

3

4

40%

4

Calculate the CPI for that program. If there were 1010 instructions how many cycles it required and what was the execution time? g) A program was executed on a specific system and the following attributes were measured: - The number of instruction executed was 1009 - The CPI measured during the run was 2.5 - The clock rate was 2.5 GHz. Calculate the amount of time this program ran.

h) A program from the previous question was executed once again but this rime using a different compiler and a different hardware system. The attributes of the second runs were as follows: - The number of instruction executed was 950,000,000 - The CPI measured during the run was 3.0 - The clock rate was 3 GHz. Which system is faster and by how much

i) A program was executed on two system. The first system with a 20 ns cycle time and the second with a 30 ns. On the first system the CPI measured was 2.0 and on the second system the CPI was 0.5. Calculate the run time on each one of the systems

© Revision 1.0

41

Computer Systems Architecture exercises j) A program was executed on two system. The first system with a 500MHZ clock rate and the second with a 650 MHz clock rate. On the first system it used 1*108 cycles and on the second system it used 1.2 *108 cycles. Which system is faster and by how much? k) A system has three types of instructions. The first type executes in one cycle, the second requires two cycles and the third requires three cycles. When running a piece of code, five instructions were of type one, three were of type two and two instructions were of type three, Calculate the CPI of this piece of code. l) While running a program on system A with a clock rate of 600 MHz the CPI was 1.3. Running the program on system B with a clock rate of 750 MHz produced a CPI of 2.5. Assuming the number of instruction was 100,000’ what should the number of instruction on system B for achieving the same execution time? m) Running a specific program requires 3*1010 cycles. How long does it run if it executes on a system with a clock rate of 100 MHz? And how long does it run on a system with a clock rate of 3 GHz? n) A specific system uses 5 groups of instruction. Each group requires a different number of cycle for execution. While executing a test run the group frequencies were measured as outlined in the following table Group 1

Name ALU

Usage 22%

CPI 1

2

Memory Access

36%

5

3

Branch

16%

3

4

Call

13%

4

5

Return

13%

4

In an attempt to increase the system’s performance the hardware engineers have come up with several improvement suggestions. However’ each one of the suggestions has some draw backs (in addition to the benefits). The following tables summarizes the suggestions and the drawbacks. Usually an improvement relates to one group of instructions and the drawback is the extra time required to increase the cycle time.

© Revision 1.0

42

Computer Systems Architecture exercises Group

Name

Improvement

1

ALU

25%

Cycle time increase 7%

2

Memory Access

35%

17%

3

Branch

90%

1%

4

Call

45%

2%

5

Return

45%

2%

For example it is possible to decrease the number of cycle required by the ALU by 25% but it requires increasing the cycle time (for all the groups) by 7%. The only limitation is that only one improvement can be implemented. Which one is the preferred suggestion? Which one is the worst one.

© Revision 1.0

43

Computer Systems Architecture exercises

3. Amdahl’s Law a) A specific system performs a task in 100 ns. It is possible to introduce an improvement and then the task will be performed in 20 ns. The task is executed only during 30% of the time. Calculate the improvement b) A new system can execute a specific task 10 times faster compared to the existing system. The task is performed during 40% of the time while 60% of the time is dedicated to I/O operations. What is the total speed up to be achieved from replacing the system. (Note: the 60% I/O operation will not change) c) On a specific application 5% of the code has to run in serial mode while 95% of the code can be executed in parallel. What will be the improvement when moving the application to a 10 CPUs system d) After spending a lot of time a hardware engineer managed to improve the floating point arithmetic which now executes twice as fast. Unfortunately floating point operations are executed only during 10% of the time. What is the overall speedup achieved? e) A scientific application was developed so it can exploit parallel systems utilizing many processors. As such the parallel part is executed during 95% of the time. - On how many processors the application should run so it is 10 times faster? - What will be the number of processors required to achieve a 25 times faster time? - After spending additional work the developers succeeded to increase the parallel percentage to 97%. - How many processors are required for a 10 times faster run? - How many are required for a 20 times faster run? f) A scientific application runs on parallel systems, however only 70% is suited for parallel processors. Calculate the speedup obtained by using, two, three, four and five processors.. g) A scientific application runs for 100 minutes. 40 minutes are CPU time while 60 minutes are I/O time. For increasing the speed, two alternatives are considered: replacing the processor by a new model 90 times faster or replacing the disk system by a newer model 4 times faster. - Which alternative is better? - What will be the run time for each alternative

© Revision 1.0

44

Computer Systems Architecture exercises h) In an attempt to improve the system’s performance, the hardware engineers came up with two possible solutions: adding a special hardware device that will execute the square root instruction 10 times faster, or improving the floating point instruction so they will execute twice as fast. On the benchmark programs the square root instructions are executed 20% of the time while the floating point instructions execute during 50% of the time. Which alternative is better?

i) In a specific processor the Load and Store instructions were improved so they execute four times faster. These instructions account for 50% of the total run time. What is the overall speedup? Assuming the test program ran for 160 seconds, how long it will run after the improvement? j) In a specific processor all logic instructions were improved and the new ones execute five time faster. Assuming the logic instructions account for 50% of the run time, how long will run a program that previously ran 10 seconds? k) \ After improving the arithmetic instructions they run five times faster. What should their percentage be if the overall speedup required is 3 times? l) You are the CIO a manufacturing organization. Due to anticipated changes the board asked that the main application executed on the system will run twice as fast. You checked the application and discovered that 65% is suited for parallel execution. How many processors (or cores) do you have to buy? m) A specific system is using RAM only (no cache memory). It is possible to ass the cache memory that runs five times faster. How the execution time will change if the applications use the cache 80% on average? What will happen if the cache is used only 50% of the time? n) A vector computer executes an instruction on a vector (array) of values, contrary to a scalar computer that executes the instruction on a single value. On a specific vector computer, the vector instructions are 20 times faster compared to the scalar instructions. What the speedup will be assuming only 70% of the application can be vectorized? o) There was an urgent need to improve the performance of a specific CISC based system. Like any other CISC based system it supported many instructions. The hardware engineers mapped the instructions and addressed the 10 most used ones. These 10 instructions account for 90% of the time. After the improvement these 10 instructions executed 6 times faster. What is the overall speedup obtained

© Revision 1.0

45

Computer Systems Architecture exercises p) During a meeting dedicated to the new system acquired, the CIO said that the fact the new system has 100 processors means that all application will benefit. Some application may see an= a increase of 50 times and others less. Nevertheless he said all application will run at least twice as fast. Do you agree to this assumption?

© Revision 1.0

46

Computer Systems Architecture exercises

4. Scoreboarding a) Draw the score board for the following Register-Register based architecture’s instructions: Add Sub Add Sub

R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5

b) Draw the score board for the following Register-Register based architecture’s instructions: Mult Mult Add Div

R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4

c) The high level computer instruction SUM = A+B+C+D Can be implemented in assembly language by: Add Add Add

R5,R1,R2 R5,R5,R3 R5,R5,R4

Add Add Add

R5,R1,R2 R6,R3,R4 R5,R5,R6

Or

Which one is better? Draw the score board for the two implementations. Assume that after issuing an instruction there is one extra cycle before the register becomes available

© Revision 1.0

47

Computer Systems Architecture exercises

5. Branch Prediction a) The following string represents the behavior of a specific branch instruction. - “1” means the branch was taken - “0” means the branch was not taken 1001110111 Calculate the success rate of the branch prediction when using one and two bits. In both cases assume the default is not to branch. b) What are the success rates of the branches in the previous exercise if the default is branch taken? Solution c) Calculate the success rates of the branch prediction mechanisms for the following scenarios. A bit set represents a branch taken. Scenario 1110000110110

One bit default Taken

Two bits default Taken

10101010111

Taken

Not taken

10101010111

Not taken

Taken

1110001010

Taken

Taken

1110001010

Not taken

Not taken

1011100110110110

Not taken

Taken

1110101010111011

Not taken

Taken

1011001110001010

Taken

Not taken

1100110011

Taken

Taken

1100011011101011

Not taken

Not taken

0011101000011001

Taken

Taken

1011011101111011

Taken

Taken

1100011110000011

Not taken

Not taken

1110101011

Not taken

Not taken

1010101011

Taken

Taken

© Revision 1.0

One bit success 61.54%

Two bits success 46.15%

48

Computer Systems Architecture exercises 1101101101101101

Taken

Taken

0111111111110

Not taken

Not taken

0111111111110

Taken

Taken

1111000011110000

Not taken

Not taken

0111111101111111

Not taken

Not taken

0001111010101111

Not taken

Not taken

1110101000110111

Taken

Taken

1101011111011000

Taken

Taken

1101110111011101

Taken

Taken

© Revision 1.0

49

Computer Systems Architecture exercises

Chapter 6 – Cache Memory exercises 1. Cache Memory improvements a) A specific system has three levels of memory (Main, Cache L1 and Cache L2) The memory access time is 50 ns, L1 access time is 1 ns L2 access time is 5 ns An application was executed on that system which is characterized by the fact that 30% of the instructions access memory. 90% of these access are found in L1 and only 1% has to be brought from main memory. Calculate the amount of time added to each cycle, if: - The application is using only main memory - The application is using main memory and L1 - The application is using main memory and L2 - The application is using main memory, L1 and L2. case 0.56 ns will be added to the calculated CPI. b) A specific system has four levels of memory (Main, Cache L1, Cache L2 and Cache L3) . The memory access percent is 33% The memory access time is 60 ns, 0 missing rate L1 access time is 1 ns, 15% missing rate (the datum is not found in L1) L2 access time is 10 ns, 7% missing rate L2 access time is 20 ns, 1% missing rate Calculate the amount of time added to each cycle, if: - The application is using only main memory - The application is using main memory and L1 - The application is using main memory and L2 - The application is using main memory and L3 - The application is using main memory, L1 and L2. - The application is using main memory, L1, L2 and L3.

c) The following table defines several systems with different characteristics. All systems have a three level hierarchical memory (main, L1, L2). - The memory access represents the percent of the instructions that access memory. - The access time defines the time required to access each of the levels. - The missing rate represents the percentage of misses per each level. The missing rate for the main memory is always zero. For each line calculate the amount of time added to the CPI if: © Revision 1.0

50

Computer Systems Architecture exercises -

No.

The application is using only main memory The application is using main memory and L1 The application is using main memory and L2 The application is using main memory, L1 and L2.

Memory Access %

Access time L2

Missing Rate (%) L1 L2

Main

Added Time ns L1 L2

Main

L1

1

35

75

2

10

12

2

26.25

3.77

2

40

80

2

15

8

1

3

25

50

3

12

9

2

4

29

48

2

10

12

2

5

32

56

1

8

11

2

6

48

40

2

12

14

3

7

28

50

3

15

12

2

8

36

54

1

12

10

1

9

65

50

1

11

8

1

10

52

52

2

11

9

2

© Revision 1.0

3.96

L1+L2 1.49

51

Computer Systems Architecture exercises

Chapter 7 – BUS exercises 1. Parity a) The following table contains binary number and the definition of the parity bit (odd or even) Calculate the parity and complete the table. The first exercise is solved. No.

Binary Number

Parity

Parity Bit 0

1

10 1010 1010 1010

Odd

2

1111 0000 1111

Even

3

1 1011 1101 1010

Even

4

11 0000 1011

Odd

5

1111 1100 0000

Odd

6

10 0000 0100

Odd

7

100 0111 1010

Even

8

000 1101 1010

Even

9

1 0111 0101 1011

Even

10

1 1111 1111

Odd

b) For increasing the integrity of the information sent over the network a special parity algorithm was define. Instead of adding just one bit per a block (seven bits) the algorithm adds three parity bits. Each such parity bit guards several data bits. Unlike the ordinary simple parity bit mechanism, thus mechanism increases the overhead but provides the capability to correct faulty blocks without the need to re-transmit. The algorithm is defined by: P0 = even_parity (D0, D1, D3, D4, D5) P1 = even_parity (D1, D2, D3, D5, D6) P2 = even_parity (D0, D2, D3, D4, D6) Encode the binary number 0110010 by adding the three parity bits as define by the algorithm above

© Revision 1.0

52

Computer Systems Architecture exercises c) The following table contains a list of 7 bits block that have to be encoded using the previously described algorithm: P0 = even_parity (D0, D1, D3, D4, D5) P1 = even_parity (D1, D2, D3, D5, D6) P2 = even_parity (D0, D2, D3, D4, D6) Calculate the three parity bits for each block. No.

Block Content

1

111 1111

2

101 1110

3

101 1110

4

001 0011

5

101 0101

6

110 1101

7

111 0011

8

101 1111

9

111 1001

10

100 0111

11

100 1001

12

111 1100

13

001 0010

P0

P1

P2

1

1

1

d) The following table contains a list of 7 bits block that have to be encoded using a very similar algorithm (same locations but the parity is odd): P0 = odd_parity (D0, D1, D3, D4, D5) P1 = odd_parity (D1, D2, D3, D5, D6) P2 = odd_parity (D0, D2, D3, D4, D6) Calculate the three parity bits for each block No.

Block Content

1

011 1111

2

000 1010

P0

P1

P2

1

0

1

© Revision 1.0

53

Computer Systems Architecture exercises 3

111 0000

4

101 0000

5

111 0111

6

110 0011

7

111 1111

8

100 0001

9

011 1110

10

101 0101

11

111 0001

12

101 1101

13

110 1101

e) The block: 111 0101 110 arrived through the network after it was encoded using the following algorithm: P0 = even_parity (D0, D1, D3, D4, D5) P1 = even_parity (D1, D2, D3, D5, D6) P2 = even_parity (D0, D2, D3, D4, D6) Decode it and find the correct data bits of the original block.

f) The block: 101 1111 000 arrived through the network after it was encoded using the following algorithm: P0 = odd_parity (D0, D1, D3, D4, D5) P1 = odd_parity (D1, D2, D3, D5, D6) P2 = odd_parity (D0, D2, D3, D4, D6) Decode it and find the correct data bits of the original block.

g) The block: 111 0111 101 arrived through the network after it was encoded using the following algorithm: P0 = odd_parity (D0, D1, D3, D4, D5) P1 = odd_parity (D1, D2, D3, D5, D6) P2 = odd_parity (D0, D2, D3, D4, D6) Decode it and find the correct data bits of the original block.

© Revision 1.0

54

Computer Systems Architecture exercises h) The following table contains a list of block that arrived through the network. All blocks were encoded using the following odd or even algorithm: P0 = parity (D0, D1, D3, D4, D5) P1 = parity (D1, D2, D3, D5, D6) P2 = parity (D0, D2, D3, D4, D6) The correct algorithm (odd or even) is define in the table for each block. Decode each received block to find the correct data bits of the original block assuming no more than one bit flipped. No.

Block Received

Algorithm

Flipped bit

Original Block

D5

111 1111

1

111 1101 000

Odd

2

000 0100 111

Odd

3

110 1001 110

Even

4

110 0111 000

Even

5

110 0001 000

Even

6

101 1010 000

Even

7

100 1111 010

Odd

8

010 1111 000

Odd

9

101 0111 000

Odd

10

111 0110 000

Even

11

110 0111 110

Even

12

111 0001 011

Even

13

111 0101 101

Odd

14

101 0100 110

Odd

15

001 1000 110

Odd

16

011 1010 101

Odd

17

000 0000 100

Odd

18

111 1111 010

Even

19

000 0010 001

Even

20

010 0101 101

Even

© Revision 1.0

55

Computer Systems Architecture exercises 21

011 1011 110

Even

22

011 0101 010

Odd

23

011 0111 100

Odd

24

010 1001 000

Odd

© Revision 1.0

56

Computer Systems Architecture exercises

2. Hamming Codes a) Use odd hamming codes for decoding the value 1111 b) Use odd hamming codes for decoding the value 1010 c) Use even hamming codes for decoding the value 1101 1110 d) The following table contains a list of block that have to be encoded using Hamming codes. Calculate the Hamming codes for each line that represents a data block. The specific parity to be used per each block is defined as well. No.

Original Data

Parity

1

011 1111

Even

2

011 1111

Odd

3

111 0001

Odd

4

101 1011

Even

5

000 1111

Even

6

010 0100

Odd

7

101 1100

Odd

8

101 1100

Even

9

100 1000

Even

10

111 1000

Even

11

010 1001

Odd

12

0101001

Even

13

001 1111

Odd

14

000 0111

Odd

15

101 0101

Even

© Revision 1.0

Encoded Block 000 1111 1111

57

Computer Systems Architecture exercises

3. SECDED a) Use even Hamming codes for decoding the value 110 1001 and in addition add an odd SECDED bit b) The following table contains a list of blocks that have to be encoded using Hamming codes. Calculate the Hamming codes As well as the SECDED The specific parity to be used per each block is defined as well. No. 1

Original Data 101 1111

Hamming Parity Even

SECDED Parity Odd

2

010 1100

Even

Even

3

001 0001

Odd

Odd

4

100 1110

Even

Odd

5

101 0001

Odd

Even

6

111 0001

Odd

Odd

7

011 0110

Even

Odd

8

001 1010

Even

Even

9

010 0010

Odd

Even

10

100 1110

Odd

Odd

11

000 0111

Even

Odd

12

010 1110

Even

Odd

13

111 1000

Odd

Even

14

011 0011

Odd

Odd

Encoded Block 1011 0011 1111

c) A message with the content: 0001 0111 0111 arrived from the network. It was encoded using odd Hamming codes and an odd SECDED. Decode it to obtain the original data. d) A message with the content: 1111 0010 0011 arrived from the network. It was encoded using even Hamming codes and an even SECDED. Decode it to obtain the original data.

© Revision 1.0

58

Computer Systems Architecture exercises e) The following table contains a list of blocks that were received. Decode the Hamming codes as well as the SECDED. The specific parity to be used per each block is defined as well. No.

Received Block

1

0001 0101 0111

Hamming Parity Odd

2

1010 1101 1001

Even

Even

3

1000 1000 0000

Odd

Odd

4

0110 1100 0101

Even

Odd

5

0100 0111 1110

Odd

Even

6

0001 0010 1001

Odd

Odd

7

1110 1101 1111

Even

Odd

8

0110 0010 1010

Even

Even

9

1000 0111 1000

Odd

Even

10

0100 0000 0001

Odd

Odd

11

1011 1100 1101

Even

Odd

12

1000 1010 1110

Even

Odd

13

1001 1111 1011

Odd

Even

14

1110 0111 1010

Odd

Odd

15

1011 1101 1100

Even

Odd

16

1101 1010 0101

Even

Even

17

1100 0001 1111

Even

Even

18

0110 1000 0001

Odd

Odd

19

1010 0001 0000

Odd

Even

20

0001 1001 0111

Even

Odd

© Revision 1.0

SECDED Parity Odd

Bit Flipped D6

Original Data 111 1111

59

Computer Systems Architecture exercises

© Revision 1.0

60

Computer Systems Architecture exercises

System Architecture Exercises Solutions

© Revision 1.0

61

Computer Systems Architecture exercises

Chapter 2 – Data Representation exercises 1. Decimal to Binary conversions All numbers are positive (unsigned numbers). First line is a solved example No.

Decimal Number

Binary Number

1

123

111 1011

2

1 023

11 1111 1111

3

765

10 1111 1101

4

9 898

10 0110 1010 1010

5

3 999

1111 1001 1111

6

159

1001 1111

7

762

10 1111 1010

8

7 602

1 1101 1011 0010

9

5 577

1 0101 1100 1001

10

2 004

111 1101 0100

11

11 000

10 1010 1111 1000

12

35 355

1000 1010 0001 1011

13

404

1 1001 0100

14

1 234

100 1101 0010

15

4 949

1 0011 0101 0101

16

573

10 0011 1101

17

669

10 1001 1101

18

917

11 1001 0101

19

8 192

10 0000 0000 0000

20

7 811

1 1110 1000 0011

21

8 642

10 0001 1100 0010

22

3 789

1110 1100 1101

23

7 003

1 1011 0101 1011

© Revision 1.0

62

Computer Systems Architecture exercises No.

Decimal Number

Binary Number

24

3 887

1111 0010 1111

25

6 423

1 1001 0001 0111

26

12 129

10 1111 0110 0001

27

27 298

100 1010 1010 0010

28

19 999

100 1110 0001 1111

29

9 873

10 0110 1001 0001

30

17 399

100 0011 1111 0111

31

57

11 1001

32

634

10 0111 1010

33

9 824

10 0110 0110 0000

34

10 000

10 0111 0001 0000

35

5 665

1 0110 0010 0001

36

7 991

1 1111 0011 0111

37

999

11 1110 0111

38

800

11 0010 0000

39

3 333

1101 0000 0101

40

7 007

1 1011 0101 1111

41

12 123

10 1111 0101 1011

42

255

1111 1111

43

7 777

1 1110 0110 0001

44

5 656

1 0110 0001 1000

45

4 321

1 0000 1110 0001

46

99

110 0011

47

375

1 0111 0111

48

1 010

11 1111 0010

49

8 119

1 1111 1011 0111

© Revision 1.0

63

Computer Systems Architecture exercises No.

Decimal Number

Binary Number

50

6 944

1 1011 0010 0000

51

2 468

1001 1010 0100

52

1 753

110 1101 1001

53

4 762

1 0010 1001 1010

54

8 117

1 1111 1011 0101

55

1 928

111 1000 1000

56

7 956

1 1111 1 0100

57

19 175

100 1010 1110 0111

58

22 222

101 0110 1100 1110

59

7 275

1 1100 0110 1011

60

1 983

111 1011 1111

61

5 555

1 0101 1011 0011

62

36 133

1000 1101 0010 0101

63

11 223

10 1011 1101 0111

64

4 590

1 0001 1110 1110

65

21 325

101 0011 0100 1101

66

9 176

10 0011 1101 1000

67

9

1001

68

81

101 0001

69

5 933

1 0111 0010 1101

70

9 724

10 1010 1111 1100

71

5 311

1 0100 1011 1111

72

14 000

11 0110 1011 0000

73

781

11 0000 1101

74

35

10 0011

75

28 753

111 0000 0101 0001

© Revision 1.0

64

Computer Systems Architecture exercises

2. Binary to Decimal conversions All numbers are positive (unsigned numbers). First line is a solved example No.

Binary Number

Decimal Number

1

1 0000 0000

256

2

1100 1100 1101

3 277

3

1 0010 0011 0100

4 660

4

1 1000 0100 0010

6 210

5

100 0000 0001

1 025

6

1010 1010 1010

2 730

7

1 0000 0000 0001

4 097

8

100 1000 1000 1101

18 573

9

1100 1100 1100

3 276

10

111 1000 0111

1 927

11

1 0000 0010 0100

4 132

12

10 1010 1010 1010

10 922

13

111 0111 0111 0111

30 583

14

111 1011 0110

1 974

15

1101 0000 1101

3 341

16

1110 1001 1110

3 742

17

101 0010 1100 0001

21 185

18

101 0000 0000 1010

20 490

19

101 0001 0001 1111

20 767

20

100 1101 1100 0001

19 905

21

1 1111 0011 0011

7 987

22

11 0001 1111 1111

12 799

23

11 0100 0101 0110

13 398

24

111 1010 0001 0011

31 251

© Revision 1.0

65

Computer Systems Architecture exercises No.

Binary Number

Decimal Number

25

111 1010 1011 1100

31 420

26

11 0011 1001 1000

12 696

27

1111 1010 1010

4 010

28

10 1000 1100 1001

10 441

29

1011 0101 0101

2 901

30

110 1110 1111

1 775

31

11 1111 0100 0000

16 192

32

1 0001 0000 0001

4 353

33

1 1111 1111 1111

8 191

34

110 0110 1000 0100

26 244

35

1001 1001 1001

2 457

36

1001 1010 1010

2 474

37

11 0110 0111

871

38

1111 0000 0000

3 840

39

1000 0000 0001

2 049

40

1010 0000 1111

2 575

41

1000 1001 1010 1011

35 243

42

1001 0111 0110

2 422

43

11 0011 0011 0011

13 107

44

1001 0111 0101 0011

38 739

45

11 1000 1010

906

46

1011 1011 1000

3 000

47

1001 1110 1110

2 542

48

1110 1110 1001

3 817

49

101 0001 1110 0101

20 965

50

11 0000 0000 0011

12 291

© Revision 1.0

66

Computer Systems Architecture exercises No.

Binary Number

Decimal Number

51

10 0100 0110 1001

9 321

52

111 1010 1011

1 963

53

1 1010 0010 0010

6 690

54

101 0001 0001 0001

20 753

55

111 0110 0101 0100

30 292

56

1001 0010 0011 0100

37 428

57

1 1001 0010 1000

6 440

58

1000 0010 0111 0011

33 395

59

1011 1100 1100

3 020

60

1010 1001 1000

2 712

61

110 1000 1010 0001

26 785

62

100 1011 0001 1010

19 226

63

10 1011 0110 1001

11 113

64

111 0001 0101 0111

29 015

65

111 0101 1001 0100

30 100

66

1 1111 0000 1000

7 944

67

1 0101 1011

347

68

10 1011 0000 0011

11 011

69

10 0010 1011 1000

8 888

70

1 1000 1011 1101

6 333

71

1110 1001 1001

3 737

72

1 1110 1000 1111

7 823

73

110 1100 0001 1001

27 673

74

100 1010 0011 1101

19 005

75

10 0000 0011 0110

8 246

76

100 0000 0000 0000

16 384

© Revision 1.0

67

Computer Systems Architecture exercises

3. Decimal to Hexadecimal conversions All numbers are positive (unsigned numbers). First line is a solved example No. 1

Decimal Number 31 987

Hexadecimal Number 7CF3

2

21 007

520F

3

17 000

4268

4

14 555

38DB

5

17 800

4588

6

15 841

3DE1

7

14 773

39B5

8

17 183

431F

9

79 891

13813

10

22 888

5968

11

12 871

3247

12

21 680

54B0

13

14 000

36B0

14

29 612

73AC

15

16 658

4112

16

22 888

5968

17

10 000

2710

18

29 999

752F

19

30 651

77BB

20

31 113

7989

21

30 001

7531

22

17 325

43AD

23

9 876

2694

24

21 111

5277

© Revision 1.0

68

Computer Systems Architecture exercises No. 25

Decimal Number 911

Hexadecimal Number 28F

26

14 590

38FE

27

17 618

44D2

28

9 784

2638

29

11 011

2B03

30

8 933

22E5

31

12 617

3149

32

21 039

522F

33

10 000

2710

34

6 785

1A81

35

12 777

31E9

36

24 242

5EB2

37

19 898

4DBA

38

6 444

192C

39

717

2CD

40

3 982

F8E

41

10 986

2AEA

42

519

207

43

4 102

1006

44

22 097

5651

45

27 963

6D3B

46

16 741

4165

47

3 785

EC9

48

9 261

242D

49

4 022

FB6

50

26 789

68A5

© Revision 1.0

69

Computer Systems Architecture exercises

4. Hexadecimal to Decimal conversions All numbers are positive (unsigned numbers). First line is a solved example No. 1

Hexadecimal Number 8 000

Decimal Number 32 768

2

6 B8C

27 532

3

3 5DD

13 789

4

3 E1D

15 901

5

2 B67

11 111

6

3 FCD

16 333

7

2 6A0

9 888

8

4 705

18 181

9

A FC8

45 000

10

2 FAD

12 205

11

8 791

34 705

12

7 A7A

31 354

13

B 00F

45 071

14

A 000

40 960

15

7 575

30 069

16

2 FAD

12 205

17

5 AB0

23 456

18

5 7E4

22 500

19

4 944

18 756

20

313

787

21

3 745

14 149

22

6 20B

25 099

23

2 2CF

8 911

24

5 A9B

23 195

© Revision 1.0

70

Computer Systems Architecture exercises No. 25

Hexadecimal Number 9 C40

Decimal Number 40 000

26

20F

527

27

7E0

2 016

28

7 97B

31 099

29

2 F59

12 121

30

2 333

9 011

31

4 2E0

17 120

32

2 000

8 192

33

1 D6B

7 531

34

395

917

35

3 30A

13 066

36

2 222

8 738

37

4 D6D

19 821

38

4 36F

17 263

39

2C7

711

40

1 B63

7 011

41

DAB

3 499

42

FFA

4 090

43

1 F62

8 034

44

1 565

5 477

45

393

915

46

9A

155

47

4 7B3

18 355

48

4 267

16 999

49

2 DE2

11 746

50

2 107

8 455

© Revision 1.0

71

Computer Systems Architecture exercises

5. Decimal to Octal conversions All numbers are positive (unsigned numbers). First line is a solved example No. 1

Decimal Number 3 755

Octal Number 7 253

2

11 111

25 547

3

1 733

3 305

4

12 121

27 531

5

9 875

23 223

6

9 077

21 565

7

96 017

273 421

8

23 777

56 341

9

35 000

104 270

10

33 333

101 065

11

44 330

126 452

12

45 145

130 131

13

53 723

150 733

14

2 323

4 423

15

47 474

134 562

16

32 432

77 260

17

53 521

150 421

18

39 999

116 077

19

22 555

54 033

20

40 960

120 000

21

261 549

776 655

22

20 480

50 000

23

32 767

77 777

24

319

477

© Revision 1.0

72

Computer Systems Architecture exercises No. 25

Decimal Number 26 788

Octal Number 64 244

26

10 976

25 340

27

17 435

42 033

28

8 099

17 643

29

11 099

25 533

30

25 000

60 650

31

9 822

23 136

32

12 377

30 131

33

5 077

11 725

34

7 333

16 235

35

49

61

36

18 555

44 173

37

9 000

21 450

38

7 102

15 676

39

8 989

21 435

40

15 415

36 067

41

2 017

3 741

42

6 522

14 572

43

16 581

40 305

44

14 037

33 325

45

9 378

22 242

46

14 043

33 333

47

9 998

23 416

48

5 037

11 655

49

17 321

41 517

50

12 055

27 427

© Revision 1.0

73

Computer Systems Architecture exercises

6. Octal to Decimal conversions All numbers are positive (unsigned numbers). First line is a solved example No. 1

Octal Number 4 227

Decimal Number 2 199

2

23 417

9 999

3

1 733

987

4

6 474

3 388

5

36 475

15 677

6

1 761

1 009

7

17 630

8 088

8

47 037

19 999

9

76 220

31 888

10

67 175

28 285

11

137 743

49 123

12

114 265

39 093

13

67 742

28 642

14

2 274

1 212

15

27 351

12 009

16

105 033

35 355

17

202 152

66 666

18

121 666

41 910

19

51 167

21 111

20

65 500

27 456

21

17 777

8 191

22

77 123

32 339

23

37 000

15 872

24

325

213

© Revision 1.0

74

Computer Systems Architecture exercises No. 25

Octal Number 1 111

Decimal Number 585

26

12 345

5 349

27

25 252

10 922

28

6 666

3 510

29

70 001

28 673

30

56 712

24 010

31

1 000

512

32

63 451

26 409

33

36 712

15 818

34

3 412

1 802

35

6 711

3 529

36

7 033

3 611

37

4 510

2 376

38

2 777

1 535

39

10 666

4 534

40

12 012

5 130

41

43 651

18 345

42

33 773

14 331

43

6 236

3 230

44

1 177

639

45

17 451

7 977

46

6 755

3 565

47

23 773

10 235

48

5 000

2 560

49

16 666

7 606

50

7 654

4 012

© Revision 1.0

75

Computer Systems Architecture exercises

7. Bases conversions (3-10) All numbers are positive (unsigned numbers). First line is a solved example No. 1

Decimal Number 159

2

Base 3

Base 4

Base 5

Base 6

Base 7

12 220

2 133

1 114

423

315

1 991

2 201 202

133 013

30 431

13 115

5 543

3

7 654

101 111 111

1 313 212

221 104

55 234

31 213

4

2 002

2 202 011

133 102

31 002

13 134

5 560

5

79

2 221

1 033

304

211

142

6

192

21 010

3 000

1 232

520

363

7

789

1 002 020

30 111

11 124

3 353

2 205

8

1 299

1 210 010

110 103

20 144

10 003

3 534

9

1 001

1 101 002

33 221

13 001

4 345

2 630

10

1 777

2 102 211

123 301

24 102

12 121

5 116

11

2 905

10 222 121

231 121

43 110

21 241

11 320

12

4 986

20 211 200

1 031 322

124 421

35 030

20 352

13

8 192

102 020 102

2 000 000

230 232

101 532

32 612

14

7 500

101 021 210

1 311 030

220 000

54 420

30 603

15

3 999

12 111 010

332 133

111 444

30 303

14 442

16

4 111

12 122 021

1 000 033

112 421

31 011

14 662

17

6 767

100 021 122

1 221 233

204 032

51 155

25 505

18

5 005

20 212 102

1 032 031

130 010

35 101

20 410

19

2 966

11 001 212

232 112

43 331

21 422

11 435

20

1 897

2 121 021

131 221

30 042

12 441

5 350

21

1 566

2 011 000

120 132

22 231

11 130

4 365

22

1 476

2 000 200

113 010

21 401

10 500

4 206

23

1 149

1 120 120

101 331

14 044

5 153

3 231

24

3 003

11 010 020

232 323

44 003

21 523

11 520

© Revision 1.0

76

Computer Systems Architecture exercises No. 25

Decimal Number 1 733

26

Base 3

Base 4

Base 5

Base 6

Base 7

2 101 012

123 011

23 413

12 005

5 024

6 474

22 212 210

1 211 022

201 344

45 550

24 606

27

36 475

1212 000 221

20 321 323

2 131 400

440 511

211 225

28

1 761

2 102 020

123 201

24 021

12 053

5 064

29

17 630

220 011 222

10 103 132

1 031 010

213 342

102 254

30

37 037

1 212 210 202

21 002 231

2 141 122

443 245

212 660

31

16 220

211 020 202

3 331 130

1 004 340

203 032

65 201

32

27 175

1 101 021 111

12 220 213

1 332 200

325 451

142 141

33

37 743

1 220 202 220

21 031 233

2 201 433

450 423

215 016

34

14 265

201 120 100

3 132 321

424 030

150 013

56 406

35

9 999

111 201 100

2 130 033

304 444

114 143

41 103

36

2 274

10 010 020

203 202

33 044

14 310

6 426

37

21 351

1 002 021 210

11 031 213

1 140 401

242 503

116 151

38

10 033

111 202 121

2 130 301

310 113

114 241

41 152

39

20 152

1 000 122 101

10 322 320

1 121 102

233 144

112 516

40

11 666

121 000 002

2 312 102

331 313

130 002

46 004

41

13 167

200 001 200

3 031 233

410 132

140 543

53 250

42

5 500

21 112 201

1 111 330

134 000

41 244

22 015

43

6 236

22 112 222

1 201 130

144 421

44 512

24 116

44

1 177

1 121 121

102 121

14 202

5 241

3 301

45

14 451

201 211 020

3 201 303

430 301

150 523

60 063

46

7 755

101 122 020

1 321 023

222 010

55 523

31 416

47

23 773

1 012 121 111

11 303 131

1 230 043

302 021

126 211

48

5 000

20 212 012

1 032 020

130 000

35 052

20 402

49

16 666

211 212 021

10 010 122

1 013 131

205 054

66 406

50

7 811

101 021 022

1 322 003

222 221

100 055

31 526

© Revision 1.0

77

Computer Systems Architecture exercises

8. Bases conversions (10-15) All numbers are positive (unsigned numbers). First line is a solved example No.

Base 11

Base 12

Base 13

Base 14

Base 15

1

Decimal Number 1234

A22

86A

73C

642

574

2

3591

2775

20B3

1833

1447

10E6

3

7699

586A

4557

3673

2B3D

2434

4

6000

4565

3580

2967

2288

1BA0

5

2222

1740

1352

101C

B4A

9D2

6

3450

2657

1BB6

1755

1386

1050

7

4212

318A

2530

1BC0

176C

13AC

8

2987

2276

188B

148A

1135

D42

9

12121

911A

7021

5695

45BB

38D1

10

9999

7570

5953

4722

3903

2E69

11

8500

6428

4B04

3B3B

3152

27BA

12

7303

553A

4287

342A

2939

226D

13

6477

4959

38B9

2C43

2509

1DBC

14

3926

2A4A

2332

1A30

1606

126B

15

5501

4151

3225

2672

200D

196B

16

7244

5496

4238

33B3

28D6

222E

17

12987

9837

7623

5BB0

4A39

3CAC

18

11111

8391

651B

5099

4099

345B

19

10009

757A

5961

472C

390D

2E74

20

13456

A123

7954

6181

4C92

3EC1

21

21550

15211

1057A

9A69

7BD4

65BA

22

19999

14031

B6A7

9145

7407

5DD4

23

13675

A302

7AB7

62BC

4DAB

40BA

24

14971

10280

87B7

6A78

5655

4681

© Revision 1.0

78

Computer Systems Architecture exercises

9. Decimal fractions to Binary fractions conversions All numbers are positive (unsigned numbers). First line is a solved example No.

Decimal Fraction

Binary Fraction

1

0.4375

0.0111

2

0.1875

0.0011

3

0.625

0.101

4

0.1171875

0.0001111

5

0.546875

0.100011

6

0.875

0.111

7

0.34375

0.01011

8

0.9375

0.1111

9

0.375

0.011

10

0.40625

0.01101

11

0.140625

0.001001

12

0.65625

0.10101

13

0.234375

0.001111

14

0.734375

0.101111

15

0.8125

0.1101

16

0.8984375

0.1110011

17

0.71875

0.10111

18

0.109375

0.000111

19

0.578125

0.100101

20

0.953125

0.111101

21

0.5546875

0.1000111

22

0.4765625

0.0111101

23

0.13671875

0.00100011

24

0.49609375

0.01111111

© Revision 1.0

79

Computer Systems Architecture exercises

10.Binary fractions to Decimal fractions conversions All numbers are positive (unsigned numbers). First line is a solved example No.

Binary Fraction

Decimal Fraction

1

0.1

0.5

2

0.001

0.125

3

0.1111111

0.9921875

4

0.010001

0.265625

5

0.010111

0.359375

6

0.000001

0.015625

7

0.00011

0.09375

8

0.1011

0.6875

9

0.00101

0.15625

10

0.000101

0.078125

11

0.10001

0.53125

12

0.001101

0.203125

13

0.11101

0.90625

14

0.11

0.75

15

0.100001

0.515625

16

0.00111

0.21875

17

0.1001

0.5625

18

0.010011

0.296875

19

0.011011

0.421875

20

0.11111

0.96875

21

0.0101101

0.3515625

22

0.1100101

0.7890625

23

0.0000011

0.02734375

24

0.00011011

0.10546875

© Revision 1.0

80

Computer Systems Architecture exercises

11.Negative numbers representations Convert the negative decimal number to binary (16 bits). First line is a solved example Binary Numbers No.

Decimal Number

One’s Complement

Two’s Complement

1

- 4095

1111 0000 0000 0000

1111 0000 0000 0001

1000 1111 1111 1111

2

- 7612

1110 0010 0100 0011

1110 0010 0100 0100

1001 1101 1011 1100

3

- 1985

1111 1000 0011 1110

1111 1000 0011 1111

1000 0111 1100 0001

4

- 5777

1110 1001 0110 1110

1110 1001 0110 1111

1001 0110 1001 0001

5

- 8000

1110 0000 1011 1111

1110 0000 1100 0000

1001 1111 0100 0000

6

- 729

1111 1101 0010 0110

1111 1101 0010 0111

1000 0010 1101 1001

7

- 3333

1111 0010 1111 1010

1111 0010 1111 1011

1000 1101 0000 0101

8

- 4799

1110 1101 0100 0000

1110 1101 0100 0001

1001 0010 1011 1111

9

- 6222

1110 0111 1011 0001

1110 0111 1011 0010

1001 1000 0100 1110

10

- 3488

1111 0010 0101 1111

1111 0010 0110 0000

1000 1101 1010 0000

11

- 8190

1110 0000 0000 0001

1110 0000 0000 0010

1001 1111 1111 1110

12

- 9999

1110 1000 1111 0000

1110 1000 1111 0001

1001 0111 0000 1111

13

- 7231

1110 0011 1100 0000

1110 0011 1100 0001

1001 1100 0011 1111

14

-3

1111 1111 1111 1100

1111 1111 1111 1101

1000 0000 0000 0011

15

- 127

1111 1111 1000 0000

1111 1111 1000 0001

1000 0000 0111 1111

16

- 4777

1111 1101 0101 0110

1111 1101 0101 0111

1001 0010 1010 1001

17

- 1741

1111 1001 0011 0010

1111 1001 0011 0011

1000 0110 1100 1101

18

- 7676

1110 0010 0000 0011

1110 0010 0000 0100

1001 1101 1111 1100

19

- 4288

1110 1111 0011 1111

1110 1111 0100 0000

1001 0000 1100 0000

20

- 3636

1111 0001 1100 1011

1111 0001 1100 1100

1000 1110 0011 0100

21

- 1901

1111 1000 1001 0010

1111 1000 1001 0011

1000 0111 0110 1101

22

- 8076

1110 0000 0111 0011

1110 1111 0111 0100

1001 1111 1000 1100

23

- 5555

1110 1010 0100 1100

1110 1010 0100 1101

1001 0101 1011 0011

© Revision 1.0

Sign and Magnitude

81

Computer Systems Architecture exercises

12.Numbers representations Assuming the Hexadecimal number represents a 16 bits signed binary number convert it to a decimal number. The three columns represent the binary notation. First line is a solved example Decimal Numbers No.

Hexadecimal Number

1

64

One’s Complement 100

Two’s Complement 100

Sign and Magnitude 100

2

8FFF

- 28672

- 28673

- 4095

3

6ABC

27324

27324

27324

4

F0F0

- 3855

- 3856

- 28912

5

FC92

- 877

- 878

- 31890

6

FAB0

-1359

- 1360

- 31408

7

2000

8192

8192

8192

8

EEEE

- 4369

- 4370

- 28398

9

D012

- 12269

- 12270

- 20498

10

8111

- 32494

- 32495

- 273

11

7FFF

32767

32767

32767

12

BAD1

- 17710

- 17711

- 15057

13

A000

- 24575

- 24576

- 8192

14

E900

- 5887

- 5888

- 26880

15

D780

- 10367

- 10368

-22400

16

9BBB

- 25668

- 25669

- 7099

17

AAAA

- 21845

- 21846

- 10922

18

6819

26649

26649

26649

19

8750

- 30895

- 30896

- 1872

20

C009

- 16734

- 16375

-16393

21

ABCD

- 21255

- 21256

- 11213

22

DFEF

- 8208

- 8209

- 24559

© Revision 1.0

82

Computer Systems Architecture exercises

13.Adding binary numbers First line is a solved example Binary Numbers No.

First Number

Second Number

Result

1

0011 0000 1101 0100

0010 0111 0000 0011

0101 0111 1101 0111

2

0010 0111 0000 1111

0010 1110 1110 1100

0101 0101 1111 1011

3

0000 1010 1110 0100

0010 1110 0000 1000

0011 1000 1110 1100

4

0010 1011 0110 0111

0000 0010 0010 1011

0010 1101 1001 0010

5

0001 1111 1111 1111

0001 1111 1111 1111

0011 1111 1111 1110

6

0000 1111 1111 1111

0010 1010 1100 1010

0011 1010 1100 1001

7

0010 1010 0101 1010

0000 0111 1111 1111

0011 0010 0101 1001

8

0001 1010 1011 1100

0001 1011 1100 1101

0011 0110 1000 1001

9

0001 1111 1110 1111

0000 1111 1101 1011

0010 1111 1100 1010

10

0011 1011 1010 1101

0001 1101 1101 1101

0101 1001 1001 0011

11

0101 1001 1001 0011

0001 1111 1111 1111

0111 1001 1001 0010

12

0001 0010 0011 0100

0001 0010 0011 0100

0020 0100 0110 1000

13

0100 1010 1011 0101

0010 0101 0111 1011

0111 0000 0011 0000

14

0011 1111 1111 1111

0011 1111 1111 1111

0111 1111 1111 1110

15

0001 1010 1010 1010

0010 1101 1101 1101

0100 1000 1000 0111

16

0100 1110 1110 1110

0001 1100 1100 1100

0110 1011 1011 1010

17

0101 0101 0101 0101

0001 1111 1111 1111

0111 0101 0101 0100

18

0001 1001 1111 1011

0011 1111 1011 1001

0101 1001 1011 0100

19

0011 1010 0110 0111

0001 0110 0111 1010

0101 0000 1110 0001

20

0001 1111 1011 1110

0011 1010 1100 1101

0101 1010 1000 1011

21

0101 1010 1000 1011

0000 1010 1000 1011

0110 0101 0001 0110

22

0010 0111 0111 0111

0010 1011 1011 1011

0101 0011 0011 0010

23

0100 0100 0100 0100

0001 1100 1100 1100

0110 0001 0001 0000

© Revision 1.0

83

Computer Systems Architecture exercises

14.Adding Hexadecimal unsigned numbers First line is a solved example Hexadecimal Numbers No.

First Number

Second Number

1

7895

4ABC

C351

2

1DDD

2CCC

4AA9

3

3DEF

2999

6788

4

42BD

2BFC

6EB9

5

4623

3BDE

8201

6

23DD

3D1F

60FC

7

5A96

1E0F

78A5

8

8B1E

1ED6

A9F4

9

22FF

3DF9

60F8

10

2D2D

3F3F

6C6C

11

3DDE

3EED

7CCB

12

17EF

6E3F

862E

13

9ABC

1234

ACF0

14

5555

6789

BCDE

15

1FFF

6BAD

8BAC

16

6841

30FB

993C

17

351D

4FE2

84FF

18

10FF

3EC7

59C6

19

22EE

3DC9

60B7

20

6699

1EB5

854E

21

3D8E

3F0F

7C9D

22

90E9

2E3F

BF28

23

4BA4

5CB5

A859

© Revision 1.0

Result

84

Computer Systems Architecture exercises

15.Adding Octal unsigned numbers First line is a solved example Octal Numbers No.

First Number

Second Number

1

3125

5632

10757

2

7777

1111

11110

3

21430

6705

30335

4

23561

17777

43560

5

36217

23567

62006

6

15374

25517

43113

7

52527

12774

65523

8

22337

33557

56116

9

17652

21576

41450

10

11666

55645

67533

11

21766

51622

73610

12

62357

22366

104745

13

12345

12345

24712

14

76543

1111

77654

15

32145

12306

44453

16

17365

36547

56134

17

47147

15647

65016

18

16736

26576

45534

19

26077

25716

54015

20

37437

26726

66365

21

42637

20675

63534

22

33512

32645

66357

23

32325

35353

67700

© Revision 1.0

Result

85

Computer Systems Architecture exercises

16.Adding Base 3 unsigned numbers First line is a solved example Base 3 Numbers No.

First Number

Second Number

1

111020

11221

200011

2

11101

100022

111200

3

10121

20120

101011

4

10202

21111

102020

5

11010

22020

110100

6

11120

21220

110110

7

101201

20111

122012

8

101011

102121

210202

9

21101

21102

112210

10

21201

21220

120121

11

22020

20120

112210

12

101120

20021

121211

13

102010

2001

111011

14

100212

20011

121000

15

101210

21002

122212

16

20101

12201

110002

17

22122

20120

120012

18

101201

21110

200011

19

100110

20111

120221

20

21202

12111

111020

21

21211

101010

122221

22

102101

11121

120222

23

22002

101011

200020

© Revision 1.0

Result

86

Computer Systems Architecture exercises

17.Adding Base 4 unsigned numbers First line is a solved example Base 4 Numbers No.

First Number

Second Number

1

30110

10320

101030

2

22323

11120

100103

3

20223

11031

31320

4

21223

11322

33211

5

22333

10332

33331

6

30312

12233

103211

7

30112

10331

101103

8

22100

20311

103011

9

32011

12003

110020

10

31221

2001

111222

11

23331

22032

112023

12

33031

12300

111331

13

31320

22122

120102

14

32211

23020

121231

15

22323

21121

110110

16

23202

30322

120130

17

30230

13212

110102

18

32003

23113

121122

19

32331

22233

121230

20

33220

13310

113130

21

103102

3222

112330

22

101113

11031

112210

23

31333

30122

122121

© Revision 1.0

Result

87

Computer Systems Architecture exercises

18.Multiplying binary numbers All are unsigned 16 bits numbers. First line is a solved example Binary Numbers No.

First Number

Second Number

Result

1

1111 1111

110 0010

110 0001 1001 1110

2

10 0011

101 0110

1011 1100 0010

3

100 1000

1000 1010

10 0110 1101 0000

4

1 1010 1011 0010

1101

101 1011 0000 1010

5

101 1101 1111

1 0011

1111 1000 1110 0011

6

100 0000 1111

1 1111

111 1101 1101 0001

7

11 1100 1011

1001

10 0010 0010 0011

8

1 1000 0011

1 1101

10 1011 1101 0111

9

111 1000 1001

1101

110 0001 1111 0101

10

1101 1101 1110

110

101 0011 0011 0100

11

1 0000 1000 1001

101

101 0010 1010 1101

12

10 0011 1001

1011

1 1000 0111 0011

13

1010 1011 1011

111

100 1011 0001 1101

14

1 1010 1011

1 0010

1 1110 0000 0110

15

11 0110 1001

1110

10 0001 1011 1110

16

101 0101 0010

1 1110

101 1111 0010 1001

17

1110 1110 0010

1011

1010 0011 1011 0110

18

1011 1010 1101

1110

1010 0011 0111 0110

19

1011 0000 0000

1101

1000 1111 0000 0000

20

100 1000 0011

1 1011

111 1001 1101 0001

21

1011 1011 0001

110

100 0110 0010 0110

22

1111 1111

1111 1111

1111 1110 0000 0001

23

1011 1011

1011 1011

1000 1000 1001 1001

© Revision 1.0

88

Computer Systems Architecture exercises

19.Floating point (754) notation (signed integers) Convert between the numbers. First line is a solved example No.

Decimal Number

1

1

Floating point Number 3F800000

2

2

40000000

3

3

40400000

4

5

40A00000

5

11

41300000

6

17

41880000

7

24

41C00000

8

28

41E00000

9

32

42000000

10

49

42440000

11

59

426C0000

12

65

42820000

13

79

429E0000

14

85

42AA0000

15

99

42C60000

16

127

42FE0000

17

238

436E0000

18

256

43800000

19

257

43808000

20

313

439C8000

21

355

43B18000

22

485

43F28000

23

490

43F50000

24

515

4400C000

© Revision 1.0

89

Computer Systems Architecture exercises No. 25

Decimal Number 533

Floating point Number 44054000

26

-4

C0800000

27

-7

C0E00000

28

-13

C1500000

29

-19

C1980000

30

-25

C1C80000

31

-31

C1F80000

32

-37

C2140000

33

-43

C22C0000

34

-55

C25C0000

35

-67

C2860000

36

-73

C2920000

37

-79

C29E0000

38

-86

C2AC0000

39

-93

C2BA0000

40

-101

C2CA0000

41

-118

C2EC0000

42

-197

C3450000

43

-241

C3710000

44

-267

C3858000

45

-309

C39A8000

46

-333

C3A68000

47

-400

C3C80000

48

-499

C3F98000

49

-525

C4034000

50

-541

C4074000

© Revision 1.0

90

Computer Systems Architecture exercises

20.Floating point (754) notation (signed fractions) Convert the numbers. First line is a solved example No. 1

Decimal Number 123.75

Floating point Number 42F78000

2

76.25

42988000

3

77.5

429B0000

4

64.3125

4280A000

5

45.875

42378000

6

75.75

42978000

7

37.375

42158000

8

19.25

419A0000

9

31.125

41F90000

10

0.625

3F200000

11

49.875

42478000

12

63.5

427E0000

13

99.25

42C68000

14

12.125

41420000

15

24.5

41C40000

16

31.625

41FD0000

17

88.5

42B10000

18

73.25

42928000

19

1.125

3F900000

20

3.375

40580000

21

19.75

419E0000

22

66.875

4285C000

23

54.625

425A8000

24

27.25

41DA0000

© Revision 1.0

91

Computer Systems Architecture exercises No. 25

Decimal Number -89.125

Floating point Number C2B24000

26

-25.625

C1CD0000

27

-0.125

BE000000

28

-8.125

C1020000

29

-25.25

C1CA0000

30

-26.375

C1D30000

31

-87.5

C2AF0000

32

-63.75

C27F0000

33

-41.25

C2250000

34

-0.3125

BEA00000

35

-5.875

C0BC0000

36

-33.375

C2058000

37

-12.125

C1420000

38

-68.875

C289C000

39

-75.75

C2978000

40

-1.75

BFE00000

41

-66.625

C2854000

42

-49.5

C2460000

43

-25.75

C1CE0000

44

-99.125

C2C64000

45

-0.875

BF600000

46

-61.25

C2750000

47

-39.375

C21D8000

48

-25.25

C1CA0000

49

-53.5

C2560000

50

-71.125

C28E4000

© Revision 1.0

92

Computer Systems Architecture exercises

21.Adding Floating point (754) numbers Add the floating-point numbers, then covert them to decimal, add the decimal numbers and check against the floating-point result obtained. First line is a solved example Floating Point Numbers No. 1

First Number 41700000

Second Number 420C0000

2

C14C0000

3

Result

Decimal Numbers

42480000

First Number 15.0

Second Number 35.0

40300000

C1200000

-12.75

2.5

-10.0

418E0000

41B20000

42200000

17.75

22.25

40.0

4

3F600000

4280A000

42C64000

0.875

99.125

100.0

5

C2AF0000

422D0000

C2310000

-87.5

43.25

-44.25

6

429A0000

42080000

42DE0000

77.0

34.0

111.0

7

44400000

43800000

44800000

768.0

256.0

1024.0

8

41CA0000

43BBB000

43C85000

25.25

375.375

400.625

9

414A0000

41460000

41C80000

12.625

12.375

25.0

10

C1EC0000

3F000000

C1E80000

-29.5

0.5

-29.0

11

C18E0000

C20B0000

C2520000

-17.75

-34.75

-52.5

12

BE400000

BF500000

BF800000

-0.1875

-0.8125

-1.0

13

41DD0000

C0500000

41C30000

27.625

-3.25

24.375

14

42C00000

41C00000

42F00000

96.0

24.0

120.0

15

44B08000

43008000

442B2800

556.125

128.5

684.625

16

42C80000

41100000

42DA0000

100.0

9.0

109.0

17

41E40000

3FC00000

41F00000

28.5

1.5

30.0

18

BF400000

40300000

40000000

-0.75

2.75

2.0

19

413E0000

411E0000

41AE0000

11.875

9.875

21.75

20

C0FC0000

41140000

3FB00000

-7.875

9.25

1.375

21

40C40000

40C40000

41440000

6.125

6.125

12.25

22

40B80000

C1340000

C0B00000

5.75

-11.25

-5.5

© Revision 1.0

Result 50.0

93

Computer Systems Architecture exercises

22.Multiplying Floating point (754) numbers Multiply the floating-point numbers, then covert them to decimal, multiply the decimal numbers and check against the floating-point result obtained. First line is a solved example Floating Point Numbers No.

Result

Decimal Numbers

1

First Number 40A00000

Second Number C0000000

C1200000

5

Second Number -2

2

41400000

40800000

42400000

12

4

48

3

C0700000

C1000000

41F00000

-3.75

-8

30

4

40200000

40700000

41160000

2.5

3.75

9.375

5

42C80000

C17C0000

C4C4E000

100

-15.75

-1575

6

43000000

3FC00000

43400000

128

1.5

192

7

C2800000

C2000000

45000000

-64

-32

2048

8

40900000

40B00000

41C60000

4.5

5.5

24.75

9

44800000

44000000

49000000

1024

512

524288

10

C1C00000

42120000

C45B0000

-24

36.5

-876

11

41700000

42080000

43FF0000

15

34

510

12

411C0000

C0500000

C1FD8000

9.75

-3.25

-31.6875

13

44200000

3E000000

42A00000

640

0.125

80

14

C2960000

C0800000

43960000

-75

-4

300

15

417A0000

40900000

428CA000

15.625

4.5

70.3125

16

41940000

C22D0000

C4480800

18.5

-43.25

-800.125

17

C1900000

C2B60000

44CCC000

-18

-91

1638

18

C2F00000

C2600000

45D20000

-120

-56

6720

19

41C80000

41C00000

44160000

25

24

600

20

C0C80000

42EC0000

C4386000

-6.25

118

-737.5

21

414A0000

41840000

434E5000

12.625

16.5

206.3125

© Revision 1.0

First Number

Result -10

94

Computer Systems Architecture exercises

23.BCD numbers (8421, 2421) All are unsigned 16 bits numbers. First line is a solved example

No.

Decimal Number

BCD Numbers

1

1234

0001 0010 0011 0100

0001 0010 0011 0100

2

2468

0010 0100 0110 1000

0010 0100 1100 1110

3

7136

0111 0001 0011 0110

1101 0001 0011 1100

4

5497

0101 0100 1001 0111

1011 0100 1111 1101

5

3780

0011 0111 1000 0000

0011 1101 1110 0000

6

9026

1001 0000 0010 0110

1111 0000 0010 1100

7

4512

0100 0101 0001 0010

0100 1011 0001 0010

8

7462

0111 0100 0110 0010

1101 0100 1100 0010

9

3297

0011 0010 1001 0111

0011 0010 1111 1101

10

4582

0100 0101 1000 0010

0100 1011 1110 0010

11

1097

0001 0000 1001 0111

0001 0000 1111 1101

12

7651

0111 0110 0101 0001

1101 1100 1011 0001

13

1928

0001 1001 0010 1000

0001 1111 0010 1110

14

8763

1000 0111 0110 0011

1110 1101 1100 0011

15

7329

0111 0011 0010 1001

1101 0011 0010 1111

16

8461

1000 0100 0110 0001

1110 0100 1100 0001

17

1357

0001 0011 0101 0111

0001 0011 1011 1101

18

9042

1001 0000 0100 0010

1111 0000 0100 0010

19

6703

0110 0111 0000 0011

1100 1101 0000 0011

20

2730

0010 0111 0011 0000

0010 1101 0011 0000

21

4587

0100 0101 1000 0111

0100 1011 1110 1101

22

9768

1001 0111 0110 1000

1111 1101 1100 1110

23

8264

1000 0010 0110 0100

1110 0010 1100 0100

8421

© Revision 1.0

2421

95

Computer Systems Architecture exercises

24.BCD numbers (84-2-1, Excess-3) All are unsigned 16 bits numbers. First line is a solved example

No.

Decimal Number

BCD Numbers

1

1234

0111 0110 0101 0100

0100 0101 0110 0111

2

2468

0110 0100 1010 1000

0101 0111 1001 1011

3

7136

1001 0111 0101 1010

1010 0100 0110 1001

4

5497

1011 0100 1111 1001

1000 0111 1100 1010

5

3780

0101 1001 1000 0000

0110 1010 1011 0011

6

9026

1111 0000 0110 1010

1100 0011 0101 1001

7

4512

0100 1011 0111 0110

0111 1000 0100 0101

8

7462

1001 0100 1010 0110

1010 011 1001 0101

9

3297

0101 0110 1111 1001

0110 0101 1100 1010

10

4582

0100 1011 1000 0110

0111 1000 1011 0101

11

1097

0111 0000 1111 1001

0100 0011 1100 1010

12

7651

1001 1010 1011 0111

1010 1001 1000 0100

13

1928

0111 1111 0110 1000

0100 1100 0101 1011

14

8763

1000 1001 1010 0101

1011 1010 1001 0110

15

7329

1001 0101 0110 1111

1010 0110 0101 1100

16

8461

1000 0100 1010 0111

1011 0111 1001 0100

17

1357

0011 0101 1011 1001

0100 0110 1000 1010

18

9042

1111 0000 0100 0110

1100 0011 0111 0101

19

6703

1010 1001 0000 0101

1001 1010 0011 0110

20

2730

0110 1001 0101 0000

0101 1010 0110 0011

21

4587

0100 1011 1000 1001

0111 1000 1011 1010

22

9768

1111 1001 1010 1000

1100 1010 1001 1011

23

8264

1000 0110 1010 0100

1011 0101 1001 0111

84-2-1

© Revision 1.0

Excess - 3

96

Computer Systems Architecture exercises

Chapter 4 – Central Processing Unit exercises 1. Architectures In the following questions the Answer should relate to a Stack-based architecture, Accumulator-based architecture, Memory-Register architecture and Register-Register architecture a) Write the instructions needed for executing the following formula: C = 2A + 3B Where A, B, C are variables. Try to optimize the code (minimum instructions possible) Stack-based architecture, First version - the simple and straightforward solution: # First Version Push A Push A Add Pop C Push B Push B Add Push B Add Push C Add Pop C

# Push variable A on TOS # Push variable A on TOS # 2A is stored on TOS # C=2A # Push variable B on TOS # Push variable B on TOS # 2B is stored on TOS # Push B on TOS # 3B is stored on TOS # Push C (contains 2A) on TOS # 2A+3B is stored on TOS # C=2A+3B

Since the stack can contain several variable it is a waste to move 2A into variable C and then load it once again. The second version will be a little bit shorter # Second Version Push A Push A Add Push B Push B Add Push B Add Add Pop C

# Push variable A on TOS # Push variable A on TOS # 2A is stored on TOS # Push variable B on TOS # Push variable B on TOS # 2B is stored on TOS # Push B on TOS # 3B is stored on TOS # 2A+3B is stored on TOS # C=2A+3B

The third version uses a multiply instruction thus is it even shorter

© Revision 1.0

97

Computer Systems Architecture exercises # Third Version Push A Push A Add Push 3 Push B Mult Add Pop C

# Push variable A on TOS # Push variable A on TOS # 2A is stored on TOS # Push constant 3 on TOS # Push variable B on TOS # 3B is stored on TOS # 2A+3B is stored on TOS # C=2A+3B

Accumulator-based architecture As with the previous case the first version is straightforward # Version 1 Load A Add A Add B Add B Add B Store C

# Load variable A into the Accumulator # Add the Accumulator content to A # Accumulator = 2A+B # Accumulator = 2A+2B # Accumulator = 2A+3B # C=2A+3B

Version 2 is using the multiply instruction # Version 2 Load B Mult 3 Add A Add A Store C

# Load variable B into the Accumulator # Accumulator = 3B # Accumulator = 3B+A # Accumulator = 3B+2A # C=2A+3B

Version 3 is using the multiply instruction but the algorithm was changed a little bit. It does not decrease the number of instructions # Version 3 Load A Add B Mult 2 Add B Store C

# Load variable A into the Accumulator # Accumulator = A+B # Accumulator = 2A+2B # Accumulator = 2A+3B # C=2A+3B

Memory-Register architecture As with previous cases the first version is simple # Version 1

© Revision 1.0

98

Computer Systems Architecture exercises Load Load Add Add Add Store Add Store

R1,A R2,B R1,A R2,B R2,B Temp,R1 R2,Temp C,R2

# Load variable A into R1 # Load variable B into R2 # R1=2A # R2=2B # R2=3B # Temp variable contains 2A # R2=3B+2A # C=2A+3B

The second version uses the multiply instruction assuming it can have one of the operands in memory or a constant. # Version 2 Load R1,A Load R2,B Mult R1,2 Mult R2,3 Store Temp,R1 Add R2,Temp Store C,R2

# Load variable A into R1 # Load variable B into R2 # R1=2A # R2=3B # Temp variable = 2A # R3=2A+3B # C=2A+3B

Register-Register architecture As with previous cases the first version is simple # Version 1 Load R1,A Load R2,B Add R1,R1,R1 Add R3,R2,R2 Add R3,R3,R2 Add R3,R3,R1 Store C,R3

# Load variable A into R1 # Load variable B into R2 # R1=2A # R3=2B # R3=3B # R3=3B+2A # C=2A+3B

The second version uses the multiply instruction assuming it can have one of the operands in memory or a constant. # Version 2 Load R1,A Load R2,B Mult R1,R1,2 Mult R2,R2,3 Add R3,R1,R2 Store C,R3

# Load variable A into R1 # Load variable B into R2 # R1=2A # R2=3B # R3=2A+3B # C=2A+3B

© Revision 1.0

99

Computer Systems Architecture exercises As with the Accumulator based architecture using a different algorithm does not produce a better solution.

b) Write the instructions needed for executing the following formula: Sum = 8A + 4B + 2C +D Where A, B, C, D and Sum are variables

Stack-based architecture, First version - the simple and straightforward solution: # First Version Push A Push A Add Push B Add Push 2 Mult Push C Add Push 2 Mult Push D Add Pop Sum

# Push variable A on TOS # Push variable A on TOS # 2A is stored on TOS # Push variable B on TOS # 2A+B is stored on TOS # Push 2 on TOS # 4A+2B is stored on TOS # Push C on TOS # 4A+2B+C is stored on TOS # 8A+4B+2C is stored on TOS # D is stored on TOS # 8A+4B+2C+D is on TOS # Sum = 8A+4B+2C+D

Same program can be written using Polish notation, but it does not shorten the program. # Second Version – Polish notation Push D # Push variable D on TOS Push 2 # Push 2 on TOS Push C # Push variable C on TOS Push 2 # Push 2 on TOS Push B # Push variable B on TOS Push A # Push variable A on TOS Push A # Push variable A on TOS Add # 2A is stored on TOS Add # 2A+B is stored on TOS Mult #4A+2B is stored on TOS Add # 4A+2B+C is stored on TOS Mult # 8A+4B+2C is stored on TOS Add # 8A+4B+2C+D is stored on TOS Pop Sum # Sum = 8A+4B+2C+D

© Revision 1.0

100

Computer Systems Architecture exercises

Accumulator-based architecture As with the previous case the first version is straightforward # Version 1 Load A Mult 8 Store Sum Load B Mult 4 Add Sum Store Sum Load C Mult 2 Add D Add Sum Store Sum

# Load variable A into the Accumulator # Accumulator =8A # Sum =8A # Accumulator =B # Accumulator =4B # Accumulator =8A+4B # Sum = 8A+4B # Accumulator =C # Accumulator =2C # Accumulator =2C+D # Accumulator =8A+4B+2C+D # Sum= 8A+4B+2C+D

Version 2 is slightly better. There is no need to store 8A+4B which may saves two instructions. # Version 2 Load A Mult 8 Store Sum Load B Mult 4 Add Sum Add C Add C Add D Store Sum

# Load variable A into the Accumulator # Accumulator = 8A # Sum = 8A # Accumulator = B # Accumulator =4B # Accumulator =8A+4B # Accumulator =8A+4B+C # Accumulator =8A+4B+2C # Accumulator =8A+4B+2C+D # Sum=8A+4B+2C+D

Version 3 is using a slightly different algorithm which is more efficient # Version 3 Load A Add A Add B Mult 2 Add C Mult 2 Add D Store Sum

# Load variable A into the Accumulator # Add the Accumulator content to A # Accumulator = 2A+B # Accumulator = 4A+2B # Accumulator = 4A+2B+C # Accumulator = 8A+4B+2C # Accumulator = 8A+4B+2C+D # Sum=8A+4B+2C+D

© Revision 1.0

101

Computer Systems Architecture exercises Memory-Register architecture As with previous cases the first version is simple, similar to the Accumulator version but using more registers # Version 1 Load R1,A Mult R1,8 Store Sum,R1 Load R2,B Mult R2,4 Add R2,Sum Add R2,C Add R2,C Add R2,D Store Sum,R2

# Load variable A into the R1 # R1=8A # Sum=8A # R2=B # R2=4B # R2=8A+4B # R2=8A+4B+C # R2=8A+4B+2C # R2=8A+4B+2C+D # Sum=8A+4B+2C+D

The second version is similar to the third Accumulator based architecture version. Although here there are many registers, this implementation is using just one register. # Version 2 Load R1,A Add R1,A Add R1,B Mult R1,2 Add R1,C Mult R1,2 Add R1,D Store Sum,R1

# Load variable A into R1 # R1=2A # R1=2A+B # R1=4A+2B # R1=4A+2B+C # R1=8A+4B+2C # R1=8A+4B+2C+D # Sum=8A+4B+2C+D

Register-Register architecture As with previous cases the first version is simple and it starts by loading the variables and the constants into the registers # Version 1 Load R1,A Load R2,B Load R3,C Load R4,D Load R5,2 Load R6,4 Load R7,8 Mult R1,R1,R7 Mult R2,R2,R6 Mult R3,R3,R5

# Load variable A into R1 # Load variable B into R2 # Load variable C into R3 # Load variable D into R4 # Load the constant 2 into R5 # Load the constant 4 into R6 # Load the constant 8 into R7 # R1=8A # R2=4B # R3=2C

© Revision 1.0

102

Computer Systems Architecture exercises Add Add Add Store

R4,R4,R1 R4,R4,R2 R4,R4,R3 Sum,R4

# R4=8A+D # R4=8A=4B+D # R4=8A+4B+2C+D # Sum=8A+4B+2C+D

The second version is slightly better # Version 2 Load R1,A Load R2,B Load R3,C Load R4,D Load R5,2 Add R1,R1,R1 Add R6,R1,R2 Mult R6,R6,R5 Add R6,R6,R3 Mult R6,R6,R5 Add R6,R6,R4 Store Sum,R6

# Load variable A into R1 # Load variable B into R2 # Load variable C into R3 # Load variable D into R4 # Load the constant 2 into R5 # R1=2A # R6=2A+B # R6=4A+2B # R6=4A+2B+C # R6=8A+4B+2C # R6=8A+4B+2C+D # Sum=8A+4B+2C+D

c) Write the instructions needed for executing the following formula: Sum = A + 2B + 4C +8D Where A, B, C, D and Sum are variables This exercise is identical to the previous one, just the variables are different. Instead of using variables A,B,C,D we have to used D,C,B,A (respectively) and the solutions will fit.

d) Write the instructions needed for executing the following formula: Sum = 2(A + 2B) – 3(C +2D) Where A, B, C, D and Sum are variables Stack-based architecture, Simple and straightforward solution: Push Push Add Push Add Push Mult Push Push

C D D 3 A B

# Push C on TOS # Push D on TOS # C+D is stored on TOS # D is stored on TOS # C+2D is stored on TOS # 3 is stored on TOS # 3(C+2D) is on TOS # Push variable A on TOS # Push variable B on TOS

© Revision 1.0

103

Computer Systems Architecture exercises Add Push Add Push Mult Sub Pop

B 2

Sum

# A+B is stored on TOS # Push variable B on TOS # A+2B is stored on TOS # Push 2 on TOS # 2(A+2B) is stored on TOS # 2(A+2B)-3(C+2D) is stored on TOS # Sum = 2(A+2B)-3(C+2D)

Accumulator-based architecture # Version 1 Load A Add B Add B Mult 2 Store Sum Load C Add D Add D Mult 3 Store Tmp Load Sum Sub Tmp Store Sum

# Load variable A into the Accumulator # Accumulator =A+B # Accumulator =A+2B # Accumulator =2(A+2B) # Sum =2(A+2B) # Accumulator =C # Accumulator =C+D # Accumulator =C+2D # Accumulator =3(C+2D) # Temporary location = 3(C+2D) # Accumulator =2(A+2B) # Accumulator =2(A+2B)-3(C+2D) # Sum= 2(A+2B)-3(C+2D)

Version 2 is slightly different and it is using expanded formula. Sum = 2(A+2B)-3(C+2D) = 2A+4B-3C-6D Unfortunately it is worse (17 instructions compared to 13 on version 1) # Version 2 Load A Mult 2 Store A Load B Mult 4 Add A Store Sum Load C Mult 3 Store C Load D Mult 6 Add C Store Tmp Load Sum

# Load variable A into the Accumulator # Accumulator =2A # A=2A # Accumulator = B # Accumulator = 4B # Accumulator = 2A+4B # Sum =2A+4B # Accumulator =C # Accumulator = 3C # C=3C # Accumulator = D # Accumulator = 6D # Accumulator = 3C+6D # Temporary location = 3C+6D # Accumulator =2A+4B

© Revision 1.0

104

Computer Systems Architecture exercises Sub Tmp Store Sum

# Accumulator =2A+4B-3C-6D # Sum= 2(A+2B)-3(C+2D)

Changing the order of the calculation can save some instructions, nevertheless version 1 is still better # Version 3 Load C Mult 3 Store C Load D Mult 6 Add C Store Tmp Load A Mult 2 Store A Load B Mult 4 Add A Sub Tmp Store Sum

# Accumulator =C # Accumulator = 3C # C=3C # Accumulator = D # Accumulator = 6D # Accumulator = 3C+6D # Temporary location = 3C+6D # Load variable A into the Accumulator # Accumulator =2A # A=2A # Accumulator = B # Accumulator = 4B # Accumulator = 2A+4B # Accumulator =2A+4B-3C-6D # Sum= 2(A+2B)-3(C+2D)

Memory-Register architecture As with previous cases the first version is simple, similar to the Accumulator version but using more registers Load Add Add Mult Store Load Add Add Mult Store Load Sub Store

R1,B R1,B R1,A R1,2 Sum,R1 R2,D R2,D R2,C R2,3 Tmp,R3 R1,Sum R1,Tmp Sum,R1

# Load variable B into the R1 # R1=2B # R1=A+2B # R1=2(A+2B) # Sum=8A # R2=D # R2=2D # R2=C+2D # R2=3(C+2D) # Tmp=3(C+2D) # R1=2(A+2B) # R1=2(A+2B)-3(C+2D) # Sum=2(A+2B)-3(C+2D)

Register-Register architecture As with previous cases the first version is simple and it starts by loading the variables and the constants into the registers

© Revision 1.0

105

Computer Systems Architecture exercises

Load Load Load Load Load Load Add Add Mult Add Add Mult Sub Store

R1,A R2,B R3,C R4,D R5,2 R6,3 R1,R1,R2 R1,R1,R2 R1,R1,R5 R3,R3,R4 R3,R3,R4 R3,R3,R6 R1,R1,R3 Sum,R4

# Load variable A into R1 # Load variable B into R2 # Load variable C into R3 # Load variable D into R4 # Load the constant 2 into R5 # Load the constant 3 into R6 # R1=A+B # R1=A+2B # R1=2(A+2B) # R3=C+D # R3=C+2D # R3=3(C+2D) # R1=2(A+2B)-3(C+2D) # Sum=2(A+2B)-3(C+2D)

e) Write the instructions needed for executing the following formula: Sum = 2AB+3CD Where A, B, C, D and Sum are variables Stack-based architecture, Simple and straightforward solution: Push Push Mult Push Mult Push Push Mult Push Mult Add Pop

C D 3 A B 2

Sum

# Push C on TOS # Push D on TOS # CD is stored on TOS # 3 is stored on TOS # 3CD is on TOS # Push variable A on TOS # Push variable B on TOS # AB is stored on TOS # Push 2 on TOS # 2AB is stored on TOS # 2AB+3CD is stored on TOS # Sum = 2AB+3CD

Accumulator-based architecture Load Mult Mult Store Load Mult Mult

C D 3 Tmp A B 2

# Load variable C into the Accumulator # Accumulator =CD # Accumulator =3CD # Tmp = 3CD # Accumulator = A # Accumulator =AB # Accumulator = 2AB

© Revision 1.0

106

Computer Systems Architecture exercises Add Tmp Store Sum

# Accumulator = 2AB+3CD # Sum= 2AB+3CD

Memory-Register architecture As with previous cases the first version is simple, similar to the Accumulator version but using more registers Load Mult Mult Store Load Mult Mult Add Store

R1,A R1,B R1,2 Tmp,R1 R2,C R2,D R2,3 R2,Tmp Sum,R2

# Load variable A into the R1 # R1=AB # R1=2AB # Tmp = 2AB # R2=C # R2=CD # R2=3CD # R2=2AB+3CD # Sum=2AB+3CD

Register-Register architecture As with previous cases the first version is simple and it starts by loading the variables and the constants into the registers Load Load Load Load Load Load Mult Mult Mult Mult Add Store

R1,A R2,B R3,C R4,D R5,2 R6,3 R1,R1,R2 R1,R1,R5 R3,R3,R4 R3,R3,R6 R3,R3,R1 Sum,R3

# Load variable A into the R1 # R2=B # R3=C # R4=D # R5=2 # R6=3 # R1=AB # R1=2AB # R3=CD # R3=3CD # R3=2AB+3CD # Sum=2AB+3CD

f) Write the instructions needed for executing the following formula: Sum = 3(A+B)*(C+D) Where A, B, C, D and Sum are variables Stack-based architecture, Simple and straightforward solution:

© Revision 1.0

107

Computer Systems Architecture exercises Push Push Add Push Push Add Push Mult Mult Pop

C D A B 3

Sum

# Push C on TOS # Push D on TOS # C+D is stored on TOS # Push variable A on TOS # Push variable B on TOS # A+B is stored on TOS # 3 is stored on TOS # 3(A+B) is on TOS # 3(A+B)*(C+D) is stored on TOS # Sum = 3(A+B)*(C+D)

Accumulator-based architecture Load Add Store Load Add Mult Mult Store

C D Tmp A B 3 Tmp Sum

# Load variable C into the Accumulator # Accumulator =C+D # Tmp = C+D # Accumulator = A # Accumulator =A+B # Accumulator = 3(A+B) # Accumulator = 3(A+B)^(C+D) # Sum= 3(A+B)^(C+D)

Memory-Register architecture As with previous cases the first version is simple, similar to the Accumulator version but using more registers Load Add Mult Store Load Add Mult Store

R1,A R1,B R1,3 Tmp,R1 R2,C R2,D R2,Tmp Sum,R2

# Load variable A into the R1 # R1=A+B # R1=3(A+B) # Tmp = 3(A+B) # R2=C # R2=C+D # R2=3(A+B)*(C+D) # Sum=3(A+B)^(C+D)

Register-Register architecture As with previous cases the first version is simple and it starts by loading the variables and the constants into the registers Load Load Load Load Load

R1,A R2,B R3,C R4,D R5,3

# Load variable A into the R1 # R2=B # R3=C # R4=D # R5=3

© Revision 1.0

108

Computer Systems Architecture exercises Add Mult Add Mult Store

R1,R1,R2 R1,R1,R5 R3,R3,R4 R3,R3,R1 Sum,R3

# R1=A+B # R1=3(A+B) # R3=C+D # R3=3(A+B)*(C+D) # Sum=3(A+B)*(C+D)

g) Write the instructions needed for executing the following formula: Sum = (A+2B+3C)*(3A+2B+C) Where A, B, C and Sum are variables Stack-based architecture, Simple and straightforward solution: Push Push Mult Push Push Add Push Add Add Push Push Mult Push Push Add Push Add Add Mult Pop

C 3 B B A

A 3 B B C

Sum

# Push C on TOS # Push 3 on TOS # 3C is stored on TOS # Push variable B on TOS # Push variable B on TOS # 2B is stored on TOS # A is stored on TOS # A+2B is on TOS # A+2B+3C is stored on TOS # Push A on TOS # Push 3 on TOS # 3A is stored on TOS # Push variable B on TOS # Push variable B on TOS # 2B is stored on TOS # C is stored on TOS # 3A+2B is on TOS # 3A+2B+C is stored on TOS # (3A+2B+C)*(A+2B+3C) is stored on TOS # Sum = (3A+2B+C)*(A+2B+3C)

Accumulator-based architecture Load Mult Store Load Add Store Load Add Add

C 3 C3 B B B2 A B2 C3

# Accumulator = C # Accumulator = 3C # Variable C3=3C # Accumulator = B # Accumulator = 2B # Variable B2=2B # Accumulator = A # Accumulator = A+2B # Accumulator = A+2B+3C

© Revision 1.0

109

Computer Systems Architecture exercises Store Load Mult Store Load Add Add Mult Store

Part1 A 3 A3 C B2 A3 Part1 Sum

# Variable Part1 = A+2B+3C # Accumulator = A # Accumulator = 3A # Variable A3=3A # Accumulator = C # Accumulator = C+2B # Accumulator = C+2B+3A # Accumulator = (C+2B+3A)*(A+2B+3C) # Sum = (3A+2B+C)*(A+2B+3C)

Memory-Register architecture As with previous cases the first version is simple, similar to the Accumulator version but using more registers Load Mult Add Add Add Store Load Mult Add Add Add Mult Store

R1,A R1,3 R1,B R1,B R1,C Tmp,R1 R2,C R2,3 R2,B R2,B R2,A R2,Tmp Sum,R2

# Load variable A into the R1 # R1=3A # R1=3A+B # R1=3A+2B # R1=3A+2B+C # Tmp = 3A+2B+C # R2=C # R2=3C # R2=3C+B # R2=3C+2B # R2=3C+2B+A # R2=(3A+2B+C)*(A+2B+3C) # Sum=(3A+2B+C)*(A+2B+3C)

Register-Register architecture As with previous cases the first version is simple and it starts by loading the variables and the constants into the registers Load Load Load Load Mult Add Add Add Mult Add Add Add Mult Store

R1,A R2,B R3,C R4,3 R5,R3,R4 R5,R5,R1 R5,R5,R2 R5,R5,R2 R6,R1,R4 R6,R6,R2 R6,R6,R2 R6,R6,R3 R6,R6,R5 Sum,R6

# Load variable A into the R1 # R2=B # R3=C # R4=3 # R5 = 3C # R5 = 3C+A # R5 = 3C+A+B # R5 = 3C+A+2B # R6 = 3A # R6 = 3A+B # R6 = 3A+2B # R6 = 3A+2B+C # R6=(3A+2B+C)*(A+2B+3C) # Sum=(3A+2B+C)*(A+2B+3C)

© Revision 1.0

110

Computer Systems Architecture exercises h) Write the instructions needed for executing the following formula: Sum = A*B*C/(A+B+C) Where A, B, C and Sum are variables Stack-based architecture, Simple and straightforward solution: Push Push Add Push Add Push Push Mult Push Mult Div Pop

C B A A B C

Sum

# Push C on TOS # Push B on TOS # C+B is stored on TOS # Push variable A on TOS # A+B+C is stored on TOS # A is stored on TOS # B is stored on TOS # A*B is on TOS # Push C on TOS # A*B*C is stored on TOS # A*B*C/(A+B+C) is stored on TOS # Sum = A*B*C/(A+B+C)

Accumulator-based architecture Load Add Add Store Load Mult Mult Div Store

C B A Tmp A B C Tmp Sum

# Accumulator = C # Accumulator = B+C # Accumulator = B+C+A # Variable Tmp = B+C+A # Accumulator = A # Accumulator = AB # Accumulator = A*B*C # Accumulator = # A*B*C/(A+B+C) # Sum = A*B*C/(A+B+C)

Memory-Register architecture As with previous cases the first version is simple, similar to the Accumulator version but using more registers Load Add Add Store Load Mult Mult Div Store

R1,A R1,B R1,C Tmp,R1 R2,C R2,B R2,A R2,Tmp Sum,R2

# Load variable A into the R1 # R1=A+B # R1=A+B+C # Tmp = A+B+C # R2=C # R2=B*C # R2=A*B*C # R2= A*B*C/(A+B+C) # Sum= A*B*C/(A+B+C)

© Revision 1.0

111

Computer Systems Architecture exercises Register-Register architecture As with previous cases the first version is simple and it starts by loading the variables and the constants into the registers Load Load Load Mult Mult Add Add Div Store

R1,A R2,B R3,C R5,R1,R2 R5,R5,R3 R4,R1,R2 R4,R1,R3 R5,R5,R4 Sum,R5

# Load variable A into the R1 # R2=B # R3=C # R5 = AB # R5 = ABC # R4 = A+B # R4 = A+B+C # R5 = A*B*C/(A+B+C) # Sum= A*B*C/(A+B+C)

i) Write the instructions needed for executing the following formula: Sum = (A+B)*(B+C)*(A+C) Where A, B, C and Sum are variables Stack-based architecture, Simple and straightforward solution: Push Push Add Push Push Add Push Push Add Mult Mult Pop

A B B C A C

Sum

# Push A on TOS # Push B on TOS # A+B is stored on TOS # Push variable B on TOS # Push variable C on TOS # B+C is stored on TOS # A is stored on TOS # C is stored on TOS # A+C is on TOS # Push (A+C)*(B+C) on TOS # Push (A+C)*(B+C)*(A+B) on TOS # Sum = (A+C)*(B+C)*(A+B)

Accumulator-based architecture Load Add Store Load Add Store Load Add Mult Mult

A B TmpAB B C TmpBC C A TmpAB TmpBC

# Accumulator = A # Accumulator = A+B # Variable TmpAB = A+B # Accumulator = B # Accumulator = B+C # Variable TmpBC = B+C # Accumulator = C # Accumulator = C+A # Accumulator = (A+B)*(C+A) # Accumulator = (A+B)*(C+A)*(B+C)

© Revision 1.0

112

Computer Systems Architecture exercises Store Sum

# Sum = (A+B)*(C+A)*(B+C)

Memory-Register architecture As with previous cases the first version is simple, similar to the Accumulator version but using more registers Load Add Store Load Add Store Load Add Mult Mult Store

R1,A R1,B TmpAB,R1 R2,B R2,C TmpBC,R2 R3,C R3,A R3,TmpAB R3,TmpBC Sum,R3

# Load variable A into the R1 # R1=A+B # Variable TmpAB = A+B # Load variable B into the R2 # R2=B+C # Variable TmpBC = B+C # Load variable C into the R3 # R3=C+A # R3 = (C+A)*(A+B) # R3 = (C+A)*(A+B)*(B+C) # Sum=(C+A)*(A+B)*(B+C)

Register-Register architecture As with previous cases the first version is simple and it starts by loading the variables and the constants into the registers Load Load Load Add Add Add Mult Mult Store

R1,A R2,B R3,C R4,R1,R2 R5,R2,R3 R6,R1,R3 R6,R6,R4 R6,R6,R5 Sum,R5

# Load variable A into the R1 # R2=B # R3=C # R4 = A+B # R5 = B+C # R6 = A+C # R6 = (A+B)*(A+C) # R6 = (A+B)*(A+C)*(B+C) # Sum= (A+B)*(A+C)*(B+C)

© Revision 1.0

113

Computer Systems Architecture exercises

2. CPI a) A specific program was executed on two computer systems. A has a cycle time of 1 ns and CPI=5 B has a cycle time of 1.6 ns and CPI=2.5 Which system is faster (for this specific program) and by how much: Solution Assuming that C is the number of instruction executed. TimeA = C * 5 * 1 * 10-9 = 5 * C * 1 * 10-9 TimeB = C * 2.5 * 1.6 * 10-9 = 4 * C * 1 * 10-9 TimeA/TimeB = (4 * C * 1 * 10-9)/(4 * C * 1 * 10-9) = 1.25 System B is faster by 25% b) Assuming we need to get similar speed on both system (described in the question above). What should be the CPI of each one of the systems. Solution There are two possibilities. The first one is slowing M2. TimeA=TimeB TimeA = C * 5 * 1 * 10-9 TimeB = C * X * 1.6 *10-9 X = 5/1.6 = 3.125 If the CPI of the second (the faster) system will increase from 2.5 to 3.125 the speeds of the two systems will be identical. The second possibility is increasing the speed of the first system TimeA=TimeB TimeA = C * X * 1 * 10-9 TimeB = C * 2.5 * 1.6 *10-9 X = 2.5 * 1.6 = 4 If the CPI of the first system will decrease from 5 to 4 the speeds of the system will be identical.

c) Assuming we need to get similar speed on both system (described in the questions above). What should be the cycle time of each one of the systems.

© Revision 1.0

114

Computer Systems Architecture exercises Solution As with the previous example, here too there are two possible solutions. The first is slowing down the second system. TimeA=TimeB TimeA = C * 5 * 1 * 10-9 TimeB = C * 2.5 * X *10-9 X = 5/2.5 = 2 If the cycle time of the second system will increase from 1.6 ns to 2 ns the speeds of the two system will be identical The second possibility is increasing the speed of the first system TimeA=TimeB TimeA = C * 5 * X * 10-9 TimeB = C * 2.5 * 1.6 *10-9 X = 2.5*1.6/5 = 4/5 = 0.8 If the cycle time of the first system will decrease from 1 ns to 0.8 ns the systems speeds will be identical

d) A specific system implemented four groups of instructions as outlined in the following table. When a specific program was executed the following instructions’ frequencies were observed Group ALU

Frequency 50%

Cycles 1

Load

20%

2

Store

10%

2

Branch

20%

2

Calculate the average CPI for that specific program. Solution CPI = 50%*1 + 20%*2 + 10%*2 +10%*2 = 0.5 + 0.4 + 0.2 + 0.4 = 1.5

© Revision 1.0

115

Computer Systems Architecture exercises e) A specific system implemented four groups of instructions as outlined in the following table. When a specific program was executed the following instructions’ frequencies were observed Group ALU

Frequency 43%

Cycles 1

Load

21%

1

Store

12%

2

Branch

24%

2

The hardware engineers can improve the “Store” group of instructions performance and execute each instruction in one cycle. However this will require increasing the systems cycle time by 15%. Is it worthwhile to implement the proposed change? Solution We will start by calculating the CPI for each case. We will mark the unknown cycle time by T CPIOld = 43%*1 + 21%*1 + 12%*2 + 24% * 2 = 1.36 CPINew = 43%*1 + 21%*1 + 12%*1 + 24% * 2 = 1.24 Speedup

= Old time / New time = (C * CPIOld * T) / (C * CPINew * 1.15 * T) = 1.36 / (1.24 * 1.15) = 0.95

The new proposed solution is not worthwhile since is slows the system down.

f) A specific system runs at 1GHz and supports four groups of instructions. When a specific program was executed the following usage was observed: Group 1

Usage 20%

CPI 1

2

30%

2

3

10%

3

4

40%

4

© Revision 1.0

116

Computer Systems Architecture exercises Calculate the CPI for that program. If there were 1010 instructions how many cycles it required and what was the execution time? Solution CPI

= 1*20% + 2*30% + 3* 10% + 4*40% = 2.7

The total number of cycles is calculated by the number of instructions multiplied by the average CPI Total number of cycles = 2.7 * 1010

The run time is calculated by dividing the total number of instructions by the clock rate Total runtime = CPI * Instruction Count / Clock rate = 2.7 * 1010 /1* 109 = 27 Seconds

g) A program was executed on a specific system and the following attributes were measured: - The number of instruction executed was 1009 - The CPI measured during the run was 2.5 - The clock rate was 2.5 GHz. Calculate the amount of time this program ran. Solution 1.25 seconds obtained by multiplying the number of instruction by the CPI and the cycle time

h) A program from the previous question was executed once again but this rime using a different compiler and a different hardware system. The attributes of the second runs were as follows: - The number of instruction executed was 950,000,000 - The CPI measured during the run was 3.0 - The clock rate was 3 GHz. Which system is faster and by how much © Revision 1.0

117

Computer Systems Architecture exercises Solution The new execution time is 0.95 seconds which represents an improvement of 32% compared to the first system. i) A program was executed on two system. The first system with a 20 ns cycle time and the second with a 30 ns. On the first system the CPI measured was 2.0 and on the second system the CPI was 0.5. Calculate the run time on each one of the systems Solution Since the number of instructions is not given we can only define the formula. Assuming the number of instructions executed is X then Time1 = X * 2.0 * 2.0 *10-8 Time2 = X * 0.5 * 3 * 10-8

j) A program was executed on two system. The first system with a 500MHZ clock rate and the second with a 650 MHz clock rate. On the first system it used 1*108 cycles and on the second system it used 1.2 *108 cycles. Which system is faster and by how much? Solution Time1 = 1*108 / (500*106) = 0.2 Seconds Time2 = 1.2*108 / (650 * 106) = 0.18 seconds This means that the second system is faster by 11% (0.2/0.18) k) A system has three types of instructions. The first type executes in one cycle, the second requires two cycles and the third requires three cycles. When running a piece of code, five instructions were of type one, three were of type two and two instructions were of type three, Calculate the CPI of this piece of code. Solution The total number of cycles id given by: 5*1 + 3*2 + 2*3 = 17 The number of instruction executed was: 5+3+2 = 10 CPI = 17/10 = 1.7 l) While running a program on system A with a clock rate of 600 MHz the CPI was 1.3. Running the program on system B with a clock rate of 750 MHz produced a CPI of 2.5. Assuming the number of instruction was 100,000’ what should the number of instruction on system B for achieving the same execution time?

© Revision 1.0

118

Computer Systems Architecture exercises Solution TimeA = 100,000 * 1.3 *1/(600 * 106) TimeB = X * 2.5 * 1/(750*106) X = 100,000 *1.3*1/(600*106)/2.5*1/(750*106) = 65,000 If the number of instruction on system B will be 65,000 the execution times will be identical. m) Running a specific program requires 3*1010 cycles. How long does it run if it executes on a system with a clock rate of 100 MHz? And how long does it run on a system with a clock rate of 3 GHz? Solution TimeA = 3*1010*1/(100*106)= 300 TimeB = 3*1010*1/(3*109)=10 On the first system it will run 300 seconds and on the second system it will run 10 seconds n) A specific system uses 5 groups of instruction. Each group requires a different number of cycle for execution. While executing a test run the group frequencies were measured as outlined in the following table Group 1

Name ALU

Usage 22%

CPI 1

2

Memory Access

36%

5

3

Branch

16%

3

4

Call

13%

4

5

Return

13%

4

In an attempt to increase the system’s performance the hardware engineers have come up with several improvement suggestions. However’ each one of the suggestions has some draw backs (in addition to the benefits). The following tables summarizes the suggestions and the drawbacks. Usually an improvement relates to one group of instructions and the drawback is the extra time required to increase the cycle time.

© Revision 1.0

119

Computer Systems Architecture exercises Group

Name

Improvement

1

ALU

25%

Cycle time increase 7%

2

Memory Access

35%

17%

3

Branch

90%

1%

4

Call

45%

2%

5

Return

45%

2%

For example it is possible to decrease the number of cycle required by the ALU by 25% but it requires increasing the cycle time (for all the groups) by 7%. The only limitation is that only one improvement can be implemented. Which one is the preferred suggestion? Which one is the worst one. Solution First we have to calculate the CPI prior to the improvements. CPI = 3.54 Then we’ll calculate the CPI improvement for each one of the possibilities, as defined in the following table: Alternative 1

Relative Improvement 1.053

2

0.977

3

1.079

4

1.085

5

1.085

The best improvement is alternative 2 (improving the memory access). Actually this is the only alternative that improves the situation. All other alternative make it worse. The worst alternative is 4/5 (improving the Call/Return)

© Revision 1.0

120

Computer Systems Architecture exercises

3. Amdahl’s Law a) A specific system performs a task in 100 ns. It is possible to introduce an improvement and then the task will be performed in 20 ns. The task is executed only during 30% of the time. Calculate the improvement Solution Using the Amdahl formula. Fe = 0.3 Se = 100/20 = 5 Speedup

= 1//(1-0.3)+0.3/5) = 1.32

The improvement gained is 32% b) A new system can execute a specific task 10 times faster compared to the existing system. The task is performed during 40% of the time while 60% of the time is dedicated to I/O operations. What is the total speed up to be achieved from replacing the system. (Note: the 60% I/O operation will not change) Solution Using the Amdahl formula. Fe = 0.4 Se = 10 Speedup

= 1//(1-0.4)+0.4/10) = 1.56

The improvement gained is 56%

c) On a specific application 5% of the code has to run in serial mode while 95% of the code can be executed in parallel. What will be the improvement when moving the application to a 10 CPUs system

© Revision 1.0

121

Computer Systems Architecture exercises Solution Using the Amdahl formula. Fe = 0.95 Se = 10 Speedup

= 1//(1-0.95)+0.95/10) = 6.9

The code will execute 6.9 times faster. d) After spending a lot of time a hardware engineer managed to improve the floating point arithmetic which now executes twice as fast. Unfortunately floating point operations are executed only during 10% of the time. What is the overall speedup achieved? Solution Using the Amdahl formula. Fe = 0.95 Se = 10 Speedup

= 1//(1-0.95)+0.95/10) = 6.9

The code will execute 6.9 times faster. e) A scientific application was developed so it can exploit parallel systems utilizing many processors. As such the parallel part is executed during 95% of the time. - On how many processors the application should run so it is 10 times faster? - What will be the number of processors required to achieve a 25 times faster time? - After spending additional work the developers succeeded to increase the parallel percentage to 97%. - How many processors are required for a 10 times faster run? - How many are required for a 20 times faster run?

© Revision 1.0

122

Computer Systems Architecture exercises Solution Using the Amdahl formula. -

19 processors. Here F is given (95%) and the speedup is define (10) so S has to be calculated It is not possible. The system will never achieve this speed 14 processors 49 processors.

f) A scientific application runs on parallel systems, however only 70% is suited for parallel processors. Calculate the speedup obtained by using, two, three, four and five processors.. Solution Using the Amdahl formula. F = 70% S is given by 200% for two processors, 300% for three and so on. The speedup for two processors is 1.54 The speedup for three processors is 1.88 The speedup for four processors is 2.11 The speedup for five processors is 2.27

g) A scientific application runs for 100 minutes. 40 minutes are CPU time while 60 minutes are I/O time. For increasing the speed, two alternatives are considered: replacing the processor by a new model 90 times faster or replacing the disk system by a newer model 4 times faster. - Which alternative is better? - What will be the run time for each alternative Solution Using the Amdahl formula twice For the CPU upgrade F = 40% S = 90

© Revision 1.0

123

Computer Systems Architecture exercises Speedup = 1.65 Run time = 100/1.65= 60 minutes

For the Disk upgrade F = 60% S=4 Speedup = 1.82 Run time = 100/1.82 = 55 minutes.

The preferred alternative is upgrading the disk.

h) In an attempt to improve the system’s performance, the hardware engineers came up with two possible solutions: adding a special hardware device that will execute the square root instruction 10 times faster, or improving the floating point instruction so they will execute twice as fast. On the benchmark programs the square root instructions are executed 20% of the time while the floating point instructions execute during 50% of the time. Which alternative is better? Solution Using the Amdahl formula twice For the square root instructions upgrade F = 20% S = 10 Speedup = 1.22 For the floating point instructions upgrade F = 50% S=2 Speedup = 1.33 This means that the floating point upgrade is better.

© Revision 1.0

124

Computer Systems Architecture exercises i) In a specific processor the Load and Store instructions were improved so they execute four times faster. These instructions account for 50% of the total run time. What is the overall speedup? Assuming the test program ran for 160 seconds, how long it will run after the improvement? Solution Using the Amdahl formula F = 50% S=4 Speedup = 1.6 The new run time = 160/1.6 = 100

j) In a specific processor all logic instructions were improved and the new ones execute five time faster. Assuming the logic instructions account for 50% of the run time, how long will run a program that previously ran 10 seconds? Solution Using the Amdahl formula F = 50% S=5 Speedup = 1.67 The new run time = 10/1.67 = 6 k) After improving the arithmetic instructions they run five times faster. What should their percentage be if the overall speedup required is 3 times? Solution Using the Amdahl formula F = needs to be calculated S=5 Speedup = 3 The percentage should be ~84% © Revision 1.0

125

Computer Systems Architecture exercises l) You are the CIO a manufacturing organization. Due to anticipated changes the board asked that the main application executed on the system will run twice as fast. You checked the application and discovered that 65% is suited for parallel execution. How many processors (or cores) do you have to buy? Solution Using the Amdahl formula F = 65% S = Needs to be calculated Speedup = 2 The number of processors (or cores) should be 5

m) A specific system is using RAM only (no cache memory). It is possible to ass the cache memory that runs five times faster. How the execution time will change if the applications use the cache 80% on average? What will happen if the cache is used only 50% of the time? Solution Using the Amdahl formula The case of 90% F = 90% S=5 Speedup = 3.57

The case of 50% F = 50% S=5 Speedup = 1.67

If the cache is used 90% of the time the speedup will be 3.57 If the cache is used only 50% of the time the speedup will be 1.67

© Revision 1.0

126

Computer Systems Architecture exercises n) A vector computer executes an instruction on a vector (array) of values, contrary to a scalar computer that executes the instruction on a single value. On a specific vector computer, the vector instructions are 20 times faster compared to the scalar instructions. What the speedup will be assuming only 70% of the application can be vectorized? Solution Using the Amdahl formula F = 70% S = 20 Speedup = 2.99

o) There was an urgent need to improve the performance of a specific CISC based system. Like any other CISC based system it supported many instructions. The hardware engineers mapped the instructions and addressed the 10 most used ones. These 10 instructions account for 90% of the time. After the improvement these 10 instructions executed 6 times faster. What is the overall speedup obtained Solution Using the Amdahl formula F = 90% S=6 Speedup = 4

p) During a meeting dedicated to the new system acquired, the CIO said that the fact the new system has 100 processors means that all application will benefit. Some application may see an= a increase of 50 times and others less. Nevertheless he said all application will run at least twice as fast. Do you agree to this assumption? Solution No. Achieving the 50 times faster time means that 99% of the application should be ready for parallel execution. Furthermore, obtaining the promised speed (twice as fast) requires that 50% of the application will be parallel ready. Application with smaller percentage will not see the promised speedup © Revision 1.0

127

Computer Systems Architecture exercises

4. Scoreboarding a) Draw the score board for the following Register-Register based architecture’s instructions: Add Sub Add Sub

R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5

Solution At the beginning all the registers valid bits are set (there is no hazard). The table on each stage represents the entry situation (Start Stage) and the completion situation (End Stage). The middle column define the tests as well as the activities on each stage. Please note that since this is a dynamic scheduling there might be other solutions as well, for example several attempts to issue an instruction before the input register becomes valid. Stage 1 - Attempt executing the first instruction Add Sub Add Sub Start Stage

R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5 Activities

R1: 1

End Stage R1: 1

R2: 1

R3.valid=1

R2: 0

R3: 1

R4.valid=1

R3: 1

R4: 1 R5: 1

R4: 1 Issue instruction

R6: 1 R7: 1

R5: 1 R6: 1

Set R2.valid=0

R8: 1

R7: 1 R8: 1

Stage 2 - Attempt executing the second instruction © Revision 1.0

128

Computer Systems Architecture exercises

Add Sub Add Sub Start Stage

R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5 Activities

R1: 1

End Stage R1: 1

R2: 0

R1.valid=1

R2: 0

R3: 1

R4.valid=1

R3: 0

R4: 1 R5: 1

R4: 1 Issue instruction

R6: 1 R7: 1

R5: 1 R6: 1

Set R3.valid=0

R8: 1

R7: 1 R8: 1

Stage 3 - Attempt executing the third instruction Add Sub Add Sub Start Stage

R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5 Activities

R1: 1 R2: 0

End Stage R1: 1

R2.valid=0

R2: 0

R3: 0

R3: 0

R4: 1

R4: 1

R5: 1

Stall

R5: 1

R6: 1

R6: 1

R7: 1

R7: 1

R8: 1

R8: 1

Stage 4 – First instruction completed

© Revision 1.0

129

Computer Systems Architecture exercises

Add Sub Add Sub Start Stage

R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5 Activities

R1: 1

End Stage R1: 1

R2: 0

Add R2,R3,R4

R2: 1

R3: 0

Completed

R3: 0

R4: 1

R4: 1

R5: 1

R5: 1

R6: 1

R6: 1

R7: 1

Set R2.valid=1

R8: 1

R7: 1 R8: 1

Stage 5 - Attempt executing the third instruction Add Sub Add Sub Start Stage

R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5 Activities

R1: 1

End Stage R1: 1

R2: 1

R2.valid=1

R2: 1

R3: 0

R3.valid=0

R3: 0

R4: 1 R5: 1

R4: 1 Stall

R5: 1

R6: 1

R6: 1

R7: 1

R7: 1

R8: 1

R8: 1

Stage 6 – Second instruction completed

© Revision 1.0

130

Computer Systems Architecture exercises

Add Sub Add Sub Start Stage

R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5 Activities

R1: 1

End Stage R1: 1

R2: 1

Sub R3,R1,R4

R2: 1

R3: 0

Completed

R3: 1

R4: 1

R4: 1

R5: 1

R5: 1

R6: 1

R6: 1

R7: 1

Set R3.valid=1

R8: 1

R7: 1 R8: 1

Stage 7 - Attempt executing the third instruction Add Sub Add Sub Start Stage

R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5 Activities

R1: 1

End Stage R1: 1

R2: 1

R2.valid=1

R2: 1

R3: 1

R3.valid=1

R3: 1

R4: 1 R5: 1

R4: 1 Issue instruction

R6: 1 R7: 1

R5: 1 R6: 1

Set R7.valid=0

R8: 1

R7: 0 R8: 1

Stage 8 - Attempt executing the fourth instruction

© Revision 1.0

131

Computer Systems Architecture exercises

Add Sub Add Sub Start Stage

R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5 Activities

R1: 1

End Stage R1: 1

R2: 1

R7.valid=0

R2: 1

R3: 1

R3: 1

R4: 1

R4: 1

R5: 1

Stall

R5: 1

R6: 1

R6: 1

R7: 0

R7: 0

R8: 1

R8: 1

Stage 9 – Third instruction completed Add Sub Add Sub Start Stage

R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5 Activities

R1: 1

End Stage R1: 1

R2: 1

Add R7,R2,R3

R2: 1

R3: 1

Completed

R3: 1

R4: 1

R4: 1

R5: 1

R5: 1

R6: 1

R6: 1

R7: 0

Set R7.valid=1

R8: 1

R7: 1 R8: 1

Stage 10 - Attempt executing the fourth instruction

© Revision 1.0

132

Computer Systems Architecture exercises

Add Sub Add Sub

R2,R3,R4 R3,R1,R4 R7,R2,R3 R8,R7,R5

Start Stage

Activities

R1: 1

End Stage R1: 1

R2: 1

R7.valid=1

R2: 1

R3: 1

R5.valid=1

R3: 1

R4: 1 R5: 1

R4: 1 Issue instruction

R6: 1 R7: 1

R5: 1 R6: 1

Set R8.valid=0

R8: 1

R7: 1 R8: 0

b) Draw the score board for the following Register-Register based architecture’s instructions: Mult Mult Add Div

R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4

Solution At the beginning all the registers valid bits are set (there is no hazard). The table on each stage represents the entry situation (Start Stage) and the completion situation (End Stage). The middle column define the tests as well as the activities on each stage. Stage 1 - Attempt executing the first instruction Mult Mult Add Div

R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4

© Revision 1.0

133

Computer Systems Architecture exercises Start Stage

Activities

R1: 1 R2: 1

End Stage R1: 0

R2.valid=1

R2: 1

R3: 1

R3: 1

R4: 1

R4: 1

R5: 1

Issue instruction

R6: 1 R7: 1

R5: 1 R6: 1

Set R1.valid=0

R8: 1

R7: 1 R8: 1

Stage 2 - Attempt executing the second instruction Mult Mult Add Div Start Stage

R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4 Activities

R1: 0

End Stage R1: 0

R2: 1

R2.valid=1

R2: 1

R3: 1

R1.valid=0

R3: 1

R4: 1

R4: 1

R5: 1

Stall

R5: 1

R6: 1

R6: 1

R7: 1

R7: 1

R8: 1

R8: 1

Stage 3 - Attempt executing the second instruction Mult Mult Add Div

R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4

© Revision 1.0

134

Computer Systems Architecture exercises

Start Stage

Activities

R1: 0

End Stage R1: 0

R2: 1

R2.valid=1

R2: 1

R3: 1

R1.valid=0

R3: 1

R4: 1

R4: 1

R5: 1

Stall

R5: 1

R6: 1

R6: 1

R7: 1

R7: 1

R8: 1

R8: 1

The first instruction still did not complete. Stage 4 – The first instruction completed Mult Mult Add Div Start Stage

R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4 Activities

R1: 0 R2: 1 R3: 1

End Stage R1: 1

Mult

R1,R2,R2

Completed

R2: 1 R3: 1

R4: 1

R4: 1

R5: 1

R5: 1

R6: 1

R6: 1

R7: 1

Set R1.valid=1

R8: 1

R7: 1 R8: 1

© Revision 1.0

135

Computer Systems Architecture exercises Stage 5 - Attempt executing the second instruction Mult Mult Add Div Start Stage

R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4 Activities

R1: 1

End Stage R1: 1

R2: 1

R2.valid=1

R2: 1

R3: 1

R1.valid=1

R3: 0

R4: 1 R5: 1

R4: 1 Issue instruction

R6: 1 R7: 1

R5: 1 R6: 1

Set R3.valid=0

R8: 1

R7: 1 R8: 1

Stage 6 - Attempt executing the third instruction Mult Mult Add Div Start Stage

R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4 Activities

R1: 1

End Stage R1: 1

R2: 1

R1.valid=1

R2: 1

R3: 0

R3.valid=0

R3: 0

R4: 1 R5: 1

R4: 1 Stall

R5: 1

R6: 1

R6: 1

R7: 1

R7: 1

R8: 1

R8: 1

© Revision 1.0

136

Computer Systems Architecture exercises Stage 7 – Second instruction completed Mult Mult Add Div Start Stage

R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4 Activities

R1: 1 R2: 1

End Stage R1: 1

Mult

R3: 0

R3,R2,R1

Completed

R2: 1 R3: 1

R4: 1

R4: 1

R5: 1

R5: 1

R6: 1

R6: 1

R7: 1

Set R3.valid=1

R8: 1

R7: 1 R8: 1

Stage 8 - Attempt executing the third instruction Mult Mult Add Div Start Stage

R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4 Activities

R1: 1

End Stage R1: 1

R2: 1

R1.valid=1

R2: 1

R3: 1

R3.valid=1

R3: 1

R4: 1 R5: 1

R4: 1 Issue instruction

R6: 1 R7: 1

R5: 0 R6: 1

Set R5.valid=0

R8: 1

R7: 1 R8: 1

© Revision 1.0

137

Computer Systems Architecture exercises Stage 9 - Attempt executing the fourth instruction Mult Mult Add Div Start Stage

R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4 Activities

R1: 1

End Stage R1: 1

R2: 1

R5.valid=0

R2: 1

R3: 1

R3: 1

R4: 1

R4: 1

R5: 0

Stall

R5: 0

R6: 1

R6: 1

R7: 1

R7: 1

R8: 1

R8: 1

Stage 10 – Third instruction completed Mult Mult Add Div Start Stage

R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4 Activities

R1: 1 R2: 1 R3: 1

End Stage R1: 1

Add

R5,R1,R3

Completed

R2: 1 R3: 1

R4: 1

R4: 1

R5: 0

R5: 1

R6: 1

R6: 1

R7: 1

Set R5.valid=1

R8: 1

R7: 1 R8: 1

© Revision 1.0

138

Computer Systems Architecture exercises Stage 11 - Attempt executing the fourth instruction Mult Mult Add Div

R1,R2,R2 R3,R2,R1 R5,R1,R3 R8,R5,R4

Start Stage

Activities

R1: 1

End Stage R1: 1

R2: 1

R5.valid=1

R2: 1

R3: 1

R4.valid=1

R3: 1

R4: 1 R5:1

R4: 1 Issue instruction

R6: 1 R7: 1

R5:1 R6: 1

Set R8.valid=0

R8: 1

R7: 1 R8: 0

c) The high level computer instruction SUM = A+B+C+D Can be implemented in assembly language by: Add Add Add

R5,R1,R2 R5,R5,R3 R5,R5,R4

Add Add Add

R5,R1,R2 R6,R3,R4 R5,R5,R6

Or

Which one is better? Draw the score board for the two implementations. Assume that after issuing an instruction there is one extra cycle before the register becomes available

© Revision 1.0

139

Computer Systems Architecture exercises Solution In both cases we’ll assume that at the beginning all the registers valid bits are set (there is no hazard). In addition the variables A,B,C,D were loaded already into R1,R2,R3,R4. The table on each stage represents the entry situation (Start Stage) and the completion situation (End Stage). The middle column define the tests as well as the activities on each stage. First implementation Stage 1 - Attempt executing the first instruction Add Add Add Start Stage

R5,R1,R2 R5,R5,R3 R5,R5,R4 Activities

R1: 1

End Stage R1: 1

R2: 1

R1.valid=1

R2: 1

R3: 1

R2.valid=1

R3: 1

R4: 1 R5: 1

R4: 1 Issue instruction

R6: 1 R7: 1

R5: 0 R6: 1

Set R5.valid=0

R8: 1

R7: 1 R8: 1

Stage 2 - Attempt executing the second instruction Add Add Add

R5,R1,R2 R5,R5,R3 R5,R5,R4

© Revision 1.0

140

Computer Systems Architecture exercises Start Stage

Activities

R1: 1

End Stage R1: 1

R2: 1

R5.valid=0

R2: 1

R3: 1

R3: 1

R4: 1

R4: 1

R5: 0

Stall

R5: 0

R6: 1

R6: 1

R7: 1

R7: 1

R8: 1

R8: 1

Stage 3 – First instruction completed Add Add Add Start Stage

R5,R1,R2 R5,R5,R3 R5,R5,R4 Activities

R1: 1 R2: 1

End Stage R1: 1

Add

R3: 1

R5,R1,R2

Completed

R2: 1 R3: 1

R4: 1

R4: 1

R5: 0

R5: 1

R6: 1

R6: 1

R7: 1

Set R5.valid=1

R8: 1

R7: 1 R8: 1

Stage 4 - Attempt executing the second instruction Add Add Add

R5,R1,R2 R5,R5,R3 R5,R5,R4

© Revision 1.0

141

Computer Systems Architecture exercises

Start Stage

Activities

R1: 1

End Stage R1: 1

R2: 1

R5.valid=1

R2: 1

R3: 1

R3.valid=1

R3: 1

R4: 1 R5: 1

R4: 1 Issue instruction

R6: 1 R7: 1

R5: 0 R6: 1

Set R5.valid=0

R8: 1

R7: 1 R8: 1

Stage 5 - Attempt executing the third instruction Add Add Add Start Stage

R5,R1,R2 R5,R5,R3 R5,R5,R4 Activities

R1: 1

End Stage R1: 1

R2: 1

R5.valid=0

R2: 1

R3: 1

R3: 1

R4: 1

R4: 1

R5: 0

Stall

R5: 0

R6: 1

R6: 1

R7: 1

R7: 1

R8: 1

R8: 1

Stage 6 – Second instruction completed Add Add Add

R5,R1,R2 R5,R5,R3 R5,R5,R4

© Revision 1.0

142

Computer Systems Architecture exercises

Start Stage

Activities

R1: 1 R2: 1

End Stage R1: 1

Add

R3: 1

R5,R5,R3

Completed

R2: 1 R3: 1

R4: 1

R4: 1

R5: 0

R5: 1

R6: 1

R6: 1

R7: 1

Set R5.valid=1

R8: 1

R7: 1 R8: 1

Stage 7 - Attempt executing the third instruction Add Add Add Start Stage

R5,R1,R2 R5,R5,R3 R5,R5,R4 Activities

R1: 1

End Stage R1: 1

R2: 1

R5.valid=1

R2: 1

R3: 1

R4.valid=1

R3: 1

R4: 1 R5: 1

R4: 1 Issue instruction

R6: 1 R7: 1

R5: 0 R6: 1

Set R5.valid=0

R8: 1

R7: 1 R8: 1

Due to the excessive usage of register 5 seven stages were required for executing the code.

© Revision 1.0

143

Computer Systems Architecture exercises Second implementation Stage 1 - Attempt executing the first instruction Add Add Add Start Stage

R5,R1,R2 R6,R3,R4 R5,R5,R6 Activities

R1: 1

End Stage R1: 1

R2: 1

R1.valid=1

R2: 1

R3: 1

R2.valid=1

R3: 1

R4: 1 R5: 1

R4: 1 Issue instruction

R6: 1 R7: 1

R5: 0 R6: 1

Set R5.valid=0

R8: 1

R7: 1 R8: 1

Stage 2 - Attempt executing the second instruction Add Add Add Start Stage

R5,R1,R2 R6,R3,R4 R5,R5,R6 Activities

R1: 1

End Stage R1: 1

R2: 1

R3.valid=1

R2: 1

R3: 1

R4.valid=1

R3: 1

R4: 1 R5: 0

R4: 1 Issue instruction

R6: 1 R7: 1

R5: 0 R6: 0

Set R6.valid=0

R8: 1

R7: 1 R8: 1

© Revision 1.0

144

Computer Systems Architecture exercises Stage 3 – First instruction completed Add Add Add Start Stage

R5,R1,R2 R6,R3,R4 R5,R5,R6 Activities

R1: 1 R2: 1

End Stage R1: 1

Add

R3: 1

R5,R1,R2

Completed

R2: 1 R3: 1

R4: 1

R4: 1

R5: 0

R5: 1

R6: 0

R6: 0

R7: 1

Set R5.valid=1

R8: 1

R7: 1 R8: 1

Stage 4 - Attempt executing the third instruction Add Add Add Start Stage

R5,R1,R2 R6,R3,R4 R5,R5,R6 Activities

R1: 1

End Stage R1: 1

R2: 1

R5.valid=1

R2: 1

R3: 1

R6.valid=0

R3: 1

R4: 1 R5: 1

R4: 1 Stall

R5: 1

R6: 0

R6: 0

R7: 1

R7: 1

R8: 1

R8: 1

© Revision 1.0

145

Computer Systems Architecture exercises Stage 5 – Second instruction completed Add Add Add Start Stage

R5,R1,R2 R6,R3,R4 R5,R5,R6 Activities

R1: 1 R2: 1

End Stage R1: 1

Add

R3: 1

R6,R3,R3

Completed

R2: 1 R3: 1

R4: 1

R4: 1

R5: 1

R5: 1

R6: 0

R6: 1

R7: 1

Set R6.valid=1

R8: 1

R7: 1 R8: 1

Stage 6 - Attempt executing the third instruction Add Add Add Start Stage

R5,R1,R2 R6,R3,R4 R5,R5,R6 Activities

R1: 1

End Stage R1: 1

R2: 1

R5.valid=1

R2: 1

R3: 1

R6.valid=1

R3: 1

R4: 1 R5: 1

R4: 1 Issue instruction

R6: 1 R7: 1

R5: 0 R6: 1

Set R5.valid=0

R8: 1

R7: 1 R8: 1

Here the second instruction can be executed without delay.

© Revision 1.0

146

Computer Systems Architecture exercises

5. Branch Prediction a) The following string represents the behavior of a specific branch instruction. - “1” means the branch was taken - “0” means the branch was not taken 1001110111 Calculate the success rate of the branch prediction when using one and two bits. In both cases assume the default is not to branch. Solution One Bit Prediction Cycle

1

2

3

4

5

6

7

8

9

10

Value

1

0

0

1

1

1

0

1

1

1

Branch

Y

N

N

Y

Y

Y

N

Y

Y

Y

Anticipation

N

Y

N

N

Y

Y

Y

N

Y

Y

Success

N

N

Y

N

Y

Y

N

N

Y

Y

Success rate = 50% (5 out of the 10 branches)

Two Bits Prediction Cycle

1

2

3

4

5

6

7

8

9

10

Value

1

0

0

1

1

1

0

1

1

1

Branch

Y

N

N

Y

Y

Y

N

Y

Y

Y

Anticipation

N!

N?

N!

N!

N?

Y?

Y!

Y?

Y!

Y!

Success

N

Y

Y

N

Y

Y

N

Y

Y

Y

Success rate = 60% (6 out of the 10 branches) b) What are the success rates of the branches in the previous exercise if the default is branch taken? Solution One Bit Prediction = 60% Two Bits Prediction = 60%

© Revision 1.0

147

Computer Systems Architecture exercises c) Calculate the success rates of the branch prediction mechanisms for the following scenarios. A bit set represents a branch taken. Scenario 1110000110110

One bit default Taken

Two bits default Taken

One bit success 61.54%

Two bits success 46.15%

10101010111

Taken

Not taken

27.27%

45.45%

10101010111

Not taken

Taken

18.18%

63.64%

1110001010

Taken

Taken

50.00%

60.00%

1110001010

Not taken

Not taken

40.00%

40.00%

1011100110110110

Not taken

Taken

37.50%

56.25%

1110101010111011

Not taken

Taken

31.25%

68.75%

1011001110001010

Taken

Not taken

43.75%

37.50%

1100110011

Taken

Taken

60.00%

20.00%

1100011011101011

Not taken

Not taken

43.75%

43.75%

0011101000011001

Taken

Taken

50.00%

37.50%

1011011101111011

Taken

Taken

50.00%

75.00%

1100011110000011

Not taken

Not taken

68.75%

43.75%

1110101011

Not taken

Not taken

30.00%

50.00%

1010101011

Taken

Taken

20.00%

60.00%

1101101101101101

Taken

Taken

37.50%

68.75%

0111111111110

Not taken

Not taken

84.62%

76.92%

0111111111110

Taken

Taken

76.92%

84.62%

1111000011110000

Not taken

Not taken

75.00%

50.00%

0111111101111111

Not taken

Not taken

81.25%

81.25%

0001111010101111

Not taken

Not taken

56.25%

68.75%

1110101000110111

Taken

Taken

50.00%

50.00%

1101011111011000

Taken

Taken

56.25%

68.75%

1101110111011101

Taken

Taken

50.00%

75.00%

© Revision 1.0

148

Computer Systems Architecture exercises

Chapter 6 – Cache Memory exercises 1. Cache Memory improvements a) A specific system has three levels of memory (Main, Cache L1 and Cache L2) The memory access time is 50 ns, L1 access time is 1 ns L2 access time is 5 ns An application was executed on that system which is characterized by the fact that 30% of the instructions access memory. 90% of these access are found in L1 and only 1% has to be brought from main memory. Calculate the amount of time added to each cycle, if: - The application is using only main memory - The application is using main memory and L1 - The application is using main memory and L2 - The application is using main memory, L1 and L2. Solution No cache (just man memory) New CPI = CPI + 30% * 50 = CPI + 15 Since 30% of the instructions access memory, a 15 ns will be added to the calculated CPI.

Main memory and L1 only New CPI = CPI + 30% * (10%*50 + 90%*1) = CPI + 1.77 30% of the instructions access memory, however 90% of them find the datum, in L1 and only 10% have to reach main memory. In this case 1.77 ns will be added to the calculated CPI

Main memory and L2 only New CPI = CPI + 30% * (99%*5 + 1%*50) = CPI + 1.64 30% of the instructions access memory, however only 1% has to access main memory while 99% find the datum in L2. In this case 1.64 ns will be added to the calculated CPI

© Revision 1.0

149

Computer Systems Architecture exercises Main memory, L1 and L2 New CPI = CPI + 30% * (90%*1 + 9%*5 +1%*50) = CPI + 0.56 30% of the instructions access memory, from which 90% find the datum in L1, another 9% find it in L2 and only in 1% of the cases the instruction has to access main memory. In this case 0.56 ns will be added to the calculated CPI.

b) A specific system has four levels of memory (Main, Cache L1, Cache L2 and Cache L3) . The memory access percent is 33% The memory access time is 60 ns, 0 missing rate L1 access time is 1 ns, 15% missing rate (the datum is not found in L1) L2 access time is 10 ns, 7% missing rate L2 access time is 20 ns, 1% missing rate Calculate the amount of time added to each cycle, if: - The application is using only main memory - The application is using main memory and L1 - The application is using main memory and L2 - The application is using main memory and L3 - The application is using main memory, L1 and L2. - The application is using main memory, L1, L2 and L3. Solution No cache (just man memory) New CPI = CPI + 33%*60 = CPI + 19.8 ns

Memory and cache L1 New CPI = CPI + 33% * (85%*1 + 15%*60) = CPI + 3.25 ns

Memory and cache L2 New CPI = CPI + 33%* (93%*10 + 7%*60) = CPI + 4.46 ns

Memory and cache L3 New CPI = CPI + 33% * (99%*20 + 1%*60) = CPI + 6.73 ns

© Revision 1.0

150

Computer Systems Architecture exercises Memory, cache L1 and cache L2 New CPI = CPI + 33^ (85%*1 + 8%*10 + 7%*60) = CPI + 1.93 ns Memory, cache L1, cache L2 and cache L3 New CPI = CPI + 33%(85%*1 + 8%*10 + 6%*20 + 1%*60) = CPI + 1.14 ns c) The following table defines several systems with different characteristics. All systems have a three level hierarchical memory (main, L1, L2). - The memory access represents the percent of the instructions that access memory. - The access time defines the time required to access each of the levels. - The missing rate represents the percentage of misses per each level. The missing rate for the main memory is always zero. For each line calculate the amount of time added to the CPI if: - The application is using only main memory - The application is using main memory and L1 - The application is using main memory and L2 - The application is using main memory, L1 and L2.

No.

Memory Access %

Access time L2

Missing Rate (%) L1 L2

Main

Added Time ns L1 L2

Main

L1

1

35

75

2

10

12

2

26.25

3.77

3.96

1.49

2

40

80

2

15

8

1

32.00

2.20

6.26

1.48

3

25

50

3

12

9

2

12.50

1.81

3.19

1.14

4

29

48

2

10

12

2

13.92

2.18

3.12

1.08

5

32

56

1

8

11

2

17.92

2.26

2.87

0.87

6

48

40

2

12

14

3

19.20

3.51

6.16

2.04

7

28

50

3

15

12

2

14.00

2.42

4.40

1.44

8

36

54

1

12

10

1

19.44

2.27

4.47

0.91

9

65

50

1

11

8

1

32.50

3.20

7.40

1.42

10

52

52

2

11

9

2

27.04

3.38

6.16

1.89

© Revision 1.0

L1+L2

151

Computer Systems Architecture exercises

Chapter 7 – BUS exercises 1. Parity a) The following table contains binary number and the definition of the parity bit (odd or even) Calculate the parity and complete the table. The first exercise is solved. No.

Binary Number

Parity

Parity Bit

1

10 1010 1010 1010

Odd

0

2

1111 0000 1111

Even

0

3

1 1011 1101 1010

Even

1

4

11 0000 1011

Odd

0

5

1111 1100 0000

Odd

0

6

10 0000 0100

Odd

1

7

100 0111 1010

Even

0

8

000 1101 1010

Even

1

9

1 0111 0101 1011

Even

1

10

1 1111 1111

Odd

1

b) For increasing the integrity of the information sent over the network a special parity algorithm was define. Instead of adding just one bit per a block (seven bits) the algorithm adds three parity bits. Each such parity bit guards several data bits. Unlike the ordinary simple parity bit mechanism, thus mechanism increases the overhead but provides the capability to correct faulty blocks without the need to re-transmit. The algorithm is defined by: P0 = even_parity (D0, D1, D3, D4, D5) P1 = even_parity (D1, D2, D3, D5, D6) P2 = even_parity (D0, D2, D3, D4, D6) Encode the binary number 0110010 by adding the three parity bits as define by the algorithm above Solution

© Revision 1.0

152

Computer Systems Architecture exercises The following table define the original value (the data bits), the bits that are being guarded by the parity bits and the value of these parity bits. D0

D1

D2

D3

D4

D5

D6

Parity

Value

0

1

1

0

0

1

0

P0

0

1

---

0

0

1

---

0

P1

---

1

1

0

---

1

0

1

P2

0

---

1

0

0

---

0

1

The encoded value is: 0110010 011 c) The following table contains a list of 7 bits block that have to be encoded using the previously described algorithm: P0 = even_parity (D0, D1, D3, D4, D5) P1 = even_parity (D1, D2, D3, D5, D6) P2 = even_parity (D0, D2, D3, D4, D6) Calculate the three parity bits for each block No.

Block Content

P0

P1

P2

1

111 1111

1

1

1

2

101 1110

0

1

0

3

101 1110

1

1

0

4

001 0011

1

1

0

5

101 0101

0

0

0

6

110 1101

0

1

0

7

111 0011

1

0

1

8

101 1111

0

0

1

9

111 1001

1

0

0

10

100 0111

1

0

1

11

100 1001

0

0

1

12

111 1100

0

1

0

13

001 0010

0

1

0

© Revision 1.0

153

Computer Systems Architecture exercises d) The following table contains a list of 7 bits block that have to be encoded using a very similar algorithm (same locations but the parity is odd): P0 = odd_parity (D0, D1, D3, D4, D5) P1 = odd_parity (D1, D2, D3, D5, D6) P2 = odd_parity (D0, D2, D3, D4, D6) Calculate the three parity bits for each block No.

Block Content

P0

P1

P2

1

011 1111

1

0

1

2

000 1010

1

1

0

3

111 0000

1

1

1

4

101 0000

0

0

1

5

111 0111

1

1

1

6

110 0011

0

0

1

7

111 1111

0

0

0

8

100 0001

0

0

1

9

011 1110

1

1

0

10

101 0101

1

1

1

11

111 0001

1

0

0

12

101 1101

0

0

0

13

110 1101

1

0

1

e) The block: 111 0101 110 arrived through the network after it was encoded using the following algorithm: P0 = even_parity (D0, D1, D3, D4, D5) P1 = even_parity (D1, D2, D3, D5, D6) P2 = even_parity (D0, D2, D3, D4, D6) Decode it and find the correct data bits of the original block. Solution First we will encode the data bits in the received message. Since the calculated parity bits are identical to the values in the message we assume the block is correct. The data bits are: 1110101

© Revision 1.0

154

Computer Systems Architecture exercises f) The block: 101 1111 000 arrived through the network after it was encoded using the following algorithm: P0 = odd_parity (D0, D1, D3, D4, D5) P1 = odd_parity (D1, D2, D3, D5, D6) P2 = odd_parity (D0, D2, D3, D4, D6) Decode it and find the correct data bits of the original block. Solution First we will encode the data bits in the received message. The calculated parity bits are: P0=1, P1=1, P2=0. It can be seen that parity bits 0 and 1 are different. Assuming just one bit flipped during the transmission, it should be D5 the only bit that is guarded by the two different parity bits. The correct data bits are: 101 1101

g) The block: 111 0111 101 arrived through the network after it was encoded using the following algorithm: P0 = odd_parity (D0, D1, D3, D4, D5) P1 = odd_parity (D1, D2, D3, D5, D6) P2 = odd_parity (D0, D2, D3, D4, D6) Decode it and find the correct data bits of the original block. Solution First we will decode the data bits in the received message. The calculated parity bits are: P0=1, P1=1, P2=1. It can be seen that parity bit 1 is different. Assuming just one bit flipped during the transmission, it should be P1 itself. If it was a data bit that flipped than more than one parity bit should have been erroneous. The correct data bits are: 111 0111

h) The following table contains a list of block that arrived through the network. All blocks were decoded using the following odd or even algorithm: P0 = parity (D0, D1, D3, D4, D5) P1 = parity (D1, D2, D3, D5, D6) P2 = parity (D0, D2, D3, D4, D6)

© Revision 1.0

155

Computer Systems Architecture exercises The correct algorithm (odd or even) is define in the table for each block. Decode each received block to find the correct data bits of the original block assuming no more than one bit flipped. No.

Block Received

Algorithm

Flipped bit

Original Block

1

111 1101 000

Odd

D5

111 1111

2

000 0100 111

Odd

D0

100 0100

3

110 1001 110

Even

P2

110 1001

4

110 0111 000

Even

D2

111 0111

5

110 0001 000

Even

None

110 0001

6

101 1010 000

Even

D3

101 0010

7

100 1111 010

Odd

D3

100 0111

8

010 1111 000

Odd

D1

000 1111

9

101 0111 000

Odd

P2

101 0111

10

111 0110 000

Even

D6

111 0111

11

110 0111 110

Even

D4

110 0011

12

111 0001 011

Even

None

111 0001

13

111 0101 101

Odd

P0

111 0101

14

101 0100 110

Odd

P1

101 0100

15

001 1000 110

Odd

D4

001 1100

16

011 1010 101

Odd

D5

011 1000

17

000 0000 100

Odd

D6

000 0001

18

111 1111 010

Even

D0

011 1111

19

000 0010 001

Even

D3

000 1010

20

010 0101 101

Even

D4

010 0001

21

011 1011 110

Even

P2

011 1011

22

011 0101 010

Odd

D5

011 0111

23

011 0111 100

Odd

D5

011 0101

24

010 1001 000

Odd

D4

010 1101

© Revision 1.0

156

Computer Systems Architecture exercises

2. Hamming Codes a) Use odd hamming codes for decoding the value 1111 Solution According to the Hamming codes mechanism - P1 guards all bits with the “one” bit on in their address (bits D3, D5, D7) - P2 guards the bits with the “two” bit on in their address ((D3, D6, D7) - P4 guards the bits with the “four” bit on in their address (D5, D6, D7) We have to calculate the parity bits according to this rules and we’ll get: - P1=0 - P2=0 - P4=0 The following table describes the process divided into steps. -

First line define the addresses of the bits Second line adds the data bits (in the proper locations) The next three lines set the parity bits The last line is the encoded value.. Bit

P1

P2

D3

P4

D5

D6

D7

Address

001

010

011

100

101

110

111

1

1

1

1

1

1

Data P1

0 0

P2

1

P4 Encoded

0

0

1

1 1

1

0

1

1

1

1

1

1

1

There is however another way, which some students may fins easier. Performing the XOR function on all the addresses of the bits that are set in the original message. (3,5,6,7 or in binary 011, 101, 110, 111). Such a function will produce the value 111. This value is correct for even parity. Since in this case the parity was define as odd, the number has to be inverted and wel will get 000. All that remains is placing the parity bits in their proper location in the message to get the encoded message 0010111

© Revision 1.0

157

Computer Systems Architecture exercises b) Use odd hamming codes for decoding the value 1010 Solution According to the Hamming codes mechanism - P1 guards all bits with the “one” bit on in their address (bits D3, D5, D7) - P2 guards the bits with the “two” bit on in their address ((D3, D6, D7) - P4 guards the bits with the “four” bit on in their address (D5, D6, D7) We have to calculate the parity bits according to this rules and we’ll get: - P1=0 - P2=1 - P4=0 The following table describes the process divided into steps. -

First line define the addresses of the bits Second line adds the data bits (in the proper locations) The next three lines set the parity bits The last line is the encoded value.. Bit

P1

P2

D3

P4

D5

D6

D7

Address

001

010

011

100

101

110

111

1

0

1

0

1

0

Data P1

0 1

P2

1

P4 Encoded

0

1

1

0 1

0

0

0

1

0

0

0

1

10

There is also the shorter way of using the XOR function on all the addresses of the bits that are set in the original message. (3,6 or in binary 011, 110). Such a function will produce the value 101. This value is correct for even parity. Since in this case the parity was define as odd, the number has to be inverted and wel will get 010. All that remains is placing the parity bits in their proper location in the message to get the encoded message 0110010

c) Use even hamming codes for decoding the value 1101 1110

© Revision 1.0

158

Computer Systems Architecture exercises Solution Since this number contains more bits we will need more parity bits as well. It is possible to decode the number using the longer way (with a table, as described in the previous two exercises), but we will use the shorter way. We will place the data bits in their proper location and perform a XOR between the address of the bits that are on (in this case address 3,5,7,9,10,11. The XOR (011  101  111  1001  1010  1011) = 1001 Since in this case the parity is define as even, the XOR code represents the parity bits. All that remains is to construct the decoded message 1010 1011 1110 d) The following table contains a list of block that have to be encoded using Hamming codes. Calculate the Hamming codes for each line that represents a data block. The specific parity to be used per each block is defined as well. No.

Original Data

Parity

Encoded Block

1

011 1111

Even

000 1111 1111

2

011 1111

Odd

110 0111 0111

3

111 0001

Odd

001 1110 0001

4

101 1011

Even

111 0011 0011

5

000 1111

Even

110 1001 1111

6

010 0100

Odd

110 0100 0100

7

101 1100

Odd

001 1011 0100

8

101 1100

Even

111 0011 1100

9

100 1000

Even

001 1001 0000

10

111 1000

Even

111 1111 0000

11

010 1001

Odd

010 1101 0001

12

0101001

Even

100 0101 1001

13

001 1111

Odd

010 1011 0111

14

000 0111

Odd

110 1000 0111

15

101 0101

Even

111 1010 0101

© Revision 1.0

159

Computer Systems Architecture exercises

3. SECDED a) Use even Hamming codes for decoding the value 110 1001 and in addition add an odd SECDED bit Solution Calculating the Hamming codes was addressed in previous exercises (either by calculating the codes using a table or using the shorter way with the XOR function). In this specific the data 110 1001 will be encoded into 011 0101 1001. The SECDED bit that in case of failure can correct a single bit flip and detect double bits flips adds a parity bit on the whole encoded block. After adding the odd SECDED parity the block will contain: 1011 0101 1001 b) The following table contains a list of blocks that have to be encoded using Hamming codes. Calculate the Hamming codes As well as the SECDED The specific parity to be used per each block is defined as well. No. 1

Original Data 101 1111

Hamming Parity Even

SECDED Parity Odd

2

010 1100

Even

Even

0110 0101 1100

3

001 0001

Odd

Odd

0010 0010 0001

4

100 1110

Even

Odd

0111 1001 0110

5

101 0001

Odd

Even

0101 0010 0001

6

111 0001

Odd

Odd

0001 1110 0001

7

011 0110

Even

Odd

1000 0110 0110

8

001 1010

Even

Even

0110 0011 1010

9

010 0010

Odd

Even

0000 0100 0010

10

100 1110

Odd

Odd

0001 0001 1110

11

000 0111

Even

Odd

1000 0000 1111

12

010 1110

Even

Odd

0100 0101 0110

13

111 1000

Odd

Even

1001 0111 1000

14

011 0011

Odd

Odd

0100 1110 1011

© Revision 1.0

Encoded Block 1011 0011 1111

160

Computer Systems Architecture exercises c) A message with the content: 0001 0111 0111 arrived from the network. It was encoded using odd Hamming codes and an odd SECDED. Decode it to obtain the original data. Solution The first step is checking the SECDED. Since it is correct it implies that either the message is correct or that two bit were flipped. In the case two bit were flipped the original data cannot be calculated. The next step is calculating the Hamming codes. In this case the codes are correct. This means that the message is correct. The third and last step is to retrieve the original data bits. In this case the data sent was: 111 1111

d) A message with the content: 1111 0010 0011 arrived from the network. It was encoded using even Hamming codes and an even SECDED. Decode it to obtain the original data. Solution The first step is checking the SECDED. Since it is wrong it implies that one bit was flipped. The next step is calculating the Hamming codes. In this case The three Hamming codes P1,P2,P4 are wrong. The only bit that is shared by these three guard bits is bit 7.. The third and last step is to retrieve the original data bits and flip bit 7. In this case the data sent was: 101 1011. As with the previous example (of calculating the Hamming codes), there is also a short way to figure out the integrity of the message received. Step 1 - Performing a XOR between the addresses of the bits set in the message received. XOR (0011  0110  1010  1011) = 0100 Step 2 – Performing a XOR between the parity bits obtained in the message XOR (0100  0011) = 0111 The value calculated is the bit flipped

© Revision 1.0

161

Computer Systems Architecture exercises e) The following table contains a list of blocks that were received. Decode the Hamming codes as well as the SECDED. The specific parity to be used per each block is defined as well. No.

Received Block

1

0001 0101 0111

Hamming Parity Odd

2

1010 1101 1001

Even

Even

D7

010 0001

3

1000 1000 0000

Odd

Odd

D11

000 0001

4

0110 1100 0101

Even

Odd

SECDED

010 0101

5

0100 0111 1110

Odd

Even

P1

011 1110

6

0001 0010 1001

Odd

Odd

D9

101 0101

7

1110 1101 1111

Even

Odd

D5

000 1111

8

0110 0010 1010

Even

Even

D7

001 1010

9

1000 0111 1000

Odd

Even

D3

111 1000

10

0100 0000 0001

Odd

Odd

D5

010 0001

11

1011 1100 1101

Even

Odd

D10

110 0111

12

1000 1010 1110

Even

Odd

D9

001 0010

13

1001 1111 1011

Odd

Even

D5

101 1011

14

1110 0111 1010

Odd

Odd

D10

011 1000

15

1011 1101 1100

Even

Odd

D6

111 1100

16

1101 1010 0101

Even

Even

P2

101 0101

17

1100 0001 1111

Even

Even

D6

001 1111

18

0110 1000 0001

Odd

Odd

D3

100 0001

19

1010 0001 0000

Odd

Even

D10

000 1010

20

0001 1001 0111

Even

Odd

P8

100 1111

© Revision 1.0

SECDED Parity Odd

Bit Flipped D6

Original Data 111 1111

162