124 41 10MB
English Pages xiv, 342 [350] Year 2023
Sebastian Slama
Experimental Physics Compact for Scientists Mechanics, Thermodynamics, Electrodynamics, Optics & Quantum Physics
Experimental Physics Compact for Scientists
Sebastian Slama
Experimental Physics Compact for Scientists Mechanics, Thermodynamics, Electrodynamics, Optics & Quantum Physics
Sebastian Slama Institute of Physics University of Tübingen Tübingen, Germany
ISBN 978-3-662-67895-4 ISBN 978-3-662-67894-7 https://doi.org/10.1007/978-3-662-67895-4
(eBook)
This book is a translation of the original German edition „Experimentalphysik kompakt für Naturwissenschaftler “ by Slama, Sebastian, published by Springer-Verlag GmbH, DE in 2020. The translation was done with the help of artificial intelligence (machine translation by the service DeepL. com). A subsequent human revision was done primarily in terms of content, so that the book will read stylistically differently from a conventional translation. Springer Nature works continuously to further the development of tools for the production of books and on the related technologies to support the authors. # Springer-Verlag GmbH Germany, part of Springer Nature 2023 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer-Verlag GmbH, DE, part of Springer Nature. The registered company address is: Heidelberger Platz 3, 14197 Berlin, Germany Paper in this product is recyclable.
Preface
Practically all mathematical-scientific and medical subjects include a module in physics in their curricula, both in Bachelor of Science and Bachelor of Education/ teaching degree programmes. The relevant lectures usually cover a wide range of physics topics in a relatively short time. Most of the lectures are of two semesters’ duration. In addition, students often have to pass a practical course in physics. This is also the case at the University of Tübingen, where students of biochemistry, bioinformatics, biology, chemistry, geoecology, geosciences, computer science, mathematics, medicine, medical informatics, medical technology, natural sciences and technology, pharmacy and environmental sciences have to take a physics module. Yet for many STEM students, physics in particular is a problem subject. Many find it difficult to follow the rapidly progressing lecture and not lose overview. The physics exam at the end of the lectures is then often a major hurdle for studying. It is therefore all the more important to have a book in one’s hands that (a) explains the material of the lecture in a compact manner so that one can reread difficult topics, (b) contains exercises that can be used to optimally prepare for the exam, and (c) in the best case serves as a reference work for important formulas. I have tried to fulfil these three criteria with this book. The chapters of the book cover all subfields of physics that are relevant in the lecture “Experimental Physics for Natural Scientists”. These are (1) mechanics of rigid bodies, (2) continuum mechanics (including elastomechanics and hydrodynamics), (3) wave physics (including sound waves), (4) thermodynamics, (5) electrostatics, (6) magnetostatics, (7) electrodynamics, (8) electronics, (9) optics, and (10) quantum physics. Relevant to the physics practical courses is the chapter on physical measurements, which describes how to correctly deal with measurement uncertainties. Despite this abundance of topics, the book has remained relatively compact with about 350 pages. It also contains 85 exam and exercise problems with solutions. The problems are marked by grey boxes. Examples and important definitions are also printed in grey boxes. The book contains 133 graphics. Each chapter concludes with a summary of all the important formulas of the corresponding subfield of physics for reference. I believe that this book can be useful to all STEM students in mastering physics, and I hope that you will enjoy it. For ease of reading, we use the generic masculine form throughout most of this book. This always implies both forms, thus including the feminine form. v
vi
Preface
Finally, I must express my gratitude, especially to PD Dr. Roland Speith, from whom I took over the lecture “Experimental Physics for Natural Scientists” at the University of Tübingen in the summer semester 2016 after many years of collaboration, and to Mr. Uwe Pettke, who contributes greatly to the success of the lecture with his experiments. I would also like to thank the project “Successful Studying in Tübingen” (ESIT), funded by the Federal Ministry of Education and Research, in the framework of which I have been able to devote myself specifically to improving teaching in the STEM subjects since 2011. Furthermore, I would like to thank Springer-Verlag and especially Ms. Maly for the cooperation and the publishing of my book. And last but not least, I would like to thank the harshest critics of my work, namely my wife Heidi and our children Verena, Maximilian, Valentin and Lucia, who always bring me back to the ground of non-physical facts.
Notes on the Second Edition A few errors had crept into the first edition. I hope that by the revisions in the second edition I caught all of them, or at least a large part of them, and did not cause new errors. I also added a few things. In the “Thermodynamics” chapter, there are now sections on heat transport and phase transitions, as well as a note on molar quantities. In the chapter “Magnetostatics”, the vector potential is introduced in an outlook. I moved the small tutorial on complex numbers from “Quantum Mechanics” to “Electronics”, as a basis for the new section on complex impedance. In the “Optics” chapter there is now an outlook on X-ray diagnostics, and in the “Fundamentals of Quantum Physics” there is an outlook on alternative interpretations of quantum mechanics. There is also now an appendix of physical constants at the end of the book. I hope you will benefit even more from the second edition than from the first, and I hope you enjoy reading it. Tübingen, Germany October 2019
Sebastian Slama
Contents
1
2
Physical Quantities and Measurements . . . . . . . . . . . . . . . . . . . . . . 1.1 Physical Quantities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Measurement Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Statistical Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.2 Histograms and Distributions . . . . . . . . . . . . . . . . . . . 1.2.3 Gaussian Error Propagation . . . . . . . . . . . . . . . . . . . . . 1.2.3.1 Addition and Subtraction of Measured Quantities . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.3.2 Multiplication of Measured Quantities . . . . . 1.2.3.3 Division of Measured Quantities . . . . . . . . . 1.2.4 Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.4.1 Example: Disintegration of Beer Foam . . . . . 1.3 Physical Measurements: Compact . . . . . . . . . . . . . . . . . . . . . .
1 1 3 4 5 7 7 8 8 9 10 12
Mechanics of Rigid Bodies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Position, Velocity and Acceleration . . . . . . . . . . . . . . . 2.1.2 Determination of the Position from the Acceleration . . . 2.1.3 Motion in Three Dimensions . . . . . . . . . . . . . . . . . . . . 2.1.3.1 Oblique Projectile Motion . . . . . . . . . . . . . . 2.1.4 Circular Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Newton’s Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Gravitational Force . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3 Spring Force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3.1 Combination of Springs . . . . . . . . . . . . . . . 2.2.4 Pseudo Forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.4.1 Inertia Force . . . . . . . . . . . . . . . . . . . . . . . . 2.2.4.2 Centrifugal Force . . . . . . . . . . . . . . . . . . . . 2.2.4.3 Coriolis Force . . . . . . . . . . . . . . . . . . . . . . . 2.2.5 Frictional Forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.5.1 Inclined Plane . . . . . . . . . . . . . . . . . . . . . . . 2.2.6 Equation of Motion . . . . . . . . . . . . . . . . . . . . . . . . . .
13 13 13 15 17 18 20 23 24 26 27 29 30 30 31 31 32 33 33 vii
viii
Contents
2.3
Conserved Quantities in Mechanics . . . . . . . . . . . . . . . . . . . . . 2.3.1 Work, Energy, Potential and Power . . . . . . . . . . . . . . . 2.3.1.1 Potential Energy . . . . . . . . . . . . . . . . . . . . . 2.3.1.2 Elastic Energy . . . . . . . . . . . . . . . . . . . . . . 2.3.1.3 Kinetic Energy . . . . . . . . . . . . . . . . . . . . . . 2.3.1.4 Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1.5 Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1.6 Outlook: General Definition of Work . . . . . . 2.3.2 Straight Motion: Momentum and Collisions . . . . . . . . . 2.3.2.1 Central Collision . . . . . . . . . . . . . . . . . . . . . 2.3.3 Circular Motion: Angular Momentum and Torque . . . . Statics and Equilibria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Statics of Translation . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1.1 Suspension of a Lamp from the Suspension Cable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.2 Statics of Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.2.1 Leverage Law . . . . . . . . . . . . . . . . . . . . . . . 2.4.2.2 Equilibria of Extended Bodies . . . . . . . . . . . 2.4.2.3 Centre of mass . . . . . . . . . . . . . . . . . . . . . . Rotation of Extended Bodies . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.1 Moment of Inertia . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.1.1 Moments of Inertia of Special Bodies and Steiner’s Theorem . . . . . . . . . . . . . . . . . . . 2.5.2 Rotational Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . Mechanics: Compact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35 35 36 36 36 37 38 39 40 42 44 46 46
Continuum Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Elastic Deformations of Solid Bodies . . . . . . . . . . . . . . . . . . . . 3.1.1 Elongation and Compression . . . . . . . . . . . . . . . . . . . . 3.1.2 Bending . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.3 Shear . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.4 Torsion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.5 Beyond the Elastic Range: Fracture . . . . . . . . . . . . . . . 3.2 Hydrostatics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Pressure in Liquids and Gases . . . . . . . . . . . . . . . . . . . 3.2.1.1 Gravity Pressure . . . . . . . . . . . . . . . . . . . . . 3.2.1.2 Plunger Pressure . . . . . . . . . . . . . . . . . . . . . 3.2.1.3 Pressure Measurement with the U-Tube . . . . 3.2.1.4 Compression of Liquids . . . . . . . . . . . . . . . 3.2.2 Buoyancy Force: Swimming, Floating, Sinking . . . . . . 3.2.3 Interfaces of Liquids . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3.1 Force on Edge Lines and Overpressure . . . . . 3.2.3.2 Adhesion and Cohesion . . . . . . . . . . . . . . . . 3.2.4 Aerostatics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.4.1 Barometric Pressure Law . . . . . . . . . . . . . . .
59 59 60 62 63 64 64 65 65 66 67 67 68 69 72 73 75 75 76
2.4
2.5
2.6 3
47 48 49 49 51 52 52 53 54 55
Contents
3.3
ix
Hydrodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Continuity Equation . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Bernoulli Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2.1 Pitot Tube . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2.2 Aerodynamic Lift Force on the Aircraft Wing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2.3 Magnus Effect . . . . . . . . . . . . . . . . . . . . . . 3.3.2.4 Hydrodynamic Paradox . . . . . . . . . . . . . . . . 3.3.2.5 Torricelli’s Law . . . . . . . . . . . . . . . . . . . . . 3.3.3 Flow of Real Fluids . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3.1 Application: Stokes Law of Friction . . . . . . . 3.3.3.2 Flows in Tubes: Hagen-Poiseuille Law . . . . . 3.3.4 Turbulence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Continuum Mechanics: Compact . . . . . . . . . . . . . . . . . . . . . . .
82 83 83 84 84 86 87 89 91
4
Oscillations and Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Harmonic Oscillation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 Examples of Harmonic Oscillations . . . . . . . . . . . . . . . 4.1.1.1 Spring Pendulum . . . . . . . . . . . . . . . . . . . . 4.1.1.2 Thread Pendulum . . . . . . . . . . . . . . . . . . . . 4.1.2 Driven Harmonic Oscillator with Damping . . . . . . . . . 4.1.2.1 Damped Harmonic Oscillator . . . . . . . . . . . . 4.1.2.2 Driven Harmonic Oscillator . . . . . . . . . . . . . 4.2 Harmonic Wave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Interference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1.1 Standing Waves . . . . . . . . . . . . . . . . . . . . . 4.2.1.2 Beatings . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Sound Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2.1 Sound Perception . . . . . . . . . . . . . . . . . . . . 4.2.3 Doppler Effect and Supersonic Speed . . . . . . . . . . . . . 4.3 Oscillations and Waves: Compact . . . . . . . . . . . . . . . . . . . . . .
95 95 96 96 98 99 99 101 104 106 107 109 110 112 113 115
5
Thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Basic Concepts of Thermodynamics . . . . . . . . . . . . . . . . . . . . . 5.1.1 Heat Transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.2 Thermal Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Ideal Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Molar Quantities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 Internal Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.3 First Law of Thermodynamics . . . . . . . . . . . . . . . . . . . 5.2.4 Changes of State in the Ideal Gas . . . . . . . . . . . . . . . . 5.2.4.1 Isochoric Change of State . . . . . . . . . . . . . . 5.2.4.2 Isobaric Change of State . . . . . . . . . . . . . . . 5.2.4.3 Isothermal Change of State . . . . . . . . . . . . . 5.2.4.4 Adiabatic Change of State . . . . . . . . . . . . . .
119 119 121 123 124 127 128 130 131 132 133 133 134
3.4
77 78 80 82
x
Contents
5.3
Entropy and Reversibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Example of Irreversible Expansion . . . . . . . . . . . . . . . 5.3.2 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.3 Heat Engines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.3.1 Carnot Cycle . . . . . . . . . . . . . . . . . . . . . . . Phase Transitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Thermodynamics – Compact . . . . . . . . . . . . . . . . . . . . . . . . . .
136 136 138 141 142 144 146
Electrostatics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Electric Charges, Forces and Fields . . . . . . . . . . . . . . . . . . . . . 6.1.1 Forces Between Charges: Coulomb’s Law . . . . . . . . . . 6.1.2 Electric Field and Electric Force on Charges . . . . . . . . 6.1.2.1 Electric Field Lines . . . . . . . . . . . . . . . . . . . 6.2 Gauss’ Theorem Calculation of Electric Fields . . . . . . . . . . . . . 6.2.1 Spherically Symmetrical Charge Distribution with Charge Q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 Long Straight Charged Wire . . . . . . . . . . . . . . . . . . . . 6.2.3 Infinitely Extended, Homogeneously Charged Plate . . . 6.2.4 Plate Capacitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Electrostatic Potential and Electric Voltage . . . . . . . . . . . . . . . . 6.3.1 Examples of Electrical Potentials . . . . . . . . . . . . . . . . . 6.3.1.1 Potential of a Spherically Symmetrical Charge Distribution . . . . . . . . . . . . . . . . . . . 6.3.1.2 Potential of an Infinitely Long Straight Charged Wire . . . . . . . . . . . . . . . . . . . . . . . 6.3.1.3 Voltage in the Plate Capacitor . . . . . . . . . . . 6.3.2 Energy in the Electric Field . . . . . . . . . . . . . . . . . . . . . 6.4 Matter in the Electric Field . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.1 Electrical Conductors and Currents . . . . . . . . . . . . . . . 6.4.1.1 Ohm’s Law . . . . . . . . . . . . . . . . . . . . . . . . 6.4.1.2 Electrical Power . . . . . . . . . . . . . . . . . . . . . 6.4.1.3 Emission of Electrons from Metals . . . . . . . . 6.4.2 Electrical Insulators and Dipoles . . . . . . . . . . . . . . . . . 6.4.2.1 Electric Dipole . . . . . . . . . . . . . . . . . . . . . . 6.4.2.2 Dielectricity . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Electrostatics – Compact . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
149 149 150 151 151 152
5.4 5.5 6
7
Magnetostatics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Permanent Magnets and Magnetic Field Lines . . . . . . . . . . . . . 7.2 Lorentz Force and Definition of the B-Field . . . . . . . . . . . . . . . 7.2.1 Trajectories of Charged Particles in the Homogeneous B-Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.2 Lorentz Forces on Electrical Conductors . . . . . . . . . . . 7.2.3 Hall Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
154 155 156 156 157 158 159 159 160 160 161 161 163 164 164 165 166 167 169 173 173 175 176 178 179
Contents
7.3
Ampère’s Law and Calculation of B-Fields . . . . . . . . . . . . . . . . 7.3.1 Current-Carrying Straight Wire . . . . . . . . . . . . . . . . . . 7.3.2 Circular Current – Magnetic Dipole Moment . . . . . . . . 7.3.3 Long Thin Coil . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.4 Magnetic Flux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.5 Energy in the Magnetic Field . . . . . . . . . . . . . . . . . . . 7.3.6 Outlook: Vector Potential . . . . . . . . . . . . . . . . . . . . . . Magnetism in Matter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 Diamagnetism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.2 Paramagnetism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.3 Ferromagnetism and Hysteresis . . . . . . . . . . . . . . . . . . Magnetostatics – Compact . . . . . . . . . . . . . . . . . . . . . . . . . . . .
181 182 183 184 185 186 187 187 188 188 189 191
Electrodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Relationship Between Electric and Magnetic Fields . . . . . . . . . . 8.2 Induction Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.1 Induction of Magnetic Fields . . . . . . . . . . . . . . . . . . . . 8.2.2 Induction of Electric Fields – Faraday’s Law of Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.3 Lenz’s Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.3.1 Eddy Currents . . . . . . . . . . . . . . . . . . . . . . 8.2.4 Generation of Alternating Current – The Dynamo . . . . 8.2.5 Self-Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Maxwell Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 Electrodynamics – Compact . . . . . . . . . . . . . . . . . . . . . . . . . . .
193 193 195 196
7.4
7.5 8
9
xi
Electronics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 Passive Components and Alternating Current . . . . . . . . . . . . . . 9.1.1 Alternating Current and Alternating Current Resistance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1.2 Alternating Current in the Capacitor . . . . . . . . . . . . . . 9.1.3 Alternating Current in the Coil . . . . . . . . . . . . . . . . . . 9.2 Electrical Networks and Circuits . . . . . . . . . . . . . . . . . . . . . . . 9.2.1 Parallel and Series Connections . . . . . . . . . . . . . . . . . . 9.2.2 Complex Impedance . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2.2.1 Outlook: Calculating with Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2.3 Special Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2.3.1 Voltage Divider . . . . . . . . . . . . . . . . . . . . . 9.2.3.2 RC Filter . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2.3.3 RL Filter . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2.3.4 Transformer . . . . . . . . . . . . . . . . . . . . . . . . 9.2.4 Switching On and Off in the RC and RL Circuits . . . . . 9.2.4.1 RC Circuit . . . . . . . . . . . . . . . . . . . . . . . . . 9.2.4.2 RL Circuit . . . . . . . . . . . . . . . . . . . . . . . . .
198 199 201 203 204 205 207 209 209 209 211 212 214 217 219 221 223 223 224 225 226 228 228 230
xii
Contents
9.3
9.4 10
Electric Resonant Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.1 Electric Resonant Circuit with Damping . . . . . . . . . . . 9.3.2 RCL Bandpass Filter . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.3 Hertzian Dipole – Radiation of Electromagnetic Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Electronics – Compact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1 Electromagnetic Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1.1 Waves in Matter: Refractive Index . . . . . . . . . . . . . . . 10.1.1.1 Outlook: Superluminal Velocity in Media . . . 10.1.1.2 Dispersion . . . . . . . . . . . . . . . . . . . . . . . . . 10.1.2 Absorption and Scattering . . . . . . . . . . . . . . . . . . . . . . 10.1.2.1 Outlook: X-Ray Diagnostics . . . . . . . . . . . . 10.2 Geometrical Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2.1 Refraction and Reflection at Interfaces . . . . . . . . . . . . . 10.2.2 Optical Imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2.2.1 Lenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2.2.2 Lens Equation . . . . . . . . . . . . . . . . . . . . . . 10.2.2.3 Imaging with the Eye . . . . . . . . . . . . . . . . . 10.3 Polarization of Light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.1 Linear Polarization . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.2 Circular Polarization . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.3 Elliptical Polarization . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.4 Unpolarized Light . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.5 Polarizing Elements and Effects . . . . . . . . . . . . . . . . . 10.3.5.1 Polarizers . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.5.2 Polarisation by Light Scattering . . . . . . . . . . 10.3.5.3 Polarization by Brewster Reflection . . . . . . . 10.3.6 Birefringence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.6.1 Outlook: Changing the Polarization with Wave Plates . . . . . . . . . . . . . . . . . . . . . . . . 10.3.7 Optical Activity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4 Wave Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.1 Interference of Light . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.1.1 Outlook: Light Rays as Interference Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.1.2 Coherence . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.1.3 Mach-Zehnder Interferometer . . . . . . . . . . . 10.4.1.4 Thin Film Interference . . . . . . . . . . . . . . . . 10.4.2 Diffraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.2.1 Huygens’ Principle . . . . . . . . . . . . . . . . . . . 10.4.2.2 Diffraction at the Single Slit . . . . . . . . . . . . 10.4.2.3 Diffraction at the Double Slit . . . . . . . . . . . . 10.4.2.4 Diffraction at the Grating . . . . . . . . . . . . . .
233 235 236 237 239 243 243 246 247 247 248 249 250 250 252 252 254 256 256 257 258 258 258 258 259 261 262 263 265 265 267 267 268 268 270 271 273 273 274 275 276
Contents
xiii
10.4.3
Resolving Power of Optical Images . . . . . . . . . . . . . . . 10.4.3.1 Rayleigh Criterion . . . . . . . . . . . . . . . . . . . 10.4.3.2 Abbe Criterion . . . . . . . . . . . . . . . . . . . . . . Quantum Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.5.1 Properties of Photons . . . . . . . . . . . . . . . . . . . . . . . . . 10.5.2 Experimental Detection of the Photon . . . . . . . . . . . . . 10.5.2.1 Black-Body Radiation . . . . . . . . . . . . . . . . . 10.5.2.2 Photoelectric Effect . . . . . . . . . . . . . . . . . . . 10.5.2.3 Compton Scattering . . . . . . . . . . . . . . . . . . 10.5.2.4 Photon Statistics in Parametric Fluorescence . . . . . . . . . . . . . . . . . . . . . . . Optics: Compact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
277 278 278 280 280 282 282 284 286
Fundamentals of Quantum Physics . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Properties of Quantum Objects . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.1 Wave-Particle Duality . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.2 Copenhagen Interpretation . . . . . . . . . . . . . . . . . . . . . 11.1.2.1 Electron Diffraction at the Double Slit Experiment . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.2.2 Outlook: Alternative Interpretations of Quantum Mechanics . . . . . . . . . . . . . . . . . . 11.1.3 Heisenberg’s Uncertainty Principle . . . . . . . . . . . . . . . 11.1.4 Superposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.5 Entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.5.1 EPR Experiment . . . . . . . . . . . . . . . . . . . . . 11.1.5.2 Schrödinger’s Cat . . . . . . . . . . . . . . . . . . . . 11.2 Atomic Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2.1 Atomic Model According to Bohr . . . . . . . . . . . . . . . . 11.2.1.1 Absorption and Emission . . . . . . . . . . . . . . 11.2.2 Atomic Model According to Schrödinger . . . . . . . . . . . 11.2.2.1 Principal Quantum Number n . . . . . . . . . . . 11.2.2.2 Angular Momentum Quantum Number l . . . 11.2.2.3 Magnetic Quantum Number m . . . . . . . . . . . 11.2.2.4 Corrections to Schrödinger’s Atomic Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2.3 Elements and Periodic Table . . . . . . . . . . . . . . . . . . . . 11.2.3.1 Spin Quantum Number ms and Pauli Principle . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2.3.2 Occupation of the Orbitals . . . . . . . . . . . . . . 11.2.3.3 Periodic Table . . . . . . . . . . . . . . . . . . . . . . 11.3 Nuclear Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3.1 Structure of the Atomic Nucleus . . . . . . . . . . . . . . . . . 11.3.2 Decays of Atomic Nuclei . . . . . . . . . . . . . . . . . . . . . . 11.3.2.1 α-Decay . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3.2.2 β--Decay . . . . . . . . . . . . . . . . . . . . . . . . . .
293 293 294 295
10.5
10.6 11
288 289
297 299 300 302 304 305 306 307 308 310 312 313 313 314 316 316 317 317 319 321 321 325 325 325
xiv
Contents
11.4 12
11.3.2.3 β+-Decay . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3.2.4 Proton and Neutron Emission . . . . . . . . . . . 11.3.2.5 γ-Decay . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3.2.6 Energy in Nuclear Decay . . . . . . . . . . . . . . 11.3.2.7 Decay Law and Activity . . . . . . . . . . . . . . . 11.3.2.8 Biological Effects of Radioactivity . . . . . . . . 11.3.3 Nuclear Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Quantum Physics: Compact . . . . . . . . . . . . . . . . . . . . . . . . . . .
326 327 327 328 328 330 331 332
Appendix: Physical Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
1
Physical Quantities and Measurements
1.1
Physical Quantities
Physics investigates and describes fundamental phenomena in nature, such as the movements of bodies, electrical, magnetic and optical phenomena, or the basic building blocks of matter, to name just a few examples. In this context, physicists claim to pursue a quantitative science. This means that quantitative predictions can be made with the help of theoretical models. In experiments, such a prediction is then tested quantitatively. Sometimes it is found that the model does not describe reality with sufficient precision or generality. In this case, theoretical physics is called upon to develop a new and hopefully better model. The new model must then also be tested experimentally; this is then again the task of experimental physics. In this way, theory and experiment have been in a cycle for centuries, extending our understanding of the laws of nature ever further. In order to make quantitative statements in physics, physical quantities are needed, for example, time t, distance s or velocity v. Some quantities are related to each other by a theoretical model, which can be described by formulas, for example, v = s/t, which we will discuss in more detail in Sect. 2.1.1. When measuring a physical quantity, it is always a matter of comparing the object to be measured with a reference object and then expressing the measured value in units of the reference quantity. In the case of measuring a length, for example, this happens by comparing the length to be measured with a ruler and stating the length in the unit of the ruler, that is, in cm or mm. Each measurement result thus consists of three parts: the measurand, the numerical value and the unit, for example, the following specification of the length is correct: l = 5:7 cm Measured quantity = numerical value × unit: If you want to say which unit a measurand has, use square brackets: [l] = cm. # Springer-Verlag GmbH Germany, part of Springer Nature 2023 S. Slama, Experimental Physics Compact for Scientists, https://doi.org/10.1007/978-3-662-67895-4_1
1
2
1
Physical Quantities and Measurements
To ensure that all people on earth have the same reference system and thus make troublesome conversions between systems unnecessary, an international system of units (SI system: Système international d’unités) was introduced in 1954. Only Myanmar and the USA do not use the SI system. This is one of the reasons why distances in the USA are still given in inches, feet, yards and miles and not in metres or kilometres. This is not expected to change in the future. The SI system of units now defines seven basic units from which all other physical units can be derived. Special definitions have been agreed upon which determine the basic units as precisely as possible: Physical quantity Length Time Mass Electric current Temperature Amount of substance Light intensity
Symbol l t m I T n Iv
Unit Meter Second Kilogram Amps Kelvin Mol Candela
Unit symbol m s kg A K mol Cd
We do not go into the individual definitions of the units in more detail here, but they are – as far as possible – defined in such a way that they can be traced back to natural constants. Until 2019, one exception was the kilogram, which was defined by a mass stored in Paris, the so-called primordial kilogram. Since it was discovered that the mass of the primordial kilogram had changed over the years, it was agreed to redefine it. Now the kilogram is defined by Planck’s constant. Example: The Second One second is 9192631770 times the period of the radiation corresponding to the transition between the two hyperfine structure levels of the ground state of atoms of the caesium isotope 133Cs. Here, the caesium atom with its period establishes a time scale, just like the pendulum swing in a grandfather clock or the oscillating quartz in a wristwatch. Each period of time (and thus also the second) can then be measured in units of this time scale. Since every caesium atom in the universe behaves in the same way, the second is clearly defined by this definition. In Germany, time is determined at the Physikalisch-Technische Bundesanstalt in Braunschweig with the help of caesium atomic clocks. However, this definition is no longer precise enough according to the current state of the art, as it is now possible to measure time intervals much more precisely. Scientists are therefore currently working on redefining the time standard using new types of optical atomic clocks (e.g., so-called lattice clocks or fountain clocks).
1.2 Measurement Uncertainty
3
Since one often wants to indicate multiples or fractions of the basic units, for example, a length in km instead of 1000 m, one has agreed on appropriate designations and abbreviations: Multiple Kilo Mega Giga Tera Peta Exa
1.2
k M G T P E
103 106 109 1012 1015 1018
Fractions milli micro nano pico femto atto
m μ n p f a
10-3 10-6 10-9 10-12 10-15 10-18
Measurement Uncertainty
Every measurement of a quantity X (measured value x) is in principle subject to errors or measurement uncertainty. This is obvious, for example, when we read the position of an analog pointer with respect to a measuring scale; however, this also applies to measuring instruments with digital display, because there the measurement uncertainty is still as large as the smallest digit displayed. In principle, a distinction is made between the following errors: • Systematic uncertainty Δxsys: One speaks of systematic uncertainties whenever the measured value is systematically changed compared to the true value of the quantity. For example, when measuring the outdoor temperature, if the thermometer is placed too close to the house, one will always measure too high temperatures in winter, because the heat escaping from the house will influence the thermometer. In addition, systematic errors can also be caused by a meter being incorrectly calibrated and always reading, say, 5% too much. This is often the case with speedometers in cars on purpose. Systematic errors can also occur when reading an analogue display, for example, if the pointer is read obliquely from the side and the angle of view of the scale changes as a result. Since systematic errors depend very specifically on the measuring method and the measuring devices and are sometimes also difficult to identify, we will not go into them further here. • Statistical uncertainty Δxstat: Statistical uncertainties are characterized by the fact that the measured value varies statistically around the true value of the variable or the value shifted by systematic errors. The causes of these random fluctuations can be rooted in the measurand itself: If, for example, the number of bacteria in several Petri dishes prepared in the same way is counted, the true values already fluctuate around the mean number of bacteria per dish. Statistical errors also occur, however, particularly due to random inaccuracies in the measurement. The good thing about this is that this type of uncertainty can be
4
1
Physical Quantities and Measurements
accounted for by statistical methods by repeating a measurement a sufficient number of times. A measurand including systematic and statistical uncertainty is reported as follows: X = x ± Δxsys ± Δxstat :
ð1:1Þ
Often one prefers to calculate the relative uncertainty: X =x±
Δxsys Δxstat ± , x x
ð1:2Þ
where this is typically expressed as a percentage. How to determine the statistical uncertainty Δxstat using many measurements is explained in Sect. 1.2.1. Example: Relative Uncertainty One measurement has the result X = 1.22 m ± 0.5 cm. The relative error is therefore Δx 0:5 cm = = 0:004 = 0:4%: x 122 cm
ð1:3Þ
The important thing here is to convert x and Δx to equal units so that you can reduce them in the fraction.
1.2.1
Statistical Uncertainty
The starting point for the determination of statistical uncertainties is the repetition of a measurement (as often as possible). Here, one tries to keep all parameters that could influence the measurement constant, so that the measured value is not disturbed by changing parameters. This results in a data set of a total of N measurements with the individual results xi, i 2 [1, N]. Since it is assumed that the individual measured values vary statistically around the true value (minus systematic errors), the mean value μ (the arithmetic mean) of the measured values will be close to the true value. Triangular brackets are used to denote a mean value: μ = hxi i =
1 1 ð x þ x2 þ . . . þ xN Þ = N 1 N
N
xi :
ð1:4Þ
i=1
In general, one would also like to know how well defined the mean is. The standard deviation σ of the measurement series provides a first indication. It describes the average deviation of the measured values from the mean value:
1.2 Measurement Uncertainty
σ = =
5
1 ðx - μÞ2 þ ðx2 - μÞ2 þ . . . þ ðxN - μÞ2 N -1 1 1 N -1
N
ð1:5Þ
2
ð xi - μ Þ :
i=1
It thus indicates the range around the mean in which an individual measurement can typically lie. The value Var = σ 2 is also referred to as variance. If the mean value and the standard deviation have been determined from a sufficient number of measured values, 68% of all measured values lie within the range of [μ - σ, μ + σ] around the mean value, and as many as 95% of all measured values lie within the range of [μ 2σ, μ + 2σ]. The standard deviation does not change by measuring even more measured values, because the new values are also distributed according to the standard deviation around the mean value. The uncertainty of the mean value Δμ, on the other hand, decreases with the number of measurements: σ Δμ = p : N
ð1:6Þ
The more measurements you make, the more precisely you can determine the middle of the distribution. These quantities can now be used to specify the error of a measured quantity. Which quantity is given as the measurement error depends on the context. Whenever you want to specify the range of a measured value, you specify the standard deviation as the error: X = μ ± σ:
ð1:7Þ
An example of this would be the indication of the average monthly temperature in a country. What is relevant here is the range in which the temperature can fluctuate around the mean value, that is, the standard deviation. If, on the other hand, you want to determine a physical quantity as precisely as possible, you specify the uncertainty of the mean value as the error: X = μ ± Δμ:
ð1:8Þ
An example of this would be the measurement of the speed of light. Here, one is not interested in the range in which the measured values fluctuate, but one is interested in the mean value and would like to determine this as accurately as possible.
1.2.2
Histograms and Distributions
Mean and standard deviation can be graphically illustrated by plotting the number of measurements in a certain range against the range. Such a plot is called a histogram. In Fig. 1.1 we see an example of the result of an old physics exam as a histogram.
6
1
Physical Quantities and Measurements
Fig. 1.1 Histogram of the points in a physics exam. The mean value is μ = 20.02 points, the standard deviation is σ = 5.5 points. The blue lines indicate the mean and delimit the ranges [μ σ, μ + σ] and [μ - 2σ, μ + 2σ], respectively. The graph also shows the corresponding Gaussian distribution as a red curve (right axis scale)
The horizontal axis indicates the achieved points and to the vertical axis the number of persons who achieved this number of points. The middle of the distribution (the mean) and the width of the distribution (the standard deviation) can be clearly seen. There is also a statistical variation; for example, there seems to be rather too few people with 24 points and rather too many with 19 points. This graininess of the histogram is quite normal and is due to the fact that only a limited number of people participated in the exam. The larger the number of data, the smoother is the curve that forms the histogram. In the limit for an infinite number of measured values N → 1 one then obtains the distribution function f(x). This is then normalized (i.e., multiplied by a factor) so that it can be understood as the probability per measurement range dx with which a measurement value lies in the range [x, x + dx]. The probability P(x1 < x < x2) of measuring a value in the range between x1 and x2 is thus given by the integral Pðx1 < x < x2 Þ =
x2
f ðxÞdx:
ð1:9Þ
x1
Due to normalization, the probability of measuring any value in the range between 1 and +1 is one: Pð- 1 < x < 1Þ =
þ1 -1
f ðxÞdx = 1:
ð1:10Þ
There are several distribution functions, the most important of which is the Gaussian distribution (or normal distribution), which occurs whenever each measurement is independent of all other measurements in the presence of statistically random deviations. The Gaussian distribution is
1.2 Measurement Uncertainty
7 1 x-μ 2 1 f ðxÞ = p e - 2ð σ Þ : 2π σ
ð1:11Þ
It is determined by its mean value μ and the standard deviation σ, which determines the width of the distribution function. In Fig. 1.1, the corresponding Gaussian distribution with mean value μ = 20.02 and standard deviation σ = 5.5 is plotted in addition to the histogram.
1.2.3
Gaussian Error Propagation
In practice, one often has to calculate another physical quantity z from one or more measured values yi. Here the yi do not denote the repeated measured values of a quantity, but different quantities, each of which was measured with an uncertainty Δyi. Now the question arises how the measurement uncertainties Δyi influence the uncertainty Δz of the calculated quantity. This depends primarily on how the quantity z looks as a function of the measured quantities z = f(y1, y2, . . .). According to Gauss, the uncertainty of the calculated quantity is given by Δz =
∂f Δy1 ∂y1
2
þ
∂f Δy2 ∂y2
2
þ . . .:
ð1:12Þ
∂f In this formula we introduce the partial derivative ∂y . It means that the function f, i which depends on several variables yj, is derived according to the variable yi, treating all other variables yj, j ≠ i as constant parameters. In Sects. 1.2.3.1, 1.2.3.2 and 1.2.3.3 we deal with the most important special cases; for the sake of simplicity we always consider only the two measured variables y1 and y2. However, the results can easily be generalized.
1.2.3.1 Addition and Subtraction of Measured Quantities When the measured quantities y1 and y2 are added or subtracted, the composite quantity is z = f(y1, y2) = y1 ± y2. With the two derivatives ∂f = þ1 ∂y1 ∂f = ±1 ∂y2
ð1:13Þ
you get the compound uncertainty Δz =
ð1 Δy1 Þ2 þ ð ± 1 Δy2 Þ2 =
Δy21 þ Δy22 :
ð1:14Þ
8
1
Physical Quantities and Measurements
1.2.3.2 Multiplication of Measured Quantities If the two measured variables y1 and y2 are multiplied,z = f(y1, y2) = y1 y2. The two derivatives are now ∂f = y2 ∂y1 ∂f = y1 : ∂y2
ð1:15Þ
In these derivatives, the quantity that is not currently being derived is regarded as a constant prefactor. This results in Δz =
ðy2 Δy1 Þ2 þ ðy1 Δy2 Þ2 :
ð1:16Þ
1 In multiplication, it is often easier to calculate with the relative uncertainties Δy y1 and The relative composite uncertainty is then given by
Δy2 y2 .
Δz = z
ðy2 Δy1 Þ2 þ ðy1 Δy2 Þ2 = y1 y2
Δy1 y1
2
þ
Δy2 y2
2
:
ð1:17Þ
The relative uncertainty is thus composed in the same way during multiplication as the absolute uncertainty is composed during the addition of the measured quantities.
1.2.3.3 Division of Measured Quantities When the measured quantities are divided, z = f ðy1 , y2 Þ = yy1 . The derivatives are 2 ∂f 1 = ∂y1 y2 y ∂f = - 12 : ∂y2 y2
ð1:18Þ
Thus, after a short calculation, the compound uncertainty is
Δz =
ðy2 Δy1 Þ2 þ ðy1 Δy2 Þ2 y22
:
ð1:19Þ
Again, it is easier to calculate with the relative uncertainties: Δz = z
Δy1 y1
2
þ
Δy2 y2
2
:
ð1:20Þ
1.2 Measurement Uncertainty
9
Example: Speed of Light When measuring the speed of light, a light pulse travels over a distance of l = 10 m ± 0.7%. The time required for this is determined at t = 34 ns ± 1.2%. The speed of light is now given by c = l/t = 10 m/34 ns = 294117647 m/s. For the estimation of the uncertainty we need the error propagation for the division of the measured quantities according to formula (1.20): Δc = c
Δl l
2
þ
Δt t
2
=
0:0072 þ 0:0122 = 0:014 = 1:4%:
ð1:21Þ
The relative uncertainty in the measured speed of light is therefore 1.4%. The absolute uncertainty is thus given by Δc = 0.014 c = 4117647 m/s.
1.2.4
Linear Regression
For some measurements, one would like to check which functional relationship exists between the measured quantity x and a variable parameter p, for example, how the location x of a wagon changes with time t at constant speed v of the wagon. One therefore measures the quantity xi for different parameter values pi, obtaining a total number of N data points ( pi, xi), i 2 [1, N]. These data points are then to be described by a theory curve, that is, by a function x = f( p, αj), j 2 [1, m]. This function generally contains further constant parameters αj in addition to the changed parameter p. In the example the velocity v is the parameter. The aim is now to find the values for αj for which the theory curve best fits the measured values. By “best” is meant here that the sum of the squared deviations of the data values xi from the function f( pi, αj) is minimal. Thus, one searches for the minimum of the function N
min αj
i=1
f pi , αj - xi
2
:
ð1:22Þ
This is also referred to as the method of least squares. Since the parameters αj can take on any values, it is generally not possible to find an analytical solution for the best αj. Typically, one uses computer programs that vary the values αj until the squared deviation reaches a local minimum. In fact, however, there is an analytical solution if there is a linear relationship between p and x. In this case, it is called linear regression. Often, linear regression is even possible when there is initially no linear relationship between p and x, but one can establish a linear relationship by substituting the parameter or measurand. An example of this is discussed below. For now, we assume in the following that the relationship is linear:
10
1
Physical Quantities and Measurements
xðpÞ = α1 þ α2 p:
ð1:23Þ
The curve x( p) thus describes a straight line with intercept α1 and slope α2. The least squares method now determines the straight line that best approximates the data points ( pi, xi). The corresponding αj are: α1 = hxi i - α2 hpi i, N
α2 =
i=1
ðxi - hxi iÞðpi - hpi iÞ N i=1
:
ð1:24Þ
ð pi - h pi i Þ 2
In these equations, both the mean value of the measured data hxii and the mean value of the parameters hpii occur. To demonstrate the procedure in more detail, an example is calculated below:
1.2.4.1 Example: Disintegration of Beer Foam We consider the exponential decay of beer froth. The height of the beer foam as a function of time is given by hðt Þ = h0 e - t=τ :
ð1:25Þ
Here h0 is the height at the time t = 0 and τ is the time constant of the decay. A large τ is a quality characteristic of a good beer. In order to be able to carry out a linear regression, we replace the measured value h(t) by the size x(t): xðt Þ = ln
hð t Þ 1 = t: h0 τ
ð1:26Þ
Thus, there is a linear relationship between x(t) and the time t. The gradient of the straight line is given by α2 = - 1/τ. Let us now turn to the measured values (ti, hi) and the quantities calculated from them, which are summarized in the following table: i 1 2 3 4 5 6 7 8
ti 0s 10 s 20 s 30 s 40 s 50 s 60 s 70 s
(ti - htii) –45 s –35 s –25 s –15 s s 5s 15 s 25 s
(ti - htii)2 2025 s2 1225 s2 625 s2 225 s2 25 s2 25 s2 225 s2 625 s2
hi 5.0813 cm 3.7854 cm 3.0257 cm 2.5489 cm 1.7120 cm 1.4416 cm 0.9681 cm 1.0137 cm
xi 0 -0.2944 -0.5184 -0.6899 -1.0879 -1.598 -1.6580 -1.6120
(xi - hxii) 1.1376 0.8432 0.6192 0.4477 0.0497 -0.1222 -0.5204 -0.4744
(ti - htii)(xi - hxii) -51.1930 s -29.5126 s -15.4800 s -6.7158 s -0.2486 s -0.6110 s -7.8056 s -11.8596 s (continued)
1.3 Physical Measurements: Compact i 9 10 ∑ hi
ti 80 s 90 s 450 45
(ti - htii) 35 s 45 s
(ti - htii)2 1225 s2 2025 s2 8250
11
hi 0.6030 cm 0.6073 cm
xi -2.1314 -2.1243 -11.3762 -1.1376
(xi - hxii) -0.9938 -0.9867
(ti - htii)(xi - hxii) -34.7829 s -44.4024 s -202.6115
The height of the beer foam hi and the calculated quantity xi = ln (hi/h1) are plotted against time ti in Fig. 1.2. The linear decrease of the quantity xi is approximated here with a linear regression by the red straight line. The slope of the straight line is given by (1.24) as follows 10
α2 =
i=1
ð t i - h t i iÞ ð x i - h xi i Þ 10 i=1
= ðt i - ht i iÞ
2
- 202:6115 = - 0:0246: 8250
ð1:27Þ
This slope corresponds to a time constant of τ = - 1/α2 = 41 s.
1.3
Physical Measurements: Compact
Here is a summary of the most important formulas for calculating uncertainties in physical measurements: Mean (1.4):
Fig. 1.2 Measured height h(t) of the beer foam as a function of time (left axis scale). The exponential decay is linearly represented by x(t) = ln (h(t)/h0) and approximated by a straight line using linear regression. This gives a (1/e) decay time of τ = 41 s
12
1
μ=
1 N
N
Physical Quantities and Measurements
xi :
i=1
Standard deviation (1.5): 1 N -1
σ=
N
ð xi - μ Þ 2 :
i=1
Uncertainty of the mean (1.6): σ Δμ = p : N Error propagation in addition and subtraction (1.14): Δy21 þ Δy22 :
Δz =
Error propagation in multiplication and division (1.17) and (1.20): Δy1 y1
Δz = z
2
þ
Δy2 y2
2
:
Linear regression (1.23) and (1.24): xðpÞ = α1 þ α2 p, α1 = hxi i - α2 hpi i, N
α2 =
i=1
ðxi - hxi iÞðpi - hpi iÞ N i=1
ðpi - hpi iÞ2
:
2
Mechanics of Rigid Bodies
2.1
Kinematics
Kinematics is only about describing the motion of a body and not about what causes the motion. We introduce a simplified model and describe the body as a mass point. This means that we disregard the exact shape of the body and imagine that the entire mass m of the body is located at the center of mass of the body. How to determine the centre of mass is explained in Sect. 2.4. The centre of mass moves in space, just like the body as a whole. Only rotations of the body around itself cannot be described by this model, more about this in Sect. 2.5.
2.1.1
Position, Velocity and Acceleration
For the sake of simplicity, we start here with the motion of a body in only one direction. The motion is defined by the position x(t) at which the mass point is located at a time t. The curve x(t) can be an arbitrarily complicated function, where the velocity v can also change. We are interested in the instantaneous velocity v(t) of the body at time t. Typically, one determines the velocity by measuring the distance Δx that the body travels in a time interval: Δt v=
m Δx , for v = const, ½v] = : Δt s
ð2:1Þ
This is the formula familiar from school, but it only applies if the velocity does not change within the time interval Δt. If the velocity changes, one must make the time interval Δt so small that the velocity remains constant. This is the case if the length of the time interval tends to zero:
# Springer-Verlag GmbH Germany, part of Springer Nature 2023 S. Slama, Experimental Physics Compact for Scientists, https://doi.org/10.1007/978-3-662-67895-4_2
13
14
2
Mechanics of Rigid Bodies
Fig. 2.1 The derivative of the position with respect to time is the instantaneous velocity. Graphically, the velocity is represented by the slope of the tangent to the position. At extreme points of the position curve, the direction of motion reverses and the velocity is v=0
Δx dx = = x_ ðt Þ: dt Δt → 0 Δt
vðt Þ = lim
ð2:2Þ
The velocity function v(t) is thus given by the time derivative of the position function. In physics, this is also indicated by the point above the function x_ ðt Þ. Graphically, velocity can be represented by the slope of the tangent to the position function (see Fig. 2.1). In the same way that velocity is defined as the derivative of the position function, acceleration a can be defined. The concept of acceleration is concerned with how fast the velocity changes as a function of time. Therefore dv = v_ ðt Þ dt 2 m d x = 2 = €xðt Þ, ½a] = 2 : s dt
að t Þ =
ð2:3Þ
The acceleration is thus given by the time derivative of the velocity function or by the second derivative of the position curve. Note again the physical notation of the time derivative with the points. Example: Acceleration of a Quadratic Position Function A body is at time t = 0 at position x0 and moves with velocity v0. For times t > 0 its position is given by the quadratic function 1 xðt Þ = x0 þ v0 t þ at 2 , 2
ð2:4Þ
with the constant quantities x0, v0 and a. What is the acceleration acting on the body? We need to derive the position function twice. The first derivative gives the velocity: (continued)
2.1 Kinematics
15
vð t Þ =
dx d 1 = x þ v0 t þ at 2 = v0 þ at: dt dt 0 2
ð2:5Þ
The second derivative finally gives the acceleration: aðt Þ =
dv d = ðv0 þ at Þ = a: dt dt
ð2:6Þ
Motions of the form (2.4), where the position is a quadratic function of time, thus all have a constant acceleration a. This statement is also true in reverse: all accelerated motions with constant acceleration a can be described by a function of the form (2.4).
2.1.2
Determination of the Position from the Acceleration
In many problems in physics the acceleration is given and one wants to determine the position. For this purpose, the mathematical operation of the twofold derivative, which yields the acceleration from the position curve, must be reversed (see Fig. 2.2). Thus, the acceleration must be integrated twice. The process of integration is a little more complicated than that of the derivative, because one has to consider the limits of the integral . Furthermore, there is an arbitrary constant of integration, which can be added to the primitive. The complete physical problem is now as follows: A body is at time t0 at location x0 and is moving with velocity v0. The acceleration a(t) acts on it. What is the velocity of the body at time t, and at what position is it then? The increase in velocity in a time interval dt at time t is given according to (2.3) by dvðt Þ = aðt Þdt:
ð2:7Þ
Thus, the total velocity increase in the time interval from t0 to t is given by t
Δv =
aðt 0 Þdt 0 :
ð2:8Þ
t0
Here we have replaced the notation of the variable t in the integral by t′ to avoid confusion, since t has already been defined as the upper integration limit in the Fig. 2.2 The position function follows from the acceleration by integrating twice
16
2
Mechanics of Rigid Bodies
integral. The velocity at time t is thus given by the starting velocity v0 plus the velocity increase: t
vð t Þ =
aðt 0 Þdt 0 þ v0 :
ð2:9Þ
t0
The integration constant can therefore be interpreted as the starting velocity at the time t0. In order to determine the position curve from the velocity v(t) obtained in this way, one proceeds quite analogously and integrates a second time: t
xð t Þ =
vðt 0 Þdt 0 þ x0 :
ð2:10Þ
t0
Since in general one can choose the zero point of the time axis arbitrarily, one can simplify the integrals (2.9) and (2.10) by setting. t0 = 0 Example: Free Fall In free fall, a body starts from rest at the height h0, i.e. it has the starting speed v0 = 0. Due to the constant acceleration due to gravity a = - g = - 9:81 sm2 , the body falls to the ground, where it impacts at h = 0. The negative sign in the acceleration comes from the fact that the height h from the ground upwards is counted as positive, and thus the acceleration acting downwards acts against the positive axis. We first integrate the acceleration t
vð t Þ =
ð- gÞdt 0 þ v0 = - gt
ð2:11Þ
0
and then the obtained velocity t
hðt Þ =
ð- gt 0 Þdt 0 þ h0 = h0 -
0
1 2 gt , 2
ð2:12Þ
to obtain the position function h(t). With this result we can now calculate at which time T the body hits the ground. To do this, we set the position function h(T ) = 0: h0 -
1 2 gT = 0 ) T = 2
2h0 : g
ð2:13Þ
Using the duration of the free fall we can also calculate the speed with which the body hits the ground by substituting t = T into the velocity function: (continued)
2.1 Kinematics
17
2h0 =g
vðT Þ = - gT = - g
2gh0 :
ð2:14Þ
The velocity obtained is again negative, since the body is moving in the negative h-direction (downwards). If one is only interested in the magnitude of the velocity, the following applies jvðT Þj =
2.1.3
2gh0 :
ð2:15Þ
Motion in Three Dimensions
The results obtained so far for the position function, velocity and acceleration are also valid in three-dimensional space, only now all quantities are to be assumed as three-dimensional vectors: xðt Þ
→
r ðt Þ =
yðt Þ zðt Þ
vx ðt Þ
→
, v ðt Þ =
ax ð t Þ
→
vy ð t Þ vz ð t Þ
, a ðt Þ =
:
ay ð t Þ az ð t Þ
ð2:16Þ
The superposition principle applies here: the movements in the three dimensions are completely independent of each other. The results of kinematics in one dimension can therefore be applied to the position functions x(t), y(t) and z(t) of the body individually and then combined as a vector sum:
→
r ð t Þ = xð t Þ
1 0
þ yð t Þ
0
0 1
þ zðt Þ
0
0 0
:
ð2:17Þ
0 0
ð2:18Þ
1
The derivatives are given by:
→
v ðt Þ =
dxðt Þ dt
1 0 0
þ
dyðt Þ dt
0 1 0
þ
dzðt Þ dt
1
and dv ðt Þ a ðt Þ = x dt
→
1 0 0
dvy ðt Þ þ dt
0 1 0
dv ðt Þ þ z dt
0 0 1
:
ð2:19Þ
18
2
Mechanics of Rigid Bodies
Thus, the calculations are not more difficult than in one dimension, but they may be three times as extensive. The superposition principle also applies to the integrals: t
→
v ðt Þ =
1 0
0
ax ðt Þdt þ vx,0
0
t0
t
þ
t
þ
0 0
0
ay ðt Þdt þ vy,0
t0
0
1 0
0 az ðt 0 Þdt 0 þ vz,0
t0
ð2:20Þ
0 1
and t
→
r ðt Þ = t0
1 0
0
vx ðt Þdt þ x0
0
t
‘þ
0 0
0
vy ðt Þdt þ y0
t0
0
0 t
þ t0
1
0
0
vz ðt Þdt þ z0
0 0
:
ð2:21Þ
1
2.1.3.1 Oblique Projectile Motion The prime example for movements in several dimensions is the oblique projectile motion, whereby we want to neglect the effects of air friction. Here, a body is thrown with a starting velocity v0 at an angle α to the horizontal (see Fig. 2.3). The motion effectively takes place in only two of the three dimensions of space, depending on the choice of axes, for example, in the xz-plane. We place the launch position at the zero point: z0 = 0 and x0 = 0. The launch velocities along the two directions are vx,
Fig. 2.3 In the oblique throw, a body is thrown at the angle α with the velocity v0. The maximum throwing height is h, and the throwing distance is l. In (a) the trajectory z(x) is sketched, in (b) the time functions x(t) and z(t) are shown
2.1 Kinematics
0
19
= v0 . cos (α) and vz, 0 = v0 . sin (α). Thus, v0 =
v2x,0 þ v2z,0 . The acceleration due
to gravity has the value →
a=
0 0 -g
ð2:22Þ
and points in the negative z-direction. We calculate the motion of the body for each direction separately and start with the motion in the z-direction. Furthermore, we fix the time zero to the launch time, t0 = 0. According to (2.9), the velocity of the body is given by t
v z ðt Þ =
- gdt 0 þ vz,0 = vz,0 - gt:
ð2:23Þ
0
The position in z-direction results with (2.10) in t
zðt Þ = 0
1 ðvz,0 - gt 0 Þdt 0 = vz,0 t - gt 2 : 2
ð2:24Þ
From these results we can calculate the duration T of the motion until the body hits the ground. With z(T) = 0, we get T=
2vz,0 : g
ð2:25Þ
We can also determine the maximum height reached. At the upper reversal point the velocity is zero for a short moment; from this we can calculate the corresponding time: vz,0 - gt = 0 ) t =
vz,0 T = : g 2
ð2:26Þ
The maximum height is therefore reached at half the duration. If we put this into the position curve, we get the zðT=2Þ = vz,0
T 1 T - g 2 2 2
2
=
2 1 vz,0 : 2 g
ð2:27Þ
Now we calculate the motion in the x-direction. Since there is no acceleration in this direction (ax = 0), the velocity remains constant: vx ðt Þ = vx,0 , and the x-position increases linearly with time:
ð2:28Þ
20
2
Mechanics of Rigid Bodies
xðt Þ = vx,0 . t:
ð2:29Þ
By substituting the duration T into this equation, we can calculate the distance l of the motion: l = xðT Þ =
2vx,0 vz,0 : g
ð2:30Þ
An often asked question asks under which angle one achieves the greatest distance. For this purpose, we calculate the derivative of the distance l with respect to the angle α: dl d 2v20 cosðαÞ sinðαÞ = g dα dα
ð2:31Þ
v20 cos ðαÞ2 - sin ðαÞ2 : g
=
ð2:32Þ
At the maximum, the derivative must be zero. This results in cos ðαÞ2 - sin ðαÞ2 = 0
ð2:33Þ
cosðαÞ = sinðαÞ
ð2:34Þ
tanðαÞ = 1
ð2:35Þ
α = 45 ° :
ð2:36Þ
Thus, the largest distance is achieved at an angle of α = 45°
2.1.4
Circular Motion
In many physical processes, bodies move on circular paths with a fixed radius R around an axis, for example, the earth moves approximately on a circular path → with a radius of R = 150 000 000 km around the sun. The position r of the body is described by the angle φ from the x-axis: →
r ðt Þ =
xð t Þ yð t Þ zðt Þ
=R
cos ðφðt ÞÞ sin ðφðt ÞÞ 0
,
ð2:37Þ
In Fig. 2.4a the z-axis was defined as the axis of rotation. The time dependence of the circular motion is contained in the angle φ(t). An important special case is the uniform circular motion. It can be described by introducing the constant angular velocity ω (also referred to as angular frequency) as
2.1 Kinematics
21
Fig. 2.4 (a) In circular motion, the body moves at a distance R from the axis of rotation. Its instantaneous position is described by the angle φ(t) relative to the x-axis. (b) Right-hand rule for → determining ω . The curved fingers indicate the orbital direction of the mass around the axis, the → outstretched thumb indicates the direction of ω
φðt Þ = ωt:
ð2:38Þ
The angular velocity is given by 2π 1 , ½ω] = , resp: T s
ð2:39Þ
1 ω 1 = , ½f ] = = Hz: T 2π s
ð2:40Þ
ω= f=
Where T denotes the orbital period of the body about the axis and f denotes the frequency in units of Hertz. The angular velocity can also be defined as a vector: →
ω = ω . e:
ð2:41Þ
It has the length ω and points in the direction of a unit vector e (vector with length 1) → in the direction of the axis of rotation. The direction of ω can be determined with the help of the right-hand rule (see Fig. 2.4b). The uniform circular motion in the Cartesian coordinates is therefore →
r ðt Þ = Rð cos ðωt Þ sinðωt Þ0Þ:
ð2:42Þ
→
From this we can now determine the velocity v ðt Þ of the body: d → r ðt Þ = R v ðt Þ = dt
→
The magnitude of the velocity is given by
- ω sinðωt Þ ω cosðωt Þ 0
:
ð2:43Þ
22 →
v = v ðt Þ =
2
Mechanics of Rigid Bodies
v2x þ v2y þ v2z = ωR
ð2:44Þ
and thus constant in time. It is worth noting here that the velocity as a vector constantly changes its direction and always points along the path. In this context, one also speaks of the tangential velocity. So there must be an acceleration that causes this change in velocity. It is called centripetal acceleration and is given by - ω2 cosðωt Þ - ω2 sinðωt Þ
d → → v ðt Þ = R a Z ðt Þ = dt
→
= - ω2 r ðt Þ:
ð2:45Þ
0
This result means that the centripetal acceleration always points in a direction → opposite to the position vector r of the body, i.e. always from the instantaneous position of the body to the axis of rotation. The amount of the centripetal acceleration is →
aZ = a ðt Þ = ω2 R:
ð2:46Þ
If the amount of the angular velocity changes, it is no longer a uniform circular motion. In this case, in addition to the centripetal acceleration, which always exists → with circular motion, one can also define an angular acceleration α : →
α=
→
dω → 1 , α = 2: dt s
ð2:47Þ
This relationship also applies to the absolute value: α=
dω : dt
ð2:48Þ →
The angular acceleration is related to the tangential acceleration a T , i.e. the change in the velocity of the body along the circular path by →
→
→
→
a T ðt Þ = α × r ðt Þ, a T =
m , s2
ð2:49Þ
with the cross product × (see also Sect. 2.2.4.3). This connection is also valid for the absolute value using normal multiplication: aT = αR:
ð2:50Þ
2.2 Forces
23
Written Test: Tangential Acceleration A child sits at the edge of a turntable with a radius of R = 0.5 m. A constant angular acceleration α = 0.8 s-2 acts on the turntable. 1. What is the tangential acceleration acting on the child? 2. What is the angular velocity of the disc and the tangential velocity of the child to t = 2 s? 3. What is the centripetal acceleration acting on the child according to t = 2 s? Solution
1. The tangential acceleration of the child is aT = αR = 0:8
1 m . 0:5 m = 0:4 2 : 2 s s
ð2:51Þ
2. The angular velocity according to t = 2 s is ω = α . t = 0:8
1 1 . 2s = 1:6 : s s2
ð2:52Þ
The child then has the tangential velocity 1 m v = ω . R = 1:6 . 0:5m = 0:8 : s s
ð2:53Þ
By the way, the same result is obtained from the tangential acceleration: v = αT . t = 0:4
m m . 2 s = 0:8 : s s2
ð2:54Þ
3. The centripetal acceleration acting on the child is aZ = ω2 . R = 1:6
2.2
1 s
2
. 0:5 m = 1:28
m : s2
ð2:55Þ
Forces
The following section explains how forces affect the motion of bodies. Since forces always point in a certain direction, they are vectors. This means that a force can always be decomposed into its components, for example, into the x, y and z
24
2
Mechanics of Rigid Bodies
components. These components then influence the motion of a body only in the respective direction. This effectively simplifies a three-dimensional problem in space to three one-dimensional problems that are easier to solve. However, forces can not only be decomposed, but also added. For example, if several different forces act on a body, the total force can be calculated by vector addition of the individual forces. The basic physical principles of how forces act on the motion of a body were already written down in the seventeeth century by Isaac Newton in his work Philosophiae Naturalis Principia Mathematica.
2.2.1
Newton’s Axioms
Isaac Newton established three axioms which today form the basis of classical mechanics: the so-called principle of inertia, the principle of action and the principle of reaction (actio = reactio). First Newton’s Axiom: The Principle of Inertia If there are no external forces acting on a body, then this body continues to → move at a constant speed. v = const Inertia in this sense means the property of a body to maintain its state of motion. This also includes the case where the velocity is v = 0, i.e. the body is not moving at all. Here it must be said that there is always a coordinate system in which the body does not move, namely the system that moves with the body. An example of inertia would be a spaceship in space, which continues to fly at a constant speed into infinity (if it does not first come too close to another celestial body). The principle of inertia is often not properly understood, as it is contrary to everyday observation, where all movements seem to slow down by themselves. In fact, friction forces are the cause of this slowing down. Second Newton Axiom: The Action Principle The acceleration of a body is proportional to the force acting on it. →
→
F = m . a , ½F ] =
kg . m = N ðNewtonÞ: s2
ð2:56Þ
Here the property of the inertia of the body is expressed as the mass m with unit kg, which for this reason is also called the inertial mass of the body. The more inert a body is, i.e. the more mass it possesses, the more force is required to accelerate this body with a. If several forces act on a body, the total force as the vector sum of the → individual forces F i is decisive for the acceleration of the centre of mass:
2.2 Forces
25
Fig. 2.5 The accelerating force is given by the gravitational force acting on the mass M, but the system of both masses is accelerated with the total mass M + m →
F=
N
→
→
Fi=m . a :
ð2:57Þ
i=1
Newton’s Axiom: The Reaction Principle (actio = reactio) →
If a body acts on a second body with a force F 12 , then the second body acts on →
the first body with a counterforce F 21 , for which the following applies →
→
F 21 = - F 12 :
ð2:58Þ
The counterforce is therefore equal in magnitude but opposite in direction, so that the sum of the two forces is zero: →
→
F 21 þ F 12 = 0:
ð2:59Þ
The reaction principle is different from Newton’s first two laws in that it specifies where the force on each body comes from. Therefore, the reaction principle also requires the presence of two bodies acting on each other with forces. The system under consideration is thus closed. We will see later that the physically deeper reason for the reaction principle lies in the conservation of momentum. In real situations, it is often not so easy to determine which are the two bodies acting on each other with forces. An example of this is a car accelerating on a road. Obviously, a force is acting on the car. However, it is neither the driver nor the engine that exerts this force on the car; it is the road. In turn, the car exerts a force on the road through its tires. In the following Sects. 2.2.2, 2.2.3, 2.2.4 and 2.2.5 we will learn about some important forces. Written Test: Newton’s Second Law A mass m = 1 kg lies frictionless on a flat surface and is connected via a cord and a pulley to a second freely suspended mass M = 0, 25 kg (see Fig. 2.5). What is the acceleration of the mass m? (continued)
26
2
Mechanics of Rigid Bodies
Solution In principle, the force of gravity acts on both masses. However, as the mass m rests on the plane, the gravitational force of the earth is compensated by a counterforce of the plane on the mass according to the reaction principle. The accelerating force is therefore only given by the gravitational force on the freely suspended mass M: F = Mg,
ð2:60Þ
with g ≈ 10 m/s2. By this force both masses are accelerated, because they are connected by the string and thus both become equally faster. The accelerated inertial mass is therefore the total mass M + m. The action principle then leads to Mg = ðM þ mÞa:
ð2:61Þ
We solve this equation for acceleration: a=
2.2.2
M 0:25 kg g= . 10 m=s2 = 2 m=s2 : Mþm 1:25 kg
ð2:62Þ
Gravitational Force
Gravity is one of the four fundamental forces in physics (along with the electromagnetic force and the strong and weak nuclear forces). It exists between all bodies that have mass and is always attractive. Since it is rather weak compared to the other forces, it mainly appears when the masses involved are large, such as in the case of celestial bodies. Two bodies with masses m1 and m2 attract each other with the gravitational force FG = G .
m1 m2 Nm2 , G = 6:67 . 10 - 11 2 2 r kg
ð2:63Þ
Here the quantity G denotes a natural constant, the so-called gravitational constant. → → The forces F 1 and F 2 on the respective masses act on the centres of gravity of the bodies and point along the axis connecting the two centres of gravity (see Fig. 2.6). → Because of the reaction principle, the two forces are equal in magnitude F 1 =
2.2 Forces
27
Fig. 2.6 Two bodies with masses m1 and m2 attract each other due to gravitation
→
F 2 = F G and point in the opposite direction. If the direction is added, the
gravitational force acting on body 2 is as follows →
F2=G .
→
m1 m2 - r 12 . , r r2
ð2:64Þ
→
where r 12 is the location vector of the mass m2 from the mass. m1 The law of gravity can also be used to calculate the acceleration due to gravity g. To do this, we imagine a body (e.g., a human being) of mass m1 = m on the surface of the earth. The second body is the earth with mass m2 = mE = 5.972 . 1024kg. Since the center of mass of the earth lies in good approximation in the center of the earth, the distance between the two centers of gravity (the earth and the body) is given by the radius of the earth r = rE = 6371 km. We now set the gravitational force according to (2.63) equal to the earth’s gravitational force F = mg G
mmE = mg r 2E
ð2:65Þ
and solve for the acceleration due to gravity: g=G
m Nm2 5:972 . 1024 kg mE = 6:67 . 10 - 11 2 = 9:81 2 : 2 2 s rE kg ð6371 kmÞ
ð2:66Þ
The exact value of g, however, still depends on where exactly one is located on the earth, since the distance from the center of the earth varies depending on latitude and altitude above sea level.
2.2.3
Spring Force
The spring force is the force with which a spring reacts to a deflection from its rest position. If the spring is compressed, the force counteracts this compression; if the spring is pulled apart, the force also counteracts this change (see Fig. 2.7). Here, the magnitude of the force is proportional to the deflection. This behaviour is known as Hooke’s law:
28
2
Mechanics of Rigid Bodies
Fig. 2.7 According to Hooke’s law, the spring force always acts against the deflection
Fig. 2.8 Examination task: Order the three spring combinations from small to large spring constant. The individual spring constants are all the same: Di = D
F = - D . x, ½D] =
N , m
ð2:67Þ
with the spring constant D, which indicates the hardness of the spring. Springs with a large spring constant are harder and are deflected by a small distance when loaded with the same force. The negative sign in (2.67) is because the force points in the opposite direction of the deflection. Hooke’s law is not only valid for springs, but also for elastic forces in general, for example, for the elastic deformation of solid bodies (see also Sect. 3.1). Examination Task: Spring Force N . A mass m = 2 kg is suspended from a spring with a spring constant D = 5 cm By what distance is the spring deflected? Solution The deflecting force is the weight force of the mass F = mg. Thus, the spring is deflected until the restoring force of the spring is equal in magnitude to the weight force. We use Hooke’s law and solve for the deflection x: x=
N F mg 2kg . 10 kg = 4cm: = = N D D 5 cm
The spring is thus deflected by 4 cm.
ð2:68Þ
2.2 Forces
29
2.2.3.1 Combination of Springs Springs can be combined either in series by hanging them together and thus extending the total spring length (see Fig. 2.8a), or parallel by hanging them side by side and connecting them crosswise by a bar at the lower end (see Fig. 2.8b). These spring combinations can also be described as an effective single spring with a total spring constant Dtot, which is calculated from the single spring constants Di. For springs suspended in parallel Dtot =
N
Di :
ð2:69Þ
i=1
The total spring constant is therefore the sum of the individual spring constants. The effective spring is therefore harder than the individual springs. In the case of springs connected in series, the reciprocal values add up: 1 = Dtot
N i=1
1 : Di
ð2:70Þ
The effective spring is therefore softer than the individual springs. If only two springs are considered, (2.70) can be easily solved for Dtot: Dtot =
D1 . D2 : D1 þ D2
ð2:71Þ
Examination Task: Spring Combinations Order the three spring combinations in Fig. 2.8 from small to large spring constant. The single spring constants are all the same: Di = D Solution In situation (a), the two springs are connected in series.. The total spring constant is therefore Dtot,a =
D.D 1 = D: DþD 2
ð2:72Þ
In situation (b), the two springs hang side by side. The total spring constant is therefore (continued)
30
2
Dtot,b = D þ D = 2D:
Mechanics of Rigid Bodies
ð2:73Þ
Situation (c) is a little more complicated. Here we first calculate the total spring constant of the upper part, i.e. of the two springs hanging in parallel. As we already know from (b), the result is 2D. If we now hang the third spring below it, we get the following for the total spring constant Dtot,c =
D . 2D 2 = D: D þ 2D 3
ð2:74Þ
So the correct order is Dtot, a < Dtot, c < Dtot, b.
2.2.4
Pseudo Forces
Pseudo forces are forces that arise because a body is in an accelerated frame of reference. An example of this is experienced every day when accelerating or braking in a car. Depending on the situation, one feels pressed into the car seat or into the seat belt; one thus measures a force backwards or forwards, respectively. So, despite their name, pseudo forces actually have measurable consequences. We will get to know three types of pseudo forces in the following.
2.2.4.1 Inertia Force Inertial forces occur in linearly accelerated systems, i.e. the acceleration points in a fixed direction. As an example, we imagine a scale in an elevator, on which a mass m is lying. If the elevator is not accelerated, the scale shows a weight force F = mg. By the way, this is the case both when the elevator is stationary and when it is moving upwards or downwards at constant speed. Thus, without looking at the external system in which the elevator is located, it is physically impossible to tell whether the elevator is moving or not. Unaccelerated systems that move with respect to each other at a constant speed are, in principle, physically indistinguishable from each other. These systems are also called inertial systems. If now the elevator accelerates upwards (this can happen by starting upwards or by decelerating while moving downwards), the scale will show a larger weight. The weight force is increased by →
→
FT = -ma:
ð2:75Þ
Here FT is the inertial force acting on the mass m = apparent force. The acceleration a is the externally measured acceleration of the reference system in which the mass m is located, i.e. in the example the acceleration of the elevator. If the elevator accelerates downwards, the scale shows a correspondingly smaller weight. Here one must be careful not to confuse the inertia force (2.75) with Newton’s second law (2.56), because the formulas differ only by the sign. Nevertheless, the interpretation
2.2 Forces
31
of Newton’s second law is different, because there a force F acts directly on the mass m and accelerates it. The reference system is not accelerated.
2.2.4.2 Centrifugal Force The centrifugal force FZ is a pseudo force that occurs in rotating coordinate systems, for example, on Earth we are in a rotating coordinate system, since the Earth rotates once around itself in 24 h. Since the centrifugal force associated with this is small compared to the Earth’s gravitational force, we generally don’t notice it. However, if you have ever been on a fast rotating disc (turntable at the playground), you will notice that you are pulled outwards from the disc by a force called centrifugal force. This is the force opposing the centripetal force (actio = reactio). That is the force FZ = m . aZ that is needed to accelerate a mass m with the centripetal acceleration aZ from Eq. (2.46) and thereby force it onto a circular path. Like centripetal acceleration, the centripetal force points to the axis of rotation, while the centrifugal force acts radially outward. The magnitude of the two forces is equal F Z = mω2 r = m
v2 , r
ð2:76Þ
with the radius r of the circular path, the angular velocity ω and the velocity v.
2.2.4.3 Coriolis Force The Coriolis force also occurs in rotating coordinate systems, but only when the body under consideration is moving relative to the rotating system. Anyone who has ever tried to walk on a turntable has probably noticed that it can easily pull your feet away. The origin of this effect is the Coriolis force →
→
→
F C = 2m v × ω :
ð2:77Þ
Here it is important that v is the velocity of the body relative to the rotating system and not relative to the external system. Movements of the body in space solely due to → the rotation of the reference system with angular frequency ω are therefore Fig. 2.9 (a) Right-hand rule for cross products, (b) Direction of rotation in high and low pressure areas in the northern hemisphere due to the Coriolis force
32
2
Mechanics of Rigid Bodies →
irrelevant. The direction of the Coriolis force is given by the cross product of v and → ω . Note the right-hand rule for cross products according to Fig. 2.9a. The Coriolis force is responsible, for example, for the clockwise and counterclockwise rotation of high and low pressure areas in the northern hemisphere. This is illustrated in → Fig. 2.9b. The angular velocity ω of the Earth’s rotation points vertically upwards along the axis of rotation at the North Pole. In a high-pressure area, air masses now move away from the location of high pressure along the Earth’s surface towards the outside, i.e. to where the air pressure is lower. Due to the spherical shape of the earth, → → there is thus a component of the velocity v , which is perpendicular to ω . Applying → → the right-hand rule for cross products to the expression v × ω gives the direction of the Coriolis force and thus the direction in which the airflow is deflected. In the southern hemisphere, by the way, the directions of rotation of high and low pressure areas are exactly the opposite of those in the northern hemisphere.
2.2.5
Frictional Forces
Frictional forces are familiar to us from everyday experience. They are responsible for the fact that movements become slower without external drive, for example, the amplitude of a swing oscillation decreases without drive, because there is friction in the suspension of the swing and air friction. Within this section we will discuss friction forces of bodies with plane surfaces. These can be divided into static friction, sliding friction and rolling friction. In the case of static friction, the body is not yet moving, i.e. it adheres to the plane. However, an external force Fext acts along a possible direction of motion on the plane (see Fig. 2.10a). This external force is compensated by a counteracting static friction force FH of the same magnitude. The static friction force cannot exceed a maximum value F max HR : F max H = μH . F N :
ð2:78Þ
This value is proportional to the normal force FN = mg. This is the weight force with which the body presses perpendicularly (= normal) on the plane (see Fig. 2.10). The unitless static friction coefficient μH depends here on the materials and the properties of the plane and the body. As soon as the external force becomes greater than the maximum static friction, the body begins to slide. In this case, the friction force is opposite to the direction of motion of the body and given by the sliding friction force FG. Round bodies can also roll. In this case, the so-called rolling friction force acts FR. The following applies to both forces:
Fig. 2.10 (a) Static friction FH on the plane, (b) Forces on the inclined plane
2.2 Forces
33
F G,R = μG,R . F N :
ð2:79Þ
The coefficient of sliding or rolling friction μG, R, which is also unitless, is typically smaller than the coefficient of static friction, so that the body is accelerated further after overcoming the static friction. In addition, μR ≪ μG, i.e., rolling is associated with significantly less friction than sliding.
2.2.5.1 Inclined Plane If the body lies on a plane inclined to the horizontal by the angle α, the weight force must be divided into a component parallel to the plane and a perpendicular component (see Fig. 2.10b). The perpendicular component is again called the normal force FN and causes frictional forces. The parallel component Fp (downward slope force), on the other hand, acts like an external force pulling the body downward parallel to the plane. The corresponding amounts are
2.2.6
F N = mg cosðαÞ,
ð2:80Þ
F p = mg sinðαÞ:
ð2:81Þ
Equation of Motion
As we already know, with the help of the formulas (2.9) and (2.10) one can → → → determine the position function r ðt Þ of a body from the acceleration a ðt Þ = a ðt Þ by integration. The acceleration is given by Newton’s second axiom through the → force F ðt Þ, which acts on the body. However, in real problems in physics, usually → → → only the force F r is known as a function of location r , and not as a function of time. This is because how the force on the body changes in time depends precisely → on where the body is at time t, and therefore implicitly on the position function r ðt Þ to be determined. So, to solve this problem, one has to proceed differently: One puts → → → → into the connection known from Newton’s second axiom F r = m a r → → € kinematics a = r . This results in: → €
r ðt Þ =
1→ → F r ðt Þ : m
ð2:82Þ
This is the so-called equation of motion. On the left side of the equation is the second → derivative of the function r ðt Þ with respect to time, and on the right side is a force that depends on where the body is at the time t. The unknown variable in this equation is not a specific position, but the entire function. Such an equation of a function and its derivatives is called a differential equation (DEQ). Solving differential equations is a problem of mathematics and can be arbitrarily difficult depending on the type of equation, so we will focus on only a few special cases in
34
2
Mechanics of Rigid Bodies
Fig. 2.11 Kepler orbits: For negative energy (measured at an infinite distance from the Sun) the comet moves on an ellipse around the Sun, for a kinetic energy equal to zero the comet comes flying out of infinity on a parabolic orbit. For positive energy the parabola changes into a hyperbola
this book. One example is discussed in Sect. 4.1 on oscillations. Another example is the motion of celestial bodies, such as the motion of a comet with mass mK around a star with mass mS ≫ mK. In this case, the gravitational force acting on the comet is given by →
→
FG r
=G .
→
mK mS - r . , r r2
ð2:83Þ
→
Where r is the comet’s position vector measured from the Sun’s position with → → magnitude r = r . The expression -r r describes the direction of the force opposite to the position vector towards the Sun. The equation of motion for the comet is thus → €
r ðt Þ = - GmS .
→
r : r3
ð2:84Þ
This differential equation has several different solutions depending on the energy of the comet at an infinite distance from the Sun, which are called Kepler orbits in honour of the astronomer Johannes Kepler (see Fig. 2.11). For negative energy, the comet is bound to the Sun and periodically orbits the Sun at a finite distance. The corresponding orbital curve is an ellipse with the Sun at one focal point. The planetary orbits in our solar system are also ellipses, but they have a very small ellipticity, making them almost circular. If the energy of the comet increases to zero, the orbital curve is no longer bound to the Sun. In this case, the comet comes flying out of infinity, circles the Sun once, and disappears back into infinity. The corresponding orbital curve is a parabola, i.e. the orbital curve is still curved even for arbitrarily large distances from the sun. If the energy increases to positive values, the parabola becomes a hyperbola, i.e. for large distances from the sun the comet flies asymptotically on a straight path.
2.3 Conserved Quantities in Mechanics
2.3
35
Conserved Quantities in Mechanics
Conserved quantities are physical quantities that do not change in a closed system. There is a direct connection between symmetries in the system and the corresponding conserved quantities. This connection is especially important in modern particle physics, where one assumes symmetries of the elementary particles, from which one can derive important connections. In the following, we will deal with the conserved quantities in classical mechanics, i.e. energy, momentum and angular momentum.
2.3.1
Work, Energy, Potential and Power
The energy E is one of the most important conserved quantities in physics. It has the unit joule. ½E ] = Nm =
kgm2 = J, ðJouleÞ s2
ð2:85Þ
Law of Conservation of Energy The total energy in a closed system is constant. In terms of conservation of energy, a closed system does not exchange energy with its surroundings. In particular, no work is performed on the system from the outside. The system itself can consist of many individual subsystems or objects, each of which contains a different amount of energy, which can also exist in different forms of energy. The total energy is then the sum of all these individual energies and forms of energy Ei. It is constant: E tot =
E i = const:
ð2:86Þ
i
Within the system under consideration, however, the energy can be converted between different objects and forms of energy as desired. If an object is moved by → → the force F by a distance r , the work W is performed on this object: →
→
W = F . r , ½W ] = Nm = J:
ð2:87Þ
In the definition of work, it is crucial that force and distance vector are connected by a scalar product. So only the contribution of the force in the direction of the motion counts. Furthermore, it is assumed in (2.87) that the path is straight and the force on this path is constant. For curved paths or a changing force on the path, the general definition of work is needed (see Sect. 2.3.1.6).
36
2
Mechanics of Rigid Bodies
The work done on the object changes the energy of the object by the amount of the work: ΔE = W:
ð2:88Þ
For example, when the value of work W > 0 is positive, the energy of the body increases. Energy is therefore work stored in the object. The body can then perform work on another object by releasing the energy itself. In the following, we will learn some examples of different forms of energy.
2.3.1.1 Potential Energy If an object with mass m is lifted upwards by a height z against the force of gravity, an upwards acting force F = mg is required for this. When lifting, the lifting work W = mgz is then performed. The potential energy Epot with respect to the initial height z = 0 is thus E pot = mgz:
ð2:89Þ
2.3.1.2 Elastic Energy Energy can also stored as elastic deformation, for example, when a spring is compressed as in an air gun. The calculation of the corresponding elastic work is a little more complicated than for the lifting work, since the force depends on the deflection x according to Hooke’s law (2.67) with F = Dx. The work is thus given by the integral W=
F ðxÞ . dx =
Dx . dx =
1 2 Dx : 2
ð2:90Þ
The elastic energy is E elast =
1 2 Dx : 2
ð2:91Þ
In the air gun, the elastic energy of the spring is used to accelerate the projectile.
2.3.1.3 Kinetic Energy Kinetic energy is stored acceleration work. In the case of acceleration, it is not the distance covered that is important, but the velocity achieved. With the help of the definition of the velocity v = dx/dt and Newton’s second law F = m . dv/dt, the integral over the distance dr for calculating the work can be rewritten into an integral over velocity dv:
2.3 Conserved Quantities in Mechanics
W=
F . dx =
37
m
dv vdt = dt
mvdv =
1 2 mv : 2
ð2:92Þ
The kinetic energy is therefore E kin =
1 2 mv : 2
ð2:93Þ
Exercise: Energy Conversion in a Rubber Ball A stationary rubber ball is dropped from a height h, hits the ground and bounces back up. 1. Which forms of energy are involved in the course of such a jumping process? 2. After each bounce, the ball reaches a slightly lower height. How can this observation be reconciled with the conservation of energy? Solution
1. The ball starts from rest. All the energy is therefore stored as potential energy. During the fall, more and more energy is converted into kinetic energy until the ball reaches the ground with its lower end. From now on, the ball is decelerated by the ground and thereby compressed, i.e. deformed. The kinetic energy is thus converted into elastic energy. After the ball has reached the speed v = 0 at its lowest point, the ball expands again and is accelerated upwards, i.e. elastic energy is converted back into kinetic energy until the ball bounces off the ground. From now on, the ball flies upwards, while at the same time its speed decreases, i.e. kinetic energy is converted into potential energy until the ball has reached the highest point of its trajectory. 2. Air friction occurs during the movement of the ball, and energy is also converted into heat energy and sound energy during elastic deformation. All these processes are irreversible, i.e. these forms of energy cannot be converted back into mechanical energy.
2.3.1.4 Power The physical concept of power P denotes the ability to perform a certain amount of work W in a time t:
38
2
P=
Mechanics of Rigid Bodies
W J , ½P] = = W ðWattÞ: t s
ð2:94Þ
The faster one has to perform a certain work, the greater is the power required for this. Here it is assumed that the work is constant during the time t, otherwise one must define the instantaneous power by Pðt Þ =
dW : dt
ð2:95Þ →
→
The formula (2.95) can be rewritten by expressing the work by W = F . d r and → → then inserting the velocity v = d r =dt: →
→
→ → dW F .d r P= = =F . v: dt dt
ð2:96Þ →
When interpreting this formula, we have to imagine a body moving with speed v and a force acting on it along the direction of speed. This force can, for example, further accelerate a car that is already moving. From (2.96) it follows that the faster the car is moving, the more power the motor must supply for the same acceleration. a = F/m Examination Task: Elevator Performance An elevator lifts a mass of m = 5000 kg by a height h = 20 m. The motor has a power of P = 20 kW. How long does it take? Assume that the power is constant during the lift. Use g = 10 m/s2. Solution We solve formula (2.94) forthe time and insert the liftingwork W = mgh: t=
mgh 5000 kg . 10 m=s2 . 20 m = 50 s: = 20 000 W P
ð2:97Þ
The lift takes 50 s.
2.3.1.5 Potential The forms of energy known so far depend on the properties of the object under consideration, for example, the potential energy Epot = mgz depends on the mass of the object. This is in contrast to the aim of physics to represent physical relationships → has been as generally as possible. Therefore, the concept of potential U r developed in physics. The potential no longer depends on the property of the object, but is only a property of space. An important example is the electrostatic potential,
2.3 Conserved Quantities in Mechanics
39
which is discussed in Sect. 6.3. At this point, we will briefly discuss the gravitational potential on Earth: →
U r
Epot J = gz, ½U ] = : m kg
=
ð2:98Þ
The gravitational potential is thus independent of the mass m and increases proportionally with the height z. Places with the same potential value are called equipotential lines or equipotential surfaces and are thus places that have the same height. On a map, these are the contour lines. The elevation profile of a landscape is thus → equivalent to the potential landscape. From the potential U r , the force →
F
→
r , which acts on an object with mass m, can be calculated in general:
→
F
→
= -m
r
→
→
r
The components of the force vector F
∂U ∂x ∂U ∂y ∂U ∂z
:
ð2:99Þ
are the partial derivatives of the potential
according to the respective axis. In the example of the gravitational potential one obtains the usual expression →
F
→
= - mg
r
0 0
:
ð2:100Þ
1
2.3.1.6 Outlook: General Definition of Work → → If an object is moved by a position-dependent force F r →
on a path from the
→
starting point r 1 to a target point r 2 , the work W is performed: →
W=
r2→
→
F
→
r
→
.dr:
ð2:101Þ
r1 →
Mathematically, in (2.101) the path is divided into tiny straight pieces d r , on each → → of which the force F r can be assumed to be constant (Sect. 6.3). The total work is then the sum or integral of the work on the individual pieces of the path. This integral is called a path integral, since it integrates along a path.
40
2
2.3.2
Mechanics of Rigid Bodies
Straight Motion: Momentum and Collisions →
→
For a body with mass m and velocity v the momentum p is defined as →
→
p = m v , ½ p] =
kgm : s
ð2:102Þ
Law of Conservation of Momentum The total momentum in a closed system is constant. In this context, the system is considered to be closed if it does not exchange any momentum with the environment, i.e. if no forces act on the system from outside, or if there are no collisions with bodies that do not belong to the system. Again, the system can → consist of many individual systems or objects with individual momenta p i . The total momentum as the sum of the individual momenta is constant: →
→
p tot =
p i = const:
ð2:103Þ
i
However, the momentum of a single object of the system can change. →
→ d dp → → = m v =m . a = F, dt dt
ð2:104Þ
The change of momentum is thus given by the acting force, if the mass of the body is constant in time. Combining Eq. (2.104) with the reaction principle (actio = reactio), we can derive the conservation of momentum: To do this, consider a system consisting of two objects. The two objects act on each other for a time Δt with the → → forces F 1 = - F 2 . The change of momentum of the two objects is thus →
→
→
→
Δ p 1 = F 1 . Δt = - F 2 . Δt = - Δ p 2 :
ð2:105Þ
Thus, the change in the total momentum is the sum of the two individual momentum changes: →
→
→
Δ p tot = Δ p 1 þ Δ p 2 = 0:
ð2:106Þ
As expected, the total momentum does not change. Reaction principle and conservation of momentum are thus closely related. Using the example just calculated, we can also see that, unlike total momentum, total velocity is not a conservation variable. Let us assume that the two objects have the masses m1 and m2. The change of the total velocity as the sum of the individual velocities is then
2.3 Conserved Quantities in Mechanics →
41
→
→
Δ v tot = Δ v 1 þ Δ v 2 =
→
→
Δp1 Δp2 þ : m1 m2
ð2:107Þ
This expression is generally non-zero, except in the special case when the two masses are just equal. Otherwise, the velocity of the body with the smaller mass changes more than that of the body with the larger mass. Written Test: The Rocket The propulsion of a rocket is a good example for the conservation of momentum. Here, the rocket, which is initially at rest and has an original total mass M, ejects fuel of mass m backwards at the speed -u and thus accelerates forwards. The total mass of the rocket is reduced by the mass of the fuel. What is the velocity v of the rocket after the propellant has been ejected? Assume that all the fuel is ejected at once. Solution Since the rocket is at rest before the propellant is ejected, the total momentum of the system (rocket + propellant) is ptot = 0:
ð2:108Þ
Because of conservation of momentum, this must still be true after the propellant is ejected, where the subsystem of the propellant has momentum pT = - mu and the rest of the rocket has momentum pR = (M - m)v. So ptot = pT þ pR = - mu þ ðM - mÞv = 0:
ð2:109Þ
We can solve this equation for v: v=
m u: M-m
ð2:110Þ
If nothing is ejected (m = 0), the speed of the rocket is also zero. An interesting case is the one where almost the entire mass is ejected as propellant, i.e. the case m → M. In this case, the velocity of the rocket tends to infinity. One can see from this that rockets can reach very high velocities when a significant portion of the original mass is ejected as propellant. In reality, of course, the fuel is not expelled all at once, but continuously over a longer period of time. This case is a little more difficult to calculate and leads to the result (continued)
42
2
v = ln
Mechanics of Rigid Bodies
M u: M-m
ð2:111Þ
Also in this case, the velocity v diverges in the limes m → M, but much slower than if the propellant is ejected all at once.
2.3.2.1 Central Collision → If the force F acts on a body for a short time Δt, this is also referred to as a force impact with an associated change in momentum. →
→
Δ p = F Δt:
ð2:112Þ
This happens, for example, when a billiard ball is hit with the cue or with a second ball. In the following, we will take a closer look at collisions between bodies. In general, such collisions can be very complicated to describe, if the bodies have an irregular shape, do not collide at a central point and additionally rotate. We will neglect these complications here: in the following we will deal with the central collision of two non-rotating bodies with masses m1 and m2 and velocities v1 and v2 (see Fig. 2.12). We are looking for the velocities v01 and v02 of the two bodies after the collision. There are two extreme cases, the perfectly elastic collision and the perfectly inelastic collision. Most real collisions are a mixture of elastic and inelastic collisions. In the case of an elastic collision, it is assumed that the kinetic energy is conserved, i.e. 1 1 1 1 m v2 þ m v2 = m v02 þ m v02 : 2 1 1 2 2 2 2 1 1 2 2 2
ð2:113Þ
This is the case if during the impact no energy is converted into the deformation of the bodies, into heat, sound energy or similar. In addition, the conservation of momentum applies to the impact m1 v1 þ m2 v2 = m1 v01 þ m2 v02 :
ð2:114Þ
From Eqs. (2.113) and (2.114) one can calculate the velocities after the impact:
Fig. 2.12 Two bodies with masses m1 and m2 and velocities v1 and v2 collide centrally, i.e. they meet on the axis connecting their centres of mass
2.3 Conserved Quantities in Mechanics
43
v01 =
m1 - m2 2m2 v þ v, m1 þ m2 1 m1 þ m2 2
ð2:115Þ
v02 =
2m1 m - m1 v þ 2 v: m1 þ m2 1 m1 þ m2 2
ð2:116Þ
Here one has to consider the signs of v1 and v2. If the velocity axis v is selected as in Fig. 2.12, then v1 > 0 and v2 < 0. The signs of v01 and v02 then also refer to the selected velocity axis. In the case of an inelastic collision, it is assumed that the two bodies form a unit after the impact, which moves together. This is the case, for example, when one body gets stuck in the second body, just as a projectile gets stuck in a block of wood during a shot. The two velocities after the impact are thus identical; we denote them here by v ′: v01 = v02 ≡ v0 :
ð2:117Þ
From the conservation of momentum according to (2.114) and the condition of equal terminal velocity (2.117) one can then calculate v′: v0 =
m1 v1 þ m2 v2 : m1 þ m2
ð2:118Þ
Again, one must note the signs of the individual velocities. You may have noticed that in order to derive (2.118), one did not use the condition (2.113) that the kinetic energy is conserved. In fact, the kinetic energy is not conserved in an inelastic collision. This is because an inelastic collision always involves a deformation of the original bodies. The energy required for this comes from the kinetic energy of the bodies before the impact. Written Test: ICE and Football An ICE (intercity express train) with a speed of v = 100 km/h hits a stationary football head-on. The mass of the football is much smaller than the mass of the ICE. What is the speed of the football after the (perfectly) elastic impact? Solution We set the mass of the ICE as m1 and its velocity as v1. Accordingly, the mass of the football is m2 and its velocity before the impact is v2 = 0. The velocity of the football after the impact is then given by Eq. (2.116) (continued)
44
2
v02 =
Mechanics of Rigid Bodies
2m1 2m1 m - m1 v þ 2 v = v: m1 þ m2 1 m1 þ m2 2 m1 þ m2 1
ð2:119Þ
The masses of the football and the ICE are not explicitly given, but we know that m1 ≫ m2 is. For this reason, we can make the approximations that m1 + m2 ≈ m1. This gives v02 ≈
2m1 v = 2v1 : m1 1
ð2:120Þ
After the impact, the football has twice the speed of the ICE, i.e. v02 = 200 km=h. By the way, the ICE is hardly slowed down at all by the impact with the football because of its large mass.
2.3.3
Circular Motion: Angular Momentum and Torque
The last conservation quantity we address in this section is the angular momentum. → For this we consider a mass point with mass m, which rotates with velocity v at → distance r around an axis (the z-axis) (see Fig. 2.13a). Here the position vector r of the mass point is measured from the point on the axis which has the smallest distance from the mass point. The position vector is therefore perpendicular to the axis of rotation. The angular momentum is now defined as →
→
→
→
→
L = r × p = m r × v , ½L] = →
kgm2 = Nms: s
ð2:121Þ
→
Since in circular motion the cross product r × v points in the direction of the → → → angular velocity ω , the angular momentum L is parallel to ω and is also on the z-axis. It has the absolute value →
L = L = mrv = mr 2 ω: →
ð2:122Þ
At this point we want to introduce torque M as the origin of rotational motion:
Fig. 2.13 (a) Definition of →
the angular momentum L for the circular motion of a mass → point m with velocity v at a distance r around the z-axis,
→
(b) Definition of the torque M
2.3 Conserved Quantities in Mechanics
45
Fig. 2.14 (a) Mr. Slama is handed a rotating wheel on the turntable. (b) As soon as Mr. Slama turns the wheel downwards, the turntable begins to rotate
→
→
→
M = r × F , ½M ] = Nm: →
ð2:123Þ
→
Due to the cross product r × F , a torque is generated whenever a force acts at a distance r from the axis of rotation and this force has a component in the direction of rotation (Fig. 2.13b). This accelerates the mass point to a faster orbital velocity. It can be shown that analogous to (2.104): →
→ dL = M, dt
ð2:124Þ
i.e., the angular momentum of a body changes whenever a torque acts. This relationship is also known as the angular momentum theorem. In closed systems, the conservation of angular momentum applies. Law of Conservation of Angular Momentum The total angular momentum in a closed system is constant. In this context, the system is considered closed if it does not exchange angular momentum with the environment, i.e. if no torque acts on the system from outside, or if no non-central collisions occur with bodies that do not belong to the system. Again, the system can consist of many individual systems or objects with →
individual angular momentum L i . The total angular momentum as the sum of the individual angular momentums is constant: →
→
L tot =
L i = const:
ð2:125Þ
i
However, the angular momentum of a single object of the system can change if the angular momentum of a second object of the system changes in exactly the opposite way. We will illustrate this with the following example:
46
2
Mechanics of Rigid Bodies
Exercise: On the Turntable During a demonstration experiment in the lecture hall, Mr. Slama sits on an initially stationary turntable and is handed a rotating wheel by Mr. Pettke (see Fig. 2.14a). Nothing happens yet. What happens when Mr. Slama turns the wheel downward, as in Fig. 2.14b? Solution The total system consists of the turntable with Mr. Slama and the rotating wheel. At the beginning the turntable does not rotate, the total angular → momentum is therefore given by the angular momentum L 1 of the rotating wheel. If Mr. Slama rotates the wheel downward, the angular momentum is → also flipped downward to - L 1 . The angular momentum has thus changed by →
→
Δ L 1 = - 2 L 1:
ð2:126Þ
Due to the conservation of angular momentum of the entire system, the angular momentum of the turntable must change at the same time by →
→
→
Δ L 2 = - Δ L 1 = 2 L 1:
ð2:127Þ
So now the whole turntable including Mr. Slama is turning in the same direction as the rotating wheel did before.
2.4
Statics and Equilibria
Statics is a branch of engineering mechanics and important for all engineering and mechanical professions. Basically, one wants to know what forces act when the objects involved do not move. In the following, we will first discuss the statics of translation and then the statics of rotation.
2.4.1
Statics of Translation
According to (2.57), the centre of mass of a body remains at rest if the total force acting on it is zero. The total force is here the vector sum of all individual forces → acting on the body F i . Mathematically expressed, the condition for the statics of translation is thus
2.4 Statics and Equilibria
47
Fig. 2.15 (a) A tensioned suspension rope is loaded with a weight and sags downwards. (b) The →
→
→
vector sum of the forces F g , F l and F r is zero. Thus no acceleration acts N
→
F tot =
→
F i = 0:
ð2:128Þ
i=1
This condition is also called equilibrium of forces. This means that if a force is pulling the body to the left, there must be an equally strong force pulling to the right so that the body remains at rest. You can think of this as in rope pulling: If the left team pulls with the same force as the right team, the forces balance out and the rope stays at rest. We will apply this principle in the following example:
2.4.1.1 Suspension of a Lamp from the Suspension Cable We stretch a suspension rope from one wall to the opposite wall and hang a weight (e.g., a lamp) from the middle of the rope (Fig. 2.15). The weight causes the rope to sag slightly. This results in an angle of the rope to the vertical of α < 90°. We want to calculate how the force F with which the rope pulls on the wall depends on the angle α. To do this, we consider the point of suspension of the weight from the suspension → → rope. At this point three forces act: (1) the weight of the lamp F g , (2) the force F l , → with which the left wall pulls on the rope, and (3) the force F r , with which the right wall pulls on the rope. In order for the suspension point under consideration to remain at rest, the sum of the forces must be zero: →
→
→
F g þ F l þ F r = 0:
ð2:129Þ
Since the forces point in different directions in space, we decompose them into their components in the horizontal x-direction and the vertical y-direction. Thus (2.129) is in vector notation 0 - mg
þ
- F l sinðαÞ F l cosðαÞ
From the first line of (2.130)
þ ð F r sin ðαÞF r cosðαÞÞ =
0 0
:
ð2:130Þ
48
2
Mechanics of Rigid Bodies
- F l sinðαÞ þ F r sinðαÞ = 0
ð2:131Þ
Fl = Fr
ð2:132Þ
it immediately follows that
The forces of the walls on the suspension point and thus also the forces with which the suspension rope pulls on the left or right wall are therefore equal. This could be expected, because the lamp is hanging in the middle of the suspension rope and therefore the situation is mirror-symmetrical with respect to the middle of the rope. We insert the result F = Fl = Fr into the second line of (2.130), - mg þ F cosðαÞ þ F cosðαÞ = 0,
ð2:133Þ
and solve for the force F: F=
mg : 2 cosðαÞ
ð2:134Þ
So the rope pulls with the force F at both side walls. This result is remarkable, because if you want to achieve that the rope does not sag (this corresponds to an angle of α = 90°), then the force F is infinitely large. This is because one cannot compensate for the downward pointing weight force with forces at right angles to it. In practice, therefore, it is impossible to tension a suspension rope such that it does not sag. Even the rope’s own weight will inevitably cause it to sag.
2.4.2
Statics of Rotation
A body can satisfy the condition of statics of translation and still be set in rotational → motion if a torque acts M . An example of this is sketched in Fig. 2.16a, where a turntable has been suspended freely rotatable about its centre of mass in the middle
→
→
Fig. 2.16 (a) A turntable is supported at its centre of mass. Since the forces F 1 and F 2 cancel each →
other, there is no force acting on the centre of mass or the axis. Nevertheless, there is a torque M , which sets the disk in rotation. (b) Derivation of the law of leverage from the statics of rotation
2.4 Statics and Equilibria
49 →
→
→
of the disc. At the sides of the turntable the two forces F 1 and F 2 = - F 1 are acting, for example, this could happen if someone grabs the disc on the left and right with both hands and pulls it down and up with equal force respectively. Since in this → → example the sum is F 1 þ F 2 = 0, there is an equilibrium of forces. The centre of mass is therefore not accelerated. In reality, the center of gravity is not accelerated even if the forces are unequal. This is because in this case the force acting on the → → centre of gravity F 1 þ F 2 ≠ 0 is balanced by a counter force on the axle, which is fixed to the wall. In both cases, the turntable is set in rotation by the action of the two forces, since the total torque is given by →
→
→
→
→
→
→
→
→
M tot = M 1 þ M 2 = r 1 × F 1 þ r 2 × F 2 = 2 . r 1 × F 1 : →
ð2:135Þ
→
The two individual torques M 1 and M 2 make the same contribution to the total → torque. Statics of rotation is only achieved when all acting torques M i cancel each other out and the total torque iszero: →
M tot =
N
→
M i = 0:
ð2:136Þ
i=1
As an example, let us derive the law of leverage below.
2.4.2.1 Leverage Law For this purpose, we consider a lever as sketched in Fig. 2.16b. At the distance l1 and l2 from the axis of rotation, two forces F1 and F2 act vertically downwards. The statics of the rotation requires that the corresponding torques M1 = l1F1 and M2 = l2F2 must be equal: l1 F 1 = l2 F 2 :
ð2:137Þ
The force on the shorter lever is therefore greater than the force on the longer lever by the ratio of the two lengths. This is often used, for example, when lifting a heavy weight with a lever. If we interpret the lever as a seesaw and the two forces as weight forces of the masses m1 and m2, statics of rotation means that we balance the seesaw. In this case, using Fi = mig, we get the condition l1 m 1 = l2 m 2 :
ð2:138Þ
So the heavier person must sit further inside in relation to the masses.
2.4.2.2 Equilibria of Extended Bodies Statics of rotation always means that a body is in equilibrium. There are different types of equilibria, which we will learn about below. For this purpose, let us consider a body of any shape which is attached to a point on the wall so that it can rotate freely. We already know that the body hangs in equilibrium when there is no total
50
2
Mechanics of Rigid Bodies
Fig. 2.17 Depending on the position of the centre of gravity relative to the axis of rotation, one has (a) counterclockwise rotation, (b) clockwise rotation, (c) stable equilibrium, (d) unstable equilibrium and (e) indifferent equilibrium
torque acting on it with respect to the axis of rotation through the point of suspen→ → sion. We also know that the weight force acting on the body F g = m g acts at the center of mass (com). Therefore, it matters for the torque where the com is located → (see Fig. 2.17). With the distance vector r SP of the com from the axis of rotation, the torque is given by →
→
→
M = r SP × F g :
ð2:139Þ
If the com is located to the left or right of a vertical straight line through the axis of rotation, this inevitably leads to a torque which causes the body to rotate counterclockwise or clockwise. If, on the other hand, the SP is on the vertical straight line → → through the axis of rotation, the torque is zero, since r SP and F g are parallel or antiparallel to each other, and for this reason the cross product between these two vectors disappears. Equilibrium therefore prevails whenever the com lies on this vertical straight line. In this case, the type of equilibrium is further classified depending on how the system reacts to a small deflection (rotation) from equilibrium: Stable equilibrium always exists when disturbances of the equilibrium are automatically led back to equilibrium, i.e. when a small rotation out of equilibrium leads to a torque which rotates the SP back to equilibrium. This is the case when the com is located on the vertical straight line below the axis of rotation. Unstable equilibrium refers to the situation when any small disturbance of the equilibrium leads the system away from equilibrium. This is the case when the centre of mass is on the vertical straight line above the axis of rotation. Even a minimal rotation of the com to the left or right of the straight line above the axis of rotation produces a torque that causes the com to tip over. Indifferent equilibrium is the situation where the equilibrium is independent of the angle of rotation. This is the case when the SP is exactly on the axis of rotation. In this case, the SP always remains on the axis of rotation, no matter what angle the body is rotated. The torque is therefore always zero, and the body hangs in equilibrium for all angles.
2.4 Statics and Equilibria
51
2.4.2.3 Centre of mass Finally, in this section we will look at how to determine the center of mass. For a → → collection of individual masses mi with distance vectors r i the center of mass r com is the weighted average value of the distance vectors with the individual masses →
r com =
1 M
N
→
ð2:140Þ
mi r i , i=1
with the total mass M = ∑ mi. For extended bodies with continuous mass density → ρ r (=mass per volume) one decomposes the body into infinitesimally small mass elements and integrates over the whole space: →
r SP =
1 M
→
ρ r
→ 3
ð2:141Þ
r d r:
Example: Calculation of the Centre of Mass Given are three mass points mi with coordinates xi, yi: →
m1 = 1 kg at r 1 = →
m2 = 2 kg at r 2 = →
m3 = 3 kg at r 3 =
4 0 4
, ,
3 0 4
:
According to (2.140), the centre of mass is given by →
r com = =
4 1 . 1 kg 6 kg 0
1 . 6
4þ8þ0 0 þ 6 þ 12
þ 2 kg =
1 . 6
4 3 12 18
þ 3 kg =
2 3
0 4 :
52
2.5
2
Mechanics of Rigid Bodies
Rotation of Extended Bodies
So far we have always described the motion of bodies by the motion of the center of mass. This works for arbitrary translations of the body. For rotations, this simplification is legitimate only if the axis of rotation is far away compared to the extension of the body, because then one can approximate the body as a mass point. But if one considers rotations of bodies about themselves or about axes directly on the body, one must take into account the shape of the body. Some parts of the body then have a smaller distance and others a greater distance from the axis of rotation. The question → → of how an acting torque M affects the angular velocity ω of a body thus depends on the shape of the body. Practically, however, one can define a physical quantity for each body, the so-called moment of inertia J, so that one can write down a → corresponding law for rotations quite analogously to Newton’s second law F = → ma: →
→
M =J α,
ð2:142Þ
→ →_ with the angular acceleration known from formula (2.47) α = ω . Comparing (2.142) with Newton’s second law, one can identify corresponding quantities between rotation and translation. The torque corresponds to the force, the angular acceleration to the acceleration and the moment of inertia takes over the role of the mass.
2.5.1
Moment of Inertia
The moment of inertia describes the inertia of a body to a torque. The moment of inertia of a point mass m is given by J = mr2⊥ , ½J ] = kgm2 :
ð2:143Þ
Here r⊥ is the (minimum) distance of the mass point from the axis of rotation, which results from a perpendicular projection of the mass point onto the axis. The moment of inertia is additive, i.e. the moments of inertia Ji of several mass points mi can be added, J=
N i=1
mi r 2i⊥ ,
ð2:144Þ
as shown in Fig. 2.18a using the example of three mass points. For a continuous → mass distribution with mass density ρ r one determines the moment of inertia by integration over the mass elements:
2.5 Rotation of Extended Bodies
53
Fig. 2.18 Moment of inertia (a) of three mass points rotating about an axis, (b) of a sphere when rotating about an axis through its centre of gravity, (c) of a solid cylinder when rotating about the cylinder axis a, i.e. through the centre of gravity of the cylinder. In (c), the rotation about a second axis a′ is also shown, which is parallel to the axis a
J=
→
ρ r r 2⊥ d3 r:
ð2:145Þ
2.5.1.1 Moments of Inertia of Special Bodies and Steiner’s Theorem For certain symmetrical bodies, one can solve the integral (2.145) and obtain a formula for the moment of inertia, which depends on the mass of the body and its geometric shape. Here it is assumed that the axis of rotation always passes through the centre of mass of the body. For a solid sphere with mass m and radius R rotating about an axis through its centre, the moment of inertia is J=
2 2 mR : 5
ð2:146Þ
For a solid cylinder with mass m and radius R rotating about its cylinder axis, the moment of inertia is J=
1 2 mR : 2
ð2:147Þ
Since for the moment of inertia in (2.145) only the radial distance of the mass elements from the axis of rotation plays a role, it is independent of the height of the cylinder. In tables (e.g., Wikipedia) one can look up the moments of inertia of many other bodies. Often, however, such a body does not rotate about an axis a through the center of gravity, but through an axis a′ parallel to it at a distance d from the old axis. This case is sketched in Fig. 2.18c. Steiner’s theorem allows us to calculate the new moment of inertia J′ from the old moment of inertia J: J a0 = J a þ md 2 :
ð2:148Þ
54
2
Mechanics of Rigid Bodies
Since the expression md2 is always positive, the moment of inertia increases quadratically with the distance from the center of gravity when the axis of rotation is shifted outward. Written Test: Moment of Inertia and Angular Velocity A torque of M = 20 Nm acts on a cylindrical rotating disc (the axis of rotation passes through the centre of gravity of the disc) with radius R = 1 m and mass m = 100 kg for a time of t = 10 s. What is the angular velocity achieved? Solution The moment of inertia of the disc is 1 1 J = mR2 = . 100 kg . ð1 mÞ2 = 50 kgm2 : 2 2
ð2:149Þ
The angular acceleration α is according to (2.142) α=
M 20 Nm = 0:4 s - 2 : = J 50 kgm2
ð2:150Þ
It is constant over the acceleration time t. This results in the angular velocity ω = α . t = 0:4 s-2 . 10 s = 4=s:
2.5.2
ð2:151Þ
Rotational Energy
If a body rotates around an axis, the individual mass elements of the body move at different speeds. They therefore all have different kinetic energy depending on their distance from the axis of rotation. The total kinetic energy of all mass points is then called rotational energy. Analogous to the kinetic energy of a mass point E kin = 1 2 2 mv with mass m and velocity v, the rotational energy of the rotating body can be expressed by the formula E rot =
1 2 Jω : 2
ð2:152Þ
This formula has the same form as Ekin = 12 mv2 , if we again identify the moment of inertia with the mass and the angular velocity with the speed.
2.6 Mechanics: Compact
55
Exercise Task: Rolling Cans Several cans with the same radius R and the same mass m, but (due to the mass distribution in the can) different moment of inertia J roll down an inclined plane. In what order do the cans arrive at the bottom? Solution Here you can argue with conservation of energy. All cans start from rest with the same potential energy Epot = mgh. When a can rolls down the plane, on the one hand its center of gravity moves along the plane, on the other hand the can rotates around its center of gravity at the same time. The potential energy of the can is therefore converted into both kinetic translational energy and rotational energy: E pot =
1 2 1 2 mv þ Jω : 2 2
ð2:153Þ
The can that reaches the highest velocity arrives first. Therefore, we express the angular velocity using the relation ω = v/R by the velocity, E pot =
1 2 1 v2 1 J mv þ J 2 = m þ 2 v2 , 2 2 R 2 R
ð2:154Þ
and solve for the speed: 2E pot v= m þ RJ2
1=2
:
ð2:155Þ
For bodies with vanishing moment ofpinertia J = 0 we get the already known formula for freely falling bodies: v = 2gh. The larger the moment of inertia, the smaller the velocity becomes. This is due to the fact that then the proportion of potential energy, which is converted into rotational energy, increases and thus less energy remains for the kinetic translational energy.
2.6
Mechanics: Compact
Here again the most important formulas for mechanics are summarized: Differential relationship between position s, velocity v and acceleration a (2.1) and (2.2):
56
2
Mechanics of Rigid Bodies
→
dr , dt → → dv d2 r → = 2 : a ðt Þ = dt dt →
v ðt Þ =
The whole in integral form according to (2.9) and (2.10): t
→
v ðt Þ =
→
→
→
→
a ðt 0 Þdt 0 þ v 0 ,
t0 t
→
r ðt Þ =
v ðt 0 Þdt 0 þ r 0 :
t0
Uniform circular motion (2.42): →
r ðt Þ = Rð cos ðωt Þ sinðωt Þ0Þ:
Velocity in uniform circular motion (2.44): v = ωR: Centripetal acceleration in uniform circular motion (2.46): aZ = ω2 R: Action principle/second Newton’s law (2.56): →
→
F =m . a:
The same applies to rotations (2.142): →
→
M =J α: Reaction principle/third Newton’s law (2.58): →
→
F 21 = - F 12 :
Gravitational force (2.63): FG = G
m1 m2 : r2
Hooke’s law/spring force (2.67): F = - D . x: Springs hung parallel or one below the other (2.69) and (2.70):
2.6 Mechanics: Compact
57 N
Dtot =
Di , i=1
1 = Dtot
N
1 : D i=1 i
Centrifugal force (2.76): FZ = m
v2 : r
Coriolis Force (2.77): →
→
→
F C = 2m v × ω :
Frictional force (2.78) and (2.79): F H,G,R = μH,G,R . F N : Potential Energy (2.89): E pot = mgz: Elastic energy (2.91): E elast =
1 2 Dx : 2
E kin =
1 2 mv : 2
E rot =
1 2 Jω : 2
Kinetic Energy (2.93):
Rotational energy (2.152):
Work and power (2.87) and (2.96): →
→
W=F . r, dW → P= =F . v : dt Momentum and momentum theorem (2.102) and (2.104):
58
2 →
Mechanics of Rigid Bodies
→
p =m v , →
→ dp = F: dt
Perfectly elastic central collision (2.115) and (2.116): m1 - m2 2m2 v þ v, m1 þ m2 1 m1 þ m2 2 2m1 m - m1 v þ 2 v: v02 = m1 þ m2 1 m1 þ m2 2 v01 =
Completely inelastic central collision (2.118): v0 =
m1 v1 þ m2 v2 : m1 þ m2
Angular momentum, torque and angular momentum theorem (2.121), (2.123) and (2.124): →
→
→
L =m r × v , →
→
→
M = r × F, →
→ dL = M: dt
Statics of translation and rotation (2.128) and (2.136): →
F tot =
N
→
F i = 0,
i=1 →
M tot =
N
→
M i = 0:
i=1
Centre of mass (2.140): →
r SP =
1 M
N
→
mi r i , i=1
Moment of inertia and Steiner’s theorem (2.144) and (2.148): Ja =
N i=1
mi r 2i⊥ ,
J a0 = J a þ md 2 :
3
Continuum Mechanics
3.1
Elastic Deformations of Solid Bodies
Elastomechanics deals with elastic deformations of solid bodies. In order to understand what happens microscopically during such a deformation, let us take the example of a metal and consider how the body is structured by its atoms (see Fig. 3.1a). The atoms are arranged so that the body occupies as small a volume as possible. We can imagine this as if we were to pour small balls into a glass, which then pile up in a lattice shape. Correspondingly, atoms in a metal are also referred to as a lattice or dense sphere packing. Each atom with mass mA can be assigned a volume VA in the lattice. This is not the volume of the atom itself, but the volume is determined by the distance from the neighbouring atoms in the solid body. This defines the density ρ of a solid body with mass m and volume V: ρ=
m mA kg , ½ρ = 3 : = VA V m
ð3:1Þ
The question why atoms behave like small solid spheres at all can be understood by looking at the molecular potential U(R), that is, the energy between two atoms at a distance R (see Fig. 3.1b). For large distances, the two atoms attract each other. This force depends on the type of bond (ionic bond, covalent bond, metal bond, van der Waals bond); for example, in ionic bonding, attractive electrostatic forces act between negatively and positively charged ions. For small distances R → 0 the two atoms strongly repel each other. This force is caused on the one hand by the electrostatic repulsion of the atomic nuclei, but also by quantum effects such as the Pauli principle (see Sect. 11.2.3.1)). The binding potential of two atoms thus has a potential minimum at a certain distance RB, the so-called binding distance, due to the attraction at large distances and the repulsion at small distances. The corresponding potential depth is the binding energy UB. This is the energy required to completely separate the two atoms. The individual atoms of a solid body now arrange themselves in such a way that the total energy, that is, the sum of all binding energies, is as # Springer-Verlag GmbH Germany, part of Springer Nature 2023 S. Slama, Experimental Physics Compact for Scientists, https://doi.org/10.1007/978-3-662-67895-4_3
59
60
3
Continuum Mechanics
Fig. 3.1 (a) Dense sphere packing of atoms in a lattice. Each atom with mass mA occupies the volume VA. (b) Bonding potential U(R) of two atoms against their distance R. For large R the atoms attract each other, for small R they repel each other. This results in bonding at the distance RB. There the potential is given in first approximation by a quadratic function
small as possible. This is how, for example, the spherical packing comes about. For the elastic properties, it is now important that the potential minimum in Fig. 3.1b can be approximated by a quadratic function for small deflections R - RB from the bond distance. The energy is thus approximately of the form U ðRÞ ≈
1 D ðR - R B Þ2 , 2
ð3:2Þ
just like the elastic energy of the spring from (2.91), if one chooses the size D (the spring constant) accordingly. The atoms in a solid body are thus bound to each other as with springs. For deflections from equilibrium, therefore, Hooke’s law (2.67) applies, according to which the restoring force is proportional to the deflection. Thus, if forces act on a body, it can be deformed. Within the range of Hooke’s law, this deformation is reversible, that is, the body returns to its original shape when the force disappears. We will find Hooke’s law again in the following Sects. 3.1.1, 3.1.2, 3.1.3 and 3.1.4 on the various elastic deformations. Different ways of deforming a solid are shown in Fig. 3.2.
3.1.1
Elongation and Compression
A solid can be stretched or compressed by a force acting along the longitudinal axis of the body (Fig. 3.2a). Here, one defines the elastic stress σ exerted on the body by the force F per cross-sectional area A: σ=
N F , ½σ = 2 = Pa, ðPascalÞ: A m
ð3:3Þ
The relative change in length of the body ε = Δl/l is then given by Hooke’s law
3.1 Elastic Deformations of Solid Bodies
61
Fig. 3.2 Elastic deformations of solid bodies: (a) Strain/compression, (b) Bending, (c) Shear, (d) Torsion
σ = E ε, ½E = Pa,
ð3:4Þ
with the modulus of elasticity E, a material constant. For example, the modulus of elasticity for Steel : E = 210 GPa,
ð3:5Þ
Rubber : E ≤ 5 GPa:
ð3:6Þ
The absolute change in length is therefore Δl =
F l: EA
ð3:7Þ
When stretched, this is Δl > 0, and when compressed it is Δl < 0. Experience shows us that a body becomes thinner when stretched (e.g., a rubber band). This effect is known as transverse contraction. The relative change of lengths transverse to the direction of stretching, for example, of the cylinder radius R in Fig. 3.2a is proportional to the relative change of length: ΔR Δl = -μ R l
ð3:8Þ
with the unitless Poisson number μ. This is a material constant with typical values in the range 0 < μ < 0, 5. A change in volume ΔV is generally associated with the change in length and the transverse contraction: ΔV Δl = ð1 - 2μÞ , V l
ð3:9Þ
62
3
Continuum Mechanics
that is, when a body is stretched, its volume generally increases, only for the value of μ = 0, 5 (e.g., rubber) the volume of the body does not change.
3.1.2
Bending
Bodies can generally be bent in very different ways. In this section we treat as special cases objects with a rectangular cross-section which are loaded either at the end or in the middle by a force (Fig. 3.2b). If the body (a board of length l, width b and thickness d ) is clamped at one end and loaded by a force F at the other end, as in the case of a springboard, the deflection Δ at the board end is given by Δ=
1 4l3 F: E d3 b
ð3:10Þ
The second possibility is that the (same) body rests at its two ends and is loaded with force F in the middle. This leads to deflection in the middle of the board: Δ=
1 l3 F: E 4d3 b
ð3:11Þ
The modulus of elasticity also plays a role in bending. It is interesting to note that in both cases the deflection is proportional to 1/d3, that is, the deflection decreases considerably with the board thickness. Furthermore, according to Hooke’s law, the deflection is proportional to the force F. Examination Task: Board on Edge A board with length l, width b = 25 cm and thickness d = 5 cm resting on both sides is loaded by a weight and deflects by Δ = 5 cm. How great is the deflection if the board is used on edge (as a beam) with the same load? Solution The deflection as a board is Δ1 =
1 l3 F: E 4d3 b
ð3:12Þ
If the board is used on edge, the width b and thickness d are exchanged. The new deflection is (continued)
3.1 Elastic Deformations of Solid Bodies
63
Δ2 =
1 l3 F: E 4b3 d
ð3:13Þ
The ratio of the deflections is given by 3
1 l Δ2 E 4b3 d F d2 = 2: = 1 l3 Δ1 b F 3 E 4d b
ð3:14Þ
This results in the following for the new deflection Δ2 =
3.1.3
ð5 cmÞ2 d2 Δ = 5 cm = 2 mm: 1 b2 ð25 cmÞ2
ð3:15Þ
Shear
Shear forces act along the surface as outlined in Fig. 3.2c. Shear stress is defined as force per area τ=
F N , ½τ = 2 = Pa, ðPascalÞ: A m
ð3:16Þ
As a result, the surfaces opposite each other at a distance h are displaced parallel to each other by the distance Δx or sheared by the shear angle θ. The shear γ is defined as relative displacement γ=
Δx = tanðθÞ: h
ð3:17Þ
Hooke’s law relates the shear stress to the shear via τ = G γ, ½G = Pa,
ð3:18Þ
with the sheer modulus G, a material constant. The shear modulus is also referred to as torsional modulus or modulus of rigidity, depending on the context. For example, the shear modulus for Steel : G = 79:3 GPa,
ð3:19Þ
Rubber : G = 0:0003 GPa:
ð3:20Þ
The displacement of the parallel surfaces is given by
64
3
Δx =
3.1.4
1 h F: G A
Continuum Mechanics
ð3:21Þ
Torsion
In torsion, an object with length l is subjected to opposing torques at its ends M = R F (Fig. 3.2d). This twists the object (e.g., a wire) by the angle φ. According to Hooke’s law, the twist is proportional to the torque: M = D φ,
ð3:22Þ
with the so-called torsional coefficient D=G
Nm πR4 , ½D = , 2l rad
ð3:23Þ
in which the torsional modulus (shear modulus) is included as a material constant. The twisting angle is thus given by φ=
3.1.5
2l 2l M= F: GπR4 GπR3
ð3:24Þ
Beyond the Elastic Range: Fracture
Elastic deformations are reversible. This means that the body returns to its original shape when the tensile force is removed. However, if a body is deformed too much, the elastic range is exceeded and irreversible deformations occur. In this case, the stress in the body initially no longer increases in proportion to the deformation, but less and less. Hooke’s law is therefore no longer valid. This is referred to as the elastic-plastic transition region. From a critical stress in the body on, the stress limit σ Grenz, the deformation increases by itself (plastic flow) until the body breaks or tears. The stress limit or tensile strength is again a material constant. Written Test: Maximum Load on Steel Cable What is the minimum thickness of a steel cable required to lift a weight from m = 1500 kg? The stress limit of steel is σ Grenz = 520 MPa:
ð3:25Þ
Solution (continued)
3.2 Hydrostatics
65
The tension in the rope must be less than the limit tension: σ=
F mg ≤ σ Grenz : = A πr2
ð3:26Þ
This inequality is solved for the rope radius r: r≥
mg = πσ Grenz
1500 kg 10m=s2 = 3 mm: π 520 × 106 N=m2
ð3:27Þ
The steel cable must have a minimum radius of 3 mm.
3.2
Hydrostatics
Hydrostatics deals with the properties of liquids at rest. Microscopically, similar to solids, there are forces between the particles of the fluid (mostly molecules, e.g., H2O molecules in water), but the individual molecules can move freely within the fluid. Therefore, no shear forces exist in liquids. These would move the molecules within the liquid and change the surface shape until all the forces are perpendicular to the surface of the liquid. This is why a liquid always takes the shape of the container it is in.
3.2.1
Pressure in Liquids and Gases
Fluids (liquids and gases) exert forces on the adjacent medium at their edges. The adjacent medium can be a body which is completely or partially in the fluid, for example, a diver under water, or it can be the wall which limits the fluid, for example, the container in which the fluid is located. Since there are no shear forces in fluids, these forces are always perpendicular to the edge of the fluid (see Fig. 3.3a). The pressure in the fluid is defined as the force F per area A of the edge: p=
F N , ½p = 2 = Pa, ðPascalÞ: A m
ð3:28Þ
Another common unit of pressure is the bar (not an SI unit), since p = 1 bar is roughly equivalent to air pressure on the surface of the earth. It applies 1 bar = 105 Pa:
ð3:29Þ
Blood pressure is often also measured in mm mercury column (mm Hg), as earlier pressure measuring devices contained a mercury scale. It applies
66
3
Continuum Mechanics
Fig. 3.3 (a) The pressure in a liquid exerts forces on the edge of the liquid, shown here as a force on the wall of the jar, and on a sphere immersed in the liquid. This force is always perpendicular to the edge surface. The pressure in the liquid due to its own weight (gravity pressure) and thus also the force per area increases linearly with the depth h of the liquid, shown here as a linear increase in the lengths of the force arrows with depth. The pressure at the top of the fluid is not necessarily zero, even though the depth is h = 0. This is due to the fact that there may be a gas (e.g., air) above the liquid, which in turn acts on the liquid with its gravitational pressure. (b) The gravitational pressure at a fixed depth h is independent of the shape of the vessel or liquid. For this reason, the pressure at the bottom of the three sketched containers is exactly the same
1 mm Hg = 133 Pa:
ð3:30Þ
Pressure in a liquid can arise in various ways. In the following, we will learn about gravity pressure and plunger pressure in particular.
3.2.1.1 Gravity Pressure The gravity pressure of a liquid is the pressure that results from the liquid’s own weight. It amounts to pS = ρgh,
ð3:31Þ
with the density ρ of the liquid and the acceleration due to gravity g. It is completely irrelevant what shape the container in which the liquid is filled in has (see Fig. 3.3b). This relationship is known as the hydrostatic paradox. The only decisive factor for the gravitational pressure is the difference in height h between the top of the liquid and the depth in the liquid at which the gravitational pressure is measured. In this case, one speaks of the liquid column with height h. This is still true if the liquid is in a bent tube. For water with a density of ρ = 1000 kg/m3 there is a simple rule of thumb: For every 10 m of water depth, the pressure increases by about p = 1000
kg m 10 2 10 m = 105 Pa = 1 bar m3 s
ð3:32Þ
an. The gravity pressure of 1 mm Hg-column is 133 Pa, as already anticipated in (3.30). The air pressure of approx. 1 bar on the Earth’s surface is caused by the gravity pressure of the Earth’s atmosphere, that is, by the weight of the air on the Earth. The gravity pressure in gases will be discussed in more detail in Sect. 3.2.4.
3.2 Hydrostatics
67
3.2.1.2 Plunger Pressure In addition to the weight force of the liquid itself, an external force can also lead to a pressure in the liquid. In Fig. 3.3a, for example, the weight force of the air presses on the liquid from above. The pressure at the top of the liquid column is therefore given by the air pressure above it. In general, one speaks of the plunger pressure. Here one imagines a piston (plunger), which presses with a force F over its surface A on the liquid. The plunger pressure is then given by p0 =
F : A
ð3:33Þ
The plunger pressure does not only act at the point where the plunger touches the liquid, but increases the pressure in the entire liquid by the constant value p0. The total pressure in a liquid is thus the sum of the plunger pressure p0 and the gravity pressure pS. Places in a liquid with the same depth h have the same total pressure p = p0 + pS. This relationship can be used to measure pressure with a U-tube.
3.2.1.3 Pressure Measurement with the U-Tube A U-tube is a tube bent into a U-shape in which there is a liquid. The difference in height Δh of the two liquid columns on the left and right side of the U-tube can be used to measure the pressure difference of the gas on the left and right side above the column. Figure 3.4a shows a U-tube manometer. This can be used to determine the unknown gas pressure p in the closed volume at the left U-tube side. The right side of the U-tube is open. The air pressure p0 on this right side of the U-tube is known. The pressure at the top of the liquid column on the left is given by the plunger pressure pl = p of the gas. At the same height on the right in the U-tube, the pressure is given by the sum of the plunger pressure of the air and the gravity pressure pr = p0 + ρgΔh. Since at the same height pl = pr is valid, we get p = p0 þ ρg Δh:
ð3:34Þ
By measuring the difference in height Δh one can thus determine the unknown pressure p if one knows the air pressure p0. One can also use a U-tube to measure the air pressure itself. In this case it is called a U-tube barometer. The corresponding construction is shown in Fig. 3.4b. On the left closed side of the U-tube, the plunger pressure is p0 = 0. This can be achieved experimentally by filling the U-tube with liquid so that no air bubble forms on the left side, and then rotating the U-tube to the Fig. 3.4 (a) U-tube manometer for determining the pressure p from the known air pressure p0. (b) U-tube barometer for determining the unknown air pressure p
68
3
Continuum Mechanics
position sketched in Fig. 3.4b without allowing air to enter the U-tube. A vacuum then exists above the left column of the liquid, with p0 = 0. The pressure at the top of the column on the right corresponds to the unknown air pressure pr = p. At the same height on the left, the pressure is given by pl = p0 + ρg Δh = ρg Δh. By equating pl = pr, the air pressure is given by p = ρg Δh:
ð3:35Þ
3.2.1.4 Compression of Liquids Unlike gases, liquids are incompressible. This means that the volume V of a liquid is, to first approximation, independent of the pressure in the liquid. This property is important for hydrodynamics when studying flowing liquids. However, if you look more closely, the volume of a fluid does change slightly due to pressure. The relative volume change is given by ΔV p = = κ p: V K
ð3:36Þ
Here the compressibility κ or the modulus of compressibility describes as a material property how well the liquid can be compressed. It applies K=
1 1 , ½K = Pa, ½κ = : κ Pa
ð3:37Þ
Work is done during compression. Its can be calculated with the following model: We consider a plunger with cross-sectional area A, which is moved into the liquid by the distance Δs with the force F = pA against the pressure p of the liquid (see Fig. 3.5a). In this process, the piston performs work on the liquid: ΔW = F Δs = p A Δs = - p ΔV:
ð3:38Þ
Here ΔV = - A Δs is the change in volume of the liquid. This is negative when the plunger is pressed into the liquid for Δs > 0. So for ΔW > 0 the piston does work on the liquid. This increases the energy of the liquid contained in the pressure. Formula (3.38) is valid for constant pressure p, or when ΔV is so small that the pressure can be
Fig. 3.5 (a) Determination of the compression work, (b) Force amplification with a hydraulic press
3.2 Hydrostatics
69
assumed to be constant. In general, however, the pressure changes with compression. The work of compression is then obtained by integration ΔW =
V2
- pðV Þ dV,
ð3:39Þ
V1
where the volume is changed from V1 to. V2. Example: Force Amplification with a Hydraulic Press A hydraulic press can be used to amplify a small force. For this purpose, as sketched in Fig. 3.5b, a piston with a cross-sectional area A1 is connected via a fluid (hydraulically) to a second piston with a cross-sectional area A2. We neglect the gravity pressure here. In the sketch the two rams are at the same height anyway, in real situations (e.g., in an excavator) the plunger pressure is much higher than the gravity pressure due to the possible height differences of the two rams. Thus, the plunger pressure on the left is equal to the plunger pressure on the right: F1 F2 = , A1 A2
ð3:40Þ
where F1, 2 is the respective force on the piston with cross-sectional area A1, 2. The force on the piston with the larger area A2 > A1 is thereby amplified with the ratio of the areas: F2 =
A2 F : A1 1
ð3:41Þ
For circular pistons with area A1,2 = πR21,2 the force F2 thus increases quadratically with the piston radius. R2.
3.2.2
Buoyancy Force: Swimming, Floating, Sinking
Experience shows us that bodies are lighter under water. This is due to the buoyant force that a body experiences when it is submerged in a liquid. The buoyancy force is due to the pressure differences at the surface of the submerged body. The body feels a greater pressure on its lower side than on its upper side due to the greater gravity pressure of the liquid. This pressure difference results in an upward force acting on the body. To determine the buoyancy force, consider a cylindrical body in a liquid (see Fig. 3.6a). For this special shape of body, calculation is most simple, however the result is valid for bodies of any shape. The pressure difference between upper and lower side is
70
3
Continuum Mechanics
Fig. 3.6 (a) The buoyancy force on a body is as great as the weight force of the the body volume filled with the liquid. (b) Floating bodies immerse part of their volume in the liquid
Δp = pu - po = ρFl g h,
ð3:42Þ
with the height of the cylinder h and the density of the liquid ρFl. The buoyant force FA on the body is therefore F A = Δp A = ρFl g h A = ρFl g V K :
ð3:43Þ
Here we denote the cross-sectional area of the cylinder by A and its volume by VK. The buoyant force acting on the body is thus equal to the weight force of the quantity of water displaced by the body with its volume. Opposite to the buoyancy force the weight force additionally acts on the body: F g = m K g = ρK g V K :
ð3:44Þ
The total force acting upwards on the body is thus given by F = F A - F g = ðρFl - ρK Þg V K :
ð3:45Þ
The sign of the total force is always greater than zero if the density of the liquid is greater than the density of the body, for example, in the case of water and wood. In this case, the body rises upwards in the liquid and ultimately floats on top. On the other hand, if the density of the liquid is less than the density of the body, for example in the case of water and iron, then the force is negative and the body sinks to the bottom in the liquid. If the two densities of the liquid and the body are exactly equal, then no force acts on the body, that is, it floats in the liquid. Divers, for example, attach lead weights to themselves so that their average density is as similar as possible to that of water and they can float in the water with as little force as possible. Exercise: Why Can Iron Ships Float? According to (3.45), a body with a greater density than the liquid sinks to the bottom. Since iron certainly has a greater density than water, the question arises why ships made of iron can float at all. (continued)
3.2 Hydrostatics
71
Solution If the ship was completely filled with iron inside, it would sink in the water. However, a ship is essentially hollow within its hull and filled with air. So when the ship sinks into the water by a certain amount, a large volume is displaced, which is determined by the outer shape of the ship. However, it is essentially only the hull, which is made of iron, that contributes to the weight of the ship. The decisive factor for the total force acting on the ship is the average density of the ship, which is made up of the air density (less than the water density) and the iron density. The question now arises as to how far a floating body is immersed in the liquid. For this purpose, we consider the situation sketched in Fig. 3.6b. The buoyancy force is given by the lower part of the volume immersed in the liquid Vu, while the weight force is given by the total mass of the body, that is, by its total volume VK = Vu + Vo. If the body floats stably, the buoyancy force and the weight force are equal: ρFl g V u = ρK g V K :
ð3:46Þ
We solve the equation for the lower part of the volume: Vu =
ρK V : ρFl K
ð3:47Þ
Alternatively, we can calculate the upper part of the volume Vo protruding from the water by substituting Vu = VK - Vo into (3.47) and solving: Vo = 1 -
ρK V : ρFl K
ð3:48Þ
Written Test: Iceberg What percentage of the volume of a floating iceberg protrudes from the water? The densities of water and ice are ρEis = 0:92 ρH2 O = 1
g , cm3
g : cm3
ð3:49Þ ð3:50Þ (continued)
72
3
Continuum Mechanics
Solution According to (3.48) the part of the total volume protruding from the water is 0:92 cmg 3 ρ Vo =1- K =1= 0:08: VK ρFl 1 cmg 3
ð3:51Þ
This means that only 8% of the volume of the iceberg, and thus only its tip, protrudes from the water.
3.2.3
Interfaces of Liquids
We have said so far that molecules can move freely in a liquid and thus there are no shear forces in a liquid. This leads to the fact that the liquid level in a container is always horizontal, that is, perpendicular to the acting force of gravity. However, if we look more closely at the surface of a small amount of liquid (e.g., water in a glass), we see that this surface can be curved. Obviously, there are internal forces at the edge of a liquid that have an influence on its shape. This is the surface tension of a liquid. It is due to the fact that the molecules at the edge of the liquid have fewer nearest neighbor molecules than molecules inside the liquid. As we have already seen in Fig. 3.1b, the presence of neighboring molecules lowers the energy. Molecules at the edge of the liquid are therefore energetically unfavourable. A force is created which acts on the liquid in such a way that it has as small a surface area as possible and thus as few edge molecules as possible. The surface tension σ describes the increase in energy ΔE per increase in surface area ΔA: σ=
ΔE N , ½σ = : ΔA m
ð3:52Þ
This quantity is a material constant of the liquid; however, it depends strongly on the conditions, for example, whether and which substances are dissolved in the liquid. Exercise: Surface Energy of a Drop of Water Water drops are spherical in absende of gravity. This is because, for a given fixed volume, the sphere is the geometric shape with the smallest surface area. Therefore, the spherical shape has the smallest surface tension. What would be the difference in energy of 1 cm3 water if the spherical drop is made into a cube shape? The surface tension of water is (continued)
3.2 Hydrostatics
73
σH2 O = 0:07
N : m
ð3:53Þ
Solution For the surface AW (six square side faces) and the volume VW of the cube with side length a the following is valid AW = 6 a2 ,
ð3:54Þ
V W = a3 = 1 cm3 :
ð3:55Þ
The surface area of the cube is connected with the volume via 2=3
AW = 6 V W = 6 1 cm3
2=3
= 6 cm2 :
ð3:56Þ
The surface area AK and the volume VK of a sphere of radius r are given by: AK = 4πr2 , VW =
ð3:57Þ
4π 3 r : 3
ð3:58Þ
The surface area of the sphere is connected with the volume via AK = ð36πV 2K Þ
1=3
= 36π ð1 cm3 Þ
2 1=3
= 4:84 cm2 :
ð3:59Þ
The area of the cube is therefore ΔA = 6 cm2 - 4, 84 cm2 = 1:16 cm2
ð3:60Þ
larger. So the cube shape has more surface energy. The energy difference is ΔE = σ ΔA = 0, 07
N 1, 16 cm2 = 8:12 μJ: m
ð3:61Þ
3.2.3.1 Force on Edge Lines and Overpressure Surface tension can, for example, cause an iron sewing needle to float on the surface of water, even though iron has a greater density than water. This is because the sinking of the needle into the water causes the water surface to increase in size in order to provide space for the needle. This is associated with an increase in energy, or a force that counteracts this increase. If the opposing force is large enough, the needle will remain on the surface of the fluid. Basically, a force is acting on the edge of length l of any fluid surface
74
3
Continuum Mechanics
Fig. 3.7 (a) Force on the edge of a soap solution film in a ring, for example, Pustefix. (b) Capillary compression pulls the liquid upwards in a thin tube and in (c) capillary depression lowers it downwards
F = σ l:
ð3:62Þ
Thin liquid films like the one in Fig. 3.7a have two sides, that is, two surfaces. Therefore, the force on the edge is twice as large: F = 2σ l:
ð3:63Þ
Here you can imagine, for example, a soap solution film that forms in a Pustefix ring before you blow through it. The length of the edge corresponds to the circumference of the ring. The plastic ring is thus subjected to an inward force. If you then create a soap bubble by blowing through the ring, you have to apply a force to increase the surface area. The finished soap bubble is stable and does not contract to a drop, because you have created an overpressure within the bubble by blowing into it. In general, the pressure pi in a bubble with radius R is greater than the pressure pa outside: Δp = pi - pa =
4σ : R
ð3:64Þ
Thus, the smaller the radius of the bubble, the greater the overpressure. This means that a large force is needed at the beginning to form a small bubble from the flat film. The larger the bubble becomes, the easier it is to enlarge it further. Exercise Task: Connected Soap Bubbles What would happen if two soap bubbles of different sizes were connected by a pipe so that the two bubbles could exchange air? Solution Since the smaller soap bubble has the greater internal pressure, air flows from the smaller bubble into the larger bubble. The smaller bubble shrinks, inflating the larger bubble.
3.2 Hydrostatics
75
3.2.3.2 Adhesion and Cohesion So far, when we have spoken of the surface tension of a liquid, we have implicitly assumed that the surface under consideration is adjacent to air. We have argued that the molecules at the edge have fewer neighboring molecules than the molecules inside the liquid. This assumption is no longer valid when the liquid is adjacent to a solid body, such as the receptable in which it is contained, since there are now also attractive forces between the molecules of the liquid and the atoms of the solid body. These forces (adhesive forces) can either be stronger than the forces between the molecules of the liquid (cohesive forces) or weaker, depending on the properties of the liquid and the solid. In the case where the adhesive forces predominate, the energy of the system decreases when as many molecules of the liquid as possible are in contact with the wall. This leads to the fact that drops, which are on such a surface, melt and wet the surface. In the context of wetting with water, one also speaks of hydrophilic surfaces. If, on the other hand, cohesive forces predominate, the energy is minimized by ensuring that as few molecules of the liquid as possible have contact with the wall. This results in a pronounced droplet formation, also known as the lotus effect, as water forms almost perfectly round drops on the leaves of the lotus plant. Such surfaces are called hydrophobic. Particularly strong effects of adhesion forces occur in very thin tubes (capillaries) filled with liquid. Adhesion between the liquid and the capillary wall can even pull the liquid upwards in the capillary against gravity. This effect is called capillary compression and plays a role, for example, in supplying the branches and leaves of trees with water from the roots (among other effects). If a capillary with radius r is located in a liquid reservoir, the liquid with density ρ can move in the capillary by a maximum height Δh =
2σ , rρg
ð3:65Þ
relative to the filling level of the reservoir (see Fig. 3.7b). Correspondingly, when the cohesive force is dominant, there is the opposite effect of capillary depression, where the level of the fluid in the capillary is lowered relative to the level of the reservoir. The maximum height difference is also given by (3.65).
3.2.4
Aerostatics
This section on gases at rest (aerostatics) deals with the main differences from the liquids considered so far. Gases have a much smaller density than liquids, for example, ρLuft ≈ 1:3 kg=m3 , in contrast to
ð3:66Þ
76
3
ρH2 O = 1000 kg=m3 :
Continuum Mechanics
ð3:67Þ
This means that the distances between the gas particles (molecules or atoms) are much larger than the distances in a liquid. The individual molecules or atoms of the gas are not bound to each other and can move freely in space. They continue to move until they collide either with one of the other gas particles or with a wall, thereby changing their direction of motion. Therefore, unlike liquids and solids, there are no forces holding a gas together as an object. The gas always completely fills the volume made available to it (e.g., a gas cylinder). Since there are no bonds between the gas particles, a gas can be compressed, whereas liquids are almost incompressible. The product of the volume V of a gas and its pressure p is constant in this case. This relationship is known as law of Boyle-Mariotte: p V = const:
ð3:68Þ
If, for example, the volume of a gas is changed from V1 to V2, the pressure changes accordingly from p1 to p2 p2 =
V1 p : V2 1
ð3:69Þ
Boyle-Mariotte’s law corresponds to the isothermal change of state of an ideal gas, which we will discuss in more detail in Sect. 5.2.3. The density ρ of the gas is always proportional to the pressure p. This can be seen by substituting (3.68) into the definition of density: ρ=
m m = p: V const
ð3:70Þ
3.2.4.1 Barometric Pressure Law With the help of Boyle-Mariotte’s law, one can determine the air pressure p(z) as a function of the height z above the earth’s surface. We will not go into the derivation of the so-called barometric pressure law here, but only give the result: pð z Þ = p 0 e
-
ρ0 g p0
z
= p0 e
- hz
0
,
ð3:71Þ
with the air pressure at the earth’s surface p0, the air density at the earth’s surface ρ0 and the acceleration due to gravity g. Thus, the air pressure decreases exponentially with height z. This is also different from fluids, where the gravity pressure is linear (and not exponential) with height. For exponential functions, one can specify a characteristic length scale on which the value of the function decreases to 1/e. In the barometric pressure law, this is the characteristic altitude h0, which can also be interpreted as the typical thickness of the atmosphere:
3.3 Hydrodynamics
77
h0 =
p0 : ρ0 g
ð3:72Þ
With values of p0 = 1 bar and ρ0 = 1.29 kg/m3 typical on Earth, the characteristic height is h = 7750 m. Thus, the Earth’s atmosphere has a typical thickness of about 8 km. The density of air is, as already mentioned, proportional to the pressure; thus the density also decreases exponentially with altitude: ρ ð z Þ = ρ0 e
-
ρ0 g p0
z
= ρ0 e
- hz
0
:
ð3:73Þ
The formulas (3.71) and (3.73) do not take into account that the temperature decreases with increasing altitude. This leads in reality to deviation of pressure and density from the barometric pressure law.
3.3
Hydrodynamics
Hydrodynamics is the study of the flow of liquids and gases (fluids). A flow is → → described by the flow field. This is the velocity v r , t of the flowing medium at →
any location r = ðx, y, zÞ and at any time t. Graphically, a velocity field at a fixed time can be thought of as having a small velocity vector attached to each location in space, indicating the velocity of the medium at that location. In Fig. 3.8, the flow field of a river is sketched as an example. In such a sketch, for the sake of clarity, one must make a judicious selection of locations where the velocity vector is represented. If one connects velocity vectors that lie behind each other, one obtains the streamlines. Particles that are carried along with the flow move along the streamlines, for example, a petal in a river flow. We now distinguish between different types of flow. If the streamlines run smoothly alongside each other, as in a calm stretch of river, it is called a laminar flow. This type of flow is usually also stationary, that is, the flow field is constant in time. In contrast, highly complex flow fields occur in turbulent flows, for example, vortices, whose shape changes constantly over time. Turbulent flows are therefore unsteady. Flows can also be classified as compressible (for flowing gases) or incompressible (for flowing liquids)
Fig. 3.8 Flow field of a river flow. The flow lines of the laminar flow map the shape of the river. At the constriction, the flow is faster; this is indicated by the greater length of the vectors. Just before the weir, however, the flow slows down. After the water has fallen over the weir and thus becomes very fast, turbulence occurs
78
3
Continuum Mechanics
according to the properties of the flowing medium. In addition, a distinction is made between ideal (non-viscous) flows, in which no friction occurs, and real (viscous) flows, in which frictional forces occur both in the fluid itself and between the fluid and the edges.
3.3.1
Continuity Equation
In a stationary flow, conservation of mass applies. This means the following: If we look at any area in the flowing liquid (see Fig. 3.9) then the mass Δm1, which flows into the area per time Δt, is exactly as large as the mass Δm2, which flows out at another point per time Δt. Loosely speaking, one can say that what goes in at the front must come out at the back. If this was not the case, then the mass in the volume under consideration would change with time, and thus the flow would not be stationary. Let us assume that the incoming fluid has the density ρ1 and the outgoing fluid has the density ρ2. The mass Δm1, 2 t corresponds to the volume ΔV1, 2 via Δm1,2 = ρ1,2 ΔV 1,2 :
ð3:74Þ
During the time Δt the flow in the two volumes ΔV1, 2 with their cross-sectional areas A1, 2 moves on by the distance Δx1, 2, thus ΔV 1,2 = A1,2 Δx1,2 :
ð3:75Þ
The flow velocity is in each case v1,2 =
Δx1,2 : Δt
ð3:76Þ
Thus one can express the mass flowing in or out per time by:
Fig. 3.9 Derivation of the continuity equation. The mass flowing in at the front per time 2 equal to the mass flowing out at the end per time Δm Δt
Δm1 Δt
is
3.3 Hydrodynamics
79
Δm1,2 ρ1,2 A1,2 Δx1,2 = = ρ1,2 A1,2 v1,2 : Δt Δt
ð3:77Þ
By equating the mass flowing in and out per time, one finally obtains the continuity equation ρ1 A1 v1 = ρ2 A2 v2 :
ð3:78Þ
The continuity equation becomes even simpler in incompressible fluids, since in this case the densities are equal, ρ1 = ρ2, and can be cancelled in (3.78). One obtains A1 v1 = A2 v2 :
ð3:79Þ
The smaller the cross-sectional area, the greater the flow velocity. The product of the cross-sectional area and the flow velocity A v is constant in the liquid. This value can be interpreted as the volume ΔV of the liquid flowing per time Δt and is referred to as the volume flow rate I: I=
ΔV m3 : = A v, ½I = s Δt
ð3:80Þ
Examination Task: Fill Watering Can The nozzle at the end of a garden hose has a cross-sectional area of A1 = 0.1 cm2. It takes 25 s to fill a watering can with a capacity of 5 l (1 l = 10-3 m3). 1. What is the velocity v1 of the water flowing out of the nozzle? 2. What is the velocity v2 of the water flow in the hose if it has a crosssectional area of A2 = 2.8 cm2? Solution
1. The volume flow rate is given by I=
m3 ΔV 5 × 10 - 3 m3 = 2 × 10 - 4 : = 25 s s Δt
ð3:81Þ
The velocity results from this with (3.80): (continued)
80
3
2 × 10 - 4 ms I m = 20 : = A 0, 1 × 10 - 4 m2 s
Continuum Mechanics
3
v1 =
ð3:82Þ
2. The velocity in the garden hose is given by the continuity Eq. (3.79) v2 =
3.3.2
A1 0, 1 cm2 m m v1 = 20 = 0:71 : A2 s s 2, 8 cm2
ð3:83Þ
Bernoulli Effect
In ideal incompressible fluids there is a simple relationship between the flow velocity of a flow and its pressure. The relationship can be derived by assuming that the energy density (= energy per volume) in the fluid is constant. Here two forms of energy occur. The first form of energy is based on the pressure in the fluid. Since liquid with a pressure p is able to build up a liquid column of height h, the pressure energy per volume can be expressed using the gravitational pressure (3.31) by p=ρ g h=
Epot m : g h= V V
ð3:84Þ
The pressure therefore corresponds with Epot = mgh to a kind of potential energy per volume. The second form of energy is the kinetic energy of the flowing fluid. Per volume the kinetic energy is given by E kin 12 mv2 1 2 = = ρv : V V 2
ð3:85Þ
We now consider in Fig. 3.10a a narrowing tube in which a fluid flows. Because of the change in cross-sectional area, the flow velocity in the two regions given by (3.79) is different and is denoted by and v1 and v2, respectively. Correspondingly, the
Fig. 3.10 (a) Derivation of Bernoulli’s equation. The faster the flow velocity, the lower the pressure. (b) Written test: In which section of the tubes is the pressure lowest?
3.3 Hydrodynamics
81
pressure is p1 and p2, respectively. Conservation of energy means that the sum of potential and kinetic energy per volume is exactly the same in region 1 as in region 2. E pot,1 Ekin,1 E pot,2 E kin,2 þ = þ : V V V V
ð3:86Þ
Substituting (3.84) and (3.85) into (3.86), we obtain Bernoulli’s equation 1 1 p1 þ ρv21 = p2 þ ρv22 : 2 2
ð3:87Þ
The greater the flow velocity, the smaller the pressure at that point in the fluid. This is because the total energy must be divided between pressure and flow velocity. Therefore, if more kinetic energy is required, the amount of pressure energy must decrease. For flows where not only the flow velocity changes, but also the height, one must add gravity pressure to Bernoulli’s equation (3.87). An example of this would be a water pipe going from the ground floor at height h1 to an upper floor at height h2. The full law is given by 1 1 p1 þ ρv21 þ ρgh1 = p2 þ ρv22 þ ρgh2 : 2 2
ð3:88Þ
Examination Task: Pressure in Tubes Consider the tube system sketched in Fig. 3.10b, through which a liquid flows. In which section (A) to (E) of the pipe system is the pressure the lowest? Solution From area (A) via (B) to (C) (all three areas are at the same height) the tube cross-sectional area decreases. Therefore, according to the continuity equation, the flow velocity increases here. According to Bernoulli’s equation, the pressure decreases accordingly: pA > pB > pC. From area (C) to area (D) the tube leads upwards at the same cross section (against the gravitation g). Therefore, the pressure in the fluid continues to decrease: pC > pD. Last but not least, from area (D) to area (E) the cross-sectional area increases again, the velocity decreases, and the pressure increases again: pD < pE. The smallest pressure is therefore reached in area (D). In the following sections “Pitot tube”, “Buoyancy force on the airplane wing”, “Magnus effect”, “Hydrodynamic paradox” and “Torricelli’s law” we will get to know some examples where Bernoulli’s equation plays a role.
82
3
Continuum Mechanics
3.3.2.1 Pitot Tube Pitot tubes are used in aircraft to determine the velocity v of the aircraft relative to the flowing air (see Fig. 3.11a). The Pitot tube has two air inlets. One of them is located laterally to the air flowing past. The static air pressure pstat prevails in the tube connected to it, which would also be measured if the aircraft was at rest relative to the air. The second air inlet is located on the front side in the direction of movement, so that the incoming air – in addition to the static pressure – generates the so-called dynamic pressure pdyn in the adjacent tube (pitot tube). Both pipes are internally connected by a pressure gauge, which measures the pressure difference between both pipes. Now we identify these quantities with the variables in Bernoulli’s equation (3.87): since the air does not move in the pitot tube, there is v1 = 0; the corresponding pressure is p1 = pstat + pdyn. The velocity of the air flowing past the side tube is v2 = v, and the corresponding pressure is p2 = pstat. Bernoulli’s equation is thus 1 pstat þ pdyn = pstat þ ρv2 : 2
ð3:89Þ
The pressure gauge measures the pressure difference Δp = p1 - p2 = pdyn. Thus one can solve (3.89) for the aircraft speed and obtains v=
2Δp : ρ
ð3:90Þ
For a reliable measurement of v, the air density ρ must also be known.
3.3.2.2 Aerodynamic Lift Force on the Aircraft Wing The Bernoulli effect also leads to a lift force acting on the wings of an aircraft. We assume that the air flow in front of and behind the wing is homogeneous and equal (see Fig. 3.11b). Due to the shape of the wing, however, the path of the air flow at the upper side is longer than at the lower side. The air flowing past the top must therefore cover a greater distance than at the bottom. If we also assume that this does not significantly change the air density, the air speed upside of the wing must be greater
Fig. 3.11 Examples of Bernoulli’s equation: (a) The Pitot tube is used to determine the speed of an aircraft relative to the air by measuring the dynamic pressure pstau of the incoming air. (b) Aerodynamic lift of an aircraft wing. Due to the faster airflow at the top, the pressure under the wing is greater
3.3 Hydrodynamics
83
than downside of the wing. This assumption naturally is not quite legitimate, as air flows are compressible. Here, however, we will only sketch approximately what happens physically. Since vo > vu Bernoulli’s equation leads to po < pu. The pressure difference Δp = po - pu leads to a force F = Δp A with area A of the airplane wing. Contrary to popular belief, however, this force accounts for only a (smaller) portion of the force that makes an airplane fly. The larger part is due to the fact that the airflow is deflected downwards by the wings. The force acting upwards on the aircraft then results from the conservation of momentum.
3.3.2.3 Magnus Effect The Magnus effect occurs when objects rotate in a flow. This results in a force perpendicular to the direction of flow. A well-known example of this is the banana cross in football or cut balls in tennis and table tennis. The balls fly on a curved path due to their rotation. The reason for this can be explained as follows: We consider a flying rotating ball in Fig. 3.12a. In the system moving along with the centre of gravity of the ball, the air flows past the ball from the front with velocity v. Due to the rotation, one side of the ball (left side in the sketch) moves with the air flow and the other (right) side of the ball moves in the opposite direction. As there is friction between the ball surface and the air molecules, the molecules on the left side of the ball are accelerated in the direction of the air flow and slowed down on the right side. As a result, the flow velocity on the left side is greater than on the right side. Therefore, according to Bernoulli’s equation, there is less pressure on the left side than on the right side, and there is a force on the ball to the left. 3.3.2.4 Hydrodynamic Paradox The hydrodynamic paradox can be observed when a liquid or gas flows out of a pipe end with a movable lid (Fig. 3.12b). One might think that the flow always pushes the lid away from the end of the pipe. In fact, one observes that the lid is attracted to the end of the pipe in a certain distance range, which seems paradoxical at first sight. However, this effect can be explained with the help of Bernoulli’s equation. If the lid is almost at the end of the pipe, the fluid can only flow out through the resulting gap, which has a small cross-sectional area compared to the pipe. According to the continuity equation, the flow velocity in the gap is correspondingly large. Exactly
Fig. 3.12 Further examples of Bernoulli’s equation: (a) The Magnus effect causes rotating balls to fly on curves. (b) Hydrodynamic paradox: A disc is attracted by the outflowing air. (c) Torricelli’s law: The outflow velocity of the liquid from the water tank is identical to free fall
84
3
Continuum Mechanics
above the lid, however, the air is at rest; therefore, there is a greater pressure above the lid than in the gap. This leads to a force that pushes the lid further onto the gap and is greater the smaller the gap becomes. However, this effect is only valid as long as the fluid actually flows through the gap. When the lid seals the gap, the velocity in the gap is also zero, and the lid is forced away from the end of the pipe by the back pressure. This can cause the lid to be alternately attracted and repelled by the end of the tube, causing it to rattle or vibrate.
3.3.2.5 Torricelli’s Law Torricelli’s law indicates the velocity v2 with which liquid flows out of a large container (see Fig. 3.12c). The derivation is carried out using Bernoulli’s equation by considering, on the one hand, the liquid level sinking with velocity v1 on which the air pressure p1 rests and, on the other hand, the liquid flowing out further down with velocity v2. After the outflow, only the air pressure p2 also weighs on the liquid. The Bernoulli equation including the height difference h between the liquid level at the top and the outlet at the bottom reads: 1 1 p1 þ ρv21 þ ρgh = p2 þ ρv22 : 2 2
ð3:91Þ
Since the gravity pressure of air is typically much smaller than that of the liquid, the air pressure at the top and bottom can be equated, p1 = p2. Moreover, for a container whose cross-sectional area is much larger than the cross-sectional area of the outlet, the sinking velocity is much smaller than the outlet velocity. Therefore, one can set v1 = 0 in good approximation. Thus the Bernoulli equation reads ρgh =
1 2 ρv : 2 2
ð3:92Þ
2gh:
ð3:93Þ
This can be solved for the speed v2: v2 =
We already know this result from free fall. The outlet velocity is exactly the same as the velocity of a mass after free fall of height h. So you can imagine that the column of liquid falls freely from the top of the liquid to the outlet. Incidentally, the result is independent of the angle at which the liquid flows out of the container.
3.3.3
Flow of Real Fluids
In contrast to ideal fluids, real fluids are characterised by the fact that friction effects play a role. Friction occurs both in the liquid and at the interface between the liquid and other substances, for example, at container walls. Microscopically, friction occurs when the molecules of the liquid move against each other in a flow. In this process, the attractive forces acting between neighbouring molecules have to be
3.3 Hydrodynamics
85
Fig. 3.13 Friction in laminar flows. (a) Viscosity is defined in terms of the frictional force between layers of fluid passing each other. (b) For Newtonian fluids, the velocity of the flowing fluid decreases linearly with the perpendicular distance from the shear force
overcome. Macroscopically, friction appears as viscosity η. For the definition of viscous fluid, we consider in Fig. 3.13a two layers of a fluid at a distance dx, which move with respect to each other with the velocity difference dv. The frictional force F between the layers defines the viscosity: F=η A
dv , ½η = Pa s: dx
ð3:94Þ
The frictional force is proportional to the area A with which the two liquid layers rub against each other and to the gradient of the velocity dv/dx. Typical values for the viscosity are, for example, Water : η = 10 - 3 Pa s,
ð3:95Þ
Honey : η = 10 Pa s:
ð3:96Þ
According to (3.94), in order to generate a velocity gradient in a fluid, one needs a force along the surface, that is, a shear force. An example of this is the wind passing over a water surface and acting with shear force F on the uppermost layer of water. This also causes fluid layers further down to move (see. Fig. 3.13b). Fluids in which the viscosity does not depend on the velocity of the fluid layers or other parameters are called Newtonian fluids; formula (3.94) is also called Newtonian friction force. In this case, the velocity v(x) decreases linearly with water depth x. This can be derived by integrating (3.94). If the total water depth D is not too large, the linear decrease continues to the bottom of the water depth; there the velocity is v(0) = 0 because of friction with the resting bottom. Thus, the velocity gradient in (3.94) can be expressed by the change in velocity Δv = v0 per depth of water D: F=η A
v0 , D
ð3:97Þ
where the velocity of the top layer of water is v0. To a good approximation, water is indeed a Newtonian fluid; however, there are enough other fluids that exhibit non-Newtonian behavior, such as blood, quicksand, or cement. This can cause these fluids to behave like a solid when loaded quickly, but like a liquid when loaded slowly. An example of this is bouncing clay, which can be deformed plastically when loaded slowly, but behaves elastically like a solid when deformed
86
3
Continuum Mechanics
quickly. As an application for Newtonian friction, we will now calculate the friction on a sphere with laminar flow.
3.3.3.1 Application: Stokes Law of Friction We consider a sphere with density ρK and radius R moving in a fluid with density ρFl → and viscosity η with velocity v (see Fig. 3.14a). The corresponding flow of the → medium with velocity - v around the sphere is supposed to be laminar (i.e., without turbulence and curls). Thus, the laminar friction force acts on the sphere →
F R = - 6πηR v ,
ð3:98Þ
→
against the direction of motion v of the sphere. This can be derived from Newton’s friction force (without proof). We now want to calculate in the following with which constant velocity v the ball sinks into the liquid due to the gravitational force of the earth. For this we must consider the forces acting on the sphere with their respective direction: 4π 3 R g ðdownwardsÞ, 3
ð3:99Þ
4π 3 R g ðupwardsÞ, 3
ð3:100Þ
the Stokes friction friction FR = 6πηRv ðupwardsÞ:
ð3:101Þ
the weight force F g = ρK
the buoyancy force F A = ρFl
Since the velocity of the sphere is constant, the sum of all forces must be zero: ρK
4π 3 4π R g - ρFl R3 g - 6πηRv = 0: 3 3
ð3:102Þ
We solve this equation for the velocity v and get: v=
2gðρK - ρFl Þ 2 R : 9η
ð3:103Þ
Fig. 3.14 Applications of Newtonian friction. (a) Stokes’ friction force in the motion of a sphere with constant velocity v in a viscous fluid. (b) In laminar flow in a pipe, the pressure decreases linearly with the length of the pipe. In the radial direction, the velocity profile of the fluid is quadratic. The volume flow rate is given by the law of Hagen-Poiseuille
3.3 Hydrodynamics
87
The sinking speed is proportional to the square of the radius of the sphere. Larger spheres of the same density sink faster. Examination Task: Sinking Spheres Given are three spheres. Ball (A) with mass m1 = 1 g and radius R1 = 0.6 cm, Ball (B) with mass m2 = 1.5 g and radius R2 = 0.7 cm and Ball (C) with mass m3 = 2.3 g and radius R3 = 0.8 cm. Rank the spheres according to their sinking speed in water (density ρW = 1 g/cm3 from slow to fast. Solution We need the densities of the three spheres: ρ1 =
m1 4π 3 3 R1
= 1:105
g , cm3
ð3:104Þ
ρ2 =
m2 4π 3 3 R2
= 1:044
g , cm3
ð3:105Þ
ρ3 =
m3 4π 3 3 R3
= 1:072
g : cm3
ð3:106Þ
Since the factor 2g 9η in (3.103) is identical for all three spheres, it is sufficient to compare the remaining factor (ρK - ρFl) R2 of formula (3.103): ðρ1 - ρW Þ R21 = 0:03
g , cm
ð3:107Þ
ðρ2 - ρW Þ R22 = 0:02
g , cm
ð3:108Þ
ðρ3 - ρW Þ R23 = 0:05
g : cm
ð3:109Þ
This expression is proportional to the sinking speed of the respective balls. Thus ball (B) sinks slowest, ball (A) somewhat faster and ball (C) fastest.
3.3.3.2 Flows in Tubes: Hagen-Poiseuille Law As a further application of Newton’s law of friction (3.94), we treat the flow of a fluid with viscosity η through a pipe of radius R and length l. Due to friction, the flow requires a pressure difference Δp = p1 - p2 between the two ends of the pipe (see Fig. 3.14a). Here the pressure in the pipe decreases linearly with the length of the pipe. Using (3.94), one can also calculate the velocity profile of the flow in the pipe (without proof). The distance from the centre of the pipe is denoted by r. Due to the friction of the fluid with the pipe, the velocity at the edge is v(r = R) = 0. The
88
3
Continuum Mechanics
velocity then increases radially inward and reaches its maximum in the center at r = 0. The velocity profile follows a quadratic function: vð r Þ =
Δp R2 - r 2 : 4ηl
ð3:110Þ
From (3.110), one can also calculate the volumetric flow rate (3.80) in the pipe by integrating the velocity v(r) over the cross-sectional area of the pipe. The result is the law of Hagen-Poiseuille: I=
π Δp 4 R : 8ηl
ð3:111Þ
The volume flow rate increases with the fourth power of the pipe radius. If a pipe is narrowed to half the radius by deposits, only one sixteenth of the original volume of liquid will flow through it at the same pressure. This plays a major role in the calcification of pipelines, for example. Written Test: Garden Hose A garden hose has a length of l = 20 m and a radius of R = 5 mm. The water pressure at the tap where the hose is connected is p1 = 5 bar, at the end of the hose the pressure has dropped by Δp = 1 × 105 Pa to p2 = 4 bar 1. How long does it take to fill a paddling pool with V = 1 m3 water? 2. What is the maximum flow velocity in the hose? Solution
1. According to (3.111) the volume flow rate is I=
4 m3 π 1 × 105 Pa ð5 × 10 - 3 mÞ = 1:23 × 10 - 3 : -3 s 8 10 Pa s 20 m
ð3:112Þ
With I = ΔV/Δt the filling time follows as Δt =
ΔV 1 m3 = 3 = 815 s ≈ 13:5 min: I 1:23 × 10 - 3 ms
ð3:113Þ
2. The maximum speed is reached in the middle of the hose. Using (3.110), It amounts to (continued)
3.3 Hydrodynamics
vðr = 0Þ =
3.3.4
89
2 1 × 105 Pa m ð5 × 10 - 3 mÞ = 31:25 : s 4 10 - 3 Pa s 20 m
ð3:114Þ
Turbulence
So far we have only considered laminar flows, with a friction force proportional to the velocity, see for example, Eq. (3.97). However, for large flow velocities, turbulent behavior of the fluid and the formation of vortices can occur. Since the generation of vortices requires much more energy, the frictional force in turbulent flows typically increases faster with velocity than linearly, for example, quadratically. First, however, let us explore the question of when turbulence begins. Since turbulent behaviour is difficult to predict and depends very much on the circumstances, we can only give a rough guide value here. To do this, one calculates the Reynolds number Re =
ρdv , ½Re = 1 η
ð3:115Þ
of a liquid with viscosity η, density ρ and flow velocity v. The quantity d denotes the characteristic geometric length scale on which the flow is disturbed. If, for example, a liquid flows through a pipe, d would be given by the pipe diameter. Turbulence now sets in when the value of the Reynolds number Re≳1000
ð3:116Þ
This is an approximate indication. It is not possible to predict exactly when turbulence begins. If a fluid flows with velocity v around a body with cross-sectional area A the turbulent friction force is FR =
1 ρ cW A v 2 : 2
ð3:117Þ
The shape of the body is also decisive for the strength of the frictional force, which is contained in the so-called cW value. Examples of the cW value are: Parachute : cW = 1:33,
ð3:118Þ
Road bike : cW = 0:4,
ð3:119Þ
Aircraft : cW = 0:08,
ð3:120Þ
Penguin : cW = 0:03:
ð3:121Þ
90
3
Continuum Mechanics
Examination Task: Parachute Jump We want to calculate the descent rate v of a parachutist. The parachute has a radius of R = 4 m, the total mass of the parachutist including the parachute is m = 100 kg, the viscosity of the air is η = 17.1 × 10-6Pa s. 1. Calculate v assuming laminar flow. Describe the parachute as a sphere with radius R. 2. Calculate v assuming turbulent flow. Use the cW value for the parachute: cW = 1.33 and the air density ρ = 1.2 kg/m3. Solution
1. Neglecting the buoyancy force, the parachute is influenced by the gravitational force of the earth and Stokes’ friction. We equal the two forces mg = 6πηRv,
ð3:122Þ
and solve for the speed: v=
100 kg 10 m=s2 m mg = 775 609 : = s 6πηR 6π 17:1 × 10 - 6 Pa s 4 m
ð3:123Þ
At this high speed, the skydiver would not survive the impact on the ground. 2. We now set the turbulent frictional force equal to the gravitational force mg =
1 ρ cW A v 2 , 2
ð3:124Þ
and solve for the velocity v: v=
2 mg = ρ cW A
200 kg 10 m=s2 m = 5:0 : 2 3 s 1:2 kg=m 1:33 π ð4 mÞ
ð3:125Þ
At this value, the impact on the ground is already somewhat more pleasant. It is noteworthy here that skydiving works only thanks to the turbulence in the air.
3.4 Continuum Mechanics: Compact
3.4
91
Continuum Mechanics: Compact
Here again the most important formulas for continuum mechanics are summarized: Length change (3.7) and transverse contraction (3.8): F l, EA ΔR Δl = -μ : R l
Δl =
Bending with one-sided (3.10) and two-sided support (3.11): Δ=
1 4l3 1 l3 3 F, Δ = 3 F: E d b E 4d b
Shear (3.21): Δx =
1 h F: G A
φ=
2l F: GπR3
Torsion (3.24):
gravity pressure (3.31) and plunger pressure (3.33): pS = ρgh, F p0 = : A Compression (3.36) and compression work (3.38): ΔV p = = κ p, K V ΔW = - p ΔV: Buoyancy force (3.43): F A = ρFl g V K : Surface tension (3.52): σ= Force on an edge line (3.62):
ΔE : ΔA
92
3
F = σ l: Overpressure in liquid bubbles (3.64): Δp =
4σ : R
Capillary compression and depression (3.65): Δh =
2σ : rρg
Boyle-Mariotte Law (3.69): p1 V 1 = p2 V 2 : Barometric pressure law (3.71): pðzÞ = p0 e
-
ρ0 g p0
z
:
Continuity equation in incompressible fluids (3.79): A1 v1 = A2 v2 : Volume flow rate (3.80): I=
ΔV = A v: Δt
Bernoulli-’s law (3.88): 1 1 p1 þ ρv21 þ ρgh1 = p2 þ ρv22 þ ρgh2 : 2 2 Stokes law of friction (3.98): →
F R = - 6πηR v : Hagen-Poiseuille law (3.110) and (3.111): Δp R2 - r 2 , 4ηl π Δp 4 I = R : 8ηl
vð r Þ =
Reynolds number (3.115) and inserting turbulence (3.116):
Continuum Mechanics
3.4 Continuum Mechanics: Compact
Re =
93
ρdv ≳1000: η
Turbulent friction force (3.117): FR =
1 ρ cW A v 2 : 2
4
Oscillations and Waves
4.1
Harmonic Oscillation
Oscillations occur in many physical systems, for example, in mechanics as pendulum oscillations, spring oscillations or as vibrations of solid bodies; in thermodynamics they occur as oscillations of single atoms in a solid lattice, in electrodynamics as oscillating electric or magnetic fields and in atomic physics as oscillations of electrons in atoms. It is remarkable that all these oscillations can be described mathematically identically: xðt Þ = x0 . sinðωt þ φÞ:
ð4:1Þ
Here, the displacement x(t) is a time-periodic function which describes the oscillating quantity (see Fig. 4.1). However, the exact meaning of the quantity x depends on the system; for example, in the case of spring oscillation, the quantity x describes the displacement of the spring from the rest position. The prefactor x0 before the sine is the amplitude of the oscillation and indicates the maximum value of the oscillating quantity. The term “harmonic” means here that the variable oscillates with a fixed angular frequency ω, which is linked to the period T and the frequency in Hertz in analogy to the uniform angular motion in Sect. 2.1.4: ω=
2π 1 , ½ω] = bzw: T s
ð4:2Þ
f=
1 1 , ½f ] = = Hz: T s
ð4:3Þ
Another parameter is the phase φ of the oscillation. It ultimately specifies the value at which the oscillation starts at t = 0. A change of φ shifts the curve x(t) to the left or right in time. Thus one can make a cosine function out of the sine function in (4.1) with a phase of φ = 90° (Fig. 4.1). The harmonic oscillation x(t) from (4.1) is also the solution of the following differential equation (DEQ): # Springer-Verlag GmbH Germany, part of Springer Nature 2023 S. Slama, Experimental Physics Compact for Scientists, https://doi.org/10.1007/978-3-662-67895-4_4
95
96
4 Oscillations and Waves
Fig. 4.1 Harmonic oscillations with amplitude x0 and period T are typically represented by sine and cosine functions. Depending on the phase φ of the oscillation, the curves are shifted to the left or right
Fig. 4.2 (a) In the case of the spring pendulum, the restoring force is given by Hooke’s law. (b) In the case of the thread pendulum, the restoring force is caused by the component of the earth’s gravitational force perpendicular to the thread
€xðt Þ = - ω2 . xðt Þ:
ð4:4Þ
To prove this, we derive the function x(t) = x0 . sin (ωt + φ) twice by time: d ðx . sinðωt þ φÞÞ = ω . x0 . cosðωt þ φÞ dt 0
ð4:5Þ
d2 ðx0 . sinðωt þ φÞÞ = - ω2 . x0 . sinðωt þ φÞ dt 2
ð4:6Þ
= - ω2 . xðt Þ:
ð4:7Þ
So whenever there is a differential equation of the form (4.4), one knows that it describes a harmonic oscillation. Here one may still have to identify the angular frequency ω with the quantities of the system under consideration. We will practice this in the following examples.
4.1.1
Examples of Harmonic Oscillations
Here we consider two simple examples from mechanics, namely the spring pendulum and the thread pendulum (Fig. 4.2).
4.1.1.1 Spring Pendulum The spring pendulum is sketched in Fig. 4.2a. For this purpose we assume that a mass m can move frictionless on a plane and is fixed laterally to the wall by a spring with spring constant D. The mass m is fixed to the wall by a spring with spring
4.1 Harmonic Oscillation
97
constant D. When the mass is deflected from its equilibrium position (relaxed spring), it begins to oscillate about the equilibrium. In the same way, one could also consider a mass hanging vertically from a spring, except that the earth’s gravity would then lead to a changed equilibrium position around which the mass would swing. The chosen example with the horizontal spring is conceptually simpler, since gravity does not play a role, but leads to the same result. We now consider the equation of motion €xðt Þ = F=m (see also Sect. 2.2.6) and insert Hooke’s spring force F(t) = - D . x(t) from (2.67): D . xðt Þ, m
ð4:8Þ
€xðt Þ = - ω2 . xðt Þ:
ð4:9Þ
€xðt Þ = -
The second equation gives again the DEQ of the harmonic oscillation from (4.4). Obviously both DGL have the same form, if one identifies the angular frequency ω by ω2 =
D : m
ð4:10Þ
From this the oscillation period T = 2π/ω of the spring pendulum can be derived as T = 2π
m : D
ð4:11Þ
The greater the mass, the slower the spring pendulum oscillates, and the greater the spring constant, the faster it oscillates. Written Test: Spring Pendulum A spring pendulum with spring constant D and mass m oscillates with period T. How does the period change if the spring constant is doubled (D′ = 2D) and the mass is tripled (m′ = 3m)? Solution We put the new values into the formula (4.11) for the period duration and express the result by the old period duration: T 0 = 2π
m0 = 2π D0
3m = 2D
3 . 2π 2
m = D
3 . T: 2
ð4:12Þ
The new period duration is therefore greater than the original one by a factor of. 3=2 = 1:22
98
4 Oscillations and Waves
4.1.1.2 Thread Pendulum The thread pendulum is sketched in Fig. 4.2b: A mass m hangs on a thread of length l and oscillates around its equilibrium position. This situation is a little more difficult to describe, because the mass is moving on a circular path with radius l. The position x(t) of the mass on the circular path is given by xðt Þ = l . αðt Þ:
ð4:13Þ
The restoring force is given by the component of the earth’s gravitational force which points parallel to the arc in the opposite direction to the deflection, and is therefore dependent on the angle with F(t) = - mg sin (α(t)). The equation of motion €xðtÞ = F=m is thus €xðt Þ = - g sinðαðt ÞÞ:
ð4:14Þ
To have the same function on both sides of the equation, we express the position by €ðt Þ. The DGL for the function α(t) is thus the angle using (4.13): €xðt Þ = l . α € ðt Þ = α
g sinðαÞ: l
ð4:15Þ
However, this differential equation is not of the form (4.4) of a harmonic oscillation because of the sinusoidal function. The general solution for α(t) is therefore not given by a harmonic oscillation, but by a more complicated function which we cannot derive here. However, we can simplify the DEQ by approximating the sinusoidal function for small angles (in rad) by sinðαÞ ≈ α:
ð4:16Þ
This approximation is very good for small angles; for example, the error made by this approximation is just 0.5% for an angle of 10°. The DEQ in the approximation for small angles is €ðt Þ = α
g αðt Þ l
ð4:17Þ
and is therefore identical in form to the DEQ of the harmonic oscillation, if the angular frequency ω is identified by ω2 =
g : l
ð4:18Þ
Accordingly, the period of oscillation of the thread pendulum is T = 2π
l : g
ð4:19Þ
4.1 Harmonic Oscillation
99
The oscillation period therefore increases with the square root of the thread length. It is also independent of the attached mass m. Exercise: Pendulum Clock The pendulum of a pendulum clock should be dimensioned in such a way that the pendulum requires exactly 2 s for a full swing. This allows the pendulum to trigger a mechanism that advances the fourth wheel by one unit every time it passes through the lowest point at a rhythm of 1 s. How long must the pendulum be if the acceleration due to gravity is g = 9.81 m/s2? Solution We solve formula (4.19) for the length l: l=g .
T 2π
2
= 9:81
m 2s . s2 2π
2
= 99:4 cm:
ð4:20Þ
The pendulum length must therefore be l = 99.4 cm. For good reason, the pendulums in grandfather clocks are therefore approx. 1 m long.
4.1.2
Driven Harmonic Oscillator with Damping
This section – in simple terms – is about the physical effect of swinging. The swing by itself represents (in approximation of small angles) the harmonic oscillator. In addition, friction forces are considered, for example, air friction or also friction within the suspension of the swing, which damp swinging without external drive. However, a person sitting on the swing can counteract the damping and stimulate the swing by moving his legs in the right rhythm. Surprisingly, the harmonic oscillator, despite having such a simple mechanical interpretation as the swing, is one of the most successful concepts in physics and is even relevant to quantum mechanics. Before introducing the drive, we first start with the description of the influence of damping on the oscillation.
4.1.2.1 Damped Harmonic Oscillator Let a restoring force F = - D . x act on a mass m, for example, by a spring (Fig. 4.3). Damping is introduced as a viscous friction force (see Sect. 3.3.3), which is proportional to the velocity of the body: F R = - β . v = - β . x_ , ½β] =
N , m=s
ð4:21Þ
100
4 Oscillations and Waves
Fig. 4.3 (a) A damped harmonic oscillator can be imagined in the model as a spring pendulum which is immersed in a viscous medium and thus subjected to a frictional force. (b) Depending on the strength of the damping rate γ, a distinction is made between the oscillatory case (red curve: γ = 0.1ω0), the aperiodic limiting case (black curve, γ = ω0) and the overdamped case (blue curve, γ = 10ω0)
with damping constant β. In the model in Fig. 4.3, the oscillating mass is decelerated by a viscous medium (Newtonian friction). If one puts the total force which is composed by the restoring force and the friction force into the equation of motion, one obtains the differential equation
€xðt Þ = -
D β . xðt Þ - . x_ ðt Þ: m m
ð4:22Þ
In order to write this differential equation in more compact way, we introduce into (4.22) the (already known) angular frequency of the oscillator without damping ω0 = D=m and the damping rate γ = β/2m with the unit [γ] = 1/s. Thus the DEQ is €xðt Þ = - ω20 . xðt Þ - 2γ . x_ ðt Þ:
ð4:23Þ
This DEQ is more difficult to solve than that of the oscillator without damping, which is why we only give the solutions here. In fact, there are different solutions depending on the strength of the damping rate γ compared to the frequency ω0: For weak damping γ < ω0 the oscillator oscillates with an amplitude that slowly decreases exponentially. The larger the damping rate becomes, the faster the oscillations are damped until the case of critical damping is reached (is also called aperiodic limit case), where γ = ω0. Now the damping is so strong that the oscillator can no longer oscillate. This is just the situation where an out-of-equilibrium excursion is exponentially damped in the fastest way. If the damping rate increases even further, so that γ > ω0 is, this is known as creep. Here, too, the oscillator can no longer oscillate, but the viscous friction is then so strong that a deflection returns to equilibrium only very slowly. The solution is then a combination of two exponential functions, one of which decays very quickly and the other very slowly. We consider here the solutions in the special case where the oscillator is deflected by the amplitude x0 at the time t = 0 and is released from rest, that is, the initial velocity is zero. The solutions are
4.1 Harmonic Oscillation
xðtÞ =
2
101
xðtÞ = x0 . expð - γtÞ . cosðωtÞ, for γ < ω0 ,
ð4:24Þ
xðt Þ = x0 . expð - γt Þ, for γ = ω0 ,
ð4:25Þ
x0 ½γ þ expð - γ - tÞ - γ - expð - γ þ tÞ], for γ > ω0 , γ 2 - ω20
ð4:26Þ
where the oscillation frequency in the oscillation case becomes smaller due to the damping, ω = ω20 - γ 2 , and the damping rates of the exponential functions in the creep case are given by γ ± = γ ± γ 2 - ω20 . The corresponding functions are sketched in Fig. 4.3.
4.1.2.2 Driven Harmonic Oscillator A harmonic oscillator is said to be driven if, in addition to the restoring force and a possible damping force, an external force acts which is itself periodic. In the picture of the swing, one exerts a periodic force on the swing by rhythmically swinging the legs back and forth. The differential Eq. (4.23) thus receives one more term: €xðt Þ = - ω20 . xðt Þ - 2γ . x_ ðt Þ þ
F0 . sinðωext t Þ: m
ð4:27Þ
Here we assume that the driving force Fext(t) = F0 . sin (ωextt) is also harmonic, that is, oscillates with a sinusoidal function with excitation frequency ωext and amplitude F0. The force could just as well be assumed to be a cosine function. The DEQ leads to an overall dynamic of the oscillator, which consists of a transient and a subsequent steady-state solution. The transient depends on the start parameters, that is, the start displacement and the start velocity. For example, it takes a little time to bring the swing from rest to the desired amplitude of oscillation by moving the legs. We will not discuss this part of the solution further here; instead, we are interested in the stationary solution xstat(t). Stationary in this case means that the oscillator oscillates with constant amplitude x0 and phase φ in time: xstat ðt Þ = x0 . sinðωext t þ φÞ:
ð4:28Þ
The oscillator oscillates with the external excitation frequency ωext and not with its own oscillation frequency ω. The phase φ describes the phase shift between the oscillation xstat(t) and the external force Fext(t). Although amplitude and phase are constant in time, they depend on the parameters of the excitation and the oscillator, and in particular on the excitation frequency : ωext x0 ðωext Þ =
F 0 =m ω20
- ω2ext
2
þ ð2γωext Þ2
,
ð4:29Þ
102
4 Oscillations and Waves
Fig. 4.4 (a) Amplitude of oscillation x0 of the driven harmonic oscillator in units of the stationary displacement xstat 0 as a function of the excitation frequency ωext in units of the resonance frequency ω0 of the undamped resonator. The curves correspond from top to bottom to increasing damping rates of γ = 0.1ω0, 0.2ω0, 0.3ω0, 0.5ω0, 1.0ω0 and γ = 3ω0. (b) Phase difference φ between the oscillator vibration and the exciting force. The curves correspond from steep to flat increasing damping rates of γ = 0.01ω0, 0.1ω0, 0.5ω0 and γ = 5ω0
φðωext Þ = arctan
2γωext : ω20 - ω2ext
ð4:30Þ
The corresponding curves are shown in Fig. 4.4 for different damping values. We first consider the case of small damping with γ ≪ ω0, where the amplitude curve x0(ωext) has a pronounced maximum at the resonant frequency ω0. If the oscillator is excited resonantly at the frequency ωext = ω0, the amplitude of oscillation is given by xres 0 =
F0 , 2mγω0
ð4:31Þ
whereby very large values can be achieved with small damping γ → 0. The phase shift at resonance is exactly φ = 90°. If one excites more slowly, both the oscillation amplitude and the phase shift decrease. This means that with slow excitation the oscillator excursion can follow the driving force and in the limit for ωext → 0 excitation and excursion are in phase, that is, φ → 0°. In this limit, the oscillator is deflected by the constant force Fext(t) = F0 by a fixed value, the static deflection : xstat 0 xstat 0 =
F 0 =D : mω20
ð4:32Þ
For excitation frequencies greater than the resonant frequency, the oscillator cannot keep up with the rapid change in force. As a result, the amplitude becomes smaller
4.1 Harmonic Oscillation
103
and smaller for increasing excitation frequency, x0(ωext) → 0 and the oscillation gets into antiphase with the excitation, φ(ωext) → 180°. As the damping rate γ increases, the maximum of the resonance curves x0(ωext) shifts to lower frequencies and becomes wider and flatter until no resonance can be detected. The phase curves φ(ωext) also become wider, but the point at which the phase shift is φ = 90° always remains at the oscillator frequency ω0. If the damping in an oscillating system is too small, the amplitude of the oscillation under resonant excitation, which is inversely proportional to the damping with xres 0 / 1=γ, can under certain circumstances become so large that the whole system is destroyed by it. This is then referred to as a resonance catastrophe. A striking example of this is the collapse of the Tacoma Narrow Bridge, a suspension bridge that was caused to vibrate by the wind. You can find videos of the collapse of this bridge on Youtube, for example at https://www. youtube.com/watch?v=XggxeuFDaDU. The Millennium Bridge in London experienced similar threatening vibrations when a large number of people were travelling on the bridge shortly after it opened. The problem was then solved by installing stronger dampers. Exercise: Oscillation of an Inflatable Boat A loaded inflatable boat with total mass m floats on the water. The crosssectional area A is immersed in the water with the depth h0. In equilibrium (swimming, Sect. 3.2.2), the gravitational force mg on the dinghy and the buoyant force ρAh0g, with water density ρ, compensate each other. 1. With which period of time do the persons in the boat have to bounce so that the boat is excited to a resonant oscillation on the water? The oscillating variable here is the change in immersion depth x = h - h0. Derive a formula and initially neglect damping effects. 2. Insert numbers: ρ = 1000 kg/m3, A = 5 m2, g = 10 m/s2 and m = 300 kg. 3. With which time constant τ = 1/γ is the oscillation damped if one assumes Stokes’ friction FR = - 6πηRv from Sect. 3.3.3.1 “Application: Stokes’ friction law”, with the viscosity η = 10-3Pa . s of water and a model radius p R = A=π = 0:7 m. Solution When the dinghy is immersed in the water with the depth h = h0 + x, the total force as the sum of the gravitational force of the earth and the buoyant force is F = mg - ρAhg = mg - ρAh0 g - ρAxg = - ρAxg:
ð4:33Þ
In this equation we have assumed that by assumption (see above) mg ρAh0g = 0. We put this force into the equation of motion for x: (continued)
104
4 Oscillations and Waves
m€x = - ρAgx
ð4:34Þ
ρAg x: m
ð4:35Þ
respectively €x = -
This differential equation is identical in form to that of the harmonic oscillation from (4.4), if one identifies the angular frequency with ω=
ρAg : m
ð4:36Þ
The period duration is therefore T=
2π = 2π ω
m = 2π ρAg
300 kg = 0:49 s: 1000 kg=m3 . 5 m2 . 10 m=s2
ð4:37Þ
From Stokes’ friction law, the damping constant can be determined by comparison with (4.21): β = 6πηR
ð4:38Þ
or the damping rate γ=
6πηR : m
ð4:39Þ
From this the damping time can be deduced: τ=
m 300kg = 151s: = 6πηR 6π . 10 - 3 Pa . s . 0:7m
ð4:40Þ
Apparently, the vibration of the inflatable boat on the water is only weakly damped.
4.2
Harmonic Wave
In the case of oscillations, we have considered the change in time of a physical quantity in the form x(t). This is the change in its magnitude at a specific point in space. A wave, on the other hand, is an object extended in space (see Fig. 4.5), which changes as a function of time and place. The harmonic wave has the form
4.2 Harmonic Wave
105
→
Fig. 4.5 A harmonic wave with wavelength λ propagates in the direction of k . At any fixed time t1 < t2 < t3 < t4 < t5 = t1 + T the wave has the form of a sinusoidal function in space. Here, the →
places of equalphase (illustrated by the circles on the wave crests) move in the direction of k . After a period of oscillation T the wave has moved on by exactly one wavelength →
x
→
→
→
→
r , t = x 0 . sin k . x - ωt ,
ð4:41Þ
with the angular frequency (defined identically to the oscillation) ω = 2π/T and the → wave vector k , which points in the direction of propagation of the wave and whose magnitude (the wavenumber k) is given by the wavelength, λ →
k= k =
2π 1 , ½k ] = : λ m
→
→
c=
λ ω m = , ½ c] = : T k s
ð4:42Þ
The points of equal phase φ = k . x - ωt are planes in space; for this reason, this wave is also called a plane wave. The wavelength here is the shortest distance in space at which the wave has the same phase at a fixed point in time, for example, the → distance between two adjacent wave crests in Fig. 4.5. The amplitude x 0 is set as a → vectorial quantity in (4.41), since the direction in which the quantity x oscillates can generally be oriented arbitrarily in space. A distinction is made here between → transverse waves, in which the direction of x is perpendicular to the direction of → → → propagation k , and longitudinal waves, in which x and k are parallel to each other. Examples of transverse waves are rope waves or light waves. Longitudinal waves are, for example, sound waves in gases. The wave number and the angular frequency are linked to each other by the propagation speed c of the wave: ð4:43Þ
The relationship c = λ/T means here that the wave moves on by exactly one wavelength λ for each period T. Well known propagation speeds are those of sound waves in air and those of light waves in vacuum:
106
4 Oscillations and Waves
Sound wave : c ≈ 340
m , s
Light wave : c = 299 792 458
ð4:44Þ m : s
ð4:45Þ
The exact value of the speed of sound dependst on environmental influences such as air pressure, air temperature and humidity. The value of the speed of lightis a natural constant.
4.2.1
Interference
Interference is the superposition of two or more waves. Think, for example, of a lake into which several stones are thrown at the same time, each producing concentric waves on the surface of the water. When the rings of waves meet, there are places where the amplitudes of the individual waves reinforce each other, namely where the crest of the first wave meets a crest of the second. At other places, however, where wave crest meets wave trough, the two waves cancel each other out. This results in a characteristic pattern called the interference pattern (Fig. 4.6a). Mathematically, the superposition principle can be applied, that is, the individual waves can be added → → vectorially. We will show this here in the example of two plane waves x 1,2 r , t →
with identical angular frequency ω and wave vectors k 1,2 which, for simplicity, → have the same amplitude: x 0 →
x
→
→
→
→
→
→
→
r , t = x 0 . sin k 1 . r - ωt þ x 0 . sin k 2 . r - ωt :
ð4:46Þ
With the trigonometric formula
Fig. 4.6 (a) Interference pattern of two concentric waves, for example, caused by two stones →
→
thrown into the water. (b) Two plane waves with wave vectors k 1 and k 2 interfere with each other and produce a characteristic interference pattern
4.2 Harmonic Wave
107
sinðαÞ þ sinðβÞ = 2 cos
α-β αþβ sin 2 2
ð4:47Þ
the total wave can be expressed by →
x
→
→
r , t = 2 x 0 cos
→ → 1 → 1 → → → k 1 - k 2 . r . sin k 1 þ k 2 . r - ωt : ð4:48Þ 2 2
position‐dependent amplitude
new wave
The second part of the product (the sine function) is again a wave that propagates in → → the direction of the sum of the wave vectors 1=2 . k 1 þ k 2 . The first part of the product is the amplitude of the wave, which now depends on the location with a cosine function. Places where the wave oscillates at full amplitude alternate in the → → direction of the difference of the wave vectors 1=2 . k 1 - k 2 with locations where the amplitude is zero. This leads to parallel tubes in which the waves travel (Fig. 4.6b).
4.2.1.1 Standing Waves A special case of interference is the so-called standing wave, which always occurs when two waves of the same frequency with opposite propagation directions are → → superimposed, that is, when k 2 = - k 1 is (see Fig. 4.7a). In this case, the sum of → → → → → → the two k vectors is k 1 þ k 2 = 0, and the difference is given by k 1 - k 2 = 2 k 1 . The total wave according to (4.48) is then
Fig. 4.7 (a) In a standing wave, places where no oscillation takes place (nodes) alternate with those where the oscillation takes place with maximum amplitude (anti-nodes). Nodes and anti-nodes always remain at the same place; the wave does not propagate in space. The curves show the standing wave at different times. (b) Standing rope waves with rope tensioned at both ends have nodes at both ends. Due to this boundary condition, only certain modes, which have an integer number of n anti-nodes, can exist. In addition to the fundamental mode with n = 1, the first and second harmonics with n = 2 and n = 3 are shown in the sketch
108
4 Oscillations and Waves →
x
→
→
→
→
r , t = 2 x 0 cos k 1 . r Að r
→
Þ
. sinð- ωt Þ:
ð4:49Þ
f ðt Þ →
This is the product of a location-dependent amplitude A r
and an oscillation f(t).
Locations where the amplitude is zero are called nodes of the standing wave. At → nodes, the magnitude x ðt Þ is zero at all times. Places where the amplitude is maximum, on the other hand, are called anti-nodes of the standing wave. At the → → anti-nodes, the magnitude x ðt Þ oscillates with amplitude 2 x 0 . The anti-nodes and nodes remain at the same places for all times, so that the wave no longer propagates in space. Standing waves occur, for example, when a wave is reflected back into itself as in the case of a rope wave reflected from the end of the rope. In Fig. 4.7b standing rope waves are sketched where the rope is firmly clamped at both ends. This means that the rope cannot move at its ends, that is, nodes of the standing wave must be located there. These boundary conditions define waves with very specific wavelengths and frequencies, which are called modes. The standing wave with the largest possible wavelength has no nodes other than the two nodes at the edge; it has exactly one anti-node (n = 1) between the edges and is called the fundamental mode. Modes with smaller wavelengths are labeled with n according to the number of antinodes and are called harmonics of the fundamental mode. The distance between adjacent nodes is given by half the wavelength. Therefore, the wavelength λn of the mode n is given by: λn =
2L , n
ð4:50Þ
with the rope length L. The corresponding frequency of the mode is given with f = c/ λ by fn =
c . n = f 1 . n, 2L
ð4:51Þ
where we have defined the frequency of the fundamental mode as f1 = c/2L. The frequencies of the harmonics are therefore multiples of the fundamental frequency. Exercise: Violin The length of the vibrating part of the strings of a violin is L = 32.5 cm. What must be the speed of propagation of the rope wave on the string so that the fundamental vibration of the string produces the concert pitch a with a frequency of f = 443 Hz? (continued)
4.2 Harmonic Wave
109
Solution We solve the formula (4.51) with n = 1 for the speed of propagation of the wave: c = 2L . f = 0:65 m . 440 s - 1 = 286 m=s:
ð4:52Þ
4.2.1.2 Beatings So far we have considered the case where the two superimposed waves have the same frequency. We have seen that in this case the wave has a location-dependent amplitude, for example, anti-nodes and nodes in the standing wave. When the frequencies are different, this location-dependent amplitude additionally moves in space. Where just a moment ago there was an anti-node, a short time later a node will appear and vice versa. At any fixed location in space, this results in a periodic modulation of the oscillation amplitude. To demonstrate this mathematically, we superimpose two oscillations x1(t) = x0 sin (ω1t) and x2(t) sin (ω2t) with the same amplitude but different frequencies: xðt Þ = x1 ðt Þ þ x2 ðt Þ = x0 ðsinðω1 t Þ þ sinðω2 t ÞÞ,
ð4:53Þ
ω1 þ ω2 ω - ω2 . t . cos 1 .t : 2 2
ð4:54Þ
= 2x0 . sin
mean frequency
difference frequency
The conversion is again done with the trigonometric formula (4.47). The oscillation is therefore the product of two oscillations with the mean frequency and with half the difference frequency. We now speak of a beat if the two frequencies are similar, that is, if ω1 ≈ ω2, so that the difference frequency is much smaller than the mean frequency. In this case, one measures an oscillation with the mean frequency, whose oscillation amplitude slowly oscillates with the period duration T float =
2π 1 = : jω1 - ω2 j jf 1 - f 2 j
ð4:55Þ
Example: Beat in a Duet Claus and Christiane sing a duet. Unfortunately, the two notes are a semitone apart, that is, Claus sings a C sharp with the frequency f1 = 139 Hz, where Christiane sings a C with a frequency of f2 = 130 Hz (Fig. 4.8). This causes a (continued)
110
4 Oscillations and Waves
Fig. 4.8 (a) If waves with similar frequencies are superimposed, a beat is produced, that is the listener hears a tone that periodically becomes louder and softer. In (b) the beating of a c with a c sharp is shown; this results in a beating period of 0.11 s Fig. 4.9 Sound waves are waves of pressure and density. Here the gas molecules move back and forth in the direction of propagation
beat for the listener, that is, the note becomes periodically louder and softer, although both sing at the same volume. How long is the period of the beat? Solution According to (4.55), the period is given by T float =
1 = 0:11 s: j139 Hz - 130 Hzj
ð4:56Þ
So the sound gets louder and softer about ten times per second.
4.2.2
Sound Waves
In this section we will deal with sound waves. These are oscillations of the pressure p and density ρ in a medium (see Fig. 4.9). Usually we assume a gas, for example air, but sound waves can also propagate in a liquid or a solid. The pressure and density oscillations are caused by the molecules in the gas oscillating back and forth and thus compressing in some places and thinning in others. Due to the compressibility of the medium, there is a restoring force which causes the oscillation. In gases and liquids, sound waves are always longitudinal waves, that is, the direction of vibration of the
4.2 Harmonic Wave
111
molecules is parallel to the direction of propagation of the wave. The motion of a molecule at the mean position x is described by the position function sðx, t Þ = x þ s0 . sinðkx - ωt Þ:
ð4:57Þ
The molecule oscillates with the amplitude s0 around its mean position x. The density and pressure waves are given accordingly by pðx, t Þ = pm þ p0 . sinðkx - ωt Þ,
ð4:58Þ
ρðx, t Þ = ρm þ ρ0 . sinðkx - ωt Þ:
ð4:59Þ
Density and pressure thus oscillate with amplitude p0 and ρ0 around their respective mean values pm and ρm. In air, for example, the mean values are pm ≈ 1 bar and ρm ≈ 1.2 kg/m3. The wave propagates with the speed of sound cS. Analogous to (4.43), the general relationship cS = λ/T = ω/k applies. In the case of sound waves, the propagation velocity depends on the properties of the medium, in particular on the mean density ρm and the compressibility κ: cS = p
1 : κ . ρm
ð4:60Þ
At a temperature of T = 20°C, the speed of sound in air is for example, cS = 343 m/s. The speed of sound must be distinguished from the speed with which the individual molecules of the medium oscillate back and forth. The velocity function of the molecules can be determined from the position function (4.57): vðx, t Þ =
dsðx, t Þ = - s0 . ω . cosðkx - ωt Þ: dt
ð4:61Þ
The molecules thus reach a maximum speed of u0 = s 0 . ω
ð4:62Þ
which is referred to as the sound particle velocity u0. The pressure amplitude is directly proportional to the sound particle velocity: p0 = ρ . c S . u 0 :
ð4:63Þ
We can now calculate how much energy is transported in a sound wave. In general, for waves of all types, the intensity I is defined as the amount of energy E that flows through the area A in the time t, divided by t and A: I=
W E J , ½I ] = 2 = 2 : A.t m .s m
ð4:64Þ
112
4 Oscillations and Waves
For sound waves, the sound intensity is quadratically related to the pressure amplitude: I=
1 1 p20 = . ρ . cS . u20 : . 2 ρ . cS 2
ð4:65Þ
Exercise: Sound Intensity For a normal table conversation, the pressure amplitude is p0 ≈ 5 . 10-3Pa. With what maximum speed do the air molecules move, and what is the sound intensity? Solution We calculate the sound particle velocity using (4.63): u0 =
p0 5 . 10 - 3 Pa = = 12 . 10 - 6 m=s: ρ . cS 1:2 kg=m3 . 343 m=s
ð4:66Þ
The sound particle velocity is therefore extremely small compared to the speed of sound. We calculate the intensity of the sound wave using (4.65): 2
I=
ð5 . 10 - 3 PaÞ W 1 1 p20 = . = 3 . 10 - 8 2 : . 2 ρ . cS 2 1:2 kg=m3 . 343 m=s m
ð4:67Þ
4.2.2.1 Sound Perception In the following we will deal with how the human ear perceives sound and which physical quantities have been introduced for this purpose. First of all, it should be noted that our ear can only perceive sound in the frequency range between about 16 and 20 kHz. The sensitivity of the ear decreases drastically at the edges of the range. Sound with frequencies below 16 Hz is called infrasound, sound with frequencies above 20 kHz is called ultrasound. Because it was evolutionarily advantageous to be able to hear very soft sounds as well as endure very loud sounds, human hearing covers a very wide range of sound intensities. This is made possible by the fact that sound perception is logarithmically dependent on intensity. This logarithmic dependency has been emulated in the definition of loudness L. It is given by L = 10 dB . log 10
I , ½L] = dB ðDezibelÞ: I0
ð4:68Þ
4.2 Harmonic Wave
113
Here, the intensity I0 is the intensity that the human ear can just perceive at a frequency of f = 1 kHz, the so-called hearing threshold. It amounts to I 0 = 10 - 12
W : m2
ð4:69Þ
A barely perceptible sound with a frequency of 1 kHz therefore has a volume of L = 0 dB. Conversations take place at volumes of approx. 50 dB. The upper limit of hearing begins at the pain threshold of about 130 dB. This corresponds approximately to the volume of a jet taking off at a distance of 100 m. The corresponding sound intensity is obtained by solving Eq. (4.68): I = I 0 . 1010dB = 10 - 12 L
130 dB W W . 10 10dB = 10 2 : m m2
ð4:70Þ
Human hearing thus functions in an intensity range of 12 orders of magnitude.
4.2.3
Doppler Effect and Supersonic Speed
In the last section, we will look at a physical phenomenon that I am sure you have all experienced: When an ambulance with a siren is coming towards you, you hear a higher signal than when the car is moving away from you. This is due to the Doppler effect, which, by the way, applies not only to sound waves but to all waves propagating in a medium. A special case that we cannot deal with here is the relativistic Doppler effect, which occurs with electromagnetic waves (such as light). We consider here a source moving with velocity vQ, which emits a wave with frequency f. The speed is measured relative to the (stationary) medium in which the wave propagates. The wave is then recorded by a receiver, which can also move relative to the medium, at the speed vE. We now ask what frequency f ′ the receiver is measuring. The general case where the source and receiver are both moving can be traced back to the combination of the two cases where (1) only the source is moving (vE = 0) and (2) only the receiver is moving (vQ = 0) (see Fig. 4.10). In case (1), the source emits a wave of frequency f and moves behind the wave at vQ > 0. Thus, after one period of oscillation T = 1/f, the source has moved on by the distance d = vQ . T. Since the wave has just moved on by one wavelength in one period of oscillation, the wavelength is shortened by the distance d compared to the source at rest, λ′ = λ - d. For the corresponding frequency in the quiescent system of the medium, which is measured by a receiver at rest, f ′ = c/λ′ gives the expression f0 =
f 1-
vQ c
:
ð4:71Þ
For positive v, the source approaches the receiver, and for v < 0, the source moves away from the receiver. In case (2) the wave propagates with its normal wavelength λ in the medium. However, if the receiver now moves against the direction of
114
4 Oscillations and Waves
Fig. 4.10 Doppler effect: (a) The source follows the emitted sound wave. This effectively makes the wavelength shorter. (b) The receiver moves against the sound wave. This effectively shortens the measured period
propagation of the wave, it takes less time T′ < T, to scan one wavelength of the wave. This can be thought of as driving a motorboat over an oncoming water wave. The faster you go, the shorter the intervals between the rise and fall of the wave. In this case, the measured frequency is given by
f0 =f . 1 þ
vE : c
ð4:72Þ
Again, v > 0 corresponds to the situation where the receiver is moving towards the source, and v < 0 corresponds to the situation where the receiver is moving away from the source. The two cases can now be combined so that when the source and receiver move simultaneously: f0 =f .
1 þ vcE v : 1 - cQ
ð4:73Þ
It is remarkable that it makes a difference whether the source moves or the receiver. This becomes particularly clear when one looks at what happens in the case v → c. If only the receiver is moving with the speed of sound c, the measured frequency is doubled with f ′ = 2f. However, if the source moves at the speed of sound, the frequency diverges: f ′ → 1. In fact, when the source exceeds the speed of sound, a bang occurs, which is called a sonic boom. If the source is moving faster than the speed of sound, the wave will not be able to catch up with the source. Therefore, the wave can only propagate around the source in a backward cone, the supersonic cone (or Mach cone). The half opening angle α of the cone is related to the Mach number Ma: Ma =
vQ 1 = : c sin α
ð4:74Þ
4.3 Oscillations and Waves: Compact
115
Fig. 4.11 Example of a motion with supersonic speed. The duck moves faster on the lake than the propagation speed of the water waves on the lake surface. This causes a backward Mach cone to form, also known as a bow wave in shipping. Note: In reality, the physics of bow waves is more complicated
As an example, consider a jet flying at twice the speed of sound v = 2 . c. The Mach number is therefore Ma = 2, and the corresponding opening angle of the supersonic cone is α = arcsin (0, 5) = 30°. Another example is shown in Fig. 4.11. Examination Task: Bat Bats orient themselves with the help of the Doppler effect. They emit ultrasonic waves that are reflected by objects and then received back by the bats. Due to the bat’s movement, the received frequency is Doppler-shifted. What is the measured frequency if the bat emits a frequency of f = 100 kHz and moves towards a wall at a speed of v = 15 m/s? Consider that the bat is both a moving source and a moving receiver. The speed of sound is v = 340 m/ s. Solution We use the formula (4.75) for moving sources and receivers: f0 =f .
15 m=s 1 þ 340 1 þ vc m=s = f = 100 kHz . = 109, 2 kHz: 1 - vc 1 - 15 m=s
ð4:75Þ
340 m=s
The transmitted frequency is thus shifted by approx. 9 kHz by the Doppler effect.
4.3
Oscillations and Waves: Compact
Here once more the most important formulas concerning oscillatory motion and waves are summarized: Differential Eq. (4.4) and solution (4.1) of the harmonic oscillation:
116
4 Oscillations and Waves
€xðt Þ = - ω2 . xðt Þ, xðt Þ = x0 . sinðωt þ φÞ: period of oscillation (4.11) of the spring pendulum: T = 2π
m : D
Period of oscillation (4.19) of the thread pendulum for small deflections: T = 2π
l : g
Differential Eq. (4.23) of the damped harmonic oscillator: €xðt Þ = - ω20 . xðt Þ - 2γ . x_ ðt Þ: Differential Eq. (4.27) of the driven harmonic oscillator: €xðt Þ = - ω20 . xðt Þ - 2γ . x_ ðt Þ þ
F0 . sinðωext t Þ: m
Harmonic wave (4.41): →
x
→
→
→
→
r , t = x 0 . sin k . x - ωt :
velocity of propagation of a wave (4.43): c=
λ ω = : k T
Wavelengths (4.50) and frequencies (4.51) of standing waves of a rope firmly clamped at the ends: 2L , n c fn = . n = f 1 . n: 2L λn =
Period of a beat (4.55): T Schwebung = Speed of sound (4.60):
1 : jf 1 - f 2 j
4.3 Oscillations and Waves: Compact
117
cS = p
1 : κ.ρ
Sound particle speed (4.62): u0 = s0 . ω: Sound intensity (4.64) and (4.65): I=
E 1 p2 1 = . 0 = . ρ . cS . u20 : A . t 2 ρ . cS 2
Volume (4.68): L = 10 dB . log 10 Doppler effect (4.75): f0 =f .
1 þ vcE v : 1 - cQ
Mach Cone (4.74): vQ 1 = : c sin α
I : I0
5
Thermodynamics
5.1
Basic Concepts of Thermodynamics
Thermodynamics (heat theory) deals with the thermal movement of the individual atoms or molecules of a gas, a liquid or a solid body. This usually involves a large number of particles, for example, one litre of a gas contains approx. 1023 particles. All these particles can move in the three spatial dimensions and possibly rotate or perform molecular vibrations. Furthermore, the particles can collide with each other and with the walls, exchanging energy and momentum. It is therefore almost impossible to reconstruct the movement of each individual particle or to simulate it with a computer. Instead, such systems are described with macroscopic quantities such as the temperature T or the pressure p of the substance. Classical thermodynamics relates these terms to the number of particles N, the volume V of the substance, and the internal energy U contained in the substance. We will explain these terms in more detail later in this section. Within the framework of statistical thermodynamics, it is shown that the macroscopic quantities can be derived from the individual motions of the particles by forming mean values over all N particles (so-called ensemble mean values): → We describe the motion of particle no. n with the position function r n ðt Þ. On average, the total system does not move due to the thermal motion of its particles, therefore the motion is undirected, that is, the average value is →
v n ðt Þ
n
= 0:
ð5:1Þ
Nevertheless, the movement of the individual particles contains kinetic energy: Ekin =
N n=1
1 →2 m v ≥ 0: 2 n n
# Springer-Verlag GmbH Germany, part of Springer Nature 2023 S. Slama, Experimental Physics Compact for Scientists, https://doi.org/10.1007/978-3-662-67895-4_5
ð5:2Þ
119
120
5 Thermodynamics
If one speaks of the heat energy or the thermal energy of a substance, one means the kinetic energy of all single particles (beside the also existing rotation and oscillation energy of the particles). The temperature T of a substance is now a measure of the thermal energy. The thermodynamic unit of temperature is the kelvin: ½T ] = K ðKelvinÞ:
ð5:3Þ
→
The state in which all atoms are at rest ( v n ðt Þ = 0) and the kinetic energy is therefore Ekin = 0 is called absolute zero with the temperature T = 0 K. Water freezes at a temperature of approx. T ≈ 273, 15 K. With this normalization of the Kelvin scale, temperature differences in Kelvin are just as large as temperature differences in the Celsius degree scale commonly used by us: ΔT ðin KelvinÞ = ΔT ðin ° CÞ:
ð5:4Þ
A system can now be supplied with energy from outside in the form of a quantity of heat ΔQ, for example, with the aid of a flame. This changes the temperature of the system: ΔQ = C . ΔT, ½Q] = J:
ð5:5Þ
The proportionality constant C is the so-called heat capacity with the unit [C] = J/K. The heat capacity of a substance is proportional to the mass of the substance, that is, twice the mass of a substance also has twice the heat capacity. It therefore makes sense to define the specific heat capacity per mass as the material constant: c=
J C , , ½ c] = kg . K m
ð5:6Þ
For example, the specific heat capacity of water has a particularly large value of cH2 O = 4182 J=ðkg . KÞ, for iron the specific heat capacity is cFe = 452 J/(kg . K). If you thermally connect two objects with different temperatures T1 > T2 so that heat can be exchanged between the objects, heat flows from the hotter object to the cooler one until the temperatures have equalized so that T 01 = T 02 applies. This is known as thermal equilibrium. This statement is also known as the 0th law of thermodynamics. 0. Law of Thermodynamics Two systems in thermal equilibrium with each other have the same temperature. In practice, one often wants to calculate the mixing temperature Tm to which two objects with starting temperatures T1 > T2 and corresponding heat capacities C1, C2 adjust. This can be calculated by considering that the amount of heat |ΔQ1| = C1 . (T1 -
5.1 Basic Concepts of Thermodynamics
121
Tm) emitted by object 1 is exactly as large as the amount of heat |ΔQ2| = C2 . (Tm T2) absorbed by object 2. Thus, C 1 . ðT 1 - T m Þ = C 2 . ðT m - T 2 Þ,
ð5:7Þ
which we can solve for the mixing temperature: Tm =
C1 . T 1 þ C2 . T 2 : C1 þ C2
ð5:8Þ
Written Test: Mixing Temperature A hot piece of iron with a temperature of T1 = 98 ° C and a mass of 5 kg is thrown into a liter (1 kg) of water of a temperature of T2 = 20 ° C. What is the mixing temperature? Note: cH2 O = 4182 J=ðkg . KÞ, cFe = 452 J/(kg . K). Solution We first calculate the heat capacities: that of the iron piece is C 1 = 452 J=ðkg . KÞ . 5 kg = 2260 J=K
ð5:9Þ
and that of the water quantity C 2 = 4182 J=ðkg . KÞ . 1 kg = 4182 J=K:
ð5:10Þ
To determine the mixing temperature, we use Eq. (5.8): Tm =
C 1 . T 1 þ C2 . T 2 2260 J=K . 98 ° C þ 4182 J=K . 20 ° C = C1 þ C2 2260 J=K þ 4182 J=K
= 47:4 ° C:
ð5:11Þ
So the mixing temperature is Tm = 47.4 ° C.
5.1.1
Heat Transport
We have just learned that two objects equalize in temperature through heat exchange. Now we want to understand more precisely how and on what time scale heat is transported from one object to the other. In principle, there are several mechanisms for this. One possibility is convection in liquids or gases. For example, if you heat a pot of water, first the layer of water at the bottom of the pot gets hot. The hot water rises and thus transports the heat upwards, where it can give it off to the air, or mix with the colder layers of water. The laws of hydrodynamics apply here. We will not go into this further here. A second possibility is radiant heat. Every body
122
5 Thermodynamics
emits a spectrum of electromagnetic radiation due to its temperature. More details can be found in Sect. 10.5.2.1. Within solid bodies, heat transfer occurs primarily when the vibrating atoms that make up the solid collide with each other. In this way, vibrational energy can be transferred to the atoms at adjacent lattice sites. How well this exchange works depends, among other things, on how close the atoms come to each other when vibrating, that is, on the density of the body. A special feature of metals are free electrons that can move throughout the metal and thus transport heat very efficiently through the metal. For this reason, metals usually conduct heat very well. Regardless of the mechanism, one introduces a phenomenological quantity called thermal conductivity λ with the unit W/(m . K). It indicates how much heat dQ flows through a body of length d and cross-sectional area A per time dt. This quantity is also called heat flow or heat flux I and has the unit of power: I=
dQ A = λ . . ΔT, ½I ] = W: dt d
ð5:12Þ
However, heat only flows if there is a temperature difference ΔT along the length of the body. Examples of thermal conductivities are Glass : λ = 0:76
W , m.K
ð5:13Þ
Air : λ = 0:026
W , m.K
ð5:14Þ
Copper : λ = 300 - 400
W : m.K
ð5:15Þ
Examination Task: Heat Losses at the Window In the past, windows consisted only of simple panes of glass. We consider a pane with area A = 0.5 m2 and thickness d = 5 mm. What is the heat flow through the window when it is 20 ° C in the room and 0 ° C outside ? Solution The heat flow is I = 0:76
0:5 m2 W . 20 K = 1520 W: . m . K 0:005 m
ð5:16Þ
To prevent the room from cooling down, this power must be provided by the heating system. With the help of the heat flow, one can calculate how fast the temperature of two objects converges. For this purpose, we consider two bodies with starting
5.1 Basic Concepts of Thermodynamics
123
Fig. 5.1 (a) Two objects with temperatures T1 and T2 and heat capacities C1 and C2 are thermally connected by a thermal bridge with thermal conductivity λ, cross-sectional area A and length d. As a result, heat flows from the warmer to the colder side. (b) The temperature difference ΔT of the two objects decreases exponentially with time
temperatures T1 and T2 and heat capacities C1 and C2, respectively (see Fig. 5.1). The two bodies are connected by a thermal bridge across which heat flows from the hotter to the colder object. This causes the temperatures to equalise to the mixing temperature from Eq. (5.8). The temperature difference ΔT becomes smaller according to an exponential function (without proof): ΔT ðt Þ = ΔT 0 . exp -
t , τ
ð5:17Þ
with the temperature difference ΔT0 at the beginning and the 1/e time scale τ=
C1 . C2 d . , C1 þ C 2 λA
ð5:18Þ
on which the temperature equalization takes place.
5.1.2
Thermal Expansion
When a solid body is heated, it usually also expands, that is, the length and volume of the body increase with temperature. However, there are also a few substances for which the length along a certain direction becomes shorter as the temperature increases. In both cases, the change in length Δl with a change in temperature ΔT can be described by Δl = α . l0 . ΔT:
ð5:19Þ
Here, the original length of the body is l0, and the quantity α with [α] = 1/K is a material constant, the linear expansion coefficient. For iron, for example, it is α = 12 . 10-6/K. The sign of α determines whether a body becomes longer or shorter with increasing temperature. It should be noted here that α itself depends on temperature, that is, formula (5.19) is valid only as long as the temperature change ΔT is not too big. For liquids, it makes more sense to consider the thermal volume change ΔV because liquids always maintain the shape of the vessel as long as the boiling temperature is not exceeded. Analogous to (5.19) the volume change is
124
5 Thermodynamics
ΔV = γ . V 0 . ΔT,
ð5:20Þ
with the original volume V0 and the volume expansion coefficient γ. For water, for example, it is γ = 6 . 10-4/K. Written Test: Mercury Thermometer A mercury thermometer has a reservoir at the bottom containing a quantity of V = 1 cm3 mercury. How thin must the cylindrical thermometer capillary be, in which the liquid rises, so that the level increases by d = 2 mm per heating of ΔT = 1 ° C ? Calculate the required radius R of the capillary. Note: The coefficient of volume expansion of mercury is γ = 0.18 . 10-3/K. Solution The volume change with an increase d of the liquid in the cylindrical capillary with radius R is ΔV = π . R2 . d:
ð5:21Þ
This change in volume is caused by heating ΔT. It is valid according to (5.20): ΔV = γ . V 0 . ΔT:
ð5:22Þ
We set the two expressions equal and solve for the radius: π . R2 . d = γ . V 0 . ΔT, R=
γ . V 0 . ΔT = π.d
0:18 . 10 - 3 =K . 1 cm3 . 1 K π . 2 mm = 170 μm:
ð5:23Þ ð5:24Þ ð5:25Þ
The capillary must therefore have a radius of 170 μm
5.2
Ideal Gas
In the following we consider the thermodynamics of gases. We consider N gas particles (atoms or molecules) confined in a volume V at a pressure p and a temperature T (see Fig. 5.2a). In the model of the ideal gas, we make the following simplifying assumptions, most of which are satisfied to a very good approximation: The gas particles are assumed to be point-like and therefore do not occupy a volume of their own in V. Furthermore, gas particles collide with each other only by elastic collisions (see Sect. 2.3.2.1), so that the kinetic energy is conserved during
5.2 Ideal Gas
125
Fig. 5.2 (a) In the model of the ideal gas, there are N point-like gas particles at temperature T and pressure p in a volume V. (b) The velocities of the particles in the ideal gas are given by the Maxwell-Boltzmann distribution, here in the example of nitrogen molecules at room temperature. The three vertical lines from left to right correspond to the most probable, the mean and the thermal velocity
collisions. Furthermore, we assume that no long-range forces act between the particles, that is, only if two particles really collide with each other, their trajectories are deflected by this. Under these assumptions the ideal gas law is valid p . V = N . k B . T,
ð5:26Þ
with the Boltzmann constant kB = 1.38 × 10-23J/K. Thus, for a fixed volume and fixed number of particles, the pressure in the gas is proportional to the temperature. This is because the particles move faster on average as the temperature rises, and thus hit the walls of the volume at a higher speed. However, how fast each particle is moving can only be predicted with a certain probability. The probability p(v) dv of encountering a particle in an infinitesimally small range dv around the velocity v is given by the Maxwell-Boltzmann velocity distribution: pðvÞ = 4π
m 2πkB T
3=2
mv2
. v2 . e - 2kB T ,
ð5:27Þ
with the mass m of a gas particle. The complicated function (5.27) is sketched graphically in Fig. 5.2b. We see that for small velocities the function initially increases quadratically, reaches a maximum at the most probable velocity v and decreases exponentially for still larger velocities. The position of the maximum can be calculated by setting the derivative to zero:
126
5 Thermodynamics
dpðvÞ = 0: dv
ð5:28Þ
This leads to the most likely speed v=
2k B T : m
ð5:29Þ
The function p(v) can also be used to calculate the mean velocity hvi: 1
h vi =
v . pðvÞ dv =
0
8k B T : πm
ð5:30Þ
The mean velocity is not identical with the most probable velocity, since the function p(v) is not axisymmetric with respect to its maximum. Another important quantity is the mean squared velocity hv2i with 1
v2 =
v2 . pðvÞ dv =
0
3k B T , m
ð5:31Þ
because from that you can calculate the average kinetic energy per particle: hEkin i =
3 1 m v2 = k B T: 2 2
ð5:32Þ
This energy depends only on the temperature and not on the properties of the gas particles such as the mass. Often one introduces also the thermal velocity vth with vth =
hv2 i =
3k B T : m
ð5:33Þ
Exercise: Gas Velocities in Air Air consists mainly of N2 molecules with a mass of m = 4.7 × 10-26 kg. Calculate the most probable, the mean and the thermal velocity of air at room temperature T = 300 K. Solution The most probable speed is (continued)
5.2 Ideal Gas
v=
127
2kB T = m
2 . 1:38 × 10 - 23 J=K . 300 K m = 420 , s 4:7 × 10 - 26 kg
ð5:34Þ
the average speed is hvi =
8kB T = πm
8 . 1:38 × 10 - 23 J=K . 300 K m = 474 , - 26 s kg π . 4:7 × 10
ð5:35Þ
and the thermal velocity is vth =
5.2.1
3kB T = m
3 . 1:38 × 10 - 23 J=K . 300 K m = 514 : - 26 s kg 4:7 × 10
ð5:36Þ
Molar Quantities
In order not to deal with with excessively large or small numbers when mixing laboratory quantities, quantities of substances are often specified in chemistry in the unit mol. The quantity of n = 1 mol refers to a specific number of atoms or molecules, which is defined by the Avogadro constant NA: NA =
6:02 . 1023 : mol
ð5:37Þ
For example, there are 6.02 . 1023 water molecules in one mole of water. Accordingly, some quantities are renormalized to mol. An example is the molar heat capacity cmol with the unit ½cmol ] =
J , mol . K
ð5:38Þ
which describes the energy required to heat one mole of the substance by one Kelvin. The molar heat capacity is calculated from the specific heat capacity c through cmol = c . mmol ,
ð5:39Þ
with the molar mass mmol with [mmol] = kg/mol. This refers to the mass of one mole of the substance. It can be calculated from the mass mA of a single atom or molecule: mmol = mA . N A :
ð5:40Þ
Another quantity that must be renormalized is the Boltzmann constant. It becomes the general gas constant R with
128
5 Thermodynamics
R = k B . N A = 8:31
5.2.2
J : mol . K
ð5:41Þ
Internal Energy
The internal energy U is the total energy of a substance that is available for thermodynamic processes. We already got to know part of this in Sect. 5.2 when we calculated the average kinetic energy per particle in (5.32). The total kinetic energy of all N particles is therefore E kin,tot =
3 Nk T: 2 B
ð5:42Þ
In the case of molecular gas particles, which are composed of several atoms, there are, in addition to the kinetic energy, contributions to the internal energy due to the rotation of the molecules (rotational energy) and due to oscillations of the atoms in the molecules (vibrational energy). In real gases, there are also other forms of energy, for example, the interaction energy between particles. These individual ways of accommodating thermal energy in different forms are called degrees of freedom. For example, there are f = 3 kinetic degrees of freedom per atom, since each atom can move in the three spatial directions x, y, and z. The equipartition theorem states that in thermal equilibrium each degree of freedom of each particle has the mean energy hE 1 i =
1 k T: 2 B
ð5:43Þ
For this reason, a particle with f degrees of freedom has the mean energy Ef =
f k T: 2 B
ð5:44Þ
Thus, the mean kinetic energy of a gas particle in accordance with (5.32) is given by hEkin i =
3 k T, 2 B
ð5:45Þ
because of the three kinetic degrees of freedom. The internal energy U of a gas with N particles is then given by U = N . Ef =
f Nk T: 2 B
ð5:46Þ
5.2 Ideal Gas
129
Fig. 5.3 (a) A 1-atomic gas has only the three degrees of freedom of translation in the x, y, and z directions. (b–d) Rotations of a 2-atomic molecule about the three x, y, and z axes. The rotation about the molecular axis x in (d) has no rotational energy, so this rotation does not count as a degree of freedom. (e) Stretching vibration of a 2-atomic molecule. At room temperature, the thermal energy is not sufficient to excite this stretching vibration
Example: Internal Energy in 1- and 2-Atomic Gas A single-atom gas, in which the gas particles consist of individual atoms, as in the case of helium in Fig. 5.3a, has neither rotational nor vibrational degrees of freedom, since the atom, as a point particle with a moment of inertia J = 0, has no rotational energy (for rotational energy, see also Sect. 2.5.2). Nor does a single atom have any vibrational degrees of freedom, since vibrations can only occur between the bound atoms in molecules. The number of degrees of freedom is thus given only by the motion along the three spatial directions and is f = 3. The internal energy of a 1-atom gas consisting of N particles is thus U=
3 Nk T: 2 B
ð5:47Þ
However, 1-atomic gases are rather rare. It is much more common to find diatomic gases such as oxygen O2 or nitrogen N2. Here, two atoms are linearly bound to each other. In addition to the f = 3 translational degrees of freedom, there are therefore also rotational degrees of freedom (see Fig. 5.3b–d). In principle, the molecule can rotate in all three spatial directions, that is, about the x-, y- and z-axis. However, when rotating about the bond axis of the molecule (the x-axis), the moment of inertia is J = 0, since the two pointlike atoms are located on the axis of rotation. Therefore, there are effectively only f = 2 degrees of rotational freedom left, where the molecule rotates around the common center of gravity of the two atoms like a dumbbell in each case. For 2-atom molecules, there is in principle also a vibrational degree of freedom. Here the two atoms move back and forth along the bond axis in (continued)
130
5 Thermodynamics
opposite directions, that is, the molecule is alternately compressed and stretched (see Fig. 5.3e). This is known as a stretching vibration. However, the associated elastic energy is so large that at room temperature this oscillation cannot be thermally excited. The degree of freedom is said to be frozen out. Therefore, molecular vibrations at room temperature do not contribute any degree of freedom to the internal energy. However, at very high temperatures, these degrees of freedom can play a role. Therefore, the total number of degrees of freedom at room temperature is f = 3 + 2 = 5, and the internal energy of a diatomic gas at room temperature is U=
5.2.3
5 Nk T: 2 B
ð5:48Þ
First Law of Thermodynamics
The first law of thermodynamics now deals with the way in which the internal energy U of a thermodynamic system can change. The change ΔU is given by First Law of Thermodynamics ΔU = ΔQ þ ΔW:
ð5:49Þ
The internal energy can change as a result of the amount of heat ΔQ being supplied to the system or as a result of the work ΔW being performed on the system (see Fig. 5.4). This work is either compression work ΔW = - p . ΔV,
ð5:50Þ
where work from outside changes the volume V of the system at constant pressure p by ΔV. Or it is frictional work within the system, where other forms of energy (e.g., mechanical energy) are converted into thermal energy. We only consider compression work in this section. In the ideal gas, the relationship Fig. 5.4 The internal energy of a thermodynamic system changes either (a) by supply (removal) of heat ΔQ or (b) by compression work on the system ΔW = - p . ΔV
5.2 Ideal Gas
131
ΔU = CV . ΔT
ð5:51Þ
describes the relation between the change of the internal energy ΔU and the change of the temperature ΔT, with the heat capacity CV. Since, if the volume (ΔV = 0) remains constant, the work of compression is ΔW = 0. Thus, according to the first law (5.49), ΔU = ΔQ. Thus, also the change in the amount of heat is given by ΔQ = C V . ΔT:
ð5:52Þ
Equation (5.52), however, in contrast to (5.51), is valid only for constant volume, which is why CV is also called heat capacity at constant volume. Exercise: Heat Capacity CV in a 1-Atom Gas Calculate the heat capacity of a 1-atom gas consisting of N gas particles. Solution As we calculated in (5.47), the internal energy of a 1-atom gas with its 3 degrees of freedom of translation is U=
3 Nk T: 2 B
ð5:53Þ
If the temperature in the gas changes, the internal energy changes proportionally: ΔU =
3 Nk ΔT: 2 B
ð5:54Þ
By comparison with (5.51) it follows that the heat capacity is CV =
5.2.4
3 Nk : 2 B
ð5:55Þ
Changes of State in the Ideal Gas
In this section we consider an ideal gas in which we manipulate one or more of the quantities p, V and T externally, thereby changing the state of the gas. We are interested here in how the other quantities respond to this change. We also assume that one of the quantities in each case remains constant as it is changed. Depending on which quantity this is, the change of state is called isochoric, isobaric, isothermal or adiabatic (see Fig. 5.5).
132
5 Thermodynamics
Fig. 5.5 Thermodynamic changes of state: (a) In an isochoric change of state, the volume V is constant. (b) In an isobaric change of state, the pressure p is constant, for example, due to a constant force on the piston. (c) In the case of an isothermal change of state, the temperature T is constant due to coupling to a heat bath. (d) In the case of an adiabatic change of state, the system is thermally insulated so that no heat change ΔQ can take place
5.2.4.1 Isochoric Change of State In the isochoric change of state, the volume remains constant. We imagine that the gas is in a container with solid walls. The work of compression W = - p . dV = 0
ð5:56Þ
is therefore zero, and the heat change in the gas is given by Eq. (5.52): ΔQ = C V . ΔT:
ð5:57Þ
The change in the quantities T and p is obtained by solving the ideal gas Eq. (5.26) for the constant quantities: V T = = const: NkB p
ð5:58Þ
In addition to the volume V, the number of particles N is also constant. The Boltzmann constant kB does not change either, of course. This means that in the case of an isochoric change of state, the ratio of temperature T and pressure p is constant. In the case of a change from the state with T1 and p1 to T2 and p2, the following equation therefore applies T1 T2 = : p1 p2
ð5:59Þ
If heat is added to the gas and the temperature is increased as a result, the pressure also increases in the same proportion.
5.2 Ideal Gas
133
5.2.4.2 Isobaric Change of State In the case of an isobaric change of state, the pressure is constant, that is, Δp = 0. Technically, this can be achieved, for example, by enclosing the gas in a cylinder with a piston and applying a constant force to the piston from outside. Analogous to the procedure for the isochoric change of state, the ideal gas equation yields the relationship T T1 = 2: V1 V2
ð5:60Þ
Here, the ratio of temperature and volume is constant. If the temperature in the gas is increased, the volume increases in the same ratio. The compression work done here is given by the change in volume: W = - p . ΔV:
ð5:61Þ
As the volume ΔV > 0 increases, the work is W < 0. This means that the gas is doing work by pushing the piston outward. According to the first law (5.49), the change in the amount of heat in the gas is given by ΔQ = ΔU - ΔW = C V . ΔT þ p . ΔV:
ð5:62Þ
This equation can be further simplified by expressing the work of compression as a function of temperature change using the ideal gas law: p . ΔV = Nk B ΔT:
ð5:63Þ
The change in the amount of heat is then ΔQ = CV . ΔT þ NkB ΔT = ðC V þ Nk B Þ . ΔT = Cp . ΔT:
ð5:64Þ
Here we have defined the heat capacity at constant pressure: Cp = CV þ Nk B :
ð5:65Þ
5.2.4.3 Isothermal Change of State In an isothermal change of state, the temperature is constant, ΔT = 0. Imagine that the whole system is in a heat bath with a constant temperature, with which it can exchange heat at will, so that the temperature in the system remains the same at all times. From the ideal gas law, analogous to the procedure for the isochoric change of state, we deduce following equation p 1 . V 1 = p2 . V 2 :
ð5:66Þ
The product of the pressure in the gas and its volume is constant. If the pressure in the gas increases, the volume of the gas must decrease accordingly so that the
134
5 Thermodynamics
temperature remains the same. We have already learned about this equation in Sect. 3.2.4 on aerostatics as Boyle-Mariotte’s law. There is now a complication when calculating the work W = - p . ΔV, since the pressure also changes when the volume changes. Therefore one must divide the volume change into many infinitesimally small steps dV and integrate over the corresponding infinitesimal work dW = p . dV: ΔW = -
V2 V1
p dV = -
V2 V1
V Nk B T dV = - Nk B T . ln 2 : V V1
ð5:67Þ
Here we used the ideal gas law and used that the primitive function of 1/V is the natural logarithm ln(V ). The result ist ΔW = - NkB T . ln
V2 : V1
ð5:68Þ
The internal energy does not change during an isothermal change of state: ΔU = CV . ΔT = 0:
ð5:69Þ
For this reason the change of the heat according to the first law is given by ΔQ = - ΔW = Nk B T . ln
V2 : V1
ð5:70Þ
This means that heat flowing into the gas is not converted into internal energy but is immediately used to do work on the outside.
5.2.4.4 Adiabatic Change of State An adiabatic change of state is characterised by the fact that the gas cannot exchange heat with the environment, that is, ΔQ = 0. Technically, this can be achieved by insulating the system as well as possible, for example, in a thermos flask or by using Styrofoam. The work is then given by the first law of thermodynamics as ΔW = ΔU - ΔQ = ΔU = C V . ΔT:
ð5:71Þ
Temperature changes in the gas are therefore always associated with work. This process is reversible, that is, on the one hand the gas can be heated by doing work from the outside, and on the other hand the gas can do work to the outside by cooling down. Later we will see that with reversible changes of state the entropy S, a measure of the disorder in the system, remains constant: ΔS = 0:
ð5:72Þ
Such processes are thus called isentropic. The question of how temperature T, volume V and pressure p behave in the case of an adiabatic change of state cannot,
5.2 Ideal Gas
135
in contrast to the isochoric, isobaric and isothermal change of state, be answered with the aid of the ideal gas law alone, since, for example, a change in temperature can be compensated by a change in pressure as well as in volume. However, with the condition that no heat may flow, one can derive the so-called adiabatic equation (here without proof): T2 V1 = T1 V2
κ-1
=
p2 p1
κ-1 κ
ð5:73Þ
,
with the isentropic exponent κ=
Cp : CV
ð5:74Þ
For a 2-atom gas, the isentropic exponent is κ = 1.4. Examination Task: Compression in the Air Pump If a bicycle pump is closed at the air outlet and the gas is compressed very quickly, the gas in the pump has no time to exchange heat with the environment via the pump housing. So this process can be considered as an adiabatic change of state, with isentropic exponent κ = 1, 4. How hot does the gas with starting temperature T1 = 20° C ≈ 293 K become in this process when the volume of V1 is compressed to one fifth V2 = V1/5. To what value does the pressure rise (starting pressure p1 = 1 bar)? What happens if you wait long enough for the gas in the pump to cool down to ambient temperature again after compression while keeping the volume V2 the same? Solution The temperature of the gas is obtained from the adiabatic Eq. (5.73) by solving for T2: T2 = T1 .
V1 V2
κ-1
∘
= 293 K . ð5Þ0:4 = 558 K ≈ 285 C:
ð5:75Þ
Similarly, the pressure is obtained by solving the adiabatic Eq. (5.73) to p2: p2 = p1 .
V1 V2
κ
= 1 bar . ð5Þ1:4 = 9:5 bar:
ð5:76Þ
If one waits after the compression, heat flows out of the compressed gas to the outside until the gas has reached the ambient temperature again, which it also (continued)
136
5 Thermodynamics
had at the beginning. Overall, therefore, ΔT = 0, that is, the total process of adiabatic compression and waiting is isothermal. For isothermal processes, however, the following equation determines the final pressure p2 = p1 .
V1 = 1 bar . 5 = 5 bar: V2
ð5:77Þ
This means that as the gas cools, the pressure decreases from 9.5 bar to 5 bar.
5.3
Entropy and Reversibility
Thermodynamics is also about in which direction a system evolves. We already know that systems always strive towards the state of minimum energy. But even with states that have the same energy, it may happen that the system always evolves to only one of the states. Let us first demonstrate this with an example.
5.3.1
Example of Irreversible Expansion
We consider the volume in Fig. 5.6, which is divided into two halves by a partition. At the beginning there is a gas in the left half. Then a small hole is opened through which the gas can flow from the left half to the right half. Since no plunger is moved by opening the hole, no work is done, ΔW = 0. Likewise, there is no heat exchange, ΔQ = 0, when we isolate the volume from the environment. Thus there is no change in internal energy, ΔU = ΔQ + ΔW = 0. Nevertheless, the gas flows from the left half to the right half of the volume until pressure equilibrium is established between the two sides. This process is not reversible, that is, the gas does not evolve by itself back to the initial state A, where all the gas particles were in the left half. This is because the final state B is statistically much more probable than the initial state. To show this, we calculate the probabilities that the system is in state A and state B, respectively. The probability of finding a single particle on the left half or on the right half is pl = pr =
1 , 2
ð5:78Þ
provided that both volumes are equal. State A, in which all N gas particles are located in the left half, therefore has probability
5.3 Entropy and Reversibility
137
Fig. 5.6 (a) If one opens a passage from the left to the right half of the volume, the gas particles distribute themselves approximately equally over both halves of the volume. The initial state, in which all the particles are in the left half, is virtually never reoccupied. This is because the initial state is much less likely than the final state. This can be seen in (b) using four particles as an example. There are six ways to distribute the particles so that the uniformly distributed state is realized; for the initial state, there is only one realization. The more particles are involved, the more probable the uniformly distributed state becomes
pA =
1 2
N
:
ð5:79Þ
For typical particle numbers N~1023 in gases, this probability is extremely low. State B, where exactly half of the particles are on the left side and the other half of the particles are on the right side, can now be realized in various ways by selecting N/ 2 of the total N particles to be on the left side. This determines the remaining particles on the right side. The total probability for state B is then given by the number of possible realizations times the probability of a single realization: pB =
N N 2
.
1 2
N
=
N! 1 N N . 2 ð 2 Þ!ð 2 Þ!
N
:
ð5:80Þ
138
5 Thermodynamics
The individual realizations are also called microstates, while state A and B are the so-called macrostates. State A is realized by a single microstate, while for state B there are a number of w=
N 2
N! ! N2 !
ð5:81Þ
microstates. The number increases very rapidly with the number of particles, as the following table shows: Number of particles N 2 4 6 8 10 100
w 2 6 20 70 252 1029
According to the so-called ergodic hypothesis, while the system evolves, each microstate is passed through with equal probability. Macrostates, which are formed by very many microstates, are for this reason much more frequent than those which are formed by only a few realizations. Thus, in the example of the subdivided volume, it is quite possible for all atoms to be randomly located in the left half; it is only extremely rare for this to occur. Even with a number of particles of only N = 100, the state in which there are exactly Nl = Nr = 50 particles on the left and on the right is 1029-times as frequent as the state in which all particles are located in the left half.
5.3.2
Entropy
We now define a measure of the probability of a state: the entropy S of a state Z realized by a number of w microstates is defined as SðZÞ = kB . lnðwÞ, ½S] =
J , K
ð5:82Þ
with the Boltzmann constant kB. Thus, entropy is proportional to the logarithm of the number of microstates. By this definition, the entropy behaves additively when two systems are united. Let us imagine that the macrostate Z of one system has w1 realizations, and the same macrostate Z of a second system has w2 realizations. If we unify the two systems into a total system, the macrostate Z is still present. However, there are now a number of w = w1 . w2 ways to realize the state. The entropy of the total system
5.3 Entropy and Reversibility
139
Fig. 5.7 The entropy in a non-closed system can also decrease if the entropy in a second system increases at the same time. In the closed overall system, which includes both subsystems, the second law of thermodynamics then applies again, that is, the total entropy of both systems can only remain the same or become greater
SðZÞ = k B . lnðw1 . w2 Þ = kB . lnðw1 Þ þ kB . lnðw2 Þ = S1 þ S2
ð5:83Þ
is given by the sum of the two individual entropies S1 and S2. In the example of the four particles in the box, state A has a single realization. The entropy of state A is thus given by SðAÞ = k B . lnð1Þ = 0:
ð5:84Þ
For the uniformly distributed state B there are six different realizations. Thus the entropy is given by SðBÞ = k B . lnð6Þ = 1, 79 . kB :
ð5:85Þ
The fact that a system always evolves by itself towards the most probable states is the statement of the second law of thermodynamics. Second Law of Thermodynamics In an energetically closed system, the entropy change must obey ΔS ≥ 0,
ð5:86Þ
where equality ΔS = 0 applies exactly when the process is reversible, and inequality ΔS > 0 applies exactly when the process is irreversible. The second law is only valid for closed systems. In a non-closed system, the entropy can also become smaller (ΔS1 < 0) if, for example, a second system performs work on the first and exchanges heat with it (see Fig. 5.7). In this case, however, the entropy of the second system must become correspondingly larger, ΔS2 > 0, since the overall system consisting of system 1 and system 2 is closed as a whole and thus the second law applies. One can also imagine that entropy flows from the first system into the second system. Thus, in non-closed systems, processes with ΔS > 0 can also be reversible if the entropy flow between the systems can be
140
5 Thermodynamics
reversed. One can show that in such reversible processes the entropy change of the subsystems is linked to the heat change ΔQ of the subsystems: ΔS =
ΔQ , T
ð5:87Þ
where the temperature should be constant during the heat change. The entropy in a system is also related to the temperature. Let us consider this in more detail using the example of an isochoric change of state. A gas in a constant volume V is heated or cooled by a change in the amount of heat ΔQ from outside. Here the temperature changes according to (5.57) with ΔQ = CV . ΔT. This process is reversible, since one can reverse the change in heat from outside at any time. The change in entropy is thus given by ΔS =
ΔQ ΔT = CV . : T T
ð5:88Þ
However, since the temperature changes with the heat change, one must decompose the process into infinitesimally small steps dT and integrate over the infinitesimal entropy change dS. Thus, a temperature change from T1 to T2 results in a change in entropy of ΔS =
dS =
T2
CV .
T1
dT T = CV . ln 2 : T1 T
ð5:89Þ
The entropy increases logarithmically with the temperature. Here the question arises, towards which value the entropy strives, if the temperature in the system becomes smaller and smaller, that is, strives towards T = 0. According to (5.89) the entropy must become smaller and smaller. This question is answered in the third law of thermodynamics in the sense of a definition. Third Law of Thermodynamics In the limes of T → 0 the entropy tends to zero: lim SðT Þ = 0:
T →0
ð5:90Þ
Here, the temperature T = 0 can never be reached completely, since this would require an infinitely large change in entropy. The smallest temperatures to which gases could be cooled with immense effort in laboratories are of the order of T≲100 pK.
5.3 Entropy and Reversibility
141
Exercise: Entropy in Isothermal Expansion A gas with pressure p1 and temperature T in a flask with volume V1 is expanded by a plunger from the outside to the volume V2. The system is in thermal equilibrium with a bath of temperature T, so that the temperature in the gas does not change during the process. Is it a reversible process? And how large is the entropy change? Solution First of all, the process is reversible, because one can compress the volume again with the help of the punch from the outside just as one could expand it. To calculate the entropy change in the gas, we therefore calculate the amount of heat ΔQ, which flows into the gas during the isothermal expansion. In the case of isothermal changes of state, according to Eq. (5.70) the following applies ΔQ = Nk B T . ln
V2 : V1
ð5:91Þ
Thus, according to (5.87), the entropy change is given by ΔS =
ΔQ V = NkB . ln 2 : T V1
ð5:92Þ
Therefore, entropy increases with the logarithm of the volume.
5.3.3
Heat Engines
Using the terms of thermodynamics introduced in this chapter, we want to deal in the following with the question of whether a quantity of heat ΔQ can in principle be used to perform work in a process, for example, to drive a motor. Here we require that such a process should be cyclic, because the engine should not just turn once, but should be able to do work continuously. A machine that accomplishes this is called a heat engine. In principle, every engine, for example, the petrol engine, the diesel engine and also the Stirling engine is a heat engine. In the internal combustion engines, heat is generated by burning a fuel such as gasoline. In the Stirling engine, on the other hand, heat is extracted from a hot reservoir (usually the hot piston). In this process, the reservoir must be constantly heated so that it does not cool down. Technically, the heating in the Stirling engine is often done with a Bunsen burner or also electrically. The various heat engines now differ in how effectively they can convert the supplied amount of heat ΔQin into work ΔW. This is called the efficiency η of the heat engine:
142
5 Thermodynamics
Fig. 5.8 Assuming that there is a heat engine with a higher efficiency η1 than a reversible heat engine η2, which is operated as a heat pump, energy could be generated in a cycle. Such a perpetuum mobile cannot exist, therefore a heat engine with reversible process has the best possible (greatest) efficiency
η=
- ΔW : ΔQin
ð5:93Þ
The negative sign for the work in this definition comes from the fact that the work done by the motor to the outside world is ΔW < 0. Thus, the efficiency in the best case is η = 1, namely exactly when 100% of the heat can be converted into work. We will see that an efficiency of η = 1 is not possible in principle. So the question is which process has the highest efficiency. If we assume that there is a heat engine whose process is reversible, then this engine has the best possible efficiency. Reversible here means that the machine can be operated in reverse. In this case, the machine is driven as a heat pump (or refrigerator) from the outside with the work ΔW and releases the amount of heat ΔQin to the outside. The proof that a reversible heat engine has the best possible efficiency is by contradiction. We assume that there is a heat engine (engine 1, reversible or irreversible) whose efficiency η1 is greater than the efficiency η2 of a second reversible heat engine (engine 2) (see Fig. 5.8). We use the work done by engine 1 -ΔW = η1 . ΔQin, 1 to drive engine 2 as a heat pump. The heat output of machine 2 is thus given by ΔQin,2 =
η - ΔW = 1 . ΔQin,1 > ΔQin,1 η2 η2
ð5:94Þ
and after the prerequisite η1 > η2 is greater than the original heat quantity ΔQin, 1. If the amount of heat ΔQin, 2 is returned to machine 1, there is more thermal energy available after one cycle than at the beginning. Thus one could produce a perpetuum mobile, which constantly performs work and thereby keeps itself running. This is not possible according to the second law of thermodynamics, because this would lead to a decrease of entropy in a closed system (without proof).
5.3.3.1 Carnot Cycle The Carnot cycle is a reversible process that uses the amount of heat ΔQin to do the work ΔW. It consists of four process steps: (1) an isothermal expansion at
5.3 Entropy and Reversibility
143
Fig. 5.9 (a) Carnot cycle in a pV diagram. Here the process steps are (1) isothermal expansion at temperature TH, (2) adiabatic expansion with cooling to temperature TK, (3) isothermal compression at TK and (4) adiabatic compression with heating to TH. In process step (1), the system absorbs the amount of heat ΔQin, and in process step (3), it releases the amount of heat ΔQout as waste heat. (b) The difference between the quantities of heat absorbed and released is converted into work ΔW
temperature TH, (2) an adiabatic expansion with final temperature TK < TH, (3) an isothermal compression at temperature TK and (4) an adiabatic compression back to the initial state with temperature TH. These four steps are shown in a so-called pV diagram (Fig. 5.9a). Here, each state of the system is plotted by its instantaneous pressure p and instantaneous volume V as a point with coordinates p, V. During expansion, the volume increases and the pressure decreases; during compression, it is just the opposite. While there is no heat flow in both adiabatic process steps (2) and (4), ΔQ = 0, heat ΔQin flows into the system in process step (1) and heat ΔQout flows out of the system in process step (3). Thus, there is a waste heat ΔQout, which cannot be converted into work (see Fig. 5.9b). Thus, the work done during one cycle is given by - ΔW = ΔQin - ΔQout :
ð5:95Þ
It can be shown that the efficiency of a Carnot machine is determined only by the ratio of the temperatures TK and TH (without derivation): η=1-
TK < 1: TH
ð5:96Þ
Thus, the efficiency for all temperatures TK ≠ 0 is 0), negatively charged (q < 0) or also uncharged (q = 0). Charges without mass, however, do not exist. If a mass is uncharged, this does not automatically mean that the mass carries no charge at all, but only that it contains an equal amount of positive and negative charge. These charges then balance each other out, so that the body appears electrically neutral to the outside. Example: The Uncharged Hydrogen Atom A hydrogen atom is uncharged as a whole. However, it consists of charged individual parts, namely a proton (in the atomic nucleus) with a positive elementary charge qp = + e and an electron (in the atomic shell) with a negative elementary charge qe = - e. The elementary charge is a very special charge, because it is the smallest freely occurring charge unit. So every charge we will deal with here is a multiple of this elementary charge. Its value is e ≈ 1:6 × 10 - 19 C:
# Springer-Verlag GmbH Germany, part of Springer Nature 2023 S. Slama, Experimental Physics Compact for Scientists, https://doi.org/10.1007/978-3-662-67895-4_6
ð6:2Þ
149
150
6.1.1
6 Electrostatics
Forces Between Charges: Coulomb’s Law
Coulomb’s law describes forces between charges. Both attractive and repulsive Coulomb forces can occur. Repulsion always occurs between charges with the same sign, that is, when both charges are either positive or both charges are negative. Attraction is only found between charges with different signs, that is, between a negative and a positive charge. To formulate Coulomb’s law in general terms, consider two charges q1 and q2 in a vacuum at a distance r apart, connected by the → position vector r . The Coulumb force exerted by charge q1 on charge q2 is then →
F C,q1 → q2 =
→
1 q1 q2 r : 4πε0 r 2 r
ð6:3Þ
1 consists only of constants, for The force is a product of three parts: The first part 4πε 0 - 12 As example, ε0 = 8:85 × 10 Vm is the electric field constant. These constants are → needed to get the calculated force in SI units, that is, in Newton (N). The third part rr is a vector of length 1, which indicates the direction of the force. This always points along the line connecting the two charges. The physics takes place in the middle part q1 q2 r 2 , which contains the dependence on the experimental parameters, that is, on the two charges and their distance from each other. By the way, the Coulomb force exerted by the charge q2 on the charge q1 is exactly opposite and equal in magnitude: → → F q2 → q1 = - F q1 → q2 . If one has more than two charges, the force on one charge is given by the sum of the forces of all other charges on this specific charge. For example, if you have three charges q1, q2 and q3, the total Coulomb force on q3 is → → → given by F q3 = F q1 → q3 þ F q2 → q3 . The forces are to be added as usual with vector addition. Often one is only interested in the magnitude of the Coulomb force and not in the direction. Then one can express Eq. (6.3) also more simply by
FC =
1 q1 q2 : 4πε0 r 2
ð6:4Þ
Examination Task: Change of the Coulumb Force How does the Coulomb force between two charges q1 and q2 change at a distance r if the charges change as q1 → 2q1 q2 → 3q2 , respectively, and the distance r → 2r changes at the same time? Solution: For this purpose we consider the relevant part of Coulomb’s law and replace the original parameters by the changed quantities (continued)
6.1 Electric Charges, Forces and Fields
151
ð2q1 Þð3q2 Þ 6 q1 q2 q q q1 q2 → = = 1:5 . 1 2 2 : 4 r2 r2 r ð2rÞ2
ð6:5Þ
Thus, the new force is 1.5 times the original force.
6.1.2
Electric Field and Electric Force on Charges
We have seen that charges can cause forces on other charges. We now consider a sample charge q in a volume which is surrounded by other charges unknown to → → → us. At each location r of the volume, therefore, a force F r acts on the sample charge. Since we do not know the charges outside the volume, it is useful to introduce a new concept for the cause of the force. The cause of the force is supposed to be a property of the space at the location of the charge, that is, where we measure → → → the force F r . Therefore, we define the electric field at the location r : → →
E
→
r
=
F
→
r
q
½E ] =
,
N V = : C m
ð6:6Þ
Since the electric field is supposed to be a property of space, it must not depend on the sample charge. This condition is fulfilled as long as the sample charge q is sufficiently small. Sufficiently small means here that the forces from the sample charge on the other charges outside the volume are so small that these other charges are not influenced by it and in particular do not change their position. If one knows the electric field in a spatial area, one can also calculate the forces on charges by the above definition with →
F
→
r
→
=qE
→
r :
ð6:7Þ
This force is usually referred to as the electric force.
6.1.2.1 Electric Field Lines Electric fields are an abstract concept, but it is also possible to illustrate electric fields with the help of field lines. To do this, imagine that a small arrow is attached to each point in space, indicating the direction of the electric field. These arrows are then connected by tangential lines (similar to flow lines in a liquid). Here the strength of the field is expressed by the line density. The more lines that pass through a volume, the greater the field strength. Some examples of typical electric fields are shown below (see Fig. 6.1). For a positive point charge the field lines point radially outwards away from the charge (see Fig. 6.1a). In the sketch, only the section through the drawing plane is shown. In three-dimensional space, the field lines
152
6 Electrostatics
Fig. 6.1 Field line images of selected charge distributions: (a) Positive point charge: Field lines point radially away from the charge. (b) Negative point charge: The field lines point radially towards the charge. (c) El. dipole: The field lines start at the positive charge and end at the negative charge. (d) Plate capacitor: The field lines point straight and parallel from the positively to the negatively charged plate
point radially outwards in all directions, so that the field line diagram looks like a porcupine massage ball. Since all field lines start in the positive charge, positive charges are also called the sources of electric fields. In the case of a negative point charge (see Fig. 6.1b) the field line pattern is quite similar to that of the positive point charge, with the difference that the direction of the field lines is reversed: all field lines end in the negative point charge, so these are also called sinks for electric fields. Figure 6.1c shows the combination of a positive and a negative point charge. This is called an electric dipole, and the resulting electric field is called a dipole field. This field can be calculated by vectorially adding the field of the positive point charge to that of the negative point charge at any point in space. Here, all field lines start in the positive point charge and end in the negative point charge. Finally, in Fig. 6.1d we see the field of a charged plate capacitor. A positively charged metal plate is shown on the left and a negatively charged metal plate is shown on the right. The electric field between the plates is homogeneous, that is, the field strength is the same at each point and points in the same direction. In the sketch we also see that the electric field is perpendicular to the plate surfaces. This is generally true for any curved metal surface: on any metal surface, the electric field (if there is one) is perpendicular to the surface.
6.2
Gauss’ Theorem Calculation of Electric Fields
We have learned in Sect. 6.1 that charges are the causes of electric fields. Gauss’ theorem now describes this relationship quantitatively. To do this, we must first introduce a new quantity: the electric flux ϕel. The electric flux is a measure of how much electric field E flows through a surface A, and has the unit of “area times electric field”. For now, we consider only flat surfaces and constant fields. The
6.2 Gauss’ Theorem Calculation of Electric Fields
153
→
→
Fig. 6.2 (a) The surface vector A is perpendicular to the surface and has a length A , which is equal to the area. (b) For Gauss’s theorem we consider a charge Q in a volume V with surface A →
surface is defined by a surface vector A , which is perpendicular to the surface and represents it as a magnitude (see Fig. 6.2). An electric flux can now be imagined as a flowing liquid passing through a surface. The size of the flow depends on the strength of the electric field and the size of the surface, but also on the angle of the electric field to the surface. The flux is greatest when the electric field flows → perpendicularly through the surface and is thus parallel to the surface vector A → (field E 1 in Fig. 6.2). The flux is then ϕel = EA. The other extreme case is when the electric field is perpendicular to the surface vector, that is, parallel to the surface → (field E 2 ). The electric field thus flows parallel to the surface, and the flux through → the surface in this case is zero, ϕel = 0. For general angles (field E 3 ), the flux is given by the scalar product between the electric field and the surface vector, →
→
ϕel = E . A :
ð6:8Þ
This mathematical connection correctly describes also the two extreme cases. If we now also consider curved surfaces, for example, spherical surfaces and fields that change on the surface under consideration, then, in order to calculate the flux, we ! must decompose the surface into infinitesimally small plane surface elements dA → ! through which the infinitesimally small flux dϕel = E . dA flows. The total flux through the whole surface is then the integral of the infinitesimal fluxes over the whole surface A: ! E . dA :
→
ϕel =
ð6:9Þ
A
Let now V be given an arbitrary volume with a (closed) surface O, through which the electric flux ϕel flows. According to the theorem of Gauss
154
6 Electrostatics
! Q E . dA = , ε0
→
ϕel = O
ð6:10Þ
where Q is the total charge in the volume V. Q can be divided into several positive and negative charges qi. Only the total charge Q = ∑ qi is decisive for the theorem. The small circle in the integral sign here means that the integral is carried out over a closed surface, as opposed to an open surface in space. In Fig. 6.2b, such a charge Q is drawn in a cube volume as an example. In this example, the closed surface consists of the six side faces of the cube. Thus, the electric flux through the surface of the volume is equal to the charge in the volume divided by the electric field constant. The theorem is a manifestation of the already known fact that charges are the sources and sinks of electric fields. With the help of Gauss’s theorem, we are now able to calculate electric fields of some charge distributions that have a particular symmetry. Written Test: The Theorem of Gauss In the centre of a spherical volume with radius R is the charge Q. This causes the electric flux ϕel to flow through the surface of the sphere. How does the flux change if you double the radius of the sphere (→ four times the surface area)? The charge in the volume does not change with the increase. Solution: The electric flux depends only on the charge in the volume. So if you increase the volume, V 1 → V 2 > V1,
ð6:11Þ
and the charge in the volume remains the same, Q2 = Q1 ,
ð6:12Þ
ϕel,1 = ϕel,2 ,
ð6:13Þ
then the flow stays the same,
regardless of how the surface changes.
6.2.1
Spherically Symmetrical Charge Distribution with Charge Q
We consider a spherically symmetrical charge distribution, for example, a charged metal sphere or a point charge (Fig. 6.3a). We already know that the electric field points radially outwards or inwards. Furthermore, because of spherical symmetry,
6.2 Gauss’ Theorem Calculation of Electric Fields
155
Fig. 6.3 Examples of application of Gauss’ theorem: (a) spherically symmetrical charge distribution (b) long straight wire, (c) infinitely extended charged surface, (d) plate capacitor
the field strength E must be constant for all points at a fixed distance r. For the application of Gauss’s theorem, we choose a volume adapted to this symmetry: a sphere with radius r > R. Here R is the radius of the charge distribution. The surface of this sphere is A = 4πr2. Furthermore, since the electric field is perpendicular to the surface at every point on the surface, the electric flux is given by ϕel = EA = E(r) 4πr2. Now we apply Gauss’s theorem ϕel = εQ0 and set the two results for the flux equal: 4πr2 Eðr Þ = εQ0 . From this, the electric field follows as
E ðr Þ =
6.2.2
1 Q : 4πε0 r2
ð6:14Þ
Long Straight Charged Wire
We consider a long straight and homogeneously charged wire (Fig. 6.3b). This has cylindrical symmetry, that is, the wire can be rotated around its axis without changing the charge distribution in space. Therefore, the field must also be cylindrically symmetrical. So it points radially away from the wire or towards the wire. As a volume for the application of Gauss’s theorem we therefore choose a cylinder with length l and radius r, which contains the charge Q. The surface of the cylinder consists of the cylinder itself and the surface of the wire. The surface is composed of the cylinder jacket A = 2πrl , the bottom, and the lid. However, there is no electric flux through the bottom and the lid, since the electric field is parallel to the two surfaces. In the case of the cylinder jacket, the field is perpendicular to the surface everywhere, so the flux is given by ϕel = EA = E(r)2πrl. Using the theorem of Gauss we set this result equal to ϕel = εQ0 , and solve for the field:
156
6 Electrostatics
E ðr Þ =
1 Q=l 1 λ = : 2πε0 r 2πε0 r
ð6:15Þ
This formula contains the charge per length of the wire Ql , which we define in this context as the line charge density: λ = Ql .
6.2.3
Infinitely Extended, Homogeneously Charged Plate
For reasons of symmetry, the electric field of a charged plate must point vertically away from or towards the plate. The electric field strength may only depend on the distance x from the plate. We choose as volume a cuboid with height h = 2x and cross-sectional area A, which contains the charge Q (Fig. 6.3c). The parts of the cuboid surface relevant for the electric flux are the top and the bottom. There is no flux through the side surfaces because the electric field is parallel to them . The flux is thus given by ϕel = E(x)2A. The factor of 2 comes from the fact that there is electric flux through both the bottom and the top surface. We again set the result equal to ϕel = εQ0 using Gauss’ theorem and solve for the field: E ðxÞ =
Q=A σ = : 2ε0 2ε0
ð6:16Þ
In this formula the charge per area of the plate QA occurs, which we define here as the surface charge density: σ = QA . It is noteworthy that the electric field E(x) does not depend on the distance x from the plate. This means that the field at a distance of one centimeter is just as large as at a distance of a thousand kilometers. Here we must not forget that we have assumed the plate to be infinitely large. Only then does the formula for the electric field apply. For finitely large real plates, however, the formula is correct to a very good approximation if the distance from the plate is much smaller than the expansion of the plate. If the plate is, for example, 1 × 1 cm large, the formula is approximately valid for distances up to approx. 1 mm, apart from the edge of the plates, where the field is inhomogeneously curved outwards.
6.2.4
Plate Capacitor
A plate capacitor consists of two parallel, homogeneously charged plates with opposite charges +Q and -Q, respectively, and opposite surface charge densities +σ and -σ, respectively (Fig. 6.3d). The electric field in the plate capacitor now results from the superposition of the electric fields of the two individual plates. Outside the plates, the fields of the positively and negatively charged plates cancel each other out, while the fields in the intermediate region between the plates reinforce each other. Neglecting boundary effects (i.e., fields at the edge of the plates), the field outside the plates E = 0, and between the plates
6.3 Electrostatic Potential and Electric Voltage
σ σ σ þ = : 2ε0 2ε0 ε0
E=
6.3
157
ð6:17Þ
Electrostatic Potential and Electric Voltage
In this section we deal with the question, which energy charges possess in the electric → field. Here we have the freedom to choose the energy zero point r 0 in space. This defines a reference with respect to which we can determine the energy of a charge at → any other point. The energy at the position r is then given by the work W, which is → → needed to bring the charge q from r 0 to r : →
W = -q
r →
→
→
E .dr,
½W ] = VAs = J:
ð6:18Þ
r0 →
→
This integral is a path integral in which the path from r 0 to r is broken down into → → → small steps d r , and the infinitesimal work dW = F . d r is calculated at each location along the path (see Fig. 6.4). Here, the electric force on the charge must → → be overcome from the outside, hence F = - q E . The total work is then the integral over the infinitesimal work along the steps of the path. Using the work calculated in this way, we can also define a potential that is independent of the charge we are moving. The electrostatic potential is →
→
φel r
=-
r →
→
r
→
E .d r ,
0
½φel ] =
J = V: C
ð6:19Þ
Thus the relation between the electric potential and the energy of a charge q at the → position r results as
Fig. 6.4 In the path integral (6.18), the path is divided into small steps, and the work →
→
dW = F . d r is calculated for each step
158
6 Electrostatics →
W r
→
= q . φel r : →
The possibility that we can define a potential φel r
ð6:20Þ →
at any point r of space
requires that the result of the integral (6.19) is independent of the path by which we → → get from r 0 to r . Otherwise the potential would not be unique. This property is true → → for so-called conservative fields, to which the electric field E r belongs. All conservative fields have in common that they can always be written as a derivative (gradient) of a potential. The gradient is calculated using the Nabla operator ∂=∂x ∇ = ∂=∂y : ∂=∂z →
E
→
r
∂φel =∂x
→
= - ∇φel r
=-
∂φel =∂y ∂φel =∂z
:
ð6:21Þ
The gradient has the meaning of a directional derivative. It describes the direction in space in which the potential increases the most. To calculate the three components, one must calculate the derivative of the potential with respect to x, y, and z, with each of the other two variables treated as constants in each of the three derivatives. Note that there are also non-conservative fields, such as vortex fields. The electric potential can also be represented graphically using equipotential lines (in 3-D: equipotential surfaces) by connecting points in space with the same potential strength. The equipotential lines are always perpendicular to the electric field lines. This can be thought of as contour lines on a map connecting points of equal height (see Fig. 6.5). The electrical voltage is now defined as the electrical potential difference → → between two points r 1 and r 2 : →
→
r2→
→
U = φel r 2 - φel r 1 = -
→
→
E .dr,
½U ] = V:
ð6:22Þ
r1
When one speaks of voltage, one must therefore always state between which points this voltage is measured.
6.3.1
Examples of Electrical Potentials
Now let us consider the electrical potential of some important examples:
6.3 Electrostatic Potential and Electric Voltage
159
Fig. 6.5 Equipotential lines of the electric potential (red) connect points of equal potential strength and are always perpendicular to the electric field lines, here using the example of a charged sphere in front of a metallic plane
6.3.1.1 Potential of a Spherically Symmetrical Charge Distribution In Sect. 6.3 we saw that the electric field of a spherically symmetrical charge distribution with total charge Q points radially outwards (for positive charge) or inwards (for negative charge) and also decreases with the radius r as 1/r2. The equipotential surfaces are therefore spherical shells around the charge. The zero point of the potential, which can be set arbitrarily, is typically set at an infinitely large distance from the charge. Since 1/r2 dr = - 1/r is, one obtains for the potential the function φel ðr Þ =
1 Q : 4πε0 r
ð6:23Þ
Therefore, for a point charge, the potential diverges at the location of the charge.
6.3.1.2 Potential of an Infinitely Long Straight Charged Wire For the long straight charged wire with wire radius RD, the electric field points radially away from the wire (for pos. charge) or towards the wire (for neg. charge), and the field strength drops with distance from the wire r as 1/r. The equipotential surfaces are therefore cylindrical surfaces around the wire. The potential can be found with the help of the integral 1/r dr = ln (r). Here, one now sets the potential zero at an arbitrary finite distance R ≥ RD from the wire. This time it is not
160
6 Electrostatics
convenient to set the zero point at an infinite distance, because then the potential would diverge at any distance, since lim lnðRÞ = 1. The potential is thus: R→1
φel ðr Þ = where λ =
Q l
λ ½lnðRÞ - lnðr Þ], 2πε0
ð6:24Þ
is the line charge density (charge per wire length).
6.3.1.3 Voltage in the Plate Capacitor In the following we calculate the voltage in the plate capacitor, that is, the difference between the electrical potentials on the two plates. To do this, we again consider the plate capacitor from Fig. 6.3d with charge Q and plate area A. We have seen that the electric field between the plates is homogeneous, E = εQ0 A, and points perpendicularly from one plate to the other. The equipotential surfaces are therefore parallel planes to the plate surfaces. For the voltage between the two plates (plate spacing d), we must now calculate the path integral from one plate to the other, for example, on a straight path (in the x-direction). This has the result U=
d
E dx =
0
Q d = Ed: ε0 A
ð6:25Þ
Here the positive side of the voltage is on the plate charged with the positive charge. Now we can introduce another important quantity that characterizes the capacitor: The capacitance C describes how much charge ±Q is loaded onto the two capacitor plates when a voltage U (e.g., with a battery) is applied to the plates: C=
Q : U
This is true for any capacitor. In the specific case of the plate capacitor, C =
6.3.2
ð6:26Þ ε0 A d .
Energy in the Electric Field
We have seen that to generate electric fields we need charges. But in order to position these charges in space, we need to do some work W, because moving charges in electric fields involves work. So we can say that work is necessary to generate electric fields. We already know this from mechanics, where work is done to lift a mass, for example. This mass then has a certain potential energy. We can now ask where this potential energy is in the case of the generated electric field, and a legitimate answer is that this energy is in the field itself: In every small volume of space where the electric field is not exactly zero, there is energy. With the help of a somewhat longer calculation, the total potential energy in a volume V can be represented as
6.4 Matter in the Electric Field
161
E pot =
1 → → 2 ε0 E ð r Þ dV: V 2
ð6:27Þ
This is an integral over the whole volume V. The integrand can now be interpreted as the energy per infinitesimal volume dV. This is the electric energy density wel, which we thus define as wel =
1 → → ε E r 2 0
2
½wel ] =
,
J : m3
ð6:28Þ
Example: Energy in the Plate Capacitor In the plate capacitor (plate area A, plate spacing d, charge Q, voltage U = Q/ C, capacitance C) there is a homogeneous electric field E = εQ0 A between the →
plates. The energy density is wel = 12 ε0 E
→
2
r
=
Q2 . 2ε0 A2
The energy is now
given by the energy density times the volume in the capacitor W = wV = wA . d. By inserting w one obtains W=
C Q2 = U2: 2C 2
ð6:29Þ
The two expressions as a function of the charge Q or the voltage U are equivalent alternatives here. This potential energy is equal to the work needed to charge the capacitor with this charge or to this voltage.
6.4
Matter in the Electric Field
So far we have assumed that the electric field is in a vacuum, now we want to look at what happens when the electric field encounters matter. Here we have to distinguish between electrical conductors, such as metals, and electrical insulators, such as glass or plastic.
6.4.1
Electrical Conductors and Currents
Electrical conductors are characterised by the fact that they contain freely movable charges inside them. In metals such as copper, these are the electrons. However, there are also other conductors, for example, electrolytes. These are liquids in which ions are dissolved. An example of this would be a common salt (sodium chloride) solution in which Na+ – and Cl- – ions are dissolved and freely mobile. An external → electric field E ext now causes forces on the charge carriers via the connection
162
6 Electrostatics
Fig. 6.6 An external electric field causes charge separation in the conductor, which in turn generates an internal electric field. In equilibrium, the two fields compensate each other inside the conductor
→
→
F = q E and displaces them in the conductor. This effect is referred to as electrostatic inductance. Figure 6.6 illustrates what happens inside the conductor: Due to the fact that the negatively charged electrons in the conductor are shifted to the left against the electric field direction, they accumulate at the left edge of the conductor. Since the conductor as a whole is uncharged, this creates positive excess charges at the right edge. This charge separation itself generates an internal electric field Eint, which is directed against the external field. Charges are displaced in the conductor until the fields inside the conductor exactly compensate each other, that is, until the interior is free of fields. This is generally valid for all conductors. Example: The Faraday cage The interior of electrical conductors is always free of fields. This also applies to holes inside conductors. An example of this is the Faraday cage. This is a cage made of metal. No matter how large the electric field strength is outside the cage, there is no electric field inside and therefore no voltage. This means that in such a cage you are safe from lightning strikes, for example. Cars, which are almost always made of metal, are an example of a Faraday cage. Now consider a conductor that has no edge, for example, a wire whose ends terminate in a battery. Within this conductor, an electric field along the wire then causes a continuous movement of the charge carriers. By the electric current I we now understand the quantity of charge ΔQ, which moves per time Δt through a crosssectional area of the wire. Since the current can change as a function of time, we define the current using infinitesimal quantities dQ and dt: I ðt Þ =
dQ , dt
½I ] =
C = A: s
ð6:30Þ
Current is thus the change in charge over time and is measured in amperes (A). In this book we use only the technical direction of current, that is, we consider the current of positive charges in the conductor, even if the electrons are moving in the opposite direction. Thus, the current is consistently defined by the current → density j . It describes the current per area and also includes the unit vector e as the direction in which the current flows. The definition is
6.4 Matter in the Electric Field →
j =
163
I → e=n .q . v , A
½ j] =
A : m2
ð6:31Þ
Here, n denotes the density of the charge carriers with charge q that contribute to the → current, and v their velocity. Thus, for negative charge carriers such as electrons → with q = - e, the direction of j is opposite to their direction of motion.
6.4.1.1 Ohm’s Law Now we can turn our attention to Ohm’s law. It describes the relationship between electric fields and currents in conductors and reads as follows →
→
j =σE:
ð6:32Þ
So we see that the current density points in the direction of the electric field and is also proportional to the electric field. The proportionality constant here is the A = mS . electrical conductivity σ. This is a material constant with the unit ½σ ] = Vm A Here the new unit of Siemens S = V has been introduced. A related quantity to 2 conductivity is the specific resistance ρ = σ1 with the unit ½ρ] = Ω mm m , where Ω is denoted as Ohm. If we now consider conductors of certain dimensions, for example, a wire with a cross-sectional area A and length l, and apply a voltage U to the wire, we find that a current I flows through the wire which is proportional to the voltage. The constant of proportionality is called the ohmic resistance R: U = RI,
½R] = Ω:
ð6:33Þ
This relationship is also commonly referred to as Ohm’s law. Of course, both laws are correct, and they are even related, since the resistance of the wire is given by R=
l l =ρ : σA A
ð6:34Þ
Written Test: Resistors in Wires Given five metal wires that are made of the same material. Which is the one with the smallest resistance? (a) (b) (c) (d) (e)
Length 40 mm, radius 1 mm, Length 50 mm, radius 1.1 mm, Length 60 mm, radius 1.2 mm, Length 70 mm, radius 1.3 mm, Length 80 mm, radius 1.5 mm. (continued)
164
6 Electrostatics
Solution: The resistance of a wire is given by R = ρ Al . Since the five wires are made of the same material, they all have the same resistivity ρ, which means we need to find the wire where the ratio of length to area is the smallest. Also, since the area increases with the radius squared (A = πr2), we need to calculate the ratio of length to radius squared. rl2 The result is: rl2 = (a) 40/mm, (b) 41.32/mm, (c) 41.67/mm, (d) 41.42/mm (e) 35.5/mm. The wire with the smallest resistance is therefore wire (e).
6.4.1.2 Electrical Power We have seen that a current in an electrical conductor is always associated with a voltage. Here charges are carried from a higher to a lower potential. This accelerates the charges to a certain extent. After a certain acceleration distance, however, the moving charges collide statistically with the oscillating (stationary) atomic core, whereby the charges are slowed down. In such a collision, the kinetic energy of the charge carriers is transferred to the oscillations of the atomic hulls, and the conductor heats up as a result. The heating power P is given by P = UI,
½ P] = VA = W,
ð6:35Þ
P is also referred to as electrical power. If we insert Ohm’s law into the definition of electrical power, we obtain P = RI 2 =
U2 : R
ð6:36Þ
6.4.1.3 Emission of Electrons from Metals Several important applications require electrons that can move freely in a vacuum. Examples are the Braun tube and the electron microscope. In order to produce these free electrons, metals are used which have an almost arbitrarily high number of freely movable electrons in their interior. However, in order to release a free electron from the metal, a certain amount of energy is required, depending on the material, which we refer to here as the work function WA (see Fig. 6.7). To provide this energy, the metal is typically heated. The energy is thus provided by thermal energy. Since the metal usually begins to glow, this is also referred to as thermionic emission. The electron is emitted from the hot cathode. Technically, the emitted electrons are then sucked off by a positively charged anode, to which the electrons are accelerated. The resulting beam of electrons is called a cathode ray. Another method is based on the use of very strong electric fields, i.e. field emission. If the electric force F = - eE on an electron is greater than the force holding the electron
6.4 Matter in the Electric Field
165
Fig. 6.7 (a) Work function of electrons from metals, (b) Glow and field emission of free electrons from the cathode and acceleration to the anode, c Deflection of a cathode beam in a plate capacitor
in the metal, that is, for jF j > dE dx in Fig. 6.7, the electron is released from the metal. Sufficiently strong electric fields can be produced, for example, by fabricating the cathode in the form of a tip and applying a negative voltage to the tip (see Fig. 6.7b). Very strong local electric fields can be produced at such tips. With the movement of the electrons from the cathode to the anode, according to (6.20) the electron acquires the electric energy
W el = eU,
½W el ] = CV = VAs = J
ð6:37Þ
where U is the voltage between the anode and cathode. Another common unit is the so-called electron volt (eV). This is the electrical energy of an electron accelerated by the voltage of U = 1 V. Thus, 1 eV ≈ 1.6 × 10-19 J. The velocity of the accelerated electrons is then calculated by equating the electrical energy with the kinetic energy 1 2 2 mv and for non-relativistic velocities, which are significantly less than the speed of light, is given by v=
2eU , m
ð6:38Þ
with elementary charge e and mass of an electron m. In a Braun tube, a cathode ray is deflected by plate capacitors (Fig. 6.7c). The homogeneous electric field in the → → capacitor leads to a constant force F = - e E , which deflects the electrons with → → the constant acceleration a = -me E . This leads to a parabolic trajectory, similar to the oblique throw of masses in the Earth’s gravitational field.
6.4.2
Electrical Insulators and Dipoles
Now let us look at what happens when an external electric field hits an electrical insulator. This is characterized by the fact that, unlike electrical conductors, it has no freely movable charge carriers, that is, all electrons in the material are bound to the atomic cores. Nevertheless, a shift of the centres of charge can occur at the atomic
166
6 Electrostatics
level. This results in the formation of atomic dipoles, which we will first take a closer look at.
6.4.2.1 Electric Dipole → The electric dipole moment p quite generally consists of two opposite charges ±q at a distance d from each other (see Fig. 6.8a). The electric dipole is defined by →
→
p =qd ,
½ p] = Cm,
ð6:39Þ
→
where the distance vector d points from the negative to the positive charge by convention. Dipoles can either be permanent, that is, present even without an external electric field, if, for example, in a molecule the centre of charge of the negative and positive charges does not coincide, as in the case of the water molecule (see Fig. 6.8b). Or an electric dipole can be induced by an external electric field. In an atom, the external field leads to a shift of the centres of charge of the outermost electrons (valence electrons) relative to the location of the atomic nucleus (Fig. 6.8c). A dipole generates its own electric field (see Fig. 6.1c). This field has a complicated structure and can be described by superposition of the two individual fields of the two charges. The same applies to the corresponding electric potential. This is also given by the sum of the individual potentials of the positive and negative charge. Dipoles in Electric Fields → → If an electric dipole p is in an external electric field E , the dipole has the energy →
→
U= - p . E:
ð6:40Þ
The energy therefore depends on the orientation of the dipole relative to the electric field and is smallest when the dipole points in the same direction as the electric field. This can also be understood graphically, because the forces acting on the two → → charges of the dipole are F - and F þ (Fig. 6.9). In a homogeneous electric field, the two forces are equal in magnitude and point in opposite directions; therefore, the total force acting on the dipole is zero. Depending on the angle of the dipole moment to the external field, however, there is a torque which causes a rotation of the dipole moment in the direction of the external field:
Fig. 6.8 Electric dipoles: (a) general definition via two opposite charges at distance d, (b) permanent electric dipole in the water molecule, (c) induced electric dipole in an atom generated by an external electric field
6.4 Matter in the Electric Field
167
Fig. 6.9 (a) In a homogeneous electric field, the forces acting on the two charges of the dipole are equal. Thus the total force on the dipole is zero, but there is a torque which makes the dipole moment parallel to the external field. (b) In an inhomogeneous electric field, the forces acting on the two charges are different. This results in a force on the dipole in addition to the torque →
→
→
M = p × E:
ð6:41Þ
In the sketch, the positive charge is pulled to the right and the negative charge is → pulled to the left. This causes the dipole moment p to rotate clockwise, and the → → direction of p aligns with the direction of E . This minimizes the energy of the dipole. On the other hand, if the electric field is inhomogeneous, there is a total force acting on the dipole in addition to the torque. In the simplified situation sketched in Fig. 6.9b, where the dipole is aligned in the direction of the electric field (here the x-direction), the force on the dipole is given by F=p .
dE : dx
ð6:42Þ
The force is therefore proportional to the derivative of the electric field with respect to the position.
6.4.2.2 Dielectricity With the help of the knowledge of dipoles, we can now understand the influence of electric fields on insulators. The insulator consists of many atoms or molecules that react to the external electric field. Either atomic dipoles are induced in the atoms (displacement polarization), or existing permanent dipoles are aligned in the external → field (orientation polarization). This results in a polarization P of the entire medium. This is understood as the average dipole moment per volume. The polarization generates its own electric field, which is superimposed on the external field in the insulator and weakens the external electric field. At this point we introduce a new → field, the electric flux density D , which takes the effect of the polarization into account: →
→
→
→
D = ε0 E þ P , →
D = →
C : m2
ð6:43Þ
The D field has the advantage that, unlike the E field, it is constant when entering an insulating medium. There is a linear relationship between the two fields:
168
6 Electrostatics →
→
D = ε0 εr E ,
ð6:44Þ
with the relative dielectric constant εr. This material constant is almost always εr ≥ 1, for vacuum it is εr = 1, in air it is, for example, εr = 1, 0006, and in water it is, for example, εr = 88 (static value, valid up to frequencies on the order of 1 GHz). All the formulas noted so far, which are valid in vacuum, are also valid in dielectric media if one replaces ε0 with. ε0εr Example: Dielectric in Plate Capacitor We assume a plate capacitor with plate area A and plate distance d, which is charged with the charge ±Q. Let there be a vacuum between the plates with εr = 1. Then the electric field is given by E1 =
Q , ε0 A
ð6:45Þ
the voltage between the plates is Qd , ε0 A
ð6:46Þ
Q ε A = 0 : d U1
ð6:47Þ
U 1 = E1 d = and the capacitance of the capacitor is C1 =
The energy stored in the plate capacitor is given by W1 =
Q2 : 2C1
ð6:48Þ
The electric flux density is D1 = ε0 εr E 1 =
Q : A
ð6:49Þ
Now we fill the area between the plates with a dielectric with εr > 1. We assume that the capacitor is separated from the voltage source, thus the charge Q on the capacitor plates remains constant. The electric flux density now remains the same by definition: D2 = D1 =
Q : A
ð6:50Þ
This causes the other quantities to change: The electric field (continued)
6.5 Electrostatics – Compact
169
D2 Q E = = 1 ε0 εr εr ε0 ε r A
E2 =
ð6:51Þ
and the voltage between the plates U 2 = E2 d =
E1 U d= 1 εr εr
ð6:52Þ
are getting smaller. The capacity C2 =
Q Q = εr = εr C 1 U1 U2
ð6:53Þ
on the other hand, becomes larger. This also reduces the energy stored in the capacitor W2 =
Q2 Q2 W = = 1: 2C 2 2εr C1 εr
ð6:54Þ
One can now ask where the missing energy is. In fact, the capacitor exerts a force on the dielectric, which pulls the dielectric into the capacitor. So when the dielectric is inserted into the capacitor, the capacitor does work on the dielectric. This causes the dielectric to gain potential energy, which comes from the electrical energy of the capacitor and is now stored in the polarization of the dielectric.
6.5
Electrostatics – Compact
Here are the most important formulas of electrostatics: Coulomb’s law (6.4): FC =
1 q1 q2 : 4πε0 r 2
Electric force (6.7): →
F
Gauss’ theorem (6.10):
→
r
→
=qE
→
r :
170
6 Electrostatics
! Q E . dA = : ε0
→
ϕel = O
Electric field and potential of a spherically symmetric charge distribution/point charge (6.14) and (6.23): 1 Q , 4πε0 r2 1 Q φel ðr Þ = : 4πε0 r E ðr Þ =
Electric field and potential of a long charged wire (6.15) and (6.24): E ðr Þ =
1 λ 1 Q=l = , 2πε0 r 2πε0 r
ð6:55Þ
λ ½lnðRÞ - lnðr Þ]: 2πε0
ð6:56Þ
φel ðr Þ =
Plate capacitor with dielectric (6.17), (6.25), (6.26) and (6.29): U Q = , d ε0 εr A Q ε εA C = = 0 r , U d CU 2 Q2 E pot = = : 2C 2 E =
Ohm’s law (6.34) and (6.33): U = RI, l R =ρ : A Electric dipole (6.39) and its energy in the electric field (6.40): →
→
p =qd , →
→
U= - p . E: Force (6.42) and torque (6.41) on an electric dipole in the electric field:
6.5 Electrostatics – Compact
171
F=p . →
dE → → → ,M = p × E: dx
→
Relationship between D field and E field (6.44): →
→
D = ε0 ε r E :
7
Magnetostatics
7.1
Permanent Magnets and Magnetic Field Lines
Magnets exist in almost every household, for example, as locking mechanisms for cupboards or as the well-known refrigerator or pinboard magnets. All these magnets are so-called permanent magnets, that is, materials that are magnetic without external action. However, there are also electromagnets which are only magnetic when an electric current flows (see Sect. 7.3). However, the properties of the magnets described below are identical, irrespective of whether the magnet is a permanent magnet or an electromagnet. In particular, each magnet has a magnetic south pole and a magnetic north pole (see Fig. 7.1). Each magnet is therefore a magnetic dipole, by analogy with the electric dipole in electrostatics (Sect. 6.4.2.1), which consists of two charges of different names. If two magnets are present, then these two magnets exert forces on each other, which are influenced by the magnetic field. →
B
→
r ,
½B] = T
ð7:1Þ →
where the SI unit for the magnetic field is the Tesla. Another common unit for the B field is the Gauss with 1 G = 10-4 T. Magnetic poles with unequal names (north and south poles) attract each other, while poles with equal names (e.g., the two north poles) repel each other. A special feature here is the fact that magnets can also exert forces on substances that are not permanently magnetic by themselves. These are the so-called ferromagnetic substances (e.g., iron). The presence of a magnet magnetizes the ferromagnetic substance in such a way that the magnet and the ferromagnetic substance always attract each other, regardless of whether the north pole or the south pole of the magnet is brought close to the ferromagnetic substance. For more on this see Sect. 7.4.3 on ferromagnetism. Analogous to the electric field lines, one can also introduce magnetic field lines, which by convention leave the permanent magnet at its north pole and re-enter the magnet at the south pole. Inside the magnet, the field lines then run from the south pole back to the north pole. Magnetic field lines are # Springer-Verlag GmbH Germany, part of Springer Nature 2023 S. Slama, Experimental Physics Compact for Scientists, https://doi.org/10.1007/978-3-662-67895-4_7
173
174
7 Magnetostatics
Fig. 7.1 Magnetic field lines of a bar magnet (a) and a horseshoe magnet (b). In the case of the horseshoe magnet, the magnetic field between the pole pieces (the straight areas on the left and right sides) is almost homogeneous. (c) A permanent magnet consists of many microscopic magnets. If the permanent magnet is divided into two halves, two new magnets are created, each with a north and a south pole
therefore always closed. The closer the field lines are, the greater the magnetic field. However, there is also a decisive difference to electrostatics: Electric charges can also occur as individual objects, for example, as positive point charges. These charges are called electric monopoles. With magnetism, on the other hand, there are no monopoles, that is, magnetic north and south poles never occur alone, but always in combination with their respective counterpart. One could now come up with the idea of splitting a magnet in half to make a monopole out of the dipole. However, this does not work because a new south or north pole is created at the cut edge and thus the two newly created magnets are also dipoles. This is due to the fact that the permanent magnet is composed of many microscopic magnets, which are ideally all aligned with each other (see Fig. 7.1c). A cut through the permanent magnet thus divides the many microscopic magnets into two halves, each of which again forms a permanent magnet. The magnets thus obtained can now be further divided until we arrive at the microscopic elementary magnets, but even these cannot be divided into two magnetic monopoles. We must accept this at this point. An exact justification would require a description of the magnetic effects in atoms, which we cannot go into in detail here, since quantum mechanics is needed for that. A simple (but physically not quite correct) idea is based on the fact that small circular currents flow within the atom, that is, the atom is regarded as a tiny little electromagnet. Example: Earth’s Magnetic Field and Compass The earth is also a giant magnet. The magnetic north pole of the earth is located near the geographic south pole and vice versa. The cause of the earth’s magnetic field lies in electrical currents in the liquid interior of the earth and is still the subject of research. The strength of the earth’s magnetic field depends on its position on the earth, but is approximately in the range of B~0.5 G. Since the electric currents in the Earth’s interior are permanently (continued)
7.2 Lorentz Force and Definition of the B-Field
175
changing slightly, both the strength of the Earth’s magnetic field and the position of the magnetic poles on the Earth’s surface fluctuate. The magnetic field even seems to have completely reversed polarity several times in the course of the Earth’s history. A much smaller magnet, on the other hand, is contained in a compass. This magnet is freely movable and aligns itself in the Earth’s magnetic field in such a way that the magnetic north pole of the compass points in the direction of the Earth’s magnetic field lines towards the magnetic south pole of the Earth.
7.2
Lorentz Force and Definition of the B-Field
The electric field was defined in Sect. 6.1.2 via the electric force acting on a sample → charge. Analogously, we now want to find a definition for the B field. The Lorentz → force F L is suitable for this purpose. This is the force experienced by a charge → → q when it is moving in a magnetic field B with the velocity: v →
→
→
FL =q . v × B:
ð7:2Þ
To determine the Lorentz force, one must calculate the cross product between the → velocity and the B field. For the direction of the Lorentz force, it is best to use the → right-hand rule for cross products: Extend the thumb in the direction of v and the → index finger in the direction of B , then the extended middle finger points in the → → direction of v × B (see Fig. 7.2). To determine the direction of the Lorentz force, → one must then consider the sign of q. For positive charges q > 0 the direction of F L
Fig. 7.2 (a) Right-hand rule for determining the Lorentz force on a charge q. For positive charges q > 0 the Lorentz force points forward in the direction of the middle finger, for negative charges it points backward in the opposite direction. (b) Circular motion with radius R of a positive charge in a homogeneous magnetic field. For a negative charge, the circular path is traversed in the opposite direction
176
7 Magnetostatics →
→
is parallel to v × B and for negative charges q < 0 such as electrons it is opposite. With the help of (7.2) one can now define the unit Tesla of the magnetic field: ½B] =
½F L ] , ½q]½v]
1 T=1
N : Am
ð7:3Þ
Example: Typical Magnetic Field Strengths Brain waves cause magnetic fields of B~10-15 T Earth’s magnetic field B~5 × 10-5 T = 0.5 G Permanent magnets (neodymium) B ≲ 0.5 T Superconducting electromagnets B ≲ 45 T Neutron stars generate magnetic fields of B~108 T If, in addition to the magnetic field, an electric field is also present, both forces can be combined as →
F =q .
→
→
→
v ×BþE :
ð7:4Þ
Another important insight from (7.2) is the proportionality of the Lorentz force to the → velocity v , that is, no Lorentz forces act on charges at rest. Moving charges, on the other hand, are forced onto special trajectories because of the cross product. This is the subject of Sect. 7.2.1.
7.2.1
Trajectories of Charged Particles in the Homogeneous B-Field →
We consider here a charge q moving with velocity v in a homogeneous magnetic → field B = Bez , which we assume here to be in the z-direction. The simplest case is → when the velocity is parallel (or antiparallel) to the magnetic field: v = vez , because in this case the cross product is zero: →
F L = qvB . ez × ez = 0:
ð7:5Þ
Thus, there is no Lorentz force acting and the charge continues to move → unaccelerated at its speed v . → The other extreme case is when the velocity vector v is perpendicular to the magnetic field. In this case, the Lorentz force has the magnitude F L = qvB
ð7:6Þ
7.2 Lorentz Force and Definition of the B-Field
177
Fig. 7.3 Examination task: One of the outlined movements is not possible
and is perpendicular to the velocity and the magnetic field (right-hand rule). Forces which are always perpendicular to the current direction of motion have already been discussed in Sect. 2.2.4.2. These forces cause a circular motion. Since the Lorentz force is also oriented perpendicular to the magnetic field, the plane of the circular path is perpendicular to the magnetic field (see Fig. 7.2b). We can therefore say that the charge moves in a circle around the direction of the magnetic field. Here, positively charged particles travel clockwise and negatively charged particles travel → counterclockwise around B . One can also determine the radius of the circular path. Since the Lorentz force drives the charge onto a circular path, it plays the role of the centripetal force Fz = mv2/R, with particle mass m and orbital radius R. By equating the two forces, one can solve for the orbital radius R: mv2 , R m v R= . : q B
qvB =
ð7:7Þ ð7:8Þ
→
In the general case where the velocity v points in any direction relative to the magnetic field, the velocity vector can be decomposed into a component parallel to the magnetic field and a component perpendicular to it. As shown above, the velocity component remains constant along the magnetic field direction, while perpendicular to it the charge moves on a circular path around the magnetic field direction. If one adds both movements, one finally arrives at a helical movement around the magnetic field direction. Written Test: Motion in the Magnetic Field A charge moves in a homogeneous magnetic field B in the z-direction. Which of the motions sketched in Fig. 7.3 cannot be performed by the charge? (continued)
178
7 Magnetostatics
Solution Motion (E) is not possible. It shows a circle around the y-axis. However only either linear movements along the direction of the magnetic field (C and D) or circled tracks (A) and screw-movements (B) around the direction of the magnetic field are possible.
7.2.2
Lorentz Forces on Electrical Conductors
The fact that Lorentz forces act on moving charges in a magnetic field also applies to charges moving in a conductor, that is, to conductors through which current flows. Here, the charge carriers (i.e., the electrons in metals) are pressed sideways against the edge of the conductor from the inside by the Lorentz force as they move along the conductor. The force on the charges is thus transferred into a force acting on the → entire conductor (Fig. 7.4). The force on a piece of conductor of length d, where d points in the technical direction of current with current I, is given by →
→
→
FL =I . d × B:
ð7:9Þ
Due to this fact, parallel current-carrying conductors also exert forces on each other. This is due to the fact that one of the conductors generates a magnetic field at the location of the other conductor, which then exerts a Lorentz force via the current of the second conductor (Fig. 7.4b). To calculate the force, we take the result for the magnetic field of a current-carrying straight conductor from Sect. 7.3: The magnetic 0I1 at a field runs circularly around the conductor and has the magnitude Bðr Þ = μ2πr distance r from the conductor, where I1 is the current through the first conductor. We put this magnetic field into the Lorentz force (7.9) on the second conductor with current I2; moreover, we assume that both conductors have length d. Then the Lorentz force acting between the two conductors is given by Fig. 7.4 (a) Lorentz force on a conductor in a homogeneous magnetic field. (b) Parallel current-carrying conductors attract each other if, as shown in the sketch, the current direction of the two conductors is the same. If the direction of current is opposite, the conductors repel each other
7.2 Lorentz Force and Definition of the B-Field
FL =
179
μ0 I 1 I 2 d . : 2π r
ð7:10Þ
The Lorentz forces acting on the two conductors are equal and opposite. This results either from analogous calculation or directly from Newton’s third law “actio = reaction”. For the same direction of current in the two conductors, the conductors attract each other with the force FL, for opposite direction of current, the two conductors repel each other. Examination Task: High Voltage Cable The cables of a high-voltage overhead line can each conduct a current of I = 2000 A. With what force F do two such cables at a distance of 3 m and a length of 300 m attract each other at maximum current? Solution We use the formula (7.10) and insert the values: →
FL=
7.2.3
μ0 I 1 I 2 d 4π . 10 - 7 . = 2π 2π r
Tm A
.
2000 A . 2000 A . 300 m = 80 N: 3m
Hall Effect
Another consequence of the Lorentz force is the so-called Hall effect. Here, too, a current-carrying conductor is considered which is penetrated by a homogeneous magnetic field perpendicular to the direction of the current (see Fig. 7.5). Just as with Fig. 7.5 In the Hall effect, a magnetic field penetrates a current-carrying conductor. This produces a voltage on the side surfaces, the Hall voltage UH
180
7 Magnetostatics
the Lorentz force on current-carrying conductors, the charge carriers (the electrons in metals) are forced onto one of the two lateral inner walls of the conductor. On this side, therefore, a negative charge surplus is created. Since the conductor as a whole is uncharged, however, these negative charges are missing on the other inner side, resulting in a positive charge surplus there. Between the positive and negative charges on the two side walls, this creates an electric field, as in a plate capacitor. The field points from the positively to the negatively charged side and counteracts further charge separation. At equilibrium, the electric force due to the electric field just compensates with the Lorentz force due to the magnetic field. By equating the two forces eE = evB
ð7:11Þ
we can calculate the voltage between the two sidewalls with distance b by using the relation between voltage and electric field U = Eb and also relating the current to the velocity of the electrons: I = nevA, where n is the electron density in the metal and A = bd is the conductor cross-sectional area. The voltage UH which arises at the two side walls and which can be measured is the Hall voltage UH =
IB , ned
ð7:12Þ
where d is the thickness of the conductor in the direction of the magnetic field. The Hall effect is used technically to build sensors for magnetic fields (Hall probes). Written Test: Hall voltage On a current-carrying conductor (current I), which is brought into a constant magnetic field B, a voltage appears at the edges lying perpendicular to the direction of the current (width b of the conductor). The voltage is (a) (b) (c) (d) (e)
independent of the geometry of the conductor, the smaller the magnetic field B is, the smaller the charge carrier density in the metal, the smaller, the larger the current I is, regardless of the strength of the magnetic field.
Which of these statements is correct? Solution The solution is in formula (7.12). The statements (a), (b), (d) and (e) are wrong, because the Hall voltage depends on the geometry via the width b. The voltage is proportional to the magnetic field and the current. Furthermore, the voltage (continued)
7.3 Ampère’s Law and Calculation of B-Fields
181
is proportional to the magnetic field and the current. Only statement (C) is correct, because the Hall voltage is proportional to 1/n, that is, if the charge carrier density decreases, the Hall voltage increases.
7.3
Ampère’s Law and Calculation of B-Fields
Now we want to establish a quantitative relationship between electric currents and → the magnetic field B . To do this, we must first introduce a new quantity, namely the → path integral SB of the B field along a closed path around an open surface. We have already learned about path integrals in the definition of electric work (Sect. 6.3). Here we will explain the path integral using Fig. 7.6. An open area A is sketched, which is bounded by an edge K. The path integral starts at an arbitrary point of the boundary (e.g., at the marked black point) and runs along the boundary until the ! starting point is reached again. Here one makes infinitesimally small steps dr . At → ! each point of the edge the scalar product B . dr between the local magnetic field and the infinitesimally small step is formed and this quantity is integrated on the circular path: ! B . dr ,
→
SB =
½SB ] = T . m,
ð7:13Þ
K
where the small circle in the integral sign means that the path is closed, that is, the end point of the path integral is equal to the start point. Now we can write down Ampère’s law: Given any open area A in space bounded by the edge K and through which the current I flows (Fig. 7.6). It holds
→
Fig. 7.6 (a) We calculate the path integral over the magnetic field B along the closed path K at the edge of a surface A. According to law, the result of the integral is given by the current I flowing through the surface A. (b) Applying Ampère’s law to a straight long wire with current I. The magnetic field depends only on the distance r and runs circularly around the wire
182
7 Magnetostatics →
→
B . r = μ0 I:
ð7:14Þ
K
The path integral along the edge K of a surface A is equal to the current I through this surface times the magnetic field constant μ0 = 4π . 10 - 7 Tm A . Here I denotes the total current through the surface. This can be distributed over several currents, whereby → currents flowing through the surface in the direction of the surface vector A are counted positively, and currents flowing in the opposite direction are counted negatively. The total current causes a magnetic field to circulate around the surface under consideration. In Sects. 7.3.1, 7.3.2 and 7.3.3 we will use Ampère’s law to determine the magnetic fields of some important current configurations.
7.3.1
Current-Carrying Straight Wire
The simplest current configuration is that of a straight long wire (Fig. 7.6b). Since a straight wire has cylindrical symmetry, we choose a circular surface A through which the wire passes centrally to apply Ampère’s law to this geometry. The associated boundary K is therefore a circle of radius r around the wire. The path integral SB obviously has the value μ0I, so it is not zero. Moreover, the magnetic field must be rotationally symmetric about the wire, matching the symmetry of the wire. We conclude that the magnetic field is circular around the wire and has a value that depends only on the distance from the wire: B = B(r). But in this case, at every point → ! of the circle K, the magnetic field B is parallel to the infinitesimal step dr ; thus → ! B . dr = BðrÞds is at every point of the circle. The path integral thus reduces to a simple product of the magnetic field B(r) and the circumference of the circle 2πr. Thus, according to the flow law B(r)2πr = μ0I, which we can solve for the magnetic field: Bðr Þ =
μ0 I : 2πr
ð7:15Þ
We summarize: The magnetic field of a straight long wire runs circularly (technical term: azimuthal) around the wire and falls like 1/r with the distance from the wire. The only thing left to clarify is the direction in which the magnetic field runs around the wire. This depends on the direction of the current. For this we use the right-hand rule for magnetic fields of straight wires: The thumb points in the technical direction of the current, then the curved four other fingers indicate the direction of the magnetic field around the wire as in Fig. 7.7.
7.3 Ampère’s Law and Calculation of B-Fields
183
Fig. 7.7 (a) Right-hand rule for magnetic fields of straight wires: The thumb is placed in the direction of the current, then the curved other fingers indicate the direction of the magnetic field around the wire. (b) Right-hand rule for conductor loops and coils: The curved fingers are placed in the direction of the current in the loops, then the thumb indicates the direction of the magnetic field Fig. 7.8 (a) A circular current generates a magnetic dipole field, similar to a bar magnet. (b) Magnetic field in a long thin coil
7.3.2
Circular Current – Magnetic Dipole Moment
Another important geometry is that of the circular current, for example, in a circular wire loop, because a circular current generates a magnetic dipole field which is very similar to the magnetic field of a bar magnet (see Fig. 7.8a). This is no coincidence, → because circular currents are directly linked to the magnetic dipole moment μ , which is given by →
→
μ =I A,
→
½μ] = Am2 :
ð7:16Þ
Here A is the area vector of the circular surface with radius r (|A| = πr2), on whose edge the current I runs in the circle. The direction of the dipole moment and the magnetic field generated as a function of the direction of the current can be considered here by applying the right-hand rule for magnetic fields of straight wires to individual small pieces of the circle (Fig. 7.8a). Or one can directly use the right-hand rule for magnetic fields of circular currents (Fig. 7.7b). Here one curves the four fingers from the index finger to the little finger in the direction of the technical circular current, then the thumb indicates the direction of the dipole moment and the magnetic field generated. Quite analogous to the electric dipole in → → the E field (Sect. 6.4.2.1), a magnetic dipole in the external B field has the energy.
184
7 Magnetostatics →
→
U = - μ . B:
ð7:17Þ
This energy is minimal when the magnetic dipole is aligned parallel to the external magnetic field (Sect. 7.4.2 on paramagnetism). Similar to the electric dipole in the electric field, the magnetic dipole in the magnetic field also experiences a torque →
→
→
M = μ × B:
ð7:18Þ
For this reason, magnetic dipoles align themselves in an external magnetic field. Compasses, for example, work this way.
7.3.3
Long Thin Coil
Solenoid coils consist of many wire loops that are attached to each other. This amplifies the magnetic field within the coil compared to a single loop, since the magnetic field of the individual circuit currents add up. Also, the magnetic field within the coil is largely homogeneous if the coil is wound tightly enough. The magnetic field inside the coil can now also be calculated using Ampère’s law. Here we assume a long thin coil with N windings, which has the length l (see Fig. 7.8b). The current I is flowing through the coil. As a surface for the application of the flow law we choose a rectangle which is penetrated exactly once by each turn of the coil. Thus, the flux through it is given by Nμ0I. Furthermore, we assume that the magnetic field inside the coil is homogeneous with value B and outside the coil is zero. This is a good approximation for thin coils, since the magnetic field outside the coil falls off on the length scale of the coil radius. For this reason, the magnetic path integral gets only contributions from the path integral inside the coil and has the value Bl. Ampère’s law thus reads Bl = Nμ0I, which we can solve for the magnetic field inside the coil: B = μ0
NI : l
ð7:19Þ
What may seem surprising is the fact that the magnetic field does not depend on the diameter of the coil. But this is only correct in the approximation of long thin coils, where the coil length is significantly greater than the coil diameter. Written Test: Magnetic Path Integral What is the magnetic path integral along the path sketched in Fig. 7.9? (continued)
7.3 Ampère’s Law and Calculation of B-Fields
185
Fig. 7.9 Written test: A circular surface is penetrated by several windings of a coil. How large is the path integral along the sketched path around the circular surface?
Solution The solution is very simple when you discover that each of the windings pierces the circular surface once upwards and once downwards. Thus the currents through the surface cancel each other out, and the flux through the surface is zero. Using Ampère’s law, the magnetic path integral is also zero.
7.3.4
Magnetic Flux
Analogous to the electric flux in Sect. 6.2, a magnetic flux can also be defined; one → → only has to replace the electric field E by the magnetic field B (Fig. 6.2a). The definition of the magnetic flux is thus →
→
ϕm = B . A ,
½ϕm ] = T . m2 ,
ð7:20Þ
! B . dA ,
→
ϕm =
ð7:21Þ
A →
where (7.20) is valid for plane surfaces with surface vector A and constant magnetic fields on this surface, while (7.21) is generally valid for arbitrarily curved surfaces ! A with local surface vector dA and variable magnetic fields. If the magnetic field penetrates a plane surface perpendicularly, Eq. (7.20) simplifies further to the simple relation ϕm = BA. Since the unit of the B-field is given by ½B] = ½ϕ½Am] ], the B-field is also called magnetic flux density.
186
7 Magnetostatics
Exercise: Is There a “Gauss Theorem” for Magnetic Flux? In electrostatics we have become familiar with Gauss’ theorem, which relates the electric flux through the closed surface O of a volume to the charge Q in that volume: ! Q E . dA = : ε0
→ O
ð7:22Þ
Now that we have defined magnetic flux, the question of an analogous law for magnetostatics is not far-fetched. On the other hand, we have already seen that charges are electric monopoles, but that there are no magnetic monopoles, only magnetic dipoles. How does this now fit together? Answer: In fact, there is a corresponding law in magnetostatics. But since there are no magnetic monopoles, the law is ! B . dA = 0,
→
ð7:23Þ
O
That is, the flux through any closed surface is zero, regardless of whether the volume is in free space or whether the volume contains magnets. Even if magnets only partially enter the volume, the magnetic flux through the surface is zero. Figuratively speaking, this statement means that exactly as many magnetic field lines always go into any volume as come out again at another place. The statement (7.23) is therefore equivalent to the statement that all magnetic field lines are closed.
7.3.5
Energy in the Magnetic Field
Similar to the energy in the electric field, energy is also stored in the magnetic field. This is because in order to create a magnetic field in a certain area of space, you have to turn on currents. For example, you have to switch on the current in a coil to generate the magnetic field in the coil. In the process of switching on, electrical power is required. However, in an ideal coil, this power is not lost as heat (as in an ohmic resistor, for example), but the power can be reused later when the current in the coil is switched off. In the meantime, this power is stored in the magnetic field of →
the coil. Analogous to the electrical energy density wel = 12 ε0 E energy density is
→
r
2
, the magnetic
7.4 Magnetism in Matter
187
wm =
1 → → B r 2μ0
2
½wm ] =
,
J , m3
ð7:24Þ
That is, the magnetic energy density increases with the square of the magnetic field. Example: Magnetic Energy in a Coil As an example, let us calculate the energy stored in a long thin coil. For this we assume a coil with N windings, coil length l, cross-sectional area A of the coil and current I. The volume contained in the coil is thus given by V = Al. Also, we know the magnetic field in the coil from Eq. (7.19): B = μ0 NIl . The energy stored in the magnetic field of the coil is thus W = wm V =
7.3.6
1 2 1 2 N2I 2 1 μ0 N 2 A 2 B Al = μ0 2 Al = I : l 2μ0 2μ0 2 l
ð7:25Þ
Outlook: Vector Potential
In electrostatics we have seen that electric fields can always be calculated as a → gradient of a conservative potential, the electric potential: E = - ∇φel . For magnetic fields this is not possible, because magnetic fields are so-called vortex fields, → B→ where the field lines are closed in themselves. Thus the path integral A B . d r between two points A and B is not uniquely defined, but depends in particular on the length of the path. However, magnetic fields can be represented as a so-called rotation of a vector potential. The rotation is calculated by forming the cross product between the Nabla operator ∇ and the vector potential: →
→
B =∇× A =
∂=∂x ∂=∂y
×
∂=∂z
7.4
Ax Ay Az
∂Az =∂y - ∂Ay =∂z =
∂Ax =∂z - ∂Az =∂x ∂Ay =∂x - ∂Ax =∂y
:
ð7:26Þ
Magnetism in Matter
Now we want to consider the case that a magnetic field penetrates into matter. We already know that also the single atoms, of which the matter consists, can be small magnetic dipoles. But whether matter as a whole is magnetic depends on whether the orientation of these elementary magnets is randomly distributed. For if this is the case, the magnetic effects of the individual dipoles cancel each other. If, however, the dipoles have a preferred direction, the whole medium gets a magnetization M, which describes the average magnetic dipole moment per volume. The magnetization can either be permanent, that is, it can exist even without applying an external
188
7 Magnetostatics
magnetic field (this is the case of ferromagnetism), or the magnetization is provoked by the external field. There are two fundamentally different mechanisms how magnetization is caused by an external field. Depending on this, one speaks either of diamagnetism or paramagnetism. In all these cases, the microscopic dipoles generate their own magnetic field via the magnetization, which is superimposed on the external magnetic field. With the definition of the magnetic field strength →
H=
→
→ B - M, μ0
→
H =
A , m
ð7:27Þ
→
analogous to the definition of electric flux density D in electrostatics, the effect of the magnetization can be taken into account. There is a linear relationship between → → the H – and the B – field. →
H=
→
B , μ0 μr
ð7:28Þ
with the unitless material constant μr, the so-called relative permeability. With this → definition, the H field is constant at the transition between media with different μr as → long as no electric currents flow at the boundary between the media. The B field, on the other hand, scales with the corresponding value of μr.
7.4.1
Diamagnetism
Diamagnetic substances do not have magnetic dipoles without an external magnetic field, i.e. the magnetic dipoles in the interior are only caused by the magnetic field. → The external field B ext influences the movement of the electrons in the atoms and forces the electrons into a circular orbit around the respective atomic nucleus (see Fig. 7.10). In order for a circular orbit to occur, a centripetal force is required. This is caused by the Lorentz force, which for this reason must point in the direction of the atomic nucleus. Thus the direction of motion of the electrons on the circle is fixed. → The internal magnetic field B int caused by the circular current (opposite to the direction of movement of the electron) is thus opposed to the external magnetic field, i.e. the external magnetic field is weakened by the internal field. This is expressed by the fact that the relative permeability of diamagnetic materials is μr ≲ 1. For example, for lead μr = 0.999.
7.4.2
Paramagnetism
In contrast to diamagnetic substances, paramagnetic substances have permanent magnetic dipoles inside them, that is, the paramagnet consists of many small
7.4 Magnetism in Matter
189
→
Fig. 7.10 (a) In diamagnetism, the external B field forces the electrons into circular orbits. This →
generates an internal B field which is directed in the opposite direction to the external field. (b) In →
paramagnetism, permanent magnetic dipoles are aligned by the external B field in such a way that the external field is strengthened. (c) In ferromagnetism, the permanent magnetic dipoles are so strong that adjacent dipoles align with each other. This results in a permanent macroscopic magnetization of the medium
permanent magnets. Here, only small forces should act between the individual dipoles, that is, the forces should be so small that the orientation of the individual dipoles with respect to each other is random. For example, the orientation of the dipoles is random because of their thermal motion. Then the magnetic effect of the → small magnets averages out, and the magnetization is M = 0. However, if an external magnetic field is applied, the elementary magnets can align themselves in this field like small compass needles in the Earth’s magnetic field (Fig. 7.10b). The resulting internal magnetic field is parallel to the external magnetic field and amplifies the external field. Paramagnetic materials have a relative permeability of μr ≳ 1, for example aluminium has μr = 1 + 10-5.
7.4.3
Ferromagnetism and Hysteresis
Ferromagnetic materials (e.g., iron and nickel) are actually also paramagnetic, that is permanent magnetic dipoles exist in the medium, with the difference that now the forces of the individual dipoles among themselves are strong (see Fig. 7.10c). Correspondingly, the relative permeability is μr ≫ 1, for example for iron, depending on the nature of the metal 300 < μr < 10.000. The forces between dipoles are now so strong that adjacent dipoles can align with each other. On a macroscopic scale, there are whole regions (domains) in the material where the dipoles point in one direction on average (see Fig. 7.11). These regions are called Weiss domains. The magnetization of the domains can be influenced by an external magnetic field. If, for example, one starts with non-magnetic iron, in which the magnetization of → different regions cancels each other, and increases the external H field, the magneti→ → zation of the material and thus the B field initially increases linearly with H according to Eq. (7.28). Microscopically, more and more Weiss domains flip to
190
7 Magnetostatics
Fig. 7.11 (a) In a ferromagnet, the magnetic dipoles in domains align themselves by mutual forces, →
so that each domain is magnetized. (b) Hysteresis curve of the magnetic flux density B with a →
change in the external magnetic field H . Once all Weiss domains are flipped, the flux density →
→
saturates at the value B S . If one then switches off the external magnetic field, a remanence field B R →
→
remains. Only at the coercivity H c the B field becomes zero again
the direction of the external field. At some point, however, all the domains are → oriented, and a further increase of the external field has no effect anymore. The B → → field therefore saturates at the value B S . If the external field is reduced again, the B field also decreases, but the Weiss domains remain partially aligned due to their → mutual forces. Therefore, a remanence field B R remains in the ferromagnet when the external magnetic field is switched off. To completely eliminate magnetization in the material, one must instead apply an external field in the opposite direction with → coercivity H c . All these effects (saturation, remanence and disappearance of magnetization at coercivity) are present in the same way when the external field is → applied in the opposite direction. Therefore, if one plots the B field against the → external H field, one obtains a Magnetic hysteresis curve that is point-symmetric about the zero point (see Fig. 7.11b). The phenomenon of hysteresis describes the → fact that there are two possible values for a measured quantity (here the B field) for → → each value of a certain parameter (here the H field). Which value the B field takes → → when changing the H field depends on the previous value of B . Overview Written test: Magnetic hysteresis →
What cannot be read from the hysteresis curve of the B field of a material with → an applied magnetic field H ? (a) whether the material is ferromagnetic or not, (b) how large the remanence of the magnetic field is, (continued)
7.5 Magnetostatics – Compact
191
(c) how large the saturation field is, (d) how large an opposing magnetic field must be in order to reverse the magnetization, (e) how many Weiss precincts there are. Solution What you can’t tell from the hysteresis curve is the number of Weiss’ districts.
7.5
Magnetostatics – Compact
Here again the most important formulas for magnetostatics are summarized: Lorentz force on moving charges (7.2): →
→
→
→
→
FL =q . v × B:
Lorentz force on straight wire (7.9): →
FL =I . d × B:
Lorentz force between two straight conductors (7.10): →
FL=
μ0 I 1 I 2 d . : 2π r
Hall voltage (7.12): UH =
IB : ned
Ampère’s flow law (7.14): →
→
B . d r = μ0 I:
K
Magnetic flux through any surface (7.23): →
→
B . d A = 0:
O
Magnetic field of a straight wire (7.15):
192
7 Magnetostatics
Bðr Þ =
μ0 I : 2πr
Magnetic dipole moment of a circular current (7.16), its energy in the magnetic field (7.17), and the corresponding torque on the dipole (7.18): →
→
μ =I A,
→
→
U = - μ . B, →
→
→
M = μ × B: Magnetic field of the long thin coil (7.19): B = μ0 →
NI : l
→
Relationship between B field and H field (7.28): →
→
B , μ0 μr μr ≲1 = Diamagnetism, H=
μr ≳1 = Paramagnetism, μr ≫ 1 = Ferromagnetism:
8
Electrodynamics
8.1
Relationship Between Electric and Magnetic Fields
In electromagnetism, electric and magnetic phenomena are brought into a common context. Formally, electric and magnetic fields are linked by Maxwell’s equations (Sect. 8.3). Unfortunately, there is no easy-to-understand stringent derivation of Maxwell’s equations. Therefore, in this section we restrict ourselves to the description of induction as an essential phenomenon of electromagnetism. Induction will then also play a role in Chap. 9 on electronic components in connection with coils. Before we go into induction in more detail, we will use an easy-to-understand thought experiment to show why there must be a connection between electric and magnetic fields. For this purpose we imagine an observer at rest in the laboratory system S, who → measures the force effect on a charge q moving with constant velocity v in a → magnetic field B . An electric field is not supposed to exist in the laboratory system, → that is, E = 0. As we have seen in Sect. 7.2, in this case the Lorentz force acts on the charge with →
→
→
FL =q . v × B:
ð8:1Þ
Now we imagine that the observer is moving with the same velocity and in the same → direction as the charge, that is, with v . We denote this system, which is moving along with the observer, by S′. From the perspective of the co-moving observer, the →0 charge is now at rest, that is, her or she measures the velocity of the charge v = 0. Therefore, the Lorentz force in the co-moving system is →
→0
→0
F L0 = q . v × B = 0:
ð8:2Þ
Nevertheless, the observer in the co-moving system measures the same force effect on the charge as the observer at rest. Let us first prove this statement: The force # Springer-Verlag GmbH Germany, part of Springer Nature 2023 S. Slama, Experimental Physics Compact for Scientists, https://doi.org/10.1007/978-3-662-67895-4_8
193
194
8
Electrodynamics
→ → € acting on the charge leads to an acceleration in the system at rest a = r ðt Þ, with the → position function r ðt Þ of the charge. The position function in the co-moving system is given by →0
→
→
r ðt Þ = r ðt Þ - v . t:
ð8:3Þ
→0 → €0 The acceleration a = r ðt Þ in the co-moving system →0
a =
d2 dt 2
→
→
r ðt Þ - v . t =
d dt
→_
→
r ðt Þ - v
→ → € = r ðt Þ = a
ð8:4Þ
is the same as the acceleration in the laboratory system. Using Newton’s second law → → F = m a we conclude that the forces must be equal: →0
→
F = F:
ð8:5Þ
According to (8.2), the force in the co-moving system cannot be caused by the Lorentz force, since the charge in this system does not move. Nevertheless, according to (8.1), the force is proportional to the charge q. The only force we know which acts on charges at rest is the electric force according to (6.7): →0
→0
F =q . E :
ð8:6Þ →0
To generate this force, there must be an electric field E in the moving system. By equating (8.6) with (8.1) →0
→
→
q . E =q . v × B
ð8:7Þ
one obtains the electric field strength →0
→
→
E = v × B:
ð8:8Þ →0
→
Thus the magnetic field B in the system at rest transforms into an electric field E in → the moving system. Conversely, an electric field E in the system at rest can also →0
generate a magnetic field B in the moving system (without proof), with →0
B =-
1 → → v × E, c2
ð8:9Þ
where the variable c is the speed of light. By the way, the formulas (8.8) and (8.9) are valid only for velocities which are small compared to the speed of light, v ≪ c. A relation valid for all velocities is given in special relativity by the Lorentz
8.2 Induction Laws
195
transformation of electromagnetic fields. However, we do not want to deepen this topic here, but in the following we will get to know the phenomenon of induction in more detail.
8.2
Induction Laws
We have seen in Sect. 8.1 that there is a connection between electric and magnetic fields as soon as one moves from one inertial frame to a second reference frame. With induction phenomena, on the other hand, we consider a fixed reference frame. What changes is the electric flux ϕel(t) or the magnetic flux ϕm(t) through an open bounded surface A with boundary curve K (Fig. 8.1). The time dependence could be caused, for example, by a magnet moving relative to the surface, or by a timevarying magnetic field being generated by an alternating current. Regardless of the actual cause, however, in principle the time dependence could also always be → → → → artificially generated by choosing a suitable field distribution E r or B r in the reference frame and moving the surface A with a suitable position function in the reference frame. An example of this is the dynamo, in which a coil is rotated in a homogeneous magnetic field. This causes the flux through the coil to oscillate sinusoidally (see Sect. 8.2.4). The reference frame of the surface is thus moved relative to the inertial frame, whereby electric and magnetic fields are transformed into each other. It is irrelevant here whether the surface actually moves in this way or not; the physically measurable and thus relevant quantity is only the variable flux. For this reason, the transformation of the fields occurs in all systems in which the flux changes. In particular, an electric flux that changes as a function of time causes a corresponding magnetic field to emerge. And conversely, a magnetic flux that changes as a function of time causes the emergence of a corresponding electric field. Both variants are generally referred to as induction.
Fig. 8.1 (a) Magnetic induction: A varying electric flux through the area A contributes to the magnetic path integral. (b) Example of magnetic induction: If a capacitor is charged, a magnetic vortex field is generated between the capacitor plates
196
8.2.1
8
Electrodynamics
Induction of Magnetic Fields
For the induction of magnetic fields (i.e. magnetic induction), one considers a surface A in space, which is bounded by the boundary curve K (Fig. 8.1). As we already know from magnetostatics, the magnetic path integral along the curve K according to Ampère’s law (7.14) is given by the current I the area ! B . dr = μ0 I:
→
ð8:10Þ
K
In electrodynamics, this law must be supplemented by another term. A magnetic → vortex field B ðt Þ can also be caused if the electric flux through the surface A changes in time. The electric flux ϕel was introduced in electrostatics in (6.9) as an integral of the electric field over the area A: ! E . dA :
→
ϕel =
ð8:11Þ
A
The law of induction for magnetic fields combines Ampère’s law of flux (8.10) with the temporal change of the electric flux: ! d B . dr = μ0 I þ μ0 ε0 . dt
! E . dA :
→ K
→
ð8:12Þ
A
An example of magnetic induction can be observed when charging a plate capacitor: Exercise: Magnetic Field in a Plate Capacitor We have learned about the plate capacitor in electrostatics. It consists of two parallel plates (Fig. 8.1b). When a voltage U is applied to the plates, a charging current IL flows and the plates are charged with the charges and +Q -Q, respectively. What is the magnitude of the magnetic field between the plates during charging as a function of IL ? Solution For the solution we consider the magnetic vortex field along the circular path → sketched in Fig. 8.1b and assume that the corresponding magnetic field B is oriented circularly (azimuthally) and depends only on the radial distance (continued)
8.2 Induction Laws
197
r from the center of the plate, that is, B = B(r). The vortex field along the circular path with radius r is thus given by ! B . dr = BðrÞ . 2πr,
→
ð8:13Þ
K
analogous to the calculation of the magnetic field around a long, straight, current-carrying conductor in Sect. 7.3. The difference here is that no direct current I flows through the area covered by the circuit, because the charging current IL ends at the two plates. In the capacitor, however, there is an electric field which changes in time during the charging process. According to (6.17) E=
Q : ε0 . A
ð8:14Þ
Since the electric field in the plate capacitor is homogeneous, the electric flux is given by ! E . dA = E . πr2 :
→
ð8:15Þ
A
The time derivative of the electric flux is related to the current via the relation I = dQ/dt: ! d d E . dA = ðE . πr 2 Þ = dt dt
→
d dt
Qπr2 ε0 . A
= IL .
πr 2 : ε0 . A
ð8:16Þ
A
We put these results into (8.12): ! d B . dr = μ0 I þ μ0 ε0 . dt
! E . dA ,
→ K
=0
→
ð8:17Þ
A
Bðr Þ . 2πr = μ0 ε0 . I L .
πr 2 ε0 . A
ð8:18Þ
and solve for the magnetic field B(r): Bðr Þ =
μ0 I L . r, 2A
ð8:19Þ
(continued)
198
8
Electrodynamics
that is, the magnetic field increases linearly with the distance from the centre of the capacitor outwards. This relation is valid as long as the considered area is still completely inside the capacitor. Otherwise, edge effects are added, which also depend on the shape of the capacitor plates.
8.2.2
Induction of Electric Fields – Faraday’s Law of Induction
Compared to magnetic induction, the following process, in which a magnetic field that changes over time causes an electric field, is generally better known, since this process leads to electrical voltages and currents that can be used for many technical applications. In this case, one also speaks of Faraday’s law of induction. Here, the temporal change of a magnetic flux through a surface A causes an electric vortex field along the boundary curve K of the surface (Fig. 8.2). The following applies to the path integral ! d E . dr = dt
! B . dA :
→ K
→
ð8:20Þ
A
Faraday’s law of induction is generally valid for all surfaces A in space, with their respective boundary curve K. In particular, it is also valid if the boundary curve is formed by an electrically conducting wire, that is, by a conducting wire. If one breaks the conducting wire at one point and integrates the electric field over the conductor without the gap (Fig. 8.2b), one obtains with the definition from (6.22) the (positively counted) induction voltage U ind = -
dϕm , dt
ð8:21Þ
→ ! where we have inserted the definition of magnetic flux ϕm = B . dA according to A
(7.21). If we close the gap in the conductor, the induction voltage causes an electric
Fig. 8.2 (a) Faraday’s law of induction: A varying magnetic flux through the surface A causes an electric vortex field. (b) If the electric vortex field is integrated along the edge, the electric voltage U is obtained, the induction voltage
8.2 Induction Laws
199
induction current to flow in a circle in the conductor. The current Iind is related to the induction voltage via Ohm’s law: U ind = R . I ind ,
ð8:22Þ
with the ohmic resistance R.
8.2.3
Lenz’s Rule
Let us now take a closer look at the negative sign in Faraday’s induction law according to (8.21). Obviously the induction voltage is opposite to the change of the magnetic flux. Its meaning is described in Lenz’s rule: Lenz’s Rule The voltage induced in an electrical conductor by a change in magnetic flux causes a current to flow in the conductor, the direction of which is oriented so that the magnetic field generated by the current opposes the change in flux with time. The direction in which the induction current flows will be demonstrated by means of an example. For this purpose, we will consider a conducting loop towards which a bar magnet is moving with the north pole in front (Fig. 8.3a). In the sketch, there is a magnetic flux from bottom to top through the conductor loop. As the bar magnet approaches the loop, this upward flux increases. Thus, the flux change dϕm points
Fig. 8.3 Induction current in a conductor loop. (a) A bar magnet moves with its north pole towards the loop, causing the magnetic flux ϕm to increase. (b) The bar magnet is oriented as in (a) but moves away from the loop; this causes the magnetic flux ϕm to decrease. The induction currents in (a) and (b) are each oriented in the opposite direction. In both cases, there is a force on the bar magnet that slows down the movement
200
8
Electrodynamics
Fig. 8.4 (a) A square conductor loop with side length d = 5 cm is moved at constant speed v = 1 m/s from an area with B = 0 to an area with a homogeneous magnetic field with B = 0.3 T, which penetrates the loop perpendicularly. (b) This induces a current I in the conductor loop
upward. Therefore, according to Lenz’s rule, the magnetic field generated by the induced current must point downward. According to the right-hand rule for magnetic fields of conductor loops and coils (see Fig. 7.7), this corresponds to a clockwise current in the coil. The conductor loop then behaves like a magnet with its north pole pointing downwards. This results in a repulsive mechanical force on the bar magnet moving upwards. The force is thus opposite to the motion. In Fig. 8.3b, we consider the opposite direction of motion of the bar magnet, which is still oriented with its north pole facing upward toward the coil, but is now moving downward, away from the coil. As before, the bar magnet generates a magnetic flux upwards through the conductor loop. If the distance between the bar magnet and the loop increases, this upward flux decreases, that is, the flux change dϕm points downward. The induction current must therefore generate a magnetic field which points upwards. This corresponds to a counterclockwise current in the coil. Now the conductor loop behaves like a magnet with its north pole pointing upwards. Its south pole thus produces an attractive force on the bar magnet, which is moving away. So also in this case the mechanical force is opposite to the movement. Written Test: Induction in a Conductor Loop Consider the situation sketched in Fig. 8.4a : A square conductor loop with side length d = 5 cm is moved at constant speed v = 2 m/s from an area x < 0 with B = 0 into an area x > 0 with a homogeneous magnetic field with B = 0.3 T which penetrates the loop perpendicularly. The conductor loop has an ohmic resistance of R = 0.1 Ω. What is the current I induced in the coil as a function of the position of the conductor loop xL ? Solution We calculate the time change of the magnetic flux through the loop area. For xL < 0, the loop is completely in the left half with B = 0, so when the loop moves, the flux ϕm = 0 does not change. In this case, there is no induction (continued)
8.2 Induction Laws
201
voltage and therefore no induction current. The same is true for the region xL > d, where the loop is completely in the right half with magnetic field B. In this case there is a non-zero flux ϕm = B . d2, but it does not change either. Induction only occurs when the flux through the coil changes. This is the case when the loop partially overlaps with the right half. The flux is then given by the magnetic field penetrating the shaded area of thesurface ϕm = B . d . x L : The loop now moves into the right-hand region at v = shadedarea. This causes the flow to increase:
ð8:23Þ dxL dt ,
increasing the
dϕm dx = B . d . L = B . v . d: dt dt
ð8:24Þ
The induction voltage is thus given by U ind = - B . v . d,
ð8:25Þ
0:3 T . 2 m=s . 0:05 m B.v.d U ind = - 0:3 A: ==R R 0:1 Ω
ð8:26Þ
and the induction current is I ind =
Since the magnetic field points out of the sheet plane and the flux increases due to the movement, the direction of the induction current is oriented so that the resulting magnetic field points into the sheet plane. This corresponds to a clockwise current (Fig. 8.4). The function I(xL) is sketched in Fig. 8.4b.
8.2.3.1 Eddy Currents Eddy currents are a special form of induction currents. Eddy currents are always referred to when a current is induced in an extended electrical conductor, for example, in a metal plate, where the current path is not given by the geometry, such as the shape of a conductor loop. Typically, multiple circular currents (vortices) then form in the extended conductor. Because induction currents and thus also eddy currents always result in a force that slows down the movement, eddy currents are frequently used as so-called eddy current brakes. For example, ICE (intercity express) trains in Germany are braked at high speeds by eddy currents in the metallic rails. For this purpose, strong magnets are located a few millimetres above the rails. In Fig. 8.5 this is schematically sketched for a single magnet. If the magnet moves along the rail, eddy currents are induced in front of and behind the magnet. For simplicity, we model this for the calculation as square circular currents whose side length is given by the rail width d = 7 cm. Thus, we can calculate the induction current Iind using Eq. (8.26) in each of the two eddy currents:
202
8
Electrodynamics
Fig. 8.5 ICE trains are braked at high speeds by eddy current brakes. For this purpose, strong magnets are positioned at a distance of a few millimetres above the tracks, which induce eddy currents in the tracks. The Lorentz force on the eddy currents results in a decelerating force on the train
jI ind j =
B . d . v 0:5 T . 0:07 m . 300 km=h = = 292 000 A, R 10 - 5 Ω
ð8:27Þ
where we have made estimates for the magnetic field B = 0.5 T, the ohmic resistance R = 10-5 Ω in the metal rail and have chosen v = 300 km/h as the velocity. The decelerating force now comes from the Lorentz force on the eddy current. According to (7.9) it is given by F L = 2 . I ind . d . B =
2 2 B2 . d2 . v ð0:5 TÞ . ð0:07 mÞ . 300 km=h = R 10 - 5 Ω
= 20 417 N,
ð8:28Þ
where the factor 2 results from the presence of two eddy currents in front of and behind the magnet. According to actio = reactio, the force on the ICE is just as great. In the ICE, this force is multiplied by the fact that there are several of these magnets above the tracks on each axle. Exercise: Induction Cooker In an induction cooker, an alternating magnetic field is generated which induces an eddy current in the bottom of the pot placed on the cooker. Let us assume that the magnetic field perpendicular to the bottom of the pot oscillates harmonically with Bðt Þ = B0 . cosðω . t Þ
ð8:29Þ
and the radius of the circular current in the pot is given by the pot radius r. What is the heating power P(t) = U(t) . I(t), with which the pot can be heated? For this, assume that the resistance for a circuit along the circumference 2πr of the eddy current is given by R = R′ . (2π r) with the resistance R′ per length of the current flow. How does the heating power increase with the radius of the pot? (continued)
8.2 Induction Laws
203
Solution The induction voltage along the circular circumference of the pot with radius r is given by Faraday’s law of induction by dϕm d =B . cosðω . t Þ . πr 2 = B0 . ω . πr 2 dt dt 0 . sinðω . t Þ:
U ind = -
ð8:30Þ
According to Ohm’s law, the corresponding current is given by I ind =
B .ω.r U ind B0 . ω . πr 2 . sinðω . t Þ = 0 0 . sinðω . t Þ: = 0 R R . ð2πr Þ 2R
ð8:31Þ
This results in the electrical heating power Pðt Þ = U ind . I ind =
B20 . ω2 . π . r 3 . sin 2 ðω . t Þ: 2R0
ð8:32Þ
The heating power therefore increases with the third power of the pot radius. However, since the mass of the pot bottom to be heated only increases quadratically with the radius (with the area of the bottom), large pots therefore heat up faster than small pots. In reality, this result is only correct to a limited extent, as modern induction cookers have several small coils integrated into them, which can detect the size and shape of the pot and which are automatically switched on or off depending on this.
8.2.4
Generation of Alternating Current – The Dynamo
The phenomenon of induction is also the basis for generating alternating current with the help of a dynamo. A homogeneous magnetic field passes through a rotating coil → with N windinggs (Fig. 8.6). Depending on the position of the surface vector A in → relation to the magnetic field B , the magnetic flux is →
→
ϕm ðαÞ = B . A = B . A . cosðαÞ:
ð8:33Þ
Thus, when the coil rotates uniformly at α = ω . t, the flux has a harmonic time dependence: ϕm ðt Þ = B . A . cosðω . t Þ: The induction voltage for N windings is then given by
ð8:34Þ
204
8
Electrodynamics
Fig. 8.6 If a coil is rotated in a homogeneous magnetic field, an alternating voltage U(t) = U0 . sin (ω . t) is induced
U ind = - N .
dϕm = N . B . A . ω . sinðω . t Þ = U 0 . sinðω . t Þ: dt
ð8:35Þ
This is an AC voltage with an angular frequency ω and amplitude U0 = N . B . A . ω.
8.2.5
Self-Induction
Self-induction is the induction voltage in a coil that arises because the current I(t) in the coil changes over time. This is the case, for example, when an alternating current flows through the coil. Thus the magnetic field B(t) / I(t) generated by the coil and the magnetic flux ϕm ðt Þ = Bðt Þ . A / I ðt Þ
ð8:36Þ
through the cross-sectional area A of the coil are changing. According to Faraday’s law of induction, the induction voltage in the coil with N windings is then given by U ind = - N .
dI ðt Þ dϕm dB = -N . A . / dt dt dt
ð8:37Þ
and thus proportional to the time derivative of the current in the coil. For the proportionality factor, a new quantity is introduced, the so-called inductance L: U ind = - L .
dI ðt Þ V .s , ½L] = = H ðHenryÞ: dt A
ð8:38Þ
The inductance is a characteristic quantity for the respective coil. The negative sign indicates that the induction voltage always counteracts the change in current in the coil. If an alternating voltage is applied to a coil from the outside, the induction voltage opposes this alternating voltage.
8.3 Maxwell Equations
205
Exercise: Inductance of a Long Thin Coil The magnetic field in a long thin coil of length l and N windings is given by (7.19) by B = μ0
NI : l
ð8:39Þ
What is the inductance of the coil? Solution: We put (8.39) in (8.37) with ϕm(t) = B(t) . A: U ind = - N . A .
μ N 2 A dI ðt Þ dB . : =- 0 l dt dt
ð8:40Þ
By comparison with (8.38) the inductance is L=
μ0 N 2 A : l
ð8:41Þ
The inductance therefore increases quadratically with the number of windings if the length l is constant. If, on the other hand, the number of windings per length N/l is constant, the inductance increases linearly with the number of windings or with the length of the coil. By comparison with the energy in a long thin coil from Eq. (7.25), it is evident that the energy in a coil can be written compactly as W=
1 2 LI : 2
ð8:42Þ
This equation is valid not only for long thin coils, but in general for all currentcarrying inductances L.
8.3
Maxwell Equations
We will conclude this chapter by writing down the four basic equations of electromagnetism, the so-called Maxwell equations. The equations listed here are valid in a vacuum; we will refrain from formulating the corresponding equations valid in media at this point. It should also be mentioned that the equations presented here are the integral form of Maxwell’s equations, that is, integrals occur in the equations. In theoretical physics, however, Maxwell’s equations are usually written down in
206
8
Electrodynamics
differential form, in which differential operators are used instead of integrals, such as divergence or rotation, which we cannot discuss further here. The following equations are thus the basis of all electromagnetic phenomena, including electrostatics, magnetostatics, electromagnetic waves and optical phenomena. The Theorem of Gauss →
ϕel =
→
Q , ε0
E .dA =
O
ð8:43Þ
where the integral is performed over a closed surface O of a volume in space and Q is the charge in the volume. The non-existence of magnetic monopoles →
→
ϕm =
B . d A = 0:
ð8:44Þ
O
Again, the integral is performed over a closed surface O of any volume in space. Ampère’s law →
→
B . d s = μ0 I þ μ0 ε0 .
→
d dt
K
→
E . dA:
ð8:45Þ
A
Here the line integral is performed over the closed curve K in space, and A is any surface bounded by K through which the electric current I flows. Faraday’s law of induction →
→
E .d s = -
K
→
d dt
→
B . dA:
ð8:46Þ
A
Again, the line integral is performed over the closed curve K in space, and A is an arbitrary surface bounded by K.
8.4 Electrodynamics – Compact
8.4
207
Electrodynamics – Compact
Here,the most important formulas for electrodynamics are summarized: Transformation of electric and magnetic fields from the laboratory system to a system moving at speed v ≪ c (8.8) and (8.9): →0
→
→
E = v × B, →0 1 → → B = - 2 v × E: c Induction law for magnetic fields (8.12): →
→
B . d r = μ0 I þ μ0 ε0 .
→
d dt
K
→
E . dA:
A
Faraday’s law of induction for electric fields or voltages (8.46) and (8.21): →
→
E .dr = -
K
→
d dt
→
B . d A , U ind = -
A
Self-induction in a coil (8.38): U ind = - L .
dI ðt Þ : dt
dϕm : dt
9
Electronics
9.1
Passive Components and Alternating Current
Passive components are all electronic components that do not require their own supply voltage, such as resistors, capacitors and coils, which consist of electrical conductors, but also, for example, diodes, which consist of semiconductors. Active components are typically semiconductor components, for example, electrical amplifiers (transistors). In this case, the energy for amplification must be provided by an additional supply voltage. Since this section is also particularly concerned with the reaction of the components to changing currents, the concept of alternating current through an ohmic resistor is first introduced.
9.1.1
Alternating Current and Alternating Current Resistance
An alternating current is generated by a voltage source which produces a harmonically oscillating voltage UðtÞ = U 0 sinðω tÞ:
ð9:1Þ
In Sect. 8.2.4, for example, we learned about the dynamo as an AC voltage source. The alternating voltage is characterized by its amplitude U0 and its frequency f = ω/ 2π. Typically, AC voltage sources are symbolized in a circuit diagram by a circle with an included tilde (Fig. 9.1). If a circuit is connected to the AC voltage, for example, via an ohmic resistor R, then the AC current flows in the circuit I ðt Þ = I 0 sinðω t Þ:
ð9:2Þ
The amplitudes U0 of the alternating voltage and I0 of the alternating current are linked by the ohmic resistance:
# Springer-Verlag GmbH Germany, part of Springer Nature 2023 S. Slama, Experimental Physics Compact for Scientists, https://doi.org/10.1007/978-3-662-67895-4_9
209
210
9
Electronics
Fig. 9.1 (a) An alternating voltage source is connected to a resistor R. This causes an alternating current to flow in the circuit. (b) The resistive power in the resistor oscillates between P = 0 and the maximum value Pmax. On time average, the power is given by Peff = 12 Pmax . This can be seen geometrically, since the area above the dashed line fits exactly into the missing area below it
R=
U0 : I0
ð9:3Þ
It is important here to state that the amplitude ratio U0/I0 for the ohmic resistor is independent of the frequency of the AC voltage. This is not necessarily the case. In the case of the capacitor and the coil, we will see that the corresponding relationship depends on the frequency. In this case we speak of the alternating current resistance RðωÞ =
U0 : I0
ð9:4Þ
The AC resistance is also called impedance. In the case of ohmic resistance, the AC resistance is constant R(ω) = R. The electrical power according to (6.36) P=U I =
U2 = R I2 R
ð9:5Þ
for an alternating current in an ohmic resistance is given by Pðt Þ = U 0 I 0 sin ðω t Þ2 =
U 20 sin ðω t Þ2 = R I 20 sin ðω t Þ2 : R
ð9:6Þ
The power oscillates with a sin2 function. It is therefore always P ≥ 0 and oscillates between zero and the maximum value Pmax = U0 I0 (Fig. 9.1b). In order to describe how much electrical power a device requires over a longer period of time, one introduces the average value of the power, the effective power: I2 U2 1 Peff = hPðt Þi = U 0 I 0 = 0 = R 0 : 2R 2 2
ð9:7Þ
The factor 12 is obtained by averaging the sin2 function over one oscillation period. Usually one introduces now also effective values for the voltage and the current with
9.1 Passive Components and Alternating Current
U 20 U 2eff U = → U eff = p0 , 2R R 2 R
I 20 I = R I 2eff → I eff = p0 : 2 2
211
ð9:8Þ ð9:9Þ
The effective values for voltage and current are therefore smaller than the amplitudes p by a factor of 2. This is also the case for the voltage from the wall socket. Example: Power Socket It is generally known that in many countries an alternating voltage of about U = 230 V comes out of the socket, whereby the frequency is approx. 50 Hz. Contrary to common belief, however, this quantity is not the amplitude value of the voltage, but the effective value. The amplitude of the AC voltage is therefore significantly higher and amounts to p p U 0 = 2 U eff = 2 230 V = 325 V:
9.1.2
ð9:10Þ
Alternating Current in the Capacitor
A plate capacitor consists of two opposing metal plates, between which there may be a dielectric. The circuit is thus interrupted at this point. It should therefore not be possible for a current to flow through the capacitor. For direct current (DC) this consideration is correct, but for alternating current one has to analyse voltage and current at the capacitor more exactly. This is because the capacitor is periodically charged or discharged by applying an AC voltage. Let us therefore assume that the capacitor in Fig. 9.2 has an alternating voltage of
Fig. 9.2 (a) An alternating voltage source is connected to a capacitor with capacitance C or to a coil with inductance L. The current in the circuits depends on the frequency. This causes an alternating current to flow in the circuits, which depends on the frequency ω. (b) The electrical power oscillates around the mean value Peff = 0 for the capacitor and the coil. This means that no energy is lost in the sense of ohmic heat. Instead, energy flows periodically into and out of the electric field of the capacitor and the magnetic field of the coil
212
9
U ðt Þ = U 0 sinðω t Þ
Electronics
ð9:11Þ
The charge on the capacitor is proportional to the voltage with the definition of the capacitance C: Qðt Þ = C U ðt Þ = C U 0 sinðω t Þ:
ð9:12Þ
The current I with its definition as the time derivative of the charge is then given by I ðt Þ =
dQðt Þ = C U 0 ω cosðω t Þ = I 0 cosðω t Þ: dt
ð9:13Þ
The current therefore oscillates with a cos-function and is 90° ahead of the voltage oscillation. The capacitive AC resistance RC is thus RC ðωÞ =
U0 1 = : ωC I0
ð9:14Þ
Thus, the resistance of a capacitor is proportional to 1/ω, that is, at direct current (ω = 0) the resistance is infinite. This is consistent with the capacitor breaking the circuit. However, the greater the frequency, the smaller is the capacitive resistance. The electrical power P(t), which arises with an alternating current through the capacitor, is given by Pðt Þ = U ðt Þ I ðt Þ =
U 20 sinðω t Þ cosðω t Þ: RC
ð9:15Þ
The power oscillates harmonically around the value P = 0 (Fig. 9.2b). A positive instantaneous power P(t) > 0 means that work is being done on the capacitor. The capacitor is charged and the energy is stored in the electric field of the capacitor. Accordingly, a negative instantaneous power P(t) < 0 means that the capacitor is doing work and is reducing its energy in the electric field by discharging. In these processes, no energy is lost as heat, so the time average power is Peff = 0. In this case, the instantaneous power P(t) is also called reactive power.
9.1.3
Alternating Current in the Coil
We consider here an ideal coil, where we assume that the wire from which it is wound has no ohmic resistance. This means that when a direct current is passed through the coil, the voltage across the coil is zero. In fact, there are coils made of superconducting wire that have this property. Coils wound from normal wire, however, all have an ohmic resistance, which we will neglect here for the moment. If we now connect an ideal coil to an AC voltage source, as shown in Fig. 9.2, selfinduction occurs in the coil (see Sect. 8.2.5). Since the induction voltage Uind at the
9.1 Passive Components and Alternating Current
213
coil has opposite sign compared to the voltage of the voltage source U(t), and using (8.38) it is given by U ðt Þ = - U ind = L
dI ðt Þ : dt
ð9:16Þ
For an AC voltage U ðt Þ = U 0 sinðω t Þ
ð9:17Þ
The current I(t) results by integration of Eq. (9.16): I ðt Þ = -
U0 cosðω t Þ = - I 0 cosðω t Þ: ωL
ð9:18Þ
The current therefore oscillates with a (-cos) function and lags behind the voltage oscillation by 90°. Thus the inductive AC resistance RL is given by RL ðωÞ =
U0 = ωL: I0
ð9:19Þ
The time-dependent electric power in the ideal coil is given by Pðt Þ = U ðt Þ I ðt Þ = -
U 20 sinðω t Þ cosðω t Þ: RL
ð9:20Þ
Analogous to the capacitor, the power at the coil also oscillates harmonically around the value P = 0 (Fig. 9.2b). Here, too, the time average is Peff = 0, that is, no electrical energy is lost as heat. Positive or negative values of P(t) mean that electrical energy is stored as a magnetic field in the coil, or that energy from the magnetic field is converted into electrical energy, respectively. Therefore, P(t) is also referred to as reactive power in the coil. Written Test: AC Resistance in Coil and Capacitor A capacitor and a coil are connected to an alternating voltage with a frequency of ω = 2π 50 Hz. How large must the capacitance C or the inductance L be so that the AC resistance is R = 1 kΩ in each case? Solution In the case of the capacitor, we solve (9.14) for the capacitance: (continued)
214
9
C=
1 1 = 3:2 μF: = ω R 2π 50 s - 1 103 V=A
Electronics
ð9:21Þ
By the way, capacitors can be bought with capacities in a huge range between 1pF ≲ C ≲ 1000 F. In the case of the coil, we solve (9.19) for inductance: L=
103 V=A R = 3:2 H: = ω 2π 50 s - 1
ð9:22Þ
This value for inductance is quite large. Normal coils in electrical circuits tend to have inductances in the range between 1 μH ≲ L ≲ 1 mH.
9.2
Electrical Networks and Circuits
Electrical networks are circuits that consist of any number of electrical components and can therefore look complicated. Just think of the circuit boards in a computer. Nevertheless, the calculation of all currents and voltages in the network can be traced back to only a few rules: to the two so-called Kirchhoff’s laws and to Ohm’s law to link current and voltage at individual components of the network. 1. Kirchhoff’s Law: The Knot Rule In a node of an electrical circuit, the sum of the incoming currents Iin, j is equal to the sum of the outgoing currents Iab, j (Fig. 9.3), I in,j = j
I ab,j :
ð9:23Þ
j
Fig. 9.3 (a) Node rule: Iin, 1 + Iin, 2 + Iin, 3 = Iab, 1 + Iab, 2 + Iab, 3. Here all currents are counted as positive. (b) Mesh rule: U1 + U2 + U3 + U4 + U5 = 0. Here the voltage U1 from the negative to the positive terminal of the battery is negative, and the remaining voltages measured in the direction of the current U2 to U5 are positive
9.2 Electrical Networks and Circuits
215
Fig. 9.4 Exercise for the application of Kirchhoff’s laws. What are the voltages and currents in the circuit for R1 = R2 = R3 ?
Here, all currents have a positive sign. The node rule follows from the conservation of charge and means that any charge that flows into a node must also flow out of the node. No charge can be lost at a node, and no charge can accumulate. 2. Kirchhoff’s Law: The Mesh Rule The sum of all voltages in an electric loop (mesh) is equal to zero (Fig. 9.3b). U j = 0:
ð9:24Þ
j
In this case, all voltages occurring in a loop must be specified consistently in the same direction. With regard to the sign of the voltages, electrical engineering usually uses a convention that is opposite to the physical definition of voltage in electrostatics (Sect. 6.3). The voltage from the positive to the negative side of a component has a positive sign in electrical engineering. This means that a positive current through the corresponding element also produces a positive voltage measured in the direction of the current. In this context, one also speaks of the (positive) voltage drop. In the case of voltage sources, it should be noted that although the battery voltage UB is always specified as positive, if the voltage is measured consistently with the direction of all other voltage drops in the mesh from the negative to the positive terminal of the battery, the voltage U1 = - UB is negative according to the convention of electrical engineering. The mesh rule is based on the fact that the voltage between two locations is equal to the difference in the electrostatic potentials of those locations. Therefore, if one completes a circle along the mesh, one subtracts the potential from its own value, and thus the corresponding potential difference, and therefore the total voltage, is zero. Example: Kirchhoff’s Laws An example of the application of Kirchhoff’s laws is sketched in Fig. 9.4. Here three equal resistors R1 = R2 = R3 are connected to a battery with voltage UB. The resistors R2 and R3 together form a mesh. From the mesh rule U3 - U2 = 0 it follows immediately that (continued)
216
9
U2 = U3
Electronics
ð9:25Þ
This can be seen directly in the circuit, since the left and right sides of the two resistors are directly connected and thus at the same potential. At the two nodes of the circuit it also follows from the node rule that I 1 = I 2 þ I 3:
ð9:26Þ
Here I1 is the total current flowing through the circuit, divided into the two partial currents I2 and I3. With the help of Ohm’s law: U 1 = R1 I 1 ,
ð9:27Þ
U 2 = R2 I 2 ,
ð9:28Þ
U 3 = R3 I 3 :
ð9:29Þ
Since the voltages U2 = U3 and the resistances R2 = R3 are equal, the currents must also be equal: I2 = I3 =
I1 : 2
ð9:30Þ
The current I1 is therefore divided equally between the two paths at the node. With (9.27) and (9.28) we get U 2 = R2
R I1 1 = 2 U1 = U1: 2 2R1 2
ð9:31Þ
Now we consider the overall circuit as a mesh. According to the mesh rule 1 3 UB = U1 þ U2 = U1 þ U1 = U1, 2 2
ð9:32Þ
or solved for the voltage U1 U1 =
2 U : 3 B
ð9:33Þ
With this result, one can also calculate the voltages U2 = U3 with the help of (9.31): U2 = U3 =
U1 1 = UB: 2 3
ð9:34Þ
(continued)
9.2 Electrical Networks and Circuits
217
So two thirds of the battery’s voltage drops across the resistor R1, and the remaining third third across each of the other two resistors. Now we can also calculate the currents using Ohm’s law: I1 =
U1 2 = U , 3R1 B R1
I2 = I3 =
ð9:35Þ
U2 1 = U : R2 3R2 B
ð9:36Þ
This simple example shows how to calculate the currents and voltages in an electrical network. For more complicated circuits, one must proceed more systematically in order not to lose track of the many unknown parameters. Therefore, one writes down all determining equations resulting from Kirchhoff’s rules and Ohm’s law. This results in a linear system of equations, which can be solved by numerical methods (preferably with the help of a computer).
9.2.1
Parallel and Series Connections
One application of Kirchhoff’s rules is parallel and series connections of resistors, capacitors and coils (Fig. 9.5). Several components connected in parallel or in series can be represented by a single component of the same type. For resistors connected in series Rj the total resistance Rtot of all resistors together is given by the sum of the individual resistances Rj in series : Rtot =
N
Rj :
ð9:37Þ
j=1
This follows from the mesh rule by summation of the voltages applied to the individual resistors; however, a stringent derivation is deliberately omitted at this point. If the N resistors are connected in parallel, the reciprocal values add up:
Fig. 9.5 Series and parallel circuits of (two) resistors, capacitors and coils. The circuits can each be described by a total resistance, a total capacitance and a total inductance, respectively
218
9
Rj parallel :
1 = Rtot
N j=1
1 : Rj
Electronics
ð9:38Þ
This follows from the node rule considering that the same voltage is applied to all resistors. The total resistance in a parallel circuit is smaller than each of the individual resistors. If only two resistors are connected in a parallel circuit, formula (9.38) can be solved for the total resistance in a simple manner: R1 and R2 parallel : Rtot =
R1 R2 : R1 þ R2
ð9:39Þ
The corresponding formula for the total capacitance Ctot of capacitors connected in series with individual capacitances Cj is as follows C j in series :
1 = Ctot
N j=1
1 : Cj
ð9:40Þ
For a parallel connection of capacitors, the total capacitance is Cj parallel : Ctot =
N
Cj :
ð9:41Þ
j=1
So for capacitors the same formulas apply as for resistors, except that the rules for series and parallel connection are reversed. With coils, on the other hand, one must pay attention to whether the individual coils are magnetically coupled. This is the case if the magnetic field of one coil generates magnetic flux in the cross-section of the other coils. In this case one has to consider the mutual magnetic induction of the single coils. This case will not be discussed further here. Instead, we consider the simpler case where the coils are spaced so far apart that they just do not affect each other. In this case, the total inductance Ltot of series-connected coils with individual inductances Lj is given by the sum Lj in series : Ltot =
N
Lj :
ð9:42Þ
j=1
When the individual inductances are connected in parallel, the reciprocal values add up again Lj parallel :
1 = Ltot
N j=1
1 : Lj
So the formulas are again analogous to those for resistors.
ð9:43Þ
9.2 Electrical Networks and Circuits
219
Fig. 9.6 Exercise: What is the total resistance of the circuit?
Exercise: Total Resistance Consider the parallel and series connection of six equal resistances R shown in Fig. 9.6 . What is the total resistance of the circuit? Solution The usual procedure for such circuits is to work your way from the inside to the outside step by step. In this example, this means that you first calculate the series connections of the three individual branches. This results in the resistances R1 = R, R2 = 2R and R3 = 3R. These three resistances are now connected in parallel. The reciprocal value of the total resistance is therefore calculated according to (9.38) as follows 1 1 1 1 6 þ 3 þ 2 11 = þ þ = = : Rtot R 2R 3R 6R 6R
ð9:44Þ
Thus the total resistance is given by Rtot =
9.2.2
6 R: 11
ð9:45Þ
Complex Impedance
In Sect. 9.1.1 we introduced the alternating current resistance. We saw that the resistance of the capacitor and the coil depends on the frequency and that the oscillations of voltage and current are not in phase. If we want to calculate the AC resistance of an electrical network, for example, the AC resistance of a series or parallel connection of a resistor with a capacitor or an inductor, we have to take into account the different phases of the individual elements. This is partly done with so-called pointer diagrams, which we will not go into here. Instead, we will introduce a clear mathematical procedure for solving problems of this kind. For this purpose, we consider the oscillations of current and voltage as complex quantities, that is, we replace
220
9
Electronics
U ðt Þ = U 0 cosðωt Þ → U ðt Þ = U 0 eiωt ,
ð9:46Þ
I ðt Þ = I 0 cosðωt þ φÞ → I ðt Þ = I 0 eiðωtþφÞ :
ð9:47Þ
An introduction to the complex numbers can be found in Sect. 9.2.2.1. In Eq. (9.47) an arbitrary phase difference φ between current and voltage has been introduced. The real amplitudes of the oscillations result from the magnitudes of the complex numbers: U 0 = U ðt Þ ,
ð9:48Þ
I 0 = I ðt Þ :
ð9:49Þ
The already known AC resistance is given by the ratio of the amplitudes R ð ωÞ =
U0 , I0
ð9:50Þ
while the complex impedance Z(ω) is defined by the ratio of the complex oscillations: Z ð ωÞ =
U U U eiωt = 0iωtþφ = 0 e - iφ = RðωÞe - iφ : I0 I0e I
ð9:51Þ
Coil, capacitor and ohmic resistor have the following complex impedances : Resistance : φ = 0 → Z ðωÞ = R, Capacitor : φ = π → Z ðωÞ =
1 , iωC
Coil : φ = - π → Z ðωÞ = iωL,
ð9:52Þ ð9:53Þ ð9:54Þ
with the imaginary unit i. The great advantage of complex impedances is that the total impedance Z of a circuit of any N individual impedances Zj can be calculated in exactly the same way as the total resistance of individual resistors: Series connection : Z =
N
Zj,
ð9:55Þ
j=1
Parallel connection :
1 = Z
N j=1
1 : Zj
ð9:56Þ
9.2 Electrical Networks and Circuits
221
The amplitude ratio between current and voltage is then given by the magnitude of the total impedance U0 = jZ j, I0
ð9:57Þ
and the phase between current and voltage from the ratio of imaginary to real part φ = arctan
ImðZ Þ : ReðZ Þ
ð9:58Þ
9.2.2.1 Outlook: Calculating with Complex Numbers A complex number z 2 ℂ can be represented as z = a þ i b,
ð9:59Þ
with two real numbers a, b 2 ℝ and the so-called imaginary unit i. The two real numbers a and b are called the real and imaginary parts respectively: a = ReðzÞ, b = ImðzÞ:
ð9:60Þ
The imaginary unit, or more precisely the two numbers ±i, are defined as the two solutions of the equation z2 = - 1,
ð9:61Þ
which has no solution in the real number range. Complex numbers can be represented graphically in a coordinate system by plotting the real part on the horizontal axis and the imaginary part on the vertical axis (Fig. 9.7). The real and
Fig. 9.7 Representation of a complex number z = a + i b and its complex conjugate number z = a - i b in the complex plane. The magnitude of the complex number |z| is a real quantity and is equal to the length of its position vector in the complex plane, with |z|2 = z z = a2 + b2. The angle φ of the position vector to the real axis is given by the so-called polar representation of the complex number: z = |z| eiφ
222
9
Electronics
imaginary parts then behave like the two components of a two-dimensional vector: When adding complex numbers z1 = a1 + i b1 and z2 = a2 + i b2, the real and imaginary parts are added separately, analogous to the addition of the x and y components of the corresponding vector: z = z1 þ z2 = ða1 þ a2 Þ þ i ðb1 þ b2 Þ:
ð9:62Þ
The absolute value or magnitude of a complex number |z| is given by the length of its position vector in the complex plane. According to Pythagoras this length is given by jzj =
a2 þ b2 :
ð9:63Þ
However, the absolute value can also be represented by the product of the complex number z = a + i b with its complex conjugate number z = a - i b. In the case of the complex conjugated number, the sign of the imaginary part is reversed. The following equality jzj2 = z z ,
ð9:64Þ
z z = ð a þ i bÞ ð a - i bÞ
ð9:65Þ
= a2 þ iab - iab - i2 b2
ð9:66Þ
= a2 þ b2 ,
ð9:67Þ
can be easily proven:
where i2 = - 1 was used. A complex number can also be written in the so-called polar representation. This is z = jzj eiφ ,
ð9:68Þ
with the angle φ 2 ℝ between the real axis and the position vector of the complex number. The connection is made by the so-called Euler formula eiφ = cosðφÞ þ i sinðφÞ,
ð9:69Þ
eiφ = 1:
ð9:70Þ
The validity of Euler’s formula can be shown by expanding the exponential function into a Taylor series, sorting the terms accordingly and then assigning each of them to the Taylor series of the sine or cosine function. The angle φ can be calculated trigonometrically. With the definition of the tangent tan(φ) = b/a the angle is given by
9.2 Electrical Networks and Circuits
223
φ = arctan
9.2.3
ImðzÞ ReðzÞ
ð9:71Þ
Special Circuits
This section presents some simple circuits, each consisting of only two passive components, but which perform important tasks in many electrical networks.
9.2.3.1 Voltage Divider A voltage divider is sketched in Fig. 9.8a. An input voltage Uin is connected to two resistors R1 and R2 connected in series. The output voltage Uout of the circuit is the voltage drop across the resistor R2. Using the total resistance R = R1 + R2 and Ohm’s law, one can calculate the current I flowing in the mesh: I=
U in : R1 þ R2
ð9:72Þ
The output voltage is now obtained by applying Ohm’s law to the resistor R2 and then inserting (9.72): U out = R2 I =
R2 U in : R1 þ R2
ð9:73Þ
Fig. 9.8 (a) A voltage divider consists of two resistors connected in series and divides the input voltage Uin. (b) If resistor R2 of the voltage divider is replaced by a capacitor, an RC low-pass filter is obtained which lets pass only low frequencies. If you replace resistor R1 of the voltage divider with a capacitor, you get an RC high-pass filter that lets pass only high frequencies. c If you replace the resistor R2 or R1 with a coil, you get an RL high-pass filter or an RL low-pass filter
224
9
Electronics
By suitable choice of the resistors R1 and R2 one can reduce the input voltage to any smaller value. Often a potentiometer is installed as a resistor R2. This is a resistor that can be adjusted by turning a screw. This also allows you to set the output voltage to the desired value.
9.2.3.2 RC Filter In principle, an RC filter looks like a voltage divider, except that one of the two resistors has been replaced by a capacitor with capacitance C (Fig. 9.8b). Depending on whether the resistor R2 has been replaced or the resistor R1, the filter is either a low-pass filter or a high-pass filter. The reason for the notion can be seen in the way the circuits react to an alternating voltage with frequency ω = 2π f as input voltage. The output voltage is calculated analogous to Eq. (9.73) by replacing the resistors R1 and R2 by complex impedances Z1 and Z2. One obtains U out ðt Þ =
Z2 U in ðt Þ: Z1 þ Z2
In the case of the RC high-pass filter, Z 1 = U out ðt Þ =
1 iωC
ð9:74Þ
and Z2 = R, so that
R U ðt Þ: R þ 1=ðiωC Þ in
ð9:75Þ
out The amplitudes U in 0 and U 0 of the voltages are obtained by calculating the absolute value
U out 0 =
R U in 0 = R þ 1=ðiωC Þ
1 1þ
1 2 ωRC
U in 0:
ð9:76Þ
For small frequencies ω → 0 the fraction 1/(ωRC) in the denominator in Eq. (9.76) goes to infinity. Thus the output voltage U out goes to zero. Voltages with small 0 frequencies are therefore blocked by the circuit. At large frequencies ω → 1, the in fraction 1/(ωRC) goes to zero. Thus the output voltage U out 0 = U 0 . So the circuit allows voltages with high frequencies to pass, hence the name of the high pass filter. The interesting question now is, at which frequency the circuit starts to block. The cut-off frequency fGrenz is defined as the frequency at which the transmitted power is halved. Since the power is proportional to the square of the voltage, this means that U out 1=2U in 0 = 0 . We put this condition into Eq. (9.76) and solve the equation for the frequency ω: ω=
1 : RC
The cut-off frequency in the unit Hertz is therefore
ð9:77Þ
9.2 Electrical Networks and Circuits
225
f Limit =
ω 1 = : 2π 2πRC
ð9:78Þ
With the low-pass filter, the output voltage drops across the resistor. Analogous to 1 are used. The following the high-pass filter, the impedances Z1 = R and Z 2 = iωC result is obtained: U out 0 =
1 U in 0: 1 þ ðωRC Þ2
ð9:79Þ
For large frequencies, the output voltage approaches zero, while for small in frequencies U out 0 = U 0 . As the name suggests, the filter allows DC voltages and AC voltages with low frequencies to pass. The cut-off frequency for the low-pass filter, at which the output power is halved, is identical to the high-pass filter and is given by formula (9.78).
9.2.3.3 RL Filter Analogous to the RC filter there is also the RL filter, where one of the two resistors of the voltage divider is replaced by a coil with inductance L (Fig. 9.8c). The output voltage is obtained from Eq. (9.74) by substituting the appropriate impedances for the resistor and the inductor. In the case Z1 = iωL and Z2 = R one obtains for the amplitudes 1
U out 0 =
1þ
ωL 2 R
U in 0:
ð9:80Þ
in For low frequencies, the output voltage is U out 0 = U 0 , while for high frequencies it approaches zero. It is therefore a low-pass filter. Accordingly, in the case where the resistor R2 is replaced by a coil, the output voltage is given by
1
U out 0 =
1þ
R 2 ωL
U in 0
ð9:81Þ
For low frequencies, the output voltage approaches zero, while for high frequencies Uout = Uin. This case represents a high-pass filter. Again, analogous to the procedure for the RC filter, one can calculate the cut-off frequency at which the output power has dropped to half of the input power. The corresponding angular frequency is ω= The cut-off frequency in Hertz is then
R : L
ð9:82Þ
226
9
f Limit =
ω R = : 2π 2πL
Electronics
ð9:83Þ
The result for the cut-off frequency is independent of whether the filter is a high-pass or low-pass filter. Examination Task: RC and RL Filters A high-pass filter with a cut-off frequency of fGrenz = 1 kHz shall be built. The resistance in the filter is R = 1 kΩ. How large must the capacitance or inductance be when building a corresponding RC or RL filter? Solution For the RC filter, the capacitance after solving (9.78) is C=
1 1 = = 159 nF: 2π R f Grenz 2π 1 kΩ 1 kHz
ð9:84Þ
For the RL filter, the inductance after solving (9.83) is L=
R 1 kΩ = = 159 mH: 2π f Grenz 2π 1 kHz
ð9:85Þ
9.2.3.4 Transformer For many applications in electronics it is necessary to amplify an existing voltage. For AC voltages, a transformer can be used for this purpose (Fig. 9.9). There, two coils with numbers of turns N1 and N2 are wound around a common ferrite core. The ferrite core causes the magnetic flux of the first coil (the primary coil) to be conducted through the second coil (secondary coil) largely without loss. The magnetic flux in both coils is thus the same: ϕm,1 = ϕm,2 ϕm :
ð9:86Þ
The voltages U1 and U2 on the two coils are given by the law of induction (8.35):
Fig. 9.9 A transformer transforms AC voltages up or down in proportion to the number of turns of the respective coil
9.2 Electrical Networks and Circuits
227
U1 = - N1
dϕm , dt
ð9:87Þ
U2 = - N2
dϕm : dt
ð9:88Þ
By solving (9.87) for dϕm/dt and substituting in (9.88) we obtain U2 =
N2 U1, N1
ð9:89Þ
that is, the voltage is transformed up or down with the ratio of the number of windings of the two coils. The coil with the larger number of turns also has the larger voltage. We can also calculate how they transform the currents I1 and I2 in the coils. We assume that because of the conservation of energy the electrical power in the two coils must be the same: U1 I 1 = U2 I 2:
ð9:90Þ
Substituting (9.89) into (9.90), it follows that I2 =
N1 I , N2 1
ð9:91Þ
That is, the currents are transformed inversely to the ratio of the voltages. Thus, the smaller current flows in the coil with the larger number of turns. In the case of real coils, it must be taken into account that electrical components are typically connected to the secondary coil. Depending on the input impedance of this secondary circuit, there will be a corresponding phase between current and voltage. This can lead to the power in the secondary coil being partially or even completely reflected back to the primary coil. However, there is actually always a circuit connected there as well (at least the AC source circuit), which can also cause reflections (depending on the output impedance). The calculation of the real voltages and currents can therefore be more complicated than expressed in the idealized Eqs. (9.89) and (9.91). Written Test: Transformer The primary coil of a transformer has 20 turns, the secondary coil 160 turns. An alternating voltage with amplitude U1 = 15 V is applied to the primary coil. What is the voltage across the secondary coil? What is the current in the secondary circuit if the voltage source in the primary circuit has a power of P = 30 W ? Assume that all the power is transferred from the primary to the secondary circuit. (continued)
228
9
Electronics
Solution The voltage at the secondary coil is U2 =
N2 160 U1 = 15 V = 120 V: 20 N1
ð9:92Þ
Since the power in the primary and secondary circuits is assumed to be the same, the current in the secondary circuit is I2 =
9.2.4
P 30 W = = 0:25 A: U 2 120 V
ð9:93Þ
Switching On and Off in the RC and RL Circuits
In Sect. 9.2.3 it was shown, among other things, what influence RC and RL circuits have on AC voltages. It turned out that depending on the circuit either high or low frequencies are attenuated. In the following we will address the question of how currents and voltages in such a circuit change as a function of time when a battery is either connected to the circuit with the aid of a switch or disconnected from the circuit.
9.2.4.1 RC Circuit We first consider the RC circuit sketched in Fig. 9.10. At the beginning, that is, at the time t = 0, the switch S is supposed to be open and the capacitor discharged, that is, no current flows in the circuit, I = 0, and charge and voltage on the capacitor are zero. Now the switch is closed so that current can flow and charge the capacitor. After the charging process is complete, so when the capacitor is fully charged, the capacitor voltage is equal to the battery voltage, and again no current flows. We will
Fig. 9.10 (a) Circuit diagram of the RC circuit during charging. At the time t = 0 the switch is closed and the charging process of the capacitor begins. (b) Voltages UR and UC at the resistor and the capacitor (left axis) and current I in the circuit (right axis)
9.2 Electrical Networks and Circuits
229
now analyze the process of charging in more detail by deriving a differential equation for the current in the circuit and solving it. To do this, we apply the mesh rule -UB + UR + UC = 0 to the RC circuit, with battery voltage UB, voltage UR at the resistor and voltage UC at the capacitor, and take the derivative with respect to time: UB = UR þ UC ,
ð9:94Þ
dU B dU R dU C þ = : dt dt dt
ð9:95Þ
We use that the battery voltage is constant in time, dUB/dt = 0, and that the voltage in the capacitor with capacitance C is given by UC = Q/C: 0=
dU R 1 dQ þ : dt C dt
ð9:96Þ
Furthermore, we use the relationship between current and charge I = dQ/dt and Ohm’s law UR = R I: 0=R
dI 1 þ I: dt C
ð9:97Þ
We solve for the time derivative of the current and thus obtain the DGL dI ðt Þ 1 = I ðt Þ: dt RC
ð9:98Þ
Here the current I(t) is a time-dependent function. The solution of the DGL is an exponential function I ðt Þ = I 0 e - τ , with time constant τ = R C: t
ð9:99Þ
The current I0 at time t = 0 is limited only by the resistance R, that is, I0 = UB/R. We do not elaborate here on the proof that the exponential function (9.99) is a solution of the DGL (9.98). This can be done, for example, by substituting the solution into both sides of the DGL and showing that the equal sign holds. From the knowledge of the current I(t), one can now calculate the voltage across the resistor using Ohm’s law: U R ðt Þ = R I ðt Þ = U B e - τ : t
ð9:100Þ
The voltage curve UR(t) at the resistor therefore has the same shape as the current curve I(t). The voltage at the capacitor can now also be calculated using the mesh rule: U C ðt Þ = U B - U R ðt Þ = U B 1 - e - τ : t
ð9:101Þ
230
9
Electronics
Fig. 9.11 (a) Circuit diagram of the RC circuit during the discharge process. At the time t = 0 the switch is closed and the capacitor is discharged through the resistor. (b) Voltages UR and UC across the resistor and capacitor (left axis) and current I in the circuit (right axis). The negative values of UR and I mean that the current and voltage are opposite to the directions of the arrows
All three curves I(t), UR(t) and UC(t) are shown in Fig. 9.10b. Thus, it takes several multiples of the 1/e time constant τ for the capacitor to charge. The larger the resistance and the capacitance are, the longer is the time constant. A capacitor can also be discharged, again via a resistor. To do this, imagine that we charge the capacitor to the voltage UC, 0 using the circuit sketched in Fig. 9.10, open the switch S and remove the battery. The charge on the capacitor remains stored during that time, that is, the capacitor voltage is constant. Now we close the circuit again, causing the charges from one side of the capacitor to flow through the resistor to the other side, thus equalizing. The corresponding circuit is sketched in Fig. 9.11. A derivation of the corresponding differential equation is omitted here. The calculation is very similar to the one for charging the capacitor. Instead, the solutions will be discussed below. The directions of the voltages and the current are defined here in the same way as for charging the capacitor. The solutions are: I ðt Þ = - I 0 e - τ , t
ð9:102Þ
U R ðt Þ = - U C,0 e - τ ,
ð9:103Þ
U C ðt Þ = U C,0 e - τ ,
ð9:104Þ
t
t
with maximum current I0 = UC, 0/R (Fig. 9.11b). The voltage across the capacitor UC(t) decreases exponentially with the time scale τ. This is also true for the current I(t) and the voltage UR(t) across the resistor, with the values I(t = 0) and UR(t = 0) now negative when the switch is closed. This is because the current now flows in the opposite direction to the original definition of the current direction, that is, just the other way round than when the capacitor was charged.
9.2.4.2 RL Circuit The RL circuit is analogous to the RC circuit (Fig. 9.12). At the beginning the switch is open, so no current flows. As soon as the switch is closed, a current begins to flow through the resistor and the coil, whereby the coil with its large AC resistance RL = ω L counteracts the rapid change in current at high frequencies. As a result,
9.2 Electrical Networks and Circuits
231
Fig. 9.12 (a) Circuit diagram of the RL circuit during the switch-on process. At the time t = 0 the switch is closed and the current in the circuit starts to flow. (b) Voltages UR and UL at the resistor and coil (left axis) and current I in the circuit (right axis)
the current cannot be switched on abruptly. Mathematically, this situation can be described analogously to the RC circuit by replacing the capacitor voltage with the voltage across the coil with UL = L dI/dt. Contrary to the definition of the induction voltage according to (8.38), the voltage at the coil does not have a negative sign here. This is because the voltage drop is measured in the same direction as the current flow, according to the convention of electrical engineering. One obtains the differential equation for the voltage UL at the coil (without proof) dU L ðt Þ R = - U L ðt Þ, dt L
ð9:105Þ
with solution U L ðt Þ = U B e - τ , with time constant τ = t
L : R
ð9:106Þ
Analogous to the RC circuit, the solution for the RL circuit is also an exponential function. At the time t = 0, the resistance of the coil is infinitely high, which causes the total battery voltage at the coil to drop, UL(t = 0) = UB. The voltage at the resistor follows with the help of the mesh rule U R ðt Þ = U B - U L = U B 1 - e - τ : t
ð9:107Þ
It starts at zero for (t = 0) and approaches the battery voltage with an inverted exponential function. The current in the RL circuit again follows from Ohm’s law according to I = UR/R and thus has the same course as the voltage at the resistor: I ðt Þ =
t UB 1 - e-τ : R
ð9:108Þ
All three curves are sketched in Fig. 9.12b. Turning off the power in an RL circuit is a little more subtle. In this case, it depends on how the current is switched off. The simpler case is when the battery
232
9
Electronics
Fig. 9.13 (a) RL circuit with voltage source that periodically switches its voltage UB > 0 on and off. This causes the current in the circuit to switch on and off periodically. (b) Voltages UR and UL across the resistor and coil (left axis) and current I in the circuit (right axis) for the current to switch off
voltage is set to UB = 0 with the switch closed. This can be done, for example, with a common laboratory power supply that periodically switches between two voltages (Fig. 9.13). It is important here that the circuit remains closed and thus the current can continue to flow. The inductance of the coil keeps the current running in the same direction until the magnetic energy stored in the coil is used up. Here the current drops exponentially. The same applies to the voltage across the resistor, which, according to Ohm’s law, is always proportional to the current. The voltage at the coil again follows from the mesh rule, which because of UB = 0 is now UL = UR. Again, we do not set up the corresponding differential equation. Instead, the solutions are given here:
I ðt Þ = I 0 e - τ , t
ð9:109Þ
U R ðt Þ = R I 0 e - τ , t
ð9:110Þ
U L ðt Þ = - R I 0 e - τ , t
ð9:111Þ
where the current before switching off is given by I0. All three curves are sketched in Fig. 9.13b. An alternative way to switch off the current is to open the switch S. This changes the current flow abruptly.. According to the law of induction, this produces a very high induction voltage at the coil. This voltage is typically much greater than the original battery voltage. The voltage can even be so large that lightning flashovers can occur at the switch contacts. This causes energy to be radiated in the form of electromagnetic waves. This is also referred to as a spark gap.
9.3 Electric Resonant Circuit
233
Fig. 9.14 (a) Undamped LC resonant circuit. (b) The energy is exchanged between coil and capacitor, but is not lost in the undamped case
9.3
Electric Resonant Circuit
This section deals with the combination of a capacitor and a coil (Fig. 9.14). This is what is known as an electrical resonator . The mesh rule for this circuit with voltage UC at the capacitor and voltage UL at the coil is given by U L þ U C = 0:
ð9:112Þ
By inserting the equations for capacitor UC = Q/C and coil UL = L dI/dt we get the relation L dI=dt þ Q=C = 0:
ð9:113Þ
By taking the derivative with respect to time and rearranging the equation, a differential equation follows, using I = dQ/dt d 2 I ðt Þ 1 = I ðt Þ: 2 LC dt
ð9:114Þ
Differential equations of this type, where the second derivative of a function is equal to the original function multiplied by a negative factor, are already known from Sect. 4.1 on oscillations. Their solutions are harmonic oscillations of the form 1 : I ðt Þ = I 0 sinðω t Þ, mit ω = p LC
ð9:115Þ
The current therefore oscillates sinusoidally around zero with the amplitude I0, that is, the current periodically changes direction. The voltage in the coil is obtained by the derivative UL = L
dI = L I 0 ω cosðω t Þ = U 0 cosðω t Þ, dt
where the amplitude of the harmonically oscillating voltage is defined by
ð9:116Þ
234
9
U 0 = L I 0 ω:
Electronics
ð9:117Þ
The voltage across the capacitor is according to the mesh rule given by U C = - U L = - U 0 cosðω t Þ:
ð9:118Þ
It is remarkable that the current and voltages in the circuit in Fig. 9.14a are not damped, that is, the sine and cosine functions continue to oscillate infinitely. Thus, no energy is lost. The energy in the oscillating circuit consists of the sum of the magnetic energy Wm in the coil and the electrical energy Wel in the capacitor: W = W m þ W el =
1 2 1 LI sin ðω t Þ2 þ CU 20 cos ðω t Þ2 : 2 0 2
ð9:119Þ
We can transform the electric energy by (9.117) and (9.115), so that we see that the energy in the oscillating circuit W=
1 1 1 2 LI sin ðω t Þ2 þ LI 20 cos ðω t Þ2 = LI 20 2 2 2 0
ð9:120Þ
is constant in time. Here we have used that cos(ω t)2 + sin (ω t)2 = 1 is. The energy is therefore periodically exchanged between the coil and the capacitor, so that the sum is always constant. The distribution of energy between coil and capacitor as a function of time is sketched in Fig. 9.14b. Written Test: LC Resonant Circuit In an ideal LC resonant circuit with capacitance C = 1.5 F and inductance L = 60 mH, at a time t′ during the oscillation the energy is divided into Wel = 1 J in the capacitor and Wm = 2 J in the coil. What is the maximum voltage U0 across the capacitor and the maximum current I0 in the coil at the appropriate times? Solution The total energy is W = W m þ W el = 3 J:
ð9:121Þ
The charge on the capacitor is at a maximum when all the energy is stored in the capacitor; therefore 1 CU 20 = W 2
ð9:122Þ
or solved for the voltage: (continued)
9.3 Electric Resonant Circuit
235
U0 =
2W = C
6J = 2 V: 1:5 F
ð9:123Þ
Accordingly, the current is maximum when all the energy is present as magnetic energy in the coil: 1 2 LI = W 2 0
ð9:124Þ
and solved for the current: I0 =
9.3.1
2W = L
6J = 10 A: 60 mH
ð9:125Þ
Electric Resonant Circuit with Damping
In reality, an undamped oscillating circuit does normally not exist. The only exception are electrical oscillating circuits made of superconducting materials that can conduct current without loss. All conventional conductors, on the other hand, have an ohmic resistance. This can be inserted into the LC resonant circuit as an additional resistor. The corresponding circuit is sketched in Fig. 9.15a. The mesh rule (9.112) must therefore be extended to include the voltage UR = R I across the resistor. In the case of the derivative (9.114), an additional term thus arises which is proportional to the derivative of the current dI(t)/dt: d 2 I ðt Þ 1 R dI ðt Þ =: I ðt Þ - 2 LC L dt dt
ð9:126Þ
The new term in the DGL (9.126) describes the voltage drop across the resistor, which is linked to ohmic losses. This reduces the total energy in the oscillating circuit and the oscillation is damped. This is quite analogous to the damped oscillation in Sect. 4.1.2. Depending on the strength of the damping, a distinction is made between different cases. In the case of weak damping R < Rcrit the current oscillates
Fig. 9.15 (a) Damped RCL resonant circuit. (b) Depending on the magnitude of the resistance, a distinction is made between oscillatory case (red), aperiodic limit case (black) and overdamped case (creep case, blue)
236
9
Electronics
harmonically but with exponentially decreasing amplitude (Fig. 9.15b). In the aperiodic limit case with critical damping L C
Rcrit = 2
ð9:127Þ
the damping is just strong enough that no more oscillation occurs, and in the so-called overdamped case R > Rcrit the current is due to the high resistance so small that the energy in the oscillating circuit is damped on a very long time scale. In this case, the current “creeps” slowly towards zero.
9.3.2
RCL Bandpass Filter
If a weakly damped RCL resonant circuit is combined with an AC voltage source (Fig. 9.16a), it is referred to as a bandpass filter. Here, the amplitude of the output voltage Uout = UR, which drops across the resistor, depends on the difference between the frequency ω of the applied AC voltage and the resonant frequency ω0 of the resonant circuit. We have already learned about the corresponding behaviour of mechanical oscillators in Sect. 4.1.2 on the driven harmonic oscillator. Exactly on resonance, i.e. ω = ω0, with resonant frequency ω0 =
1 R 2 , LC 2L
ð9:128Þ
the output voltage is equal to the input voltage, while for both higher and lower input frequencies the output voltage drops (Fig. 9.16b). The frequency range (the band) at which the filter lets pass at least 50% of the input voltage is given by the width Δω =
Fig. 9.16 (a) RCL bandpass filter. (b) The bandpass filter lets pass frequencies in a band with width Δω = R/L around the resonant frequency ω0
R : L
ð9:129Þ
9.3 Electric Resonant Circuit
9.3.3
237
Hertzian Dipole – Radiation of Electromagnetic Waves
As a last application we consider a so-called rod antenna, which consists of a straight piece of wire. This wire can also be thought of as an LC resonant circuit consisting of a single turn, whose circuit has been bent up so that the former capacitor plates correspond to the two ends of the wire (Fig. 9.17). Although the shape has been greatly changed compared to the LC resonant circuit, the rod antenna still has characteristics of the resonant circuit: if the frequency of the applied AC voltage hits the resonant frequency of the antenna fA, electric charges are excited to oscillate between the ends of the antenna. The antenna frequency is fA =
c , 2l
ð9:130Þ
with the speed of light c = 3 × 108 m/s and the length of the antenna l. Formula (9.130) describes a standing wave in the antenna with an antinode in the middle of the antenna, analogous to the mechanical standing waves in Sect. 4.2.1.1. An oscillation starts analogous to the charged capacitor in the oscillating circuit with → → charges on the two wire ends (Fig. 9.17b). This creates an electric dipole p = q d with a corresponding electric field. The electric field in the conductor causes a
Fig. 9.17 (a) From left to right: Starting with an LC resonant circuit with AC supply, the coil is first omitted so that the circuit itself constitutes a single winding. Then the capacitor plates are separated from each other and the circuit is bent open so that a straight piece of wire consisting of two parts is created. The wire ends correspond to the former capacitor plates. This configuration is called a dipole antenna. It still has characteristics of the LC resonant circuit, such as a resonant frequency at which the charge oscillates between the wire ends. (b) The charge oscillation produces → an oscillating electric dipole p , which is called a Hertzian dipole. This causes electromagnetic waves to be emitted
238
9
Electronics
Fig. 9.18 Angular characteristic of dipole radiation. The length of the arrow at the angle θ to the oscillating dipole illustrates the power per area radiated at that angle. The ends of all possible arrows result in the sketched line shape. The sketch here shows only a section through the dipole; in three dimensions, the dependence is rotationally symmetric about the direction of oscillation of the dipole, that is, the sketched lobes are a section through a torus around the dipole
current to flow from the positively to the negatively charged end of the wire. The current here is greatest when the two ends are just uncharged. At this point, the electric dipole and the electric field are zero. However, the current flow through the conductor generates a magnetic field which runs circularly around the conductor. Due to the self-induction of the wire, the current continues to flow until the two ends of the wire are again charged with the original charge, but now with opposite signs. Thus the electric dipole and the dipole field have also reversed. The current flow can now start again in the opposite direction. The dipole performs a harmonic oscillation: →
→
p ðt Þ = p 0 cosðω t Þ:
ð9:131Þ
An oscillating electric dipole moment is generally also called a Hertzian dipole. Hertzian dipoles are not only found in rod antennas, but also in atoms, for example, when an electron makes a harmonic oscillation around the atomic nucleus. The special thing about Hertzian dipoles is the fact that during such an oscillation electric and magnetic fields are detached from the source (the antenna or the atom) and travel through space as an electromagnetic wave. Here the wavelength λ is related to the length of the rod antenna l: l=
λ : 2
ð9:132Þ
They are therefore also referred to as λ/2 dipole antennas. Electromagnetic waves and their properties are described in detail in Sect. 10.1. Relevant at this point is the directional dependence of the dipole radiation. Figure 9.18 illustrates the power per area (=intensity). PA ðθÞ / sin ðθÞ2 ,
ð9:133Þ
which is emitted at the angle θ relative to the direction of oscillation of the dipole. The length of the sketched arrow is proportional to the intensity radiated in the
9.4 Electronics – Compact
239
direction of the arrow PA(θ). The ends of all possible arrows then give the sketched curve shape. It is remarkable that because of the sin2-dependence no power is radiated in the oscillation direction of the dipole. Similarly, a rod antenna does not radiate power in the direction of the rod. Most power, however, is radiated perpendicular to the dipole oscillation or perpendicular to the rod. This is relevant, for example, for remote controlled RC cars. Beginners often make the mistake of aiming the antenna of the remote control at the car. Here, however, the connection is just the worst. Examination Task: Antenna What must be the length l of a dipole antenna so that it radiates a frequency of f = 90 MHz ? Solution According to (9.130) l=
3 × 108 m=s c = 1:67 m: = 2f 2 90 × 106 1=s
ð9:134Þ
The antenna must therefore have a length of 1.67 m.
9.4
Electronics – Compact
Here, the most important formulas for electronics are summarized: RMS values for alternating currents in resistors (9.7) and (9.7): 1 U I , 2 0 0 U U eff = p0 , 2 I0 I eff = p : 2 Peff =
AC resistance of ohmic resistor (9.4), capacitor (9.14) and coil (9.19): RðωÞ = R, 1 , ωC RL ðωÞ = ω L: RC ðωÞ =
Complex impedance of ohmic resistor (9.52), capacitor (9.53) and coil (9.54)
240
9
resistor : Z ðωÞ = R, capacitor : Z ðωÞ = coil : Z ðωÞ = iωL:
1 , iωC
Knot rule (9.23) and mesh rule (9.24): I in,j =
I ab,j ,
j
j
U j = 0: j
Series connection of resistors (9.37), capacitors (9.40) and coils (9.42): N
Rtot =
Rj , j=1 N
1 , C j=1 j
1 = C tot
N
Ltot =
Lj :
j=1
Parallel connection of resistors (9.38), capacitors (9.41) and coils (9.43): N
1 , R j=1 j
1 = Rtot
N
C tot =
Cj , j=1
1 = Ltot
N
1 : L j=1 j
Voltage divider (9.73): U out = R2 I =
R2 U in : R1 þ R2
Cut-off frequency at RC high and low pass filter (9.75): f Limit =
1 : 2πRC
Cut-off frequency at RL high and low pass filter (9.82):
Electronics
9.4 Electronics – Compact
241
f Limit =
R : 2πL
Current and voltage in transformer (9.89) and (9.91): N2 U1, N1 N I2 = 1 I1: N2 U2 =
Resonant frequency of the electric oscillating circuit (9.115): ω= p
1 : LC
Critical damping in the electric resonant circuit (9.127): Rkrit = 2
L : C
Resonant frequency of the dipole antenna (9.130): fA =
c : 2l
10
Optics
10.1
Electromagnetic Waves
Electromagnetic waves are oscillating electric and magnetic fields which move through space as waves. In general, these waves can have arbitrarily complicated dependencies of position and time, but all these waves can be described by superposition of harmonic waves (see also Sect. 4.2), which are of the form →
→
→
→
→
ð10:1Þ
→
→
→
→
→
ð10:2Þ
E
B
r , t = E 0 . sin k . r - ω . t , r , t = B 0 . sin k . r - ω . t :
The electric and magnetic components of the wave oscillate harmonically with → → amplitude E 0 and B 0 and angular frequency ω. The wave propagates in the direction → of the wave vector k . Electromagnetic waves in free space are transverse waves, that is, the electric and magnetic fields are both perpendicular to the wave vector and also perpendicular to each other (Fig. 10.1). As with all waves, wavenumber k and wavelength λ are connected by →
k= k =
2π λ
ð10:3Þ
and angular frequency ω and period T of an oscillation by ω=
2π : T
ð10:4Þ
The speed of propagation of electromagnetic waves in free space is the speed of light
# Springer-Verlag GmbH Germany, part of Springer Nature 2023 S. Slama, Experimental Physics Compact for Scientists, https://doi.org/10.1007/978-3-662-67895-4_10
243
244
10
→
Fig. 10.1 An electromagnetic wave consists of a propagating electric wave E →
perpendicular to it, a magnetic wave B
→
r ,t
Optics
and,
→
r , t . Both components are perpendicular to the direction
of propagation of the wave. The wave propagates at the speed of light in the direction of the wave →
vector. k
c=
λ ω = : T k
ð10:5Þ
By definition it is c = 299 792 458
m m ≈ 3 . 108 : s s
ð10:6Þ
Electric and magnetic amplitude in electromagnetic waves are related by E 0 = c . B0 :
ð10:7Þ
Electromagnetic waves, by the way, also propagate without a propagation medium (in a vacuum). This is in contrast to sound waves, for example, which require a compressible medium (e.g., air) in which to propagate. Electromagnetic waves exist in a huge frequency range. Relevant for our everyday life are, for example, radio and microwaves with frequencies in the MHz and GHz range. At airports, we increasingly find THz radiation, which is used in the so-called nude scanners. Visible light consists of electromagnetic waves with frequencies in the range of several 100 THz. The corresponding visible wavelength range is from about 400 nm (violet) to about 750 nm (red). X-rays have even higher frequencies in the range from 1017 to 1019 Hz. Electromagnetic waves transport energy through space. The time-averaged energy that flows per surface A and time t is called light intensity I of the electromagnetic wave: I=
1 1 W E . B = cε E 2 , mit ½I ] = 2 , 2μ0 0 0 2 0 0 m
ð10:8Þ
with magnetic field constant μ0 and electric field constant ε0. For spatially constant intensity, it is related with the power P by:
10.1
Electromagnetic Waves
245
I=
P : A
ð10:9Þ
Electromagnetic waves do not only transport energy, they also have momentum. This can be seen when light is absorbed or reflected by an object. As a result, momentum is transferred to the object, as in the case of a classical impact, and a pressure is created on the object, the so-called radiation pressure pS. When the light is completely absorbed, it is given by pS =
I c
ð10:10Þ
and when it is fully reflected: I pS = 2 . : c
ð10:11Þ
Like also with classical collisions, reflection leads to the transfer of the doubled momentum. The radiation pressure stabilizes, for example, stars in their interior against imploding due to gravity. The radiation pressure caused by sunlight is also partly responsible for the tails of comets that fly past the Sun. Exercise: Electric Field in the Laser Beam What is the electric field in a beam of a laser pointer with a power of P = 1 mW and a beam radius of r = 1 mm ? Assume that the intensity is homogeneous over the beam cross-sectional area. A = π . r2 Solution The cross-sectional area of the beam is A = π . r 2 = 3:14 mm2 :
ð10:12Þ
This results in the intensity of the laser beam to be I=
W P 1 mW = 318:5 2 : = A 3:14 mm2 m
ð10:13Þ
The electric field amplitude according to (10.8) is thus E0 =
2I V = 490 : ε0 c m
ð10:14Þ
246
10
Optics
10.1.1 Waves in Matter: Refractive Index →
When an electromagnetic wave propagates through a medium, the electric field E ðt Þ → → of the wave acts with the force F ðt Þ = q . E ðt Þ on the negatively charged electrons and on the positively charged atomic nuclei. This causes the electrons to oscillate back and forth at the frequency of the wave relative to the nuclei of the atoms. Each atom or molecule in the medium is therefore a microscopic Hertzian dipole, as described in Sect. 9.3.3, and emits an electromagnetic wave. The total wave in the medium is therefore the sum of the incident wave and all the individual waves of the microscopic dipoles. At first glance this seems to be an extremely complicated system, and in fact complicated phenomena can occur in the details, which are an active area of research, for example, localization of light. In the normal case, however, the electric field of the total wave can be described very simply as →
E
→
→
→
→
r , t = E 0 . sin k . n . r - ω . t :
ð10:15Þ
The wave in the medium differs from the wave in the vacuum only by the fact that its → → spatial phase k . n . r contains the refractive index n, which is a number characteristic for the frequency ω and the medium. The refractive index n is related to the relative dielectric constant εr (Sect. 6.4.2.2) of the medium via n ð ωÞ =
εr ðωÞ, ½n] = 1:
ð10:16Þ
In vacuum, by definition n = 1, typical refractive indices in the optically visible frequency range are in the case of Air : n ≈ 1:0,
ð10:17Þ
Water : n ≈ 1:33,
ð10:18Þ
Glass : n ≈ 1:5:
ð10:19Þ
One can interpret the refractive index also in such a way that the wave number kn is changed in the medium with kn = k . n,
ð10:20Þ
with the wavenumber k = 2π/λ in vacuum. One can also formally assign the refractive index to the wavelength, so that the wavelength in the medium is λn =
λ n
ð10:21Þ
with the wavelength λ in vacuum. Since the frequency of oscillation in the medium is the same as in the vacuum,
10.1
Electromagnetic Waves
247
ωn = ω,
ð10:22Þ
the speed of light cn in a refractive index medium is different from the speed of light c in vacuum: cn =
c : n
ð10:23Þ
10.1.1.1 Outlook: Superluminal Velocity in Media Since in most normal materials n > 1, the speed of light in a medium is normally cn < c. However, if a medium has resonances, the refractive index can also be n < 1 for small frequency ranges, and the speed of light in this case is cn > c. So light propagates faster in the medium than at the speed of light. This seems to be a contradiction, because since Einstein we know that nothing can travel faster than the speed of light. However, Einstein’s statement needs to be clarified in this context, because strictly speaking it only concerns the speed at which information can be transmitted. However, the speed of light cn in the medium is the so-called phase speed, that is, the speed at which places of the same phase propagate in the oscillating electromagnetic wave. However, the phase velocity is not the velocity with which information can be transmitted. To really transmit data, one has to influence the amplitude of the wave, for example. One possibility is to generate and transmit short light pulses which then represent the information “one” (pulse on) or “zero” (pulse off). Within such a pulse, the light phase can certainly move at cn > c, but the pulse as a whole can only propagate at a speed that is less than the speed of light. 10.1.1.2 Dispersion Dispersion is the property of materials that their refractive index depends on the wavelength, n = n(λ). Under normal circumstances, that is at frequencies not too close to a resonance, the refractive index decreases with increasing wavelength (Fig. 10.2a): dn < 0, dλ
ð10:24Þ
and one speaks of normal dispersion. For example, in glass of the designation N-BK7 (also known as Schott glass), the refractive index decreases from the blue wavelength range with n ≈ 1, 54 to the red range with n ≈ 1, 52. An example of dispersion is the color separation of white light in a prism (Fig. 10.2b), with its application as a spectrometer. Another example is the colours in the rainbow, which are produced by dispersion in water droplets.
248
10
Optics
Fig. 10.2 (a) The refractive index of glass is subject to normal dispersion, that is, it decreases with increasing wavelength. (b) White light (shown as a black line) is refracted in a prism due to dispersion and is split into its colored spectral components
10.1.2 Absorption and Scattering In media, in addition to the change in the phase of the light wave due to the refractive index, there is typically also an attenuation of the light output. This can occur either by absorption of the light in the medium or by scattering of the light in directions other than the direction of propagation. In both cases, the light power in the wave decreases according to an exponential law called the Lambert-Beer law. The light power after the distance x in the medium is PðxÞ = Pð0Þ . e - μ.x , with absorption coefficient μ, ½μ] =
1 , m
ð10:25Þ
where the light power at the beginning of the medium is P(0). Like the refractive index, the absorption coefficient also depends on the wavelength of the light, that is, μ = μ(λ). In glass, the absorption coefficient has a minimum at a wavelength of about λ = 1.5 μm. This is the reason why exactly this wavelength is used in optical telecommunications via fiber optic networks, because as little light power as possible is lost in the optical fibers. Depending on the size of the scattering objects, a distinction is made between Rayleigh scattering (for objects smaller than the wavelength of light) and Mie scattering (for objects larger than the wavelength of light). Rayleigh scattering occurs, for example, in the atmosphere when sunlight is scattered by air molecules. The electric field of the light wave excites oscillating dipoles in the air molecules. The dipoles then radiate light in different directions according to the dipole characteristics (Sect. 9.3.3). The radiated light power Psc integrated over all directions is proportional to Psc / ω4
ð10:26Þ
10.1
Electromagnetic Waves
249
and thus depends very strongly on the frequency or the wavelength. Blue light is therefore scattered much more strongly than red light. This is the reason why the sky appears blue to us when we look in any direction other than where the sun is. We see the scattered blue part of the sunlight spectrum. If, on the other hand, we look in the direction of the sun, for example, in the evening, we see the light spectrum without the scattered blue components, and the sky around the sun looks yellow to reddish. Mie scattering occurs, for example, in water droplets in fog and clouds and in soot particles in smoke. Unlike Rayleigh scattering, Mie scattering is almost independent of wavelength, so all colors are scattered equally. This is the reason why fog, clouds and also smoke appear white, or grey to black if the light is also absorbed. Exercise: Absorption in Glass Fibres The absorption coefficient of light of wavelength λ = 1.5 μm in glass is μ = 3.6 . 10-5/m. After how many kilometres at the latest must a light amplifier (repeater) be installed if it is required that the light output may have dropped to a minimum of 10% after this distance? Solution According to (10.25) Pð0Þ . e - μ.x = 0:1 . Pð0Þ:
ð10:27Þ
We reduce P(0) and solve the equation for the length x: x=
lnð0:1Þ - 2:3026 = = 64 km: -μ - 3:6 . 10 - 5 =m
ð10:28Þ
A repeater must therefore be installed after a distance of 64 km at the latest.
10.1.2.1 Outlook: X-Ray Diagnostics X-ray diagnostics is an important medical application of the absorption of electromagnetic waves. In this process, radiation with typical wavelengths in the range between 5 pm and 500 pm is generated in an X-ray tube by accelerating electrons from a cathode to high speeds with electric fields and then strongly decelerating them in an anode. This produces X-ray light with a characteristic spectrum consisting of a broad background (bremsstrahlung) and also containing characteristic lines depending on the material of the anode. This radiation is absorbed to different degrees in the human body depending on the density of the tissue. Bones absorb significantly more than adipose tissue, for example. On the basis of the radiation intensity transmitted through the body, it is therefore possible to draw conclusions about the type of tissue being examined. In modern X-ray computed tomography
250
10
Optics
(CT), even different soft tissues such as the lungs and heart can be distinguished from one another, and three-dimensional images of the inside of the body can be generated in an imaging process.
10.2
Geometrical Optics
For many applications, the complete formalism of electromagnetic waves is unnec→ essarily complicated. For example, if one only wants to know in which direction k an electromagnetic wave propagates, it is more practical to describe light as rays. This field, which deals with light rays, is also called geometrical optics. It is applied to the refraction and reflection of light and forms the basis for optical imaging.
10.2.1 Refraction and Reflection at Interfaces We consider a light beam passing from a medium with refractive index n1 to a second medium with refractive index n2 . The angle α1 at the interface is measured with respect to the perpendicular, (Fig. 10.3). One of the media can also be vacuum or air with n = 1. In this case, part of the light is reflected at the interface. The angle α01 of the reflected beam is the same as the angle of incidence: α01 = α1 :
ð10:29Þ
The remaining part of the light power propagates as a beam in the second medium. Here, the direction of the beam changes at the interface, that is, the beam is refracted to the angle α2 with respect to the perpendicular. The relationship between the angle of refraction and the angle of incidence is the well-known law of Snellius:
Fig. 10.3 Refraction of a light beam at the interface of two media with refractive indices n1 and n2. In case (a) n1 < n2 the beam is refracted towards the perpendicular, in case (b) n1 > n2 it is refracted away from the perpendicular. At the interface, reflection occurs in addition to refraction. Here, α01 = α1
10.2
Geometrical Optics
251
Fig. 10.4 Reflected part R of the light power incident from a medium with refractive index n1 to a second medium with refractive index n2 as a function of the angle of incidence α. The two curves correspond respectively to p-polarized and s-polarized light. Here, in (a) light is reflected from the optically denser medium and in (b) from the optically thinner medium. For p-polarized light, there is a Brewster angle at which the reflectivity is R = 0. In the case of reflection from the optically denser medium, total internal reflection occurs at large angles of incidence, where R = 1
n1 . sinðα1 Þ = n2 . sinðα2 Þ:
ð10:30Þ
If the refractive index of the first medium is less than that of the second medium, n1 < n2, it follows from (10.30) that the beam is refracted toward the perpendicular (Fig. 10.3a). The angle at which the refracted beam propagates is smaller than the angle of incidence, α2 < α1. Incidentally, the medium with the larger refractive index is also called the optically denser medium. If, on the other hand, n1 > n2, it follows from (10.30) that the ray is refracted away from the perpendicular, that is, α2 > α1 (Fig. 10.3b). Thus, the refracted beam propagates closer to the interface. Under this condition, if the angle of incidence exceeds a critical value, the refracted beam is turned completely into the interface. In the second medium, there is then no longer a refracted beam. The entire light output is then reflected at the interface. This phenomenon is therefore called total internal reflection. From (10.30), with the condition that α2 = 90° is, one can calculate the critical angle αT, called the total reflection angle αT = arcsin
n2 : n1
ð10:31Þ
How much the boundary surface reflects the incident light power thus depends on the angle of incidence. The corresponding dependence of the reflectivity as the ratio of the reflected power PR to the incident power Pin, R=
PR , Pin
ð10:32Þ
252
10
Optics
is sketched in Fig. 10.4. A distinction is made here between (a) reflection from the optically denser medium and (b) reflection from the optically thinner medium. In case (b), the reflectivity R = 1 is in the region of total internal reflection for α > αT. In case (a), there is no total internal reflection. Figure 10.4 also distinguishes whether the light field oscillates parallel to the sketched plane of incidence (p-polarization) or perpendicular to the plane of incidence (s-polarization). The details of polarization are introduced in Sect. 10.3. The curves sketched in Fig. 10.4 are described by the so-called Fresnel equations, which will not be discussed in this book. Written Test: Total Reflection in Water If a diver looks at the water surface from below at an angle so that total internal reflection occurs, the surface looks like a silver mirror. What is the angle of total internal reflection? The refractive index of water is n1 = 1.33, that of air is n2 = 1.0. Solution The total internal reflection angle is according to (10.31) αT = arcsin
1:0 = 48:8 ° : 1:33
ð10:33Þ
10.2.2 Optical Imaging If you use lenses to capture the light emitted by an object, you can create a real image of the object at another point in space. This means that the light distribution at this point is ideally the same as at the object itself, except for a magnification factor. For example, if a CCD camera is placed at the point in space where the image is created, the camera can be used to observe the object. The lens objectives that are on cameras have the very purpose of creating an image of the object to be photographed on the CCD chip of the camera. First, let us describe the properties of lenses here.
10.2.2.1 Lenses Lenses usually consist of a plate made of plastic or glass, one or both sides of which have a curved surface. If the surface is curved outwards, it is called convex, if it is curved inwards, it is called concave. Both types of curvature can also be combined in one lens, in which case it is a concave-convex lens. If one side of the lens is flat, it is called a planar surface. Most lens surfaces are spherically curved, that is, the surface of the lens is part of the surface of a sphere with radius r. For convex surfaces the radius of curvature is by definition r > 0, for concave surfaces r < 0, and for plane surfaces r = 1. The effect of a lens on a light beam is described by the focal length f of the lens. For thin lenses it is given by
10.2
Geometrical Optics
253
Fig. 10.5 (a) Parallel rays are focused by a converging lens into a point in the focal plane behind the lens. (b) A diverging lens disperses parallel incoming rays so that the rays appear to come from a point in the focal plane in front of the lens. Rays passing through the center of a converging or diverging lens are not deflected
1 1 1 þ , ½f ] = m, = ð n - 1Þ . f r1 r2
ð10:34Þ
where the material of which the lens is made has refractive index n and the radii of curvature of the two sides of the lens are given by r1 and r2 respectively. The reciprocal of the focal length is called the refractive power D of the lens, D=
1 1 , ½D] = = dpt: f m
ð10:35Þ
The unit of refractive power is the diopter (dpt) known from ophthalmic optics. If the focal length of a lens is f > 0, it is a converging lens, that is, rays entering the lens in parallel are concentrated in a focal point in the focal plane behind the lens (Fig. 10.5a). The focal plane is at a distance f from the lens. Rays passing through the center of the lens are not deflected, even if they pass through the lens at an angle. If, on the other hand, f < 0, the lens is a diverging lens, deflecting parallel rays outward so that they appear to emerge from a focal point in the focal plane in front of the lens (Fig. 10.5b). If two lenses with focal lengths f1 and f2 are combined by placing them at a short distance d ≪ f1, f2 from each other, the combination acts as a single effective lens with focal length f, where 1 1 1 = þ : f f1 f2
ð10:36Þ
254
10
Optics
Written Test: Plano-Convex Lens What is the focal length of a plano-convex lens made of glass with refractive index n = 1.5, whose curved side has a radius of curvature of r = 10 cm? Solution The focal length is according to (10.34) 1 1 1 1 = = ð1:5 - 1Þ . þ , f 10 cm 1 20 cm
ð10:37Þ
that is, the focal length is f = 20 cm.
10.2.2.2 Lens Equation In the context of geometrical optics, it is possible to determine graphically where the image of an object is formed. This is done exemplarily in Fig. 10.6. An object of size G is located at a distance g in front of a converging lens. We trace the rays emanating from the tip of the object in different directions. The ray (1) incident on the lens parallel to the optical axis is deflected so that it passes through the focal point behind the lens. Beam (2) passing through the center of the lens is not deflected, and beam (3) passing through the focal point in front of the lens is deflected by the lens so that it is parallel to the optical axis. All three rays meet at one point. This is the image point of the tip of the object. If one performs this procedure for each point of the object, one thereby constructs the complete image. The image is formed at a distance b from the lens and has the size B. The ratio of the image size to the object size is the magnification V of the image. According to the ray theorem, it also corresponds to the ratio of the image distance to the object distance:
Fig. 10.6 Geometric construction of an image with a converging lens of focal length f. The object with object distance g from the lens and object size G is projected onto an image at image distance b after the lens and image size B
10.2
Geometrical Optics
255
V=
jbj B = : G g
ð10:38Þ
Since, as we shall see in a moment, the image distance can theoretically also be negative, namely whenever the image is formed in front of the lens, the absolute value of the image distance appears in (10.38). The relation between image distance and object distance with the focal length is given by the lens law 1 1 1 g.f = þ , resp:b = : f g b g-f
ð10:39Þ
For a converging lens with f > 0, if the object is farther from the lens than the focal length, g > f, then the image distance is b > 0, that is, a real image is formed behind the lens, which can be captured with a CCD camera at that location. However, if the object is closer to the lens than the focal length, g < f, then b < 0. In this case, there is no real image behind the lens; the focal power of the lens is not large enough to focus the rays coming from the object, so the rays continue to disperse behind the lens. But they disperse in a way, so that it looks as if they are coming from a place at a distance |b| in front of the lens. It is said that a virtual image is formed at this location. You can’t put a CCD camera in the place of a virtual image, because it only exists when you look at it through the lens. For diverging lenses with f > 0, according to (10.39), the image distance is always b < 0, that is, there is no real image. However, virtual images (just like real images) can be re-imaged by a second lens. The so-called intermediate image is then treated as a new object; the distance of the intermediate image from the second lens is the new object distance. Such imaging systems with one or more intermediate images are often found in microscopes and telescopes. Exercise: Imaging An object is imaged using a lens with a focal length of f = 10 cm. First the object is in front of the lens at a distance of g1 = 30 cm and second at a distance of g2 = 8 cm. How large are the image distance and the magnification in each case? Solution The image distance according to (10.39) is given by b1 =
30 cm . 10 cm = 15 cm > 0: 30 cm - 10 cm
ð10:40Þ
So it is a real image with magnification (continued)
256
10
V=
15 cm 1 = : 30 cm 2
Optics
ð10:41Þ
Since the magnification is less than one, it is actually a reduction. The image is exactly half the size of the original object. In the second case, the image width is b2 =
8 cm . 10 cm = - 40 cm < 0: 8 cm - 10 cm
ð10:42Þ
Now there is a virtual image. The magnification is V=
40 cm = 5: 8 cm
ð10:43Þ
The image is larger than the object by a factor of 5.
10.2.2.3 Imaging with the Eye As an application of the lens equation, we will deal with the eye in the following. The eye contains a lens at its front, the eye lens, which has a variable focal length. Depending on the tension of the eye muscle, one can squeeze the eye lens more or less and thereby adjust the focal length in a range between about fmin = 18 mm when the muscle is contracted and about fmax = 25 mm when the muscle is relaxed. This allows the eye to focus on either distant objects (distance vision) or close objects (near vision). This process is also called accommodation of the eye. The distance between the lens of the eye and the retina at the back of the eye of an adult is approximately b = 22 mm. This is the target image distance in order to see a sharp image. Using this information, we now want to calculate how close an object can be in front of the eye so that we can see it in focus. The eye lens should have the minimum focal length of f = fmin = 18 mm. We solve the les Eq. (10.39) for the object distance: g=
22 mm . 18 mm = 99 mm: 22 mm - 18 mm
ð10:44Þ
So the minimum viewing distance is about 10 cm. You are welcome to try this out for yourself.
10.3
Polarization of Light
As explained in Sect. 10.1, electromagnetic waves in free space are transverse waves → → in which the electric field E and the magnetic field B both oscillate perpendicularly to the direction of propagation, which we place here in the direction of the z-axis.
10.3
Polarization of Light
257
However, within the xy-plane perpendicular to the direction of propagation, the → direction of the E field is free to be chosen. This freedom to specify the direction → of E is called the polarization degree of freedom. The electromagnetic wave is said → → to be polarized in the direction of the E field. By the choice of E the direction of the → → → B -field is then also fixed with B ⊥ E . One distinguishes now different kinds of the polarization.
10.3.1 Linear Polarization →
In linear polarization, the direction of oscillation of E is constant (Fig. 10.7a). In → general, the E field oscillates in one direction at an angle α to the x-axis. The wave is then →
E ðz, t Þ = E 0 . ð cos ðαÞ sinðαÞÞ . cosðkz - ωt Þ,
ð10:45Þ
with field amplitude E0. Any linear polarization of the form (10.45) can be decomposed into the two basic polarizations with linear polarization in the x and y directions: →
E ðz, t Þ = E 0 cosðαÞ . . cosðkz - ωt Þ:
1 0
. cosðkz - ωt Þ þ E0 sinðαÞ .
0 1 ð10:46Þ
Here, the two base polarizations oscillate in phase. This means that the component of → the E field in the x-direction always reaches its maximum when the component in the y-direction also does so. Mathematically, this can be seen from the fact that both components in (10.46) oscillate with a cosine function. Fig. 10.7 (a) In linear polarization, the electric field oscillates constantly back and forth in one direction. (b) In circular polarization, the electric field vector rotates in a circle. (c) In elliptical polarization, the electric field vector rotates on an ellipse. (d) Unpolarized light contains all possible polarizations
258
10
Optics
10.3.2 Circular Polarization In circular polarization, the direction of oscillation of the electric field rotates around the z-axis, both for fixed time when moving along the axis and at a fixed position on the axis as a function of time (Fig. 10.7b). Mathematically, circular polarization can be described by →
E ðz, t Þ = E 0 . ð cos ðkz - ωt Þ sinðkz - ωt ÞÞ
= E0 .
1 0
. cosðkz - ωt Þ þ E 0 .
0 1
. sinðkz - ωt Þ:
ð10:47Þ ð10:48Þ
The x-component oscillates with a cosine function, while the y-component oscillates with a sine function, that is, when the x-component reaches its maximum, the y-component ist just zero.
10.3.3 Elliptical Polarization The circular polarization just described, in which the electric field rotates in a circle, is a special case of elliptical polarization, in which the field rotates on an elliptical path. This can be described by the fact that the amplitudes of the oscillations in the x and y direction, which were the same in circular polarization, are now different: →
E ðz, t Þ =
E 0,x . cosðkz - ωt Þ : E 0,y . sinðkz - ωt Þ
ð10:49Þ
This stretches the circle in the x or y direction (Fig. 10.7c).
10.3.4 Unpolarized Light Unpolarized light has no defined polarization. It arises, for example, in all thermal light sources such as the sun or also in light bulbs. Unpolarized light can be thought of as containing all types of polarization simultaneously (Fig. 10.7d). In fact, the polarization jumps back and forth between all possibilities statistically on a very fast time scale. Therefore, unpolarized light cannot easily be described mathematically.
10.3.5 Polarizing Elements and Effects The following section deals with how polarized light is created and how the polarization of an electromagnetic wave can be changed.
10.3
Polarization of Light
259
10.3.5.1 Polarizers An element frequently used in optics to change or adjust the polarization of a light wave is the so-called polarizer. This is available in the form of polarisation foils, cubes or plates. The polarizer is characterized by an axis (Fig. 10.8). Often the polarizers are mounted rotatable, so that the direction of the polarizer axis can be adjusted. If a light wave of any polarization falls on a polarizer, only that portion of the incident polarization which is parallel to the polarizer axis is transmitted through the polarizer. The light wave after the polarizer is then linearly polarized in the direction of the polarizer axis. If linearly polarized light of the power P0 falls on the polarizer whose polarization has an angle α to the polarizer axis, the transmitted power Pt behind the polarizer is given by the so-called law of Malus. Pt = P0 . cos ðαÞ2 :
ð10:50Þ
For a circularly polarized incident light wave, the polarizer allows exactly half of the power to pass through, Pt =
1 .P , 2 0
ð10:51Þ
since according to (10.47) circular polarization can be represented as a superposition of two linearly polarized waves with the same amplitude. Here one can always choose the direction of the first wave parallel to the polarizer direction, and the direction of the second wave is then perpendicular to it. The first wave with half the total power is completely transmitted, while the second wave is completely filtered. After the polarizer, the polarization is then linear in the direction of the polarizer axis. The same is true for unpolarized light. Since it contains all possible polarizations, exactly half is always parallel to the polarizer axis. So half of the power is transmitted and this part is then polarized parallel to the polarizer axis. Exercise: Crossed Polarizers A light wave falls on two crossed polarizers, that is, the two polarizer axes are at an angle of 90° to each other (Fig. 10.9). For the sake of simplicity, let the (continued)
Fig. 10.8 When linearly polarized light impinges on a polarizer, only the part parallel to the polarizer axis is transmitted. The transmitted power is given by Pt = P0 . cos (α)2
260
10
Optics
Fig. 10.9 (a) No light passes through two crossed polarizers. (b) If a third polarizer is placed between two crossed polarizers, the axis of which does not coincide with the other two polarizer axes (45° in the sketch), light passes through all three polarizers
polarization direction of the incident wave be oriented parallel to the polarizer axis of the first polarizer. (a) What is the power transmitted through both polarizers? (b) How does the power change if a third polarizer is placed between the two crossed polarizers with its polarization axis at 45° angle to the other two polarizers?
Solution
(a) The first polarizer is at α1 = 0° angle to the incident polarization axis. Therefore, it transmits the entire power, that is, e. P1 = P0. Since the second polarizer is perpendicular to it, the angle is α2 = 90°. Therefore, the second polarizer does not let the light beam pass, that is, Pt = 0:
ð10:52Þ
(b) Now we analyze what happens if a third polarizer is added. Since each polarizer by itself takes away a part of the power, or in the best case the power remains the same, and in case (a) the transmitted power is zero, one could assume that nothing changes, because there cannot be less than zero power. However, this consideration ignores the fact that a polarizer also changes the direction of polarization. For the first polarizer, nothing changes compared to (a), that is, P1 = P0. The second, new polarizer is now in the α2 = 45° – angle to the polarization axis, therefore according to the law of Malus P2 = P1 . cos 45 °
2
=
1 1 .P = .P : 2 1 2 0
ð10:53Þ
(continued)
10.3
Polarization of Light
261
It is now decisive that the axis of polarization of the light wave is tilted by 45° after the second polarizer. Therefore the angle of polarization to the axis of the third polarizer is also given by α3 = 45°, and the transmitted power is Pt = P2 . cos ð45 ° Þ2 =
1 1 .P = .P : 2 2 4 0
ð10:54Þ
Thus, in case (b), a quarter of the total incident power is transmitted, although more polarizers are involved than in case (a). In fact, one can increase the transmitted power by inserting more polarizers, each with smaller angular differences. In the limit of infinitely many polarizers, whose axes differ from one to the next only by an infinitesimally small angle, 100% of the total power is transmitted.
10.3.5.2 Polarisation by Light Scattering Polarization also occurs in the Rayleigh scattering of light. We consider the situation sketched in Fig. 10.10. A light wave excites a Hertzian dipole to oscillate, and an observer measures the scattered light at 90° angle to the axis of incidence. We assume that the incident light is unpolarized, for example, it could be sunlight scattered by air molecules. It therefore contains components that oscillate parallel to the direction of observation (p-polarized light) and components that oscillate perpendicular to it (s-polarized light). In parallel polarized light, the Hertzian dipole oscillates parallel to the direction of observation, analogous to the light field. According to the dipole characteristic (9.133), however, no power is emitted in exactly this direction. In the s-polarized component, on the other hand, the dipole oscillates perpendicular to the direction of observation. The dipole therefore radiates a proportion of the incident light power in the direction of the observer. The light field arriving at the observer is s-polarized for this reason. The polarization of light scattering is exploited in photography by screwing a polarization filter onto the
Fig. 10.10 Rayleigh scattering can polarize a light wave. Light with (a) p-polarization is not scattered at the 90° angle due to the dipole characteristic, but light with (b) s-polarization is. An observer at the 90° angle therefore only measures s-polarized light
262
10
Optics
camera lens. This allows the photographer to further attenuate the light scattered by the air and thus make the sky darker in the photo, while the object in the foreground remains bright.
10.3.5.3 Polarization by Brewster Reflection The reflection at an interface between media with different refractive indices can also lead to polarization of unpolarized incident light (Fig. 10.11). For this purpose, we must understand the reflection at the interface microscopically. The molecules of the second medium at the interface are excited to dipole oscillations by the incident light field. Thus Hertzian dipoles at the surface are radiating light. It can be shown that the reflected light beam is formed by the superposition of the light fields emitted by the dipoles. The so-called Brewster angle αB is characterized by the fact that the reflected beam is exactly perpendicular to the transmitted beam, that is, αB þ α2 = 90 ° :
ð10:55Þ
In the case of p-polarized light, the dipoles therefore all oscillate parallel to the direction of the reflected beam. Due to the dipole characteristic, therefore, no power can be radiated in that direction. The reflectivity for p-polarized light is zero at the Brewster angle: Rp ðαB Þ = 0:
ð10:56Þ
This can also be seen in Fig. 10.4, where the reflectivity R(αB) goes to zero for p-polarized light. For incident s-polarized light, on the other hand, the dipoles oscillate perpendicular to the direction of the reflected beam. In this case, light is reflected. In general, reflectivity depends on the polarization of the incident light. The Brewster angle can be derived with the help of the Snellius law (10.30) under the Brewster condition (10.55) a αB = arctan
n2 : n1
ð10:57Þ
Fig. 10.11 In Brewster reflection, the refracted beam is perpendicular to the reflected beam. Therefore, due to the dipole radiation characteristics (a) p-polarized light cannot be reflected, but (b) s-polarized light can
10.3
Polarization of Light
263
Brewster reflection exists for reflection from both the thinner and denser medium, but at different angles. Exercise: Brewster Reflection What are the Brewster angles when a ray of light is reflected at the interface of air with n1 = 1 and glass with n2 = 1.5 and when it is reflected in the opposite direction at the interface of glass with air? What is the angle α2 of the transmitted beam in the glass when the incident beam falls on the glass surface at the Brewster angle? Solution For the reflection at the air-glass interface, the Brewster angle is αB,1 = arctan
1:5 1
= 56:3 ° :
ð10:58Þ
For the reflection at the interface of glass to air, the Brewster angle is αB,2 = arctan
1 1:5
= 33:7 ° :
ð10:59Þ
To calculate the angle in the glass, we use Snellius’ law: α = arcsin
n1 1 . sinðαB,1 Þ = arcsin . sinð56:3 ° Þ = 33:7 ° : n2 1:5
ð10:60Þ
The fact that the angle in the glass corresponds exactly to the Brewster angle is no coincidence. It can be shown in general that this is always the case. This means: If a light beam passes through a plate of a refractive index medium, for example, through a glass plate, and there is Brewster reflection at the front side of the medium, then there is always also Brewster reflection at the back side.
10.3.6 Birefringence Some crystals that have a certain microscopic structure are birefringent. They are characterized by the fact that their refractive index depends on the polarization direction of the light. Since a comprehensive description of birefringence would exceed the scope of this book, only the most important effects of birefringent materials will be described here. First, polarization-dependent beam splitting can occur (Fig. 10.12a). Here, a distinction is made between the ordinary and the
264
10
Optics
Fig. 10.12 (a) A birefringent crystal can lead to polarization-dependent beam splitting. Here the →
ordinary beam behaves normally in the sense that it propagates in the direction of the k vector. It is polarized perpendicular to the optical axis of the crystal. The extraordinary, p-polarized beam, on →
the other hand, is deflected in the walk-off angle αwo from the direction of the k vector. (b) Walkoff angle αwo as a function of the angle θ of the optical axis in the example of calcite
extraordinary beam. The ordinary beam behaves in the sense that it travels in the → direction of the wave vector k . The corresponding refractive index is called the ordinary refractive index no. The ordinary beam is always polarized perpendicular to the so-called optical axis of the crystal, which is given by the crystal structure. In a plane parallel to the optical axis of the crystal, the extraordinary beam is polarized. Its special feature is that it can propagate in a different direction than the wave vector → → k specifies. The angle between the k vector and the extraordinary beam is called the walk-off angle αwo. It depends on the angle θ at which the optical axis of the crystal is inclined to the surface of the crystal, the ordinary refractive index no and the extraordinary refractive index ne. Typical values are, for example, for calcite
no = 1:658,
ð10:61Þ
ne = 1:468:
ð10:62Þ
The size of the walk-off angle is sketched as a function of the angle θ for calcite in Fig. 10.12b. If the optical axis is perpendicular or parallel to the wave vector, the walk-off angle is αwo = 0, and no beam splitting occurs. The refractive index experienced by the extraordinary beam in the crystal also depends on the angle θ: n=
n0 . n e n2o . cos ðθÞ2 þ n2e . sin ðθÞ2
:
ð10:63Þ
The refractive index is in the range between ne for θ = 0° and no for θ = 90°. In the case θ = 90°, where the optical axis is parallel to the wave vector, the refractive indices of the ordinary and extraordinary rays are the same. In fact, this condition is precisely the definition of the direction of the optical axis. We now further consider the case where the optical axis is parallel to the crystal surface and the beam propagates perpendicularly through a birefringent medium of thickness d. In this case, although the optical axis is not parallel to the wave vector, the refractive indices
10.3
Polarization of Light
265
of the ordinary and extraordinary beams are the same. In this case, although there is no splitting of the differently polarized beams, there is a shift in phase between the s-polarized and p-polarized components due to the different refractive indices no and ne. This affects the polarization of the light wave without changing the power of the light wave.
10.3.6.1 Outlook: Changing the Polarization with Wave Plates The fact that a polarization-dependent shift of the phase of the light wave occurs in birefringent plates is used in optics to manipulate the polarization in a controlled way. For this purpose, so-called half-wave and quarter-wave plates are fabricated, whose effective thickness deff = ðn0 - ne Þ . d
ð10:64Þ
Corresponds to half an optical wavelength or a quarter of the optical wavelength, according to the designation. Here the actual thickness is given by d. Thus, in the case of the half-wave plate, the relative phase just changes by 180°, reversing the sign of one of the two components. This corresponds to the reflection of the linear polarization at the optical axis. Typically, wave plates are mounted rotatably around the beam direction, so that depending on the rotation of the plate, one can rotate the linear polarization of a light wave through any angle. In the quarter-wave plate, the relative phase of the two components changes by 90°. This causes the oscillations of the two components to go out of phase. Thus, quarter-wave plates convert linear into circular polarization, and vice versa.
10.3.7 Optical Activity Optically active substances are also able to rotate the linear polarization of a light wave around the beam axis (Fig. 10.13). Optically active substances are usually
Fig. 10.13 (a) An optically active substance rotates the polarization of a light beam by an angle α proportional to the length l passed through. (b) In polarimetry, the optically active sample to be examined is located between two crossed polarizers. The second polarizer, called the analyzer, is rotated until no more light passes through. The corresponding angle of rotation α can then be used, for example, to determine the concentration of the solution
266
10
Optics
chiral molecules whose three-dimensional shape exhibits left-right asymmetry, similar to our right and left hands, respectively. Therefore, there are two classes of molecules with identical composition but different three-dimensional shape mirrored to each other. These classes of molecules are called enantiomers. Molecules of both classes rotate the polarization axis to the same extent, but in different directions. For solid chiral substances, the angle of rotation α depends on the length l of the substance and a material constant, the so-called specific rotation αs: α = αs . l, ½αs ] =
° : m
ð10:65Þ
For substances solved in liquids, the rotation also depends on the concentration c of the solution: α = αs . c . l, ½αs ] =
g ° . cm2 , , ½c] = g cm3
ð10:66Þ
where the unit of specific rotation is different compared to that of solid substances. Many naturally occurring substances exist in only one of the two forms. Dextrose, for example, is dextrorotatory with a specific rotation of αs = 52:5
° . cm2 : g
ð10:67Þ
Exercise: Polarimetry A light beam falls on two crossed polarizers, as sketched in Fig. 10.9b, so that initially no light can be detected after the second polarizer (the so-called analyzer). The transmitted power is zero, Pt = 0. Now a cuvette of length l = 10 cm filled with a glucose solution of unknown concentration is placed between the two polarizers. The optical activity of the glucose solution rotates the polarization axis of the light so that light now passes through the analyzer, Pt > 0. Now rotate the polarizer axis of the analyzer until the transmitted line is again zero, Pt = 0. The corresponding angle by which the polarizer had to be rotated is α = 7°. What is the concentration of the glucose solution? Solution The angle of rotation on the analyser is the same as the angle through which the light polarisation was rotated by the glucose solution. Therefore, by solving (10.66), the concentration of the solution is (continued)
10.4
Wave Optics
267
c=
10.4
α = αs . l 52:5
7° ° .cm2 g
= 13:3
. 10 cm
mg : cm3
ð10:68Þ
Wave Optics
While geometrical optics is mainly concerned with the propagation of light rays, wave optics deals with phenomena based on the wave properties of light, such as interference and diffraction. In the following, we describe these phenomena exclusively with the help of the electrical part of the harmonic light wave. →
E
→
→
→
→
r , t = E 0 . sin k . r - ω . t :
ð10:69Þ
Since the magnetic part is proportional to the electric part, it contains no additional information.
10.4.1 Interference of Light Interference of light waves can be described mathematically in the same way as the interference of harmonic waves, as already shown in general in Sect. 4.2.1. The → → electric field of the total wave is the sum of the electric fields E 1 r , t and →
→
E 2 r , t of the individual waves, →
E
→
→
→
→
→
r ,t = E1 r ,t þ E2 r ,t :
ð10:70Þ
The intensity I of the total wave is given by (10.8) as I=
→2 →2 →2 → → 1 1 cε0 E = cε0 . E 1 þ E 2 þ 2 . E 1 . E 2 2 2 →
→
= I 1 þ I 2 þ cε0 E 1 . E 2 :
ð10:71Þ ð10:72Þ
The intensity is therefore the sum of the two individual intensities I1 and I2 plus a mixing term (interference term), which consists of the product of the two individual fields. If both individual fields have the same sign, the total intensity is increased, and if the individual fields have opposite sign, it is decreased down to complete extinction. However, interference only occurs when the two light fields have the → → same linear polarization. If the two fields E 1 and E 2 are perpendicular to each other, → → the interference term disappears because of the scalar product E 1 . E 2 = 0. Another
268
10
Optics
condition for interference, the coherence of the light waves, is explained in more detail below. Analogous to the interference of harmonic waves, interference patterns can occur in space for light waves that have the same frequency and propagate in different directions (Fig. 4.5). In particular, there are also standing light waves when, for example, a light beam is reflected at a mirror and interferes with itself in the opposite direction. Positions with high electrical oscillation amplitude corresponding to high light intensity (antinodes) alternate with positions where the electrical oscillation amplitude and thus the light intensity are zero (nodes) (Fig. 4.7a). Furthermore, analogous to sound waves, there are also beatings between light waves. If two light waves with different frequencies are superimposed, the intensity of the light wave changes periodically with the difference frequency of the two individual waves (Fig. 4.8).
10.4.1.1 Outlook: Light Rays as Interference Patterns Plane waves as described in (10.69) are infinitely extended objects. In particular, the intensity I / E 20 of a plane wave is constant perpendicular to the direction of propagation and thus infinitely extended. A plane wave thus contradicts the everyday experience of a light beam having a finite beam cross-section profile I(x, y), with a finite beam radius and a finite beam cross-section area. However, light and laser beams can be composed by superimposing many plane waves with the same → → → frequency and different k vectors with matching amplitudes E 0 k . The ampli→
→
tude spectrum E 0 k
is given by the so-called Fourier transformation of the beam →
cross-section profile. The smaller the extension of the beam, the more different k vectors are needed. Laser beams with a Gaussian beam profile are then described in the context of Gaussian optics, which we will not discuss further here.
10.4.1.2 Coherence As already mentioned above, coherence is a condition for the occurrence of interference of light waves. However, coherence is not a simple concept; in the following, we will restrict ourselves to a description of temporal coherence that is as simple as possible. The question here is how electromagnetic waves must behave as a function of time in order for interference to occur when they are superimposed. First of all, it can be stated that the waves must have the same frequency. If they have different frequencies, beating occurs between the individual waves and the interference pattern is cancelled by averaging. White light (e.g., from a light bulb), which contains many frequencies, is therefore incoherent in time. In the following, we will therefore consider monochromatic light; for example, white light can be sent through a color filter that only allows a small range of wavelengths to pass through, or we can use laser light, which by consist of a single wavelength. It should not be forgotten that also monochromatic light (even laser light) always has a width in frequency, the linewidth Δf. This is due to the fact that every real light wave is subject to phase fluctuations. This means that the phase of the light wave makes a random jump from time to time (Fig. 10.14). The duration Δti after one jump to the
10.4
Wave Optics
269
Fig. 10.14 The coherence time τc of a light wave is defined as the mean value of the time intervals Δ ti between successive phase jumps
Fig. 10.15 In a Mach-Zehnder interferometer, a light beam is split into two paths by a beam splitter (BS) and superimposed again on a second beam splitter after the individual path lengths l1 and l2, respectively. At which of the two output ports port 1 or port 2 the light beam leaves the second beam splitter depends very sensitively on the path length difference Δ l = l1 - l2. By inserting a glass plate into one of the two interferometer arms (dashed box), the splitting depends on the length of the plate. This allows temperature changes of the plate to be measured interferometrically, see exercise task
next jump is also random. The mean value of these time differences is called the coherence time τc = hΔtii of the light wave. This is related to the linewidth of the light, via τc =
1 : Δf
ð10:73Þ
The distance the light wave travels within the coherence time is called the coherence length lc: lc = τc . c:
ð10:74Þ
The coherence length is therefore the average length of a wave train without phase jump. If two superimposed light waves are observed within a time period in which no jump occurs, a stable interference pattern appears. However, as soon as one of the two waves makes a phase jump, the interference pattern shifts in space. Over long times, the interference then cancels by averaging. The longer the coherence time of the waves, the more stable is the interference pattern. An example is the MachZehnder interferometer (Fig. 10.15), in which a light wave is split into two parts that
270
10
Optics
travel different paths with lengths l1 and l2 before being superimposed again. If the path length difference Δ l = l1 - l2 is smaller than the coherence length, the interference pattern is stable. But if the path length difference is greater than the coherence length, the interference disappears. The frequency width of the light wave depends on how exactly the path lengths can be matched. An extreme case are the so-called white-light interferometers, which, as the name suggests, are operated with white light. Here, the path length must be adjusted to one micrometer or even more precisely. Such white light interferometers are used, for example, for 3-D profile measurements. A well-known application in medical technology is optical coherence tomography (OCT).
10.4.1.3 Mach-Zehnder Interferometer A Mach-Zehnder interferometer is sketched in Fig. 10.15. A linearly polarized laser beam with intensity I0 is split at a semi-transparent mirror, a so-called beam splitter (ST), into two beams with (optimally) the same power, which propagate on two different paths (the interferometer arms) with lengths l1 and l2. On this path, they are deflected by mirrors so that they can be superimposed again at a second beam splitter. The second beam splitter has two output ports at which the intensities I1 and I2 of the corresponding laser beams are measured. The interference of the superimposed beams results in output intensities of k . Δl 2
2
I 1 = I 0 . cos
k . Δl 2
2
I 2 = I 0 . sin
,
ð10:75Þ
,
ð10:76Þ
with the path length difference Δ l = l1 - l2. If the two partial beams constructively interfere at one output port, destructive interference results at the other output port and vice versa. Depending on the length difference of the two paths, the light beam is thus either directed into port 1 or into port 2. The total intensity on both ports here corresponds to the input intensity I 1 þ I 2 = I 0:
ð10:77Þ
Interferometers react extremely sensitively to changes in length. If the length difference of the two paths changes by only half a wavelength, Δ l = λ/2, the laser beam switches completely from one port to the other. By sensitive measurement of the two intensities I1 and I2, one can thus measure length differences smaller than the diameter of an atom.
10.4
Wave Optics
271
Exercise: Mach-Zehnder Interferometer as temperature sensor Consider the interferometer sketched in Fig. 10.15. A glass plate with thickness d = 10 cm and refractive index n = 1.5 is inserted in one of the two interferometer arms. When the temperature of the glass plate changes, thermal expansion (see also Sect. 5.1.2) changes the length of the plate according to (5.19) by Δ d = α . d . Δ T:
ð10:78Þ
How large must the temperature change Δ T be so that the laser beam output changes completely from one port to the other? The coefficient of linear expansion of glass is α = 7 . 10-6/K, the laser light used has a wavelength of λ = 500 nm. Solution The beam changes the output port completely with a length change of Δ d = λn/2. Here, we have to insert the wavelength λn = λ/n including the the refractive index n of the glass plate . In the example, the length change is caused by the thermal expansion of glass. We solve (10.78) for the temperature change and insert the length change: ΔT =
λ=2 250 nm = = 0:24 K: α . n . d 7 . 10 - 6 =K . 1:5 . 10 cm
ð10:79Þ
With a temperature change of Δ T = 0.24 K, the beam changes the output port completely. The resolution at which a temperature difference can be detected is correspondingly much smaller (in the mK range).
10.4.1.4 Thin Film Interference If white light is reflected by a thin layer, for example, an oil film, coloured rings can be observed. This is caused by interference in a so-called Fabry-Perot etalon (Fig. 10.16). This is a thin medium with thickness d made of a dielectric material with refractive index n > 1. When a light beam falls on the layer, it is partly refracted at the front side and partly reflected, and the refracted part is again refracted and reflected at the back side. Multiple reflection in the medium thus results in a bundle of rays that travel back and forth in the layer and overlap in the total reflection or total transmission. Here the light fields of the phase fronts of the outgoing plane wave interfere at the two points A and B in Fig. 10.16. The relative phase of the two light fields results from the respective distances Δ x and Δ y covered from the common starting point P. Here one must take into account the refraction at the interface and the fact that the wavenumber in the medium kn = k . n is greater than outside. It also
272
10
Optics
Fig. 10.16 Within a thin plate (a so-called Fabry-Perot etalon), the rays are reflected several times inside and interfere with each other. The phases of the individual waves depend on the wavelength, the refractive index of the medium and the angle of incidence. As a result, there are certain angles αm under which light of a certain wavelength is reflected particularly well
plays a role that the electric field experiences a phase jump of π when reflected from the outside, while no phase jump occurs when reflected from the inside of the layer. Constructive interference occurs when the relative phase of the two beams at points A and B is a multiple of 2π. This condition can also be expressed as
2d .
n2 - sin ðαm Þ2 = m -
1 . λ, with m 2 f1, 2, 3, . . .g, 2
ð10:80Þ
whereby we omit here the somewhat extensive derivation. Destructive interference occurs accordingly for 2d .
n2 - sin ðαm Þ2 = m . λ, with m 2 f1, 2, 3, . . .g:
ð10:81Þ
Here the number m indicates by how many multiples of 2π the two phases differ. If the layer thickness is d ≳ λ/4, the phases differ by more than 2π even at perpendicular incidence. Therefore, for thicker layers, there is a smallest multiple m above which Eq. (10.80) has a solution at all. This smallest m can be determined by rearranging Eq. (10.80): sin ðαm Þ2 = n2 -
m - 12 . λ 2d
2
≤ 1:
ð10:82Þ
The condition ≤1 follows from the sine function. The inequality can be solved for the multiple m: p 2d 1 m ≥ n2 - 1 . þ : λ 2
ð10:83Þ
10.4
Wave Optics
273
Exercise: Thin Film Interference A glass plate with refractive index n = 1.5 has a thickness of d = 10 μm. How large must the natural number m be at least, so that for red light with wavelength λ = 650 nm constructive interference occurs in the reflection according to (10.80), and how large is the angle of reflection αm in this case ? Solution We use the condition (10.83): p 2d 1 m ≥ n2 - 1 . þ = λ 2
1:52 - 1 .
2 . 10 μm 1 þ = 34:9: 650 nm 2
ð10:84Þ
Thus, the smallest natural number satisfying this condition is m = 35. The corresponding angle according to (10.82) is
αm = arcsin
= arcsin
1:52 -
n2 -
m - 12 . λ 2d
ð35 - 12Þ . 650 nm 2 . 10 μm
2
ð10:85Þ
2
= 85:1 ° :
ð10:86Þ
At an angle of incidence of α = 85.1°, red light is therefore reflected particularly well by the glass plate.
10.4.2 Diffraction Whenever light passes through small apertures, diffraction occurs. Here, not only the beam passing the aperture is transmitted, but the light intensity is amplified or attenuated at certain angles to the optical axis. The same is true when a beam of light is shadowed by a small object. This not only creates a shadow behind the object, but again there are certain angles of amplified or attenuated light intensity. Diffraction is based on the interference of many rays passing close to each other. This can be seen in the principle of Huygens.
10.4.2.1 Huygens’ Principle Huygens’ principle states that each point of a phase front in a wave is a starting point of a spherical elementary wave. A phase front denotes all connected points with the same phase (Fig. 10.17a). The new wave then results from the superposition of all elementary waves. In Fig. 10.17b a plane wave is incident on an aperture. Each location within the width of the aperture is thus the starting point of an elementary spherical wave propagating in the right half-space, as indicated by the drawn circles.
274
10
Optics
Fig. 10.17 (a) According to Huygens’ principle, every point of a phase front of a wave is the starting point of an elementary spherical wave. In the sketch, a plane wave falls on a small slit which allows only a single elementary wave to pass. This causes a spherical wave to propagate behind the slit. (b) The slit has a width that allows several elementary waves. Through interference of these elementary waves, diffraction occurs, that is, there are certain directions (diffraction orders) under which the elementary waves interfere constructively
In the directions where many of the circles intersect, the elementary waves have the same phase. That is, in these directions the elementary waves interfere constructively, and the light intensity is increased. On the optical axis under the angle α = 0° always features a maximum, because in this direction all elementary waves interfere constructively. In the following, three examples of diffraction are analysed in more detail.
10.4.2.2 Diffraction at the Single Slit In the case of diffraction at a single slit with slit-thickness a, the elementary waves of all points in the slit are superimposed. We explicitly consider the two beams which emanate from the ends of the slit at the angle α (Fig. 10.18). The two beams have a path length difference of Δ = a . sin (α). If this path length difference is just a multiple of the wavelength m . λ, these two outermost beams constructively overlap, but within the slit width there is for each point exactly a second point at a distance of half the slit width at which the corresponding phases of the elementary waves interfere destructively. So under angles αm- there are diffraction-minima, if m . λ = a . sin αm- , with m 2 f1, 2, 3, . . .g:
ð10:87Þ
The diffraction minima can be observed on a screen as intensity minima. Correspondingly there are diffraction maxima under the angles αþ m , if 1 ðm þ Þ . λ = a . sinðαþ m Þ: 2
ð10:88Þ
10.4
Wave Optics
275
Fig. 10.18 (a) The two beams, which go out from the ends of the slits at angle α have a path-length difference Δ = a . sin (α). b Depending on the phase of the two beams, therefore, diffraction maxima arise at the angles αþ m and diffraction minima in between. These can be observed on a screen in distance l as diffraction-pattern with intensity-profile I(x)
The diffraction-maximum belonging to m is also called m-th diffraction-order. The smaller the slit-width a is, the bigger is the distance Δx between the neighbouring diffraction-orders on the screen.
10.4.2.3 Diffraction at the Double Slit In the case of a double slit, a light wave falls on two single slits with width a, which have a distance d from each other (Fig. 10.19a). Here, the distance denotes the distance between the centres of the two slits. Each of the two individual slits generates a corresponding diffraction pattern. This results in an enveloping amplitude of light intensity on the screen corresponding to the diffraction pattern of the single slit. Superimposed on this is the interference of rays from both slits. The path difference between two rays coming from the two slits at the angle α is Δ = d . sin (α). Similar to the single slit, these two rays constructively overlap if the path difference is equal to a multiple of the wavelength. In the case of the double slit, however, there are no rays from the region between the two slits, so diffraction maxima occur at the angles αm, if m . λ = d . sinðαþ m Þ, with m 2 f1, 2, 3, . . .g,
ð10:89Þ
and diffraction minima, if mþ
1 . λ = d . sinðαm- Þ, with m 2 f1, 2, 3, . . .g: 2
ð10:90Þ
Thus with a double-slit the formulas for maxima and minima are exchanged in comparison with single-slit, and the slit-width a is replaced by the slit-distance d. In particular, the smaller the slit-distance d, the larger is the distance Δx between neighbouring diffraction-orders.
276
10
Optics
Fig. 10.19 (a) Diffraction at the double slit. Two beams, which leave the two slits with slitdistance d under the angle α, have the path-length-difference Δ = d . sin (α). This results in diffraction maxima at the angles αþ m and diffraction minima in between (red curve). The amplitudes of the diffraction maxima are given by an envelope (black dotted curve), which is identical to the diffraction pattern of the single slit with slit-width a. (b) The diffraction pattern of a grating consists of diffraction maxima at the same angles αþ m as at the double slit. Likewise, the envelope is given by the diffraction pattern of the single slit. The difference to the double-slit is that the width of the diffraction maxima is smaller. This width is smaller the more lines of the grating are illuminated and emit rays which interfere with each other. In addition, in comparison with the double-slit, small secondary maxima occur between the diffraction orders
10.4.2.4 Diffraction at the Grating A grating consists of a periodic arrangement of slits with widths a and spacings d (Fig. 10.19b). The diffraction pattern of a grating therefore looks in principle like the diffraction pattern of the double slit. In particular, the enveloping amplitude of light intensity is given by the diffraction pattern of a single slit. Moreover, the diffraction maxima and minima within the envelope occur under the same conditions (10.89) and (10.90) as for the double slit. But the width of each diffraction-order is narrower than with a double-slit. The more lines of the grating contribute to the diffraction pattern, the narrower are the diffraction-maxima. Between the diffractionorders secondary-maxima are observed. The height of the side-maxima is the smaller, the more lines of the grating contribute. Examination Task: Diffraction at the Grating A slit grating with slit distance d = 10 μm is illuminated with light of wavelength λ = 500 nm. A screen is set up at a distance of l = 30 cm behind the grating. What is the distance x of the first diffraction order from the optical axis on the screen? (continued)
10.4
Wave Optics
277
Solution We solve (10.89) for the angle α and insert m = 1: α = arcsin
λ 500 nm = arcsin = 2:9 ° : d 10 μm
ð10:91Þ
The distance x of the diffraction maximum from the optical axis on the screen is then given by x = l . tanðαÞ = 30 cm . tanð2:9 ° Þ = 1:5 cm:
ð10:92Þ
10.4.3 Resolving Power of Optical Images With the help of diffraction, we can now explain why microscopic objects cannot be resolved with arbitrary precision in an optical image. This is due to a fundamental physical effect called the diffraction limit. To understand this, consider a parallel beam of rays focused through a lens of diameter D onto a screen at a distance equal to the focal length. According to geometrical optics, all rays should meet at one point, the focal point. In reality this is not the case, because the lens with its diameter represents a round slit, a so-called pinhole. Rays that are further out than the radius of the pinhole are not caught by the lens. So there is diffraction at the lens, and the typical diffraction pattern of a pinhole appears on the screen (Fig. 10.20). The light rays are thus not imaged onto a point but onto a small disc, with a disc radius r = f . tan (α), which is given approximately by the distance of the first diffraction minimum from the optical axis. Because of the round symmetry of the pinhole,
Fig. 10.20 When focusing a light beam, diffraction effects occur due to the limited diameter D of the lens. Therefore, parallel incident rays are not focused in a common point at the distance of the focal length f, but in a small area of space, the so-called diffraction disk. The radius is given by the distance of the first diffraction minimum
278
10
Optics
Eq. (10.87), which holds for diffraction minima at a single slit, is slightly modified by a factor of 1. 22 to become m . λ=
D . sinðαm Þ: 1:22
ð10:93Þ
We set m = 1 and further assume that the angle α is small, that is, tan(α) ≈ sin (α) holds. The radius of the diffraction disk is thus given by r = f . tanðαÞ = f . sinðαÞ = 1, 22 . f .
λ : D
ð10:94Þ
As a last step we have to consider how the focal length f of the lens scales with its diameter D. Here, the focal length of the lens becomes shorter the more curved the surface of the lens is. In the best case we then have a spherical lens with radius of curvature f=
D , 2
ð10:95Þ
where the focus is exactly on the surface of the lens. We put the focal length of this optimal lens into (10.93) and obtain as minimum radius of the disk λ r = 1:22 . , 2
ð10:96Þ
that is, in the best case, light can be focused on a disk with a radius of approximately λ/2. With the help of these considerations, we can now understand the resolving power of an optical image according to Rayleigh.
10.4.3.1 Rayleigh Criterion The Rayleigh criterion deals with the minimum angle δmin between two point-like objects measured from the lens, so that these two objects can just be perceived as different objects in the image (Fig. 10.21a). Because of the diffraction limit, the images of the objects are not points but small discs that blur into each other when the distance is sufficiently small. According to Rayleigh, the two slices can just be distinguished from each other when the diffraction minimum of object 1 coincides with the diffraction maximum of object 2. Therefore we get sinðδmin Þ = 1:22 .
λ : D
ð10:97Þ
A second criterion with similar significance is the Abbe criterion.
10.4.3.2 Abbe Criterion According to Abbe, the information about an object is contained in its diffraction orders, that is the more diffraction orders of an object the imaging lens captures, the
10.4
Wave Optics
279
Fig. 10.21 (a) Resolving power according to Rayleigh. The diffraction-discs of two objects (red and blue point) can just be distinguished from each other, if the diffraction-maximum of one object coincides with the diffraction-minimum of the other object. (b) Resolving power according to Abbe. Two objects can just be distinguished from each other if the imaging lens catches the first diffraction maximum of the two objects (interpreted as a double slit). Here the wavelength in the refractive index n is decisive. If the objects are in a medium with a large n, this improves the resolution (immersion microscopy)
more details of the object can be imaged. The question now is, which distance dmin two point-like objects just must have, so that the lens can catch the first diffractionmaximum, if one interprets the two objects as a double-slit with slit-distance dmin (Fig. 10.21b). From the condition for diffraction maxima in the double slit (10.89) with m = 1, we get
λn = d min . sinðαÞ,
ð10:98Þ
Where λn = λ/n is the wavelength in the refractive index n of the medium in which the objects are located. The condition for the minimum object distance follows as dmin =
λ : n . sinðαÞ
ð10:99Þ
The size NA = n . sin (α) is also called the numerical aperture of the image and is a measure of how good the resolution is. You can improve the resolution, for example, by immersing the objects in a refractive index medium with the highest possible refractive index. This is the case with so-called immersion microscopy, in which an immersion fluid (usually an oil) with a high refractive index is placed between the microscope objective and the objects to be imaged. Examination Task: Diffraction limit in the Eye An object at a distance of g = 25 cm in front of the eye is imaged onto the retina through the lens of the eye with lens radius r = 0.5 cm. How good is the resolving power theoretically if it were limited only by the diffraction limit? Find the minimum object size G that can just be resolved. Assume that the object has a green color, that is emits light of wavelength. λ = 500 nm (continued)
280
10
Optics
Solution We use the Abbe criterion. The half aperture angle α, at which the lens appears from the object, is given by α ¼ arctan
r g
¼ 1, 15 ° :
ð10:100Þ
This results in the following formula for the minimum object size G = dmin =
λ = 25 μm, sinðαÞ
ð10:101Þ
with the refractive index of the air between the eye and the object set to n = 1. In fact, at this distance, the eye can resolve objects of approximately G = 150 μm size. This is because the eye (like most lenses) contains lens aberrations that degrade the actual resolution compared to the diffraction limit.
10.5
Quantum Optics
The last section of the optics chapter deals with the quantum particles of light, the photons, and is a transition to Chap. 11, which will deal with quantum physics. Quantum optics here refers to the field of research in physics that deals with photons and their interaction with matter. Important potential applications of this research are quantum encryption, the construction of quantum networks and quantum memories, and ultimately the construction of quantum computers. However, we cannot go that far in this section. In the following, we will introduce the properties of photons and show how it was historically concluded that photons must exist.
10.5.1 Properties of Photons So far we know light as a wave. This wave has an electric field amplitude E0, which is a continuously variable quantity. If the wave is sent to a beam splitter, for example, it is divided into two parts, each of which has a smaller field amplitude. However, observation shows that the energy of electromagnetic radiation is quantized in the smallest units. The electric field identified with a single quantum of light is not further divisible. The energy E of a single photon depends here on the frequency f or on the angular frequency ω of the light field. It is given by E = hf = ħω, with ħ =
h = 1:05 . 10 - 34 Js: 2π
ð10:102Þ
10.5
Quantum Optics
281
The constant of proportionality ħ is the so-called Planck constant, a natural constant. One can also describe a light beam by many successive photons, all moving at the speed of light in the direction of propagation of the beam. The power P of the light beam is related to the photon rate j, via P = j . ħω:
ð10:103Þ
Here, the photon rate is given by the average number N of photons that pass through a cross-sectional area of the light beam per time t: j=
N 1 , with ½j] = : t s
ð10:104Þ
In addition to its energy, the photon also has a fixed momentum p, which depends on the wavelength λ or the wavenumber k of the light field: p=
h = ħk: λ
ð10:105Þ
The photon can transfer its momentum to other objects, for example, when it collides with an electron or is absorbed by an atom (see Sect. 10.5.2.3). Written Test: Photon Number in the Laser Pointer A red laser pointer typically has a wavelength of λ = 633 nm and a maximum allowed power of P = 1 mW. How many photons does the laser pointer emit per second? Solution The photon rate in the laser, after solving (10.103) using the relationship between light frequency and wavelength ω = 2πc/λ is given by j=
1 P.λ 10 - 3 W . 633 nm P = 3:2 . 1015 : ð10:106Þ = = s ħω 2πħc 2π . 1:05 . 10 - 34 Js . 3 . 108 m=s
This means that the laser pointer emits an incredibly large number of 3.2 . 1015 photons per second, even though it has a relatively small power. Normally, we always have to deal with light powers that correspond to such a large number of photons that we do not notice the quantization.
282
10
Optics
10.5.2 Experimental Detection of the Photon Since in everyday life we always have to deal with so many photons that the quantization is not noticeable, the question arises how scientists found out that photons exist in the first place. In the following, we will therefore present several famous experiments in physics that provided clues to the quantization of electromagnetic radiation and therefore mark the birth of quantum mechanics at the beginning of the twentieth century. Since the early experiments, as was found later, can also be explained with the help of a classical electromagnetic wave, assuming at the same time that the interaction with matter is quantized, these experiments are not considered as proof of the existence of photons. Only the measurement of photon statistics (see Sect. 10.5.2.4) is regarded as proof of the existence of photons.
10.5.2.1 Black-Body Radiation Thermal radiation (also black-body radiation) is the radiation emitted by a so-called black body that has a temperature T. A black body is an idealized object that absorbs all the radiation incident on it and emits its thermal energy in the form of a characteristic spectrum that depends only on the temperature of the body. Here the body is in thermal equilibrium, that is, the temperature of the body is constant. Although these conditions are idealizations of reality, there are indeed bodies that behave like black bodies to a very good approximation. Examples include charcoal embers, whose color depends on temperature, or the light that reaches us from the sun and other stars. The spectrum of a star can then be used to infer the temperature on the surface of the star. What was first succinctly called a spectrum here is, strictly speaking, the spectral specific radiation M(λ, T ) of a body, which is sketched in
Fig. 10.22 The spectrum of a black body with temperature T is continuous. The higher the temperature, the further the spectrum shifts with its maximum in the direction of short wavelengths. The exact shape of the spectrum could only be explained with the help of Planck’s law of radiation. Since in the derivation of the law the light energy is quantized, this is regarded as the birth of quantum mechanics
10.5
Quantum Optics
283
Fig. 10.22 for various temperatures. Here M(λ, T) is given by the power dP radiated by the body per surface element dA of the body and per wavelength range dλ: M ðλ, T Þ =
dP W : , ½M ðλ, T Þ] = 2 dA . dλ m . nm
ð10:107Þ
To derive the shape of the blackbody spectrum theoretically from a model was an unsolved problem towards the end of the nineteenth century. There were some approaches, but they could only reproduce the observed spectrum correctly in certain wavelength ranges. Only Max Planck succeeded in 1900 in setting up a model from which Planck’s radiation law, named after him, emerged, which correctly reproduced the blackbody spectrum for all wavelengths. Crucial in Planck’s model was the assumption that radiation energy can only be emitted from the blackbody in multiples of ħω. Thus the idea of quantization of light was born. In fact, Max Planck himself did not believe that light could be quantized, for him the whole thing was just a mathematical trick. Planck’s law of radiation is M ðλ, T ÞdAdλ =
1 2πhc2 dAdλ: . hc 5 λ eλkB T - 1
ð10:108Þ
The formula is relatively complicated, but the essential aspects can be summarized as follows: The higher the temperature, the further the spectrum shifts towards short wavelengths. A blue-colored spectrum thus corresponds to a higher temperature than a red-colored spectrum. The wavelength λmax, at which the spectral specific emission is at a maximum, is given by a simple relationship, which is also known as Wien’s displacement law: λmax =
2897:8 μm . K : T
ð10:109Þ
The total radiated power Ptot of a body with surface A is obtained by integrating (10.108) over the surface of the body and all wavelengths. The result is the StefanBoltzmann law P = σ . A . T 4 , σ = 5:67 . 10 - 8
W , m2 . K4
ð10:110Þ
with the Stefan-Boltzmann constant σ. Written Test: Radiation of the Sun The sun has its spectral radiation maximum at a wavelength of λmax = 580 nm, the radius of the sun is R = 696 342 km. What is the surface temperature of the sun, and how much total power does the sun radiate? (continued)
284
10
Optics
Solution The surface temperature T results from Wien’s displacement law: T=
2897:8 μm . K = 4996 K: 580 nm
ð10:111Þ
So the surface of the sun has a temperature of about 5000 K. This is relatively little compared to the temperature inside the sun, where nuclear fusion takes place. There temperatures of approx. 15 million degrees are reached. The radiated total power results with the surface of the sun A = 4πR2 and the Stefan-Boltzmann law to P = 5:67 . 10 - 8
W . 4π . ð696 342 kmÞ2 . ð4996 KÞ4 = 2:2 m2 . K4
. 1026 W:
ð10:112Þ
The radiation power of the sun is enormous.
10.5.2.2 Photoelectric Effect A second experiment, which could only be explained at the beginning of the twentieth century by the assumption of quantization of radiation, is the so-called photoelectric effect. Here, light of frequency f and power P falls on a metal plate which is in a vacuum (Fig. 10.23). A voltage is applied between the metal plate and an anode which is also in a vacuum. When the light strikes the metal plate, electrons can be knocked out of the metal. These are accelerated to the anode by the voltage and detected as a current in the cable to the battery. The observation of the way in which the measured current depends on frequency for different powers (Fig. 10.24a), Fig. 10.23 Set-up of an experiment to measure the photoelectric effect. Light falls onto a metal plate in a vacuum chamber. This allows electrons to be knocked out of the plate, which are then accelerated to the anode by an applied voltage. The electrons arriving at the anode cause a current I, which is measured in this experiment
10.5
Quantum Optics
285
Fig. 10.24 (a) Measurement curves for the photoelectric effect. The anode current I is plotted against the light frequency f. Current is measured only for frequencies above the cut-off frequency f > fLimit . Above the cutoff frequency, the current increases with the light power P, below the cutoff frequency it is independent of P. (b) Model to explain the photoelectric effect. The electrons are located in an energy region called the conduction band (CB) of the metal. The conduction band is filled with electrons up to the so-called Fermi energy EF. They require at least the work function WA as additional energy to leave the metal. This energy is supplied by single photons with. E = h . f
was not understood at the beginning of the twentieth century. If the frequency is less than a cutoff frequency characteristic of the metal, f < fLimit, the current is I = 0, and this is independent of the light power. For frequencies f > fLimit, on the other hand, the current increases with both frequency and power. An interpretation of this observation was proposed by Albert Einstein in 1905, for which he was awarded the Nobel Prize in Physics in 1921. According to Einstein, in order to release the electrons from the metal, a minimum energy characteristic of the metal is required. Today we know that the conduction electrons are located in a certain energy range in the so-called conduction band. The energy difference between the maximum energy of the electrons, the so-called Fermi energy EF, and the energy of a free electron is the so-called work function WA. Thus, one needs at least the work function to release electrons from the metal. Einstein further assumed that light hits the plate in quantized energy units at E = hf. Only when the photon frequency is equal to or greater than the work function can electrons be released from the metal. We equate the quantities to get the limiting frequency:
hf Limit = W A :
ð10:113Þ
If the photon energy is not sufficient, it does not help to increase the light power. This increases the number of photons that fall on the plate per unit time, but the energy of each photon remains the same and is thus too small. Theoretically, of course, it could happen that two photons interact with an electron at the same time and trigger the electron with their combined energy. However, this process is very unlikely and only occurs at very high light powers. If one increases the light power in the case f > fLimit, the number of photons that can trigger an electron increases and so does the current. If one increases the light frequency, starting from f = fLimit, electrons located further down in the conduction band can also be triggered. The current increases with the number of potentially triggerable electrons.
286
10
Optics
Written Test: Photoelectric Effect The work function of zinc is WA = 4.34 eV. What is the maximum wavelength of light λmax with which one can just release electrons from a zinc plate? Solution The cut-off frequency according to (10.113) is f Limit =
4:34 . 1:6 . 10 - 19 J WA = 1:05 . 1015 s - 1 , = h 2π . 1:05 . 10 - 34 Js
ð10:114Þ
where the unit of the work function was converted from eV to joules by multiplication with the electron charge. The corresponding wavelength is given by the relation c = λ . f. This results in λmax =
c f Limit
=
3 . 108 ms = 286 nm: 1:05 . 1015 s - 1
ð10:115Þ
So, to extract electrons from a zinc plate, you need light with a wavelength of λ < 286 nm.
10.5.2.3 Compton Scattering In Compton scattering, an electromagnetic wave with wavelength λ hits a free electron at rest with v = 0 (Fig. 10.25). In the first experiments, free electrons in graphite were used, that is, the electrons were not completely free and also not completely at rest, but the assumption of a free electron at rest in graphite is a good approximation. Due to the interaction of the light wave with the electrons, both the electron and the light wave are deflected. Experimentally, the scattered light wave is measured with the help of a detector which is moved around the location of the
Fig. 10.25 In Compton scattering, a photon collides with an electron at rest. This results in an elastic collision of the two particles with conservation of energy and momentum. The wavelength of the scattered photon is therefore larger than that of the incoming photon and depends on the scattering angle ϕ
10.5
Quantum Optics
287
scattering and thus angle-resolved information about the scattered wave is obtained. It is observed that the scattered waves have different wavelengths λ′(ϕ) depending on the angle relative to the incoming wave (the scattering angle ϕ). This observation does not conform to the assumption that a wave has been scattered in this case. Under this assumption, the incident power would be redistributed in different directions during the scattering, but the wavelength would not change. If, on the other hand, the incoming wave is described as a particle stream and the scattering of the wave by the electrons as an elastic collision of particles, the observed behaviour is exactly the same. Analogous to the elastic collision of spheres in Sect. 2.3.2.1, conservation of momentum and conservation of energy also apply to the collision of the photon with the electron. The conservation of momentum is →
→0
→0
ħ k = me . v þ ħ k , →
ð10:116Þ
→0
Where k and k are the wave vectors before and after the collision, respectively, →0 and v is the velocity of the electron after the collision. Analogously, the condition for conservation of energy is hf =
1 . m . v02 þ hf 0 , 2 e
ð10:117Þ
with light frequencies f and f ′ before and after the collision. During the collision, the incoming photon releases energy to the electron. This changes the frequency and thus also the wavelength of the scattered photon. Based on the two conditions (10.116) and (10.117), one can determine the change in the wavelength of light at the collision as a function of the scattering angle. It reads Δλ = λ0 - λ = λc . ð1 - cosðϕÞÞ, with λc =
h : me . c
ð10:118Þ
The quantity λc is called the Compton wavelength. It depends only on natural constants. Exercise: Scattering of a Photon by an Atom Analogous to the scattering of a photon by an electron, a photon can also be scattered by an atom. We consider here the backward scattering (scattering angle ϕ = 180°) of a photon with wavelength λ = 780 nm at a resting rubidium atom with a mass of m = 10-25kg. (1) Show that the wavelength of the photon changes only insignificantly during scattering. To do this, replace the electron mass in Compton scattering by the mass of the atom. (2) Now, assuming λ′ = λ, determine the velocity v′ of (continued)
288
10
Optics
the atom after the scattering. Use the condition for conservation of momentum for this. Solution According to (10.118) the change of the wavelength is Δλ =
2π . 1:05 . 10 - 34 Js . 1 - cos 180 ° 10 - 25 kg . 3 . 108 ms
= 4:4 . 10 - 8 nm:
ð10:119Þ
The relative change of the wavelength is infinitesimally small with Δλ/ λ = 5.6 × 10-11. So, to an excellent approximation, one can say that λ′ = λ →0
→
is or that the magnitude of the wave vector is k = k . Since the photon is scattered exactly in the backward direction, the direction of motion reverses. →0
→
Therefore, k = - k holds for the wave vectors. We put this condition into the equation for conservation of momentum (10.116), ħk = m . v0 - ħk,
ð10:120Þ
without the vector notation. We solve for the velocity: v0 =
2π 2 . 1:05 . 10 - 34 Js . 7802πnm 2ħk 2ħ . λ cm = = = 1:7 : m m s 10 - 25 kg
ð10:121Þ
Thus, after the collision with the photon, the rubidium atom moves with a velocity of v0 = 1:7 cm s .
10.5.2.4 Photon Statistics in Parametric Fluorescence While the experiments discussed so far according to today’s view point to the existence of photons, but are not considered as compelling evidence, the experiment presented in the following, on the other hand, is widely accepted as evidence. The experiment is based on the fact that photons are indivisible objects. If one sends a single photon onto a beam splitter, the photon can only be detected on one of the two possible output paths. The problem here is to ensure that only a single photon hits the beam splitter. In almost all light fields, photons also occur together as pairs, regardless of how weak the power of the light beam is. However, a photon pair can split up at the beam splitter. In this case, one photon is measured on both paths. The measurement therefore behaves in exactly the same way as if a wave whose amplitude is split were to hit the beam splitter. With the aid of an additional tool, however, it is possible to determine at which point in time a single photon falls on the beam splitter. To do this, send a beam of light onto a nonlinear crystal, such as
10.6
Optics: Compact
289
Fig. 10.26 A nonlinear crystal such as potassium niobate KNbO3 can absorb blue light and generate red light from it. Due to conservation of energy, two red photons are always produced from one blue photon. These two photons come out of the crystal at different angles due to conservation of momentum. If you detect the photon on the upper path with a detector (the trigger T), you can be sure that there is also a photon on the lower path. This lower path is split with a beam splitter (BS) and the two output ports of the beam splitter are monitored with detectors D1 and D2. If a photon is measured at the trigger, another photon is measured either at detector D1 or at detector D2, but never do both detectors D1 and D2 click simultaneously. This is regarded as proof of the indivisibility of the photon at the beam splitter
potassium niobate (KNbO3) (Fig. 10.26). In such crystals, parametric fluorescence can occur. Here, an incoming photon with wavelength λ is absorbed in the crystal and converted into two outgoing photons with longer wavelength. Energy conservation applies here. In KNbO3, for example, a blue photon with λ = 400 nm can be converted into two infrared photons with λ′ = 800 nm. Due to dispersion in the crystal, the two photons leave the crystal at an angle to the axis of incidence. Since conservation of momentum applies, the two photons always leave the crystal at complementary angles, that is, along two different exit paths. At one of the output paths, a photon detector is set up, the so-called trigger T. If the trigger measures a photon, then one can be sure that at the second output path, which leads to a beam splitter, there is also a photon, and only a single photon. The two outputs of the beam splitter are monitored by photon detectors D1 and D2. Whenever the trigger T provides a signal, one measures a signal at either detector D1 or detector D2, but not at both simultaneously. This observation results from the indivisibility of the photon arriving at the beam splitter and is proof of the existence of photons. In real experiments, one sometimes nevertheless detects simultaneous triggering of D1 and D2. This occurs when the nonlinear crystal produces two pairs of photons simultaneously, which also allows two photons to arrive at the beamsplitter and split between the two paths. However, if the input power of the blue light beam is sufficiently small, this process is very unlikely and occurs accordingly rarely.
10.6
Optics: Compact
The most important formulas for optics are summarized:
290
10
Optics
Electromagnetic waves (10.1) and (10.7): →
E
→
→
→
→
→
r , t = E 0 . sin k . r - ω . t , B
→
→
→
→
r , t = B 0 . sin k . r - ω . t , E 0 = c
. B0 : Intensity of an electromagnetic wave (10.8): I=
P 1 = E2 : A 2μ0 . c 0
Radiation pressure of an electromagnetic wave (10.10): I pS = : c Velocity of light in the refractive index medium (10.23): cn =
c : n
Attenuation of electromagnetic waves according to the Lambert-Beer law (10.25): PðxÞ = Pð0Þ . e - μ.x : Refraction of light according to Snellius’ law (10.30): n1 . sinðα1 Þ = n2 . sinðα2 Þ: Total internal reflection (10.31): αT = arcsin
n2 : n1
αB = arctan
n2 : n1
Brewester reflection (10.57):
Focal length of a lens (10.34): 1 1 1 þ : = ð n - 1Þ . f r1 r2 Lens equationand magnification (10.38) and (10.39):
10.6
Optics: Compact
291
1 1 1 B jbj = þ ,V = = : f g b G g Law of Malus (10.50): Pt = P0 . cos ðαÞ2 : Optical activity of solids (10.65) and liquids (10.66): α = αs . l, α = αs . c . l: Coherence length and coherence time (10.73) and (10.74): τc =
1 , l = τc . c: Δf c
Mach-Zehnder interferometer (10.75): k . Δl 2
2
I 1 = I 0 . cos
k . Δl 2
2
I 2 = I 0 . sin
, :
Constructive thin-film interference (10.80): 2d .
n2 - sin ðαm Þ2 = m -
1 . λ: 2
Diffraction maxima at the single slit (10.88): 1 ðm þ Þ . λ = a . sinðαþ m Þ: 2 Diffraction maxima at the double-slit and grating (10.89): m . λ = d . sinðαþ m Þ: Resolution according to the Rayleigh criterion (10.97): sinðδmin Þ = 1:22 .
λ : D
Resolution according to the Abbe criterion (10.98): dmin =
λ : n . sinðαÞ
Energy and momentum of a photon (10.102) and (10.105):
292
10
E = hf = ħω, p =
h = ħk: λ
Wien’s law of displacement (10.109): λmax =
2897:8 μm . K : T
Stefan-Boltzmann law (10.110): P = σ . A . T 4 , σ = 5:67 . 10 - 8
W : . K4
m2
Cut-off frequency for the photoelectric effect (10.113): hf Limit = W A : Compton scattering (10.118): Δλ = λc . ð1 - cosðϕÞÞ, with λc =
h : me . c
Optics
Fundamentals of Quantum Physics
11.1
11
Properties of Quantum Objects
Quantum physics has become an indispensable part of technical applications. The functionality of every semiconductor component, every diode and every transistor is based on quantum effects. There is quantum physics in every computer and every smartphone. Yet active research is still being conducted in this field. In almost all areas of physics research, quantum physics is either the direct focus, such as in the development of quantum technologies, or at least plays a role, such as in nanophysics. Even in the life sciences, quantum effects are becoming increasingly observable; for example, according to the latest findings, quantum phenomena seem to be responsible for the high efficiency of energy transfer in the photosynthesis process. In the future, quantum physics will play an even greater role, especially in technical developments, for example, in secure encryption using quantum cryptography, in the construction of quantum networks or in the construction of extremely powerful quantum computers. It is not yet clear which applications can ultimately be realized, but in any case it is worth knowing a little about quantum objects and their properties. The fact that the laws of the quantum world often contradict our everyday experience of particles is very exciting. As we will see in Sect. 11.1.1, a quantum object, although a particle in our mind, can behave like a wave, depending on the experiment. Quantum objects are, for example, all elementary particles such as photons, electrons, protons, neutrons, and so on. The list of elementary particles is very extensive. But composite elementary particles can also be quantum objects, for example, an atom consisting of electrons, protons and neutrons (Sect. 11.3.1). Under certain conditions, even much larger objects can behave as quantum objects. Exploring the boundary between quantum physics and classical physics is a current area of research in physics. For example, giant molecules are produced and cooled down to extremely cold temperatures. These molecules are so large that they can be observed with an optical microscope. Here you can see quite clearly that it is a particle. However, if these molecules are sent through a grating, one observes diffraction, that
# Springer-Verlag GmbH Germany, part of Springer Nature 2023 S. Slama, Experimental Physics Compact for Scientists, https://doi.org/10.1007/978-3-662-67895-4_11
293
294
11
Fundamentals of Quantum Physics
is, a wave phenomenon. This wave-particle-dualism is now to be looked at more closely in the following.
11.1.1 Wave-Particle Duality In Sect. 10.5 on quantum optics, it was shown that light can be described not only as a wave with frequency f and wavelength λ, but also as a stream of light quanta, the photons, with energy E and momentum p: E = hf ,
ð11:1Þ
h , λ
ð11:2Þ
p=
with Planck’s quantum of action h = 2π . 1.05 . 10-34Js. Depending on the experiment performed, either the particle character or the wave character is revealed. If, for example, a photon is detected with a photon detector, a particle is measured at a certain time at the location of the detector. If, on the other hand, a wave is allowed to pass through a double slit, a wave phenomenon is observed, namely diffraction. In this property, photons do not differ in any way from all other quantum particles. Every quantum object is either a wave or a particle, depending on which experiment is being carried out. If one assumes a particle with mass m and velocity v, the kinetic energy and the momentum in the particle picture are given by E kin =
m 2 .v , 2
p = m . v:
ð11:3Þ ð11:4Þ
In the wave picture, particles with a mass are referred to as a matter wave. The wavelength λdB and the frequency f of the matter wave are λdB =
h h = , p m.v
f=
E kin : h
ð11:5Þ ð11:6Þ
Since this relationship was written down by Louis de Broglie in 1924, the wavelength of the matter wave is also called the de Broglie wavelength. The smaller the velocity and the mass of the particle, the larger the de Broglie wavelength. For typical objects from our everyday experience, the de Broglie wavelength and correspondingly the coherence length (Sect. 10.4.1.2) of the matter wave are so small that we cannot observe any interference phenomena. The object therefore behaves classically. In order to be able to observe quantum effects, one therefore needs particles with mass and velocity as small as possible. This is also the reason
11.1
Properties of Quantum Objects
295
why quantum experiments are often carried out at very low temperatures, because this makes the thermal velocity vth of the particles according to (5.33) small. Exercise: de Broglie Wavelength of Air Air consists predominantly of nitrogen molecules with mass m = 2.3. 1026 kg, which move with the thermal velocity vth =
3kB T m
ð11:7Þ
with the Boltzmann constant kB = 1.38 . 10-23J/K. What is the de Broglie wavelength of a nitrogen molecule at room temperature T = 300 K? Solution At room temperature the thermal velocity is vth =
3 . 1:38 . 10 - 23 J=K . 300K m = 734:8 : - 26 s kg 2:3 . 10
ð11:8Þ
The corresponding de Broglie wavelength is λdB =
h 2π . 1:05 . 10 - 34 Js = 3:9 . 10 - 11 m: = m . v 2:3 . 10 - 26 kg . 734:8 m s
ð11:9Þ
So the wavelength is smaller than an atom. Since the motion of the molecule is thermal, random motion, the coherence length of the matter wave is on the order of the wavelength. So if interference effects occur, they average out on a length scale smaller than an atom. Therefore, no interference is observable in air at room temperature. Air thus behaves as a classical gas.
11.1.2 Copenhagen Interpretation We have seen that quantum objects can be assigned a wavelength. We now want to understand in more detail what the nature of the matter wave is all about. For this purpose, let us once again consider a light wave. The electric field in a light wave oscillates with the function
296
11 →
Fundamentals of Quantum Physics
→
→
E r , t = E0 r
→
. cos k . r - ω . t ,
ð11:10Þ
→
where the strength of the electric field at the location r is given by the amplitude → → E 0 r , a real number., The vector character E of the electric field has been neglected in this formula, for the sake of simplicity. Importantly, the intensity I of the wave according to (10.8) is proportional to →
I r
→
/ E0 r
2
ð11:11Þ
,
thus to the square of the electric field amplitude. The intensity of the wave at a location in turn directly indicates the number of photons that are present at the location per time and cross-sectional area. The probability of detecting a photon at the location is thus proportional to the intensity I and hence to the square of the amplitude E20 . We will apply this relationship to matter waves in a moment. Before that, however, let us introduce the matter wave as a mathematical object. The essential difference to the electromagnetic wave is the fact that the quantities in the matter wave are complex numbers. The wave function is →
→
Ψ r , t = Ψ0 r
. ei
→ →
k . r - ω.t
ð11:12Þ
and is usually denoted by the Greek letter “Psi”. Both the amplitude Ψ 0 2 ℂ, and the e-function are complex quantities. How to deal with complex quantities is shown in Sect. 9.2.2.1. The complex matter wave in the form (11.12) is analogous to the polar representation (9.68) of a complex number, with the difference that in the matter wave also the pre-factor Ψ 0 can be complex.This difference is unimportant for the following argumentation. With (9.70) the absolute square of the wave follows as →
Ψ r ,t
2
2
→
= Ψ0 r
:
ð11:13Þ
In the Copenhagen interpretation, this absolute square is interpreted as the proba→ → bility W r of finding the quantum object at the location r during a measurement. So it is valid →
W r
→
= Ψ0 r
2
:
ð11:14Þ →
Analogous to the probability of measuring a photon at the location r , which is proportional to E20 , the probability of measuring a quantum object of matter wave is →
given by Ψ 0 r
2
. Before the measurement is performed, the quantum object is a
matter wave. This is an extended object that exists simultaneously at all locations → where its amplitude Ψ 0 r is not exactly zero. So we can say that the quantum
11.1
Properties of Quantum Objects
297
object before the measurement is simultaneously at all places at the same time, with position-dependent probability
→
Ψ0 r , t
2
. During a measurement, a random
location is statistically selected from this distribution. But this also determines and → fixes the location r of the particle. So the particle character comes back by the measurement of the location. This is called the collapse of the wave function. In the following, the concepts of the wave function and the measurement of the location explained above are practised at an example: electron diffraction at the double slit.
11.1.2.1 Electron Diffraction at the Double Slit Experiment The diffraction experiment of electrons passing a double slit was carried out by Möllenstedt and Jönsson in Tübingen in 1960 and is one of the outstanding experiments for the investigation of wave-particle duality. It was named the most beautiful experiment of all time by Physics World magazine in 2002. In the experiment, electrons are accelerated by means of an anode and sent onto a copper foil containing thin slits (Fig. 11.1). The outstanding experimental achievement of Möllenstedt and Jönsson was to make the slit distance d so small that both slits could be illuminated coherently with the incoming matter wave. As a result, coherent → → matter waves Ψ 1 r , t and Ψ 2 r , t , which are capable of interfering with each other, emerge from the two slits. The total wave behind the screen is thus →
→
→
Ψ r , t = Ψ1 r , t þ Ψ2 r , t :
ð11:15Þ
Fig. 11.1 In the experiment of Möllenstedt and Jönsson 1960 in Tübingen, a matter wave of electrons is sent through a double slit. By interference of the two matter waves emerging from the →
slits Ψ 1 r , t
→
and Ψ 2 r , t
a matter wave is produced at a fluorescent screen behind the slit,
→
whose absolute square Ψ r , t
2
corresponds to the typical diffraction pattern of the double slit.
A single measurement of an electron statistically choses an arbitrary point with probability →
Ψ r ,t
2
. Only by averaging over the points of impact of many electrons at the screen one
can observe the diffraction pattern
298
11
Fundamentals of Quantum Physics
At a distance l behind the double slit a luminescent screen is attached which lights up at the point of impact of impinging electrons. The fluorescent screen is thus a detector which measures the position of each electron individually. According to the Copenhagen interpretation, the probability of measuring an electron at location x on the detector is given by j Ψ ðxÞj2 = j Ψ 1 þ Ψ 2 j2 = ðΨ 1 þ Ψ 2 Þ . Ψ *1 þ Ψ *2
ð11:16Þ
= jΨ 1 j2 þ jΨ 2 j2 þ Ψ 1 Ψ *2 þ Ψ *1 Ψ 2 ,
ð11:17Þ
whereby for the sake of simplicity the parameters of the functions Ψ 1 and Ψ 2 were omitted here. The last term in (11.17) describes the interference between the two matter waves at the screen and leads to the typical diffraction pattern at the doubleslit (Fig. 10.19). Whenever a single electron is detected a random place at the screen →
lights up with probability Ψ r , t
2
. At the same time the wave function collapses.
The electron loses its wave character and becomes a particle at the point of impact. The typical diffraction pattern can therefore only be observed if a large number of electrons are allowed to hit the screen and the impact locations are evaluated statistically. Möllenstedt and Jönsson did this by placing a photo plate behind the fluorescent screen and recording the light flashes over a longer period of time. The photographic plate turned black where many electrons were incident. This resulted in clear interference fringes on the photographic plate. Exercise: Measurement at the Double Slit We have seen that in the electron diffraction of Möllenstedt and Jönsson the wave function of the electrons arriving at the luminous screen corresponds to the diffraction pattern of the double slit. What happens with the diffraction-pattern, if one measures with the help of another detector, through which of both slits the electron is transmitted? Here we assume that the detector is ideal in the sense that the trajectory of the electron is not affected by the measurement. Solution As soon as one can measure in principle through which of the two gaps the electron passes, the diffraction-pattern at the screen disappears. This is because every measurement of the position – even if one does not evaluate the measurement at all – leads to a collapse of the wave-function. It is absolutely irrelevant whether the measurement method influences the path of the electron or not. In concrete terms, this means that if the electron is measured as it goesthrough slit 1 or slit 2, the wave function collapses: (continued)
11.1
Properties of Quantum Objects
299
Ψ = Ψ 1 þ Ψ 2 → Ψ = Ψ 1,resp:2 :
ð11:18Þ
collapse
The magnitude square of the wave function on the screen is then 2
jΨ ðxÞj2 = Ψ 1,resp:2 ðxÞ :
ð11:19Þ
Assuming that half of the electrons have gone through slit 1 and the other half through slit 2, the time average of electrons at the screen will be given by the distribution function W ð xÞ =
1 j Ψ 1 ðxÞj2 þ jΨ 2 ðxÞj2 : 2
ð11:20Þ
The crucial point here is that, compared to (11.17), there is no interference term included. Therefore there are no diffraction fringes.
11.1.2.2 Outlook: Alternative Interpretations of Quantum Mechanics First of all, it is important to emphasize that so far all experiments performed, and there have been quite a few since the development of quantum mechanics, have been in precise agreement with the calculations of quantum mechanics. So there is no doubt that quantum mechanics as a method works. There is more debate about the meaning of the concepts: The interpretation that before a measurement the wave function is a spatially extended object, and only the measurement assigns a location to the particle, is still difficult for many people to accept today. Even famous physicists like Albert Einstein could not come to terms with this. In fact, the idea that a particle, before its location has been measured, is everywhere at once is incompatible with our everyday conception. Moreover, it is also impossible to explain the principles by which the location is ultimately chosen in a measurement. It seems to be completely random. Therefore, several alternative interpretations have been developed during the twentieth century. One of these is known as local realism, with Albert Einstein as its most prominent proponent. The key message here is that particles can be assigned a fixed location even before they are measured, so the location is a reality at any point in time. Since quantum mechanics does not describe this reality, it must necessarily be an incomplete theory. Thus, it lacks certain information about nature, which in local realism are called hidden parameters. These parameters determine the location of the particle even before it is measured, but are not captured within the framework of quantum mechanics. The quantum mechanical probability W(x) = |Ψ (x)|2, of finding the particle at a location, is a classical probability in local realism: this means that the location of a particle is already determined at the beginning of each repetition of the experiment, randomly, according to the probability W(x). Thus, the same statistical measurement result should be obtained as with quantum theory. In fact, however, one obtains different
300
11
Fundamentals of Quantum Physics
results from local realistic theories and quantum mechanics if one correlates the measurement results of entangled particles (Sect. 11.1.5), in a certain way. The physicist John Stewart Bell was able to show in 1964 that the correlations of the measurement results in every locally realistic theory must satisfy a certain inequality, which for this reason is known as Bell’s inequality. If one follows the rules of quantum mechanics the inequality is violated. Since then, these correlations have been studied in numerous experiments, and indeed it has been found that Bell’s inequality is violated in a way that is consistent with the results of quantum mechanics. Local realist theories are therefore nowadays considered to be refuted. Another alternative interpretation is due to the physicists Hugh Everett (1957) and Bryce deWitt (~1970). Their approach is also known as many worlds theory. Here it is assumed that the whole universe can be described by a single wave function. This wave function initially contains the coherent superposition of all possible measurement results. If now a measurement takes place, it does not come to the collapse, but decoherence between the part of the wave function, which contains the appropriate measuring result, and the remaining wave function. The measured reality splits off, so to speak, from all other possibilities. Thus, many worlds emerge that exist in parallel. The idea that we possibly exist as copies of ourselves in different worlds, and have different experiences in each world, is of course even more difficult to accept than the Copenhagen interpretation. So far there is no idea whether and how to prove or disprove the many worlds theory. So, for the time being, it remains a grandiose idea for science fiction.
11.1.3 Heisenberg’s Uncertainty Principle Another peculiarity of quantum objects is connected with the question how exactly the position or the momentum of a particle can be determined simultaneously. According to the classical notion, one can simultaneously assign a fixed location and a fixed momentum to any particle. Many problems in classical mechanics deal with an object that is at a fixed location x in space at a time t and has a fixed momentum p at that location. Such a statement cannot be made for a quantum object, in principle. For a quantum object, the specification of its position x always includes an uncertainty Δx, the so-called position uncertainty of the object. It tells us how precisely the position of a quantum object is defined. As an example we consider here a plane matter wave, Ψ ðx, t Þ = Ψ 0 . eiðkx - ωtÞ ,
ð11:21Þ
with fixed wavenumber k and fixed frequency ω. Since plane waves are infinitely extended objects, the spatial uncertainty of a plane wave is Δx = 1. Thus, one has no information about the position of a quantum object in a plane wave. Corresponding to the position uncertainty, the specification of the momentum p always includes the momentum uncertainty Δp. The momentum uncertainty states how precisely the momentum of a quantum object is defined. In the example of the
11.1
Properties of Quantum Objects
301
plane matter wave, the momentum is given by the precisely defined quantity p = ħk. The momentum uncertainty is therefore Δp = 0. The Heisenberg uncertainty principle relates the uncertainty of position and the uncertainty of momentum. It says that the product of the two uncertainties fulfils the inequality Δx . Δp≳h,
ð11:22Þ
where the exact value of the right side of the inequality depends on the system (there is also h/2 found in the literature). The more precisely the position of a quantum object is defined, the less precisely the momentum is known, and vice versa. The plane wave is an extreme example, where the momentum has no uncertainty at all, at the expense of a position with infinitely large uncertainty. The other extreme example is the point particle, whose position uncertainty is zero, but which has an infinitely large momentum uncertainty. In the intermediate region, there are all possible combinations of momentum and position uncertainty that satisfy the inequality (11.22). For example, wave packets of length Δx can be generated by superimposing plane waves of different wavenumbers (Fig. 11.2). The minimum width Δk = Δp/ħ of different wavenumbers required for this is given here by Δk =
2π : Δx
ð11:23Þ
The minimum spatial impulse uncertainty is only achieved for wave packets of a certain shape (a Gaussian shape). The Heisenberg uncertainty principle is often found in the context that in the case of a simultaneous measurement of position and momentum, the measurement uncertainties Δx and Δp must satisfy the inequality (11.22). In this case, the individual uncertainties Δx and Δp of a quantum object may well change during a measurement. For example, if a plane matter wave with Δx = 1 and Δp = 0 impinges on a wall with a small hole of diameter a, the position of the quantum object while it is transmitted through the hole is measured with an uncertainty Δx = a. As a result, the momentum uncertainty must increase according to the Heisenberg uncertainty principle and is thus given by the inequality
Fig. 11.2 By superimposing plane waves with different wavenumbers k and respective amplitudes A(k), one can generate a wave packet with an extension Δx which moves through space. This packet can be interpreted as a particle with spatial uncertainty Δx and momentum uncertainty Δp = ħ . Δk. Within the dashed envelope the real part of the wave function Re(Ψ (x, t)) oscillates back and forth
302
11
Fundamentals of Quantum Physics
h Δp≳ : a
ð11:24Þ
This momentum uncertaintly leads to the fact that the wave, after passing the hole, does no longer propagate parallel to the incoming wave, but also in directions perpendicular to it, i.e., it diverges. For very small hole diameters, this results in a spherical wave that propagates in all directions, analogous to Huygens’ principle. Examination Task: Atom in the Box If one traps an atom in a box, the position of the atom is thereby fixed with an uncertaintly given by the extension of the box. Because of the Heisenberg uncertainty principle, thus the speed of the atom cannot be arbitrarily small. What is the minimum velocity of a rubidium atom with a mass of m = 1 . 1025 kg, which is placed into a box with a length of a = 100 nm? Solution The length of the box is the spatial uncertainty of the atom: Δx = a. The momentum uncertainty Δp = m . Δv is given by the Heisenberg uncertainty principle (11.22). We solve for the velocity and we get Δv≳
cm h 2π . 1:05 . 10 - 34 Js = 6:6 : = s m . a 1 . 10 - 25 kg . 100 nm
ð11:25Þ
11.1.4 Superposition One of the essential concepts in quantum physics is the superposition of wave functions. Implicitly, we have already learned about this concept in electron diffraction at the double slit. There, the wave function behind the double slit consisted of the superposition of the two spherical waves emanating from each slit. However, superpositions do not only exist for matter waves moving through space, but also for wave functions anchored at a fixed location in space. An example in classical physics is standing waves on a rope that is fixed at both ends (Fig. 4.7b). The fundamental vibration of the rope has an antinode in the middle and nodes at the ends of the rope. We will discuss here a corresponding example of quantum physics: Fig. 11.3 sketches the wave function of a particle that moves in a potential U(x) and has energy E0. This is a so-called double well potential; however, for the moment we consider only the left well with the corresponding wave function Ψ l. Like the fundamental mode of the standing wave of the rope, this wave function has its maximum in the middle. There, the probability of finding the particle is largest, while it falls off towards the edges. At the edges, where the height of the potential is as large as the energy of the particle, according to classical ideas, would be the node
11.1
Properties of Quantum Objects
303
Fig. 11.3 Double well potential. A particle in the potential is simultaneously in the left and in the right well. The wave function is given by superposition of the two single wave functions Ψ l and Ψ r with energy E0. In the case of symmetric superposition Ψ +, the particle is less localized than in the case of a single well, thus the energy decreases. With the antisymmetrical superposition Ψ - the particle is more strongly localized, and the energy belonging to it is correspondingly larger
of the standing wave. However, because of the uncertainty principle, in quantum mechanics the wave function is not quite zero at this point. Even in the regions where the potential height exceeds the energy of the particle, one can still find the particle with some, albeit small, probability. This region is forbidden in classical physics because of energy conservation. Conservation of energy also exists in quantum mechanics, of course. However, for short periods of time, energy can be “borrowed”, so to speak, and thus overcome the conservation of energy for a short time. This is the only reason why the quantum particle can run up higher in potential than its energy classically allows. The particle can thus even pass from the left well through the barrier in the middle to the right well. The particle is said to tunnel through the barrier. We now assume that the particle is somewhere in the double well, that is, either in the left well with wave function Ψ l or in the right well with wave function Ψ r. In fact, when we make a measurement, we find the particle in exactly one of the two wells. However, as long as one does not make a measurement on the system, the total wave function is given by the superposition of the two possibilities. One can superimpose the oscillations of the two single wave functions with arbitrary phase: φ
Ψ = Ψ l þ eiφ . Ψ r :
ð11:26Þ
You can imagine this as two pendulums swinging with the same frequency, but with different phase φ. Of particular interest are the so-called eigenmodes or eigenstates of the system. These are those superpositions where the energy of the system is constant in time. One possibility is the wave function Ψ + with phase φ = 0, Ψ þ = Ψ l þ Ψ r,
ð11:27Þ
That is, the wave function of the right well oscillates with the same phase as the wave function of the left well (To those experienced in quantum physics: For the sake of simplicity, we omit the correct normalization of the wave function here). This is also
304
11
Fundamentals of Quantum Physics
called the symmetric superposition state. This leads to the fact that in the middle between the wells (where the barrier is) the two wave functions interfere constructively. As a result, the particle is less localized than in the single well. The spatial uncertainty of the particle is thus larger, and the momentum uncertainty according to Heisenberg’s uncertainty principle correspondingly smaller. As a result, the kinetic energy of the particle decreases, lowering the total energy of the symmetric superposition state: Eþ < E0 :
ð11:28Þ
The second eigenmode is the superposition Ψ - with phase φ = 180°: Ψ - = Ψ l - Ψ r:
ð11:29Þ
The two single wave functions therefore oscillate in antiphase; this is referred to as an antisymmetrical superposition state. As a result, the two waves interfere destructively in the region of the barrier. Particles in the antisymmetric superposition state have a smaller spatial uncertainty, which increases the energy of the state: E - > E0 :
ð11:30Þ
In both superposition states, the particle is simultaneously located in the left as well as in the right trough. When measuring the particle location with result left or right, the wave function collapses accordingly to Ψ l or Ψ r.
11.1.5 Entanglement Another special feature of quantum physics is entanglement. In contrast to superposition, this requires not just one quantum object , but several. For the sake of simplicity, we restrict ourselves here to the case of only two quantum objects and again dispense with the normalization. Each of the two objects is in a superposition state. Object 1 is in the state Ψ 1 = Ψ 11 þ Ψ 12 ,
ð11:31Þ
where we denote the object by the superscript number and Ψ 11 and Ψ 12 may be any wavefunctions, respectively. Object 2 is also in a superposition state: Ψ 2 = Ψ 21 þ Ψ 22 :
ð11:32Þ
Again, Ψ 21 or Ψ 22 are arbitrary wave functions. We now speak of entanglement if the state of the first object is related to the state of the second object, in the sense of: If object 1 is in state Ψ 11 , then object 2 must be in state Ψ 21 , and if object 1 is in state Ψ 12 , then object 2 must be in state Ψ 22 . One writes this entangled state as
11.1
Properties of Quantum Objects
Ψ = Ψ 11 ⨂ Ψ 21 þ Ψ 12 ⨂ Ψ 22 ,
305
ð11:33Þ
where the multiplication sign in the circle is the so-called tensor product of the states, which we cannot go into further here. The interesting thing happens now with the collapse of the wave function. If one measures in which state object 1 is, the wave function of the object collapses either to Ψ 11 or to Ψ 12 , depending on the outcome of the measurement. However, because of entanglement, the wave function of object 2 also collapses at the same time, regardless of how far object 2 is from object 1. The same is true in reverse when measuring the state of object 2. Entangled states, like all waves, are extended objects; any change at any location will result in a change at all other locations in the wave. This property is also called non-locality.
11.1.5.1 EPR Experiment The consequences of entanglement in quantum mechanics have long been a source of debate. A famous example is the so-called EPR paradox, a thought experiment devised by physicists Einstein, Podolsky, and Rosen to disprove nonlocality. The idea is basically this: You create an entangled pair of photons on Earth. Here, each of the two photons is in a superposition state consisting of a right circular polarization (RZ) and a left circular polarization (LZ), that is, photon i with i = 1, 2 is represented by the wave function Ψ i = Ψ iRZ þ Ψ iLZ :
ð11:34Þ
The two photons are supposed to be entangled with each other, so that they can only have different polarizations at the same time; thus the total wave function is Ψ = Ψ 1RZ ⨂ Ψ 2LZ þ Ψ 1LZ ⨂ Ψ 2RZ :
ð11:35Þ
One of the two photons (Ph 1) remains on Earth, for example, by running it in a long glass fibre, while the second photon (Ph 2) is sent into space, for example, to Mars. There a photon detector measures the polarization of Ph 2. If one now – shortly before Ph 2 is measured – measures the polarization of Ph 1 on Earth, the total wave function Ψ collapses and with it also the state of Ph 2. This means that the physicists on Earth know at the same moment as they measure the polarization on Earth what the result of the measurement on Mars will be. Since the wave function collapses without time delay, this would seem to allow information to be transmitted faster than the speed of light. This would not be in accordance with Einstein’s theory of relativity, according to which information can be transmitted at a maximum speed of light. However, the paradox can be resolved if one realizes that no information can be transmitted with this method. The smallest unit of information consists of a bit, that is yes/no information, whereby we encode “yes” with right circular polarization and “no” with left circular polarization, for instance. So if you want to transmit a certain piece of information to Mars, you have to determine beforehand which polarization the detector should measure there. But this is just not possible. This information is only determined when the measurement is made on Earth at Ph 1. But
306
11
Fundamentals of Quantum Physics
since Ph 1 is in a superposition state, the result of the measurement is random and thus in principle not predictable. Indeed, something is transmitted with superluminal velocity, but these are only random results, that is data garbage. In the meantime, experiments have also been carried out on the EPR paradox. Here, an entangled photon pair has been created with the help of parametric fluorescence (see Sect. 10.5. 2.4). The polarization entanglement is due to the conservation of angular momentum. The angular momentum L = 0 of the incoming photon with linear polarization must equal the sum of the angular momentum of the two outgoing photons. Therefore, the two outgoing photons can only have different circular polarizations. One then sent the two photons with optical fibers to different laboratories on Earth and measured their relative polarization. It was found: As soon as one measured the polarization of one photon, the polarization of the other photon was fixed and different from the first one measured. This is also true for measurement time differences smaller than the light transit time between the two laboratories. In further experiments, it was shown that when the polarization of the two photons was measured, the so-called Bell’s inequality was violated. This is regarded as proof that the non-locality of quantum mechanics actually corresponds to reality (Sect. 11.1.2.2).
11.1.5.2 Schrödinger’s Cat A second, very well-known example of entanglement is Schrödinger’s cat. This example was also meant to expose the supposed absurdity of quantum physics. Here, a quantum object (a radioactive nucleus) is entangled with a classical, macroscopic object (the cat). In the thought experiment, the cat and the radioactive nucleus are locked in an absolutely opaque and soundproof box, which makes any measurement on the objects in the box from the outside impossible (Fig. 11.4a). This is important so that the wave functions of the objects in the box do not collapse prematurely. The radioactive decay of the atomic nucleus is, according to quantum mechanical laws, a tunneling effect, that is, the wave function of the object is a superposition of a
Fig. 11.4 (a) In the Schrödinger cat thought experiment, the radioactive decay of an atomic nucleus is entangled with the state of a cat. The radioactive decay releases a poison which kills the cat. Since the decay is a quantum process, the wave function of the atomic nucleus is entangled with the state “cat is alive” or “cat is dead”. So the cat is dead and alive at the same time until the measurement. (b) The wave function of the atomic nucleus is composed of a decayed part and a not yet decayed part. The longer one waits, the greater the probability that the atomic nucleus has already decayed
11.2
Atomic Physics
307
decayed part Ψ coll and a not yet decayed part Ψ non ‐ coll. As a function of time, the fraction of the non-collapsed wave function decreases, while the fraction of the collapsed wave function increases (Fig. 11.4b). However, as long as one does not make a measurement on the nucleus, it is simultaneously collapsed and not yet collapsed. The box also contains a mechanism that responds to the collapsed portion of the atomic wavefunction and releases a poison that kills the cat. So according to the superposition of collapsed and not collapsed atomic nucleus, the poison is released and not released, and the cat is dead and alive at the same time. The corresponding entangled state is Ψ = ð Ψ non‐coll ⨂ Ψ cat alive Þ þ ðΨ coll ⨂ Ψ cat dead Þ:
ð11:36Þ
A dead and simultaneously living cat naturally seems absurd from our everyday experience. And indeed, it is very unlikely that one will ever create a superposition of a dead and a living cat. But this is not because the idea of entanglement of a quantum object with a macroscopic object is inherently impossible. Rather, it is because entangled objects are very sensitive to perturbations. The sensitivity increases exponentially with the number of entangled objects. In the case of the cat, one must take into account all the atoms that make up the cat; it is therefore an entangled state of a huge number of atoms on the order of about 1025. Even the thermal motion of the cat atoms destroys this entanglement within a very short time, collapsing the wave function to either dead or alive. This is the reason why you cannot entangle a real cat. However, researchers are working on entangling smaller macroscopic objects that can be very well controlled and isolated from the environment, such as collections of a few atoms prepared free-floating at temperatures of T < 1 μK in an extremely good vacuum in a laboratory. In his 2012 Nobel Prize speech, physicist and Nobel laureate Dave Wineland reported how the motion of an ion in a so-called optical trap is entangled with an internal degree of freedom of the ion called the spin (see Sect. 11.2.3.1), thus forming a Schrödinger kitty. At the moment, physicists in quantum optics are working on making these catkins larger by extending the entanglement to as many particles as possible. This is important, among other things, for the construction of quantum computers.
11.2
Atomic Physics
Atomic physics is about the structure of the electron shell of atoms. This consists of the negatively charged electrons moving around the positively charged atomic nucleus (Fig. 11.5a). For the sake of simplicity, we consider here only a single electron that is moving in the Coulomb potential of the atomic nucleus. This is not a bad approximation at all, since it is mainly the valence electrons (that is, the electrons of the outermost shell) that determine the chemical and optical properties of an atom. In experiments, we see that atoms emit light when they are heated, for example, with a flame. However, the spectrum emitted is completely different from the spectrum of a blackbody (Sect. 10.5.2.1). While the blackbody spectrum is continuous and the
308
11
Fundamentals of Quantum Physics
Fig. 11.5 (a) Bohr atomic model: The Coulomb force FC, with which the positively charged atomic nucleus attracts the electron, forces the electron to follow a circular path with radius r around the nucleus. The coherence condition requires that the electron’s matter wave constructively interferes with itself after one round-trip in order for the electron’s orbit to be stable. Discrete orbital radii (shells) with corresponding energy levels, numbered by the principal quantum number n, follow from this condition. In (b) these energy levels are sketched for the hydrogen atom. The lowest, most strongly bound state corresponds to n = 1; at the so-called ionization edge for n → 1 the levels are lying closely together. The electron can jump between the levels by absorbing or emitting a photon of the corresponding energy E = h . f. This gives rise to the characteristic hydrogen spectrum with its series
colour of the spectrum depends on the temperature, atoms have a very characteristic spectrum, that is, they emit light of very specific wavelengths, which depend on the element. If you increase the temperature, more light is emitted, but still at the same wavelength. For example, sodium emits orange light at a wavelength of λ = 590 nm. This is why sodium vapor lamps, many of which are used as streetlights, emit orange light. To understand why atoms can only emit certain wavelengths, it is necessary to describe the electron as a quantum object. In the following we will get to know two atomic models: the semiclassical atomic model according to Bohr and the quantum mechanical atomic model according to Schrödinger.
11.2.1 Atomic Model According to Bohr In Bohr’s atomic model, the electron is assumed to travel around the atomic nucleus with velocity v on a circular path with radius r (Fig. 11.5a). Depending on the element, the atomic nucleus contains a number of Z protons; the positive charge of the atomic nucleus is therefore Q = Z . e. The centripetal force FZ (Sect. 2.2.4.2) necessary for the orbit of the electron is generated by the Coulomb force FC (Sect. 6. 1.1) between the nucleus and the electron. By equating the two forces one obtains 1 Z . e2 m . v2 . : = r 4πε0 r 2
ð11:37Þ
11.2
Atomic Physics
309
This condition establishes a connection between the velocity of the electron and the radius of the circular path and is based on a purely classical consideration. Quantum physics is now involved in the so-called coherence condition. This means that the electron is regarded as a matter wave that travels in a circular path around the atomic nucleus. In order to prevent the matter wave from annihilating itself after one orbit, it is required that the circumference of the orbit be a multiple of the de Broglie wavelength λdB, which according to (11.5) depends on the mass me and the velocity v of the electron. The coherence condition is thus 2πr = n .
h , me . v
ð11:38Þ
with an integer n = 1, 2, 3, . . . . Substituting the velocity from (11.38) into (11.37) yields discrete orbit radii that depend on the number n: r n = aBohr .
n2 , Z
ð11:39Þ
with the Bohr radius aBohr =
ε 0 h2 ≈ 0:5 . 10 - 10 m = 0:5 Å πme e2
ð11:40Þ
where the quantity Ångström 1 Å = 10 - 10 m was introduced. The number n designates the shell on which the electron moves and is also called the principal quantum number in the Schrödinger model. Here, the orbital radius increases quadratically with the quantum number n. The energy of the electron on a certain orbit n is composed of its kinetic energy and its potential energy in the Coulomb potential (6.23): E n = E kin þ E pot =
1 2 1 Ze2 1 1 Ze2 . =- . . , mv 2 n 4πε0 r n 2 4πε0 r n
ð11:41Þ
as can be shown by substituting vn. By further substituting rn and aBohr one can write the energy as E n = - ERy .
Z2 , n2
ð11:42Þ
with the so-called Rydberg energy E Ry =
m e . e4 ≈ 13:6 eV: 8ε20 h2
ð11:43Þ
The energy levels for hydrogen whose nucleus consists of a single proton, that is, Z = 1, are sketched in Fig. 11.5b. The smallest energy is in the so-called ground state
310
11
Fundamentals of Quantum Physics
with n = 1. In this state, the electron is most strongly bound to the nucleus. The radius of the ground state orbit is r1 = aB. The Bohr radius aB thus indicates the radius of a hydrogen atom. The larger the principal quantum number n, the larger the energy. In the n → 1 limit, the energy levels are closer and closer together, and the energy tends towards zero. Electrons that have an energy E ≥ 0 are no longer bound to the atomic nucleus. The energy E = 0 is therefore also called the ionization edge. The Rydberg energy ERy is now exactly the amount of energy needed to ionize a hydrogen atom, that is, the electron is lifted from its ground state with n = 1 to the ionization edge and can then leave the range of influence of the atomic nucleus. Written Test: Rydberg Atoms Atoms in which the valence electron is excited into a shell with a very large principal quantum number are called Rydberg atoms. The radius of the electron orbit can become very large in this case. Calculate the orbital radius of the electron in the hydrogen atom with principal quantum number n = 100. What is the energy of this state measured from the ionization edge in electron volts? Solution The orbital radius is given by (11.39) by r100 = aBohr .
n2 1002 = 0:5 . 10 - 10 m . = 0:5 μm: Z 1
ð11:44Þ
The Rydberg atom is thus half a micrometer in size, huge compared to normal ground state atoms, which are more on the order of 1 Å. According to (11.42), the energy of the Rydberg atom is E 100 = - ERy .
Z2 1 = - 13:6 eV . = - 1:36 meV: n2 1002
ð11:45Þ
The electron in the Rydberg atom is therefore rather weakly bound, with an energy of E100 = - 1.36 meV.
11.2.1.1 Absorption and Emission Atoms can absorb light and also emit it if the energy of the photons corresponds to the transition between two energy levels (Fig. 11.5b). When a photon is absorbed, the electron absorbs the energy and is thus lifted from shell m to a higher shell n > m. If, on the other hand, an electron makes the transition from a higher shell n to a lower shell m, the corresponding energy is released and a photon is emitted. This process can occur by spontaneous emission. By equating the photon energy E = h . f with the difference energy ΔE = En - Em of the levels, one obtains a condition for the photon frequency
11.2
Atomic Physics
311
ΔE , h
f=
ð11:46Þ
with E = E Ry . Z 2 .
1 1 : m 2 n2
ð11:47Þ
Here you have to convert the Rydberg energy from the unit eV to the SI unit Joule: E Ry ½J ] = E Ry ½eV ] . e,
ð11:48Þ
with elementary charge e. The corresponding light wavelength λ is given via (10.5) by λ=
h.c , ΔE
ð11:49Þ
with speed of light c. For each choice m of the lower level, there is an infinite number of possibilities to choose the upper level n > m. For fixed m, the corresponding transition frequencies usually lie in a certain range (Fig. 11.5b). Therefore, for hydrogen, the different transitions with same m are called series, for example, the Lyman series for m = 1 in the ultraviolet range, the Balmer series with m = 2 in the optically visible range, or the Paschen series with m = 3 in the infrared range. Examination Task: Balmer-α-Line The so-called Balmer-α-line designates the light wavelength that corresponds to the transition in hydrogen between the levels with m = 2 and n = 3. What is the corresponding wavelength of light λ? Solution The energy difference between the levels is ΔE = E Ry . Z 2 .
1 1 m 2 n2
= 13:6 V . 1:6 . 10 - 19 C . 12 . = 3:022 . 10 - 19 J:
1 1 - 2 2 2 3
ð11:50Þ ð11:51Þ ð11:52Þ
The corresponding wavelength is then (continued)
312
11
λ=
- 34 Js . 3 . 108 h . c 2π . 1:05 . 10 = ΔE 3:022 . 10 - 19 J
Fundamentals of Quantum Physics
m s
= 655 nm:
ð11:53Þ
A wavelength of 655 nm is red light.
11.2.2 Atomic Model According to Schrödinger The atomic model according to Bohr assumes a classical circular orbit of the electrons around the nucleus. Unfortunately, there is a problem with this idea. A charge moving on a circular path can be seen as a Hertzian dipole. However, since a Hertzian dipole emits light, the electron would lose energy along its orbit. This would cause the electron to sink in the Coulomb potential of the nucleus and to move closer and closer to the nucleus. Thus, the electron would collide with the nucleus after a short time. Since this is obviously not the case in reality (the world is still stable), we conclude that Bohr’s atomic model cannot be correct. In fact, the problem lies in the classical assumption of an orbital curve along which the electron moves. In quantum mechanics, the electron is to be described as a wave with a corresponding → wave function Ψ r . The electron is therefore not at a fixed location, in particular not at a fixed distance from the nucleus, but is everywhere at the same time according to the probability
→
Ψ r
2
. Thus, there are also no fixed paths on which
the electron moves; in particular, there is no circular path and no dipole radiation. The atom is therefore stable. If we calculate the mean distance of the electron from the nucleus from the wave function, we obtain exactly the orbital radii of Bohr’s atomic model. So this is a kind of classical approximation of the Schrödinger atom. This is also the reason why this (wrong) model gives at least partially correct results for the energy levels. Now the question is how to calculate the wave function of the electron in the atom. This is the great achievement of Erwin Schrödinger, who in 1926 established a general differential equation for calculating wave functions. This → DGL is called the Schrödinger equation in his honor. For the wave function Ψ r of the electron in the hydrogen atom, the Schrödinger equation is: -
ħ2 d 2 d2 d2 þ þ 2me dx2 dy2 dz2 kinetic energy
→
Ψ r
þ
- e2 → → . Ψ r =E . Ψ r : 4πε0 r
ð11:54Þ
Coulomb potential
The Schrödinger equation is the quantum mechanical version of the law of conservation of energy. In (11.54), the total energy E of the electron is composed of the kinetic energy of the electron and its potential energy in the Coulomb potential of the nucleus. The special feature of Eq. (11.54) is that its solutions are described by the three parameters n, l, and m, which can take only discrete values. The solutions
11.2
Atomic Physics
313
are called states or orbitals, and the corresponding wave function is called Ψ n, l, m. The parameters are the quantum numbers of the wave function. One can give an analytical formula for the wave functions in hydrogen. However, these are too complicated for this book. Instead, the individual quantum numbers are described below.
11.2.2.1 Principal Quantum Number n The principal quantum number n corresponds to the shell in Bohr’s atomic model. It can take any integer value, n 2 f1, 2, 3, . . .g:
ð11:55Þ
Here, the radius rn of Bohr’s circular orbit as described by (11.39) gets a new meaning: It now corresponds to the mean radial distance hri of the electron from the atomic nucleus,
→
given by the probability Ψ r
2
→
r n ≈ hr i =
r . Ψ n,l,m, r
dV,
ð11:56Þ
2
. The quantum mechanical mean value is gener-
ally also called the expectation value, since one expects this value on average in a repeated measurement. The shells are traditionally also designated with letters. The principal quantum numbers n = 1, 2 and 3 correspond respectively to the so-called K-, L- and M-shell.
11.2.2.2 Angular Momentum Quantum Number l Although in the quantum mechanical model the electron is not on a circular orbit → around the atomic nucleus, one can assign a classical orbital angular momentum L l to the wave function, which depends on l, by calculating the mean value of the angular momentum – similar to the radius of Bohr’s orbit: →
→
Ll= L =
→
→
L . Ψ n,l,m, r
2
dV:
ð11:57Þ
Thus, the absolute value of the expectation value of the angular momentum is given by →
Ll =
lðl þ 1Þ . ħ:
ð11:58Þ
The angular momentum quantum number l can take on integer values in the range between zero and n - 1:
314
11
Fundamentals of Quantum Physics
l 2 f0, 1, 2, . . . , ðn - 1Þg:
ð11:59Þ
The maximum possible value for l depends on the principal quantum number n. For a fixed principal quantum number n, there are a total of n different possibilities for l. The value of l influences the angular distribution of the wave function. In the case l = 0 the wave function is radially symmetric; in this case one speaks of a so-called s-orbital, which looks like a sphere. At l = 1 the wave function has the form of two opposite lobes. This function is also called a p-orbital. Depending on the direction in which the two lobes point, one speaks of the px-, py- or pz-orbital. Higher angular momentum quantum numbers (l = 2 is called d-orbital and l = 3 is called f-orbital) lead to crossed lobes and even more complicated angular distributions. If one wishes to describe a particular orbital including the principal quantum number, one inserts the number n before the name of the orbital. For example, the wave function with quantum numbers n = 2 and l = 1 is called a 2p orbital. A selection of sections through the simplest orbitals is sketched in Fig. 11.6a.
11.2.2.3 Magnetic Quantum Number m The angular momentum quantum number l determines the absolute value of the classical angular momentum. The magnetic quantum number m, on the other hand,
Fig. 11.6 Simplest wave functions Ψ n, l, m of the hydrogen atom. The color value indicates the absolute square of the wavefunction and corresponds to the residence probability of the electron, with yellow indicating a high and blue a low probability. The four sketches in (a) are sections through the xz-plane and correspond to the 1s orbital, the 2s orbital, the 2pz orbital and the 3f orbital. In (b), the sections through the xyplane are sketched. The top row shows the wave functions with quantum numbers n = 2, l = 1 and m = ± 1. These have the shape of a donut. While the residence probability for the two cases m = ± 1 is identical, the wavefunction rotates clockwise and counterclockwise about the z-axis, respectively. The px- or py-orbital (bottom row) is formed by the superposition of the two wave functions with + or -, respectively. The sketched area in all figures is ±20 Bohr radii, that is, approx. ± 10thickmathspace A
11.2
Atomic Physics
315
describes the projection of the orbital angular momentum onto an arbitrary, previously determined axis, for example, onto the z-axis. The expectation value of this projection is →
Lz,m = L . ez =
→
→
L . ez . Ψ n,l,m r
2
dV = m . ħ:
ð11:60Þ
Here the magnetic quantum number can assume integer values in the range between -l and l: m 2 f - l, - l þ 1, . . . , - 1, 0, 1, . . . , l - 1, lg:
ð11:61Þ
Altogether, for a fixed value of l, there are a number of 2l + 1 different possibilities for m. In a p-orbital, for example, there are three possibilities, namely m = - 1, m = 0 and m = 1, with correspondingly different shapes of the orbital. The wavefunction with m = 0 has the typical lobe shape of a p-orbital with orientation along the z-axis; thus, it is the pz-orbital (Fig. 11.6a). In contrast, the other two wavefunctions with m = ± 1 have the shape of a donut, with the z-axis pointing through the center of the hole of the donut (Fig. 11.6b). They differ in the orbital direction of the wavefunction about the z-axis. The missing p-orbitals (the px and the py orbitals) are now superpositions of these two donut wavefunctions: Ψ px = Ψ n,l = 1,m = þ1 þ Ψ n,l = 1,m = - 1 ,
ð11:62Þ
Ψ py = Ψ n,l = 1,m = þ1 - Ψ n,l = 1,m = - 1 :
ð11:63Þ
This is another example of the fact that the superposition of wave functions is an essential element of quantum physics. Exercise: Quantum Numbers According to Schrödinger Which of the following combinations of quantum numbers in the Schrödinger atom exist, and what is the name of the orbital? 1:n = 50, l = 0, m = 0, 2:n = 1, l = 1, m = 1, 3:n = 2, l = 1, m = 2, 4:n = 3, l = 1, m = 0, 5:n = 3, l = 2, m = 1: Solution In each case one has to check whether for the given n, which must be an arbitrary integer, the conditions (11.58) for l (continued)
316
11
Fundamentals of Quantum Physics
l 2 f0, 1, 2, . . . , ðn - 1Þg and (11.61) for m m 2 f - l, - l þ 1, . . . , - 1, 0, 1, . . . , l - 1, lg are fulfilled. 1. 2. 3. 4. 5.
Both conditions are fulfilled, it is the 50s orbital. Condition (11.58) is not fulfilled. Condition (11.61) is not fulfilled. Both conditions are fulfilled, it is the 3p orbital. Both conditions are met, it is the 3d orbital.
11.2.2.4 Corrections to Schrödinger’s Atomic Model Even Schrödinger’s atomic model does not describe reality completely. For example, the spectral lines of the elements sometimes deviate strongly from Schrödinger’s predictions. The reason are relativistic corrections, which are included in the atomic model according to Paul Dirac (1928). The expected value of the velocity of the electron in the atom is extremely large and lies in the percentage range of the speed of light. Thus the kinetic energy of the electron is no longer given by the well-known classical formula, but by the relativistic energy, which we cannot discuss further here. Moreover, if the orbital has angular momentum, this creates a magnetic field at the location of the electron, in which the spin of the electron (Sect. 11.2.3.1) can align itself. This is known as spin-orbit coupling. This leads to a shift of the energy levels, the so-called fine structure of the spectrum. A similar effect is obtained by coupling the total angular momentum (orbital angular momentum + spin) to the nuclear spin. This is then the hyperfine structure. In addition, due to the so-called Zitterbewegung (an effect of spatial uncertainty), there is a correction to the potential energy of the electron in the Coulomb potential. This correction is called the Darwin term. Another correction is the so-called quantum defect, which is due to the fact that the inner electrons partially shield the Coulomb potential of the nucleus.
11.2.3 Elements and Periodic Table After introducing the atomic orbitals in Sect. 11.2.2, the following section deals with how these orbitals are actually occupied by electrons in the various elements. The elements differ in that the atomic nucleus contains an increasing number Z of positively charged protons, starting with Z = 1 for hydrogen. Atoms, considered as a whole object, are uncharged, so the electron shell of an atom must also contain Z electrons. We will address the question how these Z electrons are distributed
11.2
Atomic Physics
317
among the infinite combinations of wavefunctions with quantum numbers n, l, and m. Here, the so-called spin of the electrons plays a major role.
11.2.3.1 Spin Quantum Number ms and Pauli Principle Spin has nothing to do with atomic wave functions, but is an internal property of every quantum particle. Without going into more detail, the mathematical structure of spin looks as if the particle had an intrinsic angular momentum and would rotate around its own axis like a small ball. In fact, however, spin exists even without intrinsic angular momentum in the classical sense. The electron, for example, cannot have an intrinsic angular momentum because, as a point particle, it has no expansion, that is, the radius of the electron is zero. The spin quantum number of the electron can now have one of the two values ms 2
1 1 - ,þ : 2 2
ð11:64Þ
According to the notion of intrinsic angular momentum pointing either up or down, that is, the particle would rotate either counterclockwise or clockwise, one speaks of either spin up or spin down. All particles, by the way, can be divided into two classes. Particles whose spin is half-integer (ms = ± 12 , ± 32 , . . .), like the electron, are called fermions, and particles whose spin is integer (ms = 0, ± 1, ± 2, . . .) are called bosons. Incidentally, an example of particles with an integer spin is photons. Atoms can be both fermions and bosons, depending on their composition. Important for the occupation of atomic orbitals is the Pauli principle, named after its discoverer Wolfgang Pauli: It states that two or more fermions must never occupy the same state. Two fermions must differ in at least one quantum number. Applied to the electrons in the atomic orbitals, this means that any combination of n, l and m can be occupied by a maximum of two electrons, which then differ by their spin quantum number.
11.2.3.2 Occupation of the Orbitals Starting from hydrogen, we now build up the elements, one after the other, by filling an additional electron into each atomic orbital. The newly added electron occupies the orbital which, firstly, is not yet occupied and, secondly, has the smallest possible energy. The order in which the orbitals are occupied essentially follows the so-called Aufbau principle, according to which the energy of an orbital is proportional to the sum of n and l. The energy of an atomic orbital is proportional to the sum of n and l: E / n þ l:
ð11:65Þ
If the sum is the same, as for example in the case of the 2p orbital with n = 2, l = 1 and the 3s orbital with n = 3, l = 0, the orbital with the smaller principal quantum number is occupied first, that is, the 2p orbital in the example. The order can be illustrated graphically, as sketched in Fig. 11.7a.
318
11
Fundamentals of Quantum Physics
Fig. 11.7 (a) The Aufbau principle indicates the order in which the atomic orbitals are occupied by electrons. Here one goes through the table in diagonals from top right to bottom left. (b) Occupation of the orbitals of the first and second shell from hydrogen to oxygen and corresponding electron configuration. Depending on whether the arrow points up or down, the electron has spin-up or spindown. (c) Occupation of the orbitals in sodium (third shell)
However, for some elements there are exceptions due to relativistic corrections, which we do not discuss further here. In the following we want to carry out the occupation of the orbitals up to sodium exemplarily: • Hydrogen (H) has an electron that occupies the orbital with the smallest energy. This is the 1s orbital. This is often sketched as a small box in which an arrow pointing up or down is drawn to indicate the spin of the electron (Fig. 11.7b). The so-called electron configuration describes the state of the atom by adding the number of electrons as a superscript after the orbital. In the case of hydrogen, the electron configuration is thus 1s1. • Helium (He) has two electrons. The second electron can also be filled into the 1s orbital; however, its spin must be different from the spin of the first electron. Therefore, draw a second arrow with opposite direction in the corresponding box. The electron configuration is 1s2. With helium, the first shell is completed with n = 1. • Lithium has three electrons. Since the 1s orbital is already full, the third electron is filled into the 2s orbital. For the 2s orbital, draw a new box above the 1s box. The electron configuration is 1s22s1. Full shells can also be abbreviated by the corresponding element in square brackets, for example, [He] can be written instead of 1s2. The electron configuration of lithium is then written as [He]2s1. • Beryllium (Be), [He]2s2, has four electrons. The fourth electron makes the 2s orbital full. • Bohr (B) has five electrons. After the 2s orbital comes the 2p orbital. There are now three possibilities, corresponding to the px, py and pz orbitals. You sketch this by three adjacent boxes right above the 2s box. The electron configuration is [He] 2s2p1. • Carbon (C), [He]2s2p2, has six electrons. The 2p orbital must therefore be occupied twice. Here, according to the so-called Hund’s rule, orbitals with the
11.2
• • • • •
Atomic Physics
319
same energy are first all simply occupied with the same spin, that is, the second arrow in the p-orbital goes into the second box and has the same orientation as the first arrow. Nitrogen (N), [He]2s2p3, has seven electrons. According to Hund’s rule, the third p-orbital is occupied by a single electron. Oxygen (O), [He]2s2p4, has eight electrons. Since all three p-orbitals are now already occupied by an electron, the new electron is filled back into the first p-orbital with reversed spin. Fluorine (F), [He]2s2p5, has nine electrons. The second p-orbital is filled. Neon (Ne), [He]2s2p6, has ten electrons. The p-orbital is now completely filled with six electrons, making the second shell complete. Sodium (Na), [Ne]3s1, has eleven electrons. The electron configuration corresponds to that of neon, plus one electron in the 3s orbital (Fig. 11.7c).
11.2.3.3 Periodic Table With the help of these rules, the electron configuration of all but a few elements can be determined. The elements are then displayed in a table, the so-called periodic table (Fig. 11.8). Elements whose orbitals are filled up to the same shell are in a row. The order from left to right corresponds to the order in which the orbitals are filled. The first row of the periodic table contains the elements of the first shell and therefore contains only hydrogen and helium. The second row contains all elements of the second shell. These are the eight elements from lithium to neon. The elements of the different rows are arranged in such a way that the elements with only one valence electron (e.g., hydrogen and lithium) are on the left and the elements with shells so full that a new shell has to be started (e.g., helium and neon) are on the right. Here you have to take into account that some orbitals are filled with delay, for example, the 3d-orbital is filled after the 4s-orbital. Elements that are in the same column have similar spectroscopic and chemical properties; the columns are called main groups. For example, all elements that are below hydrogen (these are the so-called alkali metals lithium, sodium, potassium, rubidium, cesium, and francium)
Fig. 11.8 Periodic table up to the fifth shell. Elements lying below each other belong to the same main group and have similar chemical properties. The atomic number Z indicates the number of protons in the nucleus of the element. Identical orbitals are marked in color as blocks. The width of the blocks corresponds to the number of electrons that the respective orbital can accept
320
11
Fundamentals of Quantum Physics
have a spectrum that has similarities to that of hydrogen. All elements below helium (these are the noble gases neon, argon, krypton, xenon and radon) have a closed shell and are therefore extremely unreactive, that is, they do not form chemical bonds with other atoms. The further down the periodic table you go, the more orbitals there are in the corresponding shell. Therefore, the rows get longer and longer towards the bottom. This creates gaps in the upper rows. Figure 11.8 shows the periodic table up to the fifth shell. In addition to the chemical name of the element and the name, the atomic number Z, which indicates the number of protons in the atomic nucleus, is also listed in the individual table positions. Written Test: Electrons in the M-Shell How many electrons fit into the M-shell with principal quantum number n = 3? Solution If the principal quantum number is n = 3, the angular momentum quantum number l can take the values l = 0, l = 1 and l = 2. So there is an s orbital, a p orbital and a d orbital. The s orbital can accept 2 electrons, and the p orbital can accept 2 . 3 = 6 electrons, since for l = 1 the magnetic quantum number can take one of the three values and m = 0 m = ± 1, respectively. Correspondingly, in the d orbital with l = 2 the magnetic quantum number can take 5 different values, namely m = 0, m = ± 1 and m = ± 2. Thus, the d-orbital can accommodate 2 . 5 = 10 electrons. Therefore, in total, the M shell can accept 2 + 6 + 10 = 18 electrons.
Exercise: Electron Configuration What is the electron configuration of tin (Sn)? Use the periodic table in Fig. 11.8 and note the structure principle in Fig. 11.7a. Solution Tin is an element of the fifth shell. The electron configuration is therefore based on that of krypton, which is in the fourth row on the far right. The 5s orbital is doubly occupied. This is followed by a d-orbital, which is fully occupied with ten electrons. Because of the structure principle, this is the 4d orbital. Tin is second in the 5p orbital, so this orbital is occupied by two electrons. The electron configuration is (continued)
11.3
Nuclear Physics
321
½Kr ]4d10 5s2 p2 :
ð11:66Þ
Here we ordered the oribtals according to the principal quantum number. Therefore the 4d-orbital appears left to the 5s-orbital, although it is occupied after the 5s-orbital.
11.3
Nuclear Physics
While atomic physics is primarily concerned with the structure of the electron shell, nuclear physics deals with the structure of the atomic nucleus and with the processes that can take place in the nucleus, such as radioactive decay or nuclear fusion. As things stand today, not all questions in nuclear physics have been answered. It is possible to calculate energy levels in the nucleus using a quantum mechanical approach similar to that of the Schrödinger atom, the so-called shell model. However, the nuclear potential – unlike the Coulomb potential, which acts on the electrons of the atomic shell – is not known. In theory, various phenomenological model potentials are used instead, whose shape can be derived from scattering experiments. The energy levels are then occupied by the elementary particles that make up the nucleus, the nucleons. The fact that there are two different types of particles here, the positively charged protons and the electrically neutral neutrons, makes the theoretical description more difficult than for the Schrödinger atom. In addition to the shell model, there are a number of other models, each describing different aspects of nuclear physics; unfortunately, there is no single unified model. We therefore concentrate in the following on a description of the structure of the atomic nucleus from its elementary particles.
11.3.1 Structure of the Atomic Nucleus The atomic nucleus consists, as already mentioned, of a number of Z protons with the positive elementary charge e. In addition, there are a number of N electrically neutral neutrons in the atomic nucleus (Fig. 11.9a). Protons and neutrons are related and, as we will see in Sect. 11.3.2 on nuclear decays, can convert into each other. This is because protons and neutrons are made of the same elementary particles, called quarks. The proton consists of two so-called up quarks with the charge qup = + 2/ 3 . e and one down quark with the charge qdown = - 1/3 . e. The charge of the proton is thus qp =
2 2 1 e þ e - e = e: 3 3 3
ð11:67Þ
322
11
Fundamentals of Quantum Physics
Fig. 11.9 (a) An atomic nucleus consists of Z positively charged protons and N electrically neutral neutrons. In the sketched example, Z = 3 and N = 4. Thus, it is the nucleus of lithium with mass number A = Z + N = 7 with designation 73 Li4 . (b) Protons and neutrons consist of up and down quarks with charges qup = + 2/3 . e and qdown = - 1/3e
The neutron consists of one up quark and two down quarks. Accordingly, the charge of the neutron results in 2 1 1 qn = e - e - e = 0: 3 3 3
ð11:68Þ
The three quarks are held together by the so-called strong nuclear force. It is impossible to detach one of the quarks from the triple bond. This would cost so much energy that new quarks would be created in the attempt and those would attach themselves to the individual components. The atomic nucleus as a whole is also held together by the strong nuclear force. On a very short length scale this force is stronger than the repulsion of the protons due to the Coulomb force. As a result, an atomic nucleus has a typical diameter of approx. dnuc ≈ 10 - 14 m = 10 - 4 Å
ð11:69Þ
and is thus four orders of magnitude smaller than the diameter of the atomic shell. The mass of the atomic nucleus results from the sum of the masses of the Z protons and the N neutrons to mnuc = Z . mp þ N . mn - ΔmK ,
ð11:70Þ
with the so-called nuclear mass defect ΔmK. The mass defect is related to the binding energy Eb, K with which the nucleons are bound in the nucleus. The connection results from Einstein’s famous formula E = m . c2 as ΔmK =
Eb,K : c2
ð11:71Þ
Since the masses of a proton with mp = 1.673 . 10-27 kg and a neutron with mn = 1.675 . 10-27 kg are almost equal and the mass defect is typically small compared to the total mass of the nucleus, the nuclear mass can be approximately calculated with the help of the mass number with A = Z + N
11.3
Nuclear Physics
323
mKern = A . u, with u = 1:661 . 10 - 27 kg,
ð11:72Þ
the so-called atomic mass unit. Note that the atomic mass unit u is much larger than the mass me = 9.11 . 10-31 kg of an electron. Therefore, almost the complete mass of an atom is in its nucleus, which is tiny compared to the size of the atom. So we can say that atoms are mostly made of vacuum. The atomic number Z, the neutron number N and the mass number A now determine which type of nucleus (which nuclide) we are dealing with. A certain nuclide is written with the notation A Z XN :
ð11:73Þ
Here the X stands as a placeholder for the chemical symbol of the element. Since neutral atoms always contain the same number of electrons as protons, the atomic number Z indicates which element it is. For example, the element rubidium has Z = 37 protons. In the most common case, a rubidium nucleus also has N = 48 neutrons, making the mass number A = Z + N = 85. So the name of the corresponding nucleus is 85 37 Rb48 ,
ð11:74Þ
one also speaks of rubidium 85 or Rb 85. A certain element with a fixed atomic number Z can have different neutron numbers N, and thus also different mass numbers. These are therefore different nuclei of the same element. These are called isotopes. An example are the hydrogen isotopes, which have even got their own names: Hydrogen 11 H0 ,
ð11:75Þ
Deuterium 21 H1 ,
ð11:76Þ
Tritium 31 H2 :
ð11:77Þ
Deuterium and tritium are also referred to as heavy or superheavy hydrogen. Atomic nuclei that have the same neutron number but differ in atomic number, and thus also in mass number, are called isotons. An example of isotons are the following nuclei 3 1 H2
and 42 He2
ð11:78Þ
of tritium and helium. If both the atomic number and the neutron number differ such that the mass number of two nuclei is the same, we speak of isobars (not to be confused with the isobaric change of state in thermodynamics!). There is also an example 3 1 H2
and 32 He1
ð11:79Þ
324
11
Fundamentals of Quantum Physics
Fig. 11.10 (a) In an isotope table, all types of nuclei are tabulated depending on the proton number Z and the neutron number N. Depending on Z and N, the nuclei are either stable or decay into other nuclei by various processes. (b) Section of the isotope table of the lightest nuclei
with tritium and the helium isotope 3He. In total, there are more than 3300 known isotopes, most of which are not stable, but decay into other isotopes. All the different types of nuclei are listed in a so-called isotope table (Fig. 11.10). Here the neutron number N is plotted on the horizontal axis and the proton number Z to on the vertical axis (sometimes also vice versa). Nuclei of the same element (isotopes) are in a row, nuclei with the same neutron number (isotons) are in a column, and nuclei with the same mass (isobars) are on the diagonals from top left to bottom right. Stable nuclei range from hydrogen (11 H0) to lead (208 82 Pb126), when the ratio between the number of neutrons and the number of protons is in a certain range. All heavier nuclei and all nuclei (including the lighter ones) that have either too many or too few neutrons compared to the proton number are unstable and decay into other nuclei. There are different types of decay, which are colour-coded in the isotope table. The most important decay types are introduced in Sect. 11.3.2.
Exercise: Isotopes What are the neighbouring isotopes, isotones and isobars of the beryllium nucleus Be 9 with mass number A = 9? Give the correct name of the nucleus in each case. Solution We use the section of the isotope table in Fig. 11.10b. The beryllium nucleus with mass number A = 9 has the designation 94 Be5 . To the left and right of it are respectively the two beryllium isotopes 8 4 Be4
and 10 4 Be6 :
ð11:80Þ
(continued)
11.3
Nuclear Physics
325
One row higher and lower are the two isotones Bohr 10 and Lithium 8, respectively: 10 5 B5
and 83 Li5 :
ð11:81Þ
On the diagonal are respectively the two isobars Bohr 9 and Lithium 9: 9 5 B4
and93 Li6 :
ð11:82Þ
11.3.2 Decays of Atomic Nuclei In a nuclear decay, a heavier atomic nucleus X decays into a lighter nucleus Y: A Z XN
0
→ AZ 0 YN 0 ,
ð11:83Þ
whereby further nuclear particles are emitted. Depending on the type of particle emitted, a distinction is made between different types of decay. The most important decays are listed below.
11.3.2.1 a-Decay An α particle is emitted. This is a helium 4 nucleus 42 He2 . The mass number of the nucleus X is thus reduced by four, the number of protons and neutrons by two each. This can also be written in the form of a reaction equation: A Z XN
4 4 → AZ - 2 YN - 2 þ2 He2 :
ð11:84Þ
In the isotope table, the nucleus Y is therefore shifted from the nucleus X by two fields along the diagonal to the lower left. An example of a α-decay is the decay of uranium 238 to thorium 234. Here, the new thorium nucleus is also unstable and decays in a so-called β- decay.
11.3.2.2 b--Decay This type of decay always occurs when a nucleus has too many neutrons compared to the number of protons. In the isotope table, these nuclei are on the right below the stable range. The excess of neutrons is removed by the conversion of a neutron in the nucleus into a proton. In this process an electron (the β- particle) is emitted. Thus the charge is also conserved in this process. This is necessary because the charge is one of the fundamental conservation variables in physics. Another conservation quantity is the so-called lepton number. Leptons are a class of elementary particles that includes the electron with lepton number L = + 1, but not the proton and the neutron. Therefore, this process must produce another particle, called an electron
326
11
Fundamentals of Quantum Physics
antineutrino νe with lepton number L = - 1, which is uncharged and balances the lepton number of the electron. The corresponding reaction equation for the neutron is n → p þ e - þ νe
ð11:85Þ
or for the entire core A Z XN
→ AZþ1 YN - 1 þ e - þ νe ,
ð11:86Þ
that is, the mass number of the nucleus remains the same. In the case of β- decay, one moves up one field to the left in the isotope table along the isobars. An example of a β--decay is the decay of thorium 234 into a protactinium 234 nucleus, which itself decays again via a β--decay into the radioactive isotope uranium 234. This is followed by a whole series of further α and β- decays until the stable isotope lead 206 is reached. This is called a decay series. Analogous to the β- decay, there is also the β+ decay.
11.3.2.3 b+-Decay This type of decay always occurs when a nucleus has too many protons compared to the number of neutrons. In the isotope table, these nuclei are on the left above the stable range. The excess of protons is removed by a proton in the nucleus converting into a neutron. In the process, a positron (the β+ particle) is emitted. This is the antiparticle of the electron. Antiparticles are characterized by the fact that they are an exact copy of the original particle, with the difference that the antiparticle has opposite electric charge and lepton number. Thus, the positron has the properties of an electron, but has a positive elementary charge and lepton number L = - 1. Because of lepton number conservation, another electrically neutral lepton with lepton number L = + 1 is emitted, the electron neutrino νe. The corresponding reaction equation for the proton is p → n þ eþ þ νe
ð11:87Þ
or for the entire core A Z XN
→ AZ - 1 YNþ1 þ eþ þ νe ,
ð11:88Þ
that is, as in the case of β- decay, the mass number remains the same. In the case of β+ decay, however, one moves down one field on the isotope table along the isobars in the direction of the stable range. An example of a β+-decay is the decay of nitrogen 13- into a carbon 13-nucleus. In addition to the most common decay processes (α,βand β+), there are several others that are less common, such as proton or neutron emission.
11.3
Nuclear Physics
327
11.3.2.4 Proton and Neutron Emission Similar to β+/- decay, an excess of protons/neutrons in the nucleus is removed, but by the emission of a proton/neutron from the nucleus, making the nucleus lighter by a mass number, rather than by a conversion of the particles in the nucleus. The reaction equation for the proton emission is A Z XN
1 → AZ - 1 YN þ p
ð11:89Þ
and that for the neutron emission is corresponding A Z XN
→ AZ - 1 YN - 1 þ n,
ð11:90Þ
correspondingly. Here one moves vertically downwards (proton emission) or horizontally to the left (neutron emission) in the isotope table.
11.3.2.5 g-Decay It is often the case that after a nuclear decay, the new nucleus is initially in an excited state, with protons and/or neutrons in a higher nuclear shell. After a short time, the neucleons fall back to the ground state. This causes a photon with very high energy to be emitted, a so-called γ-quantum, after which this decay is named. An example of this is the β- decay of the caesium 137 nucleus into a barium 137 nucleus. The barium nucleus is initially in an excited state. After a short time, it returns to its ground state by emitting a γ quantum with an energy of E = 0.6 MeV. Written Test: C 14 Decay The radiocarbon method for dating carbonaceous organic materials exploits the fact that in living organisms there is an equilibrium between the carbon isotopes C 12 and C 14. After the death of the organism, the amount of stable C 12 isotope remains the same, while the proportion of radioactive C 14 isotope decreases due to nuclear decay. What decay is involved and what is the reaction equation? Use the isotope table in Fig. 11.10b. Solution According to the isotope table, carbon C 14 decays to nitrogen N 14 by a decay of β-. The reaction equation is 14 6 C8
→ 14 7 N7 þ e þ νe :
ð11:91Þ
328
11
Fundamentals of Quantum Physics
11.3.2.6 Energy in Nuclear Decay In all nuclear decays, energy is released, either in the form of electromagnetic γ radiation or in the form of the kinetic energy of the reaction products. This energy comes from the mass defect, more precisely from the difference between the mass of the original nucleus and the sum of the masses of the fission products. Thus, the products together have less mass than the original nucleus. The difference in mass is released as energy. This energy is used, for example, in nuclear power plants by controlled nuclear fission to generate electricity. We will calculate how large the energy is by means of an example. Exercise: Energy in U 238 Decay Uranium 238 with a mass of mU = 238.0003 u decays in a α decay into a thorium 234 nucleus with a mass of mTh = 233.9942 u and a helium 4 nucleus with a mass of mHe = 4.0015 u, with u = 1:661 . 10 - 27 kg:
ð11:92Þ
What is the energy released per nucleus in this process? Solution The difference of the masses is Δm = mU - mTh - mHe = 0:0046 u = 7:64 . 10 - 30 kg:
ð11:93Þ
The energy released per nuclear decay is given by the Einstein relation ΔE = Δm . c2 = 7:64 . 10 - 30 kg . 3 . 108 = 6:88 . 10 - 13 J = 4:3 MeV:
m s
2
ð11:94Þ ð11:95Þ
11.3.2.7 Decay Law and Activity The decay of an atomic nucleus is a quantum mechanical process. Radioactive nuclei are classically stable in the sense that the nucleons are tightly bound in the nuclear potential. This means that the energy of the nucleons is classically insufficient to leave the nucleus. To do so, they must overcome a potential barrier. However, from a quantum mechanical point of view, the nucleons in the nuclear potential have a wave function that extends into the barrier at the edge of the nucleus. This is analogous to the electronic wave function in the double well potential in Fig. 11.3. While electrons in the double well potential tunnel back and forth between the wells, the nucleons can tunnel out of the nuclear potential and are then no longer bound to the nucleus. The trigger for nuclear decay is thus a quantum mechanical tunneling
11.3
Nuclear Physics
329
process. We denote the probability that the nucleus has not yet decayed by P(t). This probability decreases exponentially (without proof): Pðt Þ = e - τ , t
ð11:96Þ
where the 1/e decay time τ is characteristic for a particular type of nucleus, since it depends on the details of the nuclear potential and its filling with neutrons and protons. If we consider not only a single nucleus, but a sample of the radioactive material consisting of many nuclei, the number N(t) of nuclei that have not yet decayed behaves in the same way as the probability P(t). It holds N ðt Þ = N 0 . e - τ , t
ð11:97Þ
with the starting core number N0 at the time t = 0. Often the half-life T1/2 is given, instead of the 1/e decay time τ. The following relationship connects the two quantities T 1=2 = τ . lnð2Þ:
ð11:98Þ
For the assessment of how radioactive a certain amount of a substance is, the number of nuclear decays per second is decisive. The number of decays per second is called activity. It is as large as the change of the not yet decayed nuclei per time unit. Therefore: Aðt Þ = -
t 1 dN N = 0 . e - τ , ½A] = = Bq: s τ dt
ð11:99Þ
The activity has the unit 1/s, which in this context is called bequerel (Bq). Exercise: Activity of Rubidium and Thorium We consider in each case a number of N0 = 1023 nuclei of the (a) weakly radioactive rubidium isotope Rb 87 with a half-life of T1/2 = 48 . 109 years and of the (b) strongly radioactive thorium isotope Th 234 with a half-life of T1/ 2 = 24 days. Calculate in each case the activity at the time t = 0, where all cores are still present. Solution According to (11.99), the activities at time t = 0 are given by (continued)
330
11
A ð 0Þ =
Fundamentals of Quantum Physics
N 0 N 0 . lnð2Þ : = T 1=2 τ
ð11:100Þ
We insert the corresponding half-lives and obtain activities of ARb =
1023 . lnð2Þ = 45 790 Bq, 48 . 109 . 365 . 24 . 3600 s
ð11:101Þ
using that a year consists of 365 days with 24 h each and 3600 s/h. So there are about 46 000 rubidium nuclear decays per second. For thorium there are ATh =
1023 . lnð2Þ = 3:3 . 1016 Bq: 24 . 24 . 3600 s
ð11:102Þ
So thorium has a much greater activity than rubidium because it decays much faster.
11.3.2.8 Biological Effects of Radioactivity Since radioactive decay products are dangerous to all living organisms, including humans, it is important to be aware of the dangers of radioactivity. Here, both immediate damage and late damage can occur, which only becomes noticeable after years. Damage to the genetic material is particularly dangerous for offspring. The extent of the damage caused depends primarily on how much radiation energy a body or part of the body, such as an organ, absorbs. The (energy) dose DT is defined here as the amount of energy E absorbed per mass m of the absorbing body part T: DT =
E J , ½DT ] = = Gy, m kg
ð11:103Þ
whereby the new designation Gray (Gy) is introduced for the unit J/kg. Here, the time span in which the amount of energy is absorbed also plays a role for the damage. If a certain dose is absorbed over a short period of time, the damage is much more severe than if the dose is absorbed over a long period of time. In addition, the biological effect depends on the type of radiation. This is taken into account by the so-called radiation weighting factor wR which quantifies the biological effect of a certain type of radiation R compared to the effect of γ-quanta (photons). Radiation type γ-Quantum Electrons Neutrons (depending on energy) Protons α-particles
Radiation weighting factor wR 1 1 5–20 5 20
11.3
Nuclear Physics
331
The biological effect of a radiation type R on a body part T is then taken into account by the organ dose (dose equivalent) HR, T. It applies H R,T = wR . DT , ½H R,T ] =
J = Sv: kg
ð11:104Þ
The unit of equivalent dose is also J/kg. In order to be able to distinguish the dose equivalent H from the absorbed dose D, its unit is called sievert (Sv). In addition, one can also consider the effective effect of radiation on different parts of the body by tissue weighting factors. For example, the internal organs in humans such as the stomach, lungs, large intestine or bone marrow are much more affected than the skin or brain of humans. The biological effect averaged over all parts of the body is then described by the effective dose, which we will not go into here. The causes of radiation exposure are manifold. On the one hand, there is natural exposure caused by the radioactive rocks of the earth and by radiation from space (the cosmic rays) falling on the earth. This natural exposure in Europe is about 1 mSv per year. It should be noted that in some places of the world the natural exposure can be much higher, for example, at the beach of Guarapari in Brazil it is 175 mSv per year due to the thorium-containing sand. Medical examinations are another source of radioactive exposure. These vary from about 10 μSv for a simple X-ray examination to 10 mSv for a computer tomography of the pelvic area. In the case of radiotherapy, in which tumours are specifically irradiated, a much higher equivalent dose can occur locally. Nuclear power plants are another source, although under normal conditions the radiation exposure outside the power plant does not exceed the natural level. In Europe, workers inside the power plant may be exposed to a maximum equivalent dose of 20 mSv per year. However, much more radiation can be released during catastrophic events. When the Chernobyl nuclear power plant exploded in 1986, workers were exposed to a maximum dose of 13 Sv. During the explosion of the atomic bomb in Hiroshima in 1945, the radiation exposure at a distance of 2 km from the explosion site was 1 Sv. Exposure to 5 Sv leads to death in 50% of cases within 30 days.
11.3.3 Nuclear Fusion We have seen that an atomic nucleus can decay if the end products of the reaction have less mass than the original nucleus. In this case, the difference in mass is released as energy. There is also the reverse case of nuclear fusion, where two light nuclei combine to form a heavier nucleus. This is possible if the resulting nucleus has less mass than the two original nuclei together. Energy is also released in this case. In contrast to nuclear fission, however, nuclear fusion only takes place under extreme conditions. In order for the nuclei to combine, the wave functions of the nucleons involved must tunnel through a potential barrier. This process only takes place when the two initial nuclei are close enough to each other. For this to happen, however, the strong Coulomb repulsion of the positively charged nuclei must be overcome.
332
11
Fundamentals of Quantum Physics
Nuclear fusion therefore only takes place when the atomic nuclei are pressed together by enormous pressure at very high temperatures, such as in the interior of stars. In the centre of the sun, where nuclear fusion takes place, there is a temperature of 15 million Kelvin and a pressure of 200 billion bar. All the light energy that we receive on earth from the sun has been released by these nuclear fusion processes. Typically, the hydrogen that makes up all stars is first fused into helium, and when the hydrogen is used up, the helium nuclei are fused into heavier nuclei such as beryllium and carbon. On Earth, too, attempts are being made to generate energy with the help of nuclear fusion. The most promising approach is to fuse deuterium and tritium into a helium nucleus: 2 3 1 H1 þ1 H2
→ 42 He2 þ n þ 17:6 MeV:
ð11:105Þ
The deuterium required for this is available on earth as a raw material in huge quantities. However, the tritium must be produced in the reactor itself. For this purpose, the neutron released during the fusion reaction, which has a large part of the released energy as kinetic energy, and a reservoir of lithium are used: nþ63 Li3 → 42 He2 þ31 H2 þ 4:8 MeV:
ð11:106Þ
However, it is technically difficult to produce the necessary temperatures and densities. At the moment, the effort required for this still consumes more energy than is ultimately released during fusion. Therefore, nuclear fusion will only become technologically interesting when it is possible to obtain a net energy from it. The future will show whether energy generation through nuclear fusion can be established.
11.4
Quantum Physics: Compact
Here are the most important formulas of quantum physics: deBroglie wavelength and frequency of a matter wave (11.5): λdB =
h h E = , f = kin : h p m.v
Copenhagen Interpretation (11.14): →
W r
→
= Ψ0 r
Heisenberg’s uncertainty principle (11.22): Δx . Δp≳h: Orbital radii in the Bohr atomic model (11.39):
2
:
11.4
Quantum Physics: Compact
333
r n = aBohr .
n2 : Z
energies in the Bohr atomic model (11.42): E n = - ERy .
Z2 : n2
Transition wavelength in the Bohr atomic model (11.49) and (11.47): λ=
h.c 1 1 : , ΔE = ERy . Z 2 . ΔE m 2 n2
Quantum numbers in the Schrödinger model of the atom (11.55), (11.59) and (11.61): n 2 f1, 2, 3, . . .g, l 2 f0, 1, 2, . . . , ðn - 1Þg, m 2 f - l, - l þ 1, . . . , - 1, 0, 1, . . . , l - 1, lg: Spin quantum numbers of an electron (11.64): ms 2
1 1 - ,þ : 2 2
Nuclear mass defect (11.71): ΔmK =
Eb,K : c2
Radioactive decay law (11.97) and (11.98): N ðt Þ = N 0 . e - τ , t
T 1=2 = τ . lnð2Þ: Activity (11.99): Aðt Þ = -
t dN N = 0 . e - τ , ½A] = Bq: τ dt
Dose and dose equivalent (11.106) and (11.98): E , ½DT ] = Gy, m H R,T = wR . DT , ½H R,T ] = Sv: DT =
12
Appendix: Physical Constants
Gravitational constant G = 6:67 10 - 11 Speed of light c = 299 792 458
Nm2 kg2
m m ≈ 3 108 , s s
J Boltzmann constant k B = 1:38 10 - 23 , k Avogadro constant N A =
6:02 1023 , mol
General gas constant R = 8:31
J , mol K
Electric field constant ε0 = 8:85 10 - 12
As , Vm
Elementary charge e = 1:6 10 - 19 C, Magnetic field constant μ0 = 4π 10 - 7 Stefan‐Boltzmann constant σ = 5:67 10 - 8
Tm , A W , m2 K4
Planck0 s constant h = 2π ħ = 6:63 10 - 34 Js, Bohr radius aB = 0:5 10 - 10 m, Rydberg Energy E Ry = 13:6 eV,
# Springer-Verlag GmbH Germany, part of Springer Nature 2023 S. Slama, Experimental Physics Compact for Scientists, https://doi.org/10.1007/978-3-662-67895-4_12
335
336
12
Appendix: Physical Constants
Mass of the electron me = 9:1 10 - 31 kg, Proton mass mp = 1:673 10 - 27 kg, Mass of the neutron mn = 1:675 10 - 27 kg, Atomic mass unit u = 1:661 10 - 27 kg:
Index
A Abbe criterion, 278–280, 291 Absorption, 248–250, 310–312 Acceleration angular, 22, 23, 52, 54 due to gravity, 16, 19, 27, 66, 76, 99 tangential, 22, 23 AC resistance, 209–212, 230 Activity, 329, 333 Adhesion, 75 Adiabatic process/equation, 135, 143 Aerostatics, 75–77, 134 α particle, 325 Alternating current (AC), 195, 203–204, 209– 214, 219, 220, 225–228, 230, 236, 237, 239 Alternating voltage, 204, 209–211, 213, 224, 227 Ampère’s law, 181–187, 196, 206 Amplitude of an oscillation, 32, 95, 101–103, 109, 220, 258 Angular momentum theorem, 45, 58 Antiparticle, 326 Atomic model Bohr’s, 308, 312, 313, 332, 333 Schrödinger’s, 308, 309, 312–316 Atom in the box, 302 Aufbau principle, 317, 318 Avogadro constant and mol, 127
B Bandpass filter, 236 Barometric formula, 76–77, 92 Beatings, 109–110, 268 Beer foam, 10–11
Bell’s inequality, 300, 306 Bending, 61–63, 91 Bernoulli effect, 80–84 Bernoulli’s equation, 80–84 β- particle, 325 β+ particle, 326 Binding energy and binding distance in a molecule, 59 Birefringence, 263–265 Black-body radiation, 282–284 Bohr radii, 309, 310 Bohr’s atomic model, 308 Boyle-Mariotte, law of, 76, 92, 134 Brewster angle, 251, 262, 263 Brewster reflection, 262–263 Buoyant force, 69, 70, 103
C Capacitance, 160, 161, 168, 211–213, 217, 218, 224, 226, 229, 230, 234 Capillary, 75 Carnot process, 148 Cathode ray, 164 Celsius, 120 Center of mass, 13, 27, 50, 52 Centrifugal force, 31, 57 Centripetal acceleration, 22, 23, 31, 56 Coercivity, 190 Coherence length, 269, 270, 291 time, 268, 269, 291 Cohesion, 75 Coil, solenoid, 184 Collapse of the wave function, 297 Collision central, 42, 58
# Springer-Verlag GmbH Germany, part of Springer Nature 2023 S. Slama, Experimental Physics Compact for Scientists, https://doi.org/10.1007/978-3-662-67895-4
337
338 Collision (cont.) elastic, 42, 58 inelastic, 42, 43, 58 Compressibility, 68, 110, 111 Compression, 27, 60–62, 68–69, 74, 75, 91, 92, 130–133, 135, 136, 143 Compton scattering, 286–288, 292 Compton wavelength, 287 Conductivity electric, 163 thermal, 122, 123 Conservation of angular momentum, 45, 46, 306 charge, 215 energy, 35, 37, 42, 55, 81, 227 lepton number, 325, 326 momentum, 25, 40–43, 83, 286, 288, 289 Continuity equation, 78–81, 83, 92 Copenhagen interpretation, 295–300, 332 Coriolis force, 31–32, 57 Coulomb’s law, 150–151, 169 Current, 162 Current/current density, 2, 144, 149, 174, 196, 209
D De Broglie wavelength, 294, 295 Decay law, 328–330, 333 Decay, nuclear, 321, 325, 327–330 Degrees of freedom, 128–131 γ-particle, 327 Density, 51, 52, 59, 66, 70, 71, 73, 75–80, 82, 86, 87, 89, 90, 103, 110, 111, 122, 145, 151, 156, 160, 163, 168, 180, 181, 185– 188, 190, 249 Diamagnetism, 188, 189 Dielectric constant, 168 Dielectricity, 167–169 Diffraction at double slit, 275, 276 grating, 276–277, 291 single slit, 274–276, 278 Diffraction limit, 277–280 Dipole dipole radiation, 238, 262, 312 Dipole moment electrical, 166, 167 magnetic, 183, 187, 192, 238 Dispersion, 247, 248 Distributions, 5–7 distribution functions, 6, 7 Doppler effect, 113–115, 117
Index Dose, 330, 331, 333 Dose equivalent, 331, 333 Double slit experiment, 297–299 Double well potential, 302, 303 Dynamo, 195, 203–204, 209
E Eddy current, 201–203 Elastic stress/elasticity, 60–62 Electric current, 2, 162, 173, 174, 181, 188 energy density, 161 flux density, 167, 168, 190 force, 151, 152, 164, 169, 175, 180, 194 inductance, 162 potential, 157–159, 166, 187 resonator, 233 voltage, 157, 161, 198 Electric charge, 149 Electric dipole, 166–167 Electric field/field lines, 151–152 of a charged plate, 152, 156 of a long straight charged wire, 155, 156 of a plate capacitor, 152, 156, 180 of a point charge, 151, 152, 170 Electric flux, 152 Electric force, 151 Electromagnetic waves, 113, 206, 232, 237– 239, 243–250, 256–258, 268, 282, 290 Electron, 149, 164, 165, 180, 238, 281, 284– 287, 293, 297–299, 307–310, 312–314, 316–321, 323, 325, 326, 328, 330, 333, 336 Electron configuration of an atom, 318 Electron diffraction, 297–299 Elementary charge, 149, 165 Elongation, 60–62 Emission, spontaneous, 310 Energy of a charge in electric field, 154, 160 in a coil, 205 elastic, 36, 37, 57, 60, 130 of an electric dipole, 152, 166 of the electric field, 186 internal, 119, 128–131, 134, 136, 146 kinetic, 34, 36, 37, 42, 43, 54, 57, 80, 81, 119, 120, 124, 126, 128, 165 of the magnetic field, 186–187, 192 photon, 285 in a plate capacitor, 161 potential, 36–38, 55, 57, 80 rotational, 54–55, 57, 128, 129
Index thermal, 120, 128–130, 142, 164, 282 Entanglement, 304–307 Entropy, 134, 136–144, 148 EPR experiment, 305–306 Equation of motion, 33–34, 97, 98, 100, 103 Equilibrium of forces, 47, 49 stable/unstable/indifferent, 50 thermal, 120, 128, 141, 282 Equipartition theorem, 128 Ergodic hypothesis, 138 Error propagation, 7–9, 12 Expansion coefficient, 123, 124
F Faraday cage, 162 Faraday’s law of induction, 198–199, 203, 204, 206, 207 Ferromagnetism, 173, 188–191 Field, electrical, 150–152, 154–168, 170, 173, 176, 180, 184–187, 193–199, 245, 248, 249, 256–258, 267, 272, 280 Field emission, 164, 165 Focal length, 252 Forces flow field, 77 laminar, 86 turbulent, 89, 90, 93 volume flow rate, 79, 86, 88, 92 Frequency, 21 Frictional force Newtonian friction force, 85 rolling friction, 32, 33 sliding friction, 32 static friction, 32 Stokes’ law of friction, 86–87, 92, 103, 104
G Gas constant, general, 127 Gas equation, ideal, 133 Gas, ideal, 76, 124–135 Gaussian distribution, 6, 7 Gauss’ theorem, 152, 186 Gradient, 10, 85, 158, 187 Gravitation force, 25–27, 31, 34, 56, 86, 90, 96, 98, 103 potential, 39
H Hagen-Poiseuille, law of, 86, 88, 92 Hall effect, 179–180 Harmonic oscillator, 95–104
339 damped, 100 driven, 99–104 Harmonic waves, 104–116, 243, 267, 268 Hearing threshold, 113 Heat, 3, 37, 42, 119–123, 127, 130–136, 139– 147, 164, 186, 203, 211–213 Heat capacity at constant pressure, 133 at constant volume, 131 Heat engine/efficiency, 141–143, 148 Heat flows, 120, 122, 123, 135, 143, 147 Heisenberg uncertainty principle, 301, 302 Hertzian dipole, 237–239, 261, 262 Histogram, 5–7 Hooke’s law, 27, 28, 36, 56, 60, 62–64, 96 Hund’s rule, 318 Huygens principle, 273, 274 Hydrodynamics, 77–90 Hydrogen spectrum, 308 Hydrostatics, 65–77 Hysteresis, magnetic, 190
I Ideal gas law, 125, 133–135, 147 Impedance, 210, 219–225, 227, 239 Inclined plane, 32, 33, 55 Inductance, 162, 204, 205, 211, 213, 214, 217, 218, 225, 226, 232, 234 Induction of current, 199–201 of on electric field, 198–199 of a magnetic field, 196–198 self-induction, 204, 207, 212, 238 of voltage, 198, 199, 201, 203, 204, 212, 232 Induction cooker, 202, 203 Inertia force, 30–31 Inertial systems, 30 Intensity of an electromagnetic wave, 244, 290 of a sound wave, 112 Interference of light, 267–273, 275 in a thin film, 271 Interferometer Fabry-Perot, 271, 272 Mach-Zehnder, 269–271, 291 Ionization, 310 Irreversible expansion, 136–138 Isentropic exponent, 135 Isobaric process, 133, 147 Isobars, 323 Isochoric process, 132, 133, 135 Isothermal processes, 136
340 Isotons, 323 Isotope chart, 324–327 Isotopes, 2, 323 Isotope table, 324
K Kepler orbits, 34 Kinematics, 13–23, 33 Kirchhoff’s laws, 214, 215
L Lambert-Beer law, 248, 290 Law of Malus, 259 Law of Snellius, 250 Lens law, 255 Lens, optical, 252–256, 277, 278 Lenz’s rule, 199–203 Leptons, 325 Leverage law, 49 Light intensity, 244 Light rays, 250 Linear regression, 9–12 Lorentz force, 175 on current-carrying conductors, 178, 180 on a moving charge in, 176, 178, 191 Lotus effect, 75 Loudness, 112
M Mach number, 114, 115 Mach-Zehnder interferometer, 269–271, 291 Macrostates, 138 Magnetic coercivity, 190 energy density, 186, 187 field strengths, 176, 188 flux/flux density, 185, 190 hysteresis, 190 induction, 193, 195, 196, 198–201, 203, 207, 218 remanence field, 190 Magnetic field of a coil, 184, 186, 187, 213 of a straight wire, 182, 191 Magnetic field constant, 182 Magnetic flux, 185 Magnetism diamagnetism, 188 ferromagnetism, 173, 188–190 paramagnetism, 188–189 Magnetization, 187–191 Magnet, permanent, 173–174
Index Magnus effect, 81, 83 Malus, law of, 259, 260, 291 Many worlds theory, 300 Mass defect, 322, 328, 333 Mass unit, atomic, 323 Matter waves, 294, 296 Maxwell-Boltzmann distribution, 125 Maxwell’s equations, 193, 205 Mean values, 4–7, 10, 111, 119, 211, 269, 313 quantum mechanical, 313 Microstates, 138 Mie scattering, 248, 249 Mol, 2, 127 Molecular potential, 59 Moment of inertia, 52–55, 58, 129 Momentum, 25, 35, 40–46, 57, 58, 119, 281, 288, 291 Motion oblique, 18–20 in three-dimensions, 17–20 uniform circular, 20, 21, 56
N Neutron, 176, 293, 321, 322 Neutron emission, 327 Newtonian fluids, 85 Newton’s axioms, 24–26 Nuclear decays, 325, 327–330 α-decay, 325 β+-decay, 326 β--decay, 325–326 γ-decay, 327 energy, 328 283 uranium, 328 Nuclear fusion, 331–332 Nuclear physics, 321–332 Nucleus, of an atom, 59, 149, 166, 188, 321
O Ohm’s law, 163–164, 170, 199, 203, 214, 216, 217, 223, 229, 231, 232 Optical activity, 265–267, 291 Orbital, 21, 34, 45, 177, 308, 309, 313–315, 318 Orbital angular momentum, 313
P Paradox hydrodynamic, 83 hydrostatic, 66 Parallel connection in an electric circuit, 218– 220, 240
Index Paramagnetism, 184, 188–189 Partial derivative, 7, 39 Path integral, 39, 157, 160, 181, 182, 184, 185, 187, 195, 196, 198 Pauli principle, 59, 317 Pendulum clock, 99 Periodic table, 316–321 Permanent magnets, 173–176, 188 Permeability, 188, 189 Phase diagram, 144, 145 Phase transitions, 144–146 Photoelectric effect, 284–286 Photon, 280, 285, 288, 293 Photon statistics, 282, 288–289 Pitot tubes, 81, 82 Planck’s law of radiation, 282, 283 Plate capacitor, 152, 155–157, 160, 161, 165, 168, 170, 180, 196, 197, 211 Polarimetry, 265, 266 Polarization, 167, 169, 252, 256–267 Polarizers, 259–261, 265, 266 Potential electrical, 158, 160 equipotential lines/surface, 39, 158–160 gravitational, 39 Power electrical, 164, 186, 210–212, 227 mechanical, 37–38 reactive, 212, 213 Pressure, 65–69 gravity, 66, 67, 69, 81, 83, 84, 91 plunger, 66–69, 91, 136, 141 Principal quantum numbers, 308, 309, 313 Proton, 149, 293, 316, 321 Proton emission, 327
Q Quantum numbers, 313, 333 magnetic, 314 momentum, 313 principal, 308, 309, 313 spin, 317 Quarks, 321, 322
R Radiation pressure, 245, 290 Radioactivity, 330–331 Radiocarbon method, 327 Rayleigh criterion, 278, 291 Rayleigh scattering, 248, 249, 261 RC circuit, 228–230 filter, 224–225
341 Reactive power, 212, 213 Realism, local, 299, 300 Reflection of light at an interface Brewster angle, 251, 262, 263 total internal reflection, 251, 252, 290 Refraction of light, 290 Refractive index, 246–248, 250–254, 262–265, 271–273, 279, 280, 290 Refractive power, 253 Regression, 9–12 Remanence field, 190 Resistance, electric capacitive AC resistance, 212 inductive AC resistance, 213 specific, 163 Resolving power, of optical images, 278 Resonance catastrophe, 103 Resonator, electric, 233 Reversibility, 136–144 Reynolds number, 89, 92 RL circuit, 230–232 filter, 225–226 Rockets, 41 Rotation of extended bodies, 52–55 Rydberg atoms, 310 Rydberg energy, 309, 310
S Schrödinger, atomic model, 309, 312 Schrödinger equation, 312 Schrödinger’s cat, 306–307 Series connection in an electric circuit, 217– 218, 240 Shear, 61, 63–65, 72, 85, 91 Shear force/modulus, 63–65, 85 SI system, 2 Snellius law, 262 Sound intensity, 112, 113, 117 Sound particle velocity, 111, 112 Sound waves, 105, 110–114, 268 Speed of light, 5, 9, 165, 194, 237, 243, 247, 281 in a medium, 247 Speed of sound, 106, 111, 112, 114–116 Spin quantum numbers, 317, 333 Spontaneous emission, 310 Spring force, 27–30, 56, 97 Spring pendulum, 96–97, 100, 116 Standard deviation, 4–7, 12 Statics of rotation, 46, 48–51, 58 of translation, 46–48, 58 Statistical uncertainty, 3–5
342 Stefan-Boltzmann law, 283, 284, 292 Steiner’s theorem, 53–54, 58 Stokes’ law of friction, 86–87, 92, 103, 104 Streamlines, 77 Stress limit/tensile strength, 64 Strong nuclear force, 322 Superposition principle, 17, 18, 106 superposition of wave functions, 302 Supersonic cone, 114, 115 Surface tension/surface energy, 72, 73, 75, 91
T Tangential velocity, 22, 23 Temperature, 2, 3, 5, 77, 106, 111, 119–126, 129–135, 140–147, 269, 271, 282–284, 293 Thermal conductivity, 122, 123 Thermal expansion, 123–124, 147, 271 Thermionic emission, 164 Thermodynamics, 119–124 1st law of, 130–131, 134 2nd law of, 139, 142 3rd law of, 140 Thread pendulum, 96, 98–99 Torque, 44–46, 48–50, 52, 54, 58, 64, 166, 167, 170, 184, 192 Torsion, 61, 64, 91 Torsional modulus, 63, 64 Total internal reflection, 251, 252, 290 Transformer, 226–228, 241 Transverse contraction, 61 Turbulence, 77, 86, 89–90, 92
U Uncertainty principle, 300–302, 332 Units, 1–4, 21, 24, 35, 43, 65, 99, 100, 102, 120, 122, 127, 144, 149, 150, 152, 162, 163, 165, 173, 176, 185, 220, 221, 224, 253, 266, 280, 285, 286, 323, 330 U-tube, 67–68 barometer, 67 manometer, 67
Index V Variance, 5 Vector potential, 187 Velocity, 1, 9, 13–24, 31, 32, 36, 38, 40–45, 52, 54–56, 77–90, 99–101, 111–113, 116, 125–127, 147, 163, 165, 175–177, 180, 193, 194, 202, 247, 287, 288, 290 angular, 20 Virtual image, 255 Viscosity, 85–87, 89, 90, 103 Voltage effective electrical, 210, 211 electrical, 198 voltage drop, 215, 217, 223, 225, 231, 235 Volume flow rate, 79
W Wave electromagnetic, 113, 206, 232, 243, 244, 246, 247, 249, 250, 257, 258, 268, 282, 286, 290 harmonic, 104–116, 267, 268 matter wave, 294, 296, 297, 300, 332 sound wave, 105, 110–114, 268 standing, 107–109, 116 transverse/longitudinal, 105, 110, 243, 256 Wave functions, 296, 301, 314 Wavelengths, 105, 108, 113, 114, 116, 238, 247–249, 265, 268, 270–276, 279, 281–283, 286, 288 in a medium, 246 Wave number, 105 Wave particle duality, 294–295, 297 Wave plates, 265 Wave vectors, 105–107, 264, 288 Weiss domains, 189, 190 Wien’s displacement law, 283 Wien’s law of displacement, 292 Works, 24, 35–39, 52, 57, 68, 69, 90, 91, 122, 130–134, 136, 139, 141–143, 157, 160, 161, 164, 165, 169, 174, 181, 184, 212, 219, 285, 286