Hibernate in action [illustrated edition] 9781932394153, 1932394-15-X

ibernate practically exploded on the Java scene. Why is this open-source tool so popular? Because it automates a tedious

232 47 3MB

English Pages 431 Year 2005

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
contents......Page 4
foreword......Page 10
preface......Page 12
acknowledgments......Page 14
Roadmap......Page 15
Code conventions and downloads......Page 17
About the authors......Page 18
about Hibernate3 and EJB 3......Page 19
author online......Page 20
About the cover illustration......Page 21
Understanding object/relational persistence......Page 24
1.1.1 Relational databases......Page 26
1.1.2 Understanding SQL......Page 27
1.1.4 Persistence in object-oriented applications......Page 28
1.2 The paradigm mismatch......Page 30
1.2.1 The problem of granularity......Page 32
1.2.2 The problem of subtypes......Page 33
1.2.3 The problem of identity......Page 34
1.2.4 Problems relating to associations......Page 36
1.2.5 The problem of object graph navigation......Page 37
1.2.6 The cost of the mismatch......Page 38
1.3 Persistence layers and alternatives......Page 39
1.3.1 Layered architecture......Page 40
1.3.2 Hand-coding a persistence layer with SQL/JDBC......Page 41
1.3.3 Using serialization......Page 42
1.3.4 Considering EJB entity beans......Page 43
1.3.5 Object-oriented database systems......Page 44
1.4 Object/relational mapping......Page 45
1.4.1 What is ORM?......Page 46
1.4.2 Generic ORM problems......Page 48
1.4.3 Why ORM?......Page 49
1.5 Summary......Page 52
Introducing and integrating Hibernate......Page 53
2.1 “Hello World” with Hibernate......Page 54
2.2 Understanding the architecture......Page 59
2.2.1 The core interfaces......Page 61
2.2.3 Types......Page 63
2.3 Basic configuration......Page 64
2.3.1 Creating a SessionFactory......Page 65
2.3.2 Configuration in non-managed environments......Page 68
2.3.3 Configuration in managed environments......Page 71
2.4.1 Using XML-based configuration......Page 74
2.4.2 JNDI-bound SessionFactory......Page 76
2.4.3 Logging......Page 77
2.4.4 Java Management Extensions (JMX)......Page 78
2.5 Summary......Page 81
Mapping persistent classes......Page 82
3.1 The CaveatEmptor application......Page 83
3.1.2 The CaveatEmptor domain model......Page 84
3.2.1 Addressing leakage of concerns......Page 87
3.2.2 Transparent and automated persistence......Page 88
3.2.3 Writing POJOs......Page 90
3.2.4 Implementing POJO associations......Page 92
3.2.5 Adding logic to accessor methods......Page 96
3.3.1 Metadata in XML......Page 98
3.3.2 Basic property and class mappings......Page 101
3.3.3 Attribute-oriented programming......Page 107
3.3.4 Manipulating metadata at runtime......Page 109
3.4.1 Identity versus equality......Page 110
3.4.2 Database identity with Hibernate......Page 111
3.4.3 Choosing primary keys......Page 113
3.5 Fine-grained object models......Page 115
3.5.2 Using components......Page 116
3.6.1 Table per concrete class......Page 120
3.6.2 Table per class hierarchy......Page 122
3.6.3 Table per subclass......Page 124
3.6.4 Choosing a strategy......Page 127
3.7 Introducing associations......Page 128
3.7.2 Multiplicity......Page 129
3.7.3 The simplest possible association......Page 130
3.7.4 Making the association bidirectional......Page 131
3.7.5 A parent/child relationship......Page 134
3.8 Summary......Page 135
Working with persistent objects......Page 137
4.1 The persistence lifecycle......Page 138
4.1.1 Transient objects......Page 139
4.1.2 Persistent objects......Page 140
4.1.3 Detached objects......Page 141
4.1.4 The scope of object identity......Page 142
4.1.5 Outside the identity scope......Page 144
4.1.6 Implementing equals() and hashCode()......Page 145
4.2.1 Making an object persistent......Page 149
4.2.2 Updating the persistent state of a detached instance......Page 150
4.2.5 Making a persistent object transient......Page 152
4.2.6 Making a detached object transient......Page 153
4.3.1 Persistence by reachability......Page 154
4.3.2 Cascading persistence with Hibernate......Page 156
4.3.3 Managing auction categories......Page 157
4.3.4 Distinguishing between transient and detached instances......Page 161
4.4 Retrieving objects......Page 162
4.4.1 Retrieving objects by identifier......Page 163
4.4.2 Introducing HQL......Page 164
4.4.3 Query by criteria......Page 165
4.4.5 Fetching strategies......Page 166
4.4.6 Selecting a fetching strategy in mappings......Page 169
4.4.7 Tuning object retrieval......Page 174
4.5 Summary......Page 175
Transactions, concurrency, and caching......Page 177
5.1 Understanding database transactions......Page 179
5.1.1 JDBC and JTA transactions......Page 180
5.1.2 The Hibernate Transaction API......Page 181
5.1.3 Flushing the Session......Page 183
5.1.4 Understanding isolation levels......Page 184
5.1.5 Choosing an isolation level......Page 186
5.1.7 Using pessimistic locking......Page 188
5.2 Working with application transactions......Page 191
5.2.1 Using managed versioning......Page 192
5.2.2 Granularity of a Session......Page 195
5.2.3 Other ways to implement optimistic locking......Page 197
5.3 Caching theory and practice......Page 198
5.3.1 Caching strategies and scopes......Page 199
5.3.2 The Hibernate cache architecture......Page 202
5.3.3 Caching in practice......Page 208
5.4 Summary......Page 217
Advanced mapping concepts......Page 218
6.1 Understanding the Hibernate type system......Page 219
6.1.1 Built-in mapping types......Page 221
6.1.2 Using mapping types......Page 223
6.2.1 Sets, bags, lists, and maps......Page 234
6.3.1 One-to-one associations......Page 243
6.3.2 Many-to-many associations......Page 248
6.4.1 Polymorphic many-to-one associations......Page 257
6.4.2 Polymorphic collections......Page 259
6.4.3 Polymorphic associations and table-per-concrete-class......Page 260
6.5 Summary......Page 262
Retrieving objects efficiently......Page 264
7.1.1 The query interfaces......Page 266
7.1.2 Binding parameters......Page 268
7.1.3 Using named queries......Page 272
7.2.1 The simplest query......Page 273
7.2.3 Polymorphic queries......Page 274
7.2.4 Restriction......Page 275
7.2.5 Comparison operators......Page 276
7.2.6 String matching......Page 278
7.2.7 Logical operators......Page 279
7.2.8 Ordering query results......Page 280
7.3 Joining associations......Page 281
7.3.1 Hibernate join options......Page 282
7.3.2 Fetching associations......Page 283
7.3.3 Using aliases with joins......Page 285
7.3.4 Using implicit joins......Page 288
7.3.5 Theta-style joins......Page 290
7.3.6 Comparing identifiers......Page 291
7.4 Writing report queries......Page 292
7.4.1 Projection......Page 293
7.4.2 Using aggregation......Page 295
7.4.3 Grouping......Page 296
7.4.4 Restricting groups with having......Page 297
7.4.5 Improving performance with report queries......Page 298
7.5.1 Dynamic queries......Page 299
7.5.2 Collection filters......Page 302
7.5.3 Subqueries......Page 304
7.5.4 Native SQL queries......Page 306
7.6.1 Solving the n+1 selects problem......Page 309
7.6.2 Using iterate() queries......Page 312
7.6.3 Caching queries......Page 313
7.7 Summary......Page 315
Writing Hibernate applications......Page 317
8.1 Designing layered applications......Page 318
8.1.1 Using Hibernate in a servlet engine......Page 319
8.1.2 Using Hibernate in an EJB container......Page 334
8.2 Implementing application transactions......Page 343
8.2.1 Approving a new auction......Page 344
8.2.2 Doing it the hard way......Page 345
8.2.3 Using detached persistent objects......Page 347
8.2.4 Using a long session......Page 348
8.2.5 Choosing an approach to application transactions......Page 352
8.3.1 Legacy schemas and composite keys......Page 353
8.3.2 Audit logging......Page 363
8.4 Summary......Page 370
Using the toolset......Page 371
9.1 Development processes......Page 372
9.1.4 Meet in the middle......Page 373
9.2 Automatic schema generation......Page 374
9.2.1 Preparing the mapping metadata......Page 375
9.2.2 Creating the schema......Page 378
9.2.3 Updating the schema......Page 380
9.3.1 Adding meta-attributes......Page 381
9.3.2 Generating finders......Page 383
9.3.3 Configuring hbm2java......Page 385
9.3.4 Running hbm2java......Page 386
9.4.1 Starting Middlegen......Page 387
9.4.2 Restricting tables and relationships......Page 389
9.4.3 Customizing the metadata generation......Page 391
9.4.4 Generating hbm2java and XDoclet metadata......Page 393
9.5.1 Setting value type attributes......Page 395
9.5.2 Mapping entity associations......Page 397
9.5.3 Running XDoclet......Page 398
9.6 Summary......Page 399
SQL fundamentals......Page 401
ORM implementation strategies......Page 405
B.1 Properties or fields?......Page 406
B.2.1 Inheritance from generated code......Page 407
B.2.4 Reflection......Page 408
B.2.5 Runtime bytecode generation......Page 409
B.2.6 “Generic” objects......Page 410
Back in the real world......Page 411
C.1 The strange copy......Page 412
C.3 We don’t need primary keys......Page 413
C.5 Dynamically unsafe......Page 414
C.6 To synchronize or not?......Page 415
C.7 Really fat client......Page 416
C.8 Resuming Hibernate......Page 417
references......Page 418
index......Page 420
Recommend Papers

Hibernate in action [illustrated edition]
 9781932394153, 1932394-15-X

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Hibernate in Action

Hibernate in Action CH RI ST I AN BAU ER GAVI N KI N G

M ANNING Green wich ( 74° w. lon g.)

For on lin e in formation and orderin g of th is an d oth er Manning books, please visit www.man nin g.com. Th e publisher offers discoun ts on this book when ordered in quan tity. For more in formation , please con tact: Special Sales Departmen t Mann in g Publication s Co. 209 Bruce Park Aven ue Green wich , CT 06830

Fax: ( 203) 661-9018 email: man n in [email protected]

©2005 by Man nin g Publication s Co. All righ ts reserved.

No part of th is publication may be reproduced, stored in a retrieval system, or transmitted, in an y form or by mean s electron ic, mech an ical, ph otocopying, or otherwise, without prior written permission of th e publish er.

Many of th e design ation s used by manufacturers an d sellers to distinguish their products are claimed as trademarks. Wh ere th ose design ations appear in the book, and Manning Publication s was aware of a trademark claim, th e designations have been printed in initial caps or all caps.

Recogn izin g th e importan ce of preservin g wh at h as been written, it is Manning’s policy to have the books th ey publish prin ted on acid-free paper, and we exert our best efforts to that end.

.

Mann in g Publication s Co. 209 Bruce Park Aven ue Green wich , CT 06830

Copyeditor: Tiffany Taylor Typesetter: Dottie Marsico Cover design er: Leslie Haimes

ISBN 1932394-15-X Printed in th e Un ited States of America 1 2 3 4 5 6 7 8 9 10 – VHG – 07 06 05 04

contents foreword xi preface xiii acknowledgments xv about this book xvi about Hibernate3 and EJB 3 xx author online xxi about the title and cover xxii

1

Understanding object/ relational persistence 1.1

1

Wh at is persisten ce? 3 Relational databases 3 Understanding SQL 4 Using SQL in Java 5 Persistence in object-oriented applications 5 ■





1.2

Th e paradigm mismatch

7

T he problem of granularity 9 T he problem of subtypes 10 T he problem of identity 11 Problems relating to associations T he problem of object graph navigation 14 T he cost of the mismatch 15 ■





1.3

Persisten ce layers an d altern atives 16 Layered architecture 17 Hand-coding a persistence layer with SQL/ JDBC 18 Using serialization 19 Considering EJB entity beans 20 Object-oriented database systems 21 Other options 22 ■







1.4

Object/ relational mapping What is ORM? 23 Why ORM? 26

1.5



Summary 29 v

22

Generic ORM problems

25

13

vi

CONTENTS

2

Introducing and integrating Hibernate 2.1

“Hello World” with Hibern ate

2.2

Un derstan din g the arch itecture

30

31 36

T he core interfaces 38 Callback interfaces T ypes 40 Extension interfaces 41 ■

40



2.3

Basic configuration

41

Creating a SessionFactory 42 Configuration in non-managed environments 45 Configuration in managed environments 48 ■



2.4

Advan ced con figuration settin gs 51 Using XML-based configuration SessionFactory 53 Logging Extensions (JMX) 55 ■

2.5

3

■ ■

JNDI-bound Java Management

Summary 58

Mapping persistent classes 3.1

51 54

59

Th e CaveatEmptor application

60

Analyzing the business domain 61 T he CaveatEmptor domain model 61

3.2

Implemen tin g th e domain model

64

Addressing leakage of concerns 64 T ransparent and automated persistence 65 Writing POJOs 67 Implementing POJO associations 69 Adding logic to accessor methods 73 ■





3.3

Defin in g th e mappin g metadata

75

Metadata in XML 75 Basic property and class mappings 78 Attribute-oriented programming 84 Manipulating metadata at runtime 86 ■



3.4

Un derstan din g object iden tity 87 Identity versus equality 87 Database identity with Hibernate 88 Choosing primary keys 90 ■



3.5

Fine-grained object models 92 Entity and value types

3.6

93



Mapping class in h eritan ce

Using components

93

97

T able per concrete class 97 T able per class hierarchy 99 T able per subclass 101 Choosing a strategy 104 ■



CONTENTS

3.7

Introducing associations 105 Managed associations? 106 Multiplicity 106 T he simplest possible association 107 Making the association bidirectional 108 A parent/ child relationship 111 ■





3.8

4

Summary 112

Working with persistent objects 4.1

Th e persisten ce lifecycle

114 115

T ransient objects 116 Persistent objects 117 Detached objects 118 T he scope of object identity 119 Outside the identity scope 121 Implementing equals() and hashCode() 122 ■









4.2

Th e persisten ce man ager

126

Making an object persistent 126 Updating the persistent state of a detached instance 127 Retrieving a persistent object 129 Updating a persistent object 129 Making a persistent object transient 129 Making a detached object transient 130 ■







4.3

Usin g tran sitive persistence in Hibern ate

131

Persistence by reachability 131 Cascading persistence with Hibernate 133 Managing auction categories 134 Distinguishing between transient and detached instances 138 ■



4.4

Retrieving objects 139 Retrieving objects by identifier 140 Introducing HQL 141 Query by criteria 142 Query by example 143 Fetching strategies 143 Selecting a fetching strategy in mappings 146 T uning object retrieval 151 ■







4.5

5

Summary 152

Transactions, concurrency, and caching

154

5.1

Tran saction s, con curren cy, an d cach ing

154

5.2

Un derstan din g database tran saction s 156 JDBC and JT A transactions 157 T he Hibernate T ransaction API 158 Flushing the Session 160 Understanding isolation levels 161 Choosing an isolation level 163 Setting an isolation level 165 Using pessimistic locking 165 ■











5.3

Workin g with application tran saction s 168 Using managed versioning 169 Granularity of a Session 172 Other ways to implement optimistic locking 174 ■



vii

viii

CONTENTS

5.4

Cachin g th eory and practice

175

Caching strategies and scopes 176 T he Hibernate cache architecture 179 Caching in practice 185 ■



5.5

6

Summary 194

Advanced mapping concepts 6.1

195

Un derstan din g the Hibern ate type system Built-in mapping types

198

6.2

Mappin g collection s of value types 211

6.3

Mapping en tity association s 220

Sets, bags, lists, and maps One-to-one associations

6.4

196

Using mapping types



200

211

220



Many-to-many associations

225

Mapping polymorph ic association s 234 Polymorphic many-to-one associations 234 Polymorphic collections 236 Polymorphic associations and table-perconcrete-class 237 ■



6.5

7

Summary 239

Retrieving objects efficiently 7.1

241

Executing queries 243 T he query interfaces 243 Binding parameters Using named queries 249

245



7.2

Basic queries for objects 250 T he simplest query 250 Using aliases 251 Polymorphic queries 251 Restriction 252 Comparison operators 253 String matching 255 Logical operators 256 Ordering query results 257 ■









7.3



Join in g association s 258 Hibernate join options 259 Fetching associations 260 Using aliases with joins 262 Using implicit joins 265 T heta-style joins 267 Comparing identifiers 268 ■





7.4

Writin g report queries 269 Projection 270 Using aggregation Restricting groups with having 274 with report queries 275 ■

272 Grouping 273 Improving performance ■



CONTENTS

7.5

Advanced query tech niques 276 Dynamic queries 276 Collection filters 279 Subqueries 281 Native SQL queries 283 ■



7.6

Optimizin g object retrieval

286

Solving the n+1 selects problem 286 Using iterate() queries 289 Caching queries 290 ■



7.7

8

Summary 292

Writing Hibernate applications 8.1

294

Design in g layered application s 295 Using Hibernate in a servlet engine 296 Using Hibernate in an EJB container 311

8.2

Implemen tin g application tran saction s 320 Approving a new auction 321 Doing it the hard way 322 Using detached persistent objects 324 Using a long session 325 Choosing an approach to application transactions 329 ■



8.3 8.4

9

Han dling special kin ds of data

330

Legacy schemas and composite keys

330

Audit logging 340

Summary 347

Using the toolset 9.1



348

Developmen t processes 349 T op down 350 Bottom up 350 Middle out (metadata oriented) 350 Meet in the middle 350 Roundtripping 351 ■





9.2

Automatic schema generation

351

Preparing the mapping metadata 352 Creating the schema 355 Updating the schema 357 ■



9.3

Generating POJO code Adding meta-attributes Configuring hbm2java

9.4

358 358 362

■ ■

Generating finders 360 Running hbm2java 363

Existin g schemas and Middlegen

364

Starting Middlegen 364 Restricting tables and relationships 366 Customizing the metadata generation Generating hbm2java and XDoclet metadata 370 ■



368

ix

x

CONTENTS

9.5

XDoclet

372

Setting value type attributes 372 Mapping entity associations 374 Running XDoclet 375 ■



9.6

Summary 376

appendix A: SQL fundamentals

378

appendix B: ORM implementation strategies B.1

Properties or fields? 383

B.2

Dirty-checkin g strategies 384

appendix C: Back in the real world

388

C.1

Th e stran ge copy 389

C.2

Th e more th e better

C.3

We don’t need primary keys 390

C.4

Time isn’t linear

C.5

Dyn amically un safe

C.6

To synchronize or not? 392

C.7

Really fat client

C.8

Resuming Hibernate

references 395 index 397

390

391 391

393 394

382

foreword Relational databases are indisputably at the core of th e modern en terprise. While modern programming languages, including Java TM, provide an in tuitive, object-orien ted view of ap p lication -level bu sin ess en tities, th e en terp rise d ata un derlyin g th ese en tities is h eavily relational in nature. Further, the main strength of th e relation al m od el—over earlier n avigation al m od els as well as over later O O DB models—is th at by design it is in trin sically agn ostic to th e programmatic manipulation and application-level view of the data that it serves up. Man y attempts h ave been made to bridge relational and object-oriented techn ologies, or to replace one with th e oth er, but the gap between th e two is on e of the hard facts of enterprise computing today. It is this challenge—to provide a bridge between relational data and Java TM objects—that Hibernate takes on through its object/ relational mapping ( ORM) approach. Hibernate meets this challenge in a very pragmatic, direct, and realistic way. As Christian Bauer and Gavin King demonstrate in this book, the effective use of ORM tech n ology in all but th e simplest of enterprise environments requires understan din g an d con figurin g h ow th e mediation between relation al data an d objects is per formed. This demands that the developer be aware and knowledgeable both of the application and its data requirements, and of the SQL query language, relational storage structures, and th e poten tial for optimization th at relational technology offers. Not on ly does Hibern ate provide a full-fun ction solution th at meets th ese requirements head on, it is also a flexible an d con figurable arch itecture. Hibernate’s developers designed it with modularity, pluggability, extensibility, and user customization in min d. As a result, in th e few years sin ce its in itial release,

xi

xii

FOREWORD

Hibernate has rapidly become one of the leading ORM technologies for enterprise developers—an d deservedly so. Th is book provides a compreh en sive overview of Hibern ate. It covers h ow to use its type mappin g capabilities an d facilities for modelin g association s an d in h eritan ce; h ow to retrieve objects efficiently using the Hibernate query language; h ow to con figure Hibern ate for use in both managed and unmanaged environments; and how to use its tools. In addition, throughout the book the authors provide insight into the underlying issues of ORM an d in to th e design ch oices beh in d Hibern ate. Th ese in sigh ts give the reader a deep understanding of th e effective use of ORM as an en terprise tech n ology. Hibernate in Action is th e defin itive guide to usin g Hibernate and to object/ relation al mappin g in en terprise computin g today. LINDA DEMICHIEL Lead Arch itect, En terprise JavaBean s Sun Microsystems

preface Just because it is possible to push twigs along the ground with one’s nose does not necessarily mean that that is the best way to collect firewood. —An th on y Berglas

Today, many software developers work with Enterprise Information Systems ( EIS) . This kind of application creates, man ages, an d stores structured in formation an d shares this information between many users in multiple physical locations. Th e storage of EIS data in volves massive usage of SQL-based database man agemen t systems. Every compan y we’ve met durin g our careers uses at least on e SQL database; most are completely dependent on relational database technology at the core of their business. In th e past five years, broad adoption of th e Java programmin g lan guage h as brough t about th e ascen dan cy of th e object-orien ted paradigm for software development. Developers are now sold on the benefits of object orientation. However, the vast majority of businesses are also tied to long-term investments in expensive relation al database systems. Not on ly are particular ven dor products en tren ch ed, but existing legacy data must be made available to ( and via) the shiny new objectoriented web applications. However, the tabular representation of data in a relation al system is fun damen tally different than the networks of objects used in object-oriented Java applications. This difference has led to the so-called object/ relational paradigm mismatch . Traditionally, the importance and cost of th is mismatch h ave been un derestimated, and tools for solving the mismatch have been insufficient. Meanwhile, Java developers blame relation al tech n ology for the mismatch; data professionals blame object tech n ology. xiii

xiv

PREFACE

Object/ relational mapping ( ORM) is the name given to automated solutions to the mismatch problem. For developers weary of tedious data access code, th e good news is that ORM h as come of age. Application s built with ORM middleware can be expected to be cheaper, more per formant, less vendor-specific, and more able to cope with ch an ges to th e in tern al object or un derlyin g SQL sch ema. Th e aston ish ing thing is that these benefits are now available to Java developers for free. Gavin King began developing Hibernate in late 2001 wh en h e foun d th at th e popular persisten ce solution at th e time—CMP En tity Bean s—didn ’t scale to n on trivial applications with complex data models. Hibernate began life as an independent, noncommercial open source project. The Hibernate team ( includin g th e auth ors) h as learn ed O RM th e h ard way— th at is, by listen in g to user requests an d implemen tin g wh at was n eeded to satisfy th ose requests. Th e result, Hibernate, is a practical solution, emphasizing developer productivity and technical leadersh ip. Hibern ate h as been used by ten s of thousands of users and in many th ousan ds of production application s. When th e deman ds on th eir time became overwhelming, the Hibernate team con cluded th at th e future success of the project ( and Gavin’s continued sanity) deman ded profession al developers dedicated full-time to Hibern ate. Hibern ate joined jboss.org in late 2003 and now has a commercial aspect; you can purch ase commercial support and training from JBoss Inc. But commercial training shouldn’t be th e on ly way to learn about Hibernate. It’s obvious that many, perhaps even most, Java projects benefit from the use of an ORM solution like Hibern ate—alth ough th is wasn ’t obvious a couple of years ago! As O RM tech n ology becomes in creasin gly main stream, product documen tation such as Hibernate’s free user manual is n o lon ger sufficien t. We realized th at the Hibern ate commun ity an d n ew Hibernate users needed a full-length book, n ot on ly to learn about developin g software with Hibernate, but also to understand and appreciate the object/ relation al mismatch an d th e motivation s beh in d Hibernate’s design. The book you’re h oldin g was an en ormous effort that occupied most of our spare time for more than a year. It was also the source of many heated disputes an d learn in g experien ces. We h ope th is book is an excellen t guide to Hibern ate ( or, “th e Hibern ate bible,” as on e of our reviewers put it) an d also th e first comprehensive documentation of the object/ relational mismatch and ORM in general. We hope you find it helpful and enjoy working with Hibernate.

acknowledgments Writin g ( in fact, creatin g) a book wouldn ’t be possible with out h elp. We’d first like to thank the Hibernate community for keeping us on our toes; without your requests for th e book, we probably would h ave given up early on . A book is on ly as good as its reviewers, an d we had the best. J. B. Rainsberger, Matt Scarpin o, Ara Abrah amian , Mark Eagle, Glen Smith, Patrick Peak, Max Rydahl Andersen, Peter Eisentraut, Matt Raible, an d Mich ael A. Koziarski. Th an ks for your en dless h ours of readin g our h alf-fin ish ed an d raw man uscript. We’d like to than k Emman uel Bern ard for h is tech n ical review and Nick Heudecker for his help with th e first ch apters. Our team at Man n in g was in valuable. Clay An dres got this project started, Jackie Carter stayed with us in good and bad times and taught us how to write. Marjan Bace provided the necessary confidence that kept us going. Tiffany Taylor and Liz Welch found all the many mistakes we made in grammar an d style. Mary Piergies organized the production of th is book. Man y th an ks for your h ard work. An y oth ers at Man n in g wh om we’ve forgotten : You made it possible.

xv

about this book We introduce the object/ relational paradigm mismatch in th is book an d give you a h igh -level overview of curren t solution s for th is time-con sumin g problem. You’ll learn h ow to use Hibern ate as a persisten ce layer with a rich ly typed domain object model in a sin gle, con tin uing example application. This persistence layer implementation covers all entity association, class in h eritan ce, an d special type mapping strategies. We teach you h ow to tun e th e Hibern ate object query an d transaction system for the best per formance in highly concurren t multiuser application s. Th e flexible Hibernate dual-layer caching system is also an important topic in this book. We discuss Hibern ate in tegration in differen t scen arios an d also sh ow you typical arch itectural problems in two- and three-tiered Java database application s. If you h ave to work with an existing SQL database, you’ll also be interested in Hibernate’s legacy database in tegration features an d th e Hibern ate developmen t toolset.

Roadmap Chapter 1 defin es object persistence. We discuss why a relational database with a SQL inter face is the system for persistent data in today’s application s, an d wh y hand-coded Java persistence layers with JDBC and SQL code are time-consuming and error-prone. After looking at altern ative solution s for th is problem, we in troduce object/ relational mapping and talk about th e advan tages an d down sides of th is approach . Chapter 2 gives an architectural overview of Hibernate and shows you the most important application-programming inter faces. We demonstrate Hibernate

xvi

ABOUT THIS BOOK

xvii

configuration in managed ( and non-managed) J2EE and J2SE environments after lookin g at a simple “Hello World” application . Chapter 3 introduces the example application and all kinds of entity and relationship mappings to a database schema, in cludin g un i- an d bidirection al association s, class in h eritan ce, an d com p osition . You ’ll learn h ow to write H ibern ate mapping files and how to design persistent classes. Ch apter 4 teach es you th e H ibernate in ter faces for read an d save operation s; we also sh ow you h ow tran sitive persisten ce ( persisten ce by reach ability) works in Hibern ate. Th is ch apter is focused on loading and storing objects in the most efficient way. Ch apter 5 discusses con curren t data access, with database an d lon g-run n in g application tran saction s. We in troduce th e con cepts of lockin g an d version in g of data. We also cover cach in g in gen eral an d th e H ibern ate cach in g system, wh ich are closely related to con curren t data access. Ch apter 6 completes your un derstan din g of H ibern ate mappin g tech n iques with more advan ced mappin g con cepts, such as custom user types, collection s of values, an d mappin gs for on e-to-on e an d man y-to-man y association s. We briefly discuss Hibernate’s fully polymorphic behavior as well. Chapter 7 introduces the Hibernate Query Language ( HQL) and other objectretrieval meth ods such as th e query by criteria ( QBC) API, which is a typesafe way to express an object query. We show you how to translate complex search dialogs in your application to a query by example ( QBE) query. You’ll get th e full power of Hibern ate queries by combin in g th ese th ree features; we also sh ow you h ow to use direct SQL calls for th e special cases an d h ow to best optimize query per forman ce. Chapter 8 describes some basic practices of Hibern ate application arch itecture. This includes handling the SessionFactory, th e popular ThreadLocal Session pattern, and encapsulation of the persistence layer functionality in data access objects ( DAO ) an d J2EE commands. We show you how to design long-running application tran saction s an d h ow to use th e in n ovative detached object support in Hibernate. We also talk about audit loggin g an d legacy database schemas. Ch apter 9 in troduces several different development scenarios and tools that may be used in each case. We sh ow you th e common tech n ical pitfalls with each approach an d discuss th e Hibern ate toolset ( hbm2ddl, hbm2java) and the integration with popular open source tools such as XDoclet and Middlegen.

xviii

ABOUT THIS BOOK

Who should read this book? Readers of th is book sh ould h ave basic knowledge of object-oriented software developmen t an d sh ould h ave used th is kn owledge in practice. To un derstan d th e application examples, you should be familiar with th e Java programmin g lan guage an d th e Un ified Modelin g Lan guage. Our primary target audien ce con sists of Java developers wh o work with SQLbased database systems. We’ll sh ow you h ow to substan tially in crease your productivity by leveragin g ORM. If you’re a database developer, the book could be part of your in troduction to object-orien ted software developmen t. If you’re a database administrator, you’ll be in terested in h ow ORM affects performance and how you can tune the per formance of the SQL database man agement system and persistence layer to ach ieve per forman ce targets. Sin ce data access is the bottleneck in most Java application s, th is book pays close atten tion to per formance issues. Many DBAs are understandably nervous about entrusting performance to tool-generated SQL code; we seek to allay th ose fears an d also to h igh ligh t cases wh ere application s sh ould not use tool-managed data access. You may be relieved to discover that we don’t claim that ORM is th e best solution to every problem.

Code conventions and downloads This book provides copious examples, wh ich in clude all th e Hibern ate application artifacts: Java code, Hibern ate con figuration files, an d XML mappin g metadata files. Source code in listings or in text is in a fixed-width font like this to separate it from ordinary text. Additionally, Java method names, component parameters, object properties, and XML elements and attributes in text are also presented using fixed-width font. Java, HTML, and XML can all be verbose. In many cases, the original source code ( available online) has been reformatted; we’ve added line breaks and reworked indentation to accommodate the available page space in th e book. In rare cases, even th is was n ot en ough , an d listin gs in clude line-continuation markers. Addition ally, commen ts in th e source code h ave been removed from th e listin gs.

ABOUT THIS BOOK

xix

Code an n otation s accompan y man y of th e source code listin gs, h igh ligh tin g important concepts. In some cases, numbered bullets lin k to explan ation s th at follow th e listin g. Hibernate is an open source project released un der th e Lesser GNU Public License. Directions for downloading Hibernate, in source or binary form, are available from the Hibernate web site: www.h ibern ate.org/ . Th e source code for all CaveatEmptor examples in th is book is available from http:/ / caveatemptor.hibernate.org/ . Th e CaveatEmptor example application code is available on th is web site in differen t flavors: for example, for servlet an d for EJB deployment, with or without a presen tation layer. However, on ly th e stan dalone persistence layer source package is th e recommen ded compan ion to th is book.

About the authors Ch ristian Bauer is a member of th e Hibern ate developer team an d is also respon sible for the Hibernate web site and documentation. Christian is interested in relational database systems and sound data man agemen t in Java application s. He works as a developer and consultant for JBoss Inc. and lives in Frankfurt, Germany. Gavin King is the founder of the Hibernate project and lead developer. He is an en th usiastic proponen t of agile developmen t an d open source software. Gavin is helpin g in tegrate ORM tech n ology in to th e J2EE stan dard as a member of th e EJB 3 Expert Group. He is a developer an d con sultan t for JBoss In c., based in Melbourne, Australia.

about Hibernate3 and EJB 3 Th e world doesn ’t stop turn in g wh en you fin ish writin g a book, an d gettin g th e book in to production takes more time th an you could believe. Therefore, some of the information in any technical book becomes quickly outdated, especially wh en new standards and product versions are already on the horizon. Hibern ate3, an evolution ary n ew version of Hibernate, was in the early stages of plann in g an d design wh ile th is book was bein g written . By th e time th e book hits the shelves, there may be an alpha release available. However, the information in th is book is valid for Hibern ate3; in fact, we con sider it to be an essen tial reference even for the new version. We discuss fun damen tal con cepts th at will be foun d in Hibern ate3 an d in most ORM solution s. Furth ermore, Hibern ate3 will be mostly backward compatible with Hibern ate 2.1. New features will be added, of course, but you won ’t h ave problems pickin g th em up after readin g th is book. In spired by th e success of Hibern ate, th e EJB 3 Expert Group used several key concepts and APIs from Hibern ate in its redesign of en tity bean s. At the time of writin g, on ly an early draft of th e n ew EJB specification was available; hen ce we don ’t discuss it in this book. However, after reading Hibernate in Action , you’ll kn ow all th e fun damen tals th at will let you quickly un derstan d en tity bean s in EJB 3. For more up-to-date in formation, see th e Hibern ate road map: www.h ibernate.org/ About/ RoadMap.

xx

author online Purchase of Hibernate in Action includes free access to a private web forum where you can make comments about the book, ask technical questions, and receive help from the author and from other users. To access the forum and subscribe to it, point your web browser to www.manning.com/bauer. This page provides information on how to get on the forum once you are registered, what kind of help is available, and the rules of conduct on the forum. It also provides links to the source code for the examples in the book, errata, and other downloads. Manning’s commitment to our readers is to provide a venue where a meaningful dialog between individual readers and between readers and the authors can take place. It is not a commitment to any specific amount of participation on the part of the authors, whose contribution to the AO remains voluntary (and unpaid). We suggest you try asking the authors some challenging questions lest their interest stray!

xxi

about the title and cover By combin in g in troduction s, overviews, an d h ow-to examples, Man n in g’s In Action books are design ed to h elp learn in g an d rememberin g. Accordin g to research in cognitive science, the things people remember are things they discover during self-motivated exploration. Alth ough n o on e at Man n in g is a cogn itive scien tist, we are con vin ced th at for learning to become permanent it must pass through stages of exploration, play, an d, in terestin gly, re-tellin g of wh at is bein g learn ed. People un derstan d an d remember n ew th in gs, wh ich is to say th ey master them, only after actively exploring them. Humans learn in action . An essen tial part of an In Action guide is th at it is example-driven . It en courages th e reader to try th in gs out, to play with n ew code, an d explore n ew ideas. There is another, more mundane, reason for the title of this book: our readers are busy. Th ey use books to do a job or solve a problem. Th ey n eed books th at allow th em to jump in an d jump out easily an d learn just what they want, just when th ey wan t it. Th ey n eed books th at aid th em in action . The books in this series are design ed for such readers.

About the cover illustration Th e figure on th e cover of Hibernate in Action is a peasan t woman from a village in Switzerlan d, “Paysan n e de Sch watzen bourg en Suisse.” The illustration is taken from a French travel book, Encyclopedie des Voyages by J. G. St. Saveur, publish ed in 1796. Travel for pleasure was a relatively new phenomenon at the time and travel guides such as th is on e were popular, introducing both the tourist as well as the armchair traveler, to th e in h abitan ts of oth er region s of Fran ce an d abroad. xxii

ABOUT THE TITLE AND COVER

xxiii

Th e diversity of th e drawin gs in th e Encyclopedie des Voyages speaks vividly of th e un iquen ess an d in dividuality of th e world’s town s an d provin ces just 200 years ago. This was a time when the dress codes of two region s separated by a few dozen miles iden tified people un iquely as belonging to one or the other. The travel guide brin gs to life a sen se of isolation an d distance of that period and of every other historic period except our own hyperkinetic present. Dress codes have changed since then and th e diversity by region , so rich at th e time, has faded away. It is now often h ard to tell th e in h abitan t of on e con tin en t from another. Perhaps, trying to view it optimistically, we h ave traded a cultural an d visual diversity for a more varied personal life. Or a more varied and interesting intellectual an d tech n ical life. We at Man n in g celebrate th e in ven tiven ess, the initiative, and the fun of the computer busin ess with book covers based on th e rich diversity of region al life two centuries ago brought back to life by the pictures from this travel book.

Understanding object/ relational persistence

This chapter covers ■

Obje c t pe rs is te nc e with S QL databas e s



The o bje c t/ re latio nal paradigm mis matc h



Pe rs is te nc e laye rs in o bje c t-o rie nte d applic atio ns



Obje c t/ re latio nal mapping bas ic s

1

2

CHAPTER 1

Understanding object/ relational persistence

The approach to managing persistent data has been a key design decision in every software project we’ve worked on. Given that persistent data isn’t a new or unusual requiremen t for Java application s, you’d expect to be able to make a simple ch oice amon g similar, well-establish ed persisten ce solution s. Th in k of web application frameworks ( Jakarta Struts versus WebWork) , GUI compon en t frameworks ( Swin g versus SWT ) , or template en gin es ( JSP versus Velocity) . Each of th e competin g solutions has advantages and disadvantages, but th ey at least sh are th e same scope an d overall approach . Un fortun ately, this isn ’t yet th e case with persisten ce tech n ologies, wh ere we see some wildly differin g solution s to th e same problem. For several years, persistence has been a hot topic of debate in the Java commun ity. Man y developers don ’t even agree on th e scope of th e problem. Is “persistence” a problem that is already solved by relational technology and extensions such as stored procedures, or is it a more pervasive problem that must be addressed by special Java component models such as EJB en tity bean s? Sh ould we h an d-code even th e most primitive CRUD ( create, read, update, delete) operations in SQL and JDBC, or sh ould th is work be automated? How do we ach ieve portability if every database man agemen t system h as its own SQL dialect? Sh ould we abandon SQL completely and adopt a new database tech n ology, such as object database systems? Debate con tin ues, but recen tly a solution called object/ relational mapping ( ORM) has met with increasing acceptance. Hibern ate is an open source ORM implemen tation . Hibernate is an ambitious project that aims to be a complete solution to the problem of man agin g persisten t data in Java. It mediates th e application ’s in teraction with a relational database, leaving the developer free to concentrate on the busin ess problem at h an d. Hibernate is an non-intrusive solution. By this we mean you aren’t required to follow many Hibern ate-specific rules an d design pattern s when writin g your busin ess logic and persistent classes; thus, Hibernate integrates smoothly with most new and existing applications and doesn’t require disruptive changes to the rest of the application. Th is book is about Hibern ate. We’ll cover basic an d advan ced features an d describe some recommen ded ways to develop n ew application s usin g Hibern ate. Often, th ese recommen dation s won ’t be specific to Hibern ate—sometimes th ey will be our ideas about th e best ways to do things when working with persistent data, explain ed in th e con text of Hibern ate. Before we can get started with Hibern ate, h owever, you n eed to understan d th e core problems of object persistence and object/ relational mapping. This ch apter explain s wh y tools like Hibern ate are needed.

What is persistence?

3

First, we defin e persisten t data management in the context of object-oriented application s an d discuss th e relation sh ip of SQL, JDBC, and Java, the underlying technologies and standards that Hibernate is built on . We th en discuss th e socalled object/ relational paradigm mismatch an d th e gen eric problems we en coun ter in object-oriented software development with relational databases. As this list of problems grows, it becomes apparent that we n eed tools an d pattern s to min imize th e time we h ave to spen d on th e persisten ce-related code of our application s. After we look at altern ative tools an d persisten ce mech an isms, you’ll see th at ORM is th e best available solution for man y scen arios. Our discussion of the advantages and drawbacks of ORM gives you th e full backgroun d to make th e best decision wh en picking a persistence solution for your own project. The best way to learn isn’t necessarily lin ear. We un derstan d th at you probably want to try Hibernate right away. If this is how you’d like to proceed, skip to chapter 2, section 2.1, “Getting started,” wh ere we jump in an d start codin g a ( small) Hibernate application. You’ll be able to un derstan d ch apter 2 with out reading this chapter, but we also recommend that you return here at some point as you circle th rough th e book. Th at way, you’ll be prepared and have all the backgroun d con cepts you n eed for th e rest of th e material.

1.1 What is persistence? Almost all application s require persisten t data. Persisten ce is on e of th e fun damental con cepts in application developmen t. If an in formation system didn ’t preser ve data en tered by users wh en th e h ost mach in e was powered off, th e system would be of little practical use. Wh en we talk about persistence in Java, we’re normally talkin g about storin g data in a relational database using SQL. We start by takin g a brief look at th e tech n ology an d h ow we use it with Java. Armed with th at in formation , we th en con tin ue our discussion of persisten ce an d h ow it’s implemen ted in object-orien ted application s. 1 .1 .1 Relational databases

You, like most oth er developers, have probably worked with a relational database. In fact, most of us use a relation al database every day. Relation al tech n ology is a known quan tity. Th is alone is sufficien t reason for man y organ ization s to ch oose it. But to say on ly th is is to pay less respect th an is due. Relation al databases are so en tren ch ed n ot by acciden t but because th ey’re an in credibly flexible an d robust approach to data man agemen t.

4

CHAPTER 1

Understanding object/ relational persistence

A relation al database man agemen t system isn ’t specific to Java, and a relational database isn ’t specific to a particular application . Relation al tech n ology provides a way of sharing data among different application s or amon g differen t tech n ologies that form part of the same application ( th e tran saction al en gin e an d th e reportin g engine, for example) . Relational technology is a common denominator of many disparate systems and technology platforms. Hen ce, the relation al data model is often the common enterprise-wide represen tation of busin ess en tities. Relational database management systems have SQL-based application programming interfaces; h en ce we call today’s relation al database products SQL database management systems or, when we’re talkin g about particular systems, SQL databases. 1 .1 .2 Understanding SQL

To use H ibern ate effectively, a solid un derstan din g of th e relation al model an d SQL is a prerequisite. You’ll n eed to use your kn owledge of SQL to tun e th e performance of your Hibernate application. H ibernate will automate many repetitive codin g tasks, but your kn owledge of persisten ce tech n ology must exten d beyon d Hibernate itself if you want take advan tage of th e full power of modern SQL databases. Remember that the underlying goal is robust, efficient management of persistent data. Let’s review some of the SQL terms used in this book. You use SQL as a data definition language ( DDL) to create a database schema with CREATE and ALTER statements. After creating tables ( and indexes, sequences, and so on) , you use SQL as a data manipulation language ( DML) . With DML, you execute SQL operation s th at man ipulate an d retrieve data. Th e man ipulation operation s in clude insertion , update, and deletion . You retrieve data by executing queries with restriction , projection, and join operation s ( in cludin g the Cartesian product) . For efficien t reportin g, you use SQL to group, order, an d aggregate data in arbitrary ways. You can even n est SQL statements inside each other; this technique is called subselecting. You have probably used SQL for man y years an d are familiar with th e basic operation s an d statemen ts written in th is lan guage. Still, we kn ow from our own experien ce th at SQL is sometimes hard to remember and that some terms vary in usage. To understand th is book, we h ave to use th e same terms an d concepts; so, we advise you to read appen dix A if an y of th e terms we’ve mentioned are new or unclear. SQL kn owledge is man datory for soun d Java database application developmen t. If you n eed more material, get a copy of th e excellen t book SQL T uning by Dan Tow [ Tow 2003] . Also read An Introduction to Database Systems [ Date 2004] for the theory, concepts, an d ideals of ( relation al) database systems. Although the relational

What is persistence?

5

database is on e part of ORM, th e oth er part, of course, con sists of th e objects in your Java application that need to be persisted to the database using SQL. 1 .1 .3 Using SQL in Java

Wh en you work with an SQL database in a Java application , th e Java code issues SQL statemen ts to th e database via th e Java DataBase Connectivity ( JDBC) API . Th e SQL itself might have been written by han d and embedded in th e Java code, or it migh t h ave been gen erated on th e fly by Java code. You use th e JDBC API to bin d arguments to query parameters, initiate execution of the query, scroll th rough th e quer y result table, retrieve values from th e result set, an d so on . Th ese are lowlevel data access tasks; as application developers, we’re more in terested in th e busin ess problem th at requires th is data access. It isn ’t clear th at we sh ould be concerning ourselves with such tedious, mechanical details. What we’d really like to be able to do is write code that saves and retrieves complex objects—th e in stan ces of our classes—to and from the database, relieving us of this low-level drudgery. Since the data access tasks are often so tedious, we h ave to ask: Are th e relation al data model an d ( especially) SQL th e righ t ch oices for persisten ce in objectoriented applications? We answer this question immediately: Yes! Th ere are man y reasons why SQL databases dominate the computing industry. Relational database management systems are the only proven data man agemen t tech n ology an d are almost always a requirement in an y Java project. However, for th e last 15 years, developers h ave spoken of a paradigm mismatch. This mismatch explains why so much effort is expen ded on persisten ce-related concerns in every enterprise project. Th e paradigms referred to are object modeling and relational modeling, or perhaps object-orien ted programmin g an d SQL. Let’s begin our exploration of the mismatch problem by askin g wh at persistence means in the context of object-oriented application development. First we’ll widen the simplistic definition of persistence stated at the begin n in g of th is section to a broader, more mature understanding of what is involved in maintaining and using persistent data. 1 .1 .4 Persistence in object-oriented applications

In an object-orien ted application , persisten ce allows an object to outlive th e process th at created it. Th e state of th e object may be stored to disk an d an object with the same state re-created at some poin t in th e future. Th is application isn ’t limited to sin gle objects—en tire graph s of in tercon n ected objects may be made persistent and later re-created in a new process. Most objects

6

CHAPTER 1

Understanding object/ relational persistence

aren’t persisten t; a transient object h as a limited lifetime th at is boun ded by th e life of the process th at in stan tiated it. Almost all Java applications contain a mix of persistent and transient objects; hen ce we n eed a subsystem that manages our persistent data. Modern relation al databases provide a structured representation of persistent data, enabling sorting, searching, and aggregation of data. Database management systems are responsible for managing concurren cy an d data in tegrity; th ey’re respon sible for sh arin g data between multiple users and multiple applications. A database management system also provides data-level security. When we discuss persistence in this book, we’re thinking of all these things: ■

Storage, organ ization , an d retrieval of structured data



Concurrency and data integrity



Data sharing

In particular, we’re th in kin g of th ese p roblems in th e con text of an object-oriented application that uses a domain model. An application with a domain model doesn ’t work directly with th e tabular representation of th e busin ess en tities; th e application h as its own , object-orien ted model of the business entities. If the database has ITEM an d BID tables, th e Java application defines Item an d Bid classes. Then, in stead of directly workin g with th e rows an d column s of an SQL result set, the busin ess logic in teracts with th is object-oriented domain model and its runtime realization as a graph of interconnected objects. The business logic is never executed in the database ( as an SQL stored procedure) , it’s implemen ted in Java. Th is allows busin ess logic to make use of soph isticated object-orien ted con cepts such as in h eritan ce an d polymorph ism. For example, we could use wellknown design pattern s such as Strategy, Mediator, an d Composite [ GOF 1995] , all of which depend on polymorphic method calls. Now a caveat: Not all Java application s are design ed th is way, n or sh ould th ey be. Simple application s migh t be much better off with out a domain model. SQL an d th e JDBC API are perfectly serviceable for dealin g with pure tabular data, an d the n ew JDBC RowSet ( Sun JCP, JSR 114) makes CRUD operations even easier. Working with a tabular representation of persistent data is straightforward and well understood. However, in th e case of applications with nontrivial business logic, the domain model helps to improve code reuse and maintainability significantly. We focus on applications with a domain model in th is book, sin ce Hibern ate an d ORM in general are most relevant to this kind of application.

The paradigm mismatch

7

If we con sider SQL and relational databases again, we finally observe the mismatch between the two paradigms. SQL operation s such as projection an d join always result in a tabular representation of th e resultin g data. Th is is quite differen t th an th e graph of in tercon n ected objects used to execute the business logic in a Java application! These are fundamentally different models, not just different ways of visualizing the same model. With this realization, we can begin to see the problems—some well understood and some less well understood—that must be solved by an application th at combin es both data represen tation s: an object-oriented domain model and a persistent relational model. Let’s take a closer look.

1.2 The paradigm mismatch Th e p arad igm m ism atch can be broken down into several parts, which we’ll exam1..* BillingDetails User in e on e at a time. Let’s start our exploration with a simple example th at is problem Figure 1 .1 A simple UM L class diagram of the user and billing details entities free. Then, as we build on it, you’ll begin to see th e mismatch appear. Suppose you h ave to design an d implement an online e-commerce application. In th is application , you’d n eed a class to represen t in formation about a user of th e system, and another class to represen t in formation about th e user’s billin g details, as shown in figure 1.1. Looking at this diagram, you see that a User h as man y BillingDetails. You can navigate the relationship between the classes in both directions. To begin with, the classes represen tin g th ese en tities migh t be extremely simple: public class User { private String userName; private String name; private String address; private Set billingDetails; // accessor methods (get/set pairs), business methods, etc. ... } public class BillingDetails { private String accountNumber; private String accountName; private String accountType; private User user;

8

CHAPTER 1

Understanding object/ relational persistence //methods, get/set pairs... ... }

Note th at we’re on ly in terested in th e state of th e en tities with regard to persisten ce, so we’ve omitted th e implemen tation of property accessors an d busin ess meth ods ( such as getUserName() or billAuction()) . It’s quite easy to come up with a good SQL schema design for this case: create table USER ( USERNAME VARCHAR(15) NOT NULL PRIMARY KEY, NAME VARCHAR(50) NOT NULL, ADDRESS VARCHAR(100) ) create table BILLING_DETAILS ( ACCOUNT_NUMBER VARCHAR(10) NOT NULL PRIMARY Key, ACCOUNT_NAME VARCHAR(50) NOT NULL, ACCOUNT_TYPE VARCHAR(2) NOT NULL, USERNAME VARCHAR(15) FOREIGN KEY REFERENCES USER )

Th e r elatio n sh ip between th e two en tities is r ep r esen ted as th e for eign key, USERNAME, in BILLING_DETAILS. For th is simple object model, th e object/ relation al mismatch is barely in eviden ce; it’s straigh tfor ward to write JDBC code to in sert,

update, and delete information about user and billing details. Now, let’s see what happens when we consider something a little more realistic. The paradigm mismatch will be visible wh en we add more en tities an d en tity relation sh ips to our application . Th e most glarin gly obvious problem with our current implementation is that we’ve modeled an address as a simple String value. In most systems, it’s n ecessary to store street, city, state, coun try, an d ZIP code information separately. Of course, we could add these properties directly to the User class, but since it’s highly likely th at oth er classes in th e system will also carry address in formation , it makes more sense to create a separate Address class. The updated object model is shown in figure 1.2. Should we also add an ADDRESS table? Not n ecessarily. It’s common to keep address information in the USER table, in in dividual column s. Th is design is likely to perform better, sin ce we don ’t require a table join to retrieve th e user an d address in a sin gle query. Th e n icest solution migh t even be to create a user-defin ed Address Figure 1 .2

User

The User has an Address.

1..*

BillingDetails

The paradigm mismatch

9

SQL data type to represen t addresses an d to use a sin gle column of th at n ew type in th e USER table instead of several new columns.

Basically, we h ave th e ch oice of addin g either several columns or a single column ( of a n ew SQL data type) . This is clearly a problem of granularity. 1 .2 .1 The problem of granularity

Gran ularity refers to th e relative size of th e objects you’re workin g with . Wh en we’re talkin g abou t Java objects an d d atabase tables, th e gran u larity p roblem mean s persistin g objects th at can h ave various kin ds of gran ularity to tables an d columns that are inherently limited in granularity. Let’s return to our example. Adding a new data type to store Address Java objects in a single column to our database catalog soun ds like th e best approach . After all, a new Address type ( class) in Java an d a n ew ADDRESS SQL data type sh ould guarantee interoperability. However, you’ll fin d various problems if you ch eck th e support for user-defin ed column types ( UDT ) in today’s SQL database managemen t systems. UDT support is on e of a n umber of so-called object-relational extensions to traditional SQL. Un fortun ately, UDT support is a somewhat obscure feature of most SQL database management systems and certainly isn’t portable between different systems. Th e SQL stan dard supports user-defin ed data types, but very poorly. For th is reason and ( whatever) other reasons, use of UDT s isn ’t common practice in th e industry at this time—and it’s unlikely that you’ll encounter a legacy schema that makes exten sive use of UDT s. We th erefore can ’t store objects of our n ew Address class in a single new column of an equivalent user-defined SQL data type. Our solution for this problem has several column s, of ven dor-defin ed SQL types ( such as boolean , n umeric, an d strin g data types) . Considering the granularity of our tables again, the USER table is usually defined as follows: create table USER ( USERNAME VARCHAR(15) NOT NULL PRIMARY KEY, NAME VARCHAR(50) NOT NULL, ADDRESS_STREET VARCHAR(50), ADDRESS_CITY VARCHAR(15), ADDRESS_STATE VARCHAR(15), ADDRESS_ZIPCODE VARCHAR(5), ADDRESS_COUNTRY VARCHAR(15) )

This leads to the following observation: Classes in our domain object model come in a range of different levels of granularity—from coarse-grain ed en tity classes like

10

CHAPTER 1

Understanding object/ relational persistence

User, to fin er grain ed classes like Address, righ t down to simple String-valued properties such as zipcode.

In con trast, just two levels of gran ularity are visible at the level of the database: tables such as USER, along with scalar columns such as ADDRESS_ZIPCODE. Th is obviously isn’t as flexible as our Java type system. Many simple persistence mechanisms fail to recognize this mismatch and so end up forcing the less flexible representation upon the object model. We’ve seen countless User classes with properties named zipcode! It turn s out th at th e gran ularity problem isn’t especially difficult to solve. Indeed, we probably wouldn’t even list it, were it not for the fact that it’s visible in so many existin g systems. We describe the solution to this problem in chapter 3, section 3.5, “Fin e-grain ed object models.” A much more difficult and interesting problem arises when we consider domain object models that use inheritance, a feature of object-oriented design we might use to bill the users of our e-commerce application in new and interesting ways. 1 .2 .2 The problem of subtypes

In Java, we implemen t in h eritan ce usin g super- an d subclasses. To illustrate wh y th is can presen t a mismatch problem, let’s con tin ue to build our example. Let’s ad d to ou r e-com m erce ap p lication so th at we n ow can accep t n ot on ly ban k accoun t billin g, but also credit an d debit cards. We th erefore h ave several meth od s to bill a u ser accou n t. Th e m ost n atu ral way to reflect th is ch an ge in ou r object model is to use in h eritan ce for th e BillingDetails class. We migh t h ave an abstract BillingDetails superclass alon g with several con crete subclasses: CreditCard, DirectDebit, Cheque, an d so on . Each of th ese subclasses will define slightly different data ( an d completely differen t fun ction ality that acts upon that data) . The UML class diagram in figure 1.3 illustrates th is object model. We notice immediately that SQL provides no direct support for inheritance. We can ’t declare th at a CREDIT_CARD_DETAILS table is a subtype of BILLING_DETAILS by writin g, say, CREATE TABLE CREDIT_CARD_DETAILS EXTENDS BILLING_DETAILS (...).

Figure 1 .3 Using inheritance for different billing strategies

The paradigm mismatch

11

In chapter 3, section 3.6, “Mapping class in heritan ce,” we discuss h ow object/ relational mapping solutions such as Hibernate solve the problem of persisting a class h ierarch y to a database table or tables. This problem is now quite well understood in the community, an d most solutions support approximately th e same fun ction ality. But we aren ’t quite fin ish ed with in h eritan ce—as soon as we in troduce inheritance into the object model, we have the possibility of polymorphism. The User class h as an association to th e BillingDetails superclass. This is a polymorphic association . At run time, a User object migh t be associated with an in stan ce of an y of th e subclasses of BillingDetails. Similarly, we’d like to be able to write queries that refer to the BillingDetails class and have the query return instances of its subclasses. Th is feature is called polymorphic queries. Sin ce SQL databases don ’t provide a n otion of inheritance, it’s hardly surprising th at th ey also lack an obvious way to represen t a polymorph ic association . A stan dard foreign key constraint refers to exactly one table; it isn’t straightforward to define a foreign key that refers to multiple tables. We might explain this by saying that Java ( and other object-oriented languages) is less strictly typed th an SQL. Fortun ately, two of th e in h eritan ce mappin g solution s we sh ow in ch apter 3 are design ed to accommodate th e representation of polymorphic associations and efficient execution of polymorph ic queries. So, th e mismatch of subtypes is on e in which the inheritance structure in your Java model must be persisted in an SQL database th at doesn ’t offer an in h eritan ce strategy. Th e n ext aspect of th e mismatch problem is th e issue of object identity. You probably n oticed th at we defin ed USERNAME as th e primary key of our USER table. Was th at a good ch oice? Not really, as you’ll see n ext. 1 .2 .3 The problem of identity

Alth ough th e problem of object identity might not be obvious at first, we’ll encounter it often in ou r growin g an d exp an d in g exam p le e-com m erce system . Th is problem can be seen when we con sider two objects ( for example, two Users) an d ch eck if th ey’re iden tical. Th ere are th ree ways to tackle th is problem, two in th e Java world an d on e in our SQL database. As expected, th ey work togeth er on ly with some help. Java objects define two different notions of sameness: ■

Object identity ( roughly equivalent to memory location, checked with a==b)



Equality as determin ed by the implementation of the equals() meth od ( also called equality by value)

12

CHAPTER 1

Understanding object/ relational persistence

On the other hand, the identity of a database row is expressed as th e primary key valu e. As you ’ll see in section 3.4, “Un d er stan d in g object id en tity,” n eith er equals() n or == is n aturally equivalen t to th e primary key value. It’s common for several ( n on iden tical) objects to simultan eously represen t th e same row of th e d atabase. Fu rth erm ore, som e su btle d ifficulties are in volved in im p lem en tin g equals() correctly for a persistent class. Let’s discuss another problem related to database identity with an example. In our table definition for USER, we’ve used USERNAME as a primary key. Un fortun ately, this decision makes it difficult to change a usern ame: We’d n eed to update n ot on ly the USERNAME column in USER, but also th e foreign key column in BILLING_DETAILS. So, later in the book, we’ll recommend that you use surrogate keys wh erever possible. A surrogate key is a primary key column with n o mean in g to th e user. For example, we migh t ch an ge our table defin ition s to look like th is: create table USER ( USER_ID BIGINT NOT NULL PRIMARY KEY, USERNAME VARCHAR(15) NOT NULL UNIQUE, NAME VARCHAR(50) NOT NULL, ... ) create table BILLING_DETAILS ( BILLING_DETAILS_ID BIGINT NOT NULL PRIMARY KEY, ACCOUNT_NUMBER VARCHAR(10) NOT NULL UNIQUE, ACCOUNT_NAME VARCHAR(50) NOT NULL, ACCOUNT_TYPE VARCHAR(2) NOT NULL, USER_ID BIGINT FOREIGN KEY REFERENCES USER )

Th e USER_ID an d BILLING_DETAILS_ID column s con tain system-gen erated values. These columns were introduced purely for th e ben efit of th e relation al data model. How ( if at all) sh ould th ey be represen ted in th e object model? We’ll discuss th is question in section 3.4 an d fin d a solution with object/ relational mapping. In the context of persistence, identity is closely related to h ow th e system h an dles cach in g an d tran saction s. Differen t persisten ce solution s h ave ch osen various strategies, and this has been an area of con fusion . We cover all th ese in terestin g topics—and show how they’re related—in chapter 5. The skeleton e-commerce application we’ve designed and implemented has served our purpose well. We’ve identified the mismatch problems with mapping granularity, subtypes, and object identity. We’re almost ready to move on to other parts of th e application . But first, we n eed to discuss th e importan t con cept of associations—th at is, h ow th e relation sh ips between our classes are mapped and handled. Is th e foreign key in the database all we n eed?

The paradigm mismatch

13

1 .2 .4 Problems relating to associations

In our object model, association s represen t th e relation sh ips between en tities. You remember th at th e User, Address, an d BillingDetails classes are all associated. Un like Address, BillingDetails stan ds on its own . BillingDetails objects are stored in their own table. Association mapping and the management of entity associations are central concepts of any object persistence solution. Object-oriented languages represent associations using object references and collections of object references. In the relation al world, an association is represen ted as a foreign key column , with copies of key values in several tables. Th ere are subtle differences between the two representations. Object referen ces are in h eren tly direction al; th e association is from on e object to th e oth er. If an association between objects should be navigable in both directions, you must define the association twice, on ce in each of th e associated classes. You’ve already seen th is in our object model classes: public class User { private Set billingDetails; ... } public class BillingDetails { private User user; ... }

O n th e oth er h and, foreign key associations aren ’t by n ature direction al. In fact, navigation h as n o mean in g for a relation al data model, because you can create arbitrary data association s with table joins an d projection . Actually, it isn’t possible to determine the multiplicity of a unidirectional association by lookin g on ly at th e Java classes. Java association s may h ave many-to-many multiplicity. For example, our object model might have looked like this: public class User { private Set billingDetails; ... } public class BillingDetails { private Set users; ... }

Table associations on th e oth er hand, are always one-to-many or one-to-one. You can see th e multiplicity immediately by looking at the foreign key definition. The following is a on e-to-man y association ( or, if read in th at direction , a man y-to-on e) :

14

CHAPTER 1

Understanding object/ relational persistence USER_ID BIGINT FOREIGN KEY REFERENCES USER

These are one-to-one associations: USER_ID BIGINT UNIQUE FOREIGN KEY REFERENCES USER BILLING_DETAILS_ID BIGINT PRIMARY KEY FOREIGN KEY REFERENCES USER

If you wish to represen t a man y-to-many association in a relation al database, you must in troduce a n ew table, called a link table. Th is table doesn ’t appear an ywh ere in th e object model. For our example, if we con sider th e relation sh ip between a u ser an d th e u ser ’s billin g in form ation to be m an y-to-m an y, th e lin k table is defined as follows: CREATE TABLE USER_BILLING_DETAILS ( USER_ID BIGINT FOREIGN KEY REFERENCES USER, BILLING_DETAILS_ID BIGINT FOREIGN KEY REFERENCES BILLING_DETAILS PRIMARY KEY (USER_ID, BILLING_DETAILS_ID) )

We’ll discuss association mappin gs in great detail in ch apters 3 an d 6. So far, the issues we’ve considered are main ly structural. We can see th em by considering a purely static view of the system. Perhaps the most difficult problem in object persistence is a dynamic. It con cern s association s, an d we’ve already hinted at it when we drew a distinction between object graph navigation and table joins in section 1.1.4, “Persistence in object-orien ted application s.” Let’s explore th is significant mismatch problem in more depth. 1 .2 .5 The problem of object graph navigation

Th ere is a fun damen tal differen ce in th e way you access objects in Java an d in a relational database. In Java, wh en you access th e billin g in formation of a user, you call aUser.getBillingDetails().getAccountNumber(). Th is is th e most n atural way to access object-orien ted data an d is often described as walking the object graph. You n avigate from on e object to another, following associations between instances. Unfortunately, this isn’t an efficient way to retrieve data from an SQL database. The single most important thing to do to improve performance of data access code is to minimize the number of requests to the database. The most obvious way to do this is to minimize the number of SQL queries. ( Oth er ways in clude usin g stored procedures or the JDBC batch API.) Therefore, efficient access to relational data using SQL usually requires th e use of joins between the tables of interest. Th e n umber of tables in cluded in th e join determines the depth of the object graph you can navigate. For example, if we need to retrieve a User and aren’t interested in the user’s BillingDetails, we use th is simple query:

The paradigm mismatch

15

select * from USER u where u.USER_ID = 123

O n th e oth er h an d, if we n eed to retrieve th e same User an d th en subsequen tly visit each of the associated BillingDetails in stan ces, we use a differen t query: select * from USER u left outer join BILLING_DETAILS bd on bd.USER_ID = u.USER_ID where u.USER_ID = 123

As you can see, we n eed to kn ow wh at p ortion of th e object grap h we p lan to access when we retrieve the initial User, before we start navigating the object graph! On th e oth er h an d, an y object persistence solution provides functionality for fetch in g th e data of associated objects on ly wh en th e object is first accessed. However, th is piecemeal style of data access is fun damen tally in efficien t in th e con text of a relation al database, because it requires execution of one select statement for each node of the object graph. This is the dreaded n+1 selects problem. Th is mismatch in th e way we access objects in Java and in a relational database is perh aps th e sin gle most common source of performan ce problems in Java applications. Yet, although we’ve been blessed with innumerable books and magazine articles advisin g us to use StringBuffer for strin g con caten ation , it seems impossible to find any advice about strategies for avoiding the n+1 selects problem. Fortunately, Hibernate provides sophisticated features for efficiently fetching graphs of objects from th e database, tran sparen tly to th e application accessin g th e graph . We discuss these features in chapters 4 and 7. We n ow h ave a quite elaborate list of object/ relation al mismatch problems, and it will be costly to find solutions, as you might know from experience. This cost is often underestimated, and we th ink th is is a major reason for man y failed software projects. 1 .2 .6 The cost of the mismatch

Th e overall solution for th e list of mismatch problems can require a sign ifican t outlay of time an d effort. In our experien ce, th e main purpose of up to 30 percen t of th e Java application code written is to h an dle th e tedious SQL/ JDBC an d the manual bridging of the object/ relational paradigm mismatch. Despite all this effort, th e en d result still doesn ’t feel quite righ t. We’ve seen projects n early sin k due to the complexity and inflexibility of their database abstraction layers. On e of th e major costs is in th e area of modeling. The relational and object models must both en compass th e same busin ess entities. But an object-oriented purist will model these entities in a very differen t way th an an experien ced relation al data

16

CHAPTER 1

Understanding object/ relational persistence

modeler. The usual solution to this problem is to bend and twist the object model un til it match es th e un derlying relational technology. This can be done successfully, but on ly at th e cost of losin g some of th e advan tages of object orientation. Keep in min d th at relation al modelin g is un derpin n ed by relation al th eory. Object orien tation h as n o such rigorous math ematical defin ition or body of th eoretical work. So, we can’t look to mathematics to explain how we should bridge the gap between the two paradigms—there is no elegant transformation waiting to be discovered. ( Doing away with Java and SQL an d startin g from scratch isn’t considered elegant.) Th e domain modelin g mismatch problem isn’t the only source of the inflexibility and lost productivity that lead to h igh er costs. A furth er cause is th e JDBC API itself. JDBC an d SQL provide a statement- ( th at is, comman d-) orien ted approach to movin g data to an d from an SQL database. A structural relation sh ip must be specified at least th ree times ( Insert, Update, Select) , addin g to th e time required for design and implementation. The unique dialect for every SQL database doesn ’t improve th e situation . Recen tly, it h as been fash ion able to regard architectural or pattern-based models as a partial solution to th e mismatch problem. Hence, we have the entity bean compon en t model, th e data access object ( DAO ) pattern , an d oth er practices to implement data access. These approaches leave most or all of th e problems listed earlier to the application developer. To round out your understanding of object persisten ce, we n eed to discuss application architecture and the role of a persistence layer in typical application design .

1.3 Persistence layers and alternatives In a medium- or large-sized application, it usually makes sense to organize classes by con cern . Persisten ce is on e con cern . O th er con cern s are presen tation , workflow, and business logic. There are also the so-called “cross-cutting” concerns, which may be implemen ted gen erically—by framework code, for example. Typical crosscuttin g con cern s in clude loggin g, auth orization , an d tran saction demarcation . A typical object-oriented architecture comprises layers th at represen t th e concerns. It’s normal, and certainly best practice, to group all classes an d compon en ts respon sible for persisten ce in to a separate persistence layer in a layered system architecture. In this section, we first look at the layers of this type of architecture and why we use them. After th at, we focus on th e layer we’re most in terested in —th e persisten ce layer—an d some of the ways it can be implemented.

Persistence layers and alternatives

17

1 .3 .1 Layered architecture

A layered architecture defin es in ter faces between code th at implemen ts the various con cern s, allowin g a ch an ge to th e way on e con cern is implemen ted with out sign ifican t disruption to code in th e oth er layers. Layering also determines the kinds of in terlayer depen den cies that occur. Th e rules are as follows: ■

Layers communicate top to bottom. A layer is dependent only on the layer directly below it.



Each layer is un aware of an y oth er layers except for th e layer just below it.

Differen t application s group con cerns differen tly, so th ey defin e differen t layers. A typical, proven , h igh -level application arch itecture uses th ree layers, on e each for presen tation , busin ess logic, an d persisten ce, as sh own in figure 1.4. Let’s take a closer look at the layers and elements in the diagram: ■

Presentation layer—The user inter face logic is topmost. Code responsible for the presen tation an d con trol of page an d screen n avigation forms th e presentation layer.



Business layer—Th e exact form of th e n ext layer varies widely between applications. It’s generally agreed, however, that this business layer is responsible for implemen tin g an y busin ess rules or system requiremen ts th at would be un derstood by users as part of th e problem domain . In some systems, th is layer h as its own in tern al represen tation of the business domain entities. In oth ers, it reuses th e model defin ed by th e persisten ce layer. We revisit th is issue in chapter 3. Presentation Layer

Business Layer

Utility and Helper Classes

Persistence Layer

Database

Figure 1.4 A persistence layer is the basis in a layered architecture.

18

CHAPTER 1

Understanding object/ relational persistence



Persistence layer—The persistence layer is a group of classes an d compon en ts respon sible for data storage to, an d retrieval from, on e or more data stores. Th is layer n ecessarily in clu d es a m od el of th e bu sin ess d om ain en tities ( even if it’s only a metadata model) .



Database—The database exists outside the Java application . It’s th e actual, persistent representation of the system state. If an SQL database is used, th e database in cludes th e relation al sch ema an d possibly stored procedures.



Helper/ utility classes—Every application has a set of infrastructural helper or utility classes th at are used in ever y layer of th e application ( for example, Exception classes for error h an dlin g) . Th ese in frastructural elemen ts don ’t form a layer, sin ce th ey don ’t obey th e rules for in terlayer depen den cy in a layered arch itecture.

Let’s n ow take a brief look at th e various ways th e persistence layer can be implemented by Java applications. Don’t worry—we’ll get to O RM an d Hibernate soon . There is much to be learned by looking at other approaches. 1 .3 .2 Hand-coding a persistence layer with SQL/ JDBC

Th e most common approach to Java persisten ce is for application programmers to work directly with SQL an d JDBC. After all, developers are familiar with relation al database man agemen t systems, un derstan d SQL , an d kn ow h ow to work with tables an d foreign keys. Moreover, th ey can always use th e well-kn own an d widely used DAO design pattern to hide complex JDBC code and nonportable SQL from the business logic. The DAO pattern is a good one—so good that we recommend its use even with ORM ( see ch apter 8) . However, th e work involved in manually coding persistence for each domain class is con siderable, particularly wh en multiple SQL dialects are supported. Th is work usually en ds up consuming a large portion of the development effort. Furthermore, when requirements change, a hand-coded solution always requires more atten tion an d main ten an ce effort. So why not implement a simple ORM framework to fit th e specific requiremen ts of your project? The result of such an effort could even be reused in future projects. Many developers have taken this approach; numerous homegrown object/ relational persistence layers are in production systems today. However, we don’t recommend this approach. Excellent solutions already exist, not only the ( mostly expen sive) tools sold by commercial ven dors but also open source projects with free licenses. We’re certain you’ll be able to find a solution that meets your

Persistence layers and alternatives

19

requiremen ts, both busin ess an d tech n ical. It’s likely that such a solution will do a great deal more, and do it better, than a solution you could build in a limited time. Developmen t of a reason ably full-featured ORM may take many developers months. For example, Hibernate is 43,000 lines of code ( some of which is much more difficult than typical application code) , alon g with 12,000 lin es of un it test code. Th is migh t be more th an your application . A great man y details can easily be overlooked—as both th e auth ors kn ow from experien ce! Even if an existin g tool doesn ’t fully implemen t two or th ree of your more exotic requirements, it’s still probably n ot worth creatin g your own . An y ORM will handle the tedious common cases—the ones that really kill productivity. It’s okay th at you migh t n eed to h an dcode certain special cases; few applications are composed primarily of special cases. Don ’t fall for th e “Not In ven ted Here” syn drome an d start your own object/ relational mapping effort just to avoid the learn in g curve associated with th ird-party software. Even if you decide th at all th is ORM stuff is crazy, an d you wan t to work as close to th e SQL database as possible, oth er persisten ce frameworks exist th at don’t implemen t full ORM. For example, th e iBATIS database layer is an open source persistence layer that handles some of the more tedious JDBC code while lettin g developers h an dcraft the SQL. 1 .3 .3 Using serialization

Java h as a built-in persisten ce mech an ism: Serialization provides th e ability to write a graph of objects ( th e state of th e application ) to a byte-stream, wh ich may th en be p ersisted to a file or d atabase. Ser ialization is also u sed by Java’s Rem ote Method In vocation ( RMI) to achieve pass-by value seman tics for complex objects. An oth er usage of serialization is to replicate application state across n odes in a cluster of machines. Why not use serialization for the persistence layer? Unfortunately, a serialized graph of interconnected objects can only be accessed as a whole; it’s impossible to retrieve any data from the stream without deserializin g th e en tire stream. Th us, th e resulting byte-stream must be considered unsuitable for arbitrary search or aggregation. It isn’t even possible to access or update a single object or subgraph indepen den tly. Loadin g an d overwritin g an en tire object graph in each tran saction is no option for systems designed to support h igh con curren cy. Clearly, given current technology, serialization is inadequate as a persistence mechanism for high concurrency web and en terprise application s. It h as a particular n ich e as a suitable persisten ce mech an ism for desktop application s.

20

CHAPTER 1

Understanding object/ relational persistence

1 .3 .4 Considering EJB entity beans

In recen t years, En terprise JavaBean s ( EJBs) h ave been a recommen ded way of persistin g data. If you’ve been workin g in th e field of Java en terprise application s, you’ve probably worked with EJBs an d en tity bean s in particular. If you h aven ’t, don ’t worry—en tity bean s are rapidly declin in g in popularity. ( Man y of th e developer con cern s will be addressed in the new EJB 3.0 specification , h owever.) En tity bean s ( in th e curren t EJB 2.1 specification) are interesting because, in contrast to the other solutions mentioned here, they were created entirely by committee. Th e oth er solution s ( th e DAO pattern , serialization , an d ORM) were distilled from many years of experience; th ey represen t approach es th at h ave stood the test of time. Un surprisin gly, perh aps, EJB 2.1 en tity bean s h ave been a disaster in practice. Design flaws in th e EJB specification preven t bean-managed persistence ( BMP) entity beans from performing efficien tly. A margin ally more acceptable solution is container-managed persistence ( CMP) , at least sin ce some glarin g deficien cies of th e EJB 1.1 specification were rectified. Neverth eless, CMP doesn’t represent a solution to the object/ relational mismatch . Here are six reason s wh y: ■

CMP bean s are defin ed in on e-to-on e correspon den ce to th e tables of th e

relation al m od el. Th u s, th ey’re too coarse grain ed ; th ey m ay n ot take fu ll advan tage of Java’s rich typin g. In a sen se, CMP forces your domain model into first n ormal form. ■

On th e other h an d, CMP bean s are also too fine grained to realize th e stated goal of EJB: th e d efin ition of reu sable software com p on en ts. A reu sable compon en t sh ould be a very coarse-grain ed object, with an extern al in terface that is stable in the face of small ch an ges to th e database sch ema. ( Yes, we really did just claim that CMP en tity bean s are both too fin e grain ed an d too coarse grain ed!)



Alth ough EJBs may take advan tage of implemen tation in h eritance, en tity bean s don ’t support polymorph ic associations and queries, one of the definin g features of “true” ORM.



Entity beans, despite the stated goal of the EJB specification , aren ’t portable in practice. Capabilities of CMP en gin es var y widely between ven dors, an d the mappin g metadata is h igh ly ven dor-specific. Some projects h ave ch osen H ibern ate for th e sim p le reason th at H ibern ate ap p lication s are m u ch more portable between application servers.

Persistence layers and alternatives

21



Entity beans aren’t serializable. We find that we must define additional data transfer objects ( DTO s, also called value objects) wh en we n eed to tran sport data to a remote clien t tier. Th e use of fin e-grain ed meth od calls from th e client to a remote entity bean instance is not scalable; DTO s provide a way of batch in g remote data access. Th e DTO pattern results in the growth of parallel class h ierarch ies, wh ere each en tity of th e d om ain m od el is rep resen ted as both an en tity bean an d a DTO .



EJB is an intrusive model; it man dates an un n atural Java style an d makes

reuse of code outside a specific container extremely difficult. Th is is a h uge barrier to u n it test driven developmen t ( TDD ) . It even cau ses p roblem s in applications that require batch processin g or oth er offlin e fun ction s. We won ’t spen d more time discussin g th e pros an d con s of EJB 2.1 en tity bean s. After lookin g at th eir persisten ce capabilities, we’ve come to th e con clusion th at th ey aren ’t suitable for a full object mappin g. We’ll see what the new EJB 3.0 specification can im p r ove. Let’s tu rn to an oth er object p er sisten ce solu tion th at deserves some attention. 1 .3 .5 Object-oriented database systems

Sin ce we work with objects in Java, it would be ideal if th ere were a way to store th ose objects in a database with out h avin g to ben d an d twist th e object model at all. In th e mid-1990s, n ew object-orien ted database systems gain ed atten tion . An object-oriented database management system ( OODBMS) is more like an exten sion to th e application en viron men t th an an extern al data store. An OODBMS usually features a multitiered implemen tation , with th e backen d data store, object cach e, an d clien t application coupled tigh tly togeth er an d in teractin g via a proprietary n etwork protocol. Object-orien ted database developmen t begin s with th e top-down defin ition of h ost lan guage bin din gs th at add persisten ce capabilities to the programming language. Hence, object databases offer seamless in tegration in to th e object-orien ted application en viron men t. Th is is differen t from th e model used by today’s relational databases, wh ere in teraction with th e database occurs via an in termediate language ( SQL) . Analogously to ANSI SQL, th e stan dard query in terface for relation al databases, there is a stan dard for object database products. The Object Data Management Group ( ODMG) specification defin es an API, a query language, a metadata language, and host language bindings for C++, SmallTalk, and Java. Most object-

22

CHAPTER 1

Understanding object/ relational persistence

oriented database systems provide some level of support for th e ODMG stan dard, but to th e best of our kn owledge, there is n o complete implemen tation . Furthermore, a number of years after its release, an d even in version 3.0, th e specification feels immature and lacks a number of useful features, especially in a Javabased en viron men t. Th e ODMG is also n o lon ger active. More recen tly, th e Java Data Objects ( JDO ) specification ( published in April 2002) opened up new possibilities. JDO was driven by members of th e object-oriented database community and is now being adopted by object-oriented database products as the primary API, often in addition to th e existin g ODMG support. It remains to be seen if this new effort will see object-orien ted databases pen etrate beyon d CAD/ CAM ( computeraided design/ modeling) , scientific computin g, an d oth er n ich e markets. We won’t bother looking too closely into wh y object-orien ted database tech n ology hasn’t been more popular—we’ll simply observe that object databases haven’t been widely adopted an d th at it doesn ’t appear likely th at th ey will be in th e n ear future. We’re con fiden t th at th e overwh elmin g majority of developers will h ave far more opportunity to work with relational tech n ology, given th e curren t political realities ( predefined deployment environments) . 1 .3 .6 Other options

O f course, th ere are oth er kin ds of persisten ce layers. XML persisten ce is a variation on th e serialization th eme; th is approach addresses some of th e limitation s of byte-stream serialization by allowin g tools to access th e data structure easily ( but is itself subject to an object/ h ierarch ical im pedan ce mism atch ) . Furth ermore, th ere is n o addition al ben efit from th e XML, because it’s just an other text file format. You can use stored procedures ( even write th em in Java usin g SQLJ) an d m ove th e p r oblem in to th e d atabase tier. We’r e su re th ere are p len ty of oth er examples, but n on e of th em are likely to become popular in th e immediate future. Political constraints ( long-term investments in SQL databases) an d th e requirement for access to valuable legacy data call for a different approach. ORM may be the most practical solution to our problems.

1.4 Object/ relational mapping Now th at we’ve looked at th e altern ative tech n iques for object persisten ce, it’s time to in troduce th e solution we feel is th e best, and the one we use with Hibern ate: O RM. Despite its lon g h istor y ( th e first research papers were publish ed in th e late 1980s) , th e term s for O RM u sed by d evelop ers var y. Som e call it object

Object/ relational mapping

23

relational mapping, oth ers prefer th e simple object mapping. We exclusively use th e term object/ relational mapping an d its acron ym, O RM. Th e slash stresses th e mismatch problem that occurs wh en the two worlds collide. In th is section , we first look at wh at ORM is. Th en we en umerate th e problems th at a good O RM solution needs to solve. Finally, we discuss th e gen eral ben efits that ORM provides and why we recommen d th is solution . 1 .4 .1 What is ORM ?

In a nutshell, object/ relational mapping is th e automated ( an d tran sparen t) persisten ce of objects in a Java ap p lication to th e tables in a relation al d atabase, using metadata that describes the mapping between th e objects an d th e database. O RM, in essen ce, works by ( reversibly) tran sformin g data from on e represen tation to another. This implies certain performance penalties. However, if ORM is implemented as middleware, th ere are man y opportun ities for optimization th at wouldn ’t exist for a hand-coded persistence layer. A further overh ead ( at developmen t time) is th e provision and management of metadata that governs the transformation. But again, the cost is less than equivalent costs in volved in main tain in g a h an d-coded solution . An d even ODMG-compliant object databases require significant classlevel metadata. FAQ

Isn’t ORM a Visio plugin? Th e acron ym ORM can also mean object role modeling, an d th is term was in ven ted before object/ relational mappin g became relevan t. It describes a meth od for in formation an alysis, used in database modelin g, an d is primarily supported by Microsoft Visio, a graph ical modelin g tool. Database specialists use it as a replacemen t or as an addition to th e more popular entity-relationship modeling. However, if you talk to Java developers about ORM, it’s usually in th e con text of object/ relation al mappin g.

An ORM solution con sists of th e followin g four pieces: ■

An API for per forming basic CRUD operation s on objects of persisten t classes



A language or API for specifyin g queries th at refer to classes and properties of classes



A facility for specifying mapping metadata



A technique for the ORM implementation to interact with transactional objects to per form dirty ch eckin g, lazy association fetching, and other optimization fun ction s

24

CHAPTER 1

Understanding object/ relational persistence

We’re using th e term O RM to in clude an y persisten ce layer wh ere SQL is autogenerated from a metadata-based description. We aren ’t in cludin g persisten ce layers wh ere th e object/ relation al mappin g problem is solved man ually by developers h an d-codin g SQL an d usin g JDBC. With O RM, th e application in teracts with th e O RM API s an d th e domain model classes an d is abstracted from th e un derlyin g SQ L / JDBC . Dep en din g on th e featu res or th e p articu lar im p lem en tation , th e ORM runtime may also take on responsibility for issues such as optimistic lockin g and caching, relieving the application of these concerns entirely. Let’s look at the various ways ORM can be implemen ted. Mark Fussel [ Fussel 1997] , a researcher in the field of ORM, defined the following four levels of ORM quality. Pure relational

Th e wh ole application , in cludin g th e user in ter face, is design ed aroun d th e relational model and SQL-based relational operations. This approach , despite its deficien cies for large systems, can be an excellen t solution for simple application s wh ere a low level of code reuse is tolerable. Direct SQL can be fine-tuned in every aspect, but the drawbacks, such as lack of portability an d main tain ability, are significant, especially in the long run. Application s in th is category often make h eavy use of stored procedures, sh iftin g some of th e work out of th e busin ess layer an d in to th e database. Light object mapping

En tities are rep resen ted as classes th at are m ap p ed m an u ally to th e relation al tables. H an d -co d ed SQ L / JDBC is h id d en fr om th e b u sin ess logic u sin g wellkn own design pattern s. Th is approach is extremely widespread an d is successful for ap p lication s with a sm all n u m ber of en tities, or ap p lication s with gen eric, metadata-driven data models. Stored procedures migh t h ave a place in th is kin d of application . M edium object mapping

Th e application is design ed aroun d an object model. SQL is gen erated at build time using a code generation tool, or at runtime by framework code. Associations between objects are supported by the persistence mechanism, and queries may be specified using an object-oriented expression lan guage. Objects are cach ed by th e persisten ce layer. A great man y O RM products an d h omegrown persisten ce layers support at least this level of functionality. It’s well suited to medium-sized application s with som e com p lex tran saction s, p articu larly wh en p or tability between

Object/ relational mapping

25

differen t database products is im portan t. Th ese ap p lication s u su ally d on ’t u se stored procedures. Full object mapping

Full object mappin g supports soph isticated object modelin g: composition , in h eritan ce, polymorph ism , an d “persisten ce by reach ability.” Th e persisten ce layer implemen ts tran sparen t persisten ce; persisten t classes do not in h erit an y special base class or h ave to implemen t a special in ter face. Efficien t fetch in g strategies ( lazy an d eager fetch in g) an d cach in g strategies are implemented transparently to the application . Th is level of fun ction ality can hardly be achieved by a homegrown persistence layer—it’s equivalent to months or years of developmen t time. A n umber of commercial an d open source Java O RM tools h ave ach ieved th is level of quality. This level meets the definition of ORM we’re usin g in th is book. Let’s look at the problems we expect to be solved by a tool th at ach ieves full object mappin g. 1 .4 .2 Generic ORM problems

Th e following list of issues, which we’ll call the O/ R mapping problems, are th e fun damental problems solved by a full object/ relational mapping tool in a Java en viron m en t. Particu lar O RM tools m ay p rovid e extra fun ction ality ( for exam p le, aggressive caching) , but this is a reasonably exhaustive list of the conceptual issues th at are specific to object/ relation al mappin g: 1

What do persistent classes look like? Are they fine-grained JavaBeans? Or are th ey in stan ces of some ( coarser gran ularity) compon en t model like EJB? How transparent is the persistence tool? Do we have to adopt a programming model and conventions for classes of the business domain?

2

How is mapping metadata defined? Since the object/ relational transformation is govern ed en tirely by metadata, th e format an d defin ition of th is metadata is a centrally important issue. Should an ORM tool provide a GUI to man ipulate th e metadata graph ically? O r are th ere better approach es to metadata defin ition ?

3

How should we map class inheritance hierarchies? Th ere are several stan dard strategies. Wh at abou t p olym orp h ic association s, abstract classes, an d inter faces?

4

How do object identity and equality relate to database (primary key) iden tity? H ow d o we m ap in stan ces of p articu lar classes to p articu lar table rows?

26

CHAPTER 1

Understanding object/ relational persistence

5

How does the persistence logic interact at runtime with the objects of the business domain ? Th is is a p roblem of gen eric p rogram m in g, an d th ere are a n u m ber of solu tion s in clu d in g sou rce gen eration , ru n tim e reflection , runtime bytecode generation, and buildtime bytecode enhancement. The solution to th is problem migh t affect your build process ( but, preferably, shouldn’t otherwise affect you as a user) .

6

What is the lifecyle of a persistent object? Does the lifecycle of some objects depen d upon th e lifecycle of oth er associated objects? H ow do we tran slate th e lifecyle of an object to th e lifecycle of a database row?

7

What facilities are provided for sorting, searching, and aggregating? Th e application could do some of th ese th in gs in memor y. But efficien t use o f r e latio n al te ch n o lo gy r e q u ir e s th at th is wo r k so m e tim e s b e p e r formed by th e database.

8

How do we efficiently retrieve data with associations? Efficient access to relation al data is usually accomplish ed via table join s. O bject-orien ted applications usually access data by navigatin g an object graph . Two data access pattern s sh ould be avoided when possible: the n+1 selects problem, an d its complemen t, th e Cartesian product problem ( fetch in g too much data in a sin gle select) .

In ad d ition , two issu es are com m on to an y d ata-access tech n ology. Th ey also impose fun damen tal con strain ts on the design and architecture of an ORM: ■

Tran saction s an d con curren cy



Cache management ( and concurrency)

As you can see, a full object-mappin g tool n eeds to address quite a lon g list of issu es. We d iscu ss th e way H ibern ate m an ages th ese p roblem s an d d ata-access issues in chapters 3, 4, and 5, and we broaden th e subject later in th e book. By now, you sh ould be startin g to see th e value of ORM. In the next section, we look at some of the other benefits you gain when you use an ORM solution . 1 .4 .3 Why ORM ?

An O RM implemen tation is a complex beast—less complex th an an application server, but more complex th an a web application framework like Struts or Tapestry. Wh y sh ould we in troduce an oth er n ew complex in frastructural elemen t in to our system? Will it be worth it?

Object/ relational mapping

27

It will take us most of th is book to provide a complete answer to those questions. For th e impatien t, th is section provides a quick summary of the most compelling benefits. But first, let’s quickly dispose of a non-benefit. A supposed advan tage of ORM is th at it “sh ields” developers from “messy” SQL. This view holds that object-oriented developers can ’t be expected to un derstan d SQL or relation al databases well an d th at th ey fin d SQL somehow offensive. On the con trary, we believe th at Java developers must have a sufficient level of familiarity with—and appreciation of—relational modeling and SQL in order to work with ORM. ORM is an advan ced tech n ique to be used by developers wh o h ave already done it the hard way. To use Hibernate effectively, you must be able to view an d in terpret th e SQL statemen ts it issues an d un derstan d th e implication s for performan ce. Let’s look at some of the benefits of O RM and Hibernate. Productivity

Persisten ce-related code can be perh aps th e most tedious code in a Java application . Hibern ate elimin ates much of th e grun t work ( more th an you’d expect) an d lets you con cen trate on th e busin ess problem. No matter wh ich application development strategy you prefer—top-down, starting with a domain model; or bottomup, startin g with an existin g database sch ema—Hibern ate used togeth er with th e appropriate tools will significantly reduce developmen t time. M aintainability

Fewer lines of code ( LOC) makes the system more understandable since it emphasizes busin ess logic rath er th an plumbin g. Most importan t, a system with less code is easier to refactor. Automated object/ relational persistence substantially reduces LO C. O f course, coun tin g lin es of code is a debatable way of measurin g application complexity. However, there are other reasons that a Hibernate application is more maintainable. In systems with hand-coded persistence, an inevitable tension exists between th e relation al represen tation an d th e object model implemen tin g th e domain . Changes to on e almost always in volve ch an ges to the other. And often the design of one representation is compromised to accommodate th e existen ce of th e oth er. ( What almost always happens in practice is that the object model of the domain is compromised.) ORM provides a buffer between the two models, allowing more elegant use of object orientation on the Java side, and insulating each model from minor changes to the other.

28

CHAPTER 1

Understanding object/ relational persistence

Performance

A common claim is that hand-coded persisten ce can always be at least as fast, an d can often be faster, than automated persisten ce. Th is is true in th e same sen se th at it’s true th at assembly code can always be at least as fast as Java code, or a h an dwr itten p ar ser can always be at least as fast as a p arser gen er ated by YACC or ANTLR—in oth er words, it’s beside th e poin t. Th e un spoken implication of th e claim is that hand-coded persistence will per form at least as well in an actual application . But th is implication will be true on ly if th e effort required to implemen t at-least-as-fast h an d-coded persisten ce is similar to th e amoun t of effort in volved in utilizin g an automated solution . Th e really in terestin g question is, wh at h appen s wh en we con sider time an d budget con strain ts? Given a persisten ce task, man y optimizations are possible. Some ( such as query h in ts) are much easier to achieve with h an d-coded SQL/ JDBC. Most optimization s, h owever, are much easier to achieve with automated ORM. In a project with time constraints, hand-coded persisten ce usually allows you to make some optimization s, some of the time. Hibernate allows many more optimization s to be used all the time. Furthermore, automated persistence improves developer productivity so much that you can spen d more time h an d-optimizin g the few remaining bottlenecks. Finally, the people who implemented your ORM software probably had much more time to in vestigate performan ce optimization s th an you h ave. Did you know, for instance, that pooling PreparedStatement in stan ces results in a sign ificant performance increase for the DB2 JDBC driver but breaks th e In terBase JDBC driver? Did you realize th at updatin g on ly th e ch an ged column s of a table can be sign ifican tly faster for some databases but potentially slower for others? In your h an dcrafted solution , h ow easy is it to experimen t with th e impact of th ese various strategies? Vendor independence

An O RM abstracts your application away from th e un derlyin g SQL database an d SQL dialect. If th e tool supports a n umber of differen t databases ( most do) , th en th is con fers a certain level of portability on your application. You shouldn’t necessarily expect write on ce/ run an ywh ere, sin ce th e capabilities of databases differ an d ach ievin g full portability would require sacrificin g some of th e stren gth of th e more power ful platforms. Neverth eless, it’s usually much easier to develop a crossplatform application usin g O RM. Even if you don ’t require cross-platform operation, an ORM can still h elp mitigate some of th e risks associated with ven dor lock-

Summary

29

in . In addition , database in depen den ce h elps in developmen t scen arios wh ere developers use a ligh tweigh t local database but deploy for production on a different database.

1.5 Summary In th is ch apter, we’ve discussed th e con cept of object persisten ce an d th e importan ce of O RM as an implemen tation tech n ique. O bject persisten ce mean s th at in dividual objects can outlive th e application process; they can be saved to a data store an d be re-created at a later poin t in time. Th e object/ relation al mismatch comes in to play wh en th e data store is an SQL-based relation al database man agemen t system. For in stan ce, a graph of objects can ’t simply be saved to a database table; it m u st be d isassem bled an d p ersisted to colu m n s of p ortable SQ L d ata types. A good solution for th is problem is O RM, wh ich is especially h elpful if we consider richly typed Java domain models. A domain model represents the business en tities used in a Java application . In a layered system arch itecture, th e domain model is used to execute business logic in the business layer ( in Java, not in the database) . Th is busin ess layer commun icates with the persistence layer beneath in order to load and store the persistent objects of the domain model. O RM is the middleware in the persistence layer that manages the persisten ce. ORM isn ’t a silver bullet for all persisten ce tasks; its job is to relieve the developer of 95 percen t of object persisten ce work, such as writin g complex SQL statements with many table joins and copying values from JDBC result sets to objects or graphs of objects. A full-featured ORM middleware might provide database portability, certain optimization techniques like caching, an d oth er viable fun ction s th at aren ’t easy to hand-code in a limited time with SQL an d JDBC. It’s likely th at a better solution th an ORM will exist some day. We ( and many others) may have to rethink everything we know about SQL, persisten ce API stan dards, and application in tegration . Th e evolution of today’s systems into true relational database systems with seamless object-oriented integration remains pure speculation. But we can’t wait, and there is no sign th at an y of th ese issues will improve soon ( a multibillion -dollar in dustry isn ’t very agile) . ORM is th e best solution curren tly available, an d it’s a timesaver for developers facing the object/ relational mismatch every day.

Introducing and integrating Hibernate

This chapter covers ■

Hibe rnate in ac tio n with “ He llo Wo rld”



The Hibe rnate c o re pro gramming inte rfac e s



Inte gratio n with manage d and no n-manage d e nviro nme nts



Advanc e d c o nfiguratio n o ptio ns

30

“Hello World” with Hibernate

31

It’s good to un derstan d th e n eed for object/ relation al mappin g in Java application s, but you’re probably eager to see Hibern ate in action . We’ll start by sh owin g you a simple example that demonstrates some of its power. As you’re probably aware, it’s tradition al for a programmin g book to start with a “Hello World” example. In this ch apter, we follow th at tradition by in troducin g Hibernate with a relatively simple “Hello World” program. However, simply printing a message to a console window won’t be en ough to really demon strate Hibern ate. Instead, our program will store n ewly created objects in the database, update them, an d perform queries to retrieve them from the database. This chapter will form the basis for the subsequen t ch apters. In addition to th e can on ical “Hello World” example, we in troduce th e core Hibern ate APIs an d explain h ow to con figure Hibern ate in various runtime environments, such as J2EE application servers and stand-alone applications.

2.1 “Hello World” with Hibernate Hibern ate applications defin e persistent classes th at are “mapped” to database tables. O ur “H ello World” example con sists of on e class an d on e mappin g file. Let’s see wh at a simple persisten t class looks like, h ow th e mappin g is specified, an d some of th e th in gs we can do with in stan ces of th e persisten t class usin g Hibern ate. Th e objective of our sample application is to store messages in a database an d to retrieve them for display. The application has a simple persistent class, Message, which represents these printable messages. Our Message class is sh own in listin g 2.1. Listing 2 .1

Message.java: A simple persistent class

package hello; Identifier public class Message { attribute private Long id; private String text; private Message nextMessage; private Message() {} public Message(String text) { this.text = text; } public Long getId() { return id; } private void setId(Long id) { this.id = id; } public String getText() { return text;

Message text Reference to another Message

32

CHAPTER 2

Introducing and integrating Hibernate } public void setText(String text) { this.text = text; } public Message getNextMessage() { return nextMessage; } public void setNextMessage(Message nextMessage) { this.nextMessage = nextMessage; } }

Our Message class has three attributes: the identifier attribute, the text of the message, and a reference to another Message. The identifier attribute allows the application to access th e d atabase id en tity—th e p rim ar y key valu e—of a p ersisten t object. If two in stan ces of Message h ave th e same iden tifier value, th ey represen t th e same row in th e database. We’ve ch osen Long for th e type of our iden tifier attribute, but th is isn ’t a requiremen t. Hibern ate allows virtually an yth in g for th e identifier type, as you’ll see later. You may have noticed that all attributes of the Message class h ave JavaBean -style property accessor methods. The class also has a constructor with no parameters. The persisten t classes we use in our examples will almost always look someth in g like this. In stan ces of th e Message class may be managed ( made persisten t) by Hibern ate, but they don ’t have to be. Since the Message object doesn ’t implemen t an y Hibernate-specific classes or interfaces, we can use it like an y oth er Java class: Message message = new Message("Hello World"); System.out.println( message.getText() );

Th is code fragmen t does exactly wh at we’ve come to expect from “H ello World” applications: It prints "Hello World" to the console. It might look like we’re trying to be cu te h ere; in fact, we’re d em on stratin g an im portan t feature th at distin gu ish es H iber n ate from som e oth er p ersisten ce solu tion s, su ch as EJB en tity bean s. Our persisten t class can be used in any execution context at all—no special con tain er is n eeded. Of course, you came h ere to see Hibern ate itself, so let’s save a n ew Message to th e database: Session session = getSessionFactory().openSession(); Transaction tx = session.beginTransaction(); Message message = new Message("Hello World"); session.save(message);

“Hello World” with Hibernate

33

tx.commit(); session.close();

Th is code calls th e H ibern ate Session an d Transaction in ter faces. ( We’ll get to that getSessionFactory() call soon .) It results in th e execution of someth in g similar to th e followin g SQL: insert into MESSAGES (MESSAGE_ID, MESSAGE_TEXT, NEXT_MESSAGE_ID) values (1, 'Hello World', null)

Hold on —th e MESSAGE_ID column is bein g in itialized to a stran ge value. We didn ’t set th e id property of message an ywh ere, so we would expect it to be null, righ t? Actually, th e id property is special: It’s an identifier property—it h olds a gen erated u n iqu e valu e. ( We’ll d iscu ss h ow th e valu e is gen er ated later.) Th e valu e is assign ed to th e Message instance by Hibernate when save() is called. For th is example, we assume th at the MESSAGES table already exists. In ch apter 9, we’ll sh ow you h ow to use Hibern ate to automatically create th e tables your application needs, using just the information in the mappin g files. ( Th ere’s some more SQL you won ’t n eed to write by h an d!) Of course, we want our “Hello World” program to prin t th e message to th e con sole. Now that we have a message in the database, we’re ready to demon strate th is. The n ext example retrieves all messages from the database, in alphabetical order, and prints them: Session newSession = getSessionFactory().openSession(); Transaction newTransaction = newSession.beginTransaction(); List messages = newSession.find("from Message as m order by m.text asc"); System.out.println( messages.size() + " message(s) found:" ); for ( Iterator iter = messages.iterator(); iter.hasNext(); ) { Message message = (Message) iter.next(); System.out.println( message.getText() ); } newTransaction.commit(); newSession.close();

Th e literal strin g "from Message as m order by m.text asc" is a H ibern ate query, expressed in Hibernate’s own object-oriented Hibernate Query Language ( H QL) . Th is query is in tern ally tran slated in to th e followin g SQL wh en find() is called: select m.MESSAGE_ID, m.MESSAGE_TEXT, m.NEXT_MESSAGE_ID from MESSAGES m order by m.MESSAGE_TEXT asc

The code fragmen t prin ts 1 message(s) found: Hello World

34

CHAPTER 2

Introducing and integrating Hibernate

If yo u ’ve n ever u sed an O RM tool like H ib er n ate befor e, you wer e p r obably expectin g to see th e SQL statemen ts somewh ere in th e code or metadata. Th ey aren ’t there. All SQL is gen erated at run time ( actually at startup, for all reusable SQL statemen ts) . To allow this magic to occur, Hibern ate n eeds more in formation about h ow th e Message class should be made persistent. This information is usually provided in an XML mapping document. The mapping document defines, among other things, how properties of th e Message class map to column s of th e MESSAGES table. Let’s look at the mapping document in listing 2.2. Listing 2 .2

A simple Hibernate XM L mapping



Note that Hibernate 2.0





The mappin g documen t tells Hibern ate th at th e Message class is to be persisted to th e MESSAGES tab le , th at th e id e n tifie r p r o p e r ty m ap s to a co lu m n n am e d MESSAGE_ID, th at th e text property maps to a column n amed MESSAGE_TEXT, an d that the property n amed nextMessage is an association with many-to-one multiplicity th at maps to a column n amed NEXT_MESSAGE_ID. ( Don ’t worr y about th e oth er details for now.) As you can see, the XML documen t isn ’t difficult to un derstan d. You can easily write and maintain it by hand. In chapter 3, we discuss a way of gen eratin g th e

“Hello World” with Hibernate

35

XML file from comments embedded in the source code. Wh ich ever meth od you choose, Hibern ate h as en ough in formation to completely generate all the SQL

statements that would be needed to insert, update, delete, an d retrieve in stan ces of th e Message class. You no longer need to write these SQL statemen ts by h an d. NOTE

Man y Java developers h ave complain ed of th e “metadata h ell” th at accompan ies J2EE developmen t. Some h ave suggested a movemen t away from XML metadata, back to plain Java code. Although we applaud this suggestion for some problems, ORM represen ts a case wh ere text-based metadata really is n ecessary. Hibern ate h as sen sible defaults th at min imize typin g an d a mature documen t type defin ition th at can be used for auto-completion or validation in editors. You can even automatically gen erate metadata with various tools.

Now, let’s ch an ge our first message an d, wh ile we’re at it, create a n ew message associated with the first, as shown in listing 2.3. Listing 2 .3

Updating a message

Session session = getSessionFactory().openSession(); Transaction tx = session.beginTransaction(); // 1 is the generated id of the first message Message message = (Message) session.load( Message.class, new Long(1) ); message.setText("Greetings Earthling"); Message nextMessage = new Message("Take me to your leader (please)"); message.setNextMessage( nextMessage ); tx.commit(); session.close();

This code calls three SQL statements inside the same transaction: select m.MESSAGE_ID, m.MESSAGE_TEXT, m.NEXT_MESSAGE_ID from MESSAGES m where m.MESSAGE_ID = 1 insert into MESSAGES (MESSAGE_ID, MESSAGE_TEXT, NEXT_MESSAGE_ID) values (2, 'Take me to your leader (please)', null) update MESSAGES set MESSAGE_TEXT = 'Greetings Earthling', NEXT_MESSAGE_ID = 2 where MESSAGE_ID = 1

Notice h ow H ibern ate detected th e modification to th e text an d nextMessage properties of th e first message an d au tomatically updated th e database. We’ve taken advan tage of a Hibern ate feature called automatic dirty checking: Th is feature

36

CHAPTER 2

Introducing and integrating Hibernate

saves us th e effort of explicitly askin g H ibern ate to update th e database wh en we modify th e state of an object in side a tran saction . Similarly, you can see th at th e new message was made persistent when a referen ce was created from th e first message. Th is feature is called cascading save: It saves us th e effort of explicitly makin g the new object persistent by calling save(), as long as it’s reachable by an alreadypersisten t in stan ce. Also n otice th at th e orderin g of th e SQL statemen ts isn ’t th e same as th e order in wh ich we set property values. Hibern ate uses a soph isticated algorithm to determine an efficient ordering that avoids database foreign key constrain t violation s bu t is still su fficien tly p red ictable to th e u ser. Th is featu re is called transactional write-behind. If we run “Hello World” again , it prin ts 2 message(s) found: Greetings Earthling Take me to your leader (please)

This is as far as we’ll take the “Hello World” application . Now th at we fin ally h ave som e cod e u n d er ou r belt, we’ll take a step back an d p r esen t an over view of Hibernate’s main APIs.

2.2 Understanding the architecture Th e programmin g in ter faces are th e first th in g you h ave to learn about H ibernate in order to use it in the persistence layer of your application. A major objective of API d esign is to keep th e in ter faces between softwar e com p on en ts as n arrow as possible. In practice, h owever, O RM API s aren ’t especially small. Don ’t worry, th ough ; you don ’t h ave to un derstan d all th e Hibern ate in ter faces at on ce. Figure 2.1 illustrates th e roles of th e most importan t H ibernate in ter faces in th e busin ess an d persisten ce layers. We sh ow th e busin ess layer above th e persisten ce layer, sin ce th e busin ess layer acts as a clien t of th e persisten ce layer in a traditio n ally layer ed ap p licatio n . No te th at so m e sim p le ap p licatio n s m igh t n o t clean ly separate business logic from persisten ce logic; that’s okay—it merely simplifies th e diagram. Th e Hibern ate in terfaces sh own in figure 2.1 may be approximately classified as follows: ■

Inter faces called by applications to per form basic CRUD and querying operation s. Th ese in ter faces are th e main poin t of depen den cy of application busin ess/ con trol logic on H ibern ate. Th ey in clude Session, Transaction, and Query.

Understanding the architecture

37



Inter faces called by application infrastructure code to con figure Hibern ate, most importantly the Configuration class.



Callback in ter faces th at allow th e application to react to events occurring inside Hibernate, such as Interceptor, Lifecycle, and Validatable.



Inter faces that allow extension of Hibernate’s power ful mappin g fun ction ality, su ch as UserType , CompositeUserType , an d IdentifierGenerator. Th ese in ter faces are im plem en ted by application in frastructure code ( if necessary) .

H ibern ate makes use of existin g Java API s, in cludin g JDBC) , Java Transaction API ( JTA, an d Java Namin g an d Directory In ter face ( JNDI) . JDBC provides a rudimen tary level of abstraction of fun ction ality common to relational databases, allowing almost an y database with a JDBC driver to be supported by H ibern ate. JNDI an d JTA allow Hibernate to be integrated with J2EE application servers. In this section, we don’t cover the detailed semantics of Hibernate API meth ods, just th e role of each of th e primary in terfaces. You can fin d most of th ese in terfaces in th e package net.sf.hibernate. Let’s take a brief look at each interface in turn.

Figure 2 .1

High-level overview of the HIbernate API in a layered architecture

38

CHAPTER 2

Introducing and integrating Hibernate

2 .2 .1 The core interfaces

Th e five cor e in ter faces ar e u sed in ju st abou t ever y H iber n ate ap p lication . Usin g th ese in ter faces, you can store an d retrieve persisten t objects an d con trol transactions. Session interface

The Session inter face is the primary inter face used by Hibernate applications. An in stan ce of Session is ligh tweigh t an d is in expen sive to create an d destroy. Th is is importan t because your application will n eed to create an d destroy session s all th e time, perh aps on every request. Hibern ate session s are not threadsafe and should by design be used by only one thread at a time. The Hibern ate n otion of a session is something between connection and transaction . It may be easier to th in k of a session as a cach e or collection of loaded objects relating to a single unit of work. Hibernate can detect ch an ges to th e objects in th is unit of work. We sometimes call th e Session a persistence manager because it’s also the interface for persisten ce-related operation s such as storin g an d retrievin g objects. Note th at a Hibern ate session h as n oth in g to do with th e web-tier HttpSession. Wh en we use th e word session in this book, we mean the Hibernate session. We sometimes use user session to refer to th e HttpSession object. We describe the Session interface in detail in ch apter 4, section 4.2, “The persistence manager.” SessionFactory interface

Th e application obtain s Session in stan ces from a SessionFactory. Compared to the Session inter face, this object is much less exciting. The SessionFactory is certainly not lightweight! It’s intended to be shared among many application threads. There is typically a single SessionFactory for th e whole application—created durin g application in itialization , for example. However, if your application accesses multiple databases usin g Hibern ate, you’ll n eed a SessionFactory for each database. The SessionFactory caches generated SQL statements and other mapping metadata that Hibernate uses at runtime. It also holds cached data that has been read in one unit of work and may be reused in a future un it of work ( on ly if class and collection mappings specify that this second-level cache is desirable) .

Understanding the architecture

39

Configuration interface

Th e Configuration object is u sed to con figu re an d bootstrap H ibern ate. Th e application uses a Configuration instance to specify the location of mapping documen ts an d Hibern ate-specific properties and then create the SessionFactory. Even though the Configuration interface plays a relatively small part in the total scope of a Hibern ate application , it’s th e first object you’ll meet wh en you begin using Hibernate. Section 2.3 covers th e problem of con figurin g Hibern ate in some detail. Transaction interface

The Transaction in ter face is an option al API. Hibernate applications may choose n ot to use th is in ter face, in stead man agin g tran saction s in th eir own in frastructure code. A Transaction abstracts application code from th e un derlyin g tran saction implemen tation —wh ich migh t be a JDBC tran saction , a JTA UserTransaction, or even a Common O bject Request Broker Arch itecture ( CO RBA) tran saction — allowin g th e application to con trol tran saction boun daries via a con sisten t API . Th is h elps to keep H ibern ate application s portable between differen t kin ds of execution environments and containers. We use the Hibernate Transaction API th rough out th is book. Tran saction s an d the Transaction in terface are explain ed in ch apter 5. Query and Criteria interfaces

Th e Query in ter face allows you to per form queries again st th e database an d control h ow th e query is executed. Queries are written in H QL or in th e n ative SQL dialect of your database. A Query in stan ce is used to bin d query parameters, limit the number of results returned by the query, an d fin ally to execute th e query. The Criteria in terface is very similar; it allows you to create an d execute objectoriented criteria queries. To help make application code less verbose, Hibernate provides some shortcut methods on the Session interface that let you invoke a query in one line of code. We won ’t use th ese sh ortcuts in th e book; in stead, we’ll always use th e Query in terface. A Query instance is lightweight and can’t be used outside th e Session that created it. We describe th e features of th e Query in terface in ch apter 7.

40

CHAPTER 2

Introducing and integrating Hibernate

2 .2 .2 Callback interfaces

Callback inter faces allow the application to receive a n otification wh en someth in g in terestin g h appen s to an object—for example, wh en an object is loaded, saved, or deleted. H ibern ate applications don ’t n eed to implemen t th ese callbacks, but they’re useful for implemen tin g certain kin ds of gen eric fun ction ality, such as creating audit records. The Lifecycle an d Validatable in terfaces allow a persisten t object to react to events relatin g to its own persistence lifecycle. The persistence lifecycle is encompassed by an object’s CRUD operations. The Hibernate team was heavily influen ced by oth er ORM solution s th at h ave similar callback in terfaces. Later, th ey realized that having the persistent classes implement Hibernate-specific interfaces probably isn ’t a good idea, because doing so pollutes our persistent classes with nonportable code. Since these approaches are no longer favored, we don’t discuss th em in th is book. The Interceptor interface was introduced to allow the application to process callbacks without forcing the persistent classes to implemen t Hibern ate-specific APIs. Implementations of the Interceptor interface are passed to the persistent instances as parameters. We’ll discuss an example in ch apter 8. 2 .2 .3 Types

A fu n d am en tal an d ver y p ower fu l elem en t of th e arch itectu re is H ibern ate’s notion of a Type. A Hibern ate Type object maps a Java type to a database column type ( actually, th e type may span multiple column s) . All persisten t properties of persisten t classes, in cludin g association s, h ave a correspon din g H ibern ate type. This design makes Hibernate extremely flexible an d exten sible. There is a rich range of built-in types, coverin g all Java primitives an d man y JDK classes, including types for java.util.Currency, java.util.Calendar, byte[], and java.io.Serializable. Even better, Hibernate supports user-defin ed custom types. The interfaces UserType and CompositeUserType are provided to allow you to add your own types. You can use this feature to allow commonly used application classes such as Address, Name, or MonetaryAmount to be h an dled con veniently and elegantly. Custom types are considered a central feature of Hibernate, and you’re encouraged to put them to new and creative uses! We explain Hibernate types and user-defined types in chapter 6, section 6.1, “Un derstan din g th e Hibern ate type system.”

Basic configuration

41

2 .2 .4 Extension interfaces

Much of the functionality that Hibernate provides is con figurable, allowin g you to choose between certain built-in strategies. When the built-in strategies are insufficien t, H ibern ate will usually let you plug in your own custom implemen tation by implemen tin g an in ter face. Exten sion poin ts in clude: ■

Primary key gen eration ( IdentifierGenerator in ter face)



SQL dialect support ( Dialect abstract class)



Cach in g strategies ( Cache an d CacheProvider in ter faces)



JDBC con n ection man agemen t ( ConnectionProvider in ter face)



Tran saction man agemen t ( TransactionFactory, Transaction, and TransactionManagerLookup in ter faces)



ORM strategies ( ClassPersister inter face hierarchy)



Property access strategies ( PropertyAccessor in ter face)



Proxy creation ( ProxyFactory inter face)

Hibern ate sh ips with at least on e implementation of each of the listed inter faces, so you don ’t usually n eed to start from scratch if you wish to exten d th e built-in fun ction ality. Th e source code is available for you to use as an example for your own implemen tation . By n ow you can see th at before we can start writing any code that uses Hibernate, we must an swer th is question: How do we get a Session to work with?

2.3 Basic configuration We’ve looked at an example application an d examin ed H ibern ate’s core in terfaces. To use H ibern ate in an application , you n eed to kn ow h ow to con figure it. H ibern ate can be con figured to run in almost an y Java application an d developmen t en viron men t. Gen erally, H ibern ate is used in two- and th ree-tiered clien t/ server applications, with Hibernate deployed only on the server. The client application is u su ally a web browser, bu t Swin g an d SWT clien t ap p lication s aren ’t u n com m on . Alth ou gh we con cen trate on m u ltitiered web ap p lication s in th is book, our explan ation s apply equally to oth er arch itectures, such as comman dlin e ap p lication s. It’s im p ortan t to u n d erstan d th e d ifferen ce in con figu rin g Hibern ate for man aged an d n on -man aged en viron men ts: ■

Managed environment— Pools resources such as database con n ection s an d allows transaction boundaries and security to be specified declaratively ( that

42

CHAPTER 2

Introducing and integrating Hibernate

is, in metadata) . A J2EE application server such as JBoss, BEA WebLogic, or IBM WebSphere implements the standard ( J2EE-specific) managed environmen t for Java. ■

Non-managed environment— Provides basic concurrency management via th read p oolin g. A ser vlet con tain er like Jetty or Tom cat p rovid es a n on managed server environment for Java web application s. A stan d-alon e desktop or com m an d-lin e ap p lication is also con sidered n on -m an aged. Non man aged en viron men ts don ’t provide automatic tran saction or resource management or security infrastructure. The application itself manages database con n ection s an d demarcates tran saction boun daries.

Hibernate attempts to abstract the environment in which it’s deployed. In the case of a non-managed environment, Hibernate handles transactions and JDBC connections ( or delegates to application code that h an dles th ese con cern s) . In man aged en viron m en ts, H ibern ate in tegrates with con tain er-m an aged tran saction s an d datasources. Hibernate can be configured for deployment in both environments. In both man aged an d n on -man aged en viron men ts, th e first th in g you must do is start Hibernate. In practice, doing so is very easy: You have to create a SessionFactory from a Configuration. 2 .3 .1 Creating a SessionFactory

In order to create a SessionFactory, you first create a single instance of Configuration durin g application in itialization an d use it to set th e location of th e mapp in g files. O n ce con figu red , th e Configuration in stan ce is u sed to create th e SessionFactory. After the SessionFactory is created, you can discard the Configuration class. Th e followin g code starts Hibern ate: Configuration cfg = new Configuration(); cfg.addResource("hello/Message.hbm.xml"); cfg.setProperties( System.getProperties() ); SessionFactory sessions = cfg.buildSessionFactory();

Th e location of th e mappin g file, Message.hbm.xml, is relative to th e root of th e application classpath . For example, if th e classpath is th e curren t director y, th e Message.hbm.xml file must be in th e hello directory. XML mappin g files must be placed in the classpath . In th is example, we also use th e system properties of th e virtual machine to set all other configuration options ( which might have been set before by application code or as startup option s) .

Basic configuration

M ETHOD CHAINING

43

Method chaining is a programmin g style supported by man y H ibern ate in terfaces. This style is more popular in Smalltalk th an in Java an d is con sidered by some people to be less readable an d more difficult to debug than th e more accepted Java style. However, it’s very con ven ien t in most cases. Most Java developers declare setter or adder meth ods to be of type void, mean in g they return n o value. In Smalltalk, wh ich h as no void type, setter or adder meth ods usually return th e receivin g object. Th is would allow us to rewrite th e previous code example as follows: SessionFactory sessions = new Configuration() .addResource("hello/Message.hbm.xml") .setProperties( System.getProperties() ) .buildSessionFactory();

Notice th at we didn ’t n eed to declare a local variable for th e Configuration. We use th is style in some code examples; but if you don ’t like it, you don ’t n eed to use it yourself. If you do use th is codin g style, it’s better to write each meth od in vocation on a differen t lin e. O th erwise, it migh t be difficult to step th rough th e code in your debugger.

By convention, Hibernate XML mapping files are named with the .hbm.xml extension. An other conven tion is to h ave on e mappin g file per class, rath er th an h ave all your mappin gs listed in on e file ( wh ich is possible but con sidered bad style) . O ur “H ello World” exam ple h ad on ly on e p ersisten t class, bu t let’s assu m e we have multiple persistent classes, with an XML mapping file for each. Where should we put these mapping files? The Hibern ate documen tation recommen ds th at th e mappin g file for each persistent class be placed in the same directory as that class. For instance, the mapping file for the Message class would be placed in th e hello directory in a file n amed Message.hbm.xml. If we h ad an oth er persisten t class, it would be defin ed in its own mapping file. We suggest that you follow this practice. The monolithic metadata files en couraged by some frameworks, such as the struts-config.xml foun d in Struts, are a major con tributor to “metadata hell.” You load multiple mapping files by calling addResource() as often as you have to. Alternatively, if you follow the conven tion just described, you can use th e meth od addClass(), passin g a persisten t class as th e parameter: SessionFactory sessions = new Configuration() .addClass(org.hibernate.auction.model.Item.class) .addClass(org.hibernate.auction.model.Category.class) .addClass(org.hibernate.auction.model.Bid.class) .setProperties( System.getProperties() ) .buildSessionFactory();

44

CHAPTER 2

Introducing and integrating Hibernate

The addClass() meth od assumes th at th e n ame of th e mappin g file en ds with th e .hbm.xml exten sion an d is deployed alon g with the mapped class file. We’ve demonstrated the creation of a single SessionFactory, wh ich is all th at most application s n eed. If an oth er SessionFactory is needed—if there are multiple databases, for example—you repeat the process. Each SessionFactory is then available for one database and ready to produce Sessions to work with th at particular database and a set of class mappings. Of course, there is more to con figurin g Hibern ate th an just poin tin g to mapping documents. You also need to specify how database connections are to be obtained, along with various other settings that affect the behavior of Hibernate at runtime. The multitude of configuration properties may appear overwhelming ( a complete list appears in the Hibernate documentation) , but don’t worry; most defin e reason able default values, an d on ly a h an dful are common ly required. To specify configuration options, you may use any of the following techniques: ■

Pass an instance of java.util.Properties to Configuration.setProperties().



Set system properties using java -Dproperty=value.



Place a file called hibernate.properties in the classpath.



Include elemen ts in hibernate.cfg.xml in the classpath.

Th e first an d secon d option s are rarely used except for quick testin g an d prototypes, but most application s n eed a fixed con figuration file. Both th e hibernate. properties and the hibernate.cfg.xml files provide the same function: to configure Hibern ate. Wh ich file you ch oose to use depen ds on your syn tax preferen ce. It’s even possible to mix both option s an d have different settings for development an d deploymen t, as you’ll see later in th is ch apter. A rarely used alternative option is to allow the application to provide a JDBC Connection wh en it open s a Hibern ate Session from th e SessionFactory ( for example, by calling sessions.openSession(myConnection)) . Using this option means that you don’t have to specify any database connection properties. We don’t recommend this approach for new applications that can be configured to use the environment’s database connection in frastructure ( for example, a JDBC connection pool or an application server datasource) . Of all the configuration options, database connection settings are the most important. They differ in managed and non-managed environments, so we deal with the two cases separately. Let’s start with non-managed.

Basic configuration

45

2 .3 .2 Configuration in non-managed environments

In a n on -man aged en viron men t, such as a ser vlet con tain er, th e application is responsible for obtainin g JDBC con n ection s. Hibern ate is part of th e application , so it’s respon sible for gettin g th ese con n ections. You tell Hibernate how to get ( or create n ew) JDBC con n ection s. Gen erally, it isn ’t advisable to create a con n ection each time you want to interact with the database. Instead, Java applications should use a pool of JDBC connections. There are three reasons for using a pool: ■

Acquirin g a n ew con n ection is expen sive.



Main tain in g man y idle con n ection s is expen sive.



Creating prepared statements is also expen sive for some drivers.

Figure 2.2 sh ows th e role of a JDBC con n ection pool in a web application run time en viron men t. Sin ce th is n on -man aged en viron men t doesn ’t implemen t con n ection poolin g, th e application must implemen t its own poolin g algorith m or rely upon a th ird-party library such as th e open source C3P0 con n ection pool. With out Hibern ate, th e application code usually calls th e con n ection pool to obtain JDBC connections and execute SQL statemen ts. With Hibern ate, th e picture ch an ges: It acts as a client of the JDBC connection pool, as shown in figure 2.3. The application code uses th e Hibern ate Session an d Query APIs for persistence operations and only has to manage database transactions, ideally usin g th e Hibern ate Transaction API. Using a connection pool

H ibern ate defin es a plugin arch itecture th at allows in tegration with an y con n ection pool. H owever, support for C3P0 is built in , so we’ll use th at. H ibern ate will set up th e con figuration pool for you with th e given properties. An example of a hibernate.properties file usin g C3P0 is sh own in listin g 2.4. Non-Managed Environment Application

JSP Servlet main()

User-managed JDBC connections

Connection Pool

Database

Figure 2 .2

JDBC connection pooling in a non-managed environment

46

CHAPTER 2

Introducing and integrating Hibernate

Non-Managed Environment Hibernate Application

Session

JSP Servlet main()

Figure 2 .3

Transaction

Connection Pool

Query

Database

Hibernate with a connection pool in a non-managed environment

Listing 2 .4

Using hibernate.properties for C3P0 connection pool settings

hibernate.connection.driver_class = org.postgresql.Driver hibernate.connection.url = jdbc:postgresql://localhost/auctiondb hibernate.connection.username = auctionuser hibernate.connection.password = secret hibernate.dialect = net.sf.hibernate.dialect.PostgreSQLDialect hibernate.c3p0.min_size=5 hibernate.c3p0.max_size=20 hibernate.c3p0.timeout=300 hibernate.c3p0.max_statements=50 hibernate.c3p0.idle_test_period=3000

This code’s lines specify the following information, beginning with the first line: ■

Th e n ame of th e Java class implemen tin g th e JDBC Driver ( the driver JAR file must be placed in th e application ’s classpath ) .



A JDBC URL th at specifies th e h ost an d database n ame for JDBC connection s.



Th e database user n ame.



Th e database password for th e specified user.



A Dialect for th e database. Despite th e ANSI standardization effort, SQL is implemen ted differen tly by various databases ven dors. So, you must specify a Dialect. H ibern ate in clu d es bu ilt-in su p p ort for all p op u lar SQ L d atabases, and new dialects may be defined easily.



Th e min imum n umber of JDBC con n ection s th at C3P0 will keep ready.

Basic configuration



Th e maximum n umber of con n ection s in th e pool. An exception will be thrown at run time if th is n umber is exh austed.



The timeout period ( in this case, 5 min utes or 300 secon ds) after wh ich an idle con n ection will be removed from the pool.



The maximum number of prepared statements that will be cached. Caching of prepared statemen ts is essential for best per forman ce with Hibern ate.



Th e idle time in secon ds before a connection is automatically validated.

Specifyin g properties of th e form hibernate.c3p0.* selects C3P0 as H ibern ate’s connection pool ( you don’t need an y oth er switch to en able C3P0 support) . C3P0 has even more features than we’ve shown in the previous example, so we refer you to th e H ibern ate API documen tation . Th e Javadoc for th e class net.sf.hibernate.cfg.Environment d o cu m en ts ever y H ib er n ate co n figu r atio n p r o p er ty, in cludin g all C3P0-related settin gs an d settin gs for oth er th ird-party con n ection pools directly supported by Hibernate. Th e oth er supported con n ection pools are Apache DBCP and Proxool. You should try each pool in your own en viron ment before deciding between th em. Th e Hibern ate commun ity ten ds to prefer C3P0 an d Proxool. Hibernate also ships with a default connection pooling mechanism. This connection pool is only suitable for testing an d experimen tin g with Hibern ate: You should not use this built-in pool in production systems. It isn’t designed to scale to an en viron men t with man y con curren t requests, an d it lacks the fault toleran ce features foun d in specialized con n ection pools. Starting Hibernate

How do you start Hibernate with these properties? You declared th e properties in a file named hibernate.properties, so you n eed on ly place th is file in th e application classpath . It will be automatically detected an d read wh en H ibern ate is first initialized wh en you create a Configuration object. Let’s summarize the configuration steps you’ve learned so far ( this is a good time to down load an d in stall Hibern ate, if you’d like to con tin ue in a n on managed environment) : 1

Download and unpack the JDBC driver for your database, which is usually available from the database vendor web site. Place the JAR files in the application classpath; do the same with hibernate2.jar.

2

Add Hibern ate’s depen den cies to th e classpath ; th ey’re distributed alon g with Hibernate in the lib/ directory. See also the text file lib/README.txt for a list of required an d option al libraries.

47

48

CHAPTER 2

Introducing and integrating Hibernate

3

Choose a JDBC connection pool supported by Hibernate and configure it with a properties file. Don’t forget to specify th e SQL dialect.

4

Let the Configuration know about these properties by placing them in a hibernate.properties file in the classpath.

5

Create an in stan ce of Configuration in your application and load the XML mappin g files usin g eith er addResource() or addClass(). Build a SessionFactory from th e Configuration by callin g buildSessionFactory().

Un fortun ately, you don ’t h ave an y mappin g files yet. If you like, you can run th e “H ello World” example or skip th e rest of th is ch apter an d start learn in g about persisten t classes an d mappin gs in ch apter 3. Or, if you want to know more about using Hibernate in a managed en viron men t, read on . 2 .3 .3 Configuration in managed environments

A man aged en viron men t h an dles certain cross-cuttin g con cern s, such as application security ( auth orization an d auth en tication ) , con n ection poolin g, an d tran saction man agemen t. J2EE application servers are typical man aged en viron men ts. Alth ough application servers are gen erally design ed to support EJBs, you can still take advan tage of th e oth er man aged services provided, even if you don’t use EJB entity beans. Hibern ate is often used with session or message-driven EJBs, as shown in figure 2.4. EJBs call th e same Hibern ate APIs as servlets, JSPs, or stan d-alon e application s: Session, Transaction, and Query. The Hibern ate-related code is fully portable between non-managed and managed en viron men ts. Hibern ate h an dles th e different connection and transaction strategies tran sparen tly. Application Server Hibernate Application

EJB EJB EJB

Session

Transaction Manager

Transaction Query

Resource Manager

Database Figure 2 .4

Hibernate in a managed environment with an application server

Basic configuration

49

An application server exposes a con n ection pool as a JNDI-boun d datasource, an in stan ce of javax.jdbc.Datasource. You n eed to tell Hibern ate wh ere to fin d th e datasource in JNDI, by supplying a fully qualified JNDI n ame. An example Hibernate con figuration file for th is scen ario is sh own in listin g 2.5. Listing 2 .5

Sample hibernate.properties for a container-provided datasource

hibernate.connection.datasource = java:/comp/env/jdbc/AuctionDB hibernate.transaction.factory_class = \ net.sf.hibernate.transaction.JTATransactionFactory hibernate.transaction.manager_lookup_class = \ net.sf.hibernate.transaction.JBossTransactionManagerLookup hibernate.dialect = net.sf.hibernate.dialect.PostgreSQLDialect

Th is file first gives th e JNDI n am e of th e d atasou rce. Th e d atasou rce m u st be con figu red in th e J2EE en terp rise ap p lication d ep loym en t d escrip tor; th is is a ven d or-sp ecific settin g. Next, you en able H ibern ate in tegration with JTA. Now Hibernate needs to locate th e application server’s TransactionManager in order to in tegrate fully with th e con tain er tran saction s. No stan dard approach is defin ed by th e J2EE specification , but Hibern ate in cludes support for all popular application servers. Fin ally, of course, th e Hibern ate SQL dialect is required. Now that you’ve configured everything correctly, usin g Hibern ate in a man aged en viron men t isn ’t much differen t th an usin g it in a n on -man aged en viron men t: Just create a Configuration with mappin gs an d build a SessionFactory. However, some of th e tran saction en viron men t–related settin gs deserve some extra con sideration . Java already has a standard transaction API, JTA, which is used to control transaction s in a man aged en viron men t with J2EE. This is called container-managed transactions ( CMT ) . If a JTA transaction manager is present, JDBC con n ections are enlisted with this manager and under its full con trol. Th is isn ’t th e case in a n on managed environment, where an application ( or th e pool) man ages th e JDBC conn ection s an d JDBC tran saction s directly. Therefore, managed and non-managed en vironments can use different transaction meth ods. Sin ce Hibern ate n eeds to be portable across th ese en viron men ts, it defines an API for controlling transactions. The Hibernate Transaction in terface abstracts th e un derlyin g JTA or JDBC tran saction ( or, poten tially, even a CO RBA transaction) . This underlying transaction strategy is set with th e property hibernate.connection.factory_class, an d it can take on e of th e followin g two values:

50

CHAPTER 2

Introducing and integrating Hibernate



net.sf.hibernate.transaction.JDBCTransactionFactory delegates to direct JDBC tran saction s. Th is strategy sh ould be used with a con n ection pool in a

non-managed environment and is the default if n o strategy is specified. ■

net.sf.hibernate.transaction.JTATransactionFactory delegates to JTA. This is the correct strategy for CMT, wh ere con n ection s are en listed with JTA. Note th at if a JTA tran saction is alread y in p rogress wh en beginTransaction() is called, subsequen t work takes place in the context of that transaction ( oth erwise a new JTA tran saction is started) .

For a more detailed in troduction to H ibern ate’s Transaction API an d th e effects on your specific application scen ario, see ch apter 5, section 5.1, “Tran saction s.” Just remember the two steps that are necessary if you work with a J2EE application server: Set the factory class for the Hibernate Transaction API to JTA as described earlier, an d declare th e tran saction man ager lookup specific to your application server. The lookup strategy is required only if you use th e secon d-level cach in g system in Hibern ate, but it doesn ’t h urt to set it even with out usin g th e cach e. HIBERNATE WITH TOM CAT

Tomcat isn ’t a full application server; it’s just a servlet con tain er, albeit a servlet con tain er with some features usually found on ly in application servers. O n e of th ese features may be used with Hibern ate: th e Tomcat con n ection pool. Tomcat uses th e DBCP con n ection pool in tern ally but exposes it as a JNDI datasource, just like a real application server. To con figure th e Tomcat datasource, you’ll n eed to edit server.xml accordin g to in struction s in th e Tomcat JNDI/ JDBC documen tation . You can con figure Hibern ate to use th is datasource by settin g hibernate.connection.datasource. Keep in min d th at Tomcat doesn ’t sh ip with a tran saction man ager, so th is situation is still more like a n on -man aged en viron men t as described earlier.

You sh ould n ow h ave a run n in g Hibern ate system, whether you use a simple servlet con tain er or an application server. Create an d compile a persisten t class ( th e in itial Message, for exam p le) , cop y H ibern ate an d its requ ired libraries to th e classpath togeth er with a hibernate.properties file, an d build a SessionFactory. The n ext section covers advan ced Hibern ate con figuration option s. Some of th em are recommen ded, such as loggin g executed SQL statements for debuggin g or using th e con ven ien t XML configuration file instead of plain properties. However, you may safely skip th is section an d come back later once you have read more about persistent classes in chapter 3.

51

Advanced configuration settings

2.4 Advanced configuration settings Wh en you fin ally h ave a Hibern ate application run n in g, it’s well worth gettin g to know all the Hibernate configuration parameters. Th ese parameters let you optimize the runtime behavior of Hibern ate, especially by tun in g th e JDBC in teraction ( for example, usin g JDBC batch updates) . We won’t bore you with these details n ow; th e best source of in formation about con figuration option s is th e Hibernate referen ce documen tation . In th e previous section , we sh owed you th e option s you’ll n eed to get started. However, there is one parameter that we must emphasize at this point. You’ll need it continually whenever you develop software with Hibern ate. Settin g th e property hibernate.show_sql to th e value true en ables loggin g of all gen erated SQL to th e con sole. You’ll use it for troublesh ootin g, performan ce tun in g, an d just to see what’s going on. It pays to be aware of what your ORM layer is doin g—th at’s why ORM doesn ’t hide SQL from developers. So far, we’ve assumed th at you specify configuration parameters using a hibernate.properties file or an in stan ce of java.util.Properties programmatically. There is a third option you’ll probably like: using an XML con figuration file. 2 .4 .1 Using XM L-based configuration

You can use an XML con figuration file ( as demon strated in listin g 2.6) to fully con figure a SessionFactory. Un like hibernate.properties, wh ich con tain s on ly con figuration parameters, th e hibernate.cfg.xml file may also specify th e location of mappin g documen ts. Man y users prefer to cen tralize th e con figuration of Hibern ate in th is way in stead of addin g parameters to th e Configuration in application code. Listing 2 .6

Sample hibernate.cfg.xml configuration file

?xml version='1.0'encoding='utf-8'?> Document type

Name

attribute true

Property java:/comp/env/jdbc/AuctionDB specifications

net.sf.hibernate.dialect.PostgreSQLDialect

B

C

D

52

CHAPTER 2

Introducing and integrating Hibernate

net.sf.hibernate.transaction.JBossTransactionManagerLookup

Mapping

document specifications



d

E

B

The document type declaration is used by th e XML parser to validate this document again st th e Hibern ate con figuration DTD.

C

Th e option al name attribute is equivalent to the property hibernate.session_ factory_name and used for JNDI bin din g of th e SessionFactory, discussed in th e next section.

D

Hibernate properties may be specified without the hibernate prefix. Property n ames an d values are oth erwise iden tical to programmatic con figuration properties.

E

Mapping documen ts may be specified as application resources or even as hardcoded filen ames. Th e files used h ere are from our on lin e auction application , which we’ll introduce in chapter 3. Now you can initialize Hibernate using SessionFactory sessions = new Configuration() .configure().buildSessionFactory();

Wait—how did Hibernate know where the configuration file was located? When configure() was called, Hibernate searched for a file named hibernate.cfg.xml in th e classpath . If you wish to use a differen t filen ame or h ave Hibernate look in a subdirectory, you must pass a path to the configure() meth od: SessionFactory sessions = new Configuration() .configure("/hibernate-config/auction.cfg.xml") .buildSessionFactory();

Using an XML configuration file is certain ly more comfortable th an a properties file or even programmatic property configuration. The fact that you can have the class mappin g files extern alized from th e application ’s source ( even if it would be on ly in a startup h elper class) is a major ben efit of th is approach . You can , for e xam p le , u se d iffe r e n t se ts o f m ap p in g file s ( an d d iffe r e n t co n figu r atio n option s) , depen din g on your database an d en viron men t ( developmen t or production) , and switch them programatically.

Advanced configuration settings

53

If you have both hibernate.properties and hibernate.cfg.xml in the classpath, th e settin gs of th e XML con figuration file will override th e settin gs used in th e properties. This is useful if you keep some base settin gs in properties an d override them for each deploymen t with an XML configuration file. You may have noticed that the SessionFactory was also given a name in th e XML configuration file. Hibernate uses this name to automatically bind the SessionFactory to JNDI after creation . 2 .4 .2 JNDI-bound SessionFactory

In most H ibern ate application s, th e SessionFactory should be in stan tiated on ce durin g application in itialization . Th e sin gle in stan ce sh ould th en be used by all code in a particular process, and any Sessions sh ould be created usin g th is sin gle SessionFactory. A frequen tly asked question is wh ere th is factory must be placed and how it can be accessed without much hassle. In a J2EE en viron men t, a SessionFactory boun d to JNDI is easily shared between differen t th reads an d between various Hibern ate-aware compon en ts. Or course, JNDI isn ’t th e on ly way th at application compon en ts migh t obtain a SessionFactory. Th ere are man y possible implementations of this Registry pattern, including use of th e ServletContext or a static final variable in a sin gleton. A particularly elegan t approach is to use an application scope IoC ( In version of Con trol) framework component. However, JNDI is a popular approach ( an d is exposed as a JMX service, as you'll see later) . We discuss some of th e altern atives in ch apter 8, section 8.1, “Design in g layered applications.” NOTE

Th e Java Namin g an d Directory In terface ( JNDI) API allows objects to be stored to an d retrieved from a h ierarch ical structure ( directory tree) . JNDI implemen ts th e Registry pattern . In frastructural objects ( tran saction con texts, datasources) , con figuration settings ( en viron ment settin gs, user registries) , an d even application objects ( EJB referen ces, object factories) may all be boun d to JNDI.

Th e SessionFactory will automatically bin d itself to JNDI if th e property hibernate.session_factory_name is set to th e n ame of th e directory n ode. If your run time en viron men t doesn ’t provide a default JNDI con text ( or if th e default JNDI implementation doesn’t support instances of Referenceable) , you n eed to specify a JNDI in itial con text u sin g th e p r op er ties hibernate.jndi.url an d hibernate.jndi.class.

54

CHAPTER 2

Introducing and integrating Hibernate

Here is an example Hibern ate con figuration th at bin ds th e SessionFactory to the name hibernate/HibernateFactory usin g Sun ’s ( free) file system–based JNDI implemen tation , fscontext.jar: hibernate.connection.datasource = java:/comp/env/jdbc/AuctionDB hibernate.transaction.factory_class = \ net.sf.hibernate.transaction.JTATransactionFactory hibernate.transaction.manager_lookup_class = \ net.sf.hibernate.transaction.JBossTransactionManagerLookup hibernate.dialect = net.sf.hibernate.dialect.PostgreSQLDialect hibernate.session_factory_name = hibernate/HibernateFactory hibernate.jndi.class = com.sun.jndi.fscontext.RefFSContextFactory hibernate.jndi.url = file:/auction/jndi

Of course, you can also use th e XML-based configuration for this task. This example also isn ’t realistic, sin ce most application ser vers th at provide a con n ection pool th rough JNDI also h ave a JNDI implemen tation with a writable default con text. JBoss certain ly has, so you can skip th e last two properties and just specify a n ame for th e SessionFactory. All you h ave to do n ow is call Configuration.configure().buildSessionFactory() once to initialize the binding. NOTE

Tomcat comes bun dled with a read-on ly JNDI con text, wh ich isn ’t writable from application -level code after th e startup of th e servlet con tain er. Hibern ate can ’t bin d to th is con text; you h ave to eith er use a full con text implemen tation ( like th e Sun FS con text) or disable JNDI bin din g of th e SessionFactory by omittin g th e session_factory_name property in th e con figuration .

Let’s look at some oth er very importan t con figuration settin gs th at log Hibern ate operations. 2 .4 .3 Logging

H ibern ate ( an d m an y oth er O RM im p lem en tation s) execu tes SQ L statem en ts asynchronously. An INSERT statemen t isn ’t usually executed wh en th e application calls Session.save(); an UPDATE isn ’t im m ediately issued wh en th e application calls Item.addBid(). In stead, th e SQL statemen ts are usually issued at th e en d of a transaction. This behavior is called write-behind, as we mentioned earlier. This fact is evidence that tracing and debugging ORM code is sometimes n on trivial. In th eory, it’s possible for th e application to treat Hibernate as a black box and ignore this behavior. Certainly the Hibernate application can’t detect this asyn ch ron icity ( at least, n ot with out resortin g to direct JDBC calls) . However, wh en you find yourself troubleshooting a difficult problem, you n eed to be able to see exactly what’s going on inside Hibernate. Since Hibernate is open source, you can

Advanced configuration settings

55

easily step into the Hibernate code. Occasionally, doing so helps a great deal! But, especially in the face of asynchronous beh avior, debuggin g Hibern ate can quickly get you lost. You can use loggin g to get a view of Hibern ate’s in tern als. We’ve already mentioned the hibernate.show_sql con figuration parameter, wh ich is usually th e first port of call when troubleshooting. Sometimes the SQL alon e is in sufficien t; in th at case, you must dig a little deeper. Hibernate logs all interesting events using Apache commons-logging, a thin abstraction layer that directs output to either Apache log4j ( if you put log4j.jar in your classpath ) or JDK1.4 logging ( if you’re running under JDK1.4 or above and log4j isn ’t presen t) . We recommen d log4j, sin ce it’s more mature, more popular, and under more active developmen t. To see any output from log4j, you’ll n eed a file n amed log4j.properties in your classpath ( righ t n ext to hibernate.properties or hibernate.cfg.xml) . Th is example directs all log messages to th e con sole: ### direct log messages to stdout ### log4j.appender.stdout=org.apache.log4j.ConsoleAppender log4j.appender.stdout.Target=System.out log4j.appender.stdout.layout=org.apache.log4j.PatternLayout log4j.appender.stdout.layout.ConversionPattern=%d{ABSOLUTE} ➾ %5p %c{1}:%L - %m%n ### root logger option ### log4j.rootLogger=warn, stdout ### Hibernate logging options ### log4j.logger.net.sf.hibernate=info ### log JDBC bind parameters ### log4j.logger.net.sf.hibernate.type=info ### log PreparedStatement cache activity ### log4j.logger.net.sf.hibernate.ps.PreparedStatementCache=info

With th is con figuration , you won ’t see man y log messages at run time. Replacin g in fo with debug for th e log4j.logger.net.sf.hibernate categor y will reveal th e in n er workin gs of H ibern ate. Make sure you don ’t do th is in a production en vironmen t—writin g th e log will be much slower than the actual database access. Fin ally, you have th e hibernate.properties, hibernate.cfg.xml, an d log4j.properties configuration files. There is another way to configure Hibernate, if your application server supports the Java Man agemen t Exten sion s. 2 .4 .4 Java M anagement Extensions (JM X)

The Java world is full of specification s, standards, and, of course, implementations of th ese. A relatively n ew but importan t stan dard is in its first version : th e Java

56

CHAPTER 2

Introducing and integrating Hibernate

Management Extension s ( JM X ) . JMX is about th e man agemen t of systems components or, better, of system services. Where does Hibernate fit into this new picture? Hibernate, when deployed in an application server, makes use of other services like managed transactions and pooled database tran saction s. But wh y n ot make Hibernate a managed service itself, wh ich oth ers can depen d on an d use? Th is is possible with th e Hibern ate JMX in tegration , makin g Hibern ate a man aged JMX compon en t. The JMX specification defin es th e followin g compon en ts: ■

T he JMX MBean —A reusable component ( usually infrastructural) that exposes an in ter face for management ( administration)



T he JMX container— Mediates generic access ( local or remote) to th e MBean



T he (usually generic) JMX client— May be used to administer any MBean via the JMX container

An application server with support for JMX ( such as JBoss) acts as a JMX con tain er an d allows an MBean to be con figured an d in itialized as part of th e application ser ver startup process. It’s possible to mon itor an d admin ister th e MBean usin g the application server’s admin istration con sole ( which acts as th e JMX client) . An MBean may be packaged as a JMX service, wh ich is n ot on ly portable between application servers with JMX support but also deployable to a running system ( a hot deploy) . Hibernate may be packaged and administered as a JMX MBean. The Hibernate JMX service allows Hibern ate to be in itialized at application server startup and controlled ( con figured) via a JMX clien t. However, JMX compon en ts aren ’t automatically integrated with container-managed tran saction s. So, th e con figuration options in listing 2.7 ( a JBoss service deploymen t descriptor) look similar to th e usual Hibern ate settin gs in a managed environment. Listing 2 .7

Hibernate jboss-service.xml JM X deployment descriptor

jboss.jca:service=RARDeployer jboss.jca:service=LocalTxCM,name=DataSource

auction/Item.hbm.xml, auction/Bid.hbm.xml

Advanced configuration settings

57

java:/hibernate/HibernateFactory

java:/comp/env/jdbc/AuctionDB

net.sf.hibernate.dialect.PostgreSQLDialect

net.sf.hibernate.transaction.JTATransactionFactory

net.sf.hibernate.transaction.JBossTransactionManagerLookup

java:/UserTransaction



Th e HibernateService depen ds on two oth er JMX services: service=RARDeployer an d service=LocalTxCM,name=DataSource, both in th e jboss.jca service domain name. The Hibernate MBean may be found in the package net.sf.hibernate.jmx. Unfortunately, lifecycle management meth ods like startin g an d stoppin g th e JMX service aren ’t part of th e JMX 1.0 specification. The methods start() and stop() of th e HibernateService are th erefore specific to th e JBoss application server. NOTE

If you’re interested in th e advan ced usage of JMX, JBoss is a good open source startin g poin t: All services ( even th e EJB con tain er) in JBoss are implemen ted as MBean s an d can be man aged via a supplied con sole in terface.

We recommen d th at you try to con figure Hibernate programmatically ( using th e Configuration object) before you try to run Hibern ate as a JMX service. However,

some features ( like h ot-redeploymen t of Hibern ate application s) may be possible on ly with JMX, on ce th ey become available in H ibern ate. Righ t n ow, th e biggest advan tage of Hibern ate with JMX is th e automatic startup; it means you n o lon ger h ave to create a Configuration an d build a SessionFactory in your application co d e , b u t ca n sim p ly a cce ss t h e SessionFactory t h r o u gh JN DI o n ce t h e HibernateService has been deployed and started.

58

CHAPTER 2

Introducing and integrating Hibernate

2.5 Summary In th is ch apter, we took a h igh -level look at H ibern ate an d its arch itecture after run n in g a simple “H ello World” example. You also saw h ow to con figure H ibern ate in various en viron men ts an d with various techniques, even including JMX. The Configuration and SessionFactory in terfaces are th e en try poin ts to Hibernate for applications running in both man aged an d n on -man aged en viron ments. Hibern ate provides addition al APIs, such as th e Transaction in terface, to bridge th e differen ces between en viron men ts an d allow you to keep your persistence code portable. Hibern ate can be in tegrated in to almost every Java en viron men t, be it a servlet, an applet, or a fully managed three-tiered client/ server application. The most importan t elemen ts of a Hibern ate con figuration are the database resources ( connection configuration) , the transaction strategies, an d, of course, th e XML-based mapping metadata. Hibern ate’s con figuration in terfaces h ave been design ed to cover as man y usage scenarios as possible while still being easy to understand. Usually, a single file named hibernate.cfg.xml an d on e lin e of code are en ough to get Hibern ate up and running. None of this is much use without some persisten t classes an d th eir XML mapping documen ts. Th e n ext ch apter is dedicated to writin g an d mappin g persisten t classes. You’ll soon be able to store an d retrieve persistent objects in a real application with a n on trivial object/ relation al mappin g.

Mapping persistent classes

This chapter covers ■

POJO bas ic s fo r ric h do main mo de ls



Mapping POJOs with Hibe rnate me tadata



Mapping c las s inhe ritanc e and fine -graine d mo de ls



An intro duc tio n to c las s as s o c iatio n mappings

59

60

CHAPTER 3

Mapping persistent classes

Th e “H ello World ” exam p le in ch ap ter 2 in trod u ced you to H ibern ate; h owever, it isn ’t ver y useful for un derstan din g th e requiremen ts of real-world application s with com p lex d ata m od els. For th e rest of th e book, we’ll u se a m u ch more soph isticated example application —an on lin e auction system—to demon strate Hibernate. In th is ch apter, we start our discussion of the application by introducing a programming model for persistent classes. Designing an d implementing the persistent classes is a multistep process th at we’ll examin e in detail. First, you’ll learn h ow to iden tify th e business entities of a problem domain . We create a conceptual model of these entities and their attributes, called a domain model. We implement this domain model in Java by creating a persistent class for each en tity. ( We’ll spen d some time exploring exactly what these Java classes should look like.) We then define mapping metadata to tell Hibernate how these classes and their properties relate to database tables and columns. This involves writing or generating XML documen ts th at are even tually deployed along with the compiled Java classes an d used by Hibern ate at run time. Th is discussion of mappin g metadata is the core of this chapter, along with the in-depth exploration of the mapping techniques for fine-grained classes, object identity, inheritance, and associations. This chapter therefore provides the begin n in gs of a solution to th e first four gen eric problems of ORM listed in section 1.4.2, “Gen eric ORM problems.” We’ll start by in troducin g th e example application .

3.1 The CaveatEmptor application The CaveatEmptor on lin e auction application demon strates ORM techniques and Hibernate fun ction ality; you can down load the source code for the entire working ap plication from th e web site h ttp:/ / caveatemptor.h ibern ate.org. Th e app lication will have a web-based user inter face and run inside a servlet engine like Tomcat. We won ’t pay much atten tion to th e user in ter face; we’ll con centrate on th e data access code. In ch apter 8, we discuss the chan ges th at would be necessary if we were to per form all business logic and data access from a separate busin ess-tier implemented as EJB session bean s. But, let’s start at the beginning. In order to understand the design issues involved in O RM, let’s pretend the CaveatEmptor application doesn ’t yet exist, an d th at we’re buildin g it from scratch . Our first task would be analysis.

The CaveatEmptor application

61

3 .1 .1 Analyzing the business domain

A softwar e d evelop m en t effor t begin s with an alysis of th e p r o blem d om ain ( assumin g th at n o legacy code or legacy database already exist) . At th is stage, you, with th e h elp of problem domain experts, iden tify th e main entities th at are relevan t to th e software system. En tities are usually n otion s un derstood by users of th e system: Payment, Customer, Order, Item, Bid, an d so forth . Some entities might be abstractions of less concrete things the user thinks about ( for example, PricingAlgorithm) , but even these would usually be understandable to th e user. All th ese en tities are found in the conceptual view of the business, which we sometimes call a business model. Developers of object-orien ted software analyze the business model and create an object model, still at the conceptual level ( n o Java code) .This object model may be as simple as a men tal image existin g on ly in the min d of th e developer, or it may be as elaborate as a UML class diagram ( as in figure 3.1) created by a CASE ( Computer-Aided Software En gin eerin g) tool like Argo UML or TogetherJ. This simple model contains entities that you’re bound to find in any typical auction system: Category, Item, an d User. Th e en tities an d th eir relation sh ips ( an d perhaps their attributes) are all represented by this model of the problem domain. We call th is kin d of model—an object-orien ted model of entities from the problem domain , en compassin g on ly th ose en tities th at are of in terest to th e user—a domain model. It’s an abstract view of th e real world. We refer to th is model wh en we implemen t our persisten t Java classes. Let’s examin e th e outcome of our an alysis of th e problem domain of th e CaveatEmptor application. 3 .1 .2 The CaveatEmptor domain model

Th e CaveatEmptor site auction s man y differen t kin ds of items, from electron ic equipment to airline tickets. Auctions proceed according to the “English auction” model: Users continue to place bids on an item un til th e bid period for th at item expires, an d th e h igh est bidder win s. In an y store, goods are categorized by type an d grouped with similar goods in to section s an d on to sh elves. Clearly, our auction catalog requires some kind of hierarchy of item categories. A buyer may browse th ese categories or arbitrarily search by category and item attributes. Lists of items appear in the category browser and Category Figure 3 .1

0..*

Item

0..*

sells

A class diagram of a typical online auction object model

User

62

CHAPTER 3

Mapping persistent classes

search result screens. Selecting an item from a list will take th e buyer to an item detail view. An auction con sists of a sequen ce of bids. One particular bid is the winning bid. User details include name, login, address, email address, and billing information. A web of trust is an essen tial feature of an on lin e auction site. Th e web of trust allows users to build a reputation for trustworthiness ( or untrustworthiness) . Buyers may create commen ts about sellers ( an d vice versa) , an d th e commen ts are visible to all other users. A high -level overview of our domain model is sh own in figure 3.2. Let’s briefly discuss some interesting features of this model. Each item may be auction ed on ly on ce, so we don’t need to make Item distin ct from the Auction entities. Instead, we have a sin gle auction item en tity n amed Item. Th us, Bid is associated directly with Item. Users can write Comments about oth er users on ly in th e con text of an auction ; h en ce th e association between Item and Comment. The Address information of a User is modeled as a separate class, even th ough th e User may have only one Address. We do allow th e user to h ave multiple BillingDetails. The various billing strategies are represented as subclasses of an abstract class ( allowin g future exten sion ) . A Category migh t be n ested in side an oth er Category. This is expressed by a recursive association , from th e Category en tity to itself. Note th at a sin gle Category may have multiple child categories but at most one parent category. Each Item belongs to at least one Category. Th e en tities in a domain model sh ould encapsulate state an d beh avior. For example, the User entity sh ould defin e th e n ame an d address of a customer an d the logic required to calculate the shippin g costs for items ( to th is particular customer) . Our domain model is a rich object model, with complex association s, interactions, and inheritance relationships. An interesting and detailed discussion of object-orien ted techn iques for workin g with domain models can be found in Patterns of Enterprise Application Architecture [ Fowler 2003] or in Domain-Driven Design [ Evan s 2004] . However, in this book, we won’t have much to say about business rules or about the behavior of our domain model. Th is is certain ly n ot because we con sider th is an unimportan t con cern ; rath er, th is con cern is mostly orth ogon al to th e problem of persistence. It’s the state of our en tities th at is persisten t. So, we concentrate our discussion on how to best represent state in our domain model, n ot on h ow to represent behavior. For example, in this book, we aren ’t in terested in h ow tax for sold items is calculated or how the system might approve a n ew user accoun t. We’re

Figure 3 .2

Persistent classes of the CaveatEmptor object model and their relationships

The CaveatEmptor application

63

64

CHAPTER 3

Mapping persistent classes

more interested in how the relationship between users and the items they sell is represen ted an d made persisten t. FAQ

Can you use ORM without a domain model? We stress th at object persisten ce with full ORM is most suitable for application s based on a rich domain model. If your application doesn ’t implemen t complex busin ess rules or complex in teraction s between en tities ( or if you h ave few en tities) , you may n ot n eed a domain model. Man y simple an d some n ot-so-simple problems are perfectly suited to table-orien ted solution s, wh ere th e application is design ed aroun d th e database data model in stead of aroun d an object-orien ted domain model, often with logic executed in th e database ( stored procedures) . H owever, the more complex an d expressive your domain model, th e more you will ben efit from usin g H ibern ate; it sh in es wh en dealin g with th e full complexity of object/ relation al persisten ce.

Now th at we h ave a domain model, our n ext step is to implemen t it in Java. Let’s look at some of th e th in gs we n eed to con sider.

3.2 Implementing the domain model Several issues typically must be addressed wh en you implemen t a domain model in Java. For in stan ce, h ow do you separate th e busin ess con cern s from th e crosscutting concerns ( such as transactions and even persisten ce) ? Wh at kin d of persisten ce is n eeded: Do you n eed automated or transparent persistence? Do you have to use a specific programmin g model to ach ieve th is? In th is section , we examin e these types of issues and how to address th em in a typical Hibern ate application . Let’s start with an issue th at an y implemen tation must deal with : th e separation of concerns. The domain model implementation is usually a central, organizing compon en t; it’s reused h eavily whenever you implement new application fun ction ality. For th is reason , you sh ould be prepared to go to some lengths to ensure that concerns other than business aspects don ’t leak in to th e domain model implementation. 3 .2 .1 Addressing leakage of concerns

Th e domain model implemen tation is such an importan t piece of code th at it sh ould n ’t depen d on oth er Java API s. For example, code in th e domain model shouldn’t per form JNDI lookups or call the database via the JDBC API. This allows you to reuse the domain model implemen tation virtually an ywh ere. Most importan tly, it makes it easy to unit test th e domain model ( in JUn it, for example) outside of any application server or oth er man aged en viron men t.

Implementing the domain model

65

We say th at th e domain model sh ould be “con cern ed” on ly with modelin g th e business domain. However, there are other concerns, such as persistence, transaction management, and authorization. You shouldn ’t put code th at addresses th ese cross-cutting concerns in th e classes th at implemen t th e domain model. Wh en th ese concerns start to appear in the domain model classes, we call th is an example of leakage of concerns. The EJB stan dard tries to solve th e problem of leaky con cern s. In deed, if we implemented our domain model using en tity bean s, th e con tain er would take care of some con cern s for us ( or at least extern alize th ose con cern s to th e deploymen t descriptor) . Th e EJB container prevents leakage of certain cross-cutting concerns using interception . An EJB is a managed component, always executed inside the EJB container. Th e con tain er in tercepts calls to your beans and executes its own functionality. For example, it migh t pass con trol to th e CMP engine, which takes care of persisten ce. Th is approach allows th e con tain er to implemen t th e predefin ed cross-cutting concerns—security, concurren cy, persisten ce, tran saction s, an d remoteness—in a generic way. Unfortunately, the EJB specification imposes man y rules an d restriction s on h ow you must implemen t a domain model. This in itself is a kind of leakage of concern s—in th is case, th e con cern s of th e contain er implemen tor h ave leaked! Hibernate isn’t an application server, and it doesn’t try to implement all the cross-cutting concerns mentioned in the EJB specification . Hibern ate is a solution for just on e of these con cern s: persisten ce. If you require declarative security and transaction management, you should still access your domain model via a session bean , takin g advan tage of th e EJB container’s implementation of these con cern s. Hibern ate is commonly used together with the well-known session façade J2EE pattern . Much discussion h as gon e in to th e topic of persisten ce, an d both Hibern ate an d EJB entity beans take care of that concern . However, Hibern ate offers someth in g that entity beans don’t: transparent persistence. 3 .2 .2 Transparent and automated persistence

Your application ser ver’s CMP en gin e implemen ts automated persistence. It takes care of th e tedious details of JDBC ResultSet an d PreparedStatement h an dlin g. So d oes H ibern ate; in d eed , H ibern ate is a great d eal m ore sop h isticated in th is respect. But Hibernate does this in a way that is transparent to your domain model. We use transparent to mean a complete separation of concerns between the persistent classes of the domain model an d th e persisten ce logic itself, wh ere th e persisten t classes are un aware of—an d have no dependency to—the persistence mechanism.

66

CHAPTER 3

Mapping persistent classes

Our Item class, for example, will n ot h ave an y code-level dependency to any Hibernate API. Furthermore: ■

Hibernate doesn’t require that any special superclasses or inter faces be in h erited or implemen ted by persisten t classes. Nor are an y special classes used to implemen t properties or association s. Th us, tran sparen t persisten ce improves code readability, as you’ll soon see.



Persisten t classes may be reused outside th e con text of persisten ce, in un it tests or in th e u ser in ter face ( UI ) tier, for exam p le. Testability is a basic requiremen t for application s with rich domain models.



In a system with tran sparen t persisten ce, objects aren’t aware of the underlyin g data store; th ey n eed n ot even be aware that they are being persisted or retrieved. Persisten ce con cern s are extern alized to a gen eric persistence manager inter face —in the case of Hibernate, the Session an d Query inter faces.

Transparent persistence fosters a degree of portability; without special inter faces, th e persisten t classes are decoupled from an y particular persisten ce solution . Our busin ess logic is fully reusable in an y oth er application con text. We could easily change to another transparent persistence mechanism. By th is defin ition of tran sparen t persisten ce, you see th at certain n on -automated persistence layers are transparent ( for example, the DAO pattern ) because th ey decouple th e persisten ce-related code with abstract programmin g in terfaces. On ly plain Java classes without dependencies are exposed to th e busin ess logic. Con versely, some automated persisten ce layers ( in cludin g en tity bean s an d some ORM solution s) are n on -tran sparen t, because th ey require special interfaces or intrusive programming models. We regard transparency as required. In fact, tran sparen t persisten ce sh ould be one of the primary goals of any ORM solution. However, n o automated persisten ce solution is completely transparent: Every automated persisten ce layer, in cludin g Hibern ate, imposes some requirements on the persistent classes. For example, Hibern ate requires th at collection -valued properties be typed to an interface such as java.util.Set or java.util.List an d n ot to an actual implemen tation such as java.util.HashSet ( th is is a good practice an yway) . ( We discuss th e reason s for th is requiremen t in appen dix B, “ORM implementation strategies.”) You now know why the persistence mechanism should have minimal impact on h ow you implemen t a domain model an d th at tran sparen t an d automated persistence are required. EJB isn’t transparent, so what kind of programming model sh ould you use? Do you n eed a special programming model at all? In theory, no;

Implementing the domain model

67

in practice, you sh ould adopt a disciplined, consistent programming model that is well accepted by the Java community. Let’s discuss th is programmin g model an d see h ow it works with Hibern ate. 3 .2 .3 Writing POJOs

Developers h ave foun d en tity bean s to be tedious, unnatural, and unproductive. As a reaction again st en tity bean s, man y developers started talkin g about Plain Old Java Objects ( PO JO s) , a back-to-basics approach th at essen tially revives JavaBean s, a component model for UI developmen t, an d reapplies it to the business layer. ( Most developers are now using the terms POJO an d JavaBean almost synonymously.) 1 Hibernate works best with a domain model implemented as PO JO s. The few requirements that Hibernate imposes on your domain model are also best practices for th e PO JO programmin g model. So, most POJO s are Hibernate-compatible without any changes. The programming model we’ll introduce is a non-intrusive mix of JavaBean specification details, PO JO best practices, and Hibernate requirements. A PO JO declares business methods, wh ich defin e beh avior, an d properties, which represent state. Some properties represen t association s to oth er PO JO s. Listing 3.1 shows a simple PO JO class; it’s an implementation of the User en tity of our domain model. Listing 3 .1

POJO implementation of the User class

public class User implements Serializable {

B

Implementation of Serializable

private String username; private Address address; public User() {}

C Class constructor

public String getUsername() { return username; } public void setUsername(String username) { this.username = username; }

d Accessor methods

public Address getAddress() { return address; }

1

POJO is sometimes also written as Plain Ordinary Java Objects; this term was coined in 2002 by Martin Fowler, Rebecca Parson s, an d Josh Macken zie.

68

CHAPTER 3

Mapping persistent classes public void setAddress(Address address) { this.address = address; }

d

public MonetaryAmount calcShippingCosts(Address fromLocation) { ... } Business method

E

}

B

Hibernate doesn’t require that persistent classes implement Serializable. However, when objects are stored in an HttpSession or passed by value usin g RMI, serialization is n ecessary. ( Th is is very likely to h appen in a Hibern ate application .)

C

Unlike the JavaBeans specification, which requires no specific constructor, Hibern ate requires a con structor with n o arguments for every persistent class. Hibernate instan tiates persisten t classes usin g Constructor.newInstance(), a feature of the Java reflection API. Th e con structor may be n on -public, but it sh ould be at least package-visible if runtime-generated proxies will be used for per forman ce optimization ( see ch apter 4) .

D

Th e properties of th e PO JO implement the attributes of our business entities—for example, th e usern ame of User. Properties are usually implemented as instance variables, together with property accessor methods: a meth od for retrievin g th e value of th e in stan ce variable an d a meth od for changing its value. These methods are kn own as th e getter and setter, respectively. Our example POJO declares getter an d setter meth ods for th e private username instance variable and also for address. The JavaBean specification defines the guidelin es for n amin g th ese meth ods. Th e guidelin es allow gen eric tools like Hibernate to easily discover and manipulate the property value. A getter method name begins with get, followed by th e n ame of th e property ( th e first letter in uppercase) ; a setter meth od n ame begin s with set. Getter methods for Boolean properties may begin with is in stead of get. Hibern ate doesn ’t require th at accessor meth ods be declared public; it can easily use private accessors for property management. Some getter and setter meth ods do someth in g more soph isticated th an simple instance variables access ( validation, for example) . Trivial accessor methods are common , h owever.

E

This PO JO also defines a business method that calculates th e cost of sh ippin g an item to a particular user ( we left out th e implemen tation of th is meth od) .

Implementing the domain model

69

Now th at you un derstan d th e value of usin g PO JO persisten t classes as th e programmin g model, let’s see h ow you h an dle th e association s between th ose classes. 3 .2 .4 Implementing POJO associations

You u se p rop erties to exp ress association s between PO JO classes, and you use accessor meth ods to n avigate th e object 0..* name : String graph at run time. Let’s con sider th e association s defin ed by t h e Category cla ss. T h e fir st a sso cia t io n is sh o wn in figure 3.3. Figure 3 .3 Diagram of As with all our diagrams, we left out th e association the Category class related attributes ( parentCategory an d childCategories) with an association because th ey would clutter the illustration. These attributes an d th e meth ods th at man ipulate th eir values are called scaffolding code. Let’s implement the scaffolding code for the one-to-many self-association of Category: Category

public class Category implements Serializable { private String name; private Category parentCategory; private Set childCategories = new HashSet(); public Category() { } ... }

To allow bidirection al n avigation of th e association , we require two attributes. The parentCategory attribute implements the single-valued end of th e association an d is declared to be of type Category. The many-valued end, implemen ted by th e childCategories attribute, must be of collection type. We choose a Set, since duplicates are disallowed, and initialize the instance variable to a new instance of HashSet. Hibern ate requires in terfaces for collection -typed attributes. You must use java.util.Set rather than HashSet, for example. At run time, Hibern ate wraps th e HashSet in stan ce with an in stan ce of on e of Hibernate’s own classes. ( This special class isn ’t visible to th e application code) . It is good practice to program to collection in terfaces, rath er th an con crete implementations, so this restriction shouldn’t bother you. We now have some private instance variables but no public interface to allow access from business code or property management by Hibernate. Let’s add some accessor methods to the Category class: public String getName() { return name; }

70

CHAPTER 3

Mapping persistent classes public void setName(String name) { this.name = name; } public Set getChildCategories() { return childCategories; } public void setChildCategories(Set childCategories) { this.childCategories = childCategories; } public Category getParentCategory() { return parentCategory; } public void setParentCategory(Category parentCategory) { this.parentCategory = parentCategory; }

Again , th ese accessor meth ods n eed to be declared public on ly if th ey’re part of th e extern al in ter face of th e p ersisten t class, th e p u blic in ter face u sed by th e application logic. Th e basic procedure for addin g a ch ild Category to a paren t Category looks like this: Category aParent = new Category(); Category aChild = new Category(); aChild.setParentCategory(aParent); aParent.getChildCategories().add(aChild);

Wh en ever an association is created between a paren t Category an d a ch ild Category, two action s are required: ■

The parentCategory of the child must be set, effectively breaking the association between the child and its old parent ( there can be only one parent for an y ch ild) .



Th e ch ild must be added to th e childCategories collection of the new parent Category.

M ANAGED RELATIONSHIPS IN HIBERNATE

Hibern ate doesn ’t “man age” persisten t association s. If you wan t to man ipulate an association , you must write exactly th e same code you would write with out Hibern ate. If an association is bidirection al, both sides of th e relation sh ip must be con sidered. Programmin g models like EJB en tity bean s muddle th is beh avior by in troducin g container-managed relationships. Th e con tain er automatically ch an ges th e oth er side of a relation sh ip if on e side is modified by th e application . Th is is on e of th e reason s wh y code th at uses en tity bean s can ’t be reused outside th e con tain er.

Implementing the domain model

71

If you ever h ave problems un derstan din g th e beh avior of association s in H ibern ate, just ask yourself, “Wh at would I do without H ibern ate?” H ibern ate doesn ’t change the usual Java semantics. It’s a good idea to add a con ven ien ce meth od to th e Category class th at groups these operations, allowing reuse and helping ensure correctness: public void addChildCategory(Category childCategory) { if (childCategory == null) throw new IllegalArgumentException("Null child category!"); if (childCategory.getParentCategory() != null) childCategory.getParentCategory().getChildCategories() .remove(childCategory); childCategory.setParentCategory(this); childCategories.add(childCategory); }

The addChildCategory() method not only reduces the lines of code when dealing with Category objects, but also enforces th e cardin ality of th e association . Errors that arise from leavin g out on e of th e two required action s are avoided. This kind of grouping of operations sh ould always be provided for associations, if possible. Because we would like th e addChildCategory() to be th e on ly extern ally visible mutator meth od for th e ch ild categories, we make th e setChildCategories() meth od private. Hibern ate doesn ’t care if property accessor methods are private or public, so we can focus on good API design. A different kind of relationship exists between Category and the Item: a bidirectional many-to-many association ( see figure 3.4) . In the case of a many-to-many association, both sides are implemented with collection -valued attributes. Let’s add th e n ew attributes an d meth ods to access th e Item class to our Category class, as shown in listing 3.2.

Figure 3 .4

Category and the associated Item

72

CHAPTER 3

Mapping persistent classes

Listing 3 .2

Category to Item scaffolding code

public class Category { ... private Set items = new HashSet(); ... public Set getItems() { return items; } public void setItems(Set items) { this.items = items; } }

Th e code for th e Item class ( th e oth er en d of th e man y-to-man y association ) is similar to th e code for th e Category class. We add th e collection attribute, th e stan dard accessor meth ods, an d a meth od th at simplifies relation sh ip man agement ( you can also add th is to th e Category class, see listin g 3.3) . Listing 3 .3

Item to Category scaffolding code

public class Item { private String name; private String description; ... private Set categories = new HashSet(); ... public Set getCategories() { return categories; } private void setCategories(Set categories) { this.categories = categories; } public void addCategory(Category category) { if (category == null) throw new IllegalArgumentException("Null category"); category.getItems().add(this); categories.add(category); } }

Implementing the domain model

73

Th e addCategory() of th e Item meth od is similar to th e addChildCategory con ven ien ce meth od of th e Category class. It’s used by a clien t to man ipulate th e relation sh ip between Item an d a Category. For th e sake of readability, we won ’t sh ow convenience methods in future code samples and assume you’ll add them according to your own taste. Convenience methods for association handling is however not the only way to improve a domain model implemen tation . You can also add logic to your accessor meth ods. 3 .2 .5 Adding logic to accessor methods

O n e of th e reason s we like to use JavaBean s-style accessor meth ods is th at th ey provide en capsulation : Th e h idden in tern al implemen tation of a property can be ch an ged with out an y ch an ges to th e public in ter face. Th is allows you to abstract th e in tern al data structure of a class—the in stan ce variables—from th e design of the database. For example, if your database stores a name of the user as a single NAME column , but your User class has firstname an d lastname properties, you can add the followin g persisten t name property to your class: public class User { private String firstname; private String lastname; ... public String getName() { return firstname + ' ' + lastname; } public void setName(String name) { StringTokenizer t = new StringTokenizer(name); firstname = t.nextToken(); lastname = t.nextToken(); ) ... }

Later, you’ll see th at a H ibern ate custom type is probably a better way to h an dle many of these kinds of situations. However, it helps to have several options. Accessor meth ods can also perform validation. For instance, in the following example, the setFirstName() meth od verifies th at th e n ame is capitalized: public class User { private String firstname; ...

74

CHAPTER 3

Mapping persistent classes public String getFirstname() { return firstname; } public void setFirstname(String firstname) throws InvalidNameException { if ( !StringUtil.isCapitalizedName(firstname) ) throw new InvalidNameException(firstname); this.firstname = firstname; ) ... }

H owever, H ibern ate will later use our accessor meth ods to populate th e state of an object wh en loadin g th e object from th e database. Sometimes we would prefer that this validation not occur wh en Hibern ate is in itializin g a n ewly loaded object. In th at case, it migh t make sense to tell H ibern ate to directly access th e instance variables ( we m ap th e p rop erty with access="field" in H ibern ate m etad ata) , forcin g H ibern ate to bypass th e setter meth od an d access th e in stan ce variable directly. Another issue to consider is dirty checking. Hibernate automatically detects object state ch an ges in order to syn ch ron ize th e updated state with th e database. It’s usually completely safe to return a differen t object from th e getter meth od to th e object passed by H ibern ate to th e setter. H ibern ate will compare th e objects by value—n ot by object iden tity—to determin e if th e property’s persisten t state n eeds to be updated. For example, th e followin g getter meth od won ’t result in un n ecessary SQL UPDATEs: public String getFirstname() { return new String(firstname); }

H owever, th ere is on e ver y im p ortan t excep tion . Collection s are com p ared by identity! For a property mapped as a persisten t collection , you sh ould return exactly th e same collection instance from the getter method as Hibernate passed to the setter method. If you don’t, Hibernate will update the database, even if no update is necessary, every time the session synchronizes state held in memory with the database. This kind of code should almost always be avoided in accessor meth ods: public void setNames(List namesList) { names = (String[]) namesList.toArray(); } public List getNames() { return Arrays.asList(names); }

Defining the mapping metadata

75

You can see th at H ibern ate doesn ’t un n ecessarily restrict th e JavaBean s ( PO JO ) programming model. You’re free to implement whatever logic you n eed in accessor meth ods ( as lon g as you keep th e same collection instance in both getter and setter) . If absolutely n ecessar y, you can tell H ibern ate to use a differen t access strategy to read an d set th e state of a property ( for example, direct in stan ce field access) , as you’ll see later. Th is kin d of tran sparen cy guaran tees an in depen den t an d reusable domain model implemen tation . Now that we’ve implemented some persistent classes of our domain model, we n eed to defin e th e ORM.

3.3 Defining the mapping metadata O RM tools require a metadata format for th e application to specify th e mappin g

between classes an d tables, properties an d column s, association s an d foreign keys, Java types an d SQL types. Th is in formation is called th e object/ relational mapping metadata . It defin es th e tran sformation between th e differen t data type systems and relationship representations. It’s our job as developers to defin e an d main tain th is metadata. We discuss various approaches in this section. 3 .3 .1 M etadata in XM L

An y O RM solution sh ould provide a h uman -readable, easily h an d-editable mappin g format, n ot on ly a GUI mappin g tool. Curren tly, th e most popular object/ relational metadata format is XML. Mappin g documen ts written in an d with XML are ligh tweigh t, are h uman readable, are easily man ipulated by version -con trol systems an d text editors, an d may be customized at deploymen t time ( or even at runtime, with programmatic XML gen eration ) . But is XML-based metadata really th e best approach ? A certain backlash again st the overuse of XML can be seen in the Java community. Every framework and application server seems to require its own XML descriptors. In our view, th ere are th ree main reason s for th is backlash : ■

Man y existin g metadata formats weren ’t designed to be readable and easy to edit by h an d. In particular, a major cause of pain is th e lack of sen sible defaults for attribute an d elemen t values, requirin g sign ifican tly more typin g th an sh ould be n ecessary.



Metadata-based solutions were often used in appropriately. Metadata is n ot, by nature, more flexible or maintainable than plain Java code.

76

CHAPTER 3

Mapping persistent classes



Good XML editors, especially in IDEs, aren’t as common as good Java cod in g en viron m en ts. Worst, an d m ost easily fixable, a d ocu m en t typ e declaration ( DTD ) often isn ’t provided, preven tin g auto-completion an d validation . An oth er problem are DTDs th at are too gen eric, wh ere ever y declaration is wrapped in a generic “extension” of “meta” element.

Th ere is n o gettin g aroun d th e n eed for text-based metadata in O RM. H owever, Hibern ate was design ed with full awaren ess of th e typical metadata problems. Th e metadata format is extremely readable an d defin es useful default values. Wh en attribu te valu es are m issin g, H ibern ate u ses reflection on th e m apped class to h elp determin e th e defaults. H ibern ate comes with a documen ted an d complete DTD. Fin ally, IDE support for XML h as improved lately, an d modern IDEs provide dynamic XML validation an d even an auto-complete feature. If th at’s n ot en ough for you, in ch apter 9 we demon strate some tools th at may be used to gen erate Hibernate XML mappin gs. Let’s look at th e way you can use XML metadata in Hibernate. We created the Category class in th e previous section ; n ow we n eed to map it to th e CATEGORY table in th e database. To do th at, we use th e XML mappin g documen t in listin g 3.4. Listing 3 .4

Hibernate XM L mapping of the Category class



B

C Mapping

D

declaration Category class mapped

E Identifier



F Name property mapped



to NAME column

Defining the mapping metadata

77

B

Th e Hibern ate mappin g DTD should be declared in every mapping file; it’s required for syn tactic validation of th e XML.

C

Mappings are declared inside a element. You can include as many class mappings as you like, along with certain other special declarations that we’ll mention later in the book.

D

Th e class Category ( in th e package org.hibernate.auction.model) is mapped to the table CATEGORY. Every row in this table represents one instance of type Category.

E

We h aven ’t discussed th e con cept of object identity, so you may be surprised by th is mappin g elemen t. Th is complex topic is covered in section 3.4. To un derstan d th is mappin g, it’s sufficien t to kn ow th at every record in th e CATEGORY table will h ave a primary key value th at match es th e object iden tity of the instance in memory. Th e mapping element is used to define the details of object identity.

F

Th e property n ame of type String is mapped to a database column NAME. Note that the type declared in the mapping is a built-in Hibernate type ( string) , n ot the type of the Java property or the SQL column type. Th in k about th is as th e “mappin g data type.” We take a closer look at th ese types in ch apter 6, section 6.1, “Un derstan din g th e Hibern ate type system.” We’ve intentionally left the association mappin gs out of th is example. Association mappings are more complex, so we’ll return to th em in section 3.7. TRY IT

Starting Hibernate with your first persistent class—After you’ve written th e POJO code for the Category an d saved its Hibern ate mappin g to an XML file, you can start up Hibern ate with th is mappin g an d try some operation s. H owever, the POJO code for Category sh own earlier wasn ’t complete: You h ave to add an addition al property n amed id of type java.lang.Long an d its accessor meth ods to en able H ibern ate iden tity man agemen t, as discussed later in th is ch apter. Creatin g th e database sch ema with its tables for such a simple class should be n o problem for you. O bserve th e log of your application to ch eck for a successful startup an d creation of a n ew SessionFactory from th e Configuration sh own in ch apter 2. If you can ’t wait an y lon ger, ch eck out th e save(), load(), and delete() meth ods of th e Session you can obtain from th e SessionFactory. Make sure you correctly deal with tran saction s; th e easiest way is to get a n ew Transaction object with Session.beginTransaction() and commit it with its commit() meth od after you’ve made your calls. See th e code in section 2.1, “H ello World with H ibern ate,” if you’d like to copy some example code for your first test.

78

CHAPTER 3

Mapping persistent classes

Alth ough it’s possible to declare mappin gs for multiple classes in on e mappin g file by u sin g m u ltip le elem en ts, th e recom m en d ed p ractice ( an d th e practice expected by some Hibern ate tools) is to use on e mappin g file per persisten t class. Th e con ven tion is to give th e file th e same n ame as th e mapped class, appendin g an hbm suffix: for example, Category.hbm.xml. Let’s discuss basic class and property mappin gs in Hibern ate. Keep in min d th at we still n eed to come back later in th is chapter to the problem of mapping association s between persisten t classes. 3 .3 .2 Basic property and class mappings

A typical Hibernate property mapping defin es a JavaBean s property n ame, a database column n ame, an d th e n ame of a H ibern ate type. It maps a JavaBean style property to a table column . Th e basic declaration provides man y variation s an d optional settin gs. It’s often possible to omit th e type n ame. So, if description is a property of ( Java) type java.lang.String, H ibern ate will use th e H ibern ate type string by default ( we discuss th e Hibern ate type system in ch apter 6) . Hibern ate uses reflection to determ in e th e Java type of th e property. Th us, th e followin g mappings are equivalent:

You can even omit th e column n ame if it’s the same as the property name, ignorin g case. ( Th is is on e of th e sen sible defaults we mentioned earlier.) For some cases you migh t n eed to use a element instead of the column attribute. The elemen t provides more flexibility; it h as more option al attributes and may appear more than once. Th e followin g two property mappin gs are equivalent:



Th e elemen t ( an d especially th e elemen t) also defin es certain attributes th at apply main ly to automatic database sch ema gen eration . If you aren’t using the hbm2ddl tool ( see section 9.2, “Automatic sch ema gen eration ”) to generate the database schema, you can safely omit th ese. However, it’s still preferable to in clude at least th e not-null attribute, since Hibernate will then be able to report illegal n ull property values with out goin g to th e database:

Defining the mapping metadata

79

Detection of illegal n ull values is main ly useful for providing sen sible exception s at developmen t time. It isn ’t in ten ded for true data validation , wh ich is outside the scope of Hibernate. Some properties don ’t map to a column at all. In particular, a derived property takes its value from an SQL expression . Using derived properties

Th e valu e of a d erived p rop erty is calcu lated at ru n tim e by evalu ation of an expression . You defin e th e expression usin g th e formula attribute. For example, we migh t map a totalIncludingTax property with out h avin g a sin gle column with th e total price in th e database:

Th e given SQL form ula is evaluated ever y tim e th e en tity is retrieved from th e d atabase. Th e p rop erty d oesn ’t h ave a column attribu te ( or su b-elem en t) an d never appears in an SQL INSERT or UPDATE, on ly in SELECTs. Formulas may refer to column s of th e database table, call SQL fun ction s, an d in clude SQL subselects. This example, mapping a derived property of item, uses a correlated subselect to calculate th e average amount of all bids for an item:

Notice that unqualified column names refer to table columns of the class to which th e derived property belon gs. As we mentioned earlier, Hibernate doesn ’t require property accessor methods on PO JO classes, if you defin e a n ew property access strategy. Property access strategies

Th e access attribute allows you to specify h ow H ibern ate sh ould access property values of th e PO JO . Th e default strategy, property, uses th e property accessors ( get/ set meth od pair) . Th e field strategy uses reflection to access th e in stan ce variable directly. Th e followin g “property” mapping doesn’t require a get/ set pair:

Access to properties via accessor methods is con sidered best practice by th e Hibernate commun ity. It provides an extra level of abstraction between the Java domain model an d th e data model, beyon d wh at is already provided by H ibern ate. Properties are more flexible; for example, property defin ition s may be overridden by persistent subclasses. If neither accessor meth ods n or direct instance variable access is appropriate, you can defin e your own customized property access strategy by implemen tin g th e in terface net.sf.hibernate.property.PropertyAccessor an d n ame it in th e access attribute. Controlling insertion and updates

For properties th at map to column s, you can con trol wh eth er th ey appear in th e INSERT statemen t by usin g th e insert attribute an d wh eth er th ey appear in th e UPDATE statement by using the update attribute.

Th e followin g property n ever h as its state written to th e database:

Th e property name of th e JavaBean is th erefore immutable an d can be read from the database but n ot modified in an y way. If th e complete class is immutable, set the immutable="false" in the class mapping In addition , th e dynamic-insert attribute tells Hibern ate wh eth er to in clude unmodified property values in an SQL INSERT , an d th e dynamic-update attribute tells Hibernate whether to include unmodified properties in the SQL UPDATE:

...

Th ese are both class-level settin gs. En ablin g eith er of th ese settin gs will cau se H ibern ate to gen erate some SQL at run time, in stead of usin g th e SQL cach ed at startu p tim e. Th e p er form an ce cost is u su ally sm all. Fu rth erm ore, leavin g ou t colu m n s in an in sert ( an d esp ecially in an u p d ate) can occasion ally im p rove per formance if your tables define many columns.

Defining the mapping metadata

81

Using quoted SQL identifiers

By default, H ibern ate d oesn ’t quote table an d column n ames in th e gen erated SQL. Th is makes th e SQL sligh tly more readable an d also allows us to take advan tage of th e fact th at m ost SQ L d atabases are case in sen sitive wh en com p arin g u n qu oted id en tifiers. From tim e to tim e, esp ecially in legacy d atabases, you ’ll en coun ter iden tifiers with stran ge ch aracters or wh itespace, or you may wish to force case-sensitivity. If you quote a table or column name with backticks in the mapping document, Hibernate will always quote this identifier in the generated SQL. The following property declaration forces Hibernate to generate SQL with the quoted column name "Item Description". Hibernate will also know that Microsoft SQL Server needs the variation [Item Description] and that MySQL requires `Item Description`.

Th ere is n o way, apart from quotin g all table an d column n ames in backticks, to force Hibernate to use quoted identifiers everywhere. Naming conventions

You’ll often en coun ter organ ization s with strict con ven tion s for database table and column n ames. Hibern ate provides a feature that allows you to enforce namin g stan dards automatically. Suppose th at all table n ames in CaveatEmptor should follow the pattern CE_

. On e solution is to man ually specify a table attribute on all and collection elements in our mapping files. This approach is time-consuming and easily forgotten . In stead, we can implemen t Hibern ate’s NamingStrategy in terface, as in listin g 3.5 Listing 3 .5

NamingStrategy implementation

public class CENamingStrategy implements NamingStrategy { public String classToTableName(String className) { return tableName( StringHelper.unqualify(className).toUpperCase() ); } public String propertyToColumnName(String propertyName) { return propertyName.toUpperCase(); }

82

CHAPTER 3

Mapping persistent classes public String tableName(String tableName) { return "CE_" + tableName; } public String columnName(String columnName) { return columnName; } public String propertyToTableName(String className, String propertyName) { return classToTableName(className) + '_' + propertyToColumnName(propertyName); } }

The classToTableName() method is called only if a mappin g doesn ’t specify an exp licit table n am e. Th e propertyToColumnName() m eth od is called if a property h as n o explicit column n ame. Th e tableName() an d columnName() meth ods are called when an explicit name is declared. If we en able our CENamingStrategy, this class mapping declaration

will result in CE_BANKACCOUNT as th e n ame of th e table. Th e classToTableName() meth od was called with th e fully qualified class name as the argument. However, if a table n ame is specified

then CE_BANK_ACCOUNT will be the name of the table. In this case, BANK_ACCOUNT was passed to the tableName() meth od. The best feature of the NamingStrategy is th e poten tial for dyn amic beh avior. To activate a specific n amin g strategy, we can pass an instance to the Hibernate Configuration at runtime: Configuration cfg = new Configuration(); cfg.setNamingStrategy( new CENamingStrategy() ); SessionFactory sessionFactory = cfg.configure().buildSessionFactory();

Th is will allow us to h ave multiple SessionFactory in stan ces based on th e same mappin g documen ts, each usin g a differen t NamingStrategy. Th is is extremely useful in a multiclien t in stallation wh ere un ique table n ames ( but th e same data model) are required for each clien t.

Defining the mapping metadata

83

However, a better way to handle this kind of requiremen t is to use th e con cept of an SQL schema ( a kin d of n amespace) . SQL schemas

You can specify a default sch ema usin g th e hibernate.default_schema con figuration option . Altern atively, you can specify a sch ema in th e mappin g documen t. A sch ema may be specified for a particular class or collection mappin g:

...

It can even be declared for the wh ole documen t:

..

This isn’t the only thing the root element is useful for. Declaring class names

All th e persisten t classes of th e CaveatEmptor application are declared in th e Java package org.hibernate.auction.model. It would become tedious to specify th is package n ame every time we n amed a class in our mappin g documen ts. Let’s recon sider our mappin g for th e Category class ( the file Category.hbm.xml) :



...

84

CHAPTER 3

Mapping persistent classes

We don ’t wan t to repeat th e full package n ame wh en ever th is or an y oth er class is named in an association, subclass, or componen t mappin g. So, in stead, we’ll specify a package:



...

Now all un qualified class n ames th at appear in th is m appin g documen t will be prefixed with th e declared package n ame. We assume th is settin g in all mappin g examples in th is book. If writin g XML files by hand ( using the DTD for auto-completion , of course) still seems like too much work, attribute-oriented programming migh t be a good ch oice. Hibern ate mappin g files can be automatically gen erated from attributes directly embedded in th e Java source code. 3 .3 .3 Attribute-oriented programming

The innovative XDoclet project has brought the notion of attribute-oriented programmin g to Java. Un til JDK 1.5, th e Java lan guage h ad n o support for an n otation s; so XDoclet leverages th e Javadoc tag format ( @attribute) to specify class-, field-, or meth od-level metadata attributes. ( Th ere is a book about XDoclet from Manning Publication s: XDoclet in Action [ Walls/ Richards, 2004] .) XDoclet is implemented as an An t task th at gen erates code or XML metadata as part of the build process. Creating the Hibernate XML mappin g documen t with XDoclet is straigh tforward; in stead of writin g it by h an d, we mark up th e Java source code of our persisten t class with custom Javadoc tags, as sh own in listin g 3.6. Listing 3 .6

Using XDoclet tags to mark up Java properties with mapping metadata

/** * The Category class of the CaveatEmptor auction site domain model. * * @hibernate.class * table="CATEGORY" */

Defining the mapping metadata

85

public class Category { ... /** * @hibernate.id * generator-class="native" * column="CATEGORY_ID" */ public Long getId() { return id; } ... /** * @hibernate.property */ public String getName() { return name; } ... }

With the annotated class in place and an Ant task ready, we can automatically gen erate the same XML documen t sh own in th e previous section ( listing 3.4) . The down side to XDoclet is th e requiremen t for an oth er build step. Most large Java projects are usin g An t already, so this is usually a non-issue. Arguably, XDoclet mappin gs are less con figurable at deploymen t time. However, n oth in g is stoppin g you from h an d-editin g th e gen erated XML before deploymen t, so th is probably isn’t a significant objection. Finally, support for XDoclet tag validation may not be available in your developmen t environment. However, JetBrains IntelliJ IDEA an d Eclipse both support at least auto-completion of tag n ames. ( We look at th e use of XDoclet with Hibern ate in chapter 9, section 9.5, “XDoclet.”) NOTE

XDoclet isn ’t a stan dard approach to attribute-orien ted metadata. A n ew Java specification , JSR 175, defin es annotations as exten sion s to th e Java lan guage. JSR 175 is already implemen ted in JDK 1.5, so projects like XDoclet an d Hibern ate will probably provide support for JSR 175 an n otation s in the n ear future.

Both of th e approach es we h ave described so far, XML an d XDoclet attributes, assume th at all mappin g in formation is kn own at deploymen t time. Suppose th at some information isn’t known before the application starts. Can you programmatically manipulate the mapping metadata at runtime?

86

CHAPTER 3

Mapping persistent classes

3 .3 .4 M anipulating metadata at runtime

It’s sometimes useful for an application to browse, man ipulate, or build n ew mapp in gs at ru n tim e. XML API s like DO M, d om 4j, an d JDO M allow d irect ru n tim e man ipulation of XML docum en ts. So, you could create or m an ipulate an XML documen t at run time, before feeding it to the Configuration object. However, Hibernate also exposes a con figuration -time metamodel. Th e metamodel con tain s all th e in formation declared in your XML mappin g documen ts. Direct programmatic manipulation of this metamodel is sometimes useful, especially for applications that allow for extension by user-written code. For example, the following code adds a new property, motto, to th e User class mapping: // Get the existing mapping for User from Configuration PersistentClass userMapping = cfg.getClassMapping(User.class); // Define a new column for the USER table Column column = new Column(); column.setType(Hibernate.STRING); column.setName("MOTTO"); column.setNullable(false); column.setUnique(true); userMapping.getTable().addColumn(column); // Wrap the column in a Value SimpleValue value = new SimpleValue(); value.setTable( userMapping.getTable() ); value.addColumn(column); value.setType(Hibernate.STRING); // Define a new property of the User class Property prop = new Property(); prop.setValue(value); prop.setName("motto"); userMapping.addProperty(prop); // Build a new session factory, using the new mapping SessionFactory sf = cfg.buildSessionFactory();

A PersistentClass object represen ts th e metamodel for a single persisten t class; we retrieve it from th e Configuration. Column, SimpleValue, an d Property are all cla sse s o f t h e H ib e r n a t e m e t a m o d e l a n d a r e a va ila b le in t h e p a ck a g e net.sf.hibernate.mapping. Keep in min d th at addin g a property to an existin g persisten t class mappin g as sh own h ere is easy, but programmatically creatin g a n ew mappin g for a previously un mapped class is quite a bit more involved. Once a SessionFactory is created, its mappings are immutable. In fact, the SessionFactory uses a different metamodel intern ally th an th e on e used at con figura-

Understanding object identity

87

tion time. There is no way to get back to the original Configuration from the SessionFactory or Session. However, the application may read the SessionFactory’s metamodel by callin g getClassMetadata() or getCollectionMetadata(). For example: Category category = ...; ClassMetadata meta = sessionFactory.getClassMetadata(Category.class); String[] metaPropertyNames = meta.getPropertyNames(); Object[] propertyValues = meta.getPropertyValues(category);

Th is code sn ippet retrieves th e n ames of persisten t properties of th e Category class an d th e values of th ose properties for a particular in stan ce. Th is h elps you write gen eric code. For example, you migh t use th is feature to label UI components or improve log output. Now let’s turn to a special mappin g elemen t you’ve seen in most of our previous examples: the identifier property mapping. We’ll begin by discussin g th e n otion of object identity.

3.4 Understanding object identity It’s vital to u n d erstan d th e d ifferen ce between object iden tity an d object equ ality before we discuss terms like database identity and how Hibernate manages identity. We need th ese con cepts if we wan t to fin ish mappin g our CaveatEmptor persisten t classes an d th eir association s with Hibern ate. 3 .4 .1 Identity versus equality

Java developers un derstan d th e differen ce between Java object identity and equality. Object identity, ==, is a n otion defin ed by th e Java virtual mach in e. Two object referen ces are iden tical if th ey point to the same memory location. On th e oth er h an d, object equality is a notion defined by classes that implement the equals() meth od, sometimes also referred to as equivalence. Equivalence means that two different ( non-identical) objects have the same value. Two different instan ces of String are equal if th ey represen t the same sequen ce of ch aracters, even though they each have their own location in th e memory space of th e virtual machin e. ( We admit th at th is is n ot en tirely true for Strings, but you get the idea.) Persistence complicates this picture. With object/ relation al persisten ce, a persistent object is an in-memory representation of a particular row of a database table. So, alon g with Java iden tity ( memory location) and object equality, we pick up database identity ( location in th e persisten t data store) . We now have three methods for iden tifyin g objects:

88

CHAPTER 3

Mapping persistent classes



Object identity— Objects are iden tical if th ey occupy the same memory location in the JVM. This can be checked by using the == operator.



Object equality— Objects are equal if th ey h ave th e same value, as defin ed by th e equals(Object o) meth od. Classes th at don ’t explicitly override th is meth od in h erit th e implemen tation defin ed by java.lang.Object, wh ich compares

object identity. ■

Database identity— Objects stored in a relation al database are iden tical if th ey represen t th e same row or, equivalen tly, sh are th e same table an d primary key value.

You n eed to un derstan d h ow database iden tity relates to object iden tity in Hibern ate. 3 .4 .2 Database identity with Hibernate

Hibernate exposes database identity to th e application in two ways: ■

Th e value of th e identifier property of a persistent instance



Th e value return ed by Session.getIdentifier(Object o)

The identifier property is special: Its value is the primary key value of the database row represen ted by th e persisten t in stan ce. We don ’t usually sh ow th e iden tifier property in our domain model—it’s a persisten ce-related con cern , n ot part of our busin ess problem. In our examples, th e identifier property is always named id. So if myCategory is an in stan ce of Category, callin g myCategory.getId() return s th e primary key value of the row represented by myCategory in the database. Sh ould you make the accessor meth ods for th e iden tifier property private scope or public? Well, database iden tifiers are often used by th e application as a con venient handle to a particular instance, even outside th e persisten ce layer. For example, web application s often display th e results of a search screen to th e user as a list of summary information . Wh en th e user selects a particular element, the application might need to retrieve the selected object. It’s common to use a lookup by identifier for th is purpose—you’ve probably already used identifiers this way, even in applications using direct JDBC. It’s therefore usually appropriate to fully expose the database identity with a public identifier property accessor. On the other hand, we usually declare the setId() method private and let Hibernate generate and set the identifier value. Th e exception s to th is rule are classes with natural keys, where the value of the identifier is assigned by the application before the object is made persistent, in stead of bein g gen erated by Hibern ate. ( We discuss n atural keys in th e n ext section .) Hibernate doesn ’t allow you to change th e iden tifier value of a persistent instance after it’s first assigned.

Understanding object identity

89

Remember, part of the definition of a primary key is th at its value sh ould n ever change. Let’s implement an identifier property for the Category class: public class Category { private Long id; ... public Long getId() { return this.id; } private void setId(Long id) { this.id = id; } ... }

The property type depends on th e primary key type of th e CATEGORY table an d th e Hibern ate mappin g type. Th is in formation is determin ed by th e elemen t in the mapping document:



...

The identifier property is mapped to th e primary key column CATEGORY_ID of th e table CATEGORY. The Hibernate type for this property is long, wh ich maps to a BIGINT column type in most databases an d wh ich h as also been ch osen to match th e type of th e iden tity value produced by the native identifier generator. ( We discuss identifier generation strategies in the next section .) So, in addition to operation s for testin g Java object iden tity (a == b) an d object equality ( a.equals(b) ) , you may n ow use a.getId().equals( b.getId() ) to test database iden tity. An altern ative approach to h an dlin g database iden tity is to n ot implemen t an y identifier property, and let Hibernate man age database iden tity in tern ally. In th is case, you omit th e name attribute in the mapping declaration:



H ibern ate will n ow man age th e iden tifier values in tern ally. You may obtain th e identifier value of a persistent instance as follows: Long catId = (Long) session.getIdentifier(category);

90

CHAPTER 3

Mapping persistent classes

Th is tech n iqu e h as a seriou s d r awback: You can n o lon ger u se H ibern ate to man ipulate detached objects effectively ( see ch apter 4, section 4.1.6, “O utside th e iden tity scope”) . So, you sh ould always use iden tifier properties in H ibern ate. ( If you don ’t like th em bein g visible to th e rest of your application , make th e accessor methods private.) Usin g database iden tifiers in Hibern ate is easy an d straigh tforward. Ch oosin g a good primary key ( and key generation strategy) migh t be more difficult. We discuss these issues n ext. 3 .4 .3 Choosing primary keys

You h ave to tell Hibern ate about your preferred primary key gen eration strategy. But first, let’s define primary key. The candidate key is a column or set of column s th at un iquely iden tifies a specific row of th e table. A can didate key must satisfy the following properties: ■

Th e value or values are n ever null.



Each row has a unique value or values.



Th e value or values of a particular row never change.

For a given table, several column s or combin ation s of column s migh t satisfy th ese properties. If a table h as on ly on e iden tifyin g attribute, it is by defin ition th e primary key. If there are multiple can didate keys, you n eed to ch oose between th em ( candidate keys not chosen as the primary key should be declared as unique keys in th e database) . If th ere are no un ique column s or un ique combin ation s of column s, an d h en ce n o can didate keys, th en th e table is by defin ition n ot a relation as defin ed by th e relation al model ( it permits duplicate rows) , an d you sh ould rethink your data model. Many legacy SQL data models use natural primary keys. A natural key is a key with business meaning: an attribute or combination of attributes that is unique by virtue of its busin ess seman tics. Examples of n atural keys migh t be a U.S. Social Security Number or Australian Tax File Number. Distinguishing natural keys is simple: If a candidate key attribute has mean in g outside th e database con text, it’s a n atural key, whether or not it’s automatically generated. Experien ce h as sh own th at n atural keys almost always cause problems in th e long run. A good primary key must be unique, constant, and required ( never null or unkn own ) . Very few en tity attributes satisfy th ese requiremen ts, an d some th at do aren’t efficiently indexable by SQL databases. In addition , you sh ould make absolutely certain that a candidate key definition could never change throughout

Understanding object identity

91

th e lifetime of th e database before promoting it to a primary key. Changing the definition of a primary key and all foreign keys that refer to it is a frustrating task. For th ese reason s, we stron gly recommend that new applications use synthetic identifiers ( also called surrogate keys) . Surrogate keys h ave n o busin ess mean in g— th ey are un ique values gen erated by th e database or application . Th ere are a n umber of well-known approaches to surrogate key generation. Hibern ate h as several built-in iden tifier generation strategies. We list the most useful options in table 3.1. Table 3 .1

Hibernate’s built-in identifier generator modules

Generator name

Description

native

The native ide ntity ge ne rato r pic ks o the r ide ntity ge ne rato rs like identity, sequence, o r hilo de pe nding o n the c apabilitie s o f the unde rlying databas e .

ide ntity

This ge ne rato r s uppo rts ide ntity c o lumns in DB2 , MyS QL, MS S QL S e rve r, S ybas e , HSQLDB, Info rmix, and Hype rs o nic SQL. The re turne d ide ntifie r is o f type long, short, o r int.

s e que nc e

A s e que nc e in DB2 , Po s tgre S QL, Orac le , S AP DB, Mc Ko i, Fire bird, o r a ge ne rato r in Inte rBas e is us e d. The re turne d ide ntifie r is o f type long, short, o r int.

inc re me nt

At Hibe rnate s tartup, this ge ne rato r re ads the maximum primary ke y c o lumn value o f the table and inc re me nts the value by o ne e ac h time a ne w ro w is ins e rte d. The ge ne rate d ide ntifie r is o f type long, short, o r int. This ge ne rato r is e s pe c ially e ffic ie nt if the s ingle -s e rve r Hibe rnate applic atio n has e xc lus ive ac c e s s to the databas e but s ho uldn’ t be us e d in any o the r s c e nario .

hilo

A high/low algorithm is an e ffic ie nt way to ge ne rate ide ntifie rs o f type long, short, o r int, give n a table and c o lumn (by de fault hibernate_unique_key and next_hi, re s pe c tive ly) as a s o urc e o f hi value s . The high/ lo w algo rithm ge ne rate s ide ntifie rs that are unique o nly fo r a partic ular databas e . S e e [Amble r 2 0 0 2 ] fo r mo re info rmatio n abo ut the high/ lo w appro ac h to unique ide ntifie rs .

uuid.he x

This ge ne rato r us e s a 1 2 8 -bit UUID (an algo rithm that ge ne rate s ide ntifie rs o f type

string , unique within a ne two rk). The IP addre s s is us e d in c o mbinatio n with a unique time s tamp. The UUID is e nc o de d as a s tring o f he xade c imal digits o f le ngth 3 2 . This ge ne ratio n s trate gy is n’ t po pular, s inc e CHAR primary ke ys c o ns ume mo re databas e s pac e than nume ric ke ys and are marginally s lo we r.

You aren ’t limited to th ese built-in strategies; you may create your own iden tifier gen erator by implemen tin g Hibern ate’s IdentifierGenerator in ter face. It’s even possible to mix iden tifier gen erators for persisten t classes in a sin gle domain model, but for n on -legacy data we recommen d usin g th e same gen erator for all classes. The special assigned iden tifier gen erator strategy is most useful for en tities with natural primary keys. This strategy lets th e application assign iden tifier values by

92

CHAPTER 3

Mapping persistent classes

setting the identifier property before makin g th e object persisten t by callin g save(). This strategy has some serious disadvantages when you’re working with detached objects and transitive persistence ( both of these concepts are discussed in th e n ext ch apter) . Don ’t use assigned iden tifiers if you can avoid th em; it’s much easier to use a surrogate primary key gen erated by on e of th e strategies listed in table 3.1. For legacy data, th e picture is more complicated. In th is case, we’re often stuck with n atural keys an d especially composite keys ( n atural keys composed of multiple table columns) . Because composite iden tifiers can be more difficult to work with , we only discuss them in the context of chapter 8, section 8.3.1, “Legacy sch emas and composite keys.” Th e n ext step is to add iden tifier properties to the classes of the CaveatEmptor application . Do all persistent classes have their own database identity? To answer this question, we must explore the distinction between entities an d value types in Hibern ate. Th ese con cepts are required for fin e-grain ed object modelin g.

3.5 Fine-grained object models A major objective of th e H ibernate project is support for fine-grained object models, wh ich we isolated as th e m ost im p ortan t requ irem en t for a rich d om ain model. It’s one reason we’ve chosen POJO s. In crude terms, fine-grained means “more classes than tables.” For example, a user might have both a billing address and a home address. In the database, we might have a single USER table with the columns BILLING_STREET, BILLING_CITY, and BILLING_ZIPCODE alon g with HOME_STREET, HOME_CITY, and HOME_ZIPCODE. There are good reasons to use this somewhat denormalized relational model ( performance, for one) . In our object model, we could use th e same approach , represen tin g th e two addresses as six string-valued properties of the User class. But we would much rather model this using an Address class, wh ere User has the billingAddress an d homeAddress properties. This object model achieves improved coh esion and greater code reuse an d is more understandable. In the past, many ORM solutions haven’t provided good support for th is kin d of mapping. Hibernate emphasizes the usefulness of fin e-grain ed classes for implemen tin g type-safety an d beh avior. For example, many people would model an email address as a string-valued property of User. We suggest th at a more soph isticated approach

Fine-grained object models

93

is to define an actual EmailAddress class that could add higher level semantics and behavior. For example, it migh t provide a sendEmail() meth od. 3 .5 .1 Entity and value types

Th is leads us to a distin ction of cen tral importan ce in ORM. In Java, all classes are of equal stan din g: All objects h ave th eir own iden tity an d lifecycle, an d all class in stan ces are passed by referen ce. On ly primitive types are passed by value. We’re advocatin g a design in wh ich th ere are more persisten t classes th an tables. One row represen ts multiple objects. Because database identity is implemented by primary key value, some persistent objects won’t have their own identity. In effect, the persistence mechanism implements pass-by-value semantics for some classes. One of the objects represented in the row has its own identity, and others depend on that. Hibernate makes the following essential distinction: ■

An object of entity type h as its own database iden tity ( primary key value) . An object referen ce to an en tity is persisted as a referen ce in th e database ( a foreign key value) . An entity has its own lifecycle; it may exist in depen den tly of an y oth er en tity.



An object of value type h as n o database iden tity; it belon gs to an en tity, an d its persistent state is embedded in the table row of th e own in g en tity ( except in the case of collections, which are also considered value types, as you’ll see in ch apter 6) . Value types don ’t h ave id en tifiers or iden tifier properties. The lifespan of a value-type instance is bounded by the lifespan of the ownin g en tity.

Th e most obvious value types are simple objects like Strings and Integers. Hibern ate also lets you treat a user-defin ed class as a value type, as you’ll see n ext. ( We also come back to th is importan t con cept in ch apter 6, section 6.1, “Un derstan ding th e Hibern ate type system.”) 3 .5 .2 Using components

So far, th e classes of our object model h ave all been en tity classes with th eir own lifecycle an d iden tity. Th e User class, h owever, h as a special kin d of association with th e Address class, as sh own in figure 3.5. In object modeling terms, this association is a kind of aggregation— a “part of” relation sh ip. Aggregation is a stron g form of association : It h as addition al seman tics with regard to the lifecycle of objects. In our case, we h ave an even stron ger

94

CHAPTER 3

Mapping persistent classes

User firstname : String

Address

lastname : String username : String

billing

street : String

password : String

home

zipCode : String city : String

email : String

Figure 3 .5 Relationships between User and Address using composition

ranking : int created : Date

form, composition, where the lifecycle of the part is depen den t on th e lifecycle of th e wh ole. Object modelin g experts an d UML designers will claim that there is no difference between th is composition an d oth er weaker styles of association wh en it comes to th e Java implemen tation . But in th e con text of ORM, th ere is a big difference: a composed class is often a candidate value type. We now map Address as a value type and User as an en tity. Does th is affect th e implemen tation of our PO JO classes? Java itself h as n o con cept of composition —a class or attribute can ’t be marked as a compon en t or composition . Th e on ly difference is the object identifier: A compon en t h as n o iden tity, h en ce th e persistent component class requires no identifier property or identifier mappin g. Th e composition between User and Address is a metadata-level notion; we only have to tell Hibernate th at th e Address is a value type in th e mappin g documen t. Hibernate uses the term component for a user-defin ed class th at is persisted to the same table as the owning entity, as shown in listing 3.7. ( The use of the word component h ere h as n oth in g to do with th e arch itecture-level con cept, as in software component.) Listing 3 .7

M apping the User class with a component Address





B

Declare persistent attributes





C

Reuse component class



...

B

We declare the persistent attributes of Address in side th e elemen t. Th e property of th e User class is n amed homeAddress.

C

We reuse th e same compon en t class to map another property of this type to the same table.

96

CHAPTER 3

Mapping persistent classes

Figure 3.6 sh ows h ow th e attributes of th e Address class are persisted to th e same table as the User entity. Notice th at in th is example, we h ave modeled the composition association as unidirectional. We can ’t n avigate from Address to User. Hibernate supports both un idirection al an d bidirection al compositions; however, unidirectional composition is far more common . Here’s an example of a bidirectional mapping: Figure 3 .6 Table attributes of User





Th e elemen t maps a property of type User to th e own in g en tity, in th is example, the property is named user. We then call Address.getUser() to n avigate in th e oth er direction . A Hibernate component may own other components and even associations to other en tities. Th is flexibility is th e foundation of Hibernate’s support for finegrained object models. ( We’ll discuss various compon en t mappin gs in ch apter 6.) However, there are two important limitations to classes mapped as components: ■

Shared references aren’t possible. The component Address doesn ’t h ave its own database identity ( primary key) and so a particular Address object can ’t be referred to by an y object oth er th an th e con tain in g in stan ce of User.



There is no elegant way to represent a null reference to an Address. In lieu of an elegan t approach , H ibern ate represen ts n ull compon en ts as n ull values in all mapped column s of th e component. This means that if you store a component object with all null property values, Hibern ate will return a n ull compon en t wh en th e own in g en tity object is retrieved from th e database.

Support for fin e-grain ed classes isn ’t th e on ly in gredien t of a rich domain model. Class in h eritan ce an d p olym orp h ism are d efin in g featu res of object-orien ted models.

Mapping class inheritance

97

3.6 M apping class inheritance A simple strategy for mappin g classes to database tables migh t be “on e table for ever y class.” Th is approach soun ds simple, an d it works well un til you en coun ter inheritance. Inheritance is the most visible feature of th e structural mismatch between th e object-orien ted an d relation al worlds. Object-oriented systems model both “is a” and “h as a” relation sh ips. SQL-based models provide only “has a” relationships between entities. There are three different approaches to represen tin g an in h eritan ce h ierarch y. These were catalogued by Scott Ambler [ Ambler 2002] in his widely read paper “Mappin g Objects to Relation al Databases”: ■

Table per concrete class— Discard polymorph ism and inheritance relationships completely from the relational model



Table per class hierarchy— En able polymorph ism by denormalizing the relational model and using a type discriminator column to hold type information



Table per subclass— Represen t “is a” ( in h eritan ce) relation sh ips as “h as a” ( foreign key) relationships

Th is section takes a top down ap p r oach ; it assu m es th at we’r e star tin g with a domain model an d tr yin g to derive a n ew SQL sch ema. H owever, th e mappin g strategies described are just as relevan t if we’re workin g bottom up, startin g with existing database tables. 3 .6 .1 Table per concrete class

Suppose we stick with th e simplest approach : We could use exactly on e table for each ( non-abstract) class. All properties of a class, in cludin g in h erited properties, could be mapped to columns of th is table, as sh own in figure 3.7. Th e main problem with th is approach is th at it doesn ’t support polymorph ic associations very well. In the database, association s are usually represen ted as foreign key relationships. In figure 3.7, if the subclasses are all mapped to different tables, a polymorphic association to th eir superclass ( abstract BillingDetails in th is example) can ’t be represen ted as a simple foreign key relation sh ip. Th is would be problematic in our domain model, because BillingDetails is associated with User; h en ce both tables would n eed a foreign key reference to the USER table. Polymorphic queries ( queries that return objects of all classes that match the interface of th e queried class) are also problematic. A query again st th e superclass must

98

CHAPTER 3

Mapping persistent classes

Figure 3 .7

M apping a composition bidirectional

be executed as several SQL SELECTs, one for each concrete subclass. We might be able to use an SQL UNION to improve performance by avoidin g multiple roun d trips to th e database. However, un ion s are somewh at n on portable an d otherwise difficult to work with. Hibernate doesn’t support th e use of un ion s at th e time of writing, and will always use multiple SQL queries. For a query against the BillingDetails class ( for example, restrictin g to a certain date of creation) , Hibernate would use the following SQL: select CREDIT_CARD_ID, OWNER, NUMBER, CREATED, TYPE, ... from CREDIT_CARD where CREATED = ? select BANK_ACCOUNT_ID, OWNER, NUMBER, CREATED, BANK_NAME, ... from BANK_ACCOUNT where CREATED = ?

Notice that a separate query is needed for each con crete subclass. On the other hand, queries against the concrete classes are trivial and perform well: select CREDIT_CARD_ID, TYPE, EXP_MONTH, EXP_YEAR from CREDIT_CARD where CREATED = ?

( Note that here, and in other places in this book, we show SQL th at is conceptually iden tical to th e SQL executed by H ibern ate. Th e actual SQL migh t look super ficially different.) A furth er con ceptual problem with th is mapping strategy is that several different columns of differen t tables sh are th e same seman tics. Th is makes sch ema evolution more complex. For example, a change to a superclass property type results in

Mapping class inheritance

99

changes to multiple columns. It also makes it much more difficult to implemen t database integrity constraints that apply to all subclasses. Th is mappin g strategy doesn ’t require any special Hibernate mapping declaration: Simply create a new declaration for each con crete class, specifyin g a different table attribute for each . We recommen d th is approach ( only) for th e top level of your class h ierarch y, wh ere polymorph ism isn ’t usually required. 3 .6 .2 Table per class hierarchy

Altern atively, an en tire class h ierarch y could be mapped to a sin gle table. Th is table would in clude column s for all properties of all classes in th e h ierarch y. Th e con crete subclass represen ted by a particular row is iden tified by th e value of a type discriminator column . Th is approach is sh own in figure 3.8. This mapping strategy is a winner in terms of both performance and simplicity. It’s the best-performing way to represent polymorphism—both polymorphic and nonpolymorph ic queries perform well—and it’s even easy to implement by hand. Ad h oc reporting is possible with out complex join s or un ion s, an d sch ema evolution is straigh tforward. There is one major problem: Columns for properties declared by subclasses must be declared to be nullable. If your subclasses each define several non-nullable properties, the loss of NOT NULL con strain ts could be a serious problem from th e point of view of data integrity. In Hibern ate, we use th e element to indicate a table-per-class hierarchy mapping, as in listing 3.8.



BILLING_DETAILS BILLING_DETAILS_ID BILLING_DETAILS_TYPE OWNER NUMBER CREATED CREDIT_CARD_TYPE CREDIT_CARD_EXP_MONTH CREDIT_CARD_EXP_YEAR BANK_ACCOUNT_BANK_NAME BANK_ACCOUNT_BANK_SWIFT Figure 3 .8

Table per class hierarchy mapping

100

CHAPTER 3

Mapping persistent classes

Listing 3 .8

Hibernate mapping

B

Root class, mapped to table



C Discriminator column

...

D Property mappings

E CreditCard subclass



...

...

B

Th e root class BusinessDetails of the inheritance hierarchy is mapped to the table BUSINESS_DETAILS.

C

We h ave to use a special column to distinguish between persistent classes: the discriminator. Th is isn ’t a property of th e persisten t class; it’s used internally by Hibernate. Th e column n ame is BILLING_DETAILS_TYPE, an d th e values will be strin gs— in th is case, "CC" or "BA". Hibern ate will automatically set an d retrieve th e discrimin ator values.

D

Properties of th e superclass are mapped as always, with a element.

Mapping class inheritance

E

101

Every subclass has its own elemen t. Properties of a subclass are mapped to columns in the BILLING_DETAILS table. Remember that not-null constrain ts aren’t allowed, because a CreditCard in stan ce won ’t h ave a bankSwift property an d th e BANK_ACCOUNT_BANK_SWIFT field must be n ull for th at row. The elemen t can in turn con tain oth er elements, until the whole hierarchy is mapped to the table. A element can’t contain a elemen t. ( Th e element is used in the specification of the third mapping option: one table per subclass. This option is discussed in th e n ext section .) Th e mappin g strategy can’t be switched anymore at th is poin t. Hibernate would use the following SQL when querying the BillingDetails class: select BILLING_DETAILS_ID, BILLING_DETAILS_TYPE, OWNER, ..., CREDIT_CARD_TYPE, from BILLING_DETAILS where CREATED = ?

To quer y th e CreditCard subclass, H ibern ate would use a con dition on th e d iscriminator: select BILLING_DETAILS_ID, CREDIT_CARD_TYPE, CREDIT_CARD_EXP_MONTH, ... from BILLING_DETAILS where BILLING_DETAILS_TYPE='CC' and CREATED = ?

How could it be any simpler than that? 3 .6 .3 Table per subclass

The third option is to represent inheritance relationships as relational foreign key association s. Every subclass th at declares persistent properties—including abstract classes an d even in ter faces—h as its own table. Unlike th e strategy th at uses a table per con crete class, th e table h ere con tain s columns only for each non-inherited property ( each property declared by the subclass itself) along with a primary key that is also a foreign key of the superclass table. This approach is shown in figure 3.9. If an in stan ce of th e CreditCard subclass is made persistent, th e values of properties declared by th e BillingDetails superclass are persisted to a n ew row of th e BILLING_DETAILS table. On ly th e values of properties declared by th e subclass are persisted to th e n ew row of th e CREDIT_CARD table. The two rows are linked together by their shared primary key value. Later, the subclass instance may be retrieved from th e database by join in g th e subclass table with th e superclass table.

102

CHAPTER 3

Mapping persistent classes

BillingDetails owner : String number: String created : Date

CreditCard

BankAccount

type : int

bankName: String

expMonth : String

bankSwift: String

expYear : String

Table per Subclass



BILLING_DETAILS BILLING_DETAILS_ID OWNER NUMBER CREATED




CREDIT_CARD

BANK_ACCOUNT

CREDIT_CARD_ID

BANK_ACCOUNT_ID

TYPE

BANK_NAME

EXP_MONTH

BANK_SWIFT

EXP_YEAR Figure 3 .9

Table per subclass mapping

Th e primary advan tage of th is strategy is th at th e relation al model is completely n ormalized. Sch ema evolution an d in tegrity con strain t defin ition are straigh tforward. A polymorph ic association to a particular subclass may be represen ted as a foreign key poin tin g to th e table of th at subclass. In Hibernate, we use the elemen t to in dicate a table-per-subclass mapping ( see listing 3.9) . Listing 3 .9

Hibernate mapping

mapped to

B

BILLING_DETAILS table

Mapping class inheritance

103



...

C < joined-subclass> element



D Primary/foreign key

...

...

B

Again , th e root class BillingDetails is mapped to the table BILLING_DETAILS. Note th at n o discrimin ator is required with th is strategy.

C

The n ew element is used to map a subclass to a new table ( in th is example, CREDIT_CARD) . All properties declared in th e join ed subclass will be mapped to this table. Note that we intentionally left out th e mappin g example for BankAccount, wh ich is similar to CreditCard.

D

A primary key is required for the CREDIT_CARD table; it will also h ave a foreign key constrain t to th e primary key of th e BILLING_DETAILS table. A CreditCard object lookup will require a join of both tables. A elemen t may con tain oth er elemen ts but n ot a elemen t. H ibern ate doesn ’t sup p ort m ixin g of th ese two mapping strategies. Hibern ate will use an outer join wh en queryin g th e BillingDetails class:

104

CHAPTER 3

Mapping persistent classes select BD.BILLING_DETAILS_ID, BD.OWNER, BD.NUMER, BD.CREATED, CC.TYPE, ..., BA.BANK_SWIFT, ... case when CC.CREDIT_CARD_ID is not null then 1 when BA.BANK_ACCOUNT_ID is not null then 2 when BD.BILLING_DETAILS_ID is not null then 0 end as TYPE from BILLING_DETAILS BD left join CREDIT_CARD CC on BD.BILLING_DETAILS_ID = CC.CREDIT_CARD_ID left join BANK_ACCOUNT BA on BD.BILLING_DETAILS_ID = BA.BANK_ACCOUNT_ID where BD.CREATED = ?

Th e SQL case statemen t uses th e existen ce ( or n on existen ce) of rows in th e subclass tables CREDIT_CARD and BANK_ACCOUNT to determine the concrete subclass for a particular row of th e BILLING_DETAILS table. To narrow th e query to th e subclass, Hibern ate uses an in n er join in stead: select BD.BILLING_DETAILS_ID, BD.OWNER, BD.CREATED, CC.TYPE, ... from CREDIT_CARD CC inner join BILLING_DETAILS BD on BD.BILLING_DETAILS_ID = CC.CREDIT_CARD_ID where CC.CREATED = ?

As you can see, th is mappin g strategy is more difficult to implemen t by h an d— even ad h oc reportin g will be more complex. This is an important consideration if you plan to mix Hibern ate code with h an dwritten SQL/ JDBC. ( For ad h oc reporting, database views provide a way to offset the complexity of the table-per-subclass strategy. A view may be used to tran sform th e table-per-subclass model in to th e much simpler table-per-hierarchy model.) Furth ermore, even th ough th is mappin g strategy is deceptively simple, our experien ce is th at performan ce may be unacceptable for complex class hierarchies. Queries always require either a join across man y tables or man y sequen tial reads. Our problem should be recast as how to choose an appropriate combination of mappin g strategies for our application ’s class hierarchies. A typical domain model design h as a mix of interfaces an d abstract classes. 3 .6 .4 Choosing a strategy

You can apply all mappin g strategies to abstract classes an d in ter faces. In ter faces may h ave n o state but may con tain accessor meth od declaration s, so th ey can be treated like abstract classes. You can map an in ter face usin g , , or ; an d you can map an y declared or inherited property using

Introducing associations

105

. H ibern ate won ’t try to in stan tiate an abstract class, h owever, even if

you query or load it. Here are some rules of thumb: ■

If you don’t require polymorphic associations or queries, lean toward the table-per-con crete-class strategy. If you require polymorph ic association s ( an association to a superclass, h en ce to all classes in th e h ierarch y with dyn amic resolution of th e con crete class at run time) or queries, an d subclasses declare relatively few properties ( particularly if the main differen ce between subclasses is in th eir beh avior) , lean toward th e table-per-class-h ierarchy model.



If you require polymorph ic associations or queries, and subclasses declare m an y p r op er ties ( su bclasses d iffer m ain ly by th e d ata th ey h old ) , lean toward the table-per-subclass approach.

By default, ch oose table-per-class-h ierarch y for simple problems. For more complex cases ( or when you’re overruled by a data modeler in sistin g upon th e importan ce of n u llability con str ain ts) , you sh ou ld con sid er th e table-p er-su bclass strategy. Bu t at th at p oin t, ask you rself wh eth er it m igh t be better to rem od el in h eritan ce as delegation in th e object model. Complex in h eritan ce is often best avoided for all sorts of reason s un related to persisten ce or ORM. Hibernate acts as a buffer between th e object an d relation al models, but that doesn’t mean you can completely ignore persistence concerns when designing your object model. Note that you may also use an d mappin g elemen ts in a separate mappin g file ( as a top-level elemen t, in stead of ) . You then have to declare the class th at is exten ded ( for example, ) , and the superclass mapping must be loaded before th e subclass mappin g file. Th is tech n ique allows you to extend a class h ierarch y with out modifying the mapping file of the superclass. You have now seen the intricacies of mapping an entity in isolation. In the next section , we turn to th e problem of mapping associations between entities, which is another major issue arising from the object/ relational paradigm mismatch.

3.7 Introducing associations Managing the associations between classes and th e relation sh ips between tables is the soul of ORM. Most of the difficult problems involved in implementing an ORM solution relate to association management.

106

CHAPTER 3

Mapping persistent classes

Th e Hibern ate association model is extremely rich but is n ot with out pitfalls, especially for new users. In this section, we won ’t try to cover all the possible combinations. What we’ll do is examine certain cases that are extremely common . We return to th e subject of association mappin gs in ch apter 6, for a more complete treatmen t. But first, there’s something we n eed to explain up fron t. 3 .7 .1 M anaged associations?

If you’ve used CMP 2.0/ 2.1, you’re familiar with th e con cept of a managed association ( or man aged relation sh ip) . CMP association s are called con tain er-man aged relation sh ips ( CMRs) for a reason . Association s in CMP are in h eren tly bid irection al: A ch an ge made to on e side of an association is in stan tly reflected at th e other side. For example, if we call bid.setItem(item), th e con tain er automatically calls item.getBids().add(item). Transparent PO JO -orien ted persisten ce implementations such as Hibernate do not implemen t man aged association s. Con trary to CMR, Hibernate associations are all in h eren tly unidirectional. As far as Hibern ate is concerned, the association from Bid to Item is a different association th an th e association from Item to Bid. To some people, th is seems stran ge; to oth ers, it feels completely n atural. After all, associations at the Java language level are always un idirection al—an d Hibernate claims to implement persistence for plain Java objects. We’ll merely observe that this decision was made because Hibern ate objects, un like en tity bean s, are not assumed to be always un der the con trol of a container. In Hibernate application s, th e beh avior of a n on -persisten t in stan ce is th e same as th e beh avior of a persistent instance. Because association s are so importan t, we n eed a very precise language for classifying them. 3 .7 .2 M ultiplicity

In describin g an d classifyin g association s, we’ll almost always use th e association multiplicity. Look at figure 3.10. For us, th e multiplicity is just two bits of in formation : ■

Can there be more than one Bid for a particular Item?



Can th ere be more th an one Item for a particular Bid? Item

1..1

0..*

Bid

Figure 3 .1 0 Relationship between Item and Bid

Introducing associations

107

After glan cin g at th e object model, we con clude th at th e association from Bid to Item is a many-to-one association . Recallin g th at association s are direction al, we would also call th e in verse association from Item to Bid a one-to-many association. ( Clearly, th ere are two more possibilities: many-to-many and one-to-one; we’ll get back to th ese possibilities in ch apter 6.) In the context of object persistence, we aren’t interested in whether “many” really means “two” or “maximum of five” or “un restricted.” 3 .7 .3 The simplest possible association

Th e association from Bid to Item is an example of th e simplest possible kin d of association in ORM. The object reference returned by getItem() is easily mapped to a foreign key column in th e BID table. First, h ere’s th e Java class implemen tation of Bid: public class Bid { ... private Item item; public void setItem(Item item) { this.item = item; } public Item getItem() { return item; } ... }

Next, here’s the Hibernate mapping for this association:

...

This mappin g is called a unidirectional many-to-one association . The column ITEM_ID in th e BID table is a foreign key to the primary key of the ITEM table.

108

CHAPTER 3

Mapping persistent classes

We have explicitly specified th e class, Item, that the association refers to. This specification is usually option al, sin ce Hibernate can determine this using reflection . We specified the not-null attribute because we can’t have a bid without an item. Th e not-null attribute doesn ’t affect th e run time beh avior of Hibern ate; it exists main ly to con trol automatic data defin ition lan guage ( DDL) gen eration ( see ch apter 9) . 3 .7 .4 M aking the association bidirectional

So far so good. But we also need to be able to easily fetch all the bids for a particular item. We n eed a bidirection al association h ere, so we h ave to add scaffoldin g code to th e Item class: public class Item { ... private Set bids = new HashSet(); public void setBids(Set bids) { this.bids = bids; } public Set getBids() { return bids; } public void addBid(Bid bid) { bid.setItem(this); bids.add(bid); } ... }

You can th in k of th e code in addBid() ( a con ven ien ce meth od) as implemen tin g a man aged association in th e object model. A basic mapping for this one-to-many association would look like this:

...



Introducing associations

109

The column mappin g defin ed by th e element is a foreign key column of the associated BID table. Notice th at we specify th e same foreign key column in th is collection mapping that we specified in the mapping for the many-to-one association . Th e table structure for th is association mappin g is sh own in figure 3.11.




ITEM

BID

ITEM_ID NAME DESCRIPTION INITIAL_PRICE ...

BID_ID ITEM_ID AMOUNT ...

Figure 3 .1 1 Table relationships and keys for a one-to-many/ many-to-one mapping

Now we h ave two differen t un idirection al association s mapped to th e same foreign key, wh ich poses a problem. At run time, th ere are two differen t in -memory represen tation s of th e same foreign key value: th e item property of Bid an d an elemen t of th e bids collection held by an Item. Suppose our application modifies the association by, for exam p le, ad d in g a bid to an item in th is fr agm en t of th e addBid() meth od: bid.setItem(item); bids.add(bid);

This code is fine, but in this situation, H ibern ate detects two differen t ch an ges to th e in -memory persisten t in stan ces. From th e poin t of view of th e database, just on e value must be updated to reflect th ese ch an ges: th e ITEM_ID column of th e BID table. Hibernate doesn’t transparently detect the fact that the two changes refer to the same database column, since at this point we’ve done nothing to indicate that this is a bidirectional association. We need one more thing in our association mappin g to tell Hibern ate to treat this as a bidirection al association : Th e inverse attribute tells Hibern ate th at th e collection is a mirror image of the many-to-one association on the other side:

...

110

CHAPTER 3

Mapping persistent classes



With out th e inverse attribute, H ibern ate would try to execute two differen t SQL statemen ts, both updatin g th e same foreign key column, when we manipulate the association between the two instances. By specifying inverse="true", we explicitly tell H ibern ate wh ich en d of th e association it sh ould syn ch ron ize with th e database. In th is example, we tell H ibern ate th at it sh ould propagate ch an ges made at th e Bid en d of th e association to th e database, ign orin g ch an ges made only to th e bids collection . Th us if we on ly call item.getBids().add(bid), n o ch an ges will be m ad e p er sisten t. Th is is con sisten t with th e beh avior in Java with ou t H ibern ate: If an association is bidirection al, you h ave to create th e lin k on two sides, not just one. We n ow h ave a workin g bidirectional many-to-one association ( wh ich could also be called a bidirection al on e-to-many association, of course) . One final piece is missing. We explore the notion of transitive persistence in much greater detail in the next chapter. For now, we’ll introduce the concepts of cascading save an d cascading delete, which we need in order to finish our mapping of this association. When we instantiate a new Bid an d add it to an Item, th e bid sh ould become persistent immediately. We would like to avoid the need to explicitly make a Bid persistent by calling save() on th e Session interface. We make on e fin al tweak to th e mappin g documen t to en able cascadin g save:

...



Th e cascade attribute tells H ibern ate to make an y n ew Bid in stan ce persisten t ( that is, save it in the database) if the Bid is referen ced by a persistent Item.

Introducing associations

111

The cascade attribute is direction al: It applies to on ly on e en d of th e association . We could also specify cascade="save-update" for the many-to-one association declared in th e mapping for Bid, but doing so would make no sense in this case because Bids are created after Items. Are we finished? Not quite. We still need to defin e th e lifecycle for both en tities in our association . 3 .7 .5 A parent/ child relationship

With th e previous mappin g, th e association between Bid an d Item is fairly loose. We would use this mapping in a real system if both en tities h ad th eir own lifecycle an d were created an d removed in un related busin ess processes. Certain association s are much stron ger th an th is; some en tities are boun d togeth er so th at th eir lifecycles aren ’t truly in depen den t. In our example, it seems reason able th at deletion of an item implies deletion of all bids for the item. A particular bid in stan ce referen ces on ly on e item in stan ce for its en tire lifetime. In th is case, cascadin g both saves and deletions makes sense. If we en able cascadin g delete, th e association between Item and Bid is called a parent/ child relationship. In a parent/ child relationship, the parent entity is responsible for the lifecycle of its associated child entities. This is the same semantics as a composition ( usin g Hibern ate compon en ts) , but in this case on ly en tities are involved; Bid isn ’t a value type. The advan tage of usin g a paren t/ ch ild relation sh ip is that the child may be loaded individually or referenced directly by another entity. A bid, for example, may be loaded an d man ipulated with out retrievin g th e own in g item. It may be stored without storing the own in g item at th e same time. Furth ermore, we reference the same Bid instance in a second property of Item, th e sin gle successfulBid ( see figure 3.2, page 63) . Objects of value type can’t be shared. To remodel the Item to Bid association as a paren t/ ch ild relation sh ip, th e on ly change we need to make is to the cascade attribute:

...



112

CHAPTER 3

Mapping persistent classes

We used cascade="all-delete-orphan" to indicate the following: ■

An y n ewly in stan tiated Bid becomes persisten t if th e Bid is referenced by a persistent Item ( as was also th e case with cascade="save-update") . An y persistent Bid sh ould be deleted if it’s referen ced by an Item wh en th e item is deleted.



An y persisten t Bid sh ould be deleted if it’s removed from th e bids collection of a persistent Item. ( Hibern ate will assume th at it was on ly referen ced by this item and consider it an orphan.)

We h ave ach ieved with th e followin g with th is mappin g: A Bid is removed from th e d atab ase if it’s r em o ved fr o m th e co llectio n o f Bid s o f th e Item ( or it’s removed if the Item itself is removed) . Th e cascadin g of operation s to associated entities is Hibernate’s implementation of transitive persistence. We look more closely at this con cept in ch apter 4, section 4.3, “Using transitive persistence in Hibernate.” We have covered only a tiny subset of the association options available in Hibern ate. However, you already h ave en ough knowledge to be able to build entire application s. Th e remain in g option s are either rare or are variations of the associations we have described. We recommend keeping your association mappin gs simple, usin g Hibern ate queries for more complex tasks.

3.8 Summary In th is ch apter, we h ave focused on th e structural aspect of th e object/ relation al paradigm mismatch an d h ave discussed th e first four gen eric O RM problems. We discussed th e programmin g model for persisten t classes an d th e H ibernate O RM metadata for fin e-grain ed classes, object iden tity, in h eritan ce, an d association s. You now understand that persistent classes in a domain model should be free of cross-cuttin g con cern s such as transaction s an d security. Even persistence-related concerns shouldn’t leak into the domain model. We n o lon ger en tertain th e use of restrictive programming models such as EJB entity beans for our domain model. In stead, we use tran sparen t persisten ce, togeth er with th e un restrictive PO JO programmin g model—wh ich is really a set of best practices for th e creation of properly en capsulated Java types. Hibern ate requires you to provide metadata in XML text format. You use this metadata to define the mapping strategy for all your persisten t classes ( an d tables) . We created mappings for classes and properties and looked at class association

Summary

113

mappin gs. You saw h ow to implemen t th e th ree well-kn own in h eritan ce-mappin g strategies in Hibern ate. You also learn ed about th e importan t differen ces between entities an d valuetyped objects in Hibernate. Entities have their own iden tity an d lifecycle, wh ereas value-typed objects are dependent on an entity and are persisted with by-value semantics. Hibernate allows fine-grained object models with fewer tables than persistent classes. Finally, we have implemented and mapped our first paren t/ ch ild association between persisten t classes, usin g database foreign key fields and the cascading of operation s full stop. In the next chapter, we investigate th e dyn amic aspects of th e object/ relation al mismatch, including a much deeper study of th e cascaded operation s we in troduced and the lifecycle of persistent objects.

Working with persistent objects

This chapter covers ■

The life c yc le o f o bje c ts in a Hibe rnate applic atio n



Us ing the s e s s io n pe rs is te nc e manage r



Trans itive pe rs is te nc e



Effic ie nt fe tc hing s trate gy

114

The persistence lifecycle

115

You n ow h ave an un derstan din g of h ow Hibern ate an d ORM solve the static aspects of the object/ relational mismatch. With what you know so far, it’s possible to solve the structural mismatch problem, but an efficient solution to th e problem requires something more. We must investigate strategies for runtime data access, since they’re crucial to the per formance of our application s. You n eed to learn h ow to efficiently store and load objects. This chapter covers the behavioral aspect of th e object/ relation al mismatch , listed in ch apter 1 as th e last four O / R mappin g problems described in section 1.4.2. We con sider th ese problems to be at least as important as the structural problems discussed in ch apter 3. In our experien ce, man y developers are only aware of the structural mismatch and rarely pay attention to the more dynamic behavioral aspects of the mismatch. In th is ch apter, we discuss th e lifecycle of objects—how an object becomes persistent, and how it stops being considered persistent—and the method calls and other actions that trigger these transitions. Th e Hibernate persistence manager, the Session, is responsible for managing object state, so you’ll learn how to use this importan t API. Retrieving object graphs efficiently is another central concern, so we introduce th e basic strategies in th is ch apter. Hibern ate provides several ways to specify queries that return objects without losing much of the power inherent to SQL. Because network latency caused by remote access to the database can be an important limiting factor in th e overall per forman ce of Java application s, you must learn h ow to retrieve a graph of objects with a minimal number of database hits. Let’s start by discussin g objects, th eir lifecycle, and the events that trigger a ch an ge of persisten t state. Th ese basics will give you the background you need when workin g with your object graph , so you’ll kn ow wh en an d h ow to load an d save your objects. Th e material migh t be formal, but a solid understanding of the persistence lifecycle is essential.

4.1 The persistence lifecycle Sin ce Hibern ate is a tran sparen t persisten ce mech an ism—classes are un aware of th eir own persisten ce capability—it’s possible to write application logic that is un aware of wh eth er th e objects it operates on represent persistent state or temporary state that exists only in memory. The application sh ouldn’t n ecessarily n eed to care that an object is persisten t wh en in vokin g its meth ods. However, in any application with persisten t state, th e application must in teract with th e persisten ce layer wh en ever it n eeds to propagate state held in memory to

116

CHAPTER 4

Working with persistent objects

the database ( or vice versa) . To do th is, you call Hibern ate’s persisten ce man ager and query inter faces. When interactin g with th e persisten ce mech an ism th at way, it’s necessary for the application to concern itself with the state and lifecycle of an object with respect to persistence. We’ll refer to this as the persistence lifecycle. Different O RM implementations use different termin ology an d defin e differen t states an d state tran sition s for th e persisten ce lifecycle. Moreover, the object states used internally might be different from th ose exposed to th e clien t application . Hibern ate defin es on ly th ree states, hiding the complexity of its internal implemen tation from th e clien t code. In th is section, we explain these three states: transient, persistent, and detached. Let’s look at th ese states and th eir tran sition s in a state ch art, sh own in figure 4.1. You can also see th e meth od calls to the persistence manager that trigger tran sition s. We discuss th is chart in th is section ; refer to it later wh en ever you n eed an overview. In its lifecycle, an object can transition from a tran sien t object to a persisten t object to a detached object. Let’s take a closer look at each of these states. 4 .1 .1 Transient objects

In Hibernate, objects instantiated using the new operator aren’t immediately persistent. Their state is transient, which means they aren’t associated with any database table row, and so their state is lost as soon as th ey’re dereferenced ( n o lon ger referenced by any other object) by the application . Th ese objects h ave a lifespan th at new

get() load() find() iterate() etc.

Transient

save() saveOrUpdate()

delete()

garbage

Persistent evict() close() * clear() *

update() saveOrUpdate() lock()

garbage

Detached * affects all instances in a Session

Figure 4 .1 States of an object and transitions in a Hibernate application

The persistence lifecycle

117

effectively en ds at th at time, an d th ey become in accessible an d available for garbage collection. Hibernate considers all transient instances to be nontransactional; a modification to th e state of a tran sien t in stan ce isn ’t made in th e con text of an y tran saction . Th is mean s Hibern ate doesn ’t provide any rollback functionality for transient objects. ( In fact, Hibernate doesn ’t roll back any object changes, as you’ll see later.) Objects th at are referen ced on ly by oth er tran sien t instan ces are, by default, also transient. For an instance to transition from transient to persistent state requires either a save() call to th e persisten ce man ager or th e creation of a referen ce from an already persisten t in stan ce. 4 .1 .2 Persistent objects

A persisten t in stan ce is an y in stan ce with a database identity, as defin ed in ch apter 3, section 3.4, “Un derstan din g object iden tity.” Th at mean s a persisten t in stan ce h as a primary key value set as its database iden tifier. Persistent instances might be objects instantiated by th e application and th en made persisten t by callin g the save() meth od of th e persisten ce man ager ( th e Hibernate Session, discussed in more detail later in this chapter) . Persistent instances are th en associated with the persistence manager. They might even be objects that became persistent when a referen ce was created from an oth er persistent object already associated with a persistence manager. Alternatively, a persistent in stan ce migh t be an in stan ce retrieved from th e database by execution of a query, by an iden tifier lookup, or by n avigating the object graph starting from another persistent instance. In other words, persisten t in stan ces are always associated with a Session an d are transactional. Persistent instances participate in tran saction s—th eir state is syn ch ron ized with the database at the end of the transaction. When a transaction commits, state held in memory is propagated to th e database by th e execution of SQL INSERT, UPDATE, an d DELETE statemen ts. Th is procedure migh t also occur at oth er times. For example, Hibernate might synchronize with the database before execution of a query. Th is en sures th at queries will be aware of ch an ges made earlier durin g th e tran saction . We call a persistent instance new if it has been allocated a primary key value but has not yet been inserted into the database. Th e n ew persisten t in stan ce will remain “n ew” un til syn ch ron ization occurs. Of course, you don’t update the database row of every persisten t object in memory at th e en d of th e tran saction . ORM software must have a strategy for detecting which persistent objects have been modified by th e application in th e tran saction .

118

CHAPTER 4

Working with persistent objects

We call th is automatic dirty checking ( an object with modification s th at h aven ’t yet been propagated to th e database is con sidered dirty). Again, this state isn’t visible to th e application . We call th is feature transparent transaction-level write-behind, mean ing that Hibernate propagates state changes to th e database as late as possible but hides this detail from the application. Hibern ate can detect exactly wh ich attributes have been modified, so it’s possible to in clude on ly th e column s th at n eed updatin g in th e SQL UPDATE statemen t. This might bring per formance gains, particularly with certain databases. However, it isn’t usually a significant difference, and, in theory, it could harm per formance in some environments. So, by default, Hibernate includes all columns in the SQL UPDATE statement ( hence, Hibernate can gen erate th is basic SQL at startup, n ot at runtime) . If you only want to update modified column s, you can en able dyn amic SQL generation by settin g dynamic-update="true" in a class mappin g. ( Note th at th is feature is extremely difficult to implement in a handcoded persistence layer.) We talk about Hibern ate’s tran saction semantics and the synchronization process ( known as flushing) in more detail in th e n ext ch apter. Finally, a persistent instance may be made transient via a delete() call to th e persistence manager API, resultin g in deletion of th e corresponding row of the database table. 4 .1 .3 Detached objects

When a transaction completes, the persisten t in stan ces associated with th e persistence manager still exist. ( If the transaction were successful, their in-memory state will have been syn ch ron ized with th e database.) In ORM implemen tation s with process-scoped identity ( see the following sections) , the instances retain their association to the persistence manager and are still con sidered persisten t. In the case of Hibernate, however, th ese in stan ces lose th eir association with th e persisten ce man ager wh en you close() th e Session. We refer to these objects as detached, indicating that their state is no longer guaranteed to be syn ch ron ized with database state; they’re no longer under th e man agemen t of Hibern ate. However, they still contain persistent data ( that may possibly soon be stale) . It’s possible ( an d common) for the application to retain a reference to a detached object outside of a transaction ( and persistence manager) . Hibernate lets you reuse these instances in a n ew tran saction by reassociatin g th em with a n ew persisten ce man ager. ( After reassociation, they’re considered persistent.) This feature has a deep impact on how multitiered applications may be design ed. Th e ability to return objects from on e tran saction to th e presen tation layer an d later reuse th em in a n ew tran saction

The persistence lifecycle

119

is one of Hibernate’s main selling points. We discuss this usage in the next chapter as an implementation technique for long-running application transactions. We also sh ow you h ow to avoid th e DTO ( anti-) pattern by using detached objects in chapter 8, in the section “Rethinking data transfer objects.” Hibernate also provides an explicit detach men t operation : th e evict() meth od of th e Session. However, this method is typically used on ly for cach e man agemen t ( a per formance consideration) . It’s not n ormal to per form detach men t explicitly. Rath er, all objects retrieved in a tran saction become detach ed wh en th e Session is closed or when they’re serialized ( if they’re passed remotely, for example) . So, Hibernate doesn’t need to provide functionality for controlling detachment of subgraphs. In stead, th e application can con trol th e depth of the fetched subgraph ( the instan ces th at are curren tly loaded in memory) using the query language or explicit graph n avigation . Th en , wh en th e Session is closed, th is entire subgraph ( all objects associated with a persisten ce man ager) becomes detach ed. Let’s look at th e differen t states again but th is time con sider th e scope of object identity. 4 .1 .4 The scope of object identity

As application developers, we iden tify an object usin g Java object iden tity (a==b). So, if an object changes state, is its Java identity guaranteed to be the same in the new state? In a layered application, that might not be the case. In order to explore th is topic, it’s importan t to un derstan d th e relation sh ip between Java iden tity, a==b, an d database iden tity, a.getId().equals( b.getId() ). Sometimes both are equivalen t; sometimes they aren’t. We refer to the conditions under which Java identity is equivalen t to database iden tity as th e scope of object identity. For this scope, there are three common choices: ■

A primitive persistence layer with no identity scope makes n o guaran tees th at if a row is accessed twice, th e same Java object instance will be returned to the application. This becomes problematic if th e application modifies two different instances that both represen t th e same row in a single tran saction ( how do you decide which state should be propagated to th e database?) .



A persistence layer using transaction-scoped identity guaran tees th at, in th e context of a single transaction, there is on ly on e object in stan ce th at represen ts a particular database row. Th is avoids th e previous problem an d also allows for some caching to be done at the transaction level.



Process-scoped identity goes one step further and guarantees that there is only on e object in stan ce represen tin g th e row in th e wh ole process ( JVM) .

120

CHAPTER 4

Working with persistent objects

For a typical web or enterprise application, transaction-scoped identity is preferred. Process-scoped identity offers some poten tial advan tages in terms of cach e utilization and the programming model for reuse of instances across multiple transactions; however, in a pervasively multithreaded application, the cost of always synchron izin g sh ared access to persisten t objects in the global identity map is too h igh a price to pay. It’s simpler, an d more scalable, to h ave each th read work with a distinct set of persisten t in stan ces in each tran saction scope. Speaking loosely, we would say that Hibern ate implemen ts tran saction -scoped iden tity. Actually, th e Hibernate identity scope is the Session in stan ce, so iden tical objects are guaranteed if the same persisten ce man ager ( th e Session) is used for several operations. But a Session isn ’t th e same as a ( database) tran saction —it’s a much more flexible elemen t. We’ll explore th e differen ces an d th e con sequen ces of this concept in the next chapter. Let’s focus on th e persistence lifecycle an d identity scope again. If you request two objects usin g th e same database iden tifier value in th e same Session, the result will be two references to the same in-memory object. The followin g code example demon strates th is beh avior, with several load() operation s in two Sessions: Session session1 = sessions.openSession(); Transaction tx1 = session1.beginTransaction(); // Load Category with identifier value "1234" Object a = session1.load(Category.class, new Long(1234) ); Object b = session1.load(Category.class, new Long(1234) ); if ( a==b ) { System.out.println("a and b are identical."); } tx1.commit(); session1.close(); Session session2 = sessions.openSession(); Transaction tx2 = session2.beginTransaction(); Object b2 = session2.load(Category.class, new Long(1234) ); if ( a!=b2 ) { System.out.println("a and b2 are not identical."); } tx2.commit(); session2.close();

Object references a and b n ot on ly h ave th e same database iden tity, th ey also h ave the same Java identity since they were loaded in the same Session. Once outside this boun dary, h owever, Hibern ate doesn ’t guaran tee Java iden tity, so a and b2

The persistence lifecycle

121

aren’t identical and the message is printed on th e con sole. Of course, a test for database iden tity—a.getId().equals ( b2.getId() )—would still return true. To further complicate our discussion of identity scopes, we need to consider how th e persisten ce layer h an dles a referen ce to an object outside its iden tity scope. For example, for a persistence layer with transaction-scoped identity such as Hibern ate, is a referen ce to a detach ed object ( th at is, an in stan ce persisted or loaded in a previous, completed session) tolerated? 4 .1 .5 Outside the identity scope

If an object referen ce leaves th e scope of guaranteed identity, we call it a reference to a detached object. Why is this concept useful? In web applications, you usually don’t maintain a database transaction across a user in teraction . Users take a lon g time to th in k about modifications, but for scalability reasons, you must keep database tran saction s sh ort an d release database resources as soon as possible. In this environment, it’s useful to be able to reuse a reference to a detached instance. For example, you might want to send an object retrieved in one unit of work to the presentation tier and later reuse it in a second unit of work, after it’s been modified by the user. You don ’t usually wish to reattach th e entire object graph in the second unit of of work; for per forman ce ( an d oth er) reason s, it’s importan t th at reassociation of detached instances be selective. Hibernate supports selective reassociation of detached instances. This means the application can efficiently reattach a subgraph of a graph of detached objects with the curren t ( “secon d”) Hibern ate Session. Once a detached object has been reattached to a n ew Hibern ate persisten ce manager, it may be con sidered a persistent in stan ce, an d its state will be syn ch ron ized with th e database at the end of the transaction ( due to Hibern ate’s automatic dirty ch ecking of persistent instances) . Reattachment might result in the creation of n ew rows in th e database wh en a reference is created from a detached in stan ce to a n ew tran sien t in stan ce. For example, a n ew Bid might have been added to a detached Item while it was on the presentation tier. Hibernate can detect that the Bid is new and must be inserted in the database. For this to work, Hibernate must be able to distinguish between a “new” tran sien t in stan ce an d an “old” detach ed in stan ce. Transient instances ( such as the Bid) migh t n eed to be saved; detach ed in stances ( such as th e Item) migh t n eed to be reattached ( and later updated in the database) . There are several ways to distinguish between transient and detached instances, but th e n icest approach is to look at the value of the identifier property. Hibern ate can examin e th e iden tifier of a tran sien t or detach ed object on reattach men t an d treat th e object ( an d th e

122

CHAPTER 4

Working with persistent objects

associated graph of objects) appropriately. We discuss th is importan t issue furth er in section 4.3.4, “Distinguishing between tran sien t an d detach ed in stances.” If you wan t to take advan tage of Hibern ate’s support for reassociation of detached instances in your own applications, you n eed to be aware of Hibern ate’s iden tity scope wh en design in g your application —th at is, th e Session scope that guaran tees iden tical in stan ces. As soon as you leave th at scope an d h ave detach ed instances, another interesting concept comes into play. We need to discuss th e relationsh ip between Java equality ( see ch apter 3, section 3.4.1, “Iden tity versus equality”) an d database iden tity. Equality is an iden tity con cept th at you, as a class developer, con trol an d th at you can ( an d sometimes h ave to) use for classes th at h ave detach ed in stan ces. Java equality is defin ed by th e implemen tation of th e equals() an d hashCode() methods in the persistent classes of the domain model. 4 .1 .6 Implementing equals() and hashCode()

The equals() method is called by application code or, more importantly, by the Java collections. A Set collection , for example, calls equals() on each object you put in the Set, to determine ( and prevent) duplicate elements. First let’s consider the default implementation of equals(), defined by java.lang.Object, which uses a comparison by Java identity. Hibernate guarantees that there is a un ique in stan ce for each row of th e database in side a Session. Therefore, the default iden tity equals() is appropriate if you never mix instances—that is, if you never put detached instances from different sessions into the same Set. ( Actually, the issue we’re exploring is also visible if detach ed in stan ces are from th e same session but h ave been serialized an d deserialized in differen t scopes.) As soon as you have instances from multiple sessions, however, it becomes possible to have a Set con tain in g two Items th at each represen t th e same row of th e database table but don’t have the same Java identity. Th is would almost always be seman tically wrong. Nevertheless, it’s possible to build a complex application with iden tity ( default) equals as lon g as you exercise discipline wh en dealin g with detach ed objects from different sessions ( and keep an eye on serialization and deserialization) . On e n ice th in g about th is approach is th at you don ’t h ave to write extra code to implemen t your own n otion of equality. However, if this concept of equality isn’t what you want, you have to override equals() in your persisten t classes. Keep in min d th at wh en you override equals(), you always n eed to also override hashCode() so th e two meth ods are consistent ( if two objects are equal, they must have the same hashcode) . Let’s look at some of the ways you can override equals() an d hashCode() in persisten t classes.

The persistence lifecycle

123

Using database identifier equality

A clever approach is to implement equals() to compare just th e database iden tifier property ( usually a surrogate primary key) value: public class User { ... public boolean equals(Object other) { if (this==other) return true; if (id==null) return false; if ( !(other instanceof User) ) return false; final User that = (User) other; return this.id.equals( that.getId() ); } public int hashCode() { return id==null ? System.identityHashCode(this) : id.hashCode(); } }

Notice h ow th is equals() method falls back to Java identity for transient instances ( if id==null) th at don ’t h ave a database iden tifier value assign ed yet. Th is is reasonable, since they can’t have the same persisten t iden tity as an oth er in stance. Unfortunately, this solution has one huge problem: Hibernate doesn’t assign identifier values until an entity is saved. So, if th e object is added to a Set before being saved, its hash code changes while it’s contained by the Set, contrary to the contract of java.util.Set. In particular, th is problem makes cascade save ( discussed later in th is ch apter) useless for sets. We stron gly discourage th is solution ( database iden tifier equality) . Comparing by value

A better way is to include all persistent properties of the persistent class, apart from any database identifier property, in the equals() comparison . Th is is h ow most people perceive th e mean in g of equals(); we call it by value equality. Wh en we say “all properties,” we don ’t mean to in clude collection s. Collection state is associated with a different table, so it seems wrong to include it. More important, you don’t want to force the entire object graph to be retrieved just to per form equals(). In th e case of User, th is mean s you sh ouldn ’t in clude th e items collection ( the items sold by this user) in the comparison. So, this is the implementation you could use:

124

CHAPTER 4

Working with persistent objects public class User { ... public boolean equals(Object other) { if (this==other) return true; if ( !(other instanceof User) ) return false; final User that = (User) other; if ( !this.getUsername().equals( that.getUsername() ) return false; if ( !this.getPassword().equals( that.getPassword() ) return false; return true; } public int hashCode() { int result = 14; result = 29 * result + getUsername().hashCode(); result = 29 * result + getPassword().hashCode(); return result; } }

However, there are again two problems with this approach: ■

Instances from different sessions are no lon ger equal if on e is modified ( for example, if th e user ch an ges h is password) .



Instances with different database identity ( instances that represent different rows of th e database table) could be con sidered equal, unless there is some combin ation of properties th at are guaran teed to be un ique ( th e database columns have a unique constraint) . In the case of User, th ere is a un ique property: username.

To get to th e solution we recommen d, you n eed to un derstan d th e n otion of a business key. Using business key equality

A business key is a property, or some combination of properties, that is unique for each in stan ce with th e same database iden tity. Essentially, it’s the natural key you’d use if you weren ’t usin g a surrogate key. Un like a n atural primary key, it isn ’t an absolute requiremen t th at th e busin ess key n ever chan ge—as lon g as it ch an ges rarely, that’s enough. We argue that every entity should have a business key, even if it includes all properties of th e class ( th is would be appropriate for some immutable classes) . Th e busin ess key is wh at th e user th in ks of as un iquely iden tifyin g a particular record, wh ereas th e surrogate key is wh at th e application and database use.

The persistence lifecycle

125

Business key equality means that the equals() method compares only the properties th at form th e busin ess key. Th is is a per fect solution that avoids all the problems described earlier. The only downside is that it requires extra thought to identify the correct business key in the first place. But th is effort is required an yway; it’s important to identify any unique keys if you want your database to help ensure data in tegrity via con strain t ch eckin g. For th e User class, username is a great can didate busin ess key. It’s never null, it’s unique, and it changes rarely ( if ever) : public class User { ... public boolean equals(Object other) { if (this==other) return true; if ( !(other instanceof User) ) return false; final User that = (User) other; return this.username.equals( that.getUsername() ); } public int hashCode() { return username.hashCode(); } }

For some oth er classes, th e busin ess key might be more complex, consisting of a combination of properties. For example, can didate busin ess keys for th e Bid class are the item ID together with the bid amount, or the item ID togeth er with th e date an d time of th e bid. A good busin ess key for th e BillingDetails abstract class is the number togeth er with th e type ( subclass) of billin g details. Notice th at it’s almost never correct to override equals() on a subclass an d in clude an other property in the comparison. It’s tricky to satisfy the requiremen ts th at equality be both symmetric and transitive in this case; and, more important, the business key wouldn’t correspond to any well-defined candidate n atural key in th e database ( subclass properties may be mapped to a differen t table) . You migh t h ave n oticed th at th e equals() an d hashCode() methods always access the properties of the other object via the getter meth ods. Th is is importan t, sin ce the object in stan ce passed as other migh t be a proxy object, n ot th e actual in stan ce that holds the persistent state. This is one poin t wh ere Hibern ate isn ’t completely transparent, but it’s a good practice to use accessor methods instead of direct in stan ce variable access an yway. Fin ally, take care wh en modifyin g th e value of th e busin ess key properties; don ’t ch an ge th e value wh ile th e domain object is in a set.

126

CHAPTER 4

Working with persistent objects

We’ve talked about th e persisten ce man ager in th is section . It’s time to take a closer look at th e persisten ce man ager an d explore th e Hibern ate Session API in greater detail. We’ll come back to detached objects with more details in the next chapter.)

4.2 The persistence manager Any transparent persistence tool includes a persistence manager API, which usually provides services for ■

Basic CRUD operation s



Query execution



Control of transactions



Man agemen t of th e tran saction -level cach e

The persisten ce man ager can be exposed by several different inter faces ( in the case of Hibern ate, Session, Query, Criteria, and Transaction) . Un der th e covers, th e implemen tation s of th ese in ter faces are coupled tigh tly. The central inter face between the application and Hibernate is Session; it’s your startin g poin t for all th e operation s just listed. For most of th e rest of th is book, we’ll refer to the persistence manager an d th e session in terch an geably; th is is consistent with usage in the Hibernate community. So, how do you start using the session? At the beginning of a unit of work, a th read obtain s an in stan ce of Session from the application’s SessionFactory. Th e application migh t h ave multiple SessionFactorys if it accesses multiple datasources. But you should never create a new SessionFactory just to service a particular request—creation of a SessionFactory is extremely expen sive. On th e oth er h an d, Session creation is extremely inexpensive; the Session doesn ’t even obtain a JDBC Connection un til a con n ection is required. After opening a new session, you use it to load an d save objects. 4 .2 .1 M aking an object persistent

The first thing you want to do with a Session is make a n ew tran sien t object persistent. To do so, you use the save() method: User user = new User(); user.getName().setFirstname("John"); user.getName().setLastname("Doe");

The persistence manager

127

Session session = sessions.openSession(); Transaction tx = session.beginTransaction(); session.save(user); tx.commit(); session.close();

First, we instantiate a new transient object user as usual. Of course, we might also instan tiate it after open in g a Session; th ey aren’t related yet. We open a n ew Session using the SessionFactory referred to by sessions, and then we start a new database transaction. A call to save() makes the transient instance of User persistent. It’s now associated with th e curren t Session. However, n o SQL INSERT h as yet been executed. Th e Hibernate Session never executes any SQL statemen t un til absolutely n ecessary. The changes made to persistent objects have to be synchronized with the database at some point. This happens when we commit() th e Hibern ate Transaction. In th is case, Hibern ate obtain s a JDBC connection and issues a single SQL INSERT statemen t. Fin ally, th e Session is closed an d th e JDBC connection is released. Note th at it’s better ( but n ot required) to fully in itialize th e User in stan ce before associating it with the Session. The SQL INSERT statement contains the values that were h eld by th e object at the point when save() was called. You can , of course, modify th e object after callin g save(), and your changes will be propagated to the database as an SQL UPDATE. Everything between session.beginTransaction() and tx.commit() occurs in on e database tran saction . We h aven ’t discussed transactions in detail yet; we’ll leave th at topic for th e n ext ch apter. But keep in mind that all database operations in a transaction scope either completely succeed or completely fail. If one of the UPDATE or INSERT statemen ts made on tx.commit() fails, all changes made to persistent objects in this transaction will be rolled back at the database level. However, Hibernate does not roll back in-memory changes to persistent objects; this is reasonable since a failure of a database tran saction is n ormally n on recoverable an d you have to discard the failed Session immediately. 4 .2 .2 Updating the persistent state of a detached instance

Modifyin g the user after th e session is closed will h ave n o effect on its persisten t represen tation in th e database. Wh en th e session is closed, user becomes a detached in stan ce. It may be reassociated with a n ew Session by calling update() or lock().

128

CHAPTER 4

Working with persistent objects

The update() method forces an update to the persisten t state of th e object in th e database, sch edulin g an SQL UPDATE. Here’s an example of detached object handling: user.setPassword("secret"); Session sessionTwo = sessions.openSession(); Transaction tx = sessionTwo.beginTransaction(); sessionTwo.update(user); user.setUsername("jonny"); tx.commit(); sessionTwo.close();

It doesn’t matter if the object is modified before or after it’s passed to update(). Th e importan t th in g is th at th e call to update() is used to reassociate the detached in stan ce to th e n ew Session ( and current transaction) and tells Hibernate to treat the object as dirty ( un less select-before-update is enabled for the persistent class mappin g, in wh ich case Hibern ate will determine if the object is dirty by executing a SELECT statement and comparing the object’s current state to the current database state) . A call to lock() associates th e object with th e Session with out forcin g an update, as shown h ere: Session sessionTwo = sessions.openSession(); Transaction tx = sessionTwo.beginTransaction(); sessionTwo.lock(user, LockMode.NONE); user.setPassword("secret"); user.setLoginName("jonny"); tx.commit(); sessionTwo.close();

In th is case, it does matter wh eth er ch an ges are made before or after the object is associated with th e session . Ch an ges made before th e call to lock() aren’t propagated to th e database; you on ly use lock() if you’re sure th at th e detach ed in stan ce hasn’t been modified. We discuss Hibern ate lock modes in the next chapter. By specifying LockMode.NONE here, we tell Hibernate not to per form a version check or obtain any database-level locks when reassociating the object with the Session. If we specified LockMode.READ or LockMode.UPGRADE, Hibernate would execute a SELECT statement in order to per form a version ch eck ( an d to set an upgrade lock) .

The persistence manager

129

4 .2 .3 Retrieving a persistent object

The Session is also used to query the database and retrieve existing persistent objects. Hibernate is especially power ful in th is area, as you’ll see later in th is ch apter and in chapter 7. However, special methods are provided on the Session API for th e simplest kin d of query: retrieval by identifier. One of these methods is get(), demon strated h ere: Session session = sessions.openSession(); Transaction tx = session.beginTransaction(); int userID = 1234; User user = (User) session.get(User.class, new Long(userID)); tx.commit(); session.close();

The retrieved object user may n ow be passed to th e presen tation layer for use outside th e tran saction as a detach ed in stan ce ( after th e session h as been closed) . If no row with the given identifier value exists in the database, the get() return s null. 4 .2 .4 Updating a persistent object

Any persistent object returned by get() or any other kind of query is already associated with the current Session and transaction context. It can be modified, and its state will be synchronized with the database. This mechanism is called automatic dirty checking, wh ich mean s Hibern ate will track an d save th e ch an ges you make to an object in side a session : Session session = sessions.openSession(); Transaction tx = session.beginTransaction(); int userID = 1234; User user = (User) session.get(User.class, new Long(userID)); user.setPassword("secret"); tx.commit(); session.close();

First we retrieve th e object from th e database with th e given iden tifier. We modify the object, and these modifications are propagated to th e database wh en tx.commit() is called. Of course, as soon as we close the Session, the instance is considered detached. 4 .2 .5 M aking a persistent object transient

You can easily make a persistent object transient, removing its persistent state from the database, using the delete() meth od:

130

CHAPTER 4

Working with persistent objects Session session = sessions.openSession(); Transaction tx = session.beginTransaction(); int userID = 1234; User user = (User) session.get(User.class, new Long(userID)); session.delete(user); tx.commit(); session.close();

The SQL DELETE will be executed only wh en th e Session is syn ch ron ized with th e database at th e en d of th e tran saction . After th e Session is closed, the user object is considered an ordinary transient instance. Th e tran sien t in stan ce will be destroyed by the garbage collector if it’s no longer referenced by any other object. Both the in-memory object instance and the persisten t database row will h ave been removed. 4 .2 .6 M aking a detached object transient

Finally, you can make a detached in stan ce tran sien t, deletin g its persisten t state from the database. Th is mean s you don ’t h ave to reattach ( with update() or lock()) a detach ed in stan ce to delete it from th e database; you can directly delete a detach ed in stan ce: Session session = sessions.openSession(); Transaction tx = session.beginTransaction(); session.delete(user); tx.commit(); session.close();

In th is case, th e call to delete() does two things: It associates the object with the Session an d th en sch edules th e object for deletion, executed on tx.commit(). You n ow kn ow th e persisten ce lifecycle an d th e basic operation s of th e persistence manager. Together with the persistent class mappings we discussed in chapter 3, you can create your own small Hibernate application. ( If you like, you can jump to ch apter 8 an d read about a h an dy Hibern ate h elper class for SessionFactory an d Session management.) Keep in mind that we didn’t show you any exception-handling code so far, but you should be able to figure out the try/ catch blocks yourself. Map some simple entity classes and components, and then store and load objects in a stand-alone application ( you don’t need a web container or application server, just write a main meth od) . However, as soon as you try to store associated entity objects—that is, when you deal with a more complex object

Using transitive persistence in Hibernate

131

graph—you’ll see that calling save() or delete() on each object of the graph isn’t an efficien t way to write application s. You’d like to make as few calls to th e Session as possible. Transitive persistence provides a more n atural way to force object state ch an ges an d to con trol th e persistence lifecycle.

4.3 Using transitive persistence in Hibernate Real, nontrivial applications work not with single objects but rather with graphs of objects. When the application manipulates a graph of persistent objects, the result may be an object graph con sistin g of persisten t, detach ed, an d tran sien t in stan ces. Transitive persistence is a technique that allows you to propagate persistence to transient and detached subgraphs automatically. For example, if we add a n ewly in stan tiated Category to th e already persisten t hierarch y of categories, it sh ould automatically become persistent without a call to Session.save(). We gave a sligh tly differen t example in ch apter 3 wh en we mapped a paren t/ child relation sh ip between Bid and Item. In that case, not only were bids automatically made persistent when th ey were added to an item, but th ey were also automatically deleted wh en th e own in g item was deleted. There is more than one model for transitive persistence. The best known is persistence by reachability, which we’ll discuss first. Although some basic principles are the same, Hibern ate uses its own , more power ful model, as you’ll see later. 4 .3 .1 Persistence by reachability

An object persistence layer is said to implement persistence by reachability if an y in stan ce becomes persisten t wh en th e application creates an object referen ce to the in stan ce from an oth er in stan ce th at is already persisten t. Th is beh avior is illustrated by th e object diagram ( n ote th at th is isn’t a class diagram) in figure 4.2. Electronics : Category Transient Persistent Cell Phones : Category

Computer : Category Persistent by Reachability

Desktop PCs : Category Figure 4 .2

Monitors : Category

Persistence by reachability with a root persistent object

132

CHAPTER 4

Working with persistent objects

In this example, “Computer” is a persistent object. The objects “Desktop PCs” and “Monitors” are also persistent; they’re reachable from the “Computer” Category in stan ce. “Electron ics” an d “Cell Ph on es” are tran sien t. Note that we assume n avigation is on ly possible to ch ild categories, an d n ot to th e paren t—for example, we can call computer.getChildCategories(). Persistence by reachability is a recursive algorithm: All objects reachable from a persistent instance become persistent either wh en th e origin al in stan ce is made persistent or just before in -memory state is synchronized with the data store. Persistence by reachability guarantees referential integrity; any object graph can be completely re-created by loading the persisten t root object. An application may walk the object graph from association to association without worrying about the persistent state of the instances. ( SQL databases h ave a differen t approach to referential integrity, relying on foreign key and other constraints to detect a misbehaving application .) In the purest form of persistence by reachability, the database has some toplevel, or root, object from which all persistent objects are reach able. Ideally, an instance should become transient and be deleted from the database if it isn’t reachable via references from the root persistent object. Neither Hibernate nor other ORM solution s implemen t th is form; th ere is n o an alog of the root persisten t object in an SQL database and no persistent garbage collector th at can detect un referen ced in stan ces. Object-orien ted data stores might implement a garbage-collection algorithm similar to th e on e implemen ted for in -memory objects by th e JVM, but th is option isn ’t available in th e ORM world; scanning all tables for unreferenced rows won ’t per form acceptably. So, persisten ce by reach ability is at best a h alfway solution . It h elps you make transient objects persistent and propagate th eir state to th e database with out man y calls to th e persisten ce man ager. But ( at least, in th e con text of SQL databases an d ORM) it isn ’t a full solution to th e problem of making persistent objects transient and removing their state from the database. This turns out to be a much more difficult problem. You can’t simply remove all reach able in stan ces wh en you remove an object; oth er persisten t in stan ces may h old referen ces to th em ( remember th at entities can be shared) . You can’t even safely remove in stan ces that aren ’t referenced by any persistent object in memory; the instances in memory are only a small subset of all objects represen ted in th e database. Let’s look at Hibern ate’s more flexible transitive persistence model.

Using transitive persistence in Hibernate

133

4 .3 .2 Cascading persistence with Hibernate

Hibernate’s transitive persistence model uses the same basic concept as persistence by reachability—that is, object associations are examin ed to determin e tran sitive state. However, Hibern ate allows you to specify a cascade style for each association mapping, which offers more flexibility and fine-grained control for all state transitions. Hibernate reads the declared style and cascades operations to associated objects automatically. By default, Hibernate does not n avigate an association wh en search in g for tran sient or detached objects, so saving, deleting, or reattaching a Category won’t affect the ch ild category objects. Th is is th e opposite of the persistence-by-reachability default behavior. If, for a particular association , you wish to en able tran sitive persistence, you must override this default in th e mappin g metadata. You can map en tity association s in metadata with th e followin g attributes: ■

cascade="none", th e default, tells Hibern ate to ign ore th e association .



cascade="save-update" tells Hibern ate to n avigate th e association wh en th e transaction is committed and when an object is passed to save() or update() an d save n ewly in stan tiated tran sien t in stan ces an d persist ch an ges to

detach ed instances. ■

cascade="delete" tells Hibernate to navigate the association an d delete persistent instances when an object is passed to delete().



cascade="all" mean s to cascade both save-update and delete, as well as calls to evict an d lock.



cascade="all-delete-orphan" mean s th e same as cascade="all" but, in addi-

tion, Hibernate deletes any persistent en tity in stan ce th at h as been removed ( dereferen ced) from th e association ( for example, from a collection ) . ■

cascade="delete-orphan" Hibern ate will delete any persisten t en tity

in stan ce th at h as been removed ( dereferen ced) from th e association ( for example, from a collection ) . This association-level cascade style model is both rich er an d less safe th an persisten ce by reachability. Hibernate doesn’t make th e same stron g guaran tees of referen tial in tegrity th at persisten ce by reach ability provides. In stead, Hibern ate partially delegates referential integrity concerns to the foreign key constraints of the underlyin g relation al database. Of course, th ere is a good reason for this design decision: It allows Hibern ate application s to use detached objects efficien tly, because you can control reattachment of a detached object graph at th e association level.

134

CHAPTER 4

Working with persistent objects

Let’s elaborate on the cascading concept with some example association mappings. We recommend that you read the next section in one turn, because each example builds on the previous one. Our first example is straightforward; it lets you save newly added categories efficien tly. 4 .3 .3 M anaging auction categories

System administrators can create n ew categories, ren ame categories, and move subcategories around in the category hierarchy. This structure can be seen in figure 4.3. Now, we map th is class an d th e association :

...

Category 0..* name : String

Figure 4 .3

Category class with association to itself





...

This is a recursive, bidirectional, one-to-many association, as briefly discussed in chapter 3. The one-valued end is mapped with the elemen t an d th e Set typed property with th e . Both refer to the same foreign key column: PARENT_CATEGORY_ID. Suppose we create a n ew Category as a child category of “Computer” ( see figure 4.4) . We h ave several ways to create th is n ew “Laptops” object an d save it in th e database. We could go back to th e database an d retrieve the “Computer” category to which our new “Laptops” category will belong, add the new category, and commit the transaction :

Using transitive persistence in Hibernate

135

Electronics : Category

Cell Phones : Category

Computer : Category

Desktop PCs : Category

Monitors : Category

Laptops : Category Figure 4 .4

Adding a new Category to the object graph

Session session = sessions.openSession(); Transaction tx = session.beginTransaction(); Category computer = (Category) session.get(Category.class, computerId); Category laptops = new Category("Laptops"); computer.getChildCategories().add(laptops); laptops.setParentCategory(computer); tx.commit(); session.close();

The computer instance is persistent ( attached to a session) , and the childCategories association h as cascade save enabled. Hen ce, th is code results in th e n ew laptops category becomin g persisten t wh en tx.commit() is called, because Hibernate cascades the dirty-checkin g operation to th e ch ildren of computer. Hibernate executes an INSERT statement. Let’s do th e same th in g again , but th is time create the link between “Computer” and “Laptops” outside of an y tran saction ( in a real application, it’s useful to manipulate an object graph in a presentation tier—for example, before passing the graph back to the persistence layer to make the changes persistent) : Category computer = ... // Loaded in a previous session Category laptops = new Category("Laptops"); computer.getChildCategories().add(laptops); laptops.setParentCategory(computer);

136

CHAPTER 4

Working with persistent objects

Th e detach ed computer object an d an y oth er detach ed objects it refers to are n ow associated with the new transient laptops object ( an d vice versa) . We make th is change to the object graph persistent by saving the new object in a second Hibernate session : Session session = sessions.openSession(); Transaction tx = session.beginTransaction(); // Persist one new category and the link to its parent category session.save(laptops); tx.commit(); session.close();

Hibernate will inspect the database identifier property of th e paren t category of laptops and correctly create the relationship to th e “Computer” category in th e

database. Hibernate inserts the identifier value of the parent into the foreign key field of the new “Laptops” row in CATEGORY. Sin ce cascade="none" is defined for the parentCategory association , Hibern ate ignores changes to any of the other categories in th e h ierarch y ( “Computer”, “Electronics”) . It doesn’t cascade the call to save() to en tities referred to by th is association. If we had enabled cascade="save-update" on the mapping of parentCategory, Hibern ate would h ave h ad to n avigate th e wh ole graph of objects in memory, synchronizing all instances with the database. This process would per form badly, because a lot of useless data access would be required. In this case, we neither needed n or wan ted tran sitive persisten ce for th e parentCategory association . Wh y do we h ave cascadin g operation s? We could h ave saved th e laptop object, as shown in the previous example, without an y cascade mappin g bein g used. Well, con sider th e followin g case: Category computer = ... // Loaded in a previous Session Category laptops = new Category("Laptops"); Category laptopAccessories = new Category("Laptop Accessories"); Category laptopTabletPCs = new Category("Tablet PCs") laptops.addChildCategory(laptopAccessories); laptops.addChildCategory(laptopTabletPCs); computer.addChildCategory(laptops);

( Notice th at we use th e con ven ien ce meth od addChildCategory() to set both ends of the association lin k in on e call, as described in ch apter 3.) It would be un desirable to h ave to save each of the three new categories individually. Fortunately, because we mapped the childCategories association with

Using transitive persistence in Hibernate

137

cascade="save-update", we don ’t n eed to. The same code we used before to save

the single “Laptops” category will save all three new categories in a new session: Session session = sessions.openSession(); Transaction tx = session.beginTransaction(); // Persist all three new Category instances session.save(laptops); tx.commit(); session.close();

You’re probably won derin g wh y th e cascade style is called cascade="save-update" rather than cascade="save". Havin g just made all th ree categories persisten t previously, suppose we made th e followin g ch an ges to th e category h ierarch y in a subsequent request ( outside of a session and transaction) : laptops.setName("Laptop Computers"); laptopAccessories.setName("Accessories & Parts"); laptopTabletPCs.setName("Tablet Computers"); Category laptopBags = new Category("Laptop Bags"); laptops.addChildCategory(laptopBags);

We have added a new category as a child of the “Laptops” category and modified all th ree existin g categories. Th e following code propagates these changes to th e database: Session session = sessions.openSession(); Transaction tx = session.beginTransaction(); // Update three old Category instances and insert the new one session.update(laptops); tx.commit(); session.close();

Specifyin g cascade="save-update" on th e childCategories association accurately reflects th e fact th at Hibern ate determin es wh at is n eeded to persist th e objects to th e database. In th is case, it will reattach / update th e th ree detach ed categories ( laptops, laptopAccessories, and laptopTabletPCs) and save the new child category ( laptopBags) . Notice th at th e last code example differs from the previous two session examples on ly in a sin gle meth od call. Th e last example uses update() in stead of save() because laptops was already persisten t. We can rewrite all th e examples to use th e saveOrUpdate() meth od. Th en th e th ree code sn ippets are iden tical:

138

CHAPTER 4

Working with persistent objects Session session = sessions.openSession(); Transaction tx = session.beginTransaction(); // Let Hibernate decide what's new and what's detached session.saveOrUpdate(laptops); tx.commit(); session.close();

The saveOrUpdate() meth od tells Hibern ate to propagate th e state of an in stan ce to the database by creatin g a n ew database row if th e in stan ce is a n ew tran sien t in stan ce or updating th e existin g row if th e instance is a detached instance. In oth er words, it does exactly th e same th in g with th e laptops category as cascade="save-update" did with th e ch ild categories of laptops. One final question: How did Hibernate kn ow wh ich ch ildren were detach ed an d which were n ew tran sien t in stan ces? 4 .3 .4 Distinguishing between transient and detached instances

Sin ce Hibern ate doesn ’t keep a referen ce to a detached instance, you have to let Hibern ate kn ow h ow to distin guish between a detach ed in stan ce like laptops ( if it was created in a previous session ) an d a n ew tran sien t in stan ce like laptopBags. A ran ge of option s is available. Hibern ate will assume th at an in stan ce is an unsaved transient instance if: ■

Th e iden tifier property ( if it exists) is null.



Th e version property ( if it exists) is null.



You supply an unsaved-value in the mapping document for the class, and the value of the identifier property matches.



You supply an unsaved-value in th e mappin g documen t for th e version property, and the value of the version property match es.



You supply a Hibern ate Interceptor an d return Boolean.TRUE from Interceptor.isUnsaved() after ch eckin g th e instan ce in your code.

In our domain model, we h ave used th e n ullable type java.lang.Long as our identifier property type everywh ere. Sin ce we’re usin g gen erated, syn th etic iden tifiers, th is solves th e problem. New in stan ces h ave a null iden tifier property value, so Hibernate treats them as transient. Detached instances have a non-null identifier value, so Hibernate treats them properly too. However, if we had used the primitive type long in our persisten t classes, we would have needed to use the following identifier mapping in all our classes:

Retrieving objects

139



....

The unsaved-value attribute tells Hibernate to treat instances of Category with an identifier value of 0 as n ewly in stan tiated tran sien t in stan ces. Th e default value for the attribute unsaved-value is null; so, sin ce we’ve ch osen Long as our identifier property type, we can omit th e unsaved-value attribute in our auction application classes ( we use th e same identifier type everywhere) . UNSAVED ASSIGNED IDENTIFIERS

Th is approach works n icely for syn th etic iden tifiers, but it breaks down in th e case of keys assign ed by th e application , in cludin g composite keys in legacy systems. We discuss th is issue in chapter 8, section 8.3.1, “Legacy sch emas an d composite keys.” Avoid application -assign ed ( an d composite) keys in n ew application s if possible.

You n ow h ave th e kn owledge to optimize your Hibernate application and reduce th e n umber of calls to th e persisten ce man ager if you wan t to save an d delete objects. Ch eck th e unsaved-value attributes of all your classes and experiment with detached objects to get a feeling for the Hibernate transitive persistence model. We’ll now switch perspectives and look at another important concept: how to get a graph of persistent objects out of the database ( that is, how to load objects) .

4.4 Retrieving objects Retrieving persistent objects from the database is on e of th e most in terestin g ( an d complex) parts of working with Hibernate. Hibern ate provides th e followin g ways to get objects out of th e database: ■

Navigating the object graph, starting from an already loaded object, by accessing the associated objects through property accessor meth ods such as aUser.getAddress().getCity(). Hibern ate will automatically load ( or preload) n odes of th e graph wh ile you n avigate th e graph if th e Session is open .



Retrieving by identifier, which is the most convenient and per formant method when the unique identifier value of an object is kn own .



Using the Hibernate Query Language ( H QL) , wh ich is a full object-orien ted query lan guage.

140

CHAPTER 4

Working with persistent objects



Using the, Hibernate Criteria API, which provides a type-safe and objectorien ted way to per form queries with out th e n eed for strin g man ipulation . This facility includes queries based on an example object.



Usin g n ative SQL queries, wh ere Hibern ate takes care of mapping the JDBC result sets to graphs of persistent objects.

In your Hibernate applications, you’ll use a combination of these techniques. Each retrieval meth od may use a different fetching strategy—that is, a strategy that defines what part of the persistent object graph should be retrieved. The goal is to find th e best retrieval meth od an d fetching strategy for every use case in your application wh ile at th e same time min imizin g the n umber of SQL queries for best per formance. We won’t discuss each retrieval meth od in much detail in th is section ; in stead we’ll focus on the basic fetching strategies an d h ow to tun e Hibern ate mappin g files for best default fetch in g per forman ce for all methods. Before we look at the fetching strategies, we’ll give an overview of the retrieval methods. ( We mention the Hibernate caching system but fully explore it in th e n ext ch apter.) Let’s start with th e simplest case, retrieval of an object by giving its identifier value ( n avigatin g th e object graph sh ould be self-explan atory) . You saw a simple retrieval by identifier earlier in this chapter, but there is more to know about it. 4 .4 .1 Retrieving objects by identifier

The followin g Hibern ate code sn ippet retrieves a User object from the database: User user = (User) session.get(User.class, userID);

The get() method is special because the identifier uniquely identifies a single in stan ce of a class. Hen ce it’s common for application s to use th e iden tifier as a con ven ien t h an dle to a persisten t object. Retrieval by iden tifier can use th e cach e when retrieving an object, avoiding a database h it if th e object is already cach ed. Hibern ate also provides a load() method: User user = (User) session.load(User.class, userID);

The load() meth od is older; get() was added to Hibern ate’s API due to user request. The difference is trivial: ■

If load() can’t find the object in the cache or database, an exception is thrown. The load() meth od n ever return s null. Th e get() method return s null if th e object can ’t be foun d.

Retrieving objects



141

The load() method may return a proxy instead of a real persistent instance. A proxy is a placeholder that triggers the loading of the real object when it’s accessed for the first time; we discuss proxies later in th is section . On th e oth er h an d, get() n ever return s a proxy.

Choosing between get() and load() is easy: If you’re certain th e persisten t object exists, and n on existen ce would be con sidered exception al, load() is a good option . If you aren ’t certain th ere is a persisten t in stan ce with th e given identifier, use get() an d test th e return value to see if it’s null. Using load() h as a furth er implication : Th e application may retrieve a valid reference ( a proxy) to a persistent instance without hitting the database to retrieve its persistent state. So load() migh t n ot th row an exception wh en it doesn’t find the persistent object in th e cach e or database; th e exception would be th rown later, wh en th e proxy is accessed. Of course, retrieving an object by iden tifier isn ’t as flexible as usin g arbitrary queries. 4 .4 .2 Introducing HQL

Th e Hibern ate Query Lan guage is an object-oriented dialect of the familiar relational query language SQL. H QL bears close resemblan ces to ODMG O QL an d EJB-QL; but unlike OQL, it’s adapted for use with SQL databases, an d it’s much more power ful an d elegan t th an EJB-QL ( However, EJB-QL 3.0 will be very similar to H QL.) HQL is easy to learn with basic knowledge of SQL. H QL isn ’t a data-man ipulation lan guage like SQL. It’s used only for object retrieval, not for updating, inserting, or deleting data. Object state synchronization is the job of the persistence manager, not the developer. Most of th e time, you’ll on ly n eed to retrieve objects of a particular class and restrict by the properties of that class. For example, th e followin g query retrieves a user by first n ame: Query q = session.createQuery("from User u where u.firstname = :fname"); q.setString("fname", "Max"); List result = q.list();

After preparin g query q, we bin d th e iden tifier value to a n amed parameter, fname. The result is return ed as a List of User objects. H QL is power ful, and even though you may n ot use th e advan ced features all th e time, you’ll need them for some difficult problems. For example, HQL supports the following:

142

CHAPTER 4

Working with persistent objects



The ability to apply restrictions to properties of associated objects related by reference or held in collection s ( to n avigate th e object graph usin g query lan guage) .



Th e ability to retrieve on ly properties of an en tity or en tities, with out th e overh ead of loadin g th e en tity itself in a transactional scope. This is sometimes called a report query; it’s more correctly called projection .



Th e ability to order th e results of th e query.



Th e ability to pagin ate th e results.



Aggregation with group by, having, an d aggregate fun ction s like sum, min, and max.



Outer joins when retrieving multiple objects per row.



Th e ability to call user-defin ed SQL fun ction s.



Subqueries ( n ested queries) .

We discuss all these features in ch apter 7, togeth er with th e option al n ative SQL query mech an ism. 4 .4 .3 Query by criteria

The Hibern ate query by criteria ( QBC) API lets you build a query by manipulating criteria objects at runtime. This approach lets you specify constraints dynamically with out direct strin g man ipulation s, but it doesn ’t lose much of th e flexibility or power of H QL. On th e other hand, queries expressed as criteria are often less readable th an queries expressed in H QL. Retrieving a user by first name is easy using a Criteria object: Criteria criteria = session.createCriteria(User.class); criteria.add( Expression.like("firstname", "Max") ); List result = criteria.list();

A Criteria is a tree of Criterion instances. The Expression class provides static factory meth ods th at return Criterion instances. Once the desired criteria tree is built, it’s executed against the database. Many developers prefer QBC, considering it a more object-orien ted approach . They also like the fact that the query syntax may be parsed and validated at compile time, wh ereas H QL expression s aren ’t parsed un til run time. The nice thing about the Hibernate Criteria API is the Criterion framework. This framework allows extension by the user, which is difficult in the case of a query language like H QL.

Retrieving objects

143

4 .4 .4 Query by example

As part of the QBC facility, Hibernate supports query by example ( QBE) . The idea behin d QBE is th at th e application supplies an instance of the queried class with certain property values set ( to n on default values) . The query returns all persistent instances with matching property values. QBE isn’t a particularly power ful approach , but it can be con ven ien t for some applications. The following code snippet demonstrates a Hibernate QBE: User exampleUser = new User(); exampleUser.setFirstname("Max"); Criteria criteria = session.createCriteria(User.class); criteria.add( Example.create(exampleUser) ); List result = criteria.list();

A typical use case for QBE is a search screen th at allows users to specify a ran ge of property values to be matched by the returned result set. Th is kin d of fun ction ality can be difficult to express cleanly in a query lan guage; strin g man ipulation s would be required to specify a dynamic set of constraints. Both th e QBC API an d th e example query mech an ism are discussed in more detail in chapter 7. You n ow kn ow th e basic retrieval option s in Hibern ate. We focus on th e strategies for fetching object graphs in the rest of this section. A fetching strategy defines what part of the object graph ( or, what subgraph) is retrieved with a query or load operation . 4 .4 .5 Fetching strategies

In tradition al relation al data access, you’d fetch all the data required for a particular computation with a single SQL query, taking advantage of inner and outer joins to retrieve related entities. Some primitive ORM implementations fetch data piecemeal, with many requests for small chunks of data in respon se to th e application ’s n avigatin g a graph of persisten t objects. Th is approach doesn ’t make efficien t use of th e relation al database’s join capabilities. In fact, th is data access strategy scales poorly by n ature. On e of th e most difficult problems in O RM—probably the most difficult—is providin g for efficien t access to relational data, given an application th at prefers to treat th e data as a graph of objects. For the kin ds of application s we’ve often worked with ( multi-user, distributed, web, and enterprise application s) , object retrieval using many round trips to/ from the database is unacceptable. Hence we argue th at tools sh ould emph asize th e R in ORM to a much greater exten t th an h as been tradition al.

144

CHAPTER 4

Working with persistent objects

The problem of fetching object graphs efficiently ( with minimal access to the database) h as often been addressed by providin g association -level fetch in g strategies specified in metadata of the association mappin g. Th e trouble with th is approach is that each piece of code that uses an entity requires a different set of associated objects. But th is isn ’t en ough . We argue that what is needed is support for fin e-grained runtime association fetching strategies. Hibernate supports both, it lets you specify a default fetch in g strategy in th e mappin g file an d th en override it at run time in code. Hibern ate allows you to ch oose among four fetching strategies for any association , in association metadata an d at run time: ■

Immediate fetching— The associated object is fetched immediately, using a sequential database read ( or cache lookup) .



Lazy fetching— Th e associated object or collection is fetched “lazily,” when it’s first accessed. This results in a new request to the database ( unless the associated object is cached) .



Eager fetching— Th e associated object or collection is fetch ed togeth er with the owning object, using an SQL outer join, and no further database request is required.



Batch fetching— This approach may be used to improve th e per forman ce of lazy fetch in g by retrievin g a batch of objects or collections when a lazy association is accessed. ( Batch fetch in g may also be used to improve the per formance of immediate fetching.)

Let’s look more closely at each fetch in g strategy. Immediate fetching

Immediate association fetch in g occurs wh en you retrieve an entity from the database and then immediately retrieve an oth er associated en tity or en tities in a further request to the database or cache. Immediate fetch in g isn ’t usually an efficien t fetch in g strategy un less you expect the associated entities to almost always be cached already. Lazy fetching

Wh en a clien t requests an en tity an d its associated graph of objects from th e database, it isn ’t usually n ecessary to retrieve the whole graph of every ( indirectly) associated object. You wouldn’t want to load the whole database into memory at once; for example, loadin g a single Category shouldn’t trigger the loading of all Items in that category.

Retrieving objects

145

Lazy fetch in g lets you decide h ow much of the object graph is loaded in the first database hit and which associations should be loaded only when they’re first accessed. Lazy fetching is a foundational con cept in object persisten ce an d th e first step to attaining acceptable per formance. We recommen d th at, to start with, all association s be con figured for lazy ( or perhaps batched lazy) fetching in the mapping file. Th is strategy may th en be overridden at runtime by queries that force eager fetch in g to occur. Eager (outer join) fetching

Lazy association fetching can help reduce database load and is often a good default strategy. However, it’s a bit like a blind guess as far as per formance optimization goes. Eager fetching lets you explicitly specify wh ich associated objects sh ould be loaded togeth er with th e referen cin g object. Hibern ate can then return th e associated objects in a single database request, utilizin g an SQL OUTER JOIN. Per formance optimization in Hibernate often involves judicious use of eager fetching for particular transaction s. Hen ce, even th ough default eager fetching may be declared in the mappin g file, it’s more common to specify th e use of th is strategy at run time for a particular HQL or criteria query. Batch fetching

Batch fetching isn’t strictly an association fetching strategy; it’s a technique that may help improve the per formance of lazy ( or immediate) fetching. Usually, when you load an object or collection , your SQL WHERE clause specifies th e iden tifier of th e object or object that owns the collection. If batch fetching is enabled, Hibernate looks to see wh at oth er proxied in stan ces or un in itialized collection s are referen ced in th e curren t session an d tries to load them at the same time by specifying multiple iden tifier values in th e WHERE clause. We aren ’t great fan s of th is approach ; eager fetching is almost always faster. Batch fetch in g is useful for in experien ced users wh o wish to ach ieve acceptable per formance in Hibernate without having to think too hard about the SQL th at will be executed. ( Note that batch fetching may be familiar to you, since it’s used by many EJB2 engines.) We’ll n ow declare th e fetch in g strategy for some associations in our mapping metadata.

146

CHAPTER 4

Working with persistent objects

4 .4 .6 Selecting a fetching strategy in mappings

Hibernate lets you select default association fetch in g strategies by specifyin g attributes in th e mappin g metadata. You can override th e default strategy usin g features of Hibern ate’s query meth ods, as you’ll see in ch apter 7. A min or caveat: You don ’t h ave to un derstan d every option presen ted in th is section immediately; we recommen d th at you get an overview first an d use th is section as a referen ce wh en you’re optimizin g th e default fetch in g strategies in your application . A wrinkle in Hibernate’s mapping format means that collection mappings function sligh tly differen tly th an sin gle-poin t associations; so, we’ll cover th e two cases separately. Let’s first consider both ends of the bidirectional association between Bid and Item. Single point associations

For a or association , lazy fetch in g is possible on ly if the associated class mappin g en ables proxyin g. For th e Item class, we en able proxying by specifying lazy="true":

Now, remember the association from Bid to Item:

When we retrieve a Bid from the database, the association property may hold an in stan ce of a Hibern ate generated subclass of Item th at delegates all meth od in vocations to a different instance of Item th at is fetch ed lazily from th e database ( th is is the more elaborate definition of a Hibernate proxy) . Hibernate uses two different instances so that even polymorphic associations can be proxied—when the proxied object is fetched, it may be an instance of a mapped subclass of Item ( if th ere were an y subclasses of Item, that is) . We can even choose an y in ter face implemen ted by th e Item class as the type of the proxy. To do so, we declare it using the proxy attribute, in stead of specifyin g lazy="true":

As soon as we declare the proxy or lazy attribute on Item, any single-point association to Item is proxied and fetched lazily, unless that association overrides the fetch in g strategy by declarin g th e outer-join attribute. There are three possible values for outer-join:

Retrieving objects



147

outer-join="auto"—The default. When the attribute isn ’t specified; Hiber-

n ate fetch es th e associated object lazily if th e associated class h as proxyin g enabled, or eagerly using an outer join if proxying is disabled ( default) . ■

outer-join="true"—Hibernate always fetches the association eagerly using

an outer join , even if proxyin g is enabled. This allows you to choose different fetching strategies for different associations to the same proxied class. ■

outer-join="false"—Hibern ate n ever fetch es th e association usin g an

outer join, even if proxying is disabled. Th is is useful if you expect th e associated object to exist in th e secon d-level cach e ( see ch apter 5) . If it isn ’t available in the second-level cache, the object is fetched immediately using an extra SQL SELECT. So, if we wanted to reenable eager fetch in g for th e association , n ow th at proxyin g is en abled, we would specify

For a one-to-one association ( discussed in more detail in ch apter 6) , lazy fetch in g is con ceptually possible on ly wh en th e associated object always exists. We in dicate this by specifying constrained="true". For example, if an item can h ave on ly on e bid, th e mappin g for th e Bid is

The constrained attribute h as a sligh tly similar in terpretation to th e not-null attribute of a mapping. It tells Hibernate th at th e associated object is required and thus cannot be null. To en able batch fetch in g, we specify th e batch-size in the mapping for Item:

Th e batch size limits th e n umber of items th at may be retrieved in a sin gle batch . Choose a reason ably small n umber h ere. You’ll meet the same attributes ( outer-join, batch-size, and lazy) wh en we consider collection s, but th e interpretation is slightly different. Collections

In the case of collections, fetching strategies apply n ot just to en tity association s, but also to collections of values ( for example, a collection of strings could be fetch ed by outer join ) .

148

CHAPTER 4

Working with persistent objects

Just like classes, collection s h ave th eir own proxies, which we usually call collection wrappers. Un like classes, th e collection wrapper is always th ere, even if lazy fetch in g is disabled ( Hibern ate n eeds th e wrapper to detect collection modifications) . Collection mappin gs may declare a lazy attribute, an outer-join attribute, n eith er, or both ( specifyin g both isn ’t mean in gful) . Th e mean in gful option s are as follows: ■

Neither attribute specified— Th is option is equivalen t to outer-join="false" lazy="false". Th e collection is fetch ed from the second-level cache or by an immediate extra SQL SELECT. This option is the default and is most useful wh en th e secon d-level cach e is en abled for th is collection .



outer-join="true"—Hibern ate fetch es th e association eagerly usin g an

outer join . At th e time of th is writin g, Hibern ate is able to fetch on ly on e collection per SQL SELECT, so it isn’t possible to declare multiple collections belonging to the same persistent class with outer-join="true". ■

lazy="true"—Hibernate fetches the collection lazily, wh en it’s first

accessed. We don’t recommend eager fetching for collections, so we’ll map the item’s collection of bids with lazy="true". This option is almost always used for collection mappings ( it should be the default, and we recommend that you consider it as a default for all your collection mappin gs) :



We can even enable batch fetching for the collection . In th is case, th e batch size doesn’t refer to the number of bids in the batch; it refers to the number of collections of bids:



Th is mappin g tells Hibern ate to load up to n in e collection s of bids in on e batch , dependin g on h ow man y un in itialized collections of bids are currently present in the items associated with the session. In other words, if there are five Item instances with persisten t state in a Session, and all have an uninitialized bids collection , Hibernate will automatically load all five collection s in a sin gle SQL query if on e is accessed. If th ere are 11 items, on ly 9 collection s will be fetch ed. Batch fetch in g

Retrieving objects

149

can significantly reduce the number of queries required for h ierarch ies of objects ( for example, wh en loadin g th e tree of paren t an d ch ild Category objects) . Let’s talk about a special case: many-to-many associations ( we discuss this mapping in more detail in chapter 6) . You usually use a link table ( some developers also call it relationship table or association table) th at h olds on ly th e key values of th e two associated tables and therefore allows a man y-to-man y multiplicity. Th is addition al table has to be considered if you decide to use eager fetch in g. Look at th e followin g straigh tforward man y-to-man y example, wh ich maps th e association from Category to Item:



In th is case, th e eager fetch in g strategy refers on ly to th e association table CATEGORY_ITEM. If we load a Category with this fetching strategy, Hibernate will automatically fetch all lin k en tries from CATEGORY_ITEM in a sin gle outer join SQL query, but not the item instances from ITEM! Th e en tities con tain ed in th e man y-to-man y association can of course also be fetch ed eagerly with th e same SQL query. Th e element allows this behavior to be customized:



Hibernate will now fetch all Items in a Category with a sin gle outer join query wh en the Category is loaded. However, keep in min d that we usually recommend lazy loading as the default fetching strategy and th at Hibern ate is limited to on e eagerly fetch ed collection per mapped persisten t class. Setting the fetch depth

We’ll n ow discuss a global fetch in g strategy settin g: th e maximum fetch depth. This setting controls the number of outer-joined tables Hibernate will use in a single SQL query. Consider the complete association chain from Category to Item, and from Item to Bid. Th e first is a man y-to-man y association an d th e secon d is a on eto-man y; h en ce both association s are mapped with collection elemen ts. If we declare outer-join="true" for both association s ( don ’t forget th e special declaration ) an d load a sin gle Category, h ow man y queries will Hibern ate execute? Will only the Items be eagerly fetch ed, or also all th e Bids of each Item?

150

CHAPTER 4

Working with persistent objects

You probably expect a sin gle query, with an outer join operation in cludin g the CATEGORY, CATEGORY_ITEM, ITEM, an d BID tables. However, th is isn ’t th e case by default. Hibern ate’s outer join fetch beh avior is con trolled with th e global con figuration option hibernate.max_fetch_depth. If you set this to 1 ( also the default) , Hibernate will fetch only the Category an d th e lin k en tries from th e CATEGORY_ITEM association table. If you set it to 2, Hibern ate executes an outer join th at also in cludes the Items in the same SQL query. Setting this option to 3 joins all four tables in one SQL statemen t an d also loads all Bids. Recommended values for the fetch depth depend on the join per formance and th e size of th e database tables; test your applications with low values ( less than 4) first, and decrease or in crease th e n umber while tuning your application. The global maximum fetch depth also applies to sin gle-en ded association ( , ) mapped with an eager fetch in g strategy. Keep in mind that eager fetching strategies declared in the mapping metadata are effective on ly if you use retrieval by identifier, use the criteria query API , or navigate through the object graph manually. Any H QL query may specify its own fetching strategy at runtime, thus ignoring the mapping defaults. You can also override the defaults ( that is, not ignore them) with criteria queries. This is an importan t differen ce, an d we cover it in more detail in ch apter 7, section 7.3.2, “Fetching associations.” However, you may sometimes simply like to initialize a proxy or a collection wrapper manually with a simple API call. Initializing lazy associations

A proxy or collection wrapper is automatically initialized when any of its methods are invoked ( except the identifier property getter, which may return the identifier value with out fetch in g th e un derlyin g persistent object) . However, it’s only possible to initialize a proxy or collection wrapper if it’s curren tly associated with an open Session. If you close th e session an d try to access an un in itialized proxy or collection, Hibernate throws a runtime exception. Because of th is beh avior, it’s sometimes useful to explicitly initialize an object before closin g th e session . Th is approach isn’t as flexible as retrieving the complete required object subgraph with an H QL query, using arbitrary fetching strategies at runtime. We use th e static meth od Hibernate.initialize() for manual initialization: Session session = sessions.openSession(); Transaction tx = session.beginTransaction();

Retrieving objects

151

Category cat = (Category) session.get(Category.class, id); Hibernate.initialize( cat.getItems() ); tx.commit(); session.close(); Iterator iter = cat.getItems().iterator(); ...

Hibernate.initialize() may be passed a collection wrapper, as in this example, or a proxy. You may also, in similar rare cases, ch eck th e curren t state of a property by calling Hibernate.isInitialized(). ( Note that initialize() doesn’t cascade to any associated objects.) Another solution for this problem is to keep the session open until the application th read fin ish es, so you can navigate th e object graph wh en ever you like an d h ave Hibern ate automatically in itialize all lazy references. This is a problem of application design and transaction demarcation; we discuss it again in ch apter 8, section 8.1, “Designing layered applications.” However, your first choice should be to fetch th e complete required graph in th e first place, usin g HQL or criteria queries, with a sen sible an d optimized default fetch in g strategy in th e mappin g metadata for all other cases.

4 .4 .7 Tuning object retrieval

Let’s look at th e steps in volved wh en you’re tun in g th e object retrieval operation s in your application : 1

Enable the Hibernate SQL log, as described in chapter 2. You should also be prepared to read, understand, and evaluate SQL queries and their per forman ce ch aracteristics for your specific relation al model: Will a sin gle join operation be faster than two selects? Are all the indexes used properly, and wh at is th e cach e h it ratio in side th e database? Get your DBA to h elp you with th e p er form an ce evalu ation ; on ly sh e will h ave th e kn owled ge to decide which SQL execution plan is the best.

2

Step through your application use case by use case an d n ote h ow man y an d wh at SQL statemen ts H ibern ate executes. A use case can be a sin gle screen in your web ap plication or a sequ en ce of user dialogs. Th is step also in volves collectin g th e object-retrieval meth ods you use in each use case: walkin g th e graph , retrieval by iden tifier, H QL, an d criteria queries. Your goal is to brin g down th e n umber ( an d complexity) of SQL queries for each use case by tun in g th e default fetch in g strategies in metadata.

3

You may en coun ter two common issues:

152

CHAPTER 4

Working with persistent objects





If the SQL statemen ts use join operation s that are too complex an d slow, set outer-join to false for associations ( this is enabled by default) . Also try to tune with the global hibernate.max_fetch_depth configuration option, but keep in mind that this is best left at a value between 1 an d 4. If too man y SQL statemen ts are executed, use lazy="true" for all collection mappin gs; by default, Hibernate will execute an immediate additional fetch for the collection elements ( which, if they’re entities, can cascade further into the graph) . In rare cases, if you’re sure, enable outer-join="true" and disable lazy loading for particular collections. Keep in mind that only one collection property per persisten t class may be fetch ed eagerly. Use batch fetch in g with values between 3 an d 10 to furth er optimize collection fetch in g if th e given un it of work in volves several “of th e same” collection s or if you’re accessin g a tree of paren t and child objects.

4

After you set a n ew fetch in g strategy, rerun the use case and check the generated SQL again. Note the SQL statements, and go to the next use case.

5

After you optimize all use cases, ch eck every one again and see if any optimization s h ad side effects for oth ers. With some experien ce, you’ll be able to avoid any negative effects and get it righ t th e first time.

Th is optimization tech n ique isn ’t on ly practical for th e default fetch in g strategies; you can also use it to tune H QL and criteria queries, which can ign ore an d override the default fetch in g for specific use cases an d un its of work. We discuss run time fetching in chapter 7. In this section , we’ve started to th in k about per formance issues, especially issues related to association fetching. Of course, the quickest way to fetch a graph of objects is to fetch it from the cache in memory, as shown in the next chapter.

4.5 Summary Th e dyn amic aspects of th e object/ relation al mismatch are just as importan t as th e better known and better understood structural mismatch problems. In this chapter, we were primarily concerned with the lifecycle of objects with respect to the persistence mechanism. Now you understand th e th ree object states defin ed by Hibernate: persistent, detached, and transient. Objects transition between states when you invoke methods of the Session in ter face or create an d remove references from a graph of already persistent instances. This latter behavior is governed

Summary

153

by the configurable cascade styles, Hibern ate’s model for tran sitive persisten ce. Th is model lets you declare th e cascading of operations ( such as saving or deletion) on an association basis, which is more power ful an d flexible th an th e traditional persistence by reachability model. Your goal is to find the best cascading style for each association an d th erefore min imize th e n umber of persisten ce man ager calls you have to make wh en storing objects. Retrieving objects from the database is equally importan t: You can walk th e graph of domain objects by accessing properties an d let Hibern ate tran sparen tly fetch objects. You can also load objects by identifier, write arbitrary queries in the H QL, or create an object-orien ted represen tation of your query usin g th e query by criteria API. In addition , you can use n ative SQL queries in special cases. Most of these object-retrieval methods use the default fetching strategies we defined in mapping metadata ( H QL ignores them; criteria queries can override them) . The correct fetching strategy minimizes the number of SQL statemen ts th at h ave to be executed by lazily, eagerly, or batch-fetching objects. You optimize your Hibern ate application by an alyzin g th e SQL executed in each use case an d tun in g the default and runtime fetching strategies. Next we explore the closely related topics of transactions an d caching.

Transactions, concurrency, and caching

This chapter covers ■

Databas e trans ac tio ns and lo c king



Lo ng-running applic atio n trans ac tio ns



The Hibe rnate firs t- and s e c o nd-le ve l c ac he s



The c ac hing s ys te m in prac tic e with Cave atEmpto r

Transactions, concurre ncy, and caching

154

Transactions, concurrency, and caching

155

Now th at you un derstan d th e basics of object/ relation al mappin g with Hibern ate, let’s take a closer look at on e of th e core issues in database application design: transaction management. In this chapter, we examine how you use Hibernate to manage tran saction s, h ow con curren cy is h an dled, an d h ow cach in g is related to both aspects. Let’s look at our example application . Some application fun ction ality requires th at several differen t th in gs be don e together. For example, when an auction finishes, our CaveatEmptor application has to per form four different tasks: 1

Mark the winning (highest amount) bid.

2

Ch arge th e seller th e cost of th e auction .

3

Charge the successful bidder the price of th e win n in g bid.

4

Notify the seller and the successful bidder.

What h appen s if we can ’t bill th e auction costs because of a failure in th e extern al credit card system? Our business requiremen ts migh t state th at eith er all listed actions must succeed or none must succeed. If so, we call these steps collectively a transaction or unit of work. If on ly on e step fails, th e wh ole un it of work must fail. We say that the transaction is atomic: Several operation s are grouped togeth er as a sin gle indivisible unit. Furth ermore, tran saction s allow multiple users to work con curren tly with th e same data with out compromisin g th e in tegrity an d correctness of th e data; a particular tran saction sh ouldn ’t be visible to an d sh ouldn ’t in fluen ce oth er con currently running transactions. Several different strategies are used to implement this beh avior, wh ich is called isolation . We’ll explore them in this chapter. Transactions are also said to exhibit consistency and durability. Con sisten cy mean s th at an y tran saction works with a con sisten t set of data and leaves the data in a consistent state when the transaction completes. Durability guaran tees th at on ce a transaction completes, all changes made during that transaction become persisten t an d aren ’t lost even if th e system subsequently fails. Atomicity, con sisten cy, isolation, and durability are together known as the ACID criteria. We begin this chapter with a discussion of system-level database transactions, where th e database guaran tees ACID beh avior. We’ll look at th e JDBC an d JTA APIs and see h ow Hibern ate, workin g as a clien t of th ese APIs, is used to control database transactions. In an online application, database tran saction s must h ave extremely sh ort lifespans. A database transaction should span a sin gle batch of database operation s, interleaved with business logic. It should certain ly n ot span in teraction with th e

156

CHAPTER 5

Transactions, concurrency, and caching

user. We’ll augmen t your un derstan din g of tran saction s with th e n otion of a lon grunning application transaction , where database operations occur in several batches, alternating with user interaction. There are several ways to implement application transactions in Hibernate applications, all of which are discussed in th is chapter. Finally, th e subject of cachin g is much more closely related to tran saction s th an it might appear at first sight. In the second half of this chapter, armed with an understanding of transactions, we explore Hibernate’s soph isticated cach e arch itecture. You’ll learn wh ich data is a good can didate for caching and how to handle concurren cy of th e cach e. We’ll th en en able cach ing in th e CaveatEmptor application . Let’s begin with the basics and see how transactions work at the lowest level, the database.

5.1 Understanding database transactions Databases implement the notion of a unit of work as a database transaction ( sometimes called a system transaction ) . A database transaction groups data-access operations. A transaction is guaranteed to end in one of two ways: it’s either committed or rolled back. Hence, database transactions are always truly atomic. In figure 5.1, you can see th is graph ically. If several database operations should be executed in side a tran saction , you must mark th e boun daries of th e un it of work. You must start th e tran saction an d, at some poin t, commit th e ch an ges. If an error occurs ( eith er wh ile executin g operations or wh en committin g th e ch an ges) , you h ave to roll back th e tran saction to leave the data in a consistent state. This is known as transaction demarcation, an d ( dependin g on th e API you use) it in volves more or less man ual in terven tion . Transaction Succeeded

commit

begin

Transaction

Initial State rollback

Transaction Failed

Figure 5 .1 System states during a transaction

Understanding database transactions

157

You may already h ave experience with two transaction-handling programming in ter faces: th e JDBC API and the JTA. 5 .1 .1 JDBC and JTA transactions

In a non-managed environment, the JDBC API is used to mark tran saction boun daries. You begin a tran saction by callin g setAutoCommit(false) on a JDBC connection an d en d it by callin g commit(). You may, at any time, force an immediate rollback by calling rollback(). ( Easy, huh?) FAQ

What auto commit mode should you use? A magical settin g th at is often a source of con fusion is th e JDBC connection ’s auto commit mode. If a database con n ection is in auto commit mode, th e database tran saction will be committed immediately after each SQL statemen t, an d a n ew tran saction will be started. Th is can be useful for ad h oc database queries an d ad h oc data updates. Auto commit mode is almost always in appropriate in an application , h owever. An application doesn ’t perform ad h oc or any un plan n ed queries; in stead, it executes a preplan n ed sequen ce of related operation s ( wh ich are, by defin ition , n ever ad h oc) . Th erefore, H ibern ate automatically disables auto commit mode as soon as it fetch es a con n ection ( from a con n ection provider—th at is, a con n ection pool) . If you supply your own con n ection wh en you open th e Session, it’s your respon sibility to turn off auto commit! Note th at some database systems en able auto commit by default for each n ew con n ection , but oth ers don ’t. You migh t wan t to disable auto commit in your global database system con figuration to en sure th at you n ever run in to an y problems. You may th en en able auto commit on ly wh en you execute ad h oc queries ( for example, in your database SQL query tool) .

In a system that stores data in multiple databases, a particular un it of work may involve access to more than one data store. In this case, you can’t achieve atomicity using JDBC alone. You require a transaction manager with support for distributed transactions ( two-phase commit) . You communicate with the transaction manager using the JTA. In a managed environment, JTA is used n ot on ly for distributed tran saction s, but also for declarative container managed transactions ( CMT ) . CMT allows you to avoid explicit transaction demarcation calls in your application source code; rather, tran saction demarcation is con trolled by a deployment-specific descriptor. This descriptor defines how a transaction context propagates wh en a sin gle th read passes through several different EJBs.

158

CHAPTER 5

Transactions, concurrency, and caching

We aren’t interested in the details of direct JDBC or JTA tran saction demarcation . You’ll be usin g th ese APIs on ly in directly. Hibern ate commun icates with th e database via a JDBC Connection; h en ce it must support both APIs. In a stand-alone ( or web-based) application, only the JDBC tran saction h an dlin g is available; in an application server, Hibernate can use JTA. Sin ce we would like Hibern ate application code to look th e same in both man aged and non-managed environments, Hibernate provides its own abstraction layer, hiding the underlying transaction API. Hibern ate allows user exten sion , so you could even plug in an adaptor for the CORBA transaction service. Tran saction man agemen t is exposed to th e application developer via the Hibernate Transaction in ter face. You aren ’t forced to use th is API—Hibern ate lets you control JTA or JDBC tran saction s directly, but th is usage is discouraged, an d we won’t discuss th is option . 5 .1 .2 The Hibernate Transaction API

The Transaction in ter face provides methods for declaring the boundaries of a database tran saction . See listing 5.1 for an example of th e basic usage of Transaction. Listing 5 .1

Using the Hibernate Transaction API

Session session = sessions.openSession(); Transaction tx = null; try { tx = session.beginTransaction(); concludeAuction(); tx.commit(); } catch (Exception e) { if (tx != null) { try { tx.rollback(); } catch (HibernateException he) { //log he and rethrow e } } throw e; } finally { try { session.close(); } catch (HibernateException he) { throw he; } }

Understanding database transactions

159

Th e call to session.beginTransaction() marks the beginning of a database transaction . In th e case of a n on -man aged en viron men t, th is starts a JDBC transaction on the JDBC connection. In the case of a man aged en viron men t, it starts a n ew JTA transaction if there is no current JTA tran saction , or join s th e existin g curren t JTA transaction. This is all handled by Hibernate—you shouldn’t need to care about the implemen tation . The call to tx.commit()synchronizes the Session state with th e database. Hibernate then commits the underlying transaction if and only if beginTransaction() started a new transaction ( in both managed and non-managed cases) . If beginTransaction() did not start an un derlyin g database tran saction , commit() only synchron izes th e Session state with th e database; it’s left to th e respon sible party ( th e code th at started th e tran saction in th e first place) to en d th e tran saction . Th is is con sisten t with th e beh avior defin ed by JTA. If concludeAuction() th rew an exception , we must force th e tran saction to roll back by calling tx.rollback(). Th is meth od eith er rolls back th e tran saction immediately or marks the transaction for “rollback only” ( if you’re using CMTs) . FAQ

Is it faster to roll back read-only transactions? If code in a tran saction reads data but doesn ’t modify it, sh ould you roll back th e tran saction in stead of committin g it? Would th is be faster? Apparen tly some developers foun d th is approach to be faster in some special circumstan ces, an d th is belief h as n ow spread th rough th e commun ity. We tested th is with th e more popular database systems an d foun d n o differen ce. We also failed to discover an y source of real numbers sh owin g a performan ce differen ce. Th ere is also n o reason wh y a database system sh ould be implemen ted suboptimally—th at is, wh y it sh ouldn ’t use th e fastest tran saction clean up algorith m in tern ally. Always commit your tran saction an d roll back if th e commit fails.

It’s critically importan t to close th e Session in a finally block in order to ensure that the JDBC connection is released and returned to th e con n ection pool. ( Th is step is the responsibility of the application, even in a managed environment.) NOTE

Th e example in listin g 5.1 is th e stan dard idiom for a H ibern ate un it of work; th erefore, it in cludes all exception -h an dlin g code for th e ch ecked HibernateException. As you can see, even rolling back a Transaction an d closin g th e Session can th row an exception . You don ’t wan t to use this example as a template in your own application , sin ce you’d rath er h ide th e exception h an dlin g with gen eric in frastructure code. You can, for example, use a utility class to con vert th e HibernateException to an un ch ecked run time exception an d h ide th e details of rollin g back a tran saction and

160

CHAPTER 5

Transactions, concurrency, and caching

closin g th e session . We discuss this question of application design in more detail in ch apter 8, section 8.1, “Design in g layered application s.” However, th ere is on e importan t aspect you must be aware of: th e Session h as to be immediately closed an d discarded ( n ot reused) wh en an exception occurs. H ibern ate can’t retry failed tran saction s. Th is is n o problem in practice, because database exceptions are usually fatal ( constrain t violation s, for example) an d th ere is n o well-defin ed state to con tin ue after a failed tran saction . An application in production sh ouldn ’t th row an y database exception s eith er.

We’ve n oted th at th e call to commit() syn ch ron izes th e Session state with the database. Th is is called flushing, a process you automatically trigger when you use the Hibernate Transaction API. 5 .1 .3 Flushing the Session

Th e Hibern ate Session implemen ts transparent write behind. Ch an ges to th e domain model made in th e scope of a Session aren ’t immediately propagated to th e database. This allows Hibernate to coalesce many changes into a minimal number of database requests, h elpin g min imize th e impact of n etwork laten cy. For example, if a single property of an object is ch an ged twice in th e same Transaction, Hibernate only needs to execute one SQL UPDATE. Another example of the usefulness of transparent write beh in d is th at Hibern ate can take advantage of th e JDBC batch API wh en executin g multiple UPDATE, INSERT, or DELETE statemen ts. Hibernate flushes occur only at the following times: ■

When a Transaction is committed



Sometimes before a query is executed



Wh en th e application calls Session.flush() explicitly

Flush in g th e Session state to th e database at th e en d of a database tran saction is required in order to make th e ch an ges durable and is the common case. Hibernate doesn’t flush before every query. However, if there are changes held in memory that would affect th e results of th e query, H ibernate will, by default, synchronize first. You can con trol th is behavior by explicitly setting th e Hibern ate FlushMode via a call to session.setFlushMode(). Th e flush modes are as follows: ■

FlushMode.AUTO—Th e default. En ables th e beh avior just described.



FlushMode.COMMIT—Specifies that the session won’t be flushed before query

execution ( it will be flush ed on ly at th e en d of the database tran saction) . Be

Understanding database transactions

161

aware th at th is settin g may expose you to stale data: modification s you made to objects on ly in memory may con flict with th e results of th e query. ■

FlushMode.NEVER—Lets you specify th at on ly explicit calls to flush() result

in synchronization of session state with the database. We don ’t recommen d th at you ch an ge th is setting from the default. It’s provided to allow per formance optimization in rare cases. Likewise, most application s rarely need to call flush() explicitly. This functionality is useful when you’re working with triggers, mixin g Hibern ate with direct JDBC, or workin g with buggy JDBC drivers. You sh ould be aware of th e option but n ot n ecessarily look out for use cases. Now th at you un derstan d th e basic usage of database tran saction s with th e Hibernate Transaction in ter face, let’s turn our atten tion more closely to th e subject of concurrent data access. It seems as though you shouldn’t have to care about transaction isolation—the term implies that something either is or is n ot isolated. This is misleadin g. Complete isolation of con curren t tran saction s is extremely expensive in terms of application scalability, so databases provide several degrees of isolation . For most application s, incomplete transaction isolation is acceptable. It’s important to understand the degree of isolation you sh ould ch oose for an application th at uses Hibern ate an d how Hibern ate in tegrates with th e tran saction capabilities of th e database. 5 .1 .4 Understanding isolation levels

Databases ( an d oth er tran saction al systems) attempt to en sure transaction isolation , meaning that, from the point of view of each concurrent transaction, it appears that no other transactions are in progress. Traditionally, this has been implemented using locking. A transaction may place a lock on a particular item of data, temporarily preven tin g access to th at item by other transactions. Some modern databases such as Oracle and PostgreSQL implement tran saction isolation usin g multiversion concurrency control, which is generally considered more scalable. We’ll discuss isolation assumin g a lockin g model ( most of our observations are also applicable to multiversion con curren cy) . This discussion is about database transaction s an d th e isolation level provided by the database. Hibernate doesn’t add additional seman tics; it uses wh atever is available with a given database. If you con sider th e man y years of experien ce th at database ven dors h ave h ad with implemen tin g con curren cy con trol, you’ll clearly see the advantage of this approach. Your part, as a Hibern ate application developer, is to un derstan d th e capabilities of your database an d h ow to ch an ge th e database isolation beh avior if n eeded in your particular scenario ( and by your data integrity requirements) .

162

CHAPTER 5

Transactions, concurrency, and caching

Isolation issues

First, let’s look at several ph en omen a th at break full tran saction isolation . Th e ANSI SQL standard defines the standard transaction isolation levels in terms of which of these phenomena are permissible: ■

Lost update— Two tran saction s both update a row an d th en th e secon d tran saction aborts, causin g both ch an ges to be lost. This occurs in systems that don ’t implemen t an y lockin g. Th e concurrent transactions aren’t isolated.



Dirty read— On e transaction reads ch an ges made by an oth er tran saction th at hasn’t yet been committed. This is very dan gerous, because th ose ch an ges might later be rolled back.



Unrepeatable read— A transaction reads a row twice and reads different state each time. For example, another tran saction may h ave written to th e row, an d committed, between th e two reads.



Second lost updates problem—A special case of an unrepeatable read. Imagin e that two concurrent transactions both read a row, one writes to it and commits, and then the second writes to it an d commits. Th e ch an ges made by th e first writer are lost.



Phantom read— A tran saction executes a query twice, an d th e secon d result set includes rows that weren’t visible in the first result set. ( It need not necessarily be exactly th e same query.) Th is situation is caused by an oth er tran saction inserting new rows between the execution of th e two queries.

Now that you understand all the bad things that could occur, we can define the various transaction isolation levels and see what problems they prevent. Isolation levels

The standard isolation levels are defined by the ANSI SQL standard but aren’t particular to SQL databases. JTA defin es th e same isolation levels, an d you’ll use th ese levels to declare your desired tran saction isolation later: ■

Read uncommitted— Permits dirty reads but not lost updates. One transaction may n ot write to a row if an oth er un committed tran saction h as already written to it. Any transaction may read any row, however. This isolation level may be implemen ted usin g exclusive write locks.



Read committed— Permits unrepeatable reads but n ot dirty reads. Th is may be achieved using momentary shared read locks an d exclusive write locks. Readin g tran saction s don ’t block oth er tran saction s from accessin g a row.

Understanding database transactions

163

However, an un committed writin g tran saction blocks all oth er tran saction s from accessing the row. ■

Repeatable read— Permits neither unrepeatable reads n or dirty reads. Ph an tom reads may occur. Th is may be ach ieved usin g sh ared read locks an d exclusive write locks. Readin g tran saction s block writin g tran saction s ( but n ot oth er readin g tran saction s) , an d writin g tran saction s block all oth er tran saction s.



Serializable— Provides the strictest transaction isolation. It emulates serial tran saction execution , as if tran saction s h ad been executed on e after another, serially, rather than concurrently. Serializability may not be implemen ted usin g on ly row-level locks; th ere must be another mechanism that prevents a newly inserted row from becoming visible to a transaction that has already executed a query that would return the row.

It’s nice to know how all these technical terms are defin ed, but h ow does th at h elp you ch oose an isolation level for your application ? 5 .1 .5 Choosing an isolation level

Developers ( ourselves included) are often unsure about what transaction isolation level to use in a production application . Too great a degree of isolation will h arm per forman ce of a h igh ly con curren t application . In sufficien t isolation may cause subtle bugs in our application that can’t be reproduced and that we’ll never fin d out about un til th e system is workin g un der h eavy load in th e deployed en viron men t. Note th at we refer to caching an d optimistic locking ( using versioning) in the followin g explan ation , two con cepts explain ed later in this chapter. You might want to skip this section and come back when it’s time to make the decision for an isolation level in your application . Picking the right isolation level is, after all, highly dependent on your particular scenario. The following discussion contains recommen dation s; n oth in g is carved in ston e. Hibern ate tries h ard to be as tran sparent as possible regarding the transactional semantics of the database. Nevertheless, caching and optimistic locking affect th ese seman tics. So, wh at is a sen sible database isolation level to ch oose in a Hibernate application? First, you elimin ate th e read uncommitted isolation level. It’s extremely dan gerous to use on e tran saction ’s un committed ch an ges in a differen t tran saction . Th e rollback or failure of one transaction would affect oth er con curren t tran saction s. Rollback of the first transaction could bring other transactions down with it, or perhaps

164

CHAPTER 5

Transactions, concurrency, and caching

even cause them to leave the database in an in con sisten t state. It’s possible th at changes made by a transaction that ends up being rolled back could be committed anyway, since they could be read and then propagated by another transaction that is successful! Secon d, most application s don ’t n eed serializable isolation ( ph antom reads aren’t usually a problem) , and this isolation level tends to scale poorly. Few existing applications use serializable isolation in production; rather, they use pessimistic locks ( see section 5.1.7, “Using pessimistic locking”) , which effectively forces a serialized execution of operations in certain situation s. This leaves you a choice between read committed and repeatable read. Let’s first consider repeatable read. This isolation level elimin ates th e possibility th at on e transaction could overwrite ch an ges made by another concurrent transaction ( the second lost updates problem) if all data access is per formed in a single atomic database transaction. This is an important issue, but usin g repeatable read isn ’t th e on ly way to resolve it. Let’s assume you’re usin g version ed data, something that Hibernate can do for you automatically. Th e combin ation of th e ( man datory) Hibern ate first-level session cache and versioning already gives you most of the features of repeatable read isolation . In particular, version in g preven ts th e secon d lost update problem, an d the first-level session cache ensures that the state of th e persisten t in stan ces loaded by on e tran saction is isolated from ch an ges made by oth er tran saction s. So, read committed isolation for all database tran saction s would be acceptable if you use version ed data. Repeatable read provides a bit more reproducibility for query result sets ( only for th e duration of th e database tran saction ) , but since phantom reads are still possible, th ere isn ’t much value in th at. ( It’s also n ot common for web application s to query th e same table twice in a sin gle database tran saction .) You also h ave to con sider th e ( option al) second-level Hibernate cache. It can provide th e same tran saction isolation as th e underlying database transaction, but it migh t even weaken isolation . If you’re h eavily usin g a cach e con currency strategy for th e secon d-level cach e th at doesn ’t preserve repeatable read seman tics ( for example, the read-write and especially th e n on strict-read-write strategies, both discussed later in this chapter) , the choice for a default isolation level is easy: You can ’t achieve repeatable read anyway, so there’s no point slowing down the database. On the other hand, you might not be using secon d-level cach in g for critical classes, or you migh t be usin g a fully tran saction al cache that provides repeatable read isolation. Should you use repeatable read in this case? You can if you like, but it’s probably not worth the per formance cost.

Understanding database transactions

165

Settin g th e tran saction isolation level allows you to ch oose a good default lockin g strategy for all your database tran saction s. How do you set th e isolation level? 5 .1 .6 Setting an isolation level

Every JDBC connection to a database uses the database’s default isolation level, usually read committed or repeatable read. Th is default can be ch an ged in th e database con figuration . You may also set th e tran saction isolation for JDBC connections using a Hibernate configuration option: hibernate.connection.isolation = 4

Hibern ate will th en set th is isolation level on every JDBC connection obtained from a connection pool before starting a transaction. The sensible values for this option are as follows ( you can also fin d th em as con stan ts in java.sql.Connection) : ■

1—Read uncommitted isolation



2—Read committed isolation



4—Repeatable read isolation



8—Serializable isolation

Note th at Hibern ate n ever ch an ges th e isolation level of con n ection s obtain ed from a datasource provided by the application server in a man aged en viron ment. You may change the default isolation using the configuration of your application server. As you can see, setting the isolation level is a global option that affects all connections and transactions. From time to time, it’s useful to specify a more restrictive lock for a particular tran saction . Hibernate allows you to explicitly specify the use of a pessimistic lock. 5 .1 .7 Using pessimistic locking

Locking is a mechanism that prevents concurren t access to a particular item of data. When one transaction holds a lock on an item, n o con curren t tran saction can read an d/ or modify this item. A lock migh t be just a momen tary lock, h eld wh ile th e item is bein g read, or it migh t be h eld un til th e completion of th e tran saction . A pessimistic lock is a lock that is acquired when an item of data is read and that is held until transaction completion. In read-committed mode ( our preferred transaction isolation level) , the database never acquires pessimistic locks unless explicitly requested by the application. Usually, pessimistic locks aren ’t th e most scalable approach to con curren cy. However,

166

CHAPTER 5

Transactions, concurrency, and caching

in certain special circumstances, they may be used to prevent database-level deadlocks, which result in transaction failure. Some databases ( Oracle an d PostgreSQL, for example) provide th e SQL SELECT...FOR UPDATE syntax to allow the use of explicit pessimistic locks. You can ch eck th e Hibern ate Dialects to find out if your database supports this feature. If your database isn’t supported, Hibern ate will always execute a n ormal SELECT with out th e FOR UPDATE clause. The Hibern ate LockMode class lets you request a pessimistic lock on a particular item. In addition, you can use the LockMode to force Hibern ate to bypass th e cach e layer or to execute a simple version check. You’ll see the benefit of these operations when we discuss versioning and caching. Let’s see h ow to use LockMode. If you have a transaction that looks like this Transaction tx = session.beginTransaction(); Category cat = (Category) session.get(Category.class, catId); cat.setName("New Name"); tx.commit();

th en you can obtain a pessimistic lock as follows: Transaction tx = session.beginTransaction(); Category cat = (Category) session.get(Category.class, catId, LockMode.UPGRADE); cat.setName("New Name"); tx.commit();

With this mode, Hibernate will load the Category using a SELECT...FOR UPDATE, thus lockin g the retrieved rows in th e database un til th ey’re released wh en th e transaction ends. Hibernate defines several lock modes: ■

LockMode.NONE—Don ’t go to th e database un less th e object isn ’t in eith er

cach e. ■

LockMode.READ—Bypass both levels of the cache, and per form a version

ch eck to verify that th e object in memory is th e same version th at curren tly exists in th e database. ■

LockMode.UPDGRADE—Bypass both levels of th e cach e, do a version ch eck

( if applicable) , an d obtain a database-level pessimistic upgrade lock, if that is supported. ■

LockMode.UPDGRADE_NOWAIT—Th e same as UPGRADE, but use a SELECT...FOR UPDATE NOWAIT on Oracle. This disables waiting for con curren t lock releases,

thus throwing a locking exception immediately if th e lock can ’t be obtain ed.

Understanding database transactions



167

LockMode.WRITE—Is obtain ed automatically wh en Hibern ate h as written to

a row in th e curren t tran saction ( th is is an in tern al mode; you can ’t specify it explicitly) . By default, load() an d get() use LockMode.NONE. LockMode.READ is most useful with Session.lock() and a detach ed object. For example: Item item = ... ; Bid bid = new Bid(); item.addBid(bid); ... Transaction tx = session.beginTransaction(); session.lock(item, LockMode.READ); tx.commit();

This code per forms a version check on the detached Item in stan ce to verify th at th e database row wasn ’t updated by another transaction since it was retrieved, before savin g th e n ew Bid by cascade ( assumin g th at th e association from Item to Bid has cascading enabled) . By specifying an explicit LockMode oth er th an LockMode.NONE, you force Hibernate to bypass both levels of the cache and go all the way to the database. We think that most of th e time cach in g is more useful th an pessimistic lockin g, so we don ’t use an explicit LockMode unless we really need it. Our advice is th at if you h ave a profession al DBA on your project, let th e DBA decide wh ich tran saction s require pessimistic lockin g on ce th e application is up an d run n in g. Th is decision sh ould depend on subtle details of the interaction s between differen t transaction s an d can’t be guessed up front. Let’s con sider an oth er aspect of con current data access. We think that most Java developers are familiar with th e n otion of a database tran saction an d th at is wh at they usually mean by transaction . In this book, we consider this to be a fine-grained transaction , but we also con sider a more coarse-grained notion. Our coarsegrain ed tran saction s will correspon d to wh at the user of the application considers a single unit of work. Why should this be an y differen t th an th e fine-grain ed database transaction? Th e database isolates th e effects of con curren t database tran saction s. It sh ould appear to the application that each transaction is the only transaction currently accessing the database ( even when it isn’t) . Isolation is expensive. The database must allocate sign ifican t resources to each tran saction for th e duration of th e tran saction . In particular, as we’ve discussed, man y databases lock rows th at h ave been read or updated by a tran saction , preventing access by any other transaction, until the first transaction completes. In highly concurrent systems, these

168

CHAPTER 5

Transactions, concurrency, and caching

locks can preven t scalability if th ey’re held for longer than absolutely necessary. For this reason, you shouldn’t hold the database transaction ( or even the JDBC connection) open while waiting for user input. ( All this, of course, also applies to a Hibern ate Transaction, sin ce it’s merely an adaptor to th e un derlyin g database transaction mechanism.) If you wan t to h an dle lon g user th in k time wh ile still takin g advan tage of th e ACID attributes of transactions, simple database transactions aren’t sufficient. You need a n ew con cept, lon g-run n in g application transactions.

5.2 Working with application transactions Busin ess processes, which migh t be considered a single unit of work from the point of view of the user, n ecessarily span multiple user clien t requests. Th is is especially true when a user makes a decision to update data on th e basis of th e curren t state of that data. In an extreme example, suppose you collect data en tered by th e user on multiple screens, perhaps using wizard-style step-by-step n avigation . You must read an d write related items of data in several requests ( h en ce several database tran saction s) until the user clicks Finish on the last screen . Th rough out th is process, th e data must remain con sisten t an d th e user must be informed of any change to the data made by any concurrent transaction. We call this coarse-grained transaction concept an application transaction , a broader notion of the unit of work. We’ll now restate this definition more precisely. Most web application s in clude several examples of the following type of functionality: 1

Data is retrieved and displayed on the screen in a first database transaction.

2

Th e user h as an opportun ity to view an d th en modify th e data, outside of an y database tran saction .

3

Th e modification s are made persisten t in a secon d database tran saction .

In more complicated application s, th ere may be several such interactions with the user before a particular business process is complete. This leads to the notion of an application tran saction ( sometimes called a long transaction , user transaction or business transaction ) . We prefer application tran saction or user tran saction , sin ce th ese terms are less vague and emph asize th e tran saction aspect from th e poin t of view of the user. Sin ce you can’t rely on th e database to en force isolation ( or even atomicity) of concurrent application transactions, isolation becomes a con cern of th e application itself—perh aps even a con cern of th e user.

Working with application transactions

169

Let’s discuss application tran saction s with an example. In our CaveatEmptor application, both the user wh o posted a commen t an d an y system administrator can open an Edit Comment screen to delete or edit the text of a commen t. Suppose two differen t administrators open the edit screen to view the same commen t simultan eously. Both edit the comment text and submit their changes. At this point, we have three ways to h an dle th e con curren t attempts to write to th e database: ■

Last commit wins— Both updates succeed, and the second update overwrites th e ch an ges of th e first. No error message is sh own .



First commit wins— Th e first modification is persisted, an d th e user submittin g th e secon d ch an ge receives an error message. Th e user must restart th e busin ess process by retrievin g th e updated commen t. Th is option is often called optimistic locking.



Merge conflicting updates— The first modification is persisted, and the second modification may be applied selectively by the user.

The first option, last commit wins, is problematic; the second user overwrites the changes of the first user without seeing the changes made by the first user or even knowing that they existed. In our example, this probably wouldn’t matter, but it would be unacceptable for some other kinds of data. The second and third options are usually acceptable for most kin ds of data. From our poin t of view, th e th ird option is just a variation of the secon d—in stead of sh owin g an error message, we show the message and then allow the user to man ually merge ch an ges. Th ere is n o single best solution . You must in vestigate your own busin ess requiremen ts to decide among these three options. The first option h appen s by default if you don ’t do an yth in g special in your application; so, this option requires no work on your part ( or on the part of Hibern ate) . You’ll h ave two database tran saction s: The comment data is loaded in the first database transaction, and the second database tran saction saves th e ch an ges without checking for updates that could h ave h appen ed in between . On the other hand, Hibernate can help you implement the second and third strategies, using managed versioning for optimistic locking. 5 .2 .1 Using managed versioning

Man aged version in g relies on either a version n umber th at is in cremen ted or a timestamp th at is updated to th e curren t time, every time an object is modified. For Hibern ate man aged version in g, we must add a new property to our Comment class

170

CHAPTER 5

Transactions, concurrency, and caching

and map it as a version n umber usin g the tag. First, let’s look at the changes to the Comment class: public class Comment { ... private int version; ... void setVersion(int version) { this.version = version; } int getVersion() { return version; } }

You can also use a public scope for th e setter an d getter meth ods. Th e property mapping must come immediately after th e iden tifier property mappin g in the mappin g file for th e Comment class: