UNIMARC & Friends: Charting the New Landscape of Library Standards: Proceedings of the International Conference Held in Lisbon, 20-21 March 2006 9783598440342, 9783598242793

With the expansion of the World Wide Web during the last decade, libraries and their standards face an ever-complex envi

223 8 1MB

English Pages 133 [136] Year 2007

Table of contents :
Frontmatter
Contents
List of Contributors
Preface
Welcome address
IFLA’s Evolving Programme to Produce and Promote the International Standard Bibliographic Descriptions: Past, Present, and Future
The Future of Cataloguing Codes and Systems: IME ICC, FRBR, and RDA
Modelling Bibliographic Information: Purposes, Prospects, Potential
Extending FRBR Concepts to Authority Data
IFLA UBCM Working Group on FRANAR Recommendations for Potential Changes in the UNIMARC Authorities Format
BiblioML and AuthoritiesML. An XML Application for Bibliographic and Authority Data Records, Based on the UNIMARC Bibliographic and Authorities Formats
UNIMARC’s Embedded Fields and MarcXchange: Unexpected Scenarios
UNIMARC and XML
A new OPAC for BNCF Using Open Source Software, XML and UNIMARC
Evolving Standards: IFLA/ICABS and ISO/TC46
ICABS – Umbrella for Multifaceted Activities in the Area of Bibliographic and Resource Control
UNIMARC – Future Perspectives

Recommend Papers

Proceedings of the First Zooarchaeology Conference in Portugal: Held at the Faculty of Letters, University of Lisbon, 8th-9th March 2012 9781407313047, 9781407342696

This volume comprises 15 articles - the result of presentations made at the first International Conference on Zooarchaeo

170 98 23MB Read more

The SME Financing Gap (Vol. II): Proceedings of the Brasilia Conference, 27-30 March 2006 9264029443

A significant number of entrepreneurs and small and medium-sized enterprises (SMEs) could use funds productively if they

276 75 3MB Read more

Precision Spectroscopy in Astrophysics: Proceedings of the ESO/Lisbon/Aveiro Conference held in Aveiro, Portugal, 11-15 September 2006 [1 ed.] 978-3-540-75484-8

High-accuracy Doppler shift measurements and high-precision spectroscopy are primary techniques in the search for exo-pl

133 21 8MB Read more

Compiler Construction: 15th International Conference, CC 2006, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2006, Vienna, Austria, March 30-31, 2006. Proceedings [1 ed.] 354033050X, 9783540330509

ETAPS 2006 was the ninth instance of the European Joint Conferences on Theory and Practice of Software. ETAPS is an annu

337 54 3MB Read more

Algebras, Rings and Their Representations: Proceedings of the International Conference Lisbon 9789812565983, 981-256-598-1

Surveying the most influential developments in the field, this proceedings reviews the latest research on algebras and t

331 70 255KB Read more

Differential equations and mathematical physics : proceedings of the international conference held at the University of Alabama at Birmingham, March ... 186 (Mathematics in Science and Engineering) 0120890402, 9780120890408

Proceedings of the international conference held at the U. of Alabama, Birmingham, March 1990, on the theory of ordinary

99 4 2MB Read more

Blockchain – ICBC 2021: 4th International Conference, Held as Part of the Services Conference Federation, SCF 2021, Virtual Event, December 10–14, 2021, Proceedings 9783030965273, 9783030965266, 3030965279

This book constitutes the proceedings of the 4th International Conference on Blockchain, ICBC 2021, held as part of SCF

119 12 13MB Read more

Services Computing – SCC 2021: 18th International Conference, Held as Part of the Services Conference Federation, SCF 2021, Virtual Event, December 10–14, 2021, Proceedings 9783030965662, 9783030965655, 303096566X

This volume constitutes the proceedings of the 18th International Conference on Services Computing 2021, held as Part of

116 11 10MB Read more

Big Data – BigData 2021: 10th International Conference, Held as Part of the Services Conference Federation, SCF 2021, Virtual Event, December 10–14, 2021, Proceedings 3030962814, 9783030962814

110 80 14MB Read more

Classical influences on western thought, A.D. 1650-1870 : proceedings of an international conference held at Kings College, Cambridge, March 1977

118 103 24MB Read more

UNIMARC & Friends: Charting the New Landscape of Library Standards: Proceedings of the International Conference Held in Lisbon, 20-21 March 2006
9783598440342, 9783598242793

Author / Uploaded
Marie-France Plassard (editor)

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

IFLA Series on Bibliographic Control Vol 30

International Federation of Library Associations and Institutions Fédération Internationale des Associations de Bibliothécaires et des Bibliothèques Internationaler Verband der bibliothekarischen Vereine und Institutionen Международная Федерация Библиотечных Ассоциаций и Учреждений Federación Internacional de Asociaciones de Bibliotecarios y Bibliotecas

About IFLA

www.ifla.org

IFLA (The International Federation of Library Associations and Institutions) is the leading international body representing the interests of library and information services and their users. It is the global voice of the library and information profession. IFLA provides information specialists throughout the world with a forum for exchanging ideas and promoting international cooperation, research, and development in all fields of library activity and information service. IFLA is one of the means through which libraries, information centres, and information professionals worldwide can formulate their goals, exert their influence as a group, protect their interests, and find solutions to global problems. IFLA’s aims, objectives, and professional programme can only be fulfilled with the cooperation and active involvement of its members and affiliates. Currently, over 1,700 associations, institutions and individuals, from widely divergent cultural backgrounds, are working together to further the goals of the Federation and to promote librarianship on a global level. Through its formal membership, IFLA directly or indirectly represents some 500,000 library and information professionals worldwide. IFLA pursues its aims through a variety of channels, including the publication of a major journal, as well as guidelines, reports and monographs on a wide range of topics. IFLA organizes workshops and seminars around the world to enhance professional practice and increase awareness of the growing importance of libraries in the digital age. All this is done in collaboration with a number of other non-governmental organizations, funding bodies and international agencies such as UNESCO and WIPO. IFLANET, the Federation’s website, is a prime source of information about IFLA, its policies and activities: www.ifla.org Library and information professionals gather annually at the IFLA World Library and Information Congress, held in August each year in cities around the world. IFLA was founded in Edinburgh, Scotland, in 1927 at an international conference of national library directors. IFLA was registered in the Netherlands in 1971. The Koninklijke Bibliotheek (Royal Library), the national library of the Netherlands, in The Hague, generously provides the facilities for our headquarters. Regional offices are located in Rio de Janeiro, Brazil; Dakar, Senegal; and Singapore.

IFLA Series on Bibliographic Control Vol 30

UNIMARC & Friends: Charting the New Landscape of Library Standards Proceedings of the International Conference Held in Lisbon, 20–21 March 2006

Edited by Marie-France Plassard

K · G · Saur München 2007

IFLA Series on Bibliographic Control edited by Sjoerd Koopman The “IFLA Series on Bibliographic Control” continues the former “UBCIM Publications – New Series”.

Bibliographic information published by the Deutsche Nationalibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data is available in the Internet at http://dnb.d-nb.de.

U Printed on acid-free paper / Gedruckt auf säurefreiem Papier © 2007 by International Federation of Library Associations and Institutions, The Hague, The Netherlands Alle Rechte vorbehalten / All Rights Strictly Reserved K.G.Saur Verlag, München An Imprint of Walter de Gruyter GmbH & Co. KG All rights reserved. No part of this publication may be reproduced, stored in a retrieval system of any nature, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior written permission of the publisher.

Printed in the Federal Republic of Germany by Strauss GmbH, Mörlenbach

ISBN 978-3-598-24279-3

Contents List of Contributors ...........................................................................................7 Preface.................................................................................................................9 Fernanda Maria Campos Welcome address..............................................................................................13 José Afonso Furtado Session 1 – Cataloguing Standards: Challenges and Future Directions ..................................................................15 IFLA’s Evolving Programme to Produce and Promote the International Standard Bibliographic Descriptions: Past, Present, and Future....................................................................................17 John D. Byrum The Future of Cataloguing Codes and Systems: IME ICC, FRBR, and RDA ...............................................................................27 Barbara B. Tillett Modelling Bibliographic Information: Purposes, Prospects, Potential..............41 Patrick Le Boeuf Extending FRBR Concepts to Authority Data...................................................49 Glenn E. Patton IFLA UBCM Working Group on FRANAR Recommendations for Potential Changes in the UNIMARC Authorities Format ...............................................................61 Mirna Willer Session 2 – MARC Portability and Reuse in the open Web Environment ........................................................................69 BiblioML and AuthoritiesML. An XML Application for Bibliographic and Authority Data Records, Based on the UNIMARC Bibliographic and Authorities Formats...............................71 Michel Bottin 5

UNIMARC’s Embedded Fields and MarcXchange: Unexpected Scenarios........................................................................................77 Vladimir Skvortsov UNIMARC and XML ........................................................................................83 José Borbinha, Hugo Manguinhas, Nuno Freire A new OPAC for BNCF Using Open Source Software, XML and UNIMARC ........................................................................................99 Giovanni Bergamin Session 3 – Round Table: Evolving Standards for Bibliographic Data Handling: the IFLA’s Role ....................................107 Evolving Standards: IFLA/ICABS and ISO/TC46..........................................109 Sally H. McCallum ICABS – Umbrella for Multifaceted Activities in the Area of Bibliographic and Resource Control.........................................119 Renate Gömpel UNIMARC – Future Perspectives ...................................................................129 Alan Hopkinson

6

List of Contributors Giovanni Bergamin Responsible for the Information and Technology Services, Biblioteca Nazionale Centrale, Florence, Italy José Borbinha Researcher, Instituto de Engenharia de Sistemas e Computadores (INESC-ID), Lisbon, Portugal Michel Bottin Président de l’Association pour la Documentation Numérique en XML, Paris, France John Byrum Chair, ISBD Review Group, Former Chief, Regional & Cooperative Cataloging, Library of Congress, USA Nuno Freire Researcher, Instituto de Engenharia de Sistemas e Computadores (INESC-ID), Lisbon, Portugal Renate Gömpel Head of Acquisitions, Cataloguing and Standardization, Deutsche Nationalbibliothek, Frankfurt am Main, Germany Alan Hopkinson Chair, Permanent UNIMARC Committee, Head of Library Systems, Middlesex University, London, UK Patrick le Bœuf Conservateur au bureau de normalisation documentaire, Bibliothèque nationale de France, Paris, France Sally McCallum Chief, Network Development and Standards, Library of Congress, USA Hugo Manguinhas Researcher, Instituto de Engenharia de Sistemas e Computadores (INESC-ID), Lisbon, Portugal

7

Glenn Patton Director, OCLC WorldCat Quality Management, Dublin, Ohio, USA Vladimir Skvortsov Head of the Department of Information Technology, National Library of Russia, St.Petersburg, Russia Barbara B. Tillett Chief, Cataloging Policy & Support Office, Library of Congress, USA Mirna Willer Consultant for Library Automation, Croatian Institute for Librarianship Zagreb, Croatia

8

Preface When the National Library of Portugal took over in 2003 the International MARC part of UBCIM (Universal Bibliographic Control and International MARC Core Activity), UNIMARC became a separate Core Activity of IFLA. In former days, the principle underlying UBCIM was the promotion and use of bibliographic control tools so as to make bibliographic records available throughout the world, preferably in machine-readable form but always in a standard form. UNIMARC was developed, under the umbrella of IFLA, to act as a universal format to and from which conversions between different national MARC formats and records prepared under diverse cataloguing rules could be imported and exported, thus creating the perfect tool for compatibility and reusability of bibliographic data. Over the years, UNIMARC became also the format (or the basis) for national usages, especially in countries that had not developed a specific national MARC format. That was, indeed, the case of Portugal, that adopted UNIMARC as its national format and implemented it not only within the National Library but also amongst the great majority of Portuguese libraries. Taking care of UNIMARC and ensuring the responsibilities of this new IFLA Core Activity was, somehow, a natural task besides being a great challenge. It is true that the scope and context were broader and the responsibility quite different from what had been undertaken in the country concerning the implementation of the format. The fact that a Permanent UNIMARC Committee had been in place, since 1991, with the purpose of maintaining and updating the format, was a great advantage. Furthermore, since UBCIM had been remarkably conducted until 2003, that helped very much to smooth the transfer and aided in the planning of the new UNIMARC Core Activity. However, standards are never static. There is a constant need for development and adaptation to new types of information and/or resources, continuous promotion and harmonization. IFLA has been, over the years, officially responsible for library standards. From the 1960s, IFLA promoted, notably, the standardized cataloguing principles known as Paris Principles and developed the standardized bibliographic description – ISBD, as well as more recent initiatives in the fields of standardized models FRBR– Functional Requirements for Bibliographic Records and FRANAR – Functional Requirements for Authority Numbering and Records. Last but not least, we are participating now

9

in a universal movement towards an International Cataloguing Code which will revise and update the Paris Principles. Therefore, the library community is applying these bibliographic standards in traditional libraries, recognizing IFLA’s role in the maintenance and update of such standards but expecting from its bodies an active role in the new networked information environment. Already if we check the Profession Pillar with which IFLA bodies should comply, we find the following statement: “IFLA will assist libraries and information services to fulfil their purposes and shape responses to the needs of clients in a rapidly changing information environment” by means of taking “the lead in collaborative efforts to establish guidelines and standards for the organization of information for access across international boundaries”. The IFLA UNIMARC Core Activity decided to organize this International Conference in order to actively contribute to this important discussion on challenges and future directions of bibliographic standards, thus following IFLA strategic directions. “UNIMARC & Friends: charting the new landscape of library standards” was the name given to the conference so as to strengthen the harmonization factor that these standards imply. The rationale behind this theme is the complex environment libraries (and their standards) face today. With the expansion of the WWW in the last decade (new types, genres and forms of information resources and information services, a changing network infrastructure, the emergence of cheap content-based retrieval approaches), all this array leaves us with a number of questions in charting a future for the development of bibliographic standards. Disclosing the current state of the art of new conceptual models and data specifications, especially the implications for the future of MARC standards, was the more practical goal we wanted to achieve with this conference. The programme was organized in three sessions: Session 1 – Cataloguing standards: challenges and future directions; Session 2 – MARC portability and reuse in the open Web environment; Session 3 – Round Table: Evolving standards for bibliographic data handling: the IFLA’s role. 10

The conference, held at the Calouste Gulbenkian Foundation, Lisbon, on 20 and 21 March 2006, was jointly organized by the National Library of Portugal and the Calouste Gulbenkian Foundation, with the support of Novabase (Portugal), Saur Verlag and OCLC. It was attended by 151 participants, from 22 countries, thus reflecting the wide international interest in the theme. Eleven excellent papers were presented by a panel of the most reputed experts on subjects, ranging from ISBDs, International Cataloguing Code, FRBR and FRANAR (the intellectual standards and their models) to the ways in which bibliographic data is encoded, transmitted and reused; they especially addressed UNIMARC/XML implementations and finally presented and discussed the evolution of standards and IFLA’s role in that domain. There was a general sense of a continuing need for displaying and discussing the way bibliographic standards are evolving. In general, the positive manifestations from the delegates gave the IFLA UNIMARC Core Activity a very good feedback. UNIMARC & Friends is still a very sound topic for discussion and, contrary to those who say that in the new technological environment there is no further need for these old traditional standards, the Conference was able to prove that libraries not only need trustful, quality and reliable standards but also they must be prepared to understand how to connect and apply them to the infrastructure that is being driven by the broader developments in the network environment. I should like to express, on behalf of the IFLA UNIMARC Core Activity, my thanks to all those that made this Conference possible, notably the organizers National Library of Portugal and Calouste Gulbenkian Foundation, especially its Art Library – the speakers and chairs of the sessions, the sponsors and, of course, the participants. The publication of these Proceedings makes the content of the Conference widely available and promotes information on essential bibliographic tools. Clearly there are immediate opportunities and challenges in this Web environment but the future of bibliographic standards shall not be seen as a doomed business, for as long as we understand the changing context and improve our abilities to join efforts and collaborate. September 2006 Fernanda Maria Campos Director, IFLA UNIMARC Core Activity

11

Welcome address José Afonso Furtado, Director, Art Library, Calouste Gulbenkian Foundation Dear Mrs Campos, Ladies and Gentlemen, I would like to welcome you all to this International Conference, on behalf of the Calouste Gulbenkian Foundation. As its Library Director, I am particularly honoured to co-sponsor this event with the National Library of Portugal. It is not often that we can host such a large and international audience devoted to advancing the foundations of the library field. The topics of this conference all converge into the subject of standards, and these are in fact the cornerstone of librarianship. In fact, for several years now we have seen a turmoil of new developments and discussions around the standards, old and new, upon which we build the services we provide to our user communities, to the information environment in general and also, of course, the services we share as institutions with a long history in building consensus and co-operation. It is especially significant that nowadays the international agenda involving library standards encompasses more than just the latest topics of IT related standards, such as the application of XML to MARC data. While these are unquestionably important, it is also true that a much larger wave of activity is taking place, in rethinking the conceptual and normative foundations of libraries: • major principles that for forty years have governed the philosophy of catalogues are being discussed internationally; • new models, such as FRBR, have emerged to redefine bibliographic entities, their functions and articulation; • long-established bibliographic standards, like the ISBDs, have been under revision at an unusual rapid pace; and, finally, • cataloguing codes are now the subject of an extensive re-examination and renewal. Altogether, this set of topics shows the different layers of a normative body that is undergoing a multitude of changes and enhancements, in order to address current and future needs, as a coherent whole. We all know that the future of the information environment is essentially open to the unpredictable. But we also recognize that the future of professional information services such as those 13

of libraries will depend essentially on the coherence and flexibility of the standards we build today. Let this Conference be a step furthering this understanding, and these two days will be worth remembering. I do wish you the best success for your meetings.

14

SESSION 1 CATALOGUING STANDARDS: CHALLENGES AND FUTURE DIRECTIONS

Chairs: Ia McIlwaine Marie-France Plassard

IFLA’s Evolving Programme to Produce and Promote the International Standard Bibliographic Descriptions: Past, Present, and Future John D. Byrum The concept of the International Standard Bibliographic Description has now endured for 35 years and represents IFLA’s most successful effort at promoting the cause of cataloguing standardization. One reason that explains why the ISBDs have flourished and remain essentially intact after nearly two generations is the continuing influence of the forces that prompted their formulation in the first place. These include demands and opportunities arising from the automation of bibliographic control as well as the economic necessity of sharing cataloguing. The ISBDs were also intended to serve as a principal component of IFLA’s programme to promote Universal Bibliographic Control, the ideal of which in Dorothy Anderson’s words is “to make universally and promptly available, in a form which is internationally acceptable, basic bibliographic data for all publications issued in all countries.” The ISBDs seek to serve three primary purposes: First, and of greatest importance, they are intended to make it possible to interchange records from different sources. As subsidiary purposes, the ISBDs, secondly, have assisted in the interpretation of records across language barriers, so that records produced for users of one language can be interpreted by users of other languages. Thirdly, they facilitated the conversion of bibliographic records to electronic form. The first of the ISBDs to be published was the International Standard Bibliographic Description for Monographic Publications, ISBD(M)), which appeared in 1971. There followed projects to produce ISBDs for Serials, Non-book materials, Cartographic materials, Rare books, Printed music, and, most recently Continuing and Electronic resources. For article level publications, Guidelines for the Application of the ISBDs to the Description of Component Parts was issued. Along the way, the need for a general framework to which all the ISBDs would conform was felt, resulting in production of ISBD(G); the primary purpose of G is ensuring harmony among the other ISBDs. The entire inventory of the ISBDs in all their editions is listed on IFLANET; in every case, at least the latest version of each ISBD is now freely available in an HTML or PDF format.1

17

Schedule and procedures for issuance of new or revised ISBDs Procedures are essential in all standardization work in order to ensure that the steps by which a document becomes a new or revised standard are well known and consistently followed. The ISBDs are no exception to this rule. As a result, at the 1989 IFLA Conference, the Section on Cataloguing adopted a schedule and established policies for development and distribution of new and revised ISBDs. In 2002, these policies were updated to take advantage of the opportunity for electronic publication of texts, both in draft and final form. The changes were also intended to speed up the review process by using email to announce the availability of drafts for review; and to enable quicker return of comments and suggestions regarding these drafts to the ISBD Review Group. Originally it was thought that each ISBD should be considered for updating on a five-year cycle. More pragmatically, they have been revised as the need has arisen to implement generally applicable changes or by the evolution of library materials, such as those that resulted in publication of ISBD(ER) and, more recently, the ISBD for continuing resources. There are essentially five phases in the development of a new and revised ISBD. Creation of draft text During this phase, a working group may be appointed comprising cataloguing experts and, when appropriate, format specialists from both within and outside of IFLA, unless the Review Group believes that it itself possesses sufficient expertise to accomplish the objectives of the revision. Typically for every project, an editor is designated to prepare the text according to the decisions of the working group. Worldwide review Once a draft text is completed, it is ready for worldwide review and comment. At this point, the text is forwarded for posting on IFLANET. Thereupon, an announcement is sent to IFLA-L and other appropriate electronic networks. Normally, two months are allowed for review of an ISBD undergoing revision and usually an additional month if the text is entirely new.

18

Final revision All comments arising from worldwide review are considered. In accordance with the group decisions, the editor revises the draft. At this point, special attention is given to provision of examples in a variety of languages in the text and appendices. When a final text is determined, the ISBD Review Group as a whole goes over the text, primarily to ensure conformance with ISBD(G). Balloting The final version of the new or revised ISBD is then sent to the Cataloguing Section’s Standing Committee and any co-sponsoring section for voting. The ballot provides only two options: to approve or to disapprove. However, editorial and sometimes more substantive comments are conveyed and are accommodated, if possible. Ballots not returned by close of voting are considered to be affirmative votes. One month is allowed for this phase. Publication and workshop If the outcome is a vote of approval as is typically the case, the text is scheduled for publication. Today, in all cases, the text is issued electronically, although the e-text may be delayed at the request of the publisher if the text is also to be published in print. As the final step in the process in the case of new ISBDs or those extensively revised, a workshop may be held in conjunction with an IFLA conference to promote understanding and use of the publication. Although some ISBDs have been developed or revised to meet particular needs, there have been two overall revision campaigns affecting the entire Family of ISBDs. Such occurs when changes are determined that have an across-theboard effect. First General Review Project The initial overall revision resulted in the creation of the ISBD Review Committee, which first met in August 1981. The Committee established three major objectives set out for the first general review project: 1. to harmonize provisions, achieving increased consistency, 2. to improve examples, and, 3. to make the provisions more applicable to cataloguers working with materials published in non-roman scripts. 19

By the end of the decade, the ISBDs had been re-published in “Revised editions”. In addition, a separate ISBD was created for Computer Files, which, because of rapid advances in technology, was soon superseded by creation of an ISBD for Electronic Resources. Second General Review Project In the early 1990s, IFLA’s Cataloguing Section in cooperation with other Sections set up the Study Group on the Functional Requirements for Bibliographic Records (FRBR). One immediate consequence of this development was the decision to suspend most revision work on the ISBDs while the FRBR Group pursued its charge to “recommend a basic level of functionality and basic data requirements for records created by national bibliographic agencies.” This decision resulted in permanent suspension of a project to identify the components of a “Concise ISBD (M)”, because it was expected that FRBR’s findings would in effect provide such a baseline. In 1998, the FRBR Study Group published its Final Report2, and the ISBD Review Group was reconstituted to initiate a full-scale review of the ISBDs in order to implement FRBR’s recommendations for a basic level national bibliographic record. In the ISBDs, national bibliographic agencies are called upon to “prepare the definitive description containing all the mandatory elements set out in the relevant ISBD insofar as the information is applicable to the publication being described.” To facilitate implementation of this principle, the ISBDs designate as “optional” those data elements that are not mandatory when applicable. Therefore, the main task in pursuing the second general review has entailed a close look at the ISBD data elements that are now mandatory in order to make optional any that became optional in FRBR. The ISBD Review Group has completed work on three of the ISBDs – ISBD(M), ISBD(CR) and ISBD(G). The Review Group had begun updating three: the ISBD for Cartographic Materials, for Antiquarian books, and for Electronic Resources, when it was decided that these projects should be put on hold, pending development of the Consolidated Version of the ISBDs, a project described below. The exception to this decision was to continue the work to revise ISBD(A). Thus, for more than three decades, IFLA’s ISBD programme has yielded standards for representing bibliographic data for all types of library materials and maintained these standards through one or more revisions. The ISBDs have been officially translated into 25 languages, including several that use non20

Roman scripts. They have guided the work of national cataloguing committees in updating their codes to foster internationally accepted practices, a point underscored by the compilations of practices by various rules and AACR that were prepared for IFLA’s ongoing meetings of Cataloguing Experts3. Current priorities and activities Let us turn next to the current priorities and activities of the ISBD Review Group. First, there is the matter of terminology used in the ISBDs in contrast to that used in FRBR, which has raised the question as to whether such terms as “work,” “expression,” “manifestation,” and “item” should be introduced. On the one hand, such changes would be a logical extension of the Review Group’s charge to implement FRBR to the largest extent practicable. But, on the other hand, as Patrick Le Boeuf argued in his paper on “Brave new FRBR world”: “FRBR terminology should not be merely incorporated such as it stands into the ISBDs and cataloguing rules, but [these] should keep their own specific terminology, and provide accurate definitions showing how each term in this specific terminology is conceptually related to the FRBR terminology”. Accepting this advice, the ISBD Review Group concluded that it was essential to clarify the relationship between the ISBDs and the FRBR model. The Group decided to sponsor a project to create a table to detail the relationship of each of the elements specified in the ISBDs to its corresponding entity attribute or relationship as defined in the FRBR model. As a result, in 2004, a document entitled “Mapping ISBD Elements to FRBR Entity Attributes and Relationships” was produced and is available on IFLA’s Website.4 Nevertheless, the ISBD Review Group did decide to introduce some changes in terminology, beginning with the recently revised ISBD(G). Among them is the use of the term “resource” rather than “item” or “publication”. The use of the former term “item” is different from the term “item” as used in FRBR, but it is not difficult to confuse them. This led to the decision to use “resource.” As another current activity, the ISBD Review Group has been working to provide improved guidance regarding the use of the ISBDs for bibliographic description of publications in multiple formats, for example, an e-book or serially issued maps. Three issues were of particular concern: 1. use of multiple ISBDs and use of multiple general material designations ([GMDs]), 2. the order in which elements for multiple formats should be treated, and 3. the number of bibliographic records to be created for multiple versions. 21

It turned out that the last of these would be the easiest to resolve. The Review Group concluded that the ISBDs should urge national bibliographic agencies and libraries participating in networks to create separate bibliographic descriptions for works issued in multiple formats. This practice would facilitate record exchange, one of the basic purposes of the ISBDs. Other libraries would be authorized to select a single-record approach when they wish. This recommendation also addressed a recommendation emanating from Working Group 4 at the first International Meeting of Experts for an International Cataloguing Code. As a result of these initial discussions, the Review Group set up a Material Designation Study Group with Lynne Howarth as chair, to address two issues that had been identified for further work, namely: • placement of the general material designation [GMD] • identification, clarification, and definition of content and nomenclature of the GMD, area 3, area 5, and area 7. Soon, it had become clear that the Study Group’s work on terminology and nomenclature would need to parallel and complement the work of the Study Group on the Future Directions of the ISBDs (to be discussed next) as it prepares, first, the harmonized text, and, subsequently, the consolidated ISBD. The Study Group decided that, as individual areas of the harmonized text are completed, it would examine and evaluate terminology used currently in the authorized ISBDs and make recommendations for the content and terminology to be used in the GMD, and areas 3, 5, and 7 as appropriate in the proposed consolidated ISBD. The Study Group has also addressed the troublesome problem of where to locate the general material designation within the bibliographic description. Group members agreed on the importance of the GMD as an “early warning device” for catalogue users, and after considering options, the Study Group issued the following policy statement, which was approved by the Review Group at its 2005 meeting in Oslo: “Recognizing the ongoing difficulties with the current optionality, terminology, and location/placement of the general material designation [GMD], and anticipating that the Future Directions Study Group may be working towards producing a consolidated ISBD for which a Document Type Definition (DTD) can then be developed, the Material Designation Study Group proposes the creation of a separate, unique, high level component (not a numbered ISBD area) – a content/carrier or content/medium designation that would be mandatory, i.e., not optional as with the current GMD – for recording in bibliographic records”.

22

“The Material Designation Study Group emphasizes that this component is independent of system displays – that is, different systems can display the recorded content of the “content/carrier” or “content/medium” designation as each system vendor or client institution determines appropriate, and particularly if the component is a part of the DTD that a style sheet will interpret for display (or not, as a library and/or system vendor determines).” On another front, just as the Joint Steering Committee for Revision of AACR is undertaking a strategic reexamination of the organization and presentation cataloguing rules for Resource Description and Access, the Review Group decided that it too should consider the possibility of combining the ISBDs into a single document. Historically, the ISBDs have been revised and published at various times, with no method for incorporating generally applicable changes made in newer texts into the older texts. The results have been confusing to many users of the ISBDs. It was therefore decided in 2003 to set up a Study Group on Future Directions of the ISBDs, chaired by Dorothy McGarry, to determine the feasibility of consolidating the ISBDs, merging the stipulations that applied to all resources and providing for additional stipulations for resources that needed them. To provide a focus for its work, the Study Group developed the following Objectives and Principles. The Objectives are: • To prepare a consolidated, updated ISBD from the specialized ISBDs in order to meet the needs of cataloguers and other users of bibliographic information. • To provide consistent stipulations for description of all types of resources to the extent that uniformity is possible, with specific stipulations for specific types of resources as required to describe those resources. The Principles include: • The primary purpose of the ISBD is to provide the stipulations for compatible descriptive cataloguing worldwide in order to aid the international exchange of bibliographic records throughout the international library and information community (e.g., including producers and publishers). • Different levels of cataloguing will be accommodated, including those needed by national bibliographic agencies, national bibliographies, universities and other research collections. • The descriptive elements needed to identify and select a resource are to be specified.

23

• The set of elements of information rather than their display or use in a specific automated system will provide the focus. • Cost effective practices must be considered in developing the stipulations. The Study Group met for several days in April 2005 and held meetings during the IFLA conference in Oslo reaching agreement on the general outline to be followed for each area. In addition, the Group is recommending that: • The areas should be restructured. The purpose and a brief description of the area will be set out first, with a definition and a statement on the source of information and the language of the element, followed by the stipulations, rearranged from the current order as needed. • General stipulations that apply to all materials should be given first, followed by any exceptions or additional stipulations that are needed for specific types of resources. • All definitions of terms should be brought together in one place, including those listed in appendices in some of the specialized ISBDs. Some definitions will be changed, and those terms for which definitions are no longer considered necessary will be deleted. • Some of the choices of source of information for the areas should be changed as needed, and a distinction will be made between “transcribed” information and “recorded” information. This means that areas 1 and 6 will be the only ones to be “transcribed”. Information for the other areas will be taken from anywhere in the resource without putting the information within square brackets. Therefore, elements needing a prescribed source of information are primarily areas 1 and 6. • Area 3 will be limited to mathematical data for cartographic resources, to music specific information, and to numbering for serials. Area 3 will be omitted for types of electronic resources. • In area 6, the ISSN will be mandatory for all materials, including monographic series • Full examples will be published separately in a supplement. The work plan and time-line for this project are as follows: First, the Study Group prepared a merged text for the ISBDs as they were published. This text was presented, side-by-side with a column containing suggestions for changes from the published stipulations in addition to those made during the merger of the individual texts. Primary problems and suggestions were highlighted for the ISBD Review Group to consider. This phase was completed in 2005. Next, the Study Group has worked on the stipulations, taking into consideration responses from Review Group members, in order to have a text ready for a 24

meeting in April 2006 at Die Deutsche Bibliothek. Coming out of this meeting will be a text ready for worldwide review in June or July 2006. The Study Group will then revise the text and forward it to the Review Group for approval. If all goes as currently planned, in early 2007 the consolidated version should go to the Cataloguing Section’s Standing Committee for balloting. The question of whether this text will replace the individual ISBDs or will be issued in addition to them has yet to be decided. Surveys comparing existing national and multinational cataloguing codes taken in preparation for the various meetings of experts on an international cataloguing code have demonstrated conclusively that the ISBDs are used extensively as the basis for bibliographic description and usually with very little modification.5 The Review Group has worked with the authors of these national cataloguing codes whenever there are concerns that we might address by way of improving the ISBDs. In particular, we have established an effective working relationship with the Joint Steering Committee for Revision of AACR on matters of mutual interest. Further developments are announced on the ISBD Review Group’s Web page at http://www.ifla.org/VII/s13/isbd-rg.htm Finally, some personal thoughts regarding the role of the ISBDs in today’s changing information environment. First, on the positive side, publication patterns are changing, largely as a result of the electronic environment in which we increasingly function. As interest in metadata to promote control and access to electronic resources increases, the ISBDs should find new opportunities to influence content and use of these schemas, since most of them will define data elements already familiar to the ISBDs. On the other hand, not only are there new bibliographic situations to consider, but also not every bibliographic practice already in place continues to be as useful now as it was formerly. Today there is growing concern about the relevance of most existing standards in relation to the changing bibliographic environment, with its proliferation of electronic resources of all types. Some colleagues have even challenged the long-term viability of online catalogues, given changing user behaviour and expectation for direct access to information itself. Others are predicting the imminent or eventual demise of MARC formats. Many believe that in today’s technological environment key-word searching is more effective and much cheaper than traditional subject cataloguing and indexing. The ISBDs have not escaped this increasing scepticism with which 20th century standards are now being reassessed. For example, the Joint Steering Committee has already determined that ISBD punctuation per se will not be required in RDA. 25

For the future, not only IFLA’s ISBD Review Group, but indeed the cataloguing profession as a whole, will need to be keenly aware that descriptive cataloguing as we know it now is coming under high-level scrutiny by library administrators determined to redirect resources and cut cataloguing costs. In addition, automation continues to offer new applications like ONIX6, for example, that may provide opportunities for preparing bibliographic descriptions programmatically, and these developments in turn may necessitate a more flexible approach than is currently permitted by the ISBDs. Nevertheless, it seems clear to me, as we meet here in March 2006, that the ISBDs and the purposes they serve continue to be relevant to libraries. In my view, they will continue to enjoy widespread application, just as will MARC21 and UNIMARC until such time as they are overtaken by events. Meanwhile, it is necessary for IFLA to continue to keep the Family of ISBDs abreast of current requirements and to pursue further doing so in cooperation with national bibliographic agencies and national and multi-national cataloguing committees. On this guardedly optimistic note, I am concluding my term as chair of the ISBD Review Group at the end of this month, after a term of service extending for more than 15 years. I am pleased to have this opportunity to acknowledge the successful efforts of the Review Group members who contribute their expertise on a voluntary basis to maintain and promote IFLA’s most successful effort in furthering cataloguing standardization. References 1

http://www.ifla.org/VI/3/nd1/isbdlist.htm

2

http://www.ifla.org/VII/s13/frbr/frbr.htm

3

http://www.ddb.de/news/pdf/code_comp_2003_europe_2.pdf http://www.loc.gov/loc/ifla/imeicc/source/code-comparisons_final-summary.pdf

4

http://www.ifla.org/VII/s13/pubs/ISBD-FRBR-mappingFinal.pdf

5

http://www.ddb.de/news/pdf/code_comp_2003_europe_2.pdf

6

ONIX is an XML (extensible mark-up language) DTD (document type definition). For more information on ONIX, visit the EDItEUR home page at http://www.editeur.org . EDItEUR is the agency responsible for coordinating the various national ONIX groups and distributing the ONIX standard.

26

The Future of Cataloguing Codes and Systems: ∗ IME ICC, FRBR, and RDA Barbara B. Tillett First of all, I would like to thank the meeting organizers for inviting me to participate in this important conference. I have been invited to cover some of the initiatives going on just now within IFLA and among the world’s cataloguing rule makers that are changing the direction of cataloguing as we know it. Some of these things, like FRBR – Functional Requirements for Bibliographic Records, are also having an impact on the development of new systems to help our cataloguers and our users. Others of these are directly targeted to improve how we catalogue, to take better advantage of today’s digital environment, and to capture metadata from various resources and build on our strong controlled vocabularies that help users through more precise and guided searching – initiatives like the Virtual International Authority File (VIAF). So, I will be talking mostly about cataloguing principles, specifically the new principles being developed in IFLA through the IME ICC meetings – IFLA Meeting of Experts on an International Cataloguing Code; the IFLA conceptual model of FRBR, and work now under way on a replacement for the AngloAmerican Cataloguing Rules (AACR2) – known as RDA – Resource Description and Access. They, after all, are the content standards for what goes into a UNIMARC record – bibliographic or authority record. Let me start with a short step back into history. Background – Anglo-American Cataloguing Rules The Anglo-American Cataloguing Rules have an interesting history of development, ranging back at least to the 91 rules that were printed in the British Museum’s catalogue in 1841 by Panizzi, then the “Keeper of the Books”. On the other side of the ocean, Charles Ammi Cutter completed his study of cataloguing practices in the United States and issued his rules in 1876, that gave guidance about the objectives of cataloguing (finding and collocating in particular) that still hold today. Cutter’s rules went through 4 editions1 and were

∗

A version of this presentation was originally given on 28 October 2005 for the Potomac Technical Processing Librarians.

27

the basis for the British and American attempts to collaboratively create a set of rules. Around the turn of the previous century, the American Library Association and the Library Association in the United Kingdom worked together to devise rules but found they could not agree on every point and ended up issuing separate rules in 1902 and again in 1908. The Library of Congress was very much involved with ALA at the time and also had its own rules and later issued supplementary rules to augment the ALA rules. The British and American Library Associations, along with the Library of Congress continued to work together to develop rules, but by 1941, the American Library Association decided to publish its own updated code, so there continued to be separate codes. By 1949 the ALA rules for author and title entries were accompanied by the Library of Congress Rules for descriptive cataloging. And then during the 1950’s there were cries for more principle-based rules. Seymour Lubetzky was commissioned to study the rules, and he developed some basic principles in the process that were later taken to IFLA for their famous conference in 1961. The resulting “Paris Principles,” as we know them today, then formed the foundation of nearly all of the major cataloguing codes used worldwide. This was an incredible step towards global harmonization of cataloguing practices, which still remains a worthy goal. After the 1961 Paris Principles, attempts once again were made to create a unified Anglo-American Cataloguing code, but again there were enough disagreements that two “texts” were published in 1967 – one the British text and the other a “North American text”. A lot of this was caused by large libraries in the United States that did not want to change their practices for entry of some corporate names under place – imposing what was called “superimposition” of old practices on headings made under the new rules. The British took a more principled approach in their edition of the rules. At the end of the 1960’s, IFLA held another meeting of experts to develop the International Standard Bibliographic Description, which is used worldwide today for basic descriptive elements arranged in a prescribed order with prescribed punctuation. A decade later in 1978, following further agreements after 1969 on the International Standard for Bibliographic Description (ISBD) and the desire for the English-speaking countries to agree on rules, AACR2 was issued. AACR2 incorporated the ISBDs and came closer to the Paris Principles, making it even closer to other cataloguing codes used throughout the world.

28

But for libraries that formerly followed the old “North American text,” it was a very traumatic time of big and very expensive changes, resulting in closing or splitting card catalogues or abandoning card catalogues and starting new online systems – particularly because the rules for headings for corporate names were drastically different in the new rules. That second edition was then the first time that both sides of the Atlantic, the US/Canada and the UK shared the same rules, although indeed there were differences in some choices regarding options allowed in the rules, such as with application of the GMDs – General Material Designators. Then we saw revisions to AACR2 in 1988, 1998, and 2002 – they all basically followed the same structure as AACR2 with revised rules to reflect the incremental changes over time, such as a new perspective on electronic resources and serials and integrating resources. IFLA Cataloguing Section The IFLA Cataloguing Section has been the centre of major international standards for cataloguing for nearly 50 years. After the Paris Principles of 1961 and the International Standard for Bibliographic Description of 1969, we saw the creation of the FRBR conceptual model in the 1990’s (the Functional Requirements for Bibliographic Records published in 1998), and a new view of Universal Bibliographic Control (UBC) with regard to controlled vocabularies that has fostered the tests of virtual international authority files, and now the current activities to update and expand the Paris Principles through the IFLA Meetings of Experts on an International Cataloguing Code. I do not have time today to talk about the Virtual International Authority File except to mention it offers a user-focused approach to displaying the user-preferred language and script for controlled names and subjects. All of these IFLA initiatives are directly impacting changes to the new cataloguing codes. That ends our short history lesson, so now let us start with the IME ICC – and the Statement of International Cataloguing Principles. Cataloguing principles The 1961 Paris Principles mostly covered “entry” and “forms of headings”. Figure 1 shows the topics included in those principles – the function and structure of the catalogue, kinds of entry and use of multiple entries (here we have main and added entries). Then the principles go into the choice and form of headings used at the top of the catalogue cards – today we would say access 29

points to the bibliographic record. In today’s world we are not limited to a single linear card file, as they were in 1961, so how to enter a bibliographic record in a card catalogue is not of as much importance as it once was.

Paris Principles (1961)

Scope Function Structure of the C atalogue Kinds of Entry U se of M ultiple Entries C hoice of Uniform H eading

Single Personal Author Entry under C orporate Bodies M ultiple Authorship W orks Entered under Title, Uniform H eadings for W orks, etc. Entry W ord for Personal Nam es

Figure 1. Outline of Paris Principles, 1961

IME ICC In 2003 IFLA launched a series of international meetings again to review underlying principles that should govern us in cataloguing, but for the current digital world. Starting in December 2003 and revised again in September 2005 and again in April 2006, IFLA produced a draft statement of international cataloguing principles that is being reviewed by cataloguing rule makers and cataloguing experts worldwide. This new statement updates and reaffirms many of the 1961 Paris Principles, but is now bringing in the FRBR concepts and focusing on the current environment of online catalogues and planning for future systems that take better advantage of system capabilities. The new systems offer users better tools for resource discovery and for better navigation through the bibliographic universe. The goal of this series of IFLA regional meetings is to increase the ability to share cataloguing information worldwide by promoting standards for the content of bibliographic and authority records used in library catalogues. Objectives were to develop and then later to review and update the 2003 draft 30

Statement of Principles from the Frankfurt meeting and also to see if we can get closer together in cataloguing practices and to make recommendations for a possible future International Cataloguing Code. This would be a code for code makers – to identify the rules that we can agree should be in all cataloguing codes. To date, we have held three of the regional meetings. The first was in Frankfurt, Germany for the European rule makers and cataloguing experts. It brought together 54 experts from 32 European countries, as well as representatives for the Anglo-American Cataloguing Rules from the United Kingdom, Australia, and the United States. The reports of the meeting and background papers from each of the regional meetings are available at their Web sites2 and a published print report from each meeting is available from K.G. Saur as part of the IFLA Bibliographic Control series. The 2nd regional meeting was held in Buenos Aires, Argentina in August 2004. The Web site and report of the meeting are in English and Spanish. The 3rd meeting was in Cairo, Egypt in December 2005, and the Web site is in English and Arabic. The report from that meeting will also be in English and Arabic. Next meetings are for August 2006 in Seoul, Korea for the Asian countries hosted by the National Library of Korea, and then a final meeting in 2007 for the African countries hosted by the National Library of South Africa before the IFLA meeting in Durban. This is a very exciting process, and we hope it will provide guidance to simplify cataloguing practices and improve the users’ experience in finding information they need. Draft Statement of International Cataloguing Principles Let us take a quick look now at what is in this draft Statement of International Cataloguing Principles. I mostly want to show you the influence of IFLA’s conceptual model, FRBR on the principles. First in the Introduction it indicates the principles are intended to apply to the description and access for all types of materials – unlike the Paris Principles that were basically just for texts. Also these new principles cover access, not 31

just choice and form of headings and not just bibliographic records, but also now for authority records. It states that the principles are built on the great cataloguing traditions of the world and on the conceptual models of FRBR, FRAR (Functional Requirements for Authority Records), and the future FRSAR (Functional Requirements for Subject Authority Records) – those are the foundations, and we intend to keep the basics for organizing information and providing controlled access and bibliographic relationships. Figure 2 shows the outline of topics covered in this new draft statement.

Statement of International Cataloguing Principles (2003+) 1. 2. 3. 4. 5. 6. 7.

Scope Entities, Attributes, Relationships Functions of the Catalogue Bibliographic Description Access Points Authority Records Foundations for Search Capabilities

Figure 2 – Outline for IFLA’s draft Statement of International Cataloguing Principles

In the scope, besides reminding us that it is for all kinds of resources and meant to guide the development of cataloguing codes, the principles also state that the highest principle for constructing cataloguing codes is the convenience of the users. It is recognized that sometimes there are other principles that must be followed and that sometimes convenience for one user may differ from what would be convenient for other users, but keeping the user at the centre of focus should always be our guide. The terminology for the draft Statement of Principles follows FRBR – entities, relationships, and attributes. FRBR terminology is followed even to the point of indicating that a separate bibliographic record is usually made for each mani32

festation, but that a record can be at the level of a collection or an individual work or a component of a work. The entities are those described or identified in not only bibliographic records, but also entities covered in authority records; and the FRBR entities are actually listed. The attributes of entities and relationships are described – to focus on attributes that identify the entity (this is a primary one of the FRBR User tasks); and to limit relationships to those considered bibliographically significant. The Statement of International Cataloguing Principles goes on to list the functions of the catalogue – again in FRBR “user task” terms of find, identify, select, obtain, and also to navigate. It makes it clear that we want to take advantage of both controlled and uncontrolled access points and states why we would want to offer controlled vocabularies. Section 6 of the draft Statement of International Cataloguing Principles goes on to build on FRAR – Functional Requirements for Authority Records – and plans for the FRSAR for subject authority records. Functional Requirements for Bibliographic Records (FRBR) I have already been talking about FRBR, but it is probably one of the really major breakthroughs of this past decade in the development of a new view of the bibliographic universe. I know you will be hearing more about this model today, so I will only briefly remind you of some major points. From 1992–1996 an IFLA Study Group developed the conceptual model called “FRBR,” which was published in 1998. The Functional Requirements for Bibliographic Records reinforce the basic objectives of catalogues for finding and collocating information and the importance of relationships to enable users to fulfil basic tasks with respect to the catalogue – enabling them to find, identify, select, and obtain information they want. These user tasks are key to why we catalogue, why we offer bibliographic information to help users find resources. FRBR also offers us a structure to meet these basic user tasks, including ways to collocate records at the level of works and expressions, to show relationships. So, what is the FRBR model? It is a generalized view of the bibliographic universe and is intended to be independent of any cataloguing code or implementation. It is a conceptual model and is not an application or an implementa33

tion, which makes it difficult for some of us to understand how it might really be applied to our real world. It is not a data model, it is not a metadata scheme, it is not a system design, but rather a conceptual model that can be used as the foundation for development of systems. The FRBR report itself includes a description of the conceptual model of the bibliographic universe: that is, the entities, relationships, and attributes (or as we would call them today, the metadata or data elements) associated with each of the entities and relationships, and it proposes a national level bibliographic record for all of the various types of materials. It also sets out the elements needed in national level bibliographic records. Rather than being tied to any particular communication format or data structure, the FRBR conceptual model instead identifies attributes that would be needed in national-level bibliographic records – which elements are mandatory and which are optional. This model opens up new possibilities for structuring the bibliographic description and access points and is serving as a guide in the development of rules that are more principle-based, more consistent, less redundant – and thereby cost-saving and easier to apply. For example, information we now provide redundantly in bibliographic records for names of persons and corporate bodies or names of works and expressions might be done once through different structures – somewhat like our current authority records for uniform titles and linked to the package that describes manifestations and items. We could also see making links for subject headings and classification numbers to the work and expression “records” so those attributes could then be inherited by the linked records for the associated manifestations and items – again eliminating the redundancy of putting that information in each bibliographic record as we do now. We intend for this to be explained in RDA – the new cataloguing code – Resource Description and Access. Figure 3 is a scenario for the future, where we would make use of authority records for works and expressions and do more linking directly at the authority record level for the creators of works and classification and subject headings that are appropriate to the work. Those authority records would also be available to display for each linked bibliographic record, and we could save cataloguers’ time by not needing to classify and provide subject headings for all the manifestations of that same work/expression combination. I really like this model, but we need to experiment to see if this is best or perhaps there is a better implementation model for FRBR. This model will require some changes to the authority record – to include fields for the associated subject headings – we already have a field for classification. So there is a possible direction for both UNIMARC and MARC 21. 34

Future Scenario Authority

Person/ Corporate body

Work/ Expression Uniform Title

Bibliographic Holding

Person/ Corporate body

Series (work/expression) Uniform Title

Concept/ Subject

Manifestation Item

Figure 3 – Future scenario

Future of Anglo-American Cataloguing Rules Just before the FRBR Report was published, many people involved in developing the FRBR model were also actively trying to make changes for the future of the cataloguing rules – AACR2. In 1997, the Joint Steering Committee for Revision of the Anglo-American Cataloguing Rules held the International Conference on the Principles & Future Development of AACR in Toronto. We invited experts from around the world to share in developing an action plan for the future of AACR. Some of the recommendations from that meeting have guided the thinking about new directions, such as the desire to document the basic principles that underlie the rules and explorations into content versus carrier and the logical structure of AACR; and some have already been implemented, like the new views of seriality – with continuing resources and harmonization of those cataloguing standards among the ISBD, ISSN, and AACR communities. Other recommendations from that meeting are still dreams, like further internationalization of the rules for their expanded use worldwide as a content standard for bibliographic and authority records. But we now want to make those dreams a reality.

35

So we envision RDA as a new standard for resource description and access, designed for the digital environment. By digital environment we mean three things: • a Web-based tool, • a tool that addresses cataloguing digital and all other types of resources, and • a tool that results in records that are intended for use in the digital environment – through the Internet, Web-OPACs, etc. The Joint Steering Committee’s Strategic Plan includes a Statement of Purpose for AACR – now RDA. It says that the code is a multinational content standard for providing bibliographic description and access for all media. While developed for use in English language communities, it can also be used in other language communities. It is independent of the format used to communicate information. Figure 4 is the general outline that was publicly announced in July 2005 for the structure of this new code.3

RDA Structure (Proposed)

General introduction Part I – Resource description Part II - Relationships Part III – Authority control (Access point control) Appendices

Capitalization, Numerals, Initial articles, Abbreviations Presentation (ISBD display, OPAC display, etc.)

Glossary Index

Figure 4. RDA Outline (February 2006)

There will be a general introduction to provide background for teaching the rules and building cataloguers’ judgment. Both Parts I and Part II will include 36

access information. A user only wanting to provide brief description could stop with Part I. Others wanting to show relationships to other works and entities could go on to Part II, and we would expect most libraries to use all three parts, to also include authority control. At the end are appendices about capitalization, numerals, initial articles, abbreviations, how to present descriptive data and authority data, as well as a glossary and an index. For the General Introduction at the start of the new code, we propose to give background information about the purpose and scope of the code, the underlying objectives and principles, and related standards and guidelines. We will refer to the IFLA Statement of International Cataloguing Principles and FRBR. So you can see how all these things interrelate! We want to keep the text of this General Introduction to RDA brief but possibly provide links to the full text or relevant principles and conceptual documents. Figure 5 is the general outline for Part I, which basically covers the same kind of content as Part 1 of AACR2.

R D A - P art I 0. Introduction 1. G eneral guidelines for resource description 2. Identification of the resource 3. Technical description 4. C ontent description 5. Inform ation on term s of availability 6. Item -specific inform ation

Figure 5 – RDA Part I (proposed)

Part I is arranged by data elements (also called attributes) – things like title, place of publication, date, and so on. There will be an indication of what the source is for the attribute, that is, where to find it on the item, how to record the attribute including recording as notes, as well as information about using the 37

attribute as a controlled or uncontrolled access point. This new structure will provide more flexibility to describe resources that have multiple characteristics – like many of the new digital resources. For Part II, we are proposing to address relationships – these are related works, expressions, manifestations, and items, as well as persons, corporate bodies, and families that play some role with respect to the resource being described. The idea of a ‘primary access point’ is being discussed to replace the often criticized term “main entry heading,” but this access continues – to give primary emphasis to the creator of the work contained in the manifestation being catalogued. The principle of authorship is still fundamental to citation – certainly in the Western world, and remains an important device to order displays, either as the primary alphabetical ordering for a set of retrieved records or as a secondary ordering device, say under a subject topic. Besides the main entry, another criticism in AACR is the “rule of 3.” This rule limits the identification of authors to three or less. When there are more than three authors, the rule says to use only the first one for access to the record. This “rule of three” was re-examined by the JSC several years ago and has had wide discussion, and it is likely that there will continue to be the option to retain such a rule for cost-saving reasons, yet we recognize the value of enabling the enduser to retrieve all the works of an author even if that author is the 4th or 5th or whatever in a jointly created work. We propose that Part III will cover authority control to describe controlled access for the precision of searching. We are thinking of calling this “access point management.” We expect this part to cover both authorized forms of names and the variant forms that could be used as references or in clusters for alternative display forms. It will also cover the construction of authorized names for persons, corporate bodies, families, and citations for works and expressions. We currently plan for several appendices and a Glossary, as I mentioned earlier. The display standards, or how we are to present the data to users, are now in an appendix, rather than being covered in the body of the rules. This is to allow the rules to operate within a variety of displays, such as those now used in OPACs and not just limited to ISBD displays or labelled OPAC displays. So now we come to the proposed timeline for getting from today to RDA as you see in Figure 6.

38

RDA Timeline (Proposal)

July 2005: Prospectus Oct. 2005-April 2006: Part I May-Sept. 2006: Part II Oct. 2006-Apr. 2007: Part III May-Sept. 2007: General Introduction, Appendices, and Glossary 2008: Publication (Web and loose-leaf)

Figure 6 – RDA Timeline (proposed)

Some people say this is ambitious, but others say “why will it take you so long?” Given the need to consult with the constituents and other rule making bodies worldwide, I personally feel it is very ambitious. Actually, this timing coincides nicely with the IFLA schedule to complete the worldwide regional meetings on the new Statement of International Cataloguing Principles by 2007. IFLA expects to have completed the consultations with all the world’s rule-making bodies following the 2006 meeting in Asia, and then will consult with the African cataloguing experts in 2007, but the Principles should be in pretty much final shape after we meet with the rule makers and cataloguing experts in Asia in 2006. Conclusions So, we have covered a lot – the FRBR and user tasks and new vocabulary and models to take us into the future. Throughout all of this is the increased awareness of how small the world has become with Internet capabilities and how important it is to share bibliographic information globally and also help reduce global costs. Our bibliographic and authority information is being used worldwide and also across different communities. We are updating the underlying principles that support the organization of information and doing it in a way to help build cataloguers’ judgment. 39

Our new standard for resource description and access will enable us to take descriptive metadata from many sources and give guidance on continuing our controlled vocabularies for names and titles to assure precision of future searches. All of these things are interconnected and leading us into the future of cataloguing, to provide us with updated standards for today’s Web environment while still supporting the traditional collections of our libraries, archives, and museums. I think you will all agree that IFLA’s role has been tremendous in reaching global agreements on standards for cataloguing that enable users everywhere to benefit. References 1

Cutter: 1876 (1st ed.), 1889 (2nd ed.), 1891 (3rd ed.), 1904 (4th ed., Rules for a Dictionary Catalog).

2

IME ICC1: http://www.ddb.de/standardisierung/afs/imeicc_index.htm IMEICC2: http://www.loc.gov/loc/ifla/imeicc/imeicc2/ IME ICC3: http://www.loc.gov/loc/ifla/imeicc/

3

The outline was slightly changed to combine Parts I and II following the JSC meeting in April 2006.

40

Modelling Bibliographic Information: Purposes, Prospects, Potential Patrick Le Boeuf Although the phrase “conceptual model” may sound imposing, it seems that FRBR, the conceptual model for bibliographic information developed by IFLA in 1992–1997, has become a familiar element in the librarians’ landscape. The very name – which stands for “Functional Requirements for Bibliographic Records” – is more and more frequently mentioned in peer-reviewed library science literature as well as in individual librarians’ blogs, to the point that it is perhaps not exaggerated to claim that it has turned, so to say, “trendy.” If you google for “FRBR” you come up with about 163,000 hits1 (including much noise, admittedly, as FRBR also stands for Fritillaria brandegeei or Canada’s Fondation de Recherche sur les Blessures de la Route = Traffic Injury Research Foundation). And FRBR is not limited to a merely “conceptual” model, since a number of “FRBR implementations” is currently available or under study, at OCLC, the Library of Congress, VTLS Inc., AustLit Gateway, etc. But, after all, what was the point of developing a conceptual model for bibliographic information? What was the objective pursued at the time the model was being developed? What sense does it make now? What benefits can be expected from the process of “conceptualizing” such a practical activity as cataloguing? Will future uses – and users – of FRBR still be the ones that FRBR’s developers had in mind? This International Conference devoted to UNIMARC and all its friends is a good opportunity to strive for an answer to such questions. Purposes Initially, the FRBR model was designed to2: • make explicit everything that may remain implicit in cataloguers’ intentions whenever they create a bibliographic record: the deep meaning of each information element they select, the relations they strive to capture, the purpose of recording a given element in a specific form, etc. • provide a scientific basis for minimal level cataloguing that would contribute to reduce the costs of maintaining library catalogues while ensuring that the basic functions of such catalogues are still performed in a satisfactory manner • convey a big picture of the information needs associated with all the kinds of materials that are likely to be held and described (or only described) by

41

libraries, ranging from hand-printed materials to electronic resources online • address the issue of the emergence of new user expectations and needs, i.e.: how to meet at the same time “traditional” information needs and new usages that are linked, for instance, to rights management or the “Web paradigm”. After FRBR was released, it was often understood (and still is) as providing a clue to new kinds of OPACs; a number of studies and dissertations investigate its potential in that direction: • “Data mining MARC to find FRBR” (2002) by Knut Hegna and Eeva Murtomaa3 • “Navigating through Voyager” (2002) by Jennifer Bowen4 • “Storage and retrieval of musical documents in a FRBR-based library” (2004) by Marte Brenne5 • “Implementing FRBR: a comparison of two relational models: IFLA’s FRBR model and Taniguchi’s expression-prioritized model” (2004) by Einar Silset Berg6, etc. FRBR-based OPACs include: • Virtua (in its versions posterior to version 41, 2002) by VTLS Inc. • AustLit Gateway (2002) • FictionFinder and Open WorldCat by OCLC. A FRBR-based display of bibliographic information is possible through the “FRBR Display Tool” developed by the Library of Congress (2001). The University of Rochester (Rochester, NY) River Campus Libraries has received a grant from the Andrew W. Mellon Foundation to develop, over the period 2006–2007, an open-source online system, named XC (for “eXtensible Catalog”), the conceptual structure of which will be based on FRBR. Another way of understanding and using FRBR is to regard it as a help for revising existing standards. In that vein, the Paris Principles (1961) are being revised at the light of FRBR, and RDA (Resource Description and Access), the future successor of AACR2, will be informed by the FRBR concepts. Both the future International Cataloguing Principles and RDA should be finalized by 2008.7 There is also a current trend that envisions FRBR not just as a tool meant to help design a more efficient display of records or as a set of concepts meant to help update cataloguing rules, but as a basis for an ontology of bibliographic

42

information that would enable one to “plug” library catalogues to the more general area of Semantic Web activities and information retrieval. The evolution between 1990, when the decision was made at the Stockholm Seminar to commission a study to define the functional requirements for bibliographic records, and our current environment, is therefore considerable. The initial intention was to provide a tool to reduce costs or justify such costs that could not be reduced, and to justify most of cataloguing practices such as they stood; the current interpretation, at least partially, is to take cataloguers out of their “ghetto” and lead them into the general Web arena, so that they may both benefit from tools designed by other communities, and make other communities benefit from their expertise and from the huge work they have been doing for decades or even centuries. Prospects and Potential IFLA’s initiatives IFLA is eager to extend the FRBR conceptualizing effort to other aspects of bibliographic information, i.e., not just bibliographic records but also authority records and the huge complexity of subject cataloguing. To that purpose, two further study groups were formed: • the FRANAR Group (for Functional Requirements And Numbering of Authority Records) chaired by Glenn Patton (OCLC), the outcome of which will soon be released as either “FRAR” or “FRAD” (for Functional Requirements for Authority Data)8; • the FRSAR Group (for Functional Requirements for Subject Authority Records), established in 2005 and chaired by Marcia Zeng, Maja Žumer, and Athena Salaba. The IFLA Division of Bibliographic Control also intends to have the FRBR model maintained and updated. This is the reason why the FRBR Review Group was formed in 2003. It is currently chaired by Pat Riva (McGill University Libraries, Montreal, Canada) and has a couple of Web pages at http://www.ifla.org/VII/s13/wgfrbr/wgfrbr.htm. The FRBR Review Group has formed in turn some working groups, so that each of them could focus on specific issues: • The IFLA Working Group on the Expression entity, chaired by Anders Cato (Kungl. biblioteket, Sweden), will provide a revised definition for this entity in 2006.

43

• The IFLA Working Group on Aggregates, formed in 2005 and chaired by Ed O’Neill (OCLC), will “investigate practical solutions to the specific problems encountered in modelling (a) collections, selections, anthologies…, (b) augmentations, (c) series, (d) journals, (e) integrating resources, (f) multipart monographs.” A third working group is in charge of the alignment of the FRBR model with another model for cultural heritage information: the CIDOC CRM. That model was developed from 1996 on, on behalf of the International Committee for Documentation (CIDOC), affiliated to the International Council of Museums (ICOM). It is about to be published as ISO standard 21127. CIDOC CRM is a semantic model for information about museum objects (from fine arts museums, archaeological museums, natural library museums…). It was decided in 2003 that a working group, including both representatives from the IFLA FRBR Review Group and the CIDOC CRM community [it is co-chaired by Martin Doerr (ICS-FORTH, Greece) and myself], should be formed in order to harmonize – and even possibly merge – the conceptual model for bibliographic information with the conceptual model for museum information. The purposes of such an endeavour are to lay the basis for mediation tools between libraries and museums, and prepare FRBR for Semantic Web applications involving library and museum materials. The idea is to use the resulting “ontology” in an RDF context9, so that it would be possible to “navigate seamlessly” (so to speak) from library to museum information and vice versa, and to allow for “inferences” and automated reasoning based on the information stored in databases of both types. A first draft of the outcome of the FRBR/CIDOC CRM Harmonization Working Group is to be made publicly available by June 2006. Outside IFLA In 2001, the Library of Congress commissioned a study to examine the MARC21 format, among other perspectives, with regard to the FRBR model. This study was carried out by Tom Delsey –who is, as everyone one knows, the main designer of the FRBR model – and it resulted in a functional analysis of MARC21, which is available from http://www.loc.gov/marc/marc-functionalanalysis/functional-analysis.html. Based on that analysis, the Library of Congress developed its “FRBR Display Tool,” an XSLT program that transforms lists of bibliographic records into meaningful displays by grouping them into the Work, Expression, and Manifestation entities.

44

As far as I know, no equivalent has been attempted yet for the UNIMARC format. And yet, it is quite possible that UNIMARC has more potential for “FRBRization” than MARC21, at least when it comes to bibliographic relationships at the Work and Expression levels. Of course, UNIMARC does not explicitly stipulate that cataloguers can create authority records for expressions – this would require significant changes in the UNIMARC Authorities format – but at least it is possible to use field 540 in order to create direct work-to-work links. There is therefore the potential for enabling users of UNIMARC catalogues to navigate from the description of one work to the description of its derivations, parent work, and siblings. And this is one of the major features, although frequently overlooked, of FRBR. In a broader context, there seems to be much interest in making FRBR and RDF work together. I already mentioned that one of the objectives of the FRBR/CIDOC CRM Harmonization Group is to transform the resulting harmonized ontology into an RDF declaration, but other people have analogous projects too, without any reference to CIDOC CRM. For instance, Stefan Gradmann10 advocates expressing FRBR in RDF Schema or OWL in order to implement catalogues using RDF and integrating Semantic Web ontologies. The expected benefits would be that it would enable us to dig our records out of the “hidden Web”; it would enable inferences; and it would allow libraries to benefit from the general market for Web technologies, instead of the expensive, specialized library software market. Also, Ian Davis, Richard Newman and Bruce D’Arcus expressed FRBR concepts and relations in RDF in 2005; the outcome is available in a “core” version at http://vocab.org/frbr/core, and in an extended version at http://vocab.org/frbr/extended.11 Is there a world beyond FRBR? FRBR is not the only conceptual model in the world. CIDOC CRM, devoted to museum information has already been mentioned. Its concepts and structures can apply to bibliographic information as well, except for a few details. I had an opportunity to experience it as I was asked to map UNIMARC to CIDOC CRM for the C2RMF (Centre de recherche et de restauration des musées de France) in the context of the European-funded project SCULPTEUR (2002–2005) which involved, in addition to C2RMF, the Musée de Cherbourg, the Victoria and Albert Museum in London, the National Gallery in London, the Galleria degli Uffizi in Florence, and other participants. One of its objectives was to integrate information from various, heterogeneous museum databases through an ontology-driven query system. The resulting prototype, named Concept 45

Browser, was developed by the University of Southampton. The idea of integrating bibliographic information as well came only at a later stage. This mapping is about to be made available on the Web. There are only a very small number of UNIMARC elements that do not fit quite well in the CIDOC CRM structure. XOBIS (XML Organic Bibliographic Information Schema)12, should also be mentioned, although it is not, properly speaking, a “conceptual model,” but an XML schema that was developed in 2001–2002 by Lane Medical Library, Stanford University. The reason why I would like to mention it in the present context, at the risk of being inappropriate and of being blamed for mistaking a metadata schema for a conceptual model, is that it is based on a very original underlying conceptualization, which departs radically from the “ISBD paradigm.” Unlike BiblioML and MARCXML, it is not “just” a MARC format with XML tags, but an effort to redefine bibliographic structures, having the XML potential in mind. As (subjectively) major features of that conceptualization, I will mention its “unlimited” possibilities for bibliographic relationships, and the fact that it allows for authority control even of qualifiers within headings. Some other related models and initiatives include: • (www.indecs.org, 2000), which deals with bibliographic information, but from the producers’ and publishers’ viewpoint; • ABC (http://metadata.net/harmony/ABCV2.htm, 2001), which is an ontology to account for changes that affect ALM materials; • ECHO (2000–2002), which was a European-funded project aiming at developing digital library services for historical films: this project included the development of a conceptual model, based on FRBR, for metadata that describe audiovisual materials; • MPEG-7 (2004), a standard for describing multimedia content data (it goes far beyond the needs of “classical” catalogues, since it describes the level of individual sequences within a videorecording!); there are current efforts (http://rhizomik.net/ontologies/mpeg7ontos) to transform it into an ontology for Semantic Web usages13; • and the MarcOnt initiative (www.marcont.org, 2005), which aims to transform MARC21 into an ontology expressed in OWL.

46

Conclusion A few statements • FRBR was a groundbreaking initiative, but it was just a beginning • FRBR will influence the International Cataloguing Principles and RDA (which are most certainly to become our future environment, at the global level) • There is a connection with Semantic Web activities and RDF • There is a connection with the museum community • FRBR is not the only way to conceptualize bibliographic information • The more we can share our bibliographic data with other types of institutions, by using common Web technologies and widely accepted tools and standards, the better it will be – and we shall be able to benefit from the data produced by other types of institutions as well. At least, this is my humble opinion. References 1

This presentation was drafted, before the Conference actually took place in Lisbon. The number of hits has considerably increased since then…

2

The purposes listed here are paraphrased after the FRBR Final Report, p. 1–3 (“1.1. Background”).

3

http://folk.uio.no/knuthe/dok/frbr/ See also: Hegna, Knut, & Murtomaa, Eeva. Data mining MARC to find: FRBR? In: 68th IFLA General Conference and Council, August 18th–24th, Glasgow, Scotland [on line]. The Hague: International Federation of Library Associations and Institutions, 2002 [cited 16 July 2002]. Available from World Wide Web: http://www.ifla.org/IV/ifla68/papers/053-133e.pdf

4

http://www.library.rochester.edu/IN/REF/attachments/MAVUG_files/frame.htm

5

http://home.hio.no/~bagheri/Master_thesis/Music_and_FRBR.pdf

6

http://home.hio.no/~bagheri/Master_thesis/Implementing_FRBR.pdf

7

See Barbara Tillett’s presentation during this International Conference.

47

References 8

See Glenn Patton’s presentation during this International Conference.

9

RDF (Resource Description Framework) is a World Wide Web Consortium (W3C) recommendation (1999). One of the motivations for developing RDF was “to allow data to be processed outside the particular environment in which it was created, in a fashion that can work at Internet scale” (http://www.w3.org/TR/rdf-concepts/) RDF makes it possible to express metadata in a way that conforms to a given ontology, i.e., a given explicit, formalized conceptualization of a given domain. Ontologies put concepts in relation, in a way that enables automated reasoning – i.e., that allows machines to make “inferences” by linking one statement to another statement that may have been expressed elsewhere on the Web, in a distinct database. This is, among other things, what constitutes the “Semantic Web.” In that sense, the Semantic Web is often regarded as a successor to Artificial Intelligence (AI).

10

Gradmann, Stefan. “rdfs:frbr – towards an implementation model for library catalogs using semantic Web technology.” In: Le Bœuf, Patrick. ed. Functional Requirements for Bibliographic Records (FRBR): Hype, or Cure-All? [printed text]. Binghamton, NY: the Haworth Press, 2005. ISBN 0-7890-2799-2. Published simultaneously as Cataloging & Classification Quarterly, 2005, Vol. 39, No. 3–4. ISSN 0163-9374.

11

Re Ian Davis’ initiative, see also http://internetalchemy.org/2005/07/frbr-and-rdf

12

Miller, Dick R. “XOBIS – an experimental schema for unifying bibliographic and authority records.” In: Le Bœuf, Patrick. ed. Functional Requirements for Bibliographic Records (FRBR): Hype, or Cure-All? [printed text]. Binghamton, NY: the Haworth Press, 2005. ISBN 0-7890-2799-2. Published simultaneously as Cataloging & Classification Quarterly, 2005, 39 (3–4), ISSN 0163-9374. See also: http://medlane.info/xobis/docs/XOBIS.pdf

13

More generally speaking, there is currently an effort to develop an ontology devoted to the description of multimedia content data (audiovisual electronic files) http://www.acemedia.org/aceMedia/reference/multimedia_ontology/index.html

48

Extending FRBR Concepts to Authority Data Glenn E. Patton The year 1998 seems to have been a point of convergence for several authorities-related activities: First, the publication of the Functional Requirements for Bibliographic Records recognized “the need to extend the model at some future date to cover authority data.”1 Second, the Working Group on Minimal Level Authority Records and ISADN addressed for authority data part of what FRBR does for bibliographic data – the specification of a basic level of data to be included in authority records that are shared. Finally, there were several recommendations related to authorities that came from the International Conference on National Bibliographic Services held in Copenhagen late in 1998. In response, the IFLA Division of Bibliographic Control and the Universal Bibliographic Control and International MARC Programme appointed the IFLA Working Group on Functional Requirements and Numbering of Authority Records. Members of the FRANAR Working Group are Françoise Bourdon (Bibliothèque nationale de France); Christina Hengel-Dittrich (Die Deutsche Bibliothek, Germany); Olga Lavrenova (Russian State Library); Andrew McEwan (The British Library); Eeva Murtomaa (Helsinki University Library, Finland); Glenn Patton (OCLC, USA); Henry Snyder (University of California, Riverside, USA); Barbara Tillett (Library of Congress, USA); Hartmut Walravens (International ISBN Agency, Germany); and, Mirna Willer (National and University Library, Croatia). Françoise Bourdon served as the initial chair of the group with Glenn Patton taking over that role in January 2002. MarieFrance Plassard, UBCIM Programme Director, assisted the group until her retirement in February 2003. In October 2001, Tom Delsey (retired from the National Library of Canada) agreed to join the Working Group as a consultant, bringing the group his long experience with modelling and his service as a consultant to the FRBR Study Group.

49

The FRANAR Working Group agreed to 3 terms of reference proposed by Françoise Bourdon: 1. to define functional requirements of authority records, continuing the work that the “Functional requirements for bibliographic records” initiated 2. to study the feasibility of an International Standard Authority Data Number (ISADN), to define possible use and users, to determine for what types of authority records such an ISADN is necessary, to examine the possible structure of the number and the type of management that would be necessary 3. to serve as the official IFLA liaison to, and work with, other interested groups concerning authority files. This paper concentrates on the 1st charge. Information about the Working Group’s activities in the other areas has been reported at recent IFLA conferences and is the subject of an article in a recent issue of International Cataloguing and Bibliographic Control.2 All of the Working Group’s activities have been guided by these two objectives: • to provide an understanding of how authority files function currently • to clarify the underlying concepts to provide a basis for refining and improving on current practice in the future. These are similar to the FRBR model objectives of understanding why cataloguers do what they do and how the bibliographic information that is recorded as part of the cataloguing process is actually used by users of online catalogues to provide a rational basis for improving the cataloguing process. As a step toward understanding how authority data is used currently in the library context, the group has identified five functions of an authority file: First, the authority file documents decisions made by the cataloguer when choosing the appropriate controlled access points for a new bibliographic record or when formulating new access points. Second, information in an authority file serves as a reference tool for those same two activities, as well as providing information that can be used in distinguishing one person, corporate body or work from another. It may also serve to help the cataloguer to determine that none of the access points in the authority file is appropriate and that a new access point is needed. It can also serve a broader reference function for other library staff.

50

Third, the authority file can be used to control the forms of access points in bibliographic records and, in an automated environment, change those access points when the authority record itself is changed. Fourth, an authority file supports access to bibliographic records by leading the user from the form of name as searched to the form of name used in the bibliographic file. Finally, an authority file can be used to link bibliographic and authority files in ways that, for example, allow the conversion of data elements into languages and scripts most appropriate to the user’s needs. The model also defines user tasks and maps the entities, attributes, and relationships to those user tasks. In considering the user tasks, Working Group members first defined two groups of users: • authority record creators and reference librarians who create, maintain and use authority files directly • library patrons who use authority information either through direct access to authority files or indirectly through the controlled access points (i.e., authorized forms and references) in library catalogues, national bibliographies, etc. The group has also defined a list of User Tasks. These are related to the FRBR user tasks but are specific to what cataloguers do in working with authority data. The first three tasks relate to both groups of users while the fourth task relates solely to the first group of users. Find: Find an entity or set of entities corresponding to stated criteria (i.e., to find either a single entity or a set of entities using an attribute or relationship of the entity as the search criteria). Identify: Identify an entity (i.e., to confirm that the entity represented corresponds to the entity sought, to distinguish between two or more entities with similar characteristics). Contextualize: Place a person, corporate body, work, etc. in context; clarify the relationship between two or more persons, corporate bodies, works, etc.; or clarify the relationship between a person, corporate body, etc. and a name by which that person, corporate body, etc. is known. Justify: Document the authority record creator’s reason for choosing the name or form of name on which a controlled access point is based.

51

The fundamental basis for the conceptual model of authority data is very simple: Entities in the bibliographic universe (such as those identified in the Functional Requirements for Bibliographic Records) are known by names and/or identifiers. In the cataloguing process (whether it happens in libraries, museums or archives), those names and identifiers are used as the basis for constructing controlled access points (see Figure 1).

Bibliographic Entities known by

Names and/or Identifiers

Controlled Access Points basis for

Figure 1: The Fundamental Basis for a Model of Authority Data

It is crucial to emphasize this simple view before moving on to more complex representations of the model since, in comments received during the recent worldwide review of the draft of the model, it became clear that this fundamental basis needed to be more clearly explained. The conceptual model for authority data developed by the Working Group is depicted in Figure 2.3

52

Figure 2: Conceptual Model for Authority Data

Depicted in the upper half of the diagram are the entities on which authority records are focused (that is, the ten entities defined in Functional Requirements for Bibliographic Records (FRBR) – person, corporate body, work, expression, manifestation, item, concept, object, event, and place – plus one additional 53

entity – family, which came out of our involvement with the archival community).4 The lower half of the diagram depicts the names by which those entities are known, the identifiers assigned to the entities, and the controlled access points based on those names and identifiers that are registered in authority files. The diagram also highlights two entities that are instrumental in determining the content and form of access points – rules and agency. The relationships depicted in the diagram reflect the inherent associations between the various entity types. The lines and arrows connecting the entities in the upper half of the diagram with those in the lower half represent the relationships between name and identifier and the bibliographic entities with which they are associated (person, family, corporate body, work, expression, manifestation, item, concept, object, event, and place). A specific instance of any of those bibliographic entities may be “known by” one or more names, and conversely any name may be associated with one or more specific instances of any of the bibliographic entities. Similarly, a specific instance of any one of the bibliographic entities may be “assigned” one or more identifiers, but an identifier may be assigned to only one specific instance of a bibliographic entity. The relationships depicted in the lower half of the diagram represent the associations between the entities name and identifier and the formal or structural entity controlled access point, and the association between that entity and the entities rules and agency. A specific name or identifier may be the “basis for” a controlled access point, and conversely a controlled access point may be based on a name or identifier. A controlled access point may also be based on a combination of two names and/or identifiers, as in the case of a name/title access point representing a work that combines the name of the author with the name (i.e., the title) of the work. Controlled access points may be “governed by” rules, and those rules in turn may be “applied by” one or more agencies. Likewise, access points may be “created by”, or “modified by” one or more agencies. It should be emphasized that the Working Group is consciously using the more general term controlled access point, rather than specific terms such as authorized form of name and variant form of name, which might more traditionally be used to describe data elements found in an authority record. The Working Group agreed to this terminology in recognition of authority files in which all forms of name recorded in the authority record are treated as a cluster with none of the forms being designated as an authorized form of name.

54

To relate the more general form of the model to one that is aligned more closely with traditional library authority files and to the IFLA Guidelines for Authority Records and References and UNIMARC Authorities, the group will include a pair of diagrams (and accompanying text) as an Appendix. This first diagram (Figure 3A) is the equivalent of the upper portion of the more general model.

Figure 3A: Entity Names and Identifiers

55

The second diagram (Figure 3B) expands the lower portion of the more general model and focuses on the formal or structural entities that come into play when a name or identifier is used to formulate an access point and the access point is subsequently registered in an authority file as an authorized heading or a variant heading in an authority record or reference record, or as an explanatory heading in a general explanatory record. Also included in this second diagram are the two entities that are instrumental in determining the content and form of headings, references, and records—rules and agency.

Figure 3B: Access Points and Authority Records in a Library Context

56

As in the FRBR model, the Working Group has defined a set of attributes for each of the entities. Some of the attributes are carried over from the FRBR model; others represent characteristics of the entity that specifically relate to authority data. Here is a sample of 3 of the entities with some of their attributes. Entity

Definition and Selected Attributes

Person

Definition: An individual or a persona established or adopted by an individual of group. [FRBR, modified] Attributes: • Dates of person • Title of person • Place of birth • Language of person • Field of activity [etc.]

Name

Definition: A character of group of words and/or characters by which an entity is known. [FRBR, modified] Attributes: • Type of name • Scope of usage • Dates of usage • Language of name • Script of name • Transliteration scheme of name [etc.]

Controlled access point

Definition: A name, term, code, etc., under which a bibliographic or authority record or reference will be found. [GARR5, modified] Attributes: • Type of access point • Language of cataloguing • Script of cataloguing • Source of access point [etc.] 57

The model also describes the relationships that are expressed in authority data between the various entities. Here are some examples of relationships along with an example for each of them as they would be expressed using the terminology and style of Guidelines for Authority Records and References. Entity Type

Selected Relationships and Examples

Work Æ Work

Relationships: • Successor relationship • Adaptation relationship • Transformation relationship • Whole/part relationship [etc.] Example of a transformation relationship Authorized heading: Poe, Edgar Allan, 1862-1849. [Fall of the house of Usher] See also reference: For a musical composition based on this work search under: >> Debussy, Claude, 1862-1918. Chute de la maison Usher

Person Æ Name

Relationships • Real name relationship • Pseudonym relationship • Married name relationship • Secular name relationship [etc.] Example of a real name relationship: Authorized heading: Orwell, George See reference tracing: < Blair, Eric Arthur [real name]

Access point Æ Access point

Relationships • Parallel language relationship • Alternate script relationship • Different rules relationship [etc.]

58

Example of an alternate script relationship: Access point: Gogol, Nikolaï Vasilievitch Access point in an alternate script: Гоголь, Николай Васильевич A draft of Functional Requirements for Authority Records (FRAR) was made available for worldwide review on IFLANET from July through October 2005. The FRANAR Working Group received comments from 12 individuals and 13 institutions (including 6 national libraries and 3 national-level cataloguing committees). The comments received were compiled into a comments log which totalled 145 pages. Seven members of the Working Group met at the Koninklijke Bibliothek, The Hague, Netherlands, in December 2005, to consider these comments and to start revising the draft to reflect decisions made in response to the comments. The group was able to deal with about two-thirds of the comments during the meeting and is currently conducting a series of conference calls to complete discussions of the remaining comments. The revised draft will then be presented to the sections of the IFLA Division of Bibliographic Control for approval. As part of these revisions, the Working Group is considering changing the name of the document to Functional Requirements of Authority Data (FRAD). Next, the Working Group must return to the issue of numbering before completing its work. It is the group’s intention to produce a separate document on this issue. That document, and the discussion that will lead to its preparation, will be informed by the recent decision of ISO/TC46/SC9 to begin work on an International Standard Party Identifier as a continuation of the and Interparty projects. It has also become clear during the Working Group’s discussions that, as a result of the analysis undertaken to create the model, revisions to some existing IFLA publications may be necessary. Thus far, the group has identified Guidelines for Authority Records and References, Mandatory Data Elements for Internationally Shared Resource Authority Records, and the UNIMARC Manual – Authorities Format as well as FRBR itself and there may be others for which changes will be recommended. The FRANAR Working Group got its start because the FRBR Study Group recognized the need to extend that model to cover authority data. Just as FRBR has changed how we think about bibliographic data, the Working Group’s hope is that our work will bring a clearer understanding of authority data and its relationships to the catalogue. 59

References 1

Functional Requirements for Bibliographic Records: final report / IFLA Study group on the Functional Requirements for Bibliographic Records. München : K.G. Saur, 1998, p. 5.

2

For more information about the other aspects of the Working Group’s charge, see: Glenn E. Patton, “FRAR: Extending FRBR Concepts to Authority Data,” International Cataloguing and Bibliographic Control, 2006, 35 (2), pp. 41–45.

3

For further discussion of previous versions of the FRAR entity relationship model, see: Glenn E. Patton, “FRANAR: A Conceptual Model for Authority Data,” Cataloging & Classification Quarterly, 2004, 38 (3/4), pp. 91–104, and Glenn E. Patton, “Extending FRBR to Authorities,” Cataloging & Classification Quarterly, 2005, 39 (3/4), pp. 39–48.

4

The description of the entity-relationship models is adapted from text prepared for the Working Group by Tom Delsey.

5

Guidelines for Authority Records and References / revised by the Working Group on GARE Revision. Second edition. München : K.G. Saur, 2001.

60

IFLA UBCM Working Group on FRANAR Recommendations for Potential Changes in the UNIMARC Authorities Format Mirna Willer Introduction In the process of developing a conceptual model for the Functional Requirements for Authority Records (FRAR)1, the IFLA UBCIM Working Group on FRANAR examined data elements and structures defined in IFLA documents, FRBR: Functional Requirements for Bibliographic Records (1998), GARR: Guidelines for Authority Records and References (2nd edition, 2001), UNIMARC Manual: Authorities Format (2nd edition, 2001)2, and Mandatory Data Elements for Internationally Shared Resource Authority Records: Report of the IFLA UBCIM Working Group on Minimal Level Authority Records and ISADN (1998). The Working Group subsequently identified fields for potential changes in those documents. Recommendations for potential changes in the UNIMARC Authorities format include: 1) the need to survey the relevance of reference entry records; 2) the analysis of the equivalence of the “Entry element” in all the headings fields to the “Base access point” attribute defined for Access points in FRAR; 3) further clarification of the 101 Language of the Entity field in relation to the FRAR attribute of the Work; 4) consideration of attributes regarding 102 Nationality of entity field; 5) the need for a subfield or another indicator for Type of family in the 220 Family name field; and 6) the need to expand coded values to cover different cataloguing rules in the 7XX or 4XX heading fields, and subfield $2 Subject System Code. IFLA Permanent UNIMARC Committee (PUC) reviewed these proposals at its annual meeting on 22–23 March 20063. Reference entry records The UNIMARC/Authorities format specifies that the reference entry records are only made in cases when the authority entry record would not be suitable to convey the complex relationships information between the uniform or authorized heading and variant headings, i.e., would not adequately generate information from see reference tracings in authority entry records. It was noted in the 61

practice of building authority files that in such cases libraries use authority entry records, and not reference entry records. Therefore, the FRANAR Working Group’s recommendation is to consider whether reference entry records are needed at all. The reference information to the headings, the relation between uniform and variant headings, and the notes can be included in the authority entry record. A survey should be performed together with GARR and MARC 21. The PUC accepted the recommendation. The Entry element and the Base access point The FRANAR Working Group raised the issue of whether the Entry element as defined in UNIMARC heading fields is equivalent to the Base access point attribute defined for Access points in FRAR. According to FRAR, the Base access point includes: • the name element in an access point beginning with the name of a person, family, or corporate body • the title of the person element in an access point beginning with the title of nobility or ecclesiastical title of a person • the title element in an access point for a work or expression • a term designating form at the beginning of an access point for a musical work or expression (e.g., Symphony, Concerto). In UNIMARC, the Base heading is defined as the opposite of the Qualifier. The function of this distinction is to enable the coding of the language of heading elements by the subfield $8 Language of cataloguing and language of the base heading, which can be added to any of the heading field. The aim of the introduction of this data element was to meet the needs of multilingual and multiscript authority files for controlling, managing and displaying headings. The concept was introduced in the second edition of the format following the recommendation of the Working Group on Minimal Level Authority Records and ISADN. The issue of defining what makes a Base heading and a Qualifier can best be illustrated by an example. In the example from the Bibliothèque Nationale de France 200#1$8freita$aNicolini da Sabio$bDomenico$f15..-160.?$cimprimeur-libraire

62

the base heading is $aNicolini da Sabio$bDomenico, and the language of the base heading is in the original, i.e., Italian according to the AFNOR rules. However, the language of cataloguing being French, the qualifier is expressed in French, i.e., the data in subfield $c. One should pay attention to the codes in subfield $8. However, the question here is What UNIMARC data elements make the Base heading in each particular heading type? Or, what can be defined as a Qualifier in different types of headings? Additional examples follow: Ex 1: Personal Name 200#0$aJohn$dII Comnenus$cEmperor of the East The definition of subfields 200 Personal name $c Additions to names other than dates, and $d Roman numerals are, for this purpose, not distinctive enough. The definition of $d is: “if an epithet (or a further forename) is associated with a numeration, this too should be included”. Thus, the question related to the intended function of this distinction is: Can $d be defined as a qualifier in addition to $c which itself includes “titles, epithets or indication of office,” apart from “any additions to names”? However, is $d a qualifier at all? The use of language forms in subfields $a and $d in this example is particularly confusing. Ex 2: Place Access 260##$aItalija$dVenezia The name of country in subfield $a is in Croatian, while the name for the city in $d is in Italian according to the Croatian cataloguing rules. 260##$aItalia$dVenezia The name of the country in subfield $a is in Italian, as well as the name for the city in $d according to different cataloguing rules intended for users speaking different languages. The same problem can be recognized in the use of subfields for state and county in the same field. Thus, what is the Base heading and what is the Qualifier in this case? Is this question at all relevant for this type of heading? The FRANAR Working Group’s recommendation is that UNIMARC has to define what data elements make a “qualification” for each particular type of heading.

63

The PUC recognized the need to add explanations of base headings, with definitions of “base heading” and “qualifier”, in each relevant field introduction. Language of the Entity and Original Language Attribute of Work Entity The FRANAR Working Group recognized the need for further clarification that 101 Language of the Entity is the same as the FRAR’s Original Language Attribute of the Work Entity. The definition of the 101 field is as follows: “This field contains coded information relating to the language or languages used by the entity identified in 2--. The entity may be an author (i.e., a person, a family, a corporate body) or a work.” Other types of entities as defined in UNIMARC Record Label, Type of Entity (ch. p. 9), like territorial or geographic name, trademark and topical subject are considered not relevant for this attribute. The definition of the subfield $a Language used by the entity is more precise in stating that it contains “the language in which the author expresses him/herself or the original language of a work”. FRAR defines the language as attribute of: • a person: a language the person uses when writing for publication, broadcasting, etc. • a corporate body: a language that the corporate body uses in its communications • a work: the language in which the work was first expressed • an expression: a language in which the work is expressed; it includes the language(s) of the expression of a whole and of individual components of the expression. • a name: the language in which the work is expressed • an access point: the language of the base access point and the language of cataloguing It can be concluded that both documents approach the issue of the language in the same manner regarding the coding of or defining attribute to the work entity. The language attribute of the expression entity, however, is not dealt in UNIMARC field 101, but is defined in subfield $m Language (when part of a heading) in Uniform title fields. Another conclusion of this analysis is that the language attribute of a family should be added to the FRAR document.

64

The PUC agreed that the definition of the Language of the Entity field needs to be further refined considering the following cases: when the record is a Work uniform title record, it [i.e., Language of the Entity] refers to the language of the Work (being the entity identified in 2--), when the record is an Expression level record, it refers to the language of the Expression (being the entity identified in 2--). Besides, UNIMARC needs to be able to code the distinction between Work and Expression either in the Record Label or in the field 154 Coded Data Field: Uniform Title. Nationality of the Entity The FRANAR Working Group’s proposal is to analyse place attributes of the FRAR document regarding this type of data in field 102, Nationality of the Entity. UNIMARC field 102 Nationality of the Entity contains coded information representing the country of which “the person or a family is a national or citizen where the corporate body or the trademark is headquartered, or where the work is composed”. FRAR defines the place as attribute of: • a person: place of birth, place of death, country, place of residence • a family: places associated with family • a corporate body: place associated with the corporate body • a work: place of origin of the work • a manifestation: place of publication/distribution. Both documents deal with nationality (UNIMARC), i.e., country (FRAR), and citizenship (UNIMARC), i.e., place of residence (FRAR), although some objection could be raised as to what extent do these concepts cover the same meaning. The difference in the treatment of this type of data can be referred to its representation. Namely, in UNIMARC the code represents the nationality which corresponds to FRAR’s attributes country and place of residence, although how this data is to be represented (coded or in textual form) is not the issue of that document. Other attributes defined in FRAR, like place of birth and death, are recorded in UNIMARC in the note fields if required by cataloguing rules. The FRANAR Working Group recommends that the PUC consider coding of these attributes. The PUC accepted the recommendation as it conforms to the principle that details need to be expressed at both the coding level and at the notes level. 65

Family name The FRANAR Working Group noted that a subfield or other indicator for the Type of family may be needed. UNIMARC field 220 defines only two subfields for recording data elements for this type of heading, i.e., $a Entry element and $f Dates. FRAR’s attributes of a Family entity are the following: • type of family: includes terms such as clan, dynasty, etc. • dates of family: dates associated with the family • places associated with family: information pertaining to places where the family resided or had some connection. • history of family: information pertaining to the history of the family. UNIMARC has corresponding fields to the two latter FRAR attributes: places associated with family are recorded in 102 field, and history of family in 340 Biography and Activity Note field. However, the recommendation goes towards the introduction of a new subfield as a kind of “qualification” of the type of heading to match that type of family attribute. The following examples show that this recommendation can be accommodated by the addition of a new subfield, e.g., $c Type of Family: 220##$aShah$cdynasty$f1768220##$aBuchanan$cclan$xHistory$yScotland The PUC accepted the recommendation and the proposal described here. Different cataloguing rules heading The FRANAR Working Group recognized the need to code the access point, i.e., heading, according to different cataloguing rules. The reason for this recommendation is the necessity to record in the same authority file the form of the authorised heading for the same entity formulated according to different rules, as defined by GARR, or formulated for particular user groups. The proposal is to consider the addition of a code for cataloguing rules to $2 System Code subfield, and to expand the field’s name. Additionally, the name of the format could be coded as there could also be a need for recording it. UNIMARC defines 7-- Linking heading block for recording parallel or alternative script forms of the heading in the 2-- block, which also serves as a link to a separate record in which the 7-- heading is the heading in the 2-- block. 66

The recommendation, however, has also bearing on other factors that influence the treatment of parallel information, i.e., how parallel information is treated by different cataloguing rules4, and is the bibliographic identity described under different rules the same one? If it is, then it should be treated in 7-- Linking Heading Block fields, if not, in 5-- See Also Reference Tracing Block. The PUC discussed the recommendation in the following direction: currently, the 7-- block is used only for parallel or alternative script forms. Therefore, should the use of the 7-- block be expanded for headings that follow different cataloguing rules, or should the 5-- block be used instead? Should the definition of subfield $2 be expanded beyond its use for subject system codes? It was decided that the analysis of the two heading blocks alternatives was needed for further discussion. Relationships between different entities The FRANAR Working Group recommended the PUC to consider expressing relationships between entities of different types. FRAR defines the following relationships between entities of different types: • Person to Family: membership relationship – the relationship between a person and a family of which the person is a member • Person to Corporate Body: membership relationship – the relationship between a person and a corporate body in which the person is a member or with which the person is affiliated. • Person/Family/Corporate Body to Work: creation relationship (this is still the topic under discussion within FRANAR Working Group). The proposal is to add a new code for Relationship code in subfield $5 Tracing Control of the 5-- See Also reference Tracing block fields. The PUC decided to consider this recommendation together with the one mentioned above. The PUC recognized that the UNIMARC Authorities Format will change radically following the discussions held at its 2006 PUC meeting, along the lines of the FRANAR Working Group report. References 1

Functional Requirements for Authority Records: A Conceptual Model: Draft 2005-06-15 / IFLA UBCIM Working Group on Functional Requirements and Numbering of Authority

67

References Records (FRANAR) [Cited: 2006-07-02]. Accessed at: http://www.ifla.org/VII/d4/FRANAR-Conceptual-M-Draft-e.pdf 2

UNIMARC Manual: Authorities Format. 2nd revised and enlarged ed. München: Saur, 2001. Also available at: http://www.ifla.org/VI/8/projects/UNIMARC-AuthoritiesFormat.pdf

3

IFLA Permanent UNIMARC Committee. Minutes of the 17th Meeting of the Permanent UNIMARC Committee, 2006 March 22–24, National Library of Portugal, Lisbon. See: Jay Weitz “The 17th Meeting of the Permanent UNIMARC Committee “International Cataloguing and Bibliographic Control, 2006, 35(3), p. 63–64.

4

See discussion on this issue in: Willer, Mirna UNIMARC Format for Authority Records: Its Scope and Issue for Authority Control. // Authority Control in Organizing and Accessing Information : Definition and International Experience / Arlene G. Taylor, Barbara B. Tillett editors. New York etc.: The Howarth Press, 2004, p. 163–164.

68

SESSION 2 MARC PORTABILITY AND REUSE IN THE OPEN WEB ENVIRONMENT

Chair: Maria Inês Cordeiro Rapporteur: Joaquim Ramos de Carvalho

BiblioML and AuthoritiesML An XML Application for Bibliographic and Authority Data Records, Based on the UNIMARC Bibliographic and Authorities Formats Michel Bottin Introduction BiblioML1 (and its companion AuthoritiesML) are full XML applications as defined by the World Wide Web Consortium. As such, the documents conform to these schemes can be treated by all the XML tools. The Unicode coding associated with the Character Representation Model2 of the W3C is used allowing the representation of bibliographic data in whatever language. These formats are defined by a Document Type Definition (DTD) and a XSchema to define the constraints of the formats. They use the UNIMARC format as the model of data analysis. They retain all the atomic elements of UNIMARC but organize them in a more accurate way without the Procrustean bed of the two level field/subfield representation. Their recursive nature allows a full deployment of all the necessary levels to represent the information in the most precise way. They also allow the link to, or the inclusion of, associated data such as complex text, other data representation, images, sounds, videograms, and so on, directly in the format or by the use of the XML namespace specification. BiblioML is an open format released under the terms of the GNU General Public License (GPL). In this way, everyone can contribute to its development or the development of associated tools. Origin The creation of BiblioML was undertaken on the initiative of the French Ministry of Culture and Communication – Research and Technology Mission and achieved by Martin Sévigny of the AJLSM firm. The basic idea was to offer a rich and accurate bibliographic description to be included in the other XML documents such as description of patrimonial sites, artist files, excavation

71

relation, etc. The project is now hosted by the Association for Digital Documentation in XML3. Examples As an example of a BiblioML document, here is the equivalent of Example 1 given in Annex L of the UNIMARC Manual: Bibliographic Format 1994:

DE GyFmDB

FR ADNX

3515023550 kart. DM46.0 0

88,A22,0260

76,N46,0054

72

The phonology of Old High German e. Veroff. in Verbindung mit d. Forschungsinst. fur Dt. Sprache, Dt. Sprachatlas, Marburg, Lahn

by Joseph B. Voyles (Skizzen u. Sonderzeichen: HansJurgen Jenkel. Kt.: Margot Schrey)

Voyles, Joseph B.

Wiesbaden Steiner

1976

1976

XII, 323S. 1 Kt. 73

24cm

Zeitschrift fur Dialektologie und Linguistik / Beihefte

Literaturverz. S. 321323

Phonologie

Althochdeutsche o. a. Sprache

Zeitschrift fur Dialektologie und Linguistik / Beihefte

Example of an AuthoritiesML record:

74

FR ADNX

Достоевский Федор Михайлович 1821-1881 ru

Dostoevskij Fedor Mihajlovič 1821-1881 en

Russian novelist

The DTDs or XML Schema of the two formats can be freely downloaded from the BiblioML site4. Tools Some useful tools are available or in progress5: • a Java program to convert UNIMARC records directly into BiblioML or AuthoritiesML • an XSLT stylesheet to transform the output of a translation of a UNIMARC record by MARCXML into BiblioML 75

• a set of XSLT stylesheets to transform BiblioML documents into a corresponding ISBD representation • an XSLT stylesheet to transform ISBD records into BiblioML • another XSLT stylesheet to transform CDS-ISIS records into BiblioML • many other stylesheets to translate different table data into BiblioML • an XSLT stylesheet to transform BiblioML documents to a DublinCore presentation, in order for example to answer an OAI request • decoding tables giving literal labels in English and French for UNIMARC codes used in BiblioML and AuthoritiesML. Future developments The main tasks to be performed are: • checking the conformity of the extensions of BiblioML (table of contents, indexes, etc;) with the new extensions of UNIMARC and the possible revision of ISBDs and cataloguing rules • standardizing XML standardization, for example replacing the ISO Language attribute by the standard xml:lang • adding perhaps some elements in order to be fully compatible with the Dublin Core scheme if it will be considered more practical than the inclusion of DC data by the use of namespace inclusion This is an open project developed on a cooperative basis and every one is invited to participate in the development and dissemination. References 1

http://90plan.ovh.net/~adnx/biblioml/

2

http://www.w3.org/TR/charmod/

3

In French: Association pour la Documentation Numérique en XML (ADNX), http://adnx.org

4

http://90plan.ovh.net/~adnx/biblioml/doku.php?id=en:biblioml051

5

You can look at the progress of these tools at http://90plan.ovh.net/~adnx/biblioml/doku.php?id=en:converters

76

UNIMARC’s Embedded Fields and MarcXchange: Unexpected Scenarios Vladimir Skvortsov XML Slim will be discussed in the light of ISO 25577 DIS, which is currently under development. The standard is intended to be an analogue of the ISO 2709 container in an XML environment. The way which the developers chose to solve the problem, i.e. direct emulation of the ISO 2709 to XML, seems to be evident and promising no surprise. But, as the title of the paper indicates, it is going to deal with an unexpected scenario, or, to be more precise, with unexpected scenarios, at least two, the consequences of which are difficult to predict right now. First scenario – embedded fields UNIMARC provides for the use of special constructions, which are called embedded fields. An example of such a construction follows: 451#0$1001BLN6956090$12001#$aPrefaces to the experiences of literature $1210##$aNew York$cHarcourt Brace Jovanovich$d1979 Embedded fields are placed in the ISO 2709 container exactly as standard fields. In this example, it is supposed that we have standard field 451 with the only repeatable subfield $1; all the data included in the $1 subfield are thus the object of semantic analysis inside automated library systems and there is no relation to the syntax of ISO 2709. It seems natural to suppose that, when the syntax of ISO 2709 is translated to that of XML – the situation should be the same without any unexpected consequences. The following example (reproduced with the kind permission of Tony Curwen who sent it to me along with two questions related to the example) shows how embedded fields would look like in an XML SLIM schema.

001BLN6956090 2001 $aPrefaces to the experiences of literature 210 $aNew York$cHarcourt Brace Jovanovich$d1979

77

What is there to identify 001, 200 and 210 as tags, and the following digits or spaces as indicator values? I also note (wrote Tony Curwen further) that section 4 refers to IS2, Field separator, and IS3, Record separator, but makes no mention of IS1, Subfield delimiter). These questions emphasize the point that certain doubts concerning the approach being used by ISO Technical Committee 46 – are not grounded (we are talking about the direct transfer of the ISO 2709 syntax to that of XML). This example illustrates both the scenarios, which are the subject of this paper. The first question will be dealt with first: • What is there to identify 001, 200 and 210 as tags, and the following digits or spaces as indicator values? As with ISO 2709, we have sufficient data for a semantic analysis: subfield $1 shows that the next 7 symbol positions (except for field 001) should be treated respectively as the tag of the field (3 symbols), the indicators (2 symbols) and the subfield delimiter (2 symbols), followed by the data itself. But, if one tries to answer that question in terms of XML syntax, the answer is quite simple – nothing. So what is the problem? Nothing changes. It is still now like it was before. But why does one change ISO 2709 into XML? Unlike ISO 2709 where data loading and indexing were executed by internal tools of the automated system, XML provides a standard and (this is important) external means for these operations. Thus, if one works with embedded fields, the system, working in XML, might (ideally) have no special tools for data processing during its loading and indexing. But if one works with embedded fields, XML would give just a partial solution, and one would have to consider purchasing or developing special software for semantic analysis, which in this case is necessary for data loading and indexing. Besides, the system, working in XML, would have to treat the links specified in the embedded fields correctly. It should be noted that in this case the tools that would provide correct work are certainly non-standard ones, since that is the basic question under discussion – whether one should introduce such kind of tools into the standard or not. The standard does not include such tools so far.

78

This means, in particular, that no standard automated system could support these tools. And, as already mentioned, if the library or library community, or even the whole country works with embedded fields – it is its own problem. Under such circumstances the very technique of embedded fields has little chance to survive – and this is quite a probable consequence of the decision which is close to be approved today as ISO 25577. Well, does one really need embedded fields as such? It seems to us that there are those who do not use them, probably, consider them of no special necessity, but also those who realized the opportunities given by using embedded fields in practice; the latter would hardly want to abandon using them. There are, in addition, huge financial and technological problems, which would arise in case of rejection of the embedded fields for those who worked with them. To deal with the first scenario, the above considerations will be now summarized. Direct and immediate transferring of ISO 2709 syntax to that of XML would certainly give guarantee, i.e., provide possibility of converting transport schemes to each other. But it would also mean that no progress is needed in the field of container production. From our point of view, the question of embedded fields is in fact the question of the linking technique – and this is the thing which ISO 2709 was not definitely supposed to do. Consequently, XML Slim must be some new step in the development of transport schemas not only in the sense of language development. On the other hand this step should not be done such as to lose their “round-trip ability”, which could really happen. Second scenario – subfield delimiters This was the second question from Tony Curwen. If one looks at his example again, one can see that in embedded fields subfield delimiters are contained in the body of the XML-document. The ‘$’ dollar symbol is used just for graphic record representation. As for a transport schema, the hexadecimal code 1F (or IS1) is used as subfield delimiter. If one looks at the W3C Recommendation regarding characters, which might be used in a text string of any XML-document, Character Range [2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] [#xE000-#xFFFD] | [#x10000-#x10FFFF]

|

79

One can see that the 1F code goes straight before the range of acceptable codes: [#x20-#xD7FF, and so 1F is not included in the range of characters recommended by W3C to be used in the body of XML-documents. It means, in particular, that using embedded fields in XML nowadays is completely unacceptable. All the above was supposed to provoke the interest and anxiety of all of those who work with embedded fields. It is hoped that the Permanent UNIMARC Committee and the UNIMARC Core Activity on the whole will also feel the same. The rest of our respected audience could have sufficient grounds to feel detached onlookers of the progress of these scenarios. To change the situation and involve them in the second scenario, a field, which I believe to be sometimes forgotten even by its own creators, will be mentioned. In terms of structure this field differs from the 4--block fields of UNIMARC being discussed above, but in fact it is an embedded field by its very nature. This field is present both in UNIMARC and in MARC21, (in fact it came from MARC21 to UNIMARC). This field is 886 DATA NOT COVERED FROM SOURCE FORMAT (UNIMARC) and 886 Foreign MARC Information Field (MARC21). That is how the field looks like in UNMARC and MARC21 formats: 8862#$2ukmarc$a083$b00$aRussia. Education$b- Biographies – Collections (Example from UNIMARC Manual) 8862b≠2ukmarc≠a690≠b00≠a00030≠dGreat Britain≠z11030≠abatterflies≠z21030≠alife cycles (Example from USMARC Manual) So, in the XML Slim Schema it should look like this:

ukmarc 083 00$aRussia. Education$b- Biographies – Collections

ukmarc 690 80

00≠a00030≠dGreat Britain≠z11030≠abatterflies≠ z21030≠alife cycles

In this case subfield delimiters are also contained in the body of the XMLdocument. It means that the 886 field is lost during any transport of the XML schema both for UNIMARC and MARC21. Now, after painting such a black picture it is time to speak about a possible solution. Solution At the UNIMARC session at the World Library and Information Congress in Oslo in August 2005 we proposed the XML Slim Schema, covering any MARC formats, working with embedded fields and non-conflicting with subfield delimiters described above in the second scenario. Besides, like ISO 25577, the Schema would provide for possibility of “round-trip ability” of the transport schemas. The Schema has been brought since in conformity with the draft of ISO 25577, keeping none the less all the possibilities provided by the Schema. And now we can state proudly that our Schema keeps all the merits of the draft ISO 25577 while having none of its shortcomings. This does not mean that we created anything extraordinary. The solution, as you can see, is quite simple and rather evident. Moreover, it is not the only possible one: see, for example, the Schema proposed by Giovanni Bergamin. Our XML Schema can be accessed at: http://www.rba.ru/rusmarc/soft/UNISlim.xsd Using our Schema, Tony Curwen’s example would look like as follows:

BLN695609

Prefaces to the experiences of literature

New York 81

Harcourt Brace Jovanovich 1979

The Schema defines an additional nesting level, corresponding to any embedded field. Following UNIMARC, we define the only possible nesting level, but in principle nothing prevents us from defining any number of such levels, as it is done, for example, in the Schema proposed by Giovanni Bergamin. This example shows part of the record, where all tags could be recognized and unambiguously interpreted by XML-parsers. And such a solution, as already mentioned – is not the only possible one. It is now time to take organizational decisions, which often turns out to be most difficult.

82

UNIMARC and XML José Borbinha, Hugo Manguinhas, Nuno Freire Introduction XML represented a revolution for the storage, processing and transport of digital information. This could not be avoided by the libraries world, where its impact is already deeply relevant, especially in the fields of interoperability and metadata encoding. This paper addresses the impact of XML in the libraries world from the perspective of UNIMARC. That impact occurs in three areas: the encoding, storage and transport of UNIMARC records; the management and publishing of the UNIMARC standard in itself; and the processing of UNIMARC records. The paper addresses these three areas with reference to projects developed at the BN – The National Library of Portugal. The transport of UNIMARC records in XML The primary purpose of UNIMARC is to facilitate the international exchange of bibliographic data in machine-readable form between library management systems. UNIMARC belongs to a family of other MARC formats for the same purpose, like MARC21, which are also implementations of ISO 2709, an international standard that specifies the structure for the encoding of bibliographic records. These formats do not stipulate the form, content, or record structure of the data within individual information systems. They provide recommendations only on the form and content of data when it has to be exchanged. MARCXML is a XML schema developed to replace ISO 2709 as a solution to transport MARC records1. This was developed initially for the MARC21 format, but it can be also used for UNIMARC. Even if it is a practical solution, MARCXML has some limitations that are supposed to be solved by its substitute, the “ISO/DIS 25577 Information and documentation – MarcXchange”2. MarcXchange is a superset of MARCXML, which means that every valid MARCXML file is also a valid MarcXchange file. However, it is more ISO 2709 centric (it allows more than 2 indicators and more than 1 subfield code 83

length). It defines also extensions to the ISO 2709 specifications for format and type. This is very important to achieve interoperability in heterogeneous environments, like the Internet, for two reasons: because machines are now able to identify explicitly the format of the encoded records and no longer need to assume it implicitly; and because it means that one can have now, in the same XML file, multiple records coexisting in different MARC formats. Representations of UNIMARC The textual descriptions of the UNIMARC formats have been published traditionally in printed paper. English has been the reference language, but translations to other languages exist in some cases. The most recent reference version of at least the bibliographic format is accessible online from the IFLA’s site3. These descriptions of UNIMARC are only textual, with no formal validation. It means that if in two different places of the text two contradictory rules are defined, there is no automatic way to detect that discrepancy. On this point we will describe now a solution, based on XML, to manage not only the processes of maintenance and publishing of the descriptions of the UNIMARC formats, but also to validate their logical consistency. Publishing UNIMARC In this paper we are describing the solution to manage the processes of maintenance and publication of the descriptions of the UNIMARC formats. That must provide interfaces for humans and machine, which is assured by a Model-ViewController (MVC) architecture4. The Model part of this MVC architecture consists in fact of two complementary models, actually both expressed in XML: • a format schema for each UNIMARC format, comprising a grammar with all the rules and relations; • a schema description, containing the corresponding textual descriptions for the elements of the format as defined in the schema. Both these models use URN5 identifiers for reference and linkage. Their source is a Metadata Registry, also a very important part of our work. The Controller merges these two source models, using DocBook as an intermediate format, appends extra information (extra texts, not structured, and complementary to the formal description) and applies the final transformations. 84

With this architecture one can generate any publishing format views as output, either for human (HTML, RTF, PDF, etc.) or for machine processing (XML). The UNIMARC schema UNIMARC is a format with a very complex structure. Besides the common syntactic rules for elements, attributes and values, it also defines semantic relations between them. These relations may even define the interpretation made for a given element or attribute. UNIMARC also requires grouping information in subsets of rules (aggregation) to represent blocks of fields. This requirement is not essential for the validation of the records, but is important to define element semantic coupling. A schema language is a formal language for expressing schemas, so software or users can unambiguously understand them. In the XML scope, schema languages are used to define the allowed structures of XML documents. An XML schema provides a means for defining the structure, content and to some extent, the semantics of XML documents6. Currently schema languages, like XMLSchema7 and RELAX-NG8, allow the definition of syntactic rules based on elements, attributes and values, but they lack semantic rules for defining relations between them. Schematron9 is the only language that allows defining these semantic relations. On the other hand, existing format schemas tend to evolve in time or be forgotten, while for UNIMARC we need a stable format. It was thus decided not to use any of the existing schema languages, but to develop our own schema language for the purpose of describing the UNIMARC formats. The UNIMARC Schema is the formal description of UNIMARC, as a family of formats, in a specific schema, coded in XML, able to accommodate the formal lexis, syntaxes and multilingual textual descriptions of any format and version, to be interpreted by both machines and humans. According to this, each UNIMARC format schema (Bibliographic, Authority, Holdings and Classification), and the respective versions, can have their own format schema file with the corresponding format rules. The most recent version of the bibliographic format, for example, is actually an XML file with 3700 text lines. Any format schema file can inherit the structural information of another format schema file, making it possible to represent the evolution of a format by simply adding and replacing (overloading) rules in the new versions. 85

The UNIMARC schema description Each UNIMARC format schema has also a complementary schema description of the terms. This schema contains therefore the relations between the URN identifiers of the elements and their corresponding textual descriptions. Each definition of an element of a format schema comprises: • a label, a limited length character data that defines the name given to the element; • a definition, an unlimited length structured data that describes the element; • notes, an unlimited length structured data adding information about the use and purpose of the element; • examples, an unlimited structured data providing information about possible uses of the element; • relationships, an unlimited structured data that pinpoints relations with other elements of the given format. New descriptions of the schema can be declared at any time to satisfy new requirements. Current schema description formats involve declarations in RDF (Resource Description Framework, a language for representing information about resources in the World Wide Web), and other ontology schema description formats. For our current purposes we do not need a complex schema description. To simplify our problem, we decided to take advantage of the XML namespace functionality and build an extension of an XML properties schema with the ability to contain both text and structured information. This description can be multilingual and multi-version, where each case is represented by a specific XML file. The publishing Controller For the Controller part we decided to use XSLT technology10. This provides a great flexibility and makes an easy access and usage by third party applications possible. To support this framework, we use an intermediate representation of the formats expressed in the DocBook schema. DocBook is a collection of widely accepted standards and tools for technical publishing. Figure 1 shows an example of part of the description of the UNIMARC bibliographic format in this format.

86

Figure 1: The UNIMARC bibliographic format description coded in XML (example for the second edition, third revision, in English).

This solution enables the execution of a number of tasks needed before the generation of the final publishing view. The Controller starts by merging the two model representations, using the format schema as source for document tree building and using the schema description for content. This is done by fetching the description of each document tree node in the description model, using the document node URN as search key, and applying it to the current document node. As a result, the first DocBook/XML representation of the document is built. Among the executed tasks are the adding of extra information, such as bibliographic information (like title, authority, etc.), preface, introduction, appendices and annexes, and the generation of all kinds of indexes (tables of contents and annexes indexes), references, lists and glossaries of terms. When transforming from the Model to View, all URN references to terms are translated into real links (in a HTML document, those will be URL links). Some tasks can be 87

ignored or made differently, according to the type of publication that is desired. For example, the concise publication needs less work from the Controller than the complete publication. The publishing view After all the Controller workflow is done, the final publishing view takes place. This is built using an appropriate XSLT or XSLT-FO transformation for the intended publishing format. At the moment we are publishing in all currently available human-readable formats for DocBook (HTML, PS, PDF, etc.). Figure 2 shows all publication views currently available. Figure 3 shows an example of the HTML version of the description of one part of the bibliographic format, resulting from this process. cd Publication View M odel «Publication View» UNIM ARC

«Machine Readable View» UNIM ARC

«DocBook View» UNIM ARC

«PDF View» UNIM ARC

«Human Readable View» UNIMARC

«PS View» UNIMARC

«HTML View» UNIMARC

Figure 2: Publication View Model Class Diagram

88

«Text View» UNIMARC

Figure 3: The publication of the UNIMARC bibliographic description (the DocBook/XML description is converted in HTML).

Currently, the publication site contains the textual publication of the Portuguese and English versions of the bibliographic, authority and holdings formats, but work is in progress to extend it to more languages. Other human-readable formats (RTF and Doc formats) can be built using some of the publications which are already available. We are also using DocBook/XML for the publication of the machine-readable version. All the produced publications are accessible online at http://www.unimarc.info/. One can only access there the UNIMARC publication views. The site uses a simple syntax for the construction of the URLs, based on the conceptual tree structure of the description. This provides to external services immediate and permanent access to the format publication when available. Third party services only need to understand the URN syntax to link to this publishing service which will be registered in the near future in the URN Registration Agencies (IANA Registry of URN Namespaces11 [0], Info URI Registry12, etc.).

89

The UNIMARC Metadata Registry A formal Metadata Registry compliant with the ISO/IEC 11179 standard13 for Metadata Registries (MDR) is under development to support the registration, maintenance, evolution, access, discovery, and interoperation processes. Part of the work for this purpose was already done for the purpose of publishing, as already described, but it needs to be re-addressed again in a more structured way for the purpose of the registry. However, the work done so far was essential, because while producing the results described above, it gave us the necessary experience to perform this new task in a more effective way. ISO/IEC 11179 is a standard in six parts that defines the concepts behind Metadata Registries, addressing the semantics of data, the representation of data, and the registration of the descriptions of that data: 1. Framework 2. Classification 3. Registry metamodel and basic attributes 4. Formulation of data definitions 5. Naming and identification principles 6. Registration The first three parts of the standard are concerned with design, implementation and management issues which are out of the scope of this paper. Concerning the fourth part, we are already using the descriptions defined in the format schema description (label, definition, notes, examples and relationships). Concerning the fifth part, the registry uses URN identifiers according to those used to identify format schema elements in the data model. The registry also provides persistent URL identifiers (PURL) for immediate access to the registry of the most recent version of each element of the format, as well as identifiers for all previous versions. This access is available for human – readable version access and also for machine- readable access. Registration, the sixth part, is made according to levels of administration. Professionals responsible for the edition and revision of the format can access this register online and edit changes in the format descriptions. These changes are registered directly into the description of XML files. Usually, there is one professional responsible for a given language and format, in all its versions. On the other hand, professionals responsible for managing the evolution of the format can also edit the schema of that format online. These changes are registered directly into the format schema XML file. Generally there is a group of professionals responsible for all the format versions, involved in a higher level network of responsibilities and tasks, different from the editors and reviewers.

90

This Metadata Registry also serves as the primary source for format discovery and provides access to all the available information and services concerning the format. Concerning the publication, this registry acts primarily as the source for the models required for the publication process, and secondly as the source for discovery by providing links to all available published formats’ views (including the formats’ versions). This is made possible by making use of the PURL features provided by the publication site. For example, the URN for the field 100, subfield $a, second edition, third revision of the UNIMARC Bibliographic Format in English (urn:unimarc: bibliographic:2.3:en/100.a) would be translated, for the HTML publishing view, into the URL with the following syntax: “http://www.unimarc.info/bibliographic/2.3/en/100.a”. Changes made to the format, whether concerning the structure or the description, can be immediately available online on the publication site in all provided views. The Registry shares also another important feature, enabling the comparison between different versions of the same format. With this we can profile the evolution of the formats and track changes between them. All this information can also appear when the formats are published. UNIMARC and XML at the BN The work described in the previous sections has been promoted in the framework of ICABS activities. We will describe now additional examples of effective bibliographic services and processes at the BN where the multiple XML representations of UNIMARC play already very important roles. Preservation of UNIMARC records REPOX is a metadata repository developed according to our concept of Metadata Space. A concept of metadata space was proposed by Watson and Wiley14, but that was defined in a specific scenario of metadata retrieval. Although sharing the same designation, our concept is different, closer to the concept of “dataspace” as defined in Franklin and Maier15. In this sense, a Metadata Space is defined in relation to the concept of Enterprise Architecture (EA)16 and the emerging of computing environments deployed as Service Oriented Architectures (SOA)17. At BN, this is already part of a new strategic technological view for its digital library’s architecture. 91

The way the records are coded and managed in REPOX addresses the preservation requirements in OAIS18 and in ERPANET and Boudrez19; namely, by providing non proprietary tools for the management of the digital preservation processes, it is robust and flexible, it provides mechanisms for self-description and validation of digital resources, and is not application or platform based. The REPOX infrastructure at BN supports already several data collections. It started with PORBASE (the national union catalogue), followed by the Archive of the Contemporary Portuguese Culture (a specific department of BN). Actually, it supports also several collections of the National Digital Library, including a collection of Dublin Core records of a deposit of the full Gutenberg project (http://www.gutenberg.org/). PORBASE is the largest bibliographic database in the country, with collections from more than 160 libraries. It is maintained by a proprietary information system designed to support the traditional cataloguing and searching processes. The database schema of this system is therefore not always compatible with the requirements of the new services to be developed. Besides, the possible utilization of the server used for the cataloguing and searching by the general public would overload it with tasks that usually could be executed on a different server. Finally, there is a need to assure the data replication, encoded under an open schema and stored in an objective infrastructure, fulfilling the requirements for a more generic security and preservation. REPOX harvests the records directly from the PORBASE library integrated management system on a daily basis, and encodes them in MarcXchange. Each record is wrapped in an AIP – Archive Information Package. These packages are stored in the file system as regular files, assuring that its storage remains independent of any particular storage system. These AIP objects contain information about the provenance of each record and its history. The history comprises each harvested version of the record. Therefore, REPOX maintains also a URN space for the identification of data collections, record types, AIP and each version of the records. This actual data collection of PORBASE contains more than 3 million records (including both bibliographic and authority records), performing 26 GBytes when encoded in XML. Authenticity is another important issue. The case of assuring authenticity of electronic records on the long term has been studied in Gladney20 and Waugh et al21. In REPOX the records are stored as simple XML files, in the local file system. File systems security against changes in the files is not enough to guarantee the level of authenticity that we need. For this purpose, REPOX has a 92

Digital Signature Manager component, responsible for signing the records. A possible future point of failure in this authenticity infrastructure can occur when it becomes necessary to preserve both the digital signature and the validation function, as recognized in ERPANET22. REPOX addresses this problem by following the W3C XML-Signature Syntax and Processing recommendation23, and also by providing mechanisms for the integration of future digital signature algorithms and migration of the digital signatures. REPOX started its operation with PORBASE on 1st October 2005. Immediately afterwards, one could notice the increase in the availability and performance of access to the data, carried out by several departments of the National Library and by the cooperating libraries. The proprietary library management system was also freed from serving many accesses to the records originated from other automated systems that were overloading it considerably. Another factor of great importance is the backup copy of the data, expressed in XML. That represents now a total of 3 million records, thus preserved independently of any specific software or hardware. Access to PORBASE by Unique Identifiers The access to PORBASE by Unique Identifiers is a very simple but effective service, located at http://urn.porbase.org. It makes records from PORBASE available to other libraries in several formats via a HTTP interface. Both bibliographic and authority records are available. The records can be retrieved by several unique identifiers (ISBN, ISSN, legal deposit number, call number, etc.); the resolution is supported by access points in REPOX. Besides the native format UNIMARC, the records are also made available in other structural formats, such as Dublin Core, as also in several encoding formats (MARCXML, MarcXchange, HTML, plain text, ISO2709, etc). These formats are generated at runtime by the service, by means of XSLT transformations to the XML records retrieved from REPOX. This service is useful mainly for external library management systems which can import records directly to support local cataloguing work. In this sense it offers a functionality partially similar to the OAI-PMH, but in a much simpler way.

93

Validation of records and reporting A set of applications, named MANGAS, was developed for the purpose of supporting quality control processes involving UNIMARC records, like validation, reporting, filtering and correction. The most important aspect of these applications is the way they use the UNIMARC schema to validate automatically the records. MANGAS Diag is a tool to analyse a local collection of records. It is suitable for common users who are not familiar with the UNIMARC format and only require a way to build knowledge about their personal catalogues. This application is available as a standalone tool or as a web service (located at http://diag.porbase.org). MANGAS Workstation is a tool for skilled professionals who are familiar with the UNIMARC format and require a more complex functionality for their work. It is useful for professionals responsible for quality control procedures that require the ability not only to detect errors but also to act upon them. The tool can produce custom reports (validation and content reports), different views of the records, it can support the editing of the records, correct errors systematically, etc. This application is only available as a standalone tool. Finally, MANGAS Batch is a tool developed in order to enable third party applications to use the MANGAS functionality. Currently this application is available as a standalone tool that can be called by a prompt in the local operating system, or be embedded as a library in a Java based application. An example of a service using MANGAS Batch is Qualicat. This is an internal service assuring the daily quality control of the bibliographic records in PORBASE. It is defined according to the requirements of the UNIMARC format, the Portuguese Cataloguing Rules, and also of the special internal rules of PORBASE. It allows a cataloguer to validate in real time the record he/she is working on, in the library management system. It also creates detailed error reports of the records already in the collection. This service reports the errors with great detail, providing hyperlinks to the online version of the UNIMARC description available form the registry. In this sense, the cataloguer may read the UNIMARC description of the fields with the errors reported, thus improving his/her knowledge. OAI-PMH Service The OAI-PMH24 service at BN (available at http://oai.bn.pt) gives access to the records stored at REPOX to cooperation projects. 94

A regular client of this service is, for example, the portal TEL – The European Library (the TEL central host harvests regularly the records from PORBASE and makes them available for searching and access). In this case the records are obtained from REPOX in MarcXchange and transformed into the TEL format by a XSLT transformation. Conclusions and further issues For exchanging purposes, and from the perspective of UNIMARC, MarcXchange is a step in the right direction. Concerning the UNIMARC Registry, its formal functional development is still in progress. The infrastructure used to produce the results reported here is functional, but it needs to be reviewed more carefully according to the requirements of the ISO/IEC 11179 standard. Concerning the interoperability between UNIMARC and MARC21 (mappings, etc.), we should be able to express inheritance from a general MARC schema. This is an action that should be taken in partnership within the UNIMARC and the MARC21 community. From the specific perspective of the BN, the future developments in REPOX will focus on a mechanism to support global queries for the retrieval of the record across multiple types and data sources. To better address the authenticity issues, we will also consider work to implement the XML Advanced Electronic Signatures W3C recommendation25. We are also planning the development of services for cross maps of metadata schemas stored in REPOX, to retrieve Dublin Core or MARC21 records created “on the fly” from original UNIMARC records. This will be implemented as a generalization of the concept of the actual service URN.PORBASE.ORG (which serves only authority or bibliographic UNIMARC records from PORBASE). Another important service will be a generic OpenURL26 resolving service, which can take in consideration a large number of resources’ providers (PORBASE, the National Digital Library, the Archive of the Contemporary Portuguese Culture, DiTeD – the Database of Thesis and Dissertations, etc.). This is a long awaited achievement, which will be easy to develop now in the scope of the new SOA environment. Finally, other services will be developed to provide detailed statistics, quality control (including the semantics of the records), as well as the design of a data warehouse to support management decision processes. 95

References 1

Electronic Resource Preservation and Access Network (ERPANET): Urbino Workshop: XML for Digital Preservation, 2002 http://eprints.erpanet.org/archive/00000002/01/UrbinoWorkshopReport.pdf

2

MarcXchange (http://www.bs.dk/marcxchange/). Information MarcXchange: http://www.niso.org/international/SC4/n577.pdf

3

IFLA – UNIMARC Manual: Bibliographic Format, 1994. (http://www.ifla.org/VI/3/p19961/sec-uni.htm)

4

http://en.wikipedia.org/wiki/Model_view_controller

5

Moats, R., “URN Syntax”, RFC 2141, May 1997.

6

http://en.wikipedia.org/wiki/Schema

7

XML Schema (http://www.w3.org/XML/Schema)

8

RELAX-NG (http://relaxng.org)

9

Schematron (http://www.schematron.com)

10

http://en.wikipedia.org/wiki/Schema

11

http://www.iana.org/assignments/urn-namespaces

12

http://info-uri.info/

13

ISO/IEC 11179, Information Technology – Metadata Registries (MDR). (http://metadata-standards.org/11179)

14

Wason, T., Wiley, D.: Structured Metadata Spaces. Metadata and Organizing Educational Resources on the Internet, NY: Haworth Press, 2001. (http://opencontent.org/docs/metadata_spaces.pdf)

15

Franklin, M., Halevy, A., Maier, D.: From Databases to Dataspaces: A New Abstraction for Information Management. ACM SIGMOD Record, 2005.

16

http://en.wikipedia.org/wiki/Enterprise_Architect

17

http://en.wikipedia.org/wiki/Service-oriented_architecture

18

Consultative Committee for Space Data Systems. OAIS – Reference Model for an Open Archival Information System, 2002.

96

and

documentation

–

References 19

Electronic Resource Preservation and Access Network (ERPANET): Urbino Workshop: XML for Digital Preservation, 2002 and Boudrez, F.: XML and electronic record-keeping (2002). (http://www.expertisecentrumdavid.be/davidproject/teksten/XML_erecordkeeping.pdf)

20

Gladney H.M.:Trustworthy 100-Year Digital Objects: Evidence After Every Witness Is Dead. ACM Transactions on Information Systems, 2004, 22 (3) pp. 406–436.

21

Waugh, A., Wilkinson, R., Hills, B., Dell’oro, J.: Preserving Digital Information Forever. Proceedings of the fifth ACM conference on Digital libraries, 2000.

22

Electronic Resource Preservation and Access Network (ERPANET): Urbino Workshop: XML for Digital Preservation, 2002. http://eprints.erpanet.org/archive/00000002/01/UrbinoWorkshopReport.pdf

23

W3C Consortium: XML-Signature Syntax and Processing (http://www.w3.org/TR/xmldsigcore/#sec-SignatureAlg)

24

The Open Archives Initiative Protocol for Metadata Harvesting (http://www.openarchives.org)

25

W3C Consortium: XML Advanced Electronic Signatures (XAdES) (http://www.w3.org/TR/XAdES/)

26

http://www.niso.org/committees/committee_ax.html

97

A new OPAC for BNCF Using Open Source Software, XML and UNIMARC Giovanni Bergamin As a focus of my presentation, I could use this fashionable slogan: UNIMARC 2.0 as a foundation of OPAC 2.0. As we know, there are different points of view on “2.0” in the library environment: “trendy jargon” 1 or radical change where “Web 2.0’s principles and technology offer libraries many opportunities to serve their existing audiences better”2? I do not know if the new OPAC for BNCF could be labelled OPAC 2.0 but I am sure – or at least I hope – that this new tool will serve our users better.3 An interesting new opportunity is now emerging in software for libraries: the open source option. When we started to plan a new OPAC for the BNCF two years ago we spent a considerable amount of time in searching open source solutions ready to be “reused” in our library (of course with some “small adaptations”…). Unfortunately the summary of our findings was: • there is a lot of valuable open source software for the so called “digital library infrastructure”, but this infrastructure does not include a “suitable” OPAC for BNCF; • the available open source solutions for an OPAC o offer limited support for MARC records (namely UNIMARC records); o usually present “an interface that only a librarian could love4”. A new OPAC, resulting from our project, is based on • open source system and programming software; • UNIMARC records expressed in XML syntax (based on MARCXML syntax, but with full support of embedded fields technique); • a new “post-Google” interface. In the BNCF we have been using UNIMARC since 1985 for the production of the Italian National Bibliography (BNI – Bibliografia Nazionale Italiana). In 1998 we started to provide a web-based online catalogue using a proprietary information retrieval engine. We are not using an integrated library system if “integrated” means an automated system in which all of the functional modules share a common bibliographic database; OPAC records are, in fact, imported from the cataloguing module using UNIMARC. 99

Two years ago we recognized the need of a new online catalogue. The architecture and the requirements of the new service have been identified: • Open source: as our information retrieval engine we have selected Lucene5. According to Wikipedia “Lucene is an open source search engine library released by the Apache Software Foundation. It is written in Java and is released under the Apache Software License6”. • MARCXM: the advantages of using XML syntax instead of ISO 2709 are well known and we all agree that in today’s Information Technology XML is one of the most important components. We tried to use MARCXML Schema (MARC21slim.xsd7) produced by the Library of Congress for UNIMARC records, but we soon realized that MARCXML Schema does not take into account the embedded fields technique. At that time we were not aware of the work undertaken by the National Library of Russia (Vladimir Skvortsov8). • Integration of previously separated online catalogues (manuscripts, maps, theses and dissertations, etc.): this requirement confirms the need for a common XML Schema. • SRU, OAI and OpenURL interfaces: environments where the availability of XML records is of course essential. The design of the UNIMARC XML Schema can be summarized in the following three figures9: This structure is the same as that of MARC21 slim. We have chosen to use abbreviated10 tags: rec = record lab = label cf = control field df = data field sf = subfield

100

This is the real difference between MARC21slim and UNIMARC. The UNIMARC field is represented as a choice between: • “standard subfields” like $a, $e, etc. • or “special subfield” $1 that announces embedded fields. $1 is represented in our schema by a new tag The tag is an articulation that could embed control fields and data fields

Finally, I would like to underline the use of UNIMARCXML in a real application, the new OPAC of BNCF: • records from different sources are converted in UNIMARCXML (using UTF-8 as set of characters)11; • using an XSL, UNIMARCXML is converted in “XML_Index format” (an internal format) for the definition of fields for Lucene; • using an XSL, UNIMARCXML records are displayed in the OPAC; • UNIMARCXML registered as info:srw/schema/8/unimarcxml-v0.1 12 could be a Schema in which records may be transferred in response to an SRU “searchRetrieve” request13; • using an XSL, UNIMARCXML could be converted in Dublin Core and records may be transferred as a response to an OAI request;

101

• using an XSL, metadata can be extracted from UNIMARCXML records and used in an OpenURL service (we are using CoinS14 for publishing OpenURL references in HTML). Example15

- 01046nam0 2200277 450 RMS1134619 20060211123928.5 - 88-06-17116-X

- IT 2005-2131

- 20060211d2004 |||||ita|01 ba

- ita

- IT

- |||| |||||

- *B*La *E*caverna José Saramago traduzione di Rita Desti

- Torino Einaudi [2004].

- 335 p. 23 cm

102

- Einaudi tascabili 1259.

- - CFI0163201

- - Einaudi tascabili 1259.

- *B*A *E*caverna VIA0084410 CFIV042098

- 869.342 21 NARRATIVA PORTOGHESE, 1945-1999

- Saramago , José CFIV042098

- Desti , Rita CFIV041750

- IT BNCF 20060211

- Sousa Saramago , José : de 103

CFIV178457 Saramago, José

- Saramago , Joao MILV221333 Saramago, José

- Bibl. Nazionale Centrale di Firenze 1 v. 1 v. CFGEN B19 02823 CF0058490565 200501071 v. GEN B19 02823 A

References 1

Ex. g.: http://tomroper.typepad.com/tr/2006/02/a_library_20_op.html

2

Miller, Paul. Web 2.0: Building the new library, , 45(2005), http://www.ariadne.ac.uk/issue45/miller/

3

An experimental version is now available at: http://opac.bncf.firenze.sbn.it/opac/controller.jsp?action=search_baseedit

4

Tennant, Roy. Lipstick on a Pig, , 2005: http://libraryjournal.com/article/CA516027.html

5

http://lucene.apache.org/

6

http://en.wikipedia.org/wiki/Lucene

7

http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd

8

Skvortsov, Vladimir. UNIMARC XML Slim Schema: Living in a new environment… http://www.ifla.org/IV/ifla71/papers/064e-Skvortsov.pdf

9

The XML Schema is available here: http://www.bncf.firenze.sbn.it/progetti/unimarc/slim/documentation/unimarcslim.html

104

References 10

Today space storage is not a relevant issue as a few years ago: anyway this solution allow us to save an average of 200 bytes per record (we have more than 2 million records).

11

See the example at the end.

12

http://www.loc.gov/standards/sru/record-schemas.html

13

http://www.loc.gov/standards/sru/index.html

14

http://ocoins.info/ COinS = Context Object in Spans is a “Convention to Embed Bibliographic Metadata in HTML”

15

Non-sort begin in UTF-8 (hex C298) are here represented as *B*; non-sort end (hex C29C) as *E*.

105

SESSION 3 ROUND TABLE: EVOLVING STANDARDS FOR BIBLIOGRAPHIC DATA HANDLING: THE IFLA’S ROLE

Moderators: Fernanda Maria Campos Sjoerd Koopman

Evolving Standards: IFLA/ICABS and ISO/TC46 Sally H. McCallum After a comfortable four centuries refining the technology of the books and developing standards for their bibliographic control and access, as well as for that of maps, music, audio, and image, the new digital media and Internet environment have challenged these standards to evolve. This paper discusses evolving standards from four perspectives: transformation of current standards to new technical platforms; new standards developed for the new environment; coordination with broader communities; and finally putting the pieces together for digital material. The landscape for libraries changed in the last decade as digital became a standard form of material almost “overnight”. It is easy to create, has attractive advantages over traditional forms of material, and the volume that has been produced is already enormous. The new media has moved in so fast and has so many different characteristics; librarians who were skillful in building analogue collections have not really been able to clearly articulate what “adding to the collection” means for digital material. Digital, and particularly web, documents are easily changed, and even electronic versions of scholarly journals introduce corrections, so that the bibliographic concept of version is no longer clear. And the community is just beginning to explore preservation requirements. All these issues point to the need for the evolution of old standards and creation of new ones. Fortunately the Internet technical environment is producing tools for dealing with these changes, of which an important one is XML. The lack of this mark up language in the early Internet days gave the new landscape a chaotic appearance, but with the rapid acceptance of XML many existing standards can be retooled, using XML as the vehicle. This paper focuses on several standards associated with a programme organized jointly by the International Federation of Library Associations and Institutions (IFLA) and the Conference of Directors of National Libraries (CDNL), called the Alliance for Bibliographic Standards (ICABS)1; and on several that are being developed for the information community in the International Standards Organization (ISO), Technical Committee 46, Information and Documentation2. Renata Gömpel’s paper presents the ICABS organization and its whole work programme, as the Deutsche Bibliothek has a major coordinating role for that group, while this paper treats only the particular ICABS standards mandate of the Library of Congress within the programme. 109

The Library of Congress in ICABS The Library of Congress has four work areas in the ICABS framework, with responsibility for continuing development activities in the library community and with giving them exposure through IFLA: MARC 21 and its derivatives, metadata, search protocols, and identifiers. Each of these areas is at a different stage of change or development within the community, and that evolution is exciting, although that is a term seldom used when talking about standards. MARC 21 and its derivatives MARC 21 has been a long-term solution for the library community for the exchange of descriptive metadata, and as a result there are over a billion MARC 21 records in network and local systems worldwide. There are thousands of MARC-based installed systems that carry out complex operations from end user services to a library’s processing needs. There are also thousands of librarians trained in MARC 21 who have responsibility for the organization of library resources. The evolution of MARC 21 into the Internet and XML environment must therefore provide a continuity and connectivity between the existing MARC 21 records and any new XML record. The difference in the record structure (ISO 2709, the pre-XML record structure specification used by MARC, vs. XML) is not so critical for an evolutionary approach as is the compatibility of the semantics of the record content with the records already created. A pathway into XML for MARC was needed for several additional reasons. Newer protocols such as the Open Archives Initiative (OAI) Protocol for Metadata Harvesting and Search Retrieve via URL (SRU), and format frameworks like the Metadata Encoding and Transmission Standard (METS) either require or strongly prefer XML for record encoding. The computing environment is exploding with tools for manipulating XML and use of these tools requires an XML record. And finally, the young programmers coming out of school today are highly skilled with XML and less skilled (or patient) with earlier formats for data. MARCXML. In response to the need to evolve MARC to the new environment, an XML derivative for MARC 21, called MARCXML, was developed3. It is highly compatible with MARC 21 as it maintains the MARC content designation (tags and codes), but adapted to the XML data structures. This has proven to be very useful with the newer protocols and for record manipulation. To support the community, the Library of Congress and others have supplied the community with free downloadable transformations that permit easy conversion 110

of MARC 21 records to and from MARCXML, enabling a number of interesting applications4. For example the Library of Congress uses those conversions to offer MARCXML records for OAI harvesting. ISO TC46 has also taken a role in the evolution of the structural standard behind MARC formats. In 2004, Danish Standards generalized the MARCXML schema and introduced it into ISO as an XML schema, ISO 25577 called MarcXchange5. Since the MARCXML schema itself is defined in a broad and general way, it could be adapted with very little change for MarcXchange. MARC 21, UNIMARC, and all the other MARC formats that are implementations of ISO 2709 should be able to be represented using the XML structure specified in MarcXchange. The Draft International Standard ballot for ISO 25577 was distributed by ISO in February 2006, a final step in the process of standardization. MODS. Another important standard evolving out of MARC 21 is the XML schema Metadata Object Description Standard (MODS)6. MODS was designed as a MARC 21 derivative, but with clear recognition of the importance of being simpler than MARC 21, especially where ISO 2709 structural constraints forced complications in MARC 21. It also needed to support data constructed with the emerging content models that IFLA has been in the forefront of developing. It was also recognized that while the primary target is compatibility with MARC 21 data, MODS records should accommodate data derive from Dublin Core records, which are extremely simple, from publishers’ ONIX records, which are more complex and less compatible with library data, and from electronic information resources themselves. MODS is less detailed than MARC in content designation, providing a pathway to technician input of records, and it has some special support for electronic resources. MODS is especially useful with the newer protocols that prefer records in XML. MODS was developed in a open manner using the new technology. An initial XML schema was drafted by the Library of Congress and then turned over to an open listserv where all users can make reviews and comments. Now only in its 4th year, it has been widely employed in digital library initiatives. It is favoured for inclusion as the descriptive metadata part of METS and has been adopted as the preferred format for the Digital Library Federation’s Aquifer project7. MODS simplifies while not losing the richness of the MARC data, as Dublin Core does. Besides its compatibility with MARC 21, some of the features that make MODS useful are its recursive related item and special linking capabilities. Using MODS, agencies are able to derive metadata from existing MARC 21 records, record descriptive information about the parts or subparts of the digital 111

object to whatever depth or detail is necessary using the recursion feature, and links the parts of the digital resource to the appropriate descriptive data. The recursion and the linking are special capabilities easily accommodated in an XML schema. MODS is thus an important step in the evolution of bibliographic formats. The recent development of the Metadata Authority Description Schema (MADS) as a companion to both MODS and the MARC 21 Authority format is an especially interesting experiment with its different views of authority data. Metadata The second area of responsibility for the Library under ICABS is “metadata”. This has been interpreted to include, essentially, non-descriptive metadata, since the descriptive formats are covered under the MARC umbrella. The Library has focused on two important areas: the Metadata Transmission and Description Standard (METS) and the Preservation Implementation Strategy (PREMIS) data dictionary. METS. METS is familiar to many working with digital material as a wrapper that bundles the descriptive, technical, rights, preservation, and other metadata associated with a digital item together with the file names and the structural map for the item, and optionally even the digital resource itself8. This standard bundle can then accompany the item when the item is entering a repository or when it moves between systems or repositories. METS is an XML standard that evolved out of 1990s experiments in digital repositories. By trying to accommodate any practice in the wrapper framework, METS has appeared to forfeit interoperability, to an extent, since any two METS packages can use very different internal content format standards. The community has recognized this, however, and efforts to standardize the content through agreement on profiles have gained much support. Important next steps are development of these standards for the METS metadata content. For example, technical metadata for images has had relative success in being exchangeable between implementations because there is a standard data dictionary and associated schema (MIX) that were developed in parallel with the development of METS9. MODS also appeared at the right time for use with METS and now PREMIS should enable wider standardization of the preservation data in METS. The METS Editorial Board is following the development of MPEG-21 for the evolution of METS. For the present MPEG-21 has little to offer for it is even more open to differences than METS. There may be advantages to harmonization in the future. 112

PREMIS. The PREMIS data dictionary10 was the product of a recent international project that itself evolved from several European and American initiatives from the 1990s. The committee developing PREMIS met via conference call every few weeks for almost 2 years, with the members in the Pacific area joining the calls around midnight their time. The committee took a very pragmatic approach to the actual data format question: they made the data dictionary implementation independent. They recognized that many repositories are already in production and for those in the planning stage, the systems environment in which they will reside may have special characteristics. Therefore the PREMIS core elements that are to be available to the repository are not necessarily explicitly stored in it. The elements could be stored in auxiliary systems, could be implicit in the business rules used by the repository, or could be stored within a local database or format. The important point is that the core data be available for gathering into a standard schema in the event of interchange. Systems do not need to be reimplemented or specially designed in order to “conform” to the PREMIS core. For system situations where a format is needed for the elements and for the interchange environment, XML schemas for PREMIS data have been created. This enables the elements to be embedded in a METS record or to be placed in some auxiliary repository site. Work is now under way to determine the best way to position the PREMIS elements within a METS package – a component of METS profiling. Search protocols SRU. The Library of Congress’s third area of responsibility for ICABS is search protocol standards. The widely implemented search protocol, Z39.50, was developed pre-web and pre-XML, so this is another place where evolution of a standard is taking place. Search and Retrieve (SRU) is the new derivative protocol service developed for the Internet environment by an international group of Z39.50 experts, incorporating the most useful specifications of Z39.50 into this new XML protocol. SRU provides three service choices: SRU (via URL), which is the basic service that allows users to send a search using title, name, identifier and other parameters using an HTTP GET and receive XML records in response; SRU via POST, which is an alternative to SRU that uses the HTTP POST as carrier in order to ease some length and character set restrictions; and SRU via SOAP (formerly SRW), which supports a web search that operates over the web-based SOAP protocol. These new search protocols are much less widely implemented than Z39.50 so far, but interest is growing rapidly because of their relatively simple imple113

mentation. The Library of Congress has SRU via URL and SRU via SOAP available against its bibliographic file and expects to see growing use of it in the future. SRU not only represents an evolution of a current standard to a new technical platform, but also an attempt to reach a broader community. The SRU Editorial Board is considering taking SRU, which has already been registered with NISO, to OASIS, the Organization for the Advancement of Structured Information Standards, for more official standardization, in order to reach a larger audience. Interest there could have the effect of providing libraries with more off-the-shelf and open source tools to work with when implementing the SRU protocol. One segment of the information community, specifically Amazon, is experimenting with another search protocol, OpenSearch (or A9), and the W3C has been developing Xquery for a number of years. The SRU Board has initiated discussions with the OpenSearch developers and a comparison of these different approaches to information retrieval and their logical applications will be a topic in the ICABS session at IFLA in Seoul. TC46 initiatives Three more areas in this tangle of interrelated standards activities are currently active in TC46, the technical committee responsible for information and documentation standards in ISO, which held a plenary meeting in February 2006 in Thailand. One is MarcXchange, ISO DIS 25577, that was described above; another concerns data storage, specifically a web archive file format; and the third area concerns the interoperability of the identifiers that have been standardized in TC46 over the last 20 years. Work on identifiers of a different type is also a part of the Library of Congress tasks for ICABS. Data storage WARC. The Web ARChive (WARC) file format provides a valuable missing component for digital archiving as it specifies a standard for concatenating multiple records retrieved in a web crawl, for example, into one file11. WARC gives us a standard format for archiving web sites, something that libraries are increasingly called upon to do as the legal deposit laws are updated to include the web and as more and more material that should be saved for future generations appears only on the web. The WARC can be thought of as similar to a WAVE audio file format, TEI for text, TIFF for images, or other specialized file format that might contain digital content in an easy to read and retrieve standard 114

format, only WARC is specialized for the web site files. Based on a file scheme that has been used for several years by the Internet Archive, it was reworked in the context of an international consortium formed in 2004 by the Bibliothèque nationale de France that has focused on web archiving12. The format came to ISO at the recent TC46 meeting as a well-developed draft standard. The expectation is that it can be progressed through ISO procedures via fast track and the participants in the Thailand meeting approved that strategy. The WARC file would relate to METS in the same way a TIFF file does, in that a WARC resource file may be described by a METS package with rich metadata, technical and administrative metadata, and even a METS structural map that points to the sites in the WARC set of archived web sites. A profile for a standard METS package for web archive files is now under development. Identifiers Resource and work numbers. One area in which TC46 has done extensive standardization is for numbers and codes that uniquely identify items in various media and the more abstract concept of works. Initially the development was stock number driven, with numbers for “manifestations” of an item such as the ISBN (books), ISSN (serials), ISMN (notated music), and ISRC (audio or video recordings). But then rights became an equal if not more critical concern for publishers and rights holders and “work” numbers such as the ISWC (musical works), ISAN (audiovisual works), and ISTC (text works, still a draft standard) were developed. Currently additional identifiers are under investigation including the ISCI (collections) and the ISPI (persons and companies). At the recent ISO plenary meeting in Thailand the registration agencies for these identifiers formally launched a cooperative international initiative to explore issues relating to the interoperability of these identifier systems. Among the interoperability use cases that they envisioned are linking of manifestations to works, which could help to expand access, manage assets and report usage, and linking of manifestations and works to rights holders, which would help to determine permissions and collect royalties13. URIs. The above numbers and codes are not the only area where identifiers are a concern. The whole Internet community has wrestled for years with the “persistent identifier” question, and there are a number of identifiers that tackle the issue: URN, URL, URI, “info” URI, ARK, HDL, etc. The situation is complex because the persistent identifier may be called upon to be actionable (i.e., lead directly to an electronic resource), or may be simply a pure identifier. As part of its ICABS responsibility, the Library of Congress maintains a URI Resource Pages web site that explains and provides information about the

115

various identifier issues and a summary of any recent activities in the area on the part of the W3C or the IETF14. Putting the pieces together The schematic of a logical METS “bundle” indicates how the evolving standards are fitting into the digital archive landscape – helping to supply missing components for storing, preserving, and accessing the new digital media. Evolving current standards for descriptive metadata such as MARCXML and MODS, new standards like MIX and PREMIS, and the WARC data storage format are filling in some of the components. Continuing evolution of descriptive formats is important. NISO image metadata standard provides a good foundation for image archiving but technical metadata requirements for other media need to be standardized. PREMIS is a good start on the preservation metadata that should exist in a repository, but more extensive technical metadata for preservation is also needed and should be developed in conjunction with the technical metadata projects like MIX. Logical METS “Bundle”

Resource Space Audio File (W AV, etc.) Video File (M PE G, etc.) Text File (T EI, etc.) Im age File (T IFF, etc.) W eb File (W ARC)

M E TS Information M etadata: D escription (M O DS, M ARCXM L, etc.) Technical (M IX, etc.) Rights (M ETS Rights, etc.) Preserva tion (PRE M IS)

Structure m ap File section

Rights are also an important area that needs further work. METSRights serves as a placeholder but many believe that more actionable rights metadata than it provides is needed. The publishers may provide some help there as it is in their interest. 116

And finally a repository built from METS bundles can be accessed via the standards for search and retrieval of resources coming out of the SRU and identifier initiatives.

Repository built from METS bundles

• SRU, etc. • Identifiers

References 1

The ICABS main web home where the total programme is described and activities reported is at http://www.ifla.org/VI/7/icabs.htm

2

The ISO TC46 main page on the ISO site may be accessed from the following address: http://www.iso.org/. On that site, under Standards development, access List of technical committees, and on that page access TC46, Information and Documentation. On the TC46 page are links to the two Subcommittees most relevant to this paper, SC4, Technical interoperability, and SC9, Identification and description. Secretariat maintained websites for those two Subcommittees are found at http://www.niso.org/international/SC4/index.html for SC4 and http://www.collectionscanada.ca/iso/tc46sc9/index.htm for SC9

3

Links and URL for the MARCXML web site are found at www.loc.gov/standards

4

Several examples are described in the following paper: McCallum, Sally H., A MARCXML Sampler. International Cataloguing and Bibliographic Control, vol. 35, no.1 (January/March 2006).

5

The Draft International Standard may be viewed at http://www.niso.org/international/SC4/n577.pdf

6

Links and URL for the MODS web site are found at www.loc.gov/standards

7

See http://www.diglib.org/aquifer/aquiferbkgd.htm for more information

8

Links and URL for the METS web site are found at www.loc.gov/standards

117

References 9

The image metadata standard, ANSI/NISO Z39.87, Data Dictionary – Technical Metadata for Digital Still Images is in press. The MIX (Metadata for Images in XML) schema that supports the NISO data dictionary is available from the website noted in 3.

10

Links and URL for the PREMIS web sites are found at www.loc.gov/standards

11

A pdf of the February 2006 draft is available from http://www.niso.org/international/SC4/N595.pdf

12

A description of the International Internet Preservation Consortium is at http://netpreserve.org/about/index.php

13

An article describing this initiative and the prospective uses for interoperable identifiers may be found at http: www.dlib.org/dlib/april06/paskin/04paskin.html

14

ICABS identifier site http://www.loc.gov/standards/uri/

118

ICABS – Umbrella for Multifaceted Activities in the Area of Bibliographic and Resource Control Renate Gömpel Birth of ICABS During the World Library and Information Congress in Berlin a new alliance between IFLA and national libraries was established on 7August 2003. The intention has been to continue and expand the coordination work formerly done by the IFLA Universal Bibliographic Control and International MARC (UBCIM) and by the Universal Dataflow and Telecommunications (UDT) core programme offices. The UBCIM activity was established thirty years ago and came to an end in 2003. It was originally hosted by the British Library (1973–1989) and later by Die Deutsche Bibliothek from 1990 to the beginning of 2003. In early 2003 the Biblioteca Nacional de Portugal took over the responsibility for UNIMARC. The quarterly journal International Cataloguing and Bibliographic Control (ICBC) is now under the direct responsibility of IFLA Headquarters. The former series UBCIM publications, now IFLA Series on Bibliographic Control, is edited at IFLA Headquarters. Another component of ICABS is a major part of the programme of the former core activity UDT which was hosted at the National Library of Canada (NLC) from its beginning in the late 1980s until it ended in 2001. UDT also developed and maintained IFLANET, hosted for many years at NLC. In 2001 IFLANET was moved to Institut de l’Information Scientifique et Technique (INIST) in France and is not part of the ICABS activity. And finally, the Conference of Directors of National Libraries (CDNL), which has provided the main support and funding for these core activities over many years, established a committee to monitor digital library developments – the CDNL Committee on Digital Issues (CDI). The Committee’s work on bibliographic standards and digital preservation has been integrated into the ICABS mission, while the Committee’s work on deposit agreements has been continued separately by the National Library of Australia. The new approach has been that the national libraries participating in the core activity should take over responsibility in those areas of bibliographic and resource control in which they have had experience for a long time. 119

Background In early 2002 IFLA reviewed its core activities. The objectives were to monitor the core programmes, to enhance the core programmes, to widen the support for them and to develop new ideas for core programmes. As to the former UBCIM Core Programme one of the recommendations was to “continue the core bibliographic/metadata activities and publishing programme but in a different structure including XML and Dublin Core”. As a consequence, Die Deutsche Bibliothek as the host of the UBCIM Office since 1990 sent out a “Questionnaire on national libraries’ needs on an IFLA core activity in the field of bibliographic control, including metadata, persistent identifiers and interoperability standards”. 55 answers from 39 states from all over the world came back and were evaluated. Harmonization of cataloguing codes and data exchange formats, of interfaces and document/markup languages (XML) have been seen as very important or important. Promoting the concept of VIAF, and the IFLA concepts of FRBR and FRANAR, and of new conventions such as metadata schemes and application profiles, numbering systems, and metadata requirements have been seen as very important or important. Maintenance and development of the ISBDs have been seen as very important or important. ICABS objectives and goals Out of this feedback the following objectives and goals were established for ICABS: The objectives of ICABS are to coordinate activities aimed at the development of standards and practices for bibliographic and resource control; to support the international exchange of bibliographic resources by supporting, promoting, developing, and testing the maintenance of metadata and format standards; to ensure the promotion of new conventions; to act as a clearinghouse for information on all IFLA endeavours in these fields; to organize and participate in seminars and workshops; and to enhance communication within the community. These ICABS objectives will be realized through the following goals agreed to during the IFLA Berlin Conference: to maintain, promote, and harmonize existing standards and concepts related to bibliographic and resource control; to develop strategies for bibliographic and resource control and ensure the promotion of new and recommended conventions; and to advance understanding of issues related to long-term archiving of electronic resources. The goals are primarily linked to IFLA professional priority “Promoting standards, 120

guidelines and best practices”. Some may also be linked to “Promoting resource sharing”, “Providing unrestricted access to information”, “Representing libraries in the technological marketplace”, and “Developing library professionals”. ICABS partners and their fields The Biblioteca Nacional de Portugal, the British Library, Die Deutsche Bibliothek, IFLA, the Koninklijke Bibliotheek, the Library of Congress, and the National Library of Australia have agreed to participate in this joint alliance together with CDNL. They are partnering to assure ongoing coordination, communication and support for key activities in the areas of bibliographic and resource control for all types of resources and related format and protocol standards. Each of the partners in this alliance has agreed to be the lead support agency for one or more of the actions thus realizing the objectives. The following is a brief overview of what each of the partners has been responsible for, and states some major outcomes since 2003. Biblioteca Nacional de Portugal The National Library of Portugal took over the responsibility for UNIMARC formerly under the auspices of UBCIM that ended in early 2003. Since 2003 the library has been maintaining and updating a number of UNIMARC formats such as UNIMARC/Bibliographic, UNIMARC/Authorities and UNIMARC/ Holdings. The UNIMARC Manual (Update 5 of the 2nd edition) in printed form was presented in Oslo. Joaquim Carvalho gave an XML presentation of the UNIMARC Manual. The UNIMARC Forum has been launched at http://www.unimarc.net – a dedicated website prepared by the UNIMARC Core Activity to raise awareness of the UNIMARC format; to provide information and documentation on the uses of UNIMARC in several languages; and to promote discussion between UNIMARC users. The British Library The British Library supports the work of the IFLA Division IV FRBR Review Group in developing and maintaining the conceptual model and related guidelines for the Functional Requirements for Bibliographic Records (FRBR) and 121

promotes the use of this model. The British Library is responsible for the IFLA Division IV Working Group on Functional Requirements of Authority Numbering and Records (FRANAR) and for the promotion of the use of this model for authority control. It has similar responsibility for the Functional Requirements for Subject Authority Records (FRSAR) Working Group. The British Library and Die Deutsche Bibliothek jointly funded a project being undertaken by Tom Delsey to map ISBDs to FRBR, in accordance with an agreement with the FRBR and ISBD Review groups. The purpose of this work is to reinforce the essential consistency between ISBDs and the FRBR model. Tom Delsey delivered the final version of the mapping before the 2004 IFLA Conference published on IFLANET1. In May 2005 a FRBR Workshop took place in Dublin, Ohio, and was partly sponsored by the British Library. The workshop tackled a number of thorny issues within FRBR and was judged very successful. Workshop participants made a number of recommendations to the review group and started new thinking on some long-standing issues. One of the outcomes was that the FRBR Review Group agreed upon contacting vendors in order to ask them how they think the FRBR concepts could be introduced into library catalogues. The results of the questionnaire for library system vendors have been circulated within the FRBR Review Group and the Cataloguing Section so far, but surely will be spread to a broader audience. Die Deutsche Bibliothek ∗

The German national library (called Die Deutsche Bibliothek ) has taken over the responsibility to support the work of the IFLA Cataloguing Section’s ISBD Review Group in developing and maintaining the International Standards for Bibliographic Description. Die Deutsche Bibliothek encourages the harmonization of national practices to follow these standards and to promote the results of the ISBD revisions. In addition, Die Deutsche Bibliothek and the Library of Congress are partners to support and promote the idea of the Virtual International Authority File (VIAF) in cooperation with the Sections of IFLA’s Division IV: Bibliographic Control and the partners in the current VIAF Proof of Concept project. They also want to explore other VIAF models and promote the testing of prototypes. OCLC is another partner of the VIAF Proof of Concept project, without being member of the ICABS alliance. ∗

Editor’s note: called Deutsche Nationalbibliothek since 29 June 2006.

122

With regard to the International Standard Bibliographic Description (ISBD) Die Deutsche Bibliothek is supporting mainly the work of the ISBD Future Directions Study Group. The group met in 2005 in Frankfurt am Main and again in April 2006. The support of the mapping project has been mentioned already in connection with the British Library. During the first 3-year term Die Deutsche Bibliothek chairs the ICABS Advisory Board and provides the secretary for the group. IFLA IFLA Headquarters took over the responsibility for the quarterly journal International Cataloguing and Bibliographic Control (ICBC) – formerly under the auspices of UBCIM. IFLA Headquarters is also the editor of the former series UBCIM publications, now IFLA Series on Bibliographic Control. The first volume, Vol. 26: IFLA Cataloguing Principles: Steps Towards an International Cataloguing Code containing the proceedings of the First IFLA Meeting of Experts on an International Cataloguing Code appeared in 2004. Volume 27: Guidelines for Online Public Access Catalogue (OPAC) Displays and volume 28 containing the proceedings of the Second IME ICC in Buenos Aires appeared later. Koninklijke Bibliotheek Another ICABS partner is the Koninklijke Bibliotheek (KB), the Dutch national library. One of the actions the library has been committed to is the “Development of tools to improve preservation planning and content characterization”. KB and the National Library of Australia (NLA) will jointly take up new initiatives in this area. Since both libraries have had an operational digital repository in place for several years now, the activities in digital preservation are now focusing on the development of permanent access solutions. KB will develop several preservation management tools within the framework of the PLANETS (Preservation and Long-term Access NETworked Services) Project funded by the European Union. The two libraries will cooperate by sharing information, providing input to specification and testing tools developed in the abovementioned areas. Another goal of the KB is the “Improvement of knowledge sharing and knowledge dissemination on emerging topics in long term preservation of and access to digital resources”. In March 2006 the ICABS survey which KB carried out in 2005 on the state of the art in digital preservation and on guidance, was 123

published by K.G. Saur as an IFLA/ICABS publication in the IFLA/Saur Greenbacks Series under the title Networking for Digital Preservation. Current practice in 15 National Libraries. That publication, as well as the ICABS survey on digital preservation guidance documents undertaken by the NLA in 2005, has been made available online on the ICABS website and on PADI (Preservation Access to Digital Information), NLA’s subject gateway to international digital preservation resources. The book contains the state of the art of digital repositories, preservation strategies and current projects in 15 national libraries in Australia, Austria, Canada, China, Denmark, France, Germany, Japan, the Netherlands, New Zealand, Portugal, Sweden, Switzerland, the United Kingdom and the United States of America. Library of Congress Besides the joint responsibility for the VIAF cooperative mentioned above the Library of Congress (LC) is responsible for the promotion of the development and use of MARC21 and its XML derivatives. LC promotes the application and use of Z39.50 and cooperates with its users to continue the development of Z39.50 International, Next Generation and its XML-based Search/Retrieve Web Services in order to evolve next generation implementations. Furthermore, LC closely cooperates with the IFLA Information Technology and Cataloguing sections and their working groups to explore metadata requirements. The library collects and communicates information on existing metadata schemes and application profiles and monitors the work on persistent identifiers. Of the many outcomes of the work of the Library of Congress only MARCXML and MODS/MADS shall be mentioned here. MARCXML, which had been available from the MARC 21 web site since 2002, provides a lossless pathway from MARC 21 to MARC in XML and then back. Work is under way, led by Danish Standards, to create an ISO standard called MarcXchange, for the basic underlying structure of MARCXML. The Metadata Object Description Scheme (MODS) draft version 3.1 was published in July 2005. MODS is being used in a number of digital library projects, either by deriving MODS data from MARC 21 records or by creating simple MODS records for uncatalogued items. Version 1.0 of the XML Metadata Authority Description Schema (MADS) was also posted in 2005 on the MADS web site along with a draft mapping of MARC 21 Authority data to MADS.

124

After the PREMIS Working Group, an international task force developing core data elements for the preservation of library and archive material, completed its report in early 2005, the Library of Congress has been the official home and maintenance center for PREMIS. Since LC established a web site, the PREMIS data dictionary and accompanying schemas have gained recognition as an important standard for preservation metadata. National Library of Australia The National Library of Australia is especially responsible for exploring the role and requirements of the catalogue in supporting rapid and easy access by users to information in all formats, identifying enablers and inhibitors associated with for instance, functionality, international standards support, cataloguing conventions, and other resource discovery infrastructure. The actions NLA wants to undertake encompass exploring and promoting methods to archive web-based publications collected by web harvesting; developing tools to improve preservation planning and implementation of solutions; and improving knowledge sharing and dissemination on emerging topics in long-term preservation of and access to digital resources The Commonwealth Metadata Pilot Project aims to improve access to Australian government information published online by automating the contribution of metadata to the national bibliographic database provided through the Kinetica service, and by automating the archiving of content associated with the metadata in PANDORA: Australia’s web archive. Data is converted from its original format to the MODS standard and then to MARC for loading onto the national database2. The international conference “Archiving web resources: issues for cultural heritage institutions”, was held at the National Library in Canberra in November 2004. Its main objective was to identify significant issues facing cultural heritage institutions in collecting web resources and to explore how the issues are being addressed. Major research programmes and projects were included in the programme. The programme was deemed to be a great success with 200 delegates and around 30 speakers representing a range of institutions around the world. A detailed report on the conference and information about speakers and the programme are available on the conference web site3. By means of PADI, NLA prepared a review of existing guidance documents (such as standards, guidelines and codes) that address digital preservation issues and identify any significant gaps in PADI coverage. The review of existing 125

guidance documents listed in PADI and various other sources was completed in summer 2005. NLA’s report will be archived in the PANDORA archive of online Australian publications. Both the NLA’s review of guidance documents and the KB’s review of recent developments in 15 national libraries will be made available through PADI. ICABS coordination ICABS coordinates and communicates its work and activities to enhance cooperation and to avoid overlapping or duplicating work between the alliance partners and IFLA Headquarters, the Governing Board, the Professional Committee, Divisions, & Sections; CDNL and the regional groups of CDNL. If it is necessary the work will also be coordinated with UNESCO and other funding bodies, ISO, ICA, and other national and international standard making bodies in the area of bibliographic control. The ICABS Advisory Board, chaired by one of the participating institutions on a rotating basis, is constituted by one member of each of the participating libraries (plus two members nominated by IFLA). Die Deutsche Bibliothek has agreed to chair the board for the first 3-year-term and thus provides a secretary for the group. The secretary handles various arrangements and communications, for example, creating and maintaining a Web page for the programme and maintaining a web view that promotes the results of the programme on IFLANET. The ICABS Advisory Board will review and evaluate the actions of this alliance after the first 3 years. At present the secretary prepares the evaluation due in summer 2006. Next steps To prepare this evaluation, the ICABS Advisory Board will hold an extra meeting in Ottawa, Canada in April 2006. There the work of ICABS so far and future directions will be discussed, and a questionnaire prepared. This meeting will be followed by the evaluation process; we hope to have the results by the annual meeting during the IFLA conference in Seoul to be able to discuss the outcome and the consequences there. Besides the Advisory Board meeting, ICABS will offer in Seoul a programme session called “The changing role of the catalogue in supporting resource discovery and delivery”. Further information is available at http://www.ifla.org/VI/7/icabs.htm 126

References 1

http://www.ifla.org/VII/s13/pubs/ISBD-FRBR-mappingFinal.pdf

2

http://www.nla.gov.au/ntwkpubs/gw/65/html/p04a01.html

3

http://www.nla.gov.au/webarchiving/

127

UNIMARC – Future Perspectives Alan Hopkinson Evolving standards Standards are never static. There is always a tension between stability and innovation. Standards sit at the point of tension. Standards represent stability. They enable systems developers to have a framework in which to develop systems knowing that there are basic building blocks on which they can build. On the other hand, sometimes something new appears which requires the building blocks to be adjusted. Also, standards themselves are affected by other standards and cannot remain static. Sometimes new standards are developed because of advances in the technology and they replace earlier standards, either replacing them formally with the original withdrawn or co-existing in which case the old standard may well fade out gradually until everyone uses the new. UNIMARC is not a formal international standard, it has never been proposed to the International Organization for Standardization as an international standard. If it were there would be a danger that it would go before committees made up of ISO member standards bodies who might want to make drastic changes. This happened in the late 1970’s with the International Standard Bibliographic Descriptions (ISBDs), a set of ‘standards’ that had been developed by delegated working groups of IFLA but which some national libraries felt would gain more acceptance in their countries if they were adopted as international standards. The effort was abandoned because the members of the appropriate standards committees could not agree to adopt a document they had not themselves developed without recommending changes. Matters have probably improved over recent years as we can see that Dublin Core, a standard developed outside formal standards committee structures, has been adopted without change as an international standard. Even though the UNIMARC formats do not have the status of international standards they have the status of de facto standards. They are regarded as standards in many of the countries where they are used because the national library uses them and the majority of libraries if not all accept them. Additionally UNIMARC in exactly the same manner as a standard requires the use of other standards. The main one is ISO 2709, Information and Documentation: Format for Data Exchange. As this and other standards evolve, UNIMARC may have to change or if new versions remain backwards compatible 129

with the old then UNIMARC may not need to change. Or it may be advisable for it to change to keep up with new developments in technology. A new standard has just been published which is not a replacement for ISO 2709 but a new way of exchanging data – in XML – the kind of data which ISO 2709 exchanged in a particular arrangement or format of its data. This is currently a draft standard but will be called ISO 25577 – Information and documentation: MarcXchange and will be voted on by 20 June 2006. As a member of the ISO Committee which looks after this standard, I was able to ensure that UNIMARC was taken into account, though our embedded subfield technique has not been completely covered. Another area which has had impact on UNIMARC and will have more in the future is UNICODE. UNICODE has meant significant advances for the representation of material in different scripts. UNIMARC is responsive to this. The Permanent UNIMARC Committee expects to make progress there in forthcoming meetings. We must not forget the new 13-digit ISBN which is having an urgent impact on UNIMARC. A new ISSN is on its way and may be voted in later in 2006. But that is not all. UNIMARC is a de facto standard but derives many of its definitions from other de facto standards, notably the ISBDs. In consequence when they change UNIMARC has to be scoured carefully to see what changes it needs. ISBD(S) to ISBD(CR) has given us some changes. FRBR has of course been feeding into these changes and now we have FRANAR too. So UNIMARC has to change in line with standards to continue to exist in the environments in which it is used. UNIMARC: a format for data exchange UNIMARC therefore has to be developed in line with the opportunities and sometimes the constraints imposed by the developments in the outside world on the activities which it supports. UNIMARC has developed consistently since its initial development which was to establish a bibliographic exchange format to facilitate conversion between other MARC formats. UNIMARC is intended as an international exchange format. Nevertheless, it has been used as a national format though since national formats based on 130

UNIMARC all have their own requirements built in as implementations, one could argue that they are not UNIMARC but implementations of UNIMARC: UNIMARC was intended to be the format used to send data between different countries where data in the computers may well be stored in a national version of UNIMARC. In line with its role as exchange format it has a truly international mechanism for its updating and maintenance in the Permanent UNIMARC Committee. Constraints imposed from outside include the widespread adoption of XML as a carrier format. For many years people have been forecasting that MARC formats would die. They have not yet done so. Around the world there are thousands of libraries obtaining records in MARC formats from national libraries, bibliographic agencies and now more recently book suppliers. The standards that underpin them and the formats themselves are very sturdy. Even if the record structure, ISO 2709, were no longer used, the tags, codes and other identifiers that make up a record would probably remain, though would change in time along with changes to cataloguing codes as they have done hitherto. UNIMARC has shown itself able to adapt in this area as well, with work done by users on the Permanent UNIMARC Committee, the Corresponding experts and other expert users to ensure that UNIMARC remains consistent with other standards such as ISBDs and cataloguing rules which are never stationary. Current State and Future Progress When talking about the current state of UNIMARC one has to be aware that every time a meeting of the Permanent UNIMARC Committee takes place changes are agreed: nothing is ever static as far as UNIMARC is concerned. We now have UNIMARC Bibliographic 2nd edition, update 5. Our publisher, K.G. Saur, would like to consolidate this. We have a 2nd edition of UNIMARC Authorities on the Internet. UNIMARC for Holdings was published last year. We have the Concise UNIMARC Classification format. There is more work to be done but the UNIMARC family of formats is more stable than it has been for some time and I think that there are probably currently fewer requests for changes. We have Guidelines for many kinds of materials, the latest being worked on is for Continuing Resources. We continue to have requests for translations of the format documents and the guidelines. Often translators cannot understand the precise meaning of something in the original or identify ambiguities which enable us make improvements to the documentation.

131

We cannot ignore MARC21 and we will ensure that we remain in line with what they are doing, in the coded fields, in the assignment of tags and in the assignment of indicators and subfields within fields. We need also to watch what they are doing in the realm of UNICODE and other standards that affect us equally, such as XML and Z39.50. UNIMARC is registered as one of the formats in Z39.50. There are new developments on the way such as SRW Search and Retrieve Webservice and the Z39.50 Bibliographic Holdings Schema. We must ensure that any requirements of UNIMARC and its users are taken into account. I said earlier that we were in a period of stability. Not for long. We are awaiting the publication of Functional Requirements of Authority Data so that we can see what affect it will have on the UNIMARC Authorities format Methodologies need to be established for representing a few other kinds of materials in UNIMARC, such as manuscript and archival materials. If anyone has any ideas we would be pleased to take them into account. It could be another format document or it could be guidelines. We also have different kinds of links represented by the 856 field. Recently, our cataloguers at Middlesex University imported records from a book supplier with 856 links which led to contents pages rather than full text of the book. The users expect full text when they click on a link. How can this situation be tidied up? I spoke earlier of Z39.50 which is a tool for enabling searching and retrieval of bibliographic records: we need to identify channels for record sharing between libraries and related institutions but more especially links between the commercial world of book vendors and the library catalogue need to be strengthened and consolidated. Much of this is a matter for publicity. I recently encountered a small specialist book supplier who was afraid of losing business if they could not supply MARC records with their books. I also had a request from a book supplier for advice on tools to convert records to UNIMARC to widen their market. These are areas for us to progress. While talking about the commercial sector we would benefit from stronger links with the library systems developers and vendors. We must not forget new technical developments in producing a UNIMARC manual in XML, again work with which the commercial sector may be involved (see José Borbinha, Hugo Manguinhas, Nuno Freire “UNIMARC and XML” in these proceedings). Finally, all that I have been describing amounts to very intense and timeconsuming work and I must take this opportunity of thanking my colleagues on 132

the Permanent UNIMARC Committee and Corresponding experts for all the work they do. I must not forget the National Library of Portugal which hosts the UNIMARC Core Activity and the Director Fernanda Campos. Now, all this is being undertaken with much less support than was available in the past when we had Marie-France Plassard’s excellent support as Director (again thanks are due to her). But we still have the same amount of work to do. In line with many bodies producing standards and de facto standards there is less funding for standards making activities. Members of committees like the Permanent UNIMARC Committee need as much support as they can get and still have as much work to do. One way in which we can make the work easier is by having a document store with places for public and for committee documents. However these are not easy to maintain so a UNIMARC Registry is being established within the UNIMARC Forum to store documentation securely. This is something which in the last four or five years has been adopted by standards bodies as a way of developing standards more efficiently and economically. I must end of course in thanking our users for using the manual and for inputting into our discussions by making their requirements known, often via national committees. This brings me to the end of my paper which leads into another important development for the future of UNIMARC, the establishment of a UNIMARC Users group which is holding its first meeting following this paper.

133

K · G · Saur Verlag

IFLA Publications Edited by Sjoerd Koopman

The International Federation of Library Associations and Institutions (IFLA) is the leading international body representing the interests of library and information services and their users. It is the global voice of the information profession. 115

e-Learning for Management and Marketing in Libraries e-Formation pour le marketing et le management des bibliothèques Papers presented at the IFLA Satellite Meeting, Section Management & Marketing / Management & Marketing Section, Geneva, Switzerland, July 28–30, 2003 Edited by / Edité par Daisy McAdam 2005. 165 pages. Hardbound € 74.00 (for IFLA members: € 55.50) ISBN 978-3-598-21843-9 116

Continuing Professional Development - Preparing for New Roles in Libraries: A Voyage of Discovery Sixth World Conference on Continuing Professional Development and Workplace Learning for the Library and Information Professions Edited by Paul Genoni and Graham Walton 2005. 307 pages. Hardbound € 78.00 (for IFLA members: € 58.00) ISBN 978-3-598-21844-6 117

The Virtual Customer: A New Paradigm for Improving Customer Relations in Libraries and Information Services / O cliente virtual: um novo paradigma para melhorar o relacionamento entre clientes e serviços de informação e bibliotecas / Le usager virtuel: un nouveau paradigme pour améliorer le service à la clientèle dans les bibliothèques et services d'information / El cliente virtual: un nuevo paradigma para mejorar el relacionamento entre clientes y servicios de información y biblioteca Satellite Meeting Sao Paulo, Brazil, August 18-20,2004 Edited by Sueli Mara Soares Pinto Ferreira and Réjean Savard 2005. XVIII, 385 pages. Hardbound € 128.00 (for IFLA members: € 96.00) ISBN 978-3-598-21845-3

www.saur.de

K · G · Saur Verlag 118

International Newspaper Librarianship for the 21st Century Edited by Hartmut Walravens 2006. 298 pages. Hardbound € 78.00 (for IFLA members: € 58.00) ISBN 978-3-598-21846-0 119

Networking for Digital Preservation. Current Practice in 15 National Libraries Ingeborg Verheul 2006. 269 pages. Hardbound € 78.00 (for IFLA members: € 58.00) ISBN 978-3-598-21847-7 120/121

Management, Marketing and Promotion of Library Services. Based on Statistics, Analyses and Evaluation Edited by Trine Kolderup Flaten 2006. 462 pages. Hardbound € 128.00 (for IFLA members: € 96.00) ISBN 978-3-598-21848-4 122

Newspapers of the World Online: U.S. and International Perspectives. Proceedings of Conferences in Salt Lake City and Seoul, 2006 Edited by Hertmut Walravens 2006. 195 pages. Hardbound € 78.00 (for IFLA members: € 58.00) ISBN 978-3-598-21849-1 123

Changing Roles of NGOs in the Creation, Storage, and Dissemination of Information in Developing Countries Edited by Steve W. Witt 2006. 146 pages. Hardbound € 78.00 (for IFLA members: € 58.00) ISBN 978-3-598-22030-2 124

Librarianship as a Bridge to an Information and Knowledge Society in Africa Edited by Alli Mcharazo and Sjoerd Koopman 2007. 248 pages. Hardbound € 78.00 (for IFLA members: € 58.00) ISBN 978-3-598-22031-9

www.saur.de

K · G · Saur Verlag

IFLA Series on Bibliographic Control Edited by Sjoerd Koopman IFLA Series on Bibliographic Control publications provide detailed information on bibliographic standards and norms, the cultivation and development of which has become indispensable to the exchange of national bibliographic information on an international level. The IFLA Series on Bibliographic Control publications also give a comprehensive and accurate overview of a wide range of national bibliographic services on offer. Volume 26

IFLA Cataloguing Principles: Steps towards an International Cataloguing Code Report form the 1st Meeting of Experts on an international Cataloguing Code, Frankfurt, 2003 Ed. by Barbara B. Tillett, Renate Gömpel and Susanne Oehlschläger 2004. IV, 186 pages. Hardbound € 78.00 / sFr 134.00 IFLA members € 58.00 / sFr 100.00 ISBN 978-3-598-24275-5 Volume 27

IFLA Guidelines for Online Public Access Catalogue (OPAC) Displays Final Report May 2005 2005. 61 pages. Hardbound € 34.00 / sFr 59.00 IFLA members € 26.80 / sFr 46.00 ISBN 978-3-598-24276-2 Volume 28

IFLA Cataloguing Principles: Steps towards an International Cataloguing Code, 2 Principios de Catalogación IFLA: Hacia un Código Internacional de Catalogación, 2 Report form the 2nd Meeting of Experts on an international Cataloguing Code, Buenos Aires, Argentina, 2004 Ed. by Barbara B. Tillett and Ana Lupe Cristán 2005. 229 pages. Hardbound € 78.00 / sFr 134.00 IFLA members € 58.00 / sFr 100.00 ISBN 978-3-598-24277-9 Volume 29

IFLA Cataloguing Principles: Steps towards an International Cataloguing Code, 3

.3 ˬΔγήϬϔϠϟ ϲϟϭΩ ϦϴϨϘΗ ϮΤϧ Ε΍ϮτΧ : ΔγήϬϔϠϟ ϼϓϹ΍ ΉΩΎΒϣ Report form the 3rd Meeting of Experts on an international Cataloguing Code, Cairo, Egypt, 2005 Ed. by Barbara B. Tillett, Khaled Mohamed Reyad and Ana Lupe Cristán 2006. 199 pages. Hardbound € 78.00 / sFr 134.00 IFLA members € 58.00 / sFr 100.00 ISBN 978-3-598-24278-6

www.saur.de