The Impact of Digital Technology on Contemporary and Historic Newspapers: Proceedings of the International Newspaper Conference, Singapore, April 1-3 2008, and papers from the IFLA World Library and Information Congress, Québec, Canada, August, 2008 9783598441264, 9783598220418

The papers brought together in this highly actual book are grouped around three themes. Not only the physical and digita

154 86 9MB

English Pages 234 Year 2008

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Frontmatter
CONTENTS
FOREWORD
PREFACE
KEYNOTE ADDRESS
The development of newspaper companies and its impact on libraries in managing newspaper resource, education and information literacy programmes
THE AUSTRALIAN NEWSPAPER PLAN (ANPLAN)
THE IMPORTANCE OF PARTNERSHIPS FORNEWSPAPER PRESERVATION
DIGITISING HISTORIC NEWSPAPERS IN GERMANY – THE CASE OF BAVARIA
19TH CENTURY BRITISH LIBRARY NEWSPAPERS: UTILISING THE ONLINE DATABASE
NEWSPAPER DIGITISATION IN THE NETHERLANDS. The Dutch Digital Databank for Newspapers and other initiatives
THE CALIFORNIA DIGITAL NEWSPAPER PROJECT: CANVASSING, CATALOGING, PRESERVATION, DIGITISATION
NEW ACCESS TO OLD MATERIALS: The Hong Kong Newspaper Literary Supplements Digitisation Project
CREATION OF A NATIONAL NEWSPAPER REPOSITORY AT THE UNIVERSITY OF ZIMBABWE (UZ) LIBRARY
WIDENING ACCESS AND LEGAL ISSUES – NEWSPAPERS IN FOCUS
DIGITAL INGEST OF CURRENT NEWSPAPERS BY THE BIBLIOTHÈQUE NATIONALE DE FRANCE: The Situation End 2007/Beginning 2008
COOPERATIVE EFFORTS IN PRESERVATION OF AND ACCESS TO THE WORLD’S NEWSPAPERS
THE INDEX TO PHILIPPINE NEWSPAPERS (IPN) ONLINE
SERVICE AND PROCESSING OF NEWSPAPER IN SUPPORTING RESEARCH: A Case Study at Libraries of Universities in Surabaya
ENHANCING ACCESS TO THE NEWSPAPER COLLECTIONS: The Lee Kong Chian Reference Library Experience
ONLINE NEWSPAPERS: A NEW ERA
NEWSLINK 2.0 : MAJOR ISSUES IN THE DEVELOPMENT OF THE SPH MULTIMEDIA NEWS ARCHIVES
ALL NEWS BUT NO PAPER – HARVESTING SWEDISH ONLINE NEWSPAPERS
CANADIAN INUIT NEWSPAPERS AND PERIODICALS: PAST, PRESENT & FUTURE
SAUVEGARDER ET NUMERISER LA PRESSE DES IMMIGRATIONS EN FRANCE A LA BNF, XIXEME-XXEME SIECLES
PUBLICATION, ACCESS AND PRESERVATION OF SCANDINAVIAN IMMIGRANT PRESS IN NORTH AMERICA
PRESS, COMMUNITY, AND LIBRARY. A study of the Chinese-language newspapers published in North America
Recommend Papers

The Impact of Digital Technology on Contemporary and Historic Newspapers: Proceedings of the International Newspaper Conference, Singapore, April 1-3 2008, and papers from the IFLA World Library and Information Congress, Québec, Canada, August, 2008
 9783598441264, 9783598220418

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

ifla135_title.qxp

27.11.2008

14:00

Seite i

ifla135_title.qxp

27.11.2008

14:00

Seite ii

International Federation of Library Associations and Institutions Fédération Internationale des Associations de Bibliothécaires et des Bibliothèques Internationaler Verband der bibliothekarischen Vereine und Institutionen Международная Федерация Библиотечных Ассоциаций и Учреждений Federación Internacional de Asociaciones de Bibliotecarios y Bibliotecas

About IFLA

www.ifla.org

IFLA (The International Federation of Library Associations and Institutions) is the leading international body representing the interests of library and information services and their users. It is the global voice of the library and information profession. IFLA provides information specialists throughout the world with a forum for exchanging ideas and promoting international cooperation, research, and development in all fields of library activity and information service. IFLA is one of the means through which libraries, information centres, and information professionals worldwide can formulate their goals, exert their influence as a group, protect their interests, and find solutions to global problems. IFLA’s aims, objectives, and professional programme can only be fulfilled with the cooperation and active involvement of its members and affiliates. Currently, approximately 1,600 associations, institutions and individuals, from widely divergent cultural backgrounds, are working together to further the goals of the Federation and to promote librarianship on a global level. Through its formal membership, IFLA directly or indirectly represents some 500,000 library and information professionals worldwide. IFLA pursues its aims through a variety of channels, including the publication of a major journal, as well as guidelines, reports and monographs on a wide range of topics. IFLA organizes workshops and seminars around the world to enhance professional practice and increase awareness of the growing importance of libraries in the digital age. All this is done in collaboration with a number of other non-governmental organizations, funding bodies and international agencies such as UNESCO and WIPO. IFLANET, the Federation’s website, is a prime source of information about IFLA, its policies and activities: www.ifla.org Library and information professionals gather annually at the IFLA World Library and Information Congress, held in August each year in cities around the world. IFLA was founded in Edinburgh, Scotland, in 1927 at an international conference of national library directors. IFLA was registered in the Netherlands in 1971. The Koninklijke Bibliotheek (Royal Library), the national library of the Netherlands, in The Hague, generously provides the facilities for our headquarters. Regional offices are located in Rio de Janeiro, Brazil; Pretoria, South Africa; and Singapore.

ifla135_title.qxp

27.11.2008

14:00

Seite iii

IFLA Publications 135

The Impact of Digital Technology on Contemporary and Historic Newspapers Proceedings of the International Newspaper Conference, Singapore, 1-3 April 2008, and papers from the IFLA World Library and Information Congress, Québec, Canada, August, 2008

Edited by Hartmut Walravens In collaboration with the National Library of Singapore

K · G · Saur München 2008

ifla135_title.qxp

27.11.2008

14:00

Seite iv

IFLA Publications edited by Sjoerd Koopman

Bibliographic information published by the Deutsche Nationalibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data is available in the Internet at http://dnb.d-nb.de. U Printed on permanent paper The paper used in this publication meets the minimum requirements of American National Standard – Permanence of Paper for Publications and Documents in Libraries and Archives ANSI/NISO Z39.48-1992 (R1997)

© 2008 by International Federation of Library Associations and Institutions, The Hague, The Netherlands Alle Rechte vorbehalten / All Rights Strictly Reserved K.G.Saur Verlag, München An Imprint of Walter de Gruyter GmbH & Co. KG

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system of any nature, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior written permission of the publisher. Printed in the Federal Republic of Germany by Strauss GmbH, Mörlenbach ISBN 978-3-598-22041-8 ISSN 0344-6891 (IFLA Publications)

CONTENTS

Foreword........................................................................................................................... Hartmut Walravens, Chair, IFLA Newspapers Section

ix

Preface.............................................................................................................................. Ngian Lek Choh, Director, National Library Singapore

xi

Keynote Address.............................................................................................................. Hartmut Walravens, Chair, IFLA Newspapers Section

1

Track 1: Physical and Digital Preservation of Newspapers The Future of Newspaper Resource: The development of newspaper companies and its impact on libraries in managing newspaper resource, education and information literacy programmes ............................................................ Idris Rashid Khan Surattee

9

The Australian Newspaper Plan (ANPlan) ..................................................................... Pam Gatenby

21

The Importance of Partnerships for Newspaper Preservation......................................... Beth M. Robertson

31

Parallel Session A: Digitisation of Historic Newspapers: Approaches and Challenges Digitising Historic Newspapers in Germany: the Case of Bavaria.................................. Klaus Ceynowa

39

19th Century British Library Newspapers: Utilising the Online Database....................... Ed King

47

Newspaper Digitisation in the Netherlands: The Dutch Digital Databank for Newspapers and other initiatives………………………………………………….. Astrid Verheusen

55

Parallel Session B: Challenges and Opportunities in Digitisation Projects The California Digital Newspaper Project: Canvassing, Cataloging Preservation, Digitization…………………………………………………………... Henry L. Snyder

63

New Access to Old Materials: The Hong Kong Newspaper Literary Supplements Digitisation Project……………………………………………………….. Leo F. H. Ma & Louise L. M. Chan

71

vi

Contents

Creation of a National Newspaper Repository at the University of Zimbabwe (UZ) Library……………………………………………………………….. Edward Tasikani

81

Track 1: Physical and Digital Preservation of Newspapers: the Issues of Legal Deposit and Copyright Widening Access and Legal Issues: Newspapers in Focus …………………………… Majlis Bremer-Laamanen Digital Ingest of Current Newspapers by the Bibliothèque Nationale de France: The Situation End 2007/Beginning 2008 ........................................................................ Else Delaunay

87

93

Track 2: Service and Access Models of Southeast Asian Newspapers Cooperative Efforts in the Preservation of and Access to the World's Newspapers........................................................................................................ 103 James Simon The Index to Philippine Newspapers (IPN) Online......................................................... 115 Chito N. Angeles Service and Processing of Newspaper in Supporting Research: A Case Study of Libraries of Universities in Surabaya………………………………... 131 Munawaroh Enhancing Access to the Newspaper Collections: The Lee Kong Chian Reference Library Experience......................................................................................... 137 Gracie Lee and Josephine Yeo

Track 3: Online Newspapers Online Newspapers: A New Era...................................................................................... 143 Ed King Newslink 2.0: Major Issues in the Development of the SPH Multimedia News Archives................................................................................................................. 151 Tay Sok Cheng, Sebastian Chow and Ben Lim All News but No Paper – Harvesting Swedish Online Newspapers............................... Pär Nilsson

159

Contents

vii

Appendix: Papers presented at the IFLA conference «Libraries without borders: Navigating towards global understanding», 10 – 14 August 2008, Québec, Canada Canadian Inuit Newspapers and Periodicals: Past, Present and Future ............................ 171 Sharon Rankin Sauvegarder et numériser la presse des immigrations en France à la BnF, XIXème-XXème siècles ................................................................................................... 185 Philippe Mezzasalma Publication, Access and Preservation of Scandinavian Immigrant Press in North America .................................................................................................................. 193 James Simon and Patricia Finney Press, Community, and Library - A study of the Chinese-language newspapers published in North America ............................................................................................ 209 Tao Yang

FOREWORD It is part of the concept of the IFLA Newspapers Section to hold its midwinter conferences in different parts of the world to guarantee a good outreach and foster relations with colleagues on other countries. The conference in Singapore was a particularly successful event – not only has the National Library of Singapore been at the forefront of professional development, it also boasts splendid new facilities and energetic staff. In addition, the Singapore Press Holdings control a variety of newspapers in different languages. This was an excellent background for a conference, and the papers collected here attest to the wide scope of topics within the chosen theme of preservation and access of historic newspapers and the digital challenge. Besides Singapore itself, the geographical range reaches from China to the United States, Africa, Australia and Europe. Particularly encouring is the fact that also Indonesia, the Philippines and Zimbabwe were represented by papers. As an appendix, papers on the ethnic press of North America, given at the World Library and Information Congress in Québec, Canada (August 2008) were added: The papers of the immigrants were neglected for a long time, and preservation and access as well as digitisation are particular challenges. So while the subject of these presentations seems different, the practical issues line up perfectly with those of the Singapore Conference. In behalf of the IFLA Newspapers Section, I gratefully acknowledge the enormous support and hospitality provided by the National Library of Singapore to the Newspaper Conference and also its help with editing the proceedings. It has been a pleasure working with Singapore colleagues! The papers were formatted by Sharmini Chellapandi and the final layouting was done, as in previous years, by Carolin Unger; both deserve praise for their excellent work. The cooperation with the IFLA Programme Director, Sjoerd Koopman, and Manfred Link, of K. G. Saur Publishing, has been as pleasant as ever.

Hartmut Walravens Chair, IFLA Newspapers Section

PREFACE Ngian Lek Choh Director, National Library, Singapore

I am pleased to have the honour of writing the preface for the 2008 IFLA Newspaper Conference proceedings. This conference has been very successful in showing us the different ways libraries approach the digitisation of newspapers, and how access to the digitised content is improved. Most libraries and memory institutions digitise newspapers for preservation, and to make access much more convenient for their target clientele. Before the age of digitisation, users had to make use of microfilms to access content in newspapers. The search and retrieval process was very tedious. This mode of access is still common today for newspapers that are yet to be digitised. The actual process of digitising newspapers has its own unique challenges which have been well-covered in the papers presented at the Conference. I would like to focus on life after digitisation, that is, how do we make the access convenient to our end-users. The 2005 OCLC Perception of Libraries and Information Resources Survey that covered search and use habits showed that a considerable 84% of information seekers use internet engines such as Google, Yahoo and MSN to get information. In stark contrast, only 1% use library websites. This figure has indicated that if we only park our digitised content on the library's website, we will miss a large number of our users who do not use library websites to search for information. Most of them start their search using internet search engines, and they are generally happy with what they get from the internet. Good enough information is good enough for these users, and they form the majority of information seekers today. The National Library of Singapore initiated a project to find new ways to reach our users in the e-spaces that they frequent. For over two years, the library experimented with repackaging its digital content in such a way as to allow Google to find content that is parked in our website easily. This is so that when users go to Google to search for any topic, especially on or about Singapore, they are likely to find the library's content listed in the first two pages of the search results. As a result of this effort, we were pleasantly surprised to see an increase of the use of a database of over 2,000 articles and pathfinders on Singapore called Singapore Infopedia from 400 access rates a month to 150,000 a month in less than 18 months. The good news is that usage of this database is still increasing. I would like to urge all participants to put their efforts on this very critical aspect of the life cycle of digitisation, that is, making digitised content accessible in new ways, while continuing with the actual digitisation work. I strongly recommend this as it takes

xii

Preface

considerable time to find new ways of reaching our end-users in the e-spaces that they frequent. We need to keep up with the way users look for and use information in this interconnected world, where internet and mobile devices are the way of life for many of our users. If we do not put in a concerted effort in this area, by the time we complete the whole process of digitisation, we may find that use of the content is not as high as we would like it to be. The resources invested may therefore not be so worthwhile. The internet spaces out there offer us limitless new opportunities to expose our content to users who choose to spend their time in these spaces, whether to connect to their friends or to look for information. I wish you all the best in finding ways to reach our end-users in their chosen e-spaces, to make digitised newspapers a key resource to our users in their pursuit of knowledge and information. Good luck!

KEYNOTE ADDRESS Hartmut Walravens Chair, IFLA Newspapers Section

Preservation and Access If we review the last 20 years of newspaper librarianship we notice a development of priorities which signals a fast progress triggered by technical innovation. This is true mainly for a number of highly developed countries while some of the others leap forward but leave much infrastructural work to be done: Preservation and Access - these have been the big headlines that characterize the newspaper librarian’s responsibility, and in both cases the challenge is greater than with other library materials: Brittle paper, poor printing, lack of comprehensive collection plans, issues of space and filming quality, storing of films have been some of the buzz words in preservation, while bulk, large size formats, poor film quality, unsatisfactory reading equipment, cost of digitization, text recognition levels have been some in the access field. Recording collections The first big step was the collection and cataloguing of extant material, and this work is linked with such names as NewsPlan in the UK, the Bibliographie de la presse française and the U.S. Newspaper Programme. These long term actions were highly successful – the national newspaper holdings were registered in a comprehensive way, the catalogues made available, papers in a particularly bad state of preservation were filmed right away. Some collections were detected in private hands (usually in basements and attics), others saved from destruction (i.e. from publishers that had gone out of business, or filming companies). These projects also created a new awareness of the value of newspapers as a source of detailed local and family information, with a coverage and granulation far beyond what other sources (monographs, archival sources) provided. Newspapers have become the most sought after kind of library material. Microfilming The responsibility for the preservation of the recorded material led to major microfilming both in order to save the originals from further damage by readers and to have a master from which any number of user copies could be drawn. While the microforms seem a good option for long-term preservation and archiving, there is no doubt that going through hundreds of reels of film to retrieve information is time-consuming and frustrating. Small wonder that students looked for other material to base their theses on than newspapers. Nevertheless, the major achievement of these actions was reliable long-time archiving of the most endangered papers. Relatively inexpensive duplication of films enabled major research libraries and institutes to acquire sets of important newspapers and thus allowed scholars to use this material without expensive continuous travels.

2

Walravens: Keynote Address

Digitization The third step made headlines about ten years ago: Digitization! Electronic storage media had improved their capacity and scanning technology became more efficient - so especially the Scandinavian TIDEN project won much acclaim. It proved that whole sets of newspapers might be scanned, especially from quality microfilm, and, run through OCR programmes, could be mounted on the internet. As the poor quality of paper and printing did not always lead to a highly satisfactory text, the scanned images were linked with the OCRed text, and the researcher was thus able to check the original page and cite the information correctly. This project made newspaper digitization very popular - in many countries both libraries and commercial providers started digitizing newspapers, and the newspaper information became an attractive commodity. Text management software improved and allowed to recognize headlines, subtitles, dates, and the collation of parts of articles scattered over several columns. Texts and bibliographic data, especially date of publication, issue and page numbers, were automatically linked. Libraries entered on mass digitization projects, e.g. in the UK, in North America and France. Also Mexico and China did a large amount of digitization while some other countries decided to put the scanned images on the net and do the OCRing at a later stage. This suddenly made newspapers very attractive to researchers - instead of spending weeks going through reel after reel of microfilm, they could just search the OCRed files and collect the results within seconds. If there were shortcomings, they were mainly resulting from the rights situation: Libraries focused on material in the public domain, and only few newspaper publishers ventured on digitization projects of their own. And while libraries offered their material usually free of charge, the private vendors, publishers or service companies, had to charge. Mass digitization of newspapers is being done in a number of cases parallel to electronic mass storage of internet data. The growing amount of born digital data required measures to preserve them, similar to the legal deposit system of national libraries. The Scandinavian countries were again among the first to recognize this challenge and find answers to it. The necessary legislation was put in place, and electronic mass storage systems were installed, usually in tandem to avoid the loss or destruction of data. So far the experience collected with the harvesting and preservation of internet material has been encouraging, and thus some newspaper specialists are going back to the traditional issue of newspaper preservation. Is microfilming really the last word in the long-time archiving of newspapers? or would electronic mass storage be an acceptable alternative? Digital Newspapers These questions are supported by the issue of how to handle born digital newspapers. Some providers like the Fraunhofer Gesellschaft in Germany, were quick in offering computer output on microfilm as a solution. But people would want to use the electronic papers as they are and not go back to microfilm. Would it be worth it to prepare the film just for safe-keeping? When electronic newspapers started, they were acclaimed as a huge improvement over the printed ones. Printing and distribution in the traditional way became superfluous, and thus, hopefully, some overhead could be removed. Also with the spread of the internet, people all over the world could have access to a paper. In countries with lacking infrastructure, the electronic access would solve issues of bad roads, unreliable circulation systems and do away with timelags. Also when papers became available electronically there would be less

Walravens: Keynote Address

3

chance of censorship or monopolies. But instead of becoming easier and more transparent, the newspaper landscape has become more complex and has offered more challenges. The term Online Newspapers covers both born digital papers and digital versions of printed papers. Many publishers see their main market in the print world - people want to read their news at breakfast, on the train or on the beach and not stare at a screen. So the version put on the net is sometimes an exact replica of the printed edition. But sometimes it is updated continuously. In a number of cases, the online version is abbreviated - local advertisements may be left out, news agency reports and pictures (because the respective contracts with the rights holders refer only to the printed edition) may be dropped. The case is somewhat easier with newspapers that are only published online. Some cling to the traditional newspaper layout (presentation in page form) but some adopted a more flexible complex format with links and hyper-links. For librarians the newspaper situation has become less transparent. The printed editions with their many editions, different formats and preservation problems offered challenges enough: Which edition of a paper should be collected? In the case of legal deposit, should all editions be kept? For understandable reasons some libraries decided to collect only the main edition and the individual regional parts (but not the complete issues). Others gave up collecting the paper originals and focused on microform, thus losing the effects of colour (headlines, pictures). Also, it turned out difficult later on to fill in missing pages etc. as the originals had been dumped. Most libraries did not bother to pay attention to the different update levels of a paper - they took what they were sent, and their sets are a complete mix. Cataloguing offered challenges, too. The pure doctrine would call for individual records for all the local and regional editions of a paper. The result may be rather confusing to the reader, and instead of linking individual records, many libraries prefer one record per title, not per edition. This is much easier from the bibliographical point of view but does not allow easy retrieval of the desired articles. Some union catalogues - if they exist at all - do not accept comprehensive listings of individual holdings (e.g. single numbers) ... As to the key-words Preservation and Access, the offer of digitized and OCRed files seems indeed to be the big step forward towards a user-friendly presentation of the material. Additional Challenges Digital and born-digital papers offer additional challenges: Again, which paper should be collected, also considering that online versions may not match existing paper editions? Instead of just buying the paper or receiving it under the stipulations of the law, libraries usually have to sign licence agreements which define the conditions of use. In many cases, the readers may only read the paper on the premises of the library and may not download it on CD or DVD. In other cases, there is a national licence, covering the academic, or library sector of a country. This makes things easier, but still - the libraries have to see to it that no unauthorized use is being made of the material. Sometimes authorized readers, e.g. the faculty of a university and its registered students may use material by proxy, with their user id and pass-words.

4

Walravens: Keynote Address

There is an increasing number of free internet papers of varying quality; they have their attractions to readers - like the free papers in printed form. But they lack a lot of the information that the established commercial papers offer. On the other hand, some provide easy even if preliminary access to areas which are not always in the focus of international media, e.g. English language papers in Central Asian countries. Such free papers may not be preserved on a systematic basis. Many countries do not yet have a legal deposit legislation in place for such digital publications. The origin of some papers is difficult to locate - sometimes such information is not given because of tax or possible censorship reasons. At any rate, there is usually no rights problem with the free papers as it would bar access to the news media to those who either do not have the money, or do not want to spend it for this purpose. Current Development These are just a few of the many challenges that the libraries and the readers are currently facing. There is no doubt that the current development -

-

made information accessible much faster worldwide - often for a fee. did not change the reading behaviour of many newspaper readers who still want to have their papers in their hands. Whether electronic ink is the solution as it looked for a while will be seen. made rights become more important than in the past. Producers and service companies rediscovered the market value of historical newspaper information. made newspaper contents easily and speedily available for a select number of papers. This number is to increase substantially. will establish online papers as an ideal medium for quick information. proved the old maxim that there is no such thing as a free lunch - good information has to be paid for.

The Mission of Libraries What is the mission of libraries in this context? What are they expected to do? - Make newspapers available on a broad scale This is being done by major digitization projects for historical printed material. There remains a time lag of about 70 years for copyrighted material. Only some major papers are offering their own digital products. -

Conclude licence agreements with important papers or negotiate national, or international licence agreements. - Create information tools to point out existing newspaper holdings, digital offers, online papers and provide navigational help. This can be done by creating newspaper portals, or websites with lists of updated links. They would take the role of traditional bibliographies. - See to a comprehensive preservation of published material. That means preservation of as many papers as possible, disregarding the format of publication. Not only «important» publications, but also small local and free papers should be covered. If the newspaper programme cannot take charge of this, perhaps a webharvesting initiative would include «less relevant» newspapers.

Walravens: Keynote Address

5

- Improve access to non-digital holdings The present digitization euphoria suggests to many people that in a few years «everything» that is needed will be available in digital form, hopefully free of charge. While this vision is certainly understandable it is clear now already that it is only a vision. Or, we have to be generous with the term «everything» and limit it to mainstream and major scientific publications; this would include the view that many publications in the sciences that are older than five years are already classified as «history of science», relatively close to archaeology. There will always be a relation between demand and supply, and small areas of interest will not be covered by digitization because the expenses of processing the material and maintaining it on the net would not be in relation to the expected use / or income / or publicity. Therefore libraries will still have to deal with the traditional problems and issues. -

Recheck some of the traditional basics of newspaper librarianship: microfilms, quality issues, storage Some of the currently accepted axioms of newspaper librarians are under re-examination, depending on the technological development. Should it become obvious that electronic mass storage could be customized to the needs of the newspaper experts and would be safe enough on a long term basis, then some of the present issues and doubts could be discarded. The best microfilms are of little value if the quality of filming was insufficient, if no checking is done after the filming (to detect gaps, or lacking quality), and if the films are not stored under the necessary climate conditions. But technology is not the only issue there are access limitations because of the rights situation, for example, and in such cases the microfilm is still and will still be a good - even if only second best – solution. -

Improve cooperation between the communities - libraries, archives, museums, publishers, rights-holders ... Newspapers are collected by a number of different institutions - not just libraries. For that reason it is important to get the different stakeholders together to discuss the current issues and contribute to the further development. - Deal with international initiatives - like Google! The digitization boom has alerted some big international providers like Google, and Microsoft, who plan to scan millions of items and make them available through their networks. This seems a splendid idea and certainly meets some objectives of the library community. On the other hand, negotiations have shown that one has to pay a lot of attention to the details to ensure that libraries, archives, institutes etc. that furnish the original material also have their share of rights in the products, and also may exert influence on the processing, e.g. quality standards, linking of bibliographic records, third party use .... - Information rich? Information poor? This is a big political issue, and newspapers are but a part of the whole complex situation. The question reduces the matter to an economic factor. But there are also other determinants - like keeping ones information and not offer it generally (e.g. archives); like library privileges - there are large libraries, especially (but not exclusively) in the former socialist countries that offer their materials on different levels. Only holders of privileges get access to top material or what is considered to be of such kind; lacking infrastructure; lack of reading ability; lack of buying power within the country; lack of education; lack of ... You will be able to come up with more obstacles, depending on the local situation. There is no One size fits all. I am tempted to say - with librarianship it is like with politics - politics is the art of the possible as a wise saying goes!

6

Walravens: Keynote Address

IFLA Newspapers Section What is the role of the IFLA Newspapers Section in this context? Its mission statement says: The Newspapers Section is concerned with all issues relating to newspapers in libraries and archives, including acquisition and collection development; intellectual and physical access; storage and handling; preservation of newspapers and their contents; preservation of microfilm of newspapers; interlibrary lending; and the impact of digital technologies on all of these. As it is just a small group of international experts, it can only advise and serve as a catalyst. But its advantage is that is it international, and so it is not focused on a particular library, or a particular country. It tries to monitor developments, evaluate new technologies, spread best practice worldwide compile guidelines. In addition one of its main tasks is to foster communication within the newspaper sector. This is partly accomplished within the framework of the World Library and Information Congress (formerly IFLA General Conference); but there is only a time-slot of about two hours available, and so the Section decided to combine its annual midwinter business meetings with an international newspaper conference in order to get in touch with colleagues in other parts of the world, stimulate discussion and alert the public towards the importance of newspapers and their preservation. Thus we organized a number of such conferences on an annual basis; actually we covered all continents, and we were surprised how well our initiative was received. Apparently colleagues were just waiting for something like that ... Section Conferences 2008 IFLA Québec Singapore Conference 2007 IFLA Durban Santiago de Chile Conference 2006 Poznan IFLA Seoul Salt Lake City conference 2005 IFLA Oslo Arctic Circle Preservation symposium, Mo i Rana, Norway Canberra Conference 2004 IFLA Buenos Aires Shanghai Conference 2003 IFLA Berlin Central European Newspaper Conference, Berlin Cape Town conference 2002 IFLA Glasgow Mikkeli meeting 2001 IFLA Boston San Francisco meeting 2000 Paris Conference IFLA Jerusalem Moscow meeting

Walravens: Keynote Address

7

Publications But even successful conferences are not likely to make major inroads - only a fraction of the interested public is able to attend, and so we made it a point to publish the proceedings of our events in order to disseminate the results of our and our colleagues efforts. IFLA has been very supportive in accepting our books for publication: Microfilming for Digitisation and Optical Character Recognition Published in French, Spanish and Chinese as a Supplement to The Guidelines for the preservation microfilming of newspapers compiled in 1996 and mounted onto the IFLANET. Managing the Preservation of Periodicals and Newspapers / Gérer la conservation des périodiques et de la presse. Jennifer Budd (Eds.) Proceedings of the IFLA Symposium / Actes du Symposium IFLA. München: Saur, 2002. 175 p. (IFLA Publications; 103) Newspapers in International Librarianship Papers presented by the Newspapers Section at IFLA General Conferences. Edited by: Hartmut Walravens and Edmund King München: Saur, 2003. 260 p. (IFLA Publications; 107) Newspapers in Central and Eastern Europe – Zeitungen in Mittel- und Osteuropa. Edited by Hartmut Walravens in cooperation with Marieluise Schillig. München: Saur, 2005. 251 p. (IFLA Publications; 110) Gazety – Newspapers. Resources, processing, preservation, digitization, promotion, information. Conference proceedings Poznan, October 19-21, 2006. Poznan: Universytet im. Adam Mickiewicza, 2006. 478 p. [Polish and English texts] International Newspaper Librarianship. Edited by Hartmut Walravens. München: Saur, 2006. 298 p. (IFLA Publications; 118) Newspapers of the World Online: U.S. and International Perspectives Proceedings of Conferences in Salt Lake City and Seoul, 2006. Edited by Hartmut Walravens München: K. G. Saur, 2006. 195 p. (IFLA Publications; 122) The proceedings of last year‘s conference are now almost ready for publication. It will be a 400 p. volume with texts in English and Spanish. Owing to the kind support of sponsors most contributions were translated into the respective other language so that this volume may be a good basis for a lasting exchange of information. Needless to say, plans are under way to publish the proceedings of the present conference right after the fact.

8

Walravens: Keynote Address

Conclusion Newspapers will remain a world phenomenon for some time to come. Senior leaders in libraries will be confronted with the need to make them available in printed format or in online format. Consistent planning and resourcing of these activities will be needed, providing access to the content of newspapers, in order that libraries may serve their users most effectively.

THE FUTURE OF NEWSPAPER RESOURCE: The development of newspaper companies and its impact on libraries in managing newspaper resource, education and information literacy programmes Idris Rashid Khan Surattee Abstract In recent years, there has been an expansion in the use of media resource materials for teaching and learning in schools and colleges. As most major newspapers shift their core content from news reports to news features and analysis, it changes the quality of journalism and intellectual discourse. There is now a greater demand for feature articles and analytical pieces by schools and institutions of higher learning for their teaching and learning materials. The use of newspaper materials for education poses great challenge to librarians in preserving contextual information and challenges the librarians’ notion of information literacy, which is often equated with technique in information retrievals and evaluation of selection of databases. This paper will attempt to highlight the relevant trends in the development of feature articles and analytical pieces in newspapers and use of these materials in education. It will also discuss major issues confronting librarians especially in the way these materials are indexed, annotated and organised to preserve the relevant contextual information as well as the development of appropriate information literacy programme in handling media materials, especially that of the newspaper. It proposes to advance the role of librarian as educator and the appropriate competencies and skills required of a librarian to perform this role.

1

Introduction

The importance of newspapers as a resource for scholars and researchers cannot be underestimated. Even in this time of media proliferation, the newspaper is still a very important medium, with a special place in the library’s collection. A Library of Congress document states that “As a resource for scholar and researcher, no form of public records captures the day to day life of the community and citizens better than the local newspaper”. “As a primary source of local history information all newspapers, metropolitan dailies, suburb, rural weeklies and rich ethnic press are worthy of retention and preservation. Yet the effort required, due to the number of papers published and the quality of the paper on which they are printed, is tremendous” (Preservation newspapers, n.d.) At the same time, institutions, which are spending huge public resources to preserve the historical newspapers, are under pressure to make the collection more accessible to the public. The rapid development in information search and retrieval tools and free access model has raised public expectations for the availability and easy access to historical newspaper collection. Readers are demanding that the collection be available on the same platform from which they access all other media.

10

Idris: The Future of Newspaper Resource

Digitisation of historical newspapers becomes one of the most important programmes in many national and public libraries. Such programmes involving newspapers going back to more than a century requires huge resources. These libraries often justify the huge public expenditure for digitisation on the ground of heritage preservation and making the heritage information more accessible by the public. Digitisation of historical papers is also being undertaken by commercial enterprises like the New York Times and Washington Post for purely commercial considerations. These considerations determine the scope of the digitisation programme, the way the resources are organised or repackaged, distributed and used. The digitisation of historical newspapers which goes back to more than a century opens a new window to the life of community of the time as captured by the press. These digitised papers not only provides rich historical resource for scholars and researchers, but the availability of content to mass audience opens new opportunities for its use, which are not necessarily historical in nature. For example the use of historical content for advertisements and for product design and merchandising creates new information which is detached from its historical context. The creative use of news reports, headlines, advertisements in developing new information product and services poses new challenge for information professionals and users alike. In addition to the challenge of the archival of historical newspapers, librarians are confronted with new challenges brought about by the rapid proliferation of online newspapers and by the changing features of print version and their associated products. The relevant libraries need to develop new strategies for long-term management of newspaper content and the accessibility of these new expressions of the newspapers. We need to examine two important areas relating to the development of newspapers. First, we need to examine the development and trends in the newspaper content creation and newspaper products. Secondly, we need to examine the changing pattern of behaviour of the readers and users of the newspaper information. The behaviour pattern of readers and users of the media information are in turn shaped by factors beyond the newspaper or media industry itself. The proliferation of digital information resources and the way society interact and mediate through these resources creates new culture and lifestyles which generate the demand for the type of news and information and how these are consume and created. This paper will attempt to identify some key trends in the news newspaper industry and the behaviour of readers and users of newspaper information, which are relevant to libraries as repositories of newspapers. These trends will librarians better understand the changing features of newspapers as information resource. It will also attempt to discuss some strategic options for librarians in the handling newspaper content information which challenges our idea of newspaper archives and how librarians need to review the way they approach their interaction with creators of news content and users of this resource. In understanding the development of the industry it is no longer sufficient to discuss the problem of migration of newspapers from the print to the digital copy or online newspapers on the web. “This trend is obvious affecting all forms traditional media be it the print or electronic media. All media are going digital. The digital revolution is not just impacting the web; it’s impacting television radio, newspaper, magazines and outdoor. They are all becoming digital in part and will almost certainly be fully digital within a decade.” (Morgan, 2006).

Idris: The Future of Newspaper Resource

11

Is the digital newspaper still a newspaper resource? Or is it news content on a newspaper company’s web-site, or news articles in information database or news content in social network sites? Such categorisation we use today to differentiate our resources will become meaningless to consumers or users. To them it is just news or information or entertainment or games or a great sale offer or all of these in a single resource base. Most libraries are not prepared for this and they do not have the strategy to manage such a resource and its usage.

2

Newspaper companies in search of a new value proposition

Although we see an increasing number of newspapers available online, the print edition of the print is not dead. In fact the print version is still the main revenue earner and revenues from the print make it possible for newspaper publishers to invest in new innovative products which include print and online versions. As far as the newspaper business is concerned, it is not about the battle between print and online but the battle for eyeballs of readers and consumers of information products and services. For many years now, newspaper companies in developed economies, with the exception of a few countries such as Singapore, are experiencing continuous decline in circulation, readership, advertising profit and for public listed companies, declining share prices. There has been intense discussion on what is the future of newspapers and newspaper companies. Newspaper companies have been playing catch up with other media and information service leaders such as Google and Yahoo, in news gathering and dissemination. In response to this lag, newspaper companies are shifting their focus from product management to audience management. The audience, both the consumers and businesses, are the pillars of its market. It now “strives to touch and connect every consumer and serves every business, not just the consumers who read news and businesses that want and afford mass reach advertising”. Attention to the specific needs of consumers regardless of demographic and businesses regardless of the size, have become the primary focus in attempts to penetrate the “whole market”. By capitalising on them, newspaper companies must reach beyond the limits of the newspapers and news becoming what is described as “local information and connection utilities” (Gray, 2008, p.1). Newspaper companies are reinventing themselves as multimedia companies without preference for any specific platform. The customer demand drives the medium on which information and utilities are delivered. This approach fundamentally changes their core business. Newspaper companies are taking the leap, beyond newspaper companies, to a larger, more diverse kind of company where their core business is information services. The print newspaper is no longer the only medium for news and information. It is now one of the many products developed to engage customers and businesses. To avoid cannibalising each of their products through competition, each product is designed to complement the other. The print newspaper is designed to complement other print products and online information services and utilities to fill the needs of the whole market. The role of the print newspaper has shifted. It is no longer the best medium for reporting the day-to-day life of the community and events. Online media portals, mobile news channels and user-generated social network, compete in the social space for news gathering and dissemination. These are more suitable platforms for breaking news and "straight reporting" of events of events. Even the very idea of what constitutes news is no longer defined by newspaper media companies.

12

Idris: The Future of Newspaper Resource

Newspapers are shifting their attention to provide perspective analysis. News analysis, commentaries, feature articles and opinions pieces are the kind of content, which distinguish the newspaper from the online news portal. These higher level intellectual activities are poised to be the new core value of newspaper. Although newspapers will continue to carry news reports and other utility and service information, these are periphery to its core content. These provide the core content with the necessary reference link to the spectrum of information and conversation, which are taking place on other platforms. This shift in the role of newspapers poses new challenges to our evaluation of newspapers as an information resource. It is no longer the primary resource that captures the day-today life of the community. Instead it represents the discourse of issues that are deemed important by the community at a particular period. These discourses are not divorced from the news reports and conversation that are taking place outside the newspaper. Therefore as a resource, newspapers must be closely integrated with the other sources of information. Facilitating integration will be a great challenge for libraries.

3

Localised and customised content

Another major trend taking place in newspaper companies is associated with the idea of “hyper-local”. In the industry it simply means the “idea of parsing and tapping into smaller and smaller audience niches with the Internet or with smaller publications in order to deliver more targeted content and advertising. It is an attempt to better reach people with information and services tailored to their specific geographical and cultural interests, and to use that as a platform for local customised advertising that will be of most interest to these people” (Hirschman, 2008, p. 3) Studies of consumers and business needs show that consumers seek more specific local news content and advertising mode for local businesses which may not need to reach a large mass market but instead seek a local or specific audience. Hyper-local projects in newspapers are in part a response to the rise of the internet, which is by nature hyper-local in the sense that it allows people to connect in a way that they please Newspapers, which over time grow to become high volume mass circulation targeting generic readers and businesses, are now being reorganised to service more localised communities. These local editions carry more local news and advertisements from businesses targeting local consumers. The smaller print run and lower cost of distribution helps to reduce the cost of newspaper operations and lower cost of advertising for local businesses. The new printing technology machines enable publishers to produce high quality, low volume and yet low cost printing of newspaper at a location close to the community they serve. Newsrooms, which are decentralised can now print very localised edition of news content and advertisement and yet retain the editorial content and character of the national paper. Special editions with small print run allow newspapers to target crowds at a specific location and time, for example, printing a special edition for the lunch-time crowd at central business district. Until quite recently, it was rare to see publishers putting up multiple editions of daily papers because it was too expensive and time consuming to print a second edition or regional edition unless publishers find it necessary to do so. However, changes on the run are sometimes made for correction of errors or to replace some minor stories. The digital pre-press processes and advanced printing technology allow for such changes to be

Idris: The Future of Newspaper Resource

13

executed very easily. Even though some of these changes may be considered as small from the editorial point of view to warrant it to be labelled as second edition, these changes are significant for libraries as repositories of published materials. Part of the hyper-local project for newspapers is the publication of special topic newspaper magazines, supplements and pull-outs on specialised subjects or lifestyles targeting at specific reader groups. These materials, which are supplementary to the main newspaper, may not be necessarily circulated together with every copy of the paper. Some are only available online and accessible by subscribers of print edition. The online platform also enables these hyper-localised newspaper projects, to develop highly customised and personalised content service. This hybrid method for information delivery serves the need for interactive response to hyper-local demand. The publication of multiple versions and changes on the run are now part of an established mode of operation for publishers. This is also partly driven by the readers’ and advertisers’ demand for greater localisation of content and targeted advertising. The creation and production of newspaper content are highly responsive to the demand of the consumers and businesses. Consumer and business generated content are becoming a significant part of the editorial content. Industry-based professionals are often invited to write commentaries and news analysis on issues related to the industry. These experts help to enhance the depth of coverage. However these industry-based stringers also bring into their writings, their professional and industry’s interest. This is part of the trend in a consumer centric business world. The media is no longer a one way communication. Now, it is about having a conversation, in some cases with one at a time. (Morgan, 2006). In the newspaper industry this is often referred to as citizen journalism “Hyper-local projects at newspapers are in part a response to the rise of the internet, which is by nature hyper-local. The internet allows people to connect themselves in any way they please to any information they please so that people increasingly define themselves by the content they send and receive and search for and self published. (Hirschman, 2008, p. 4) Tracking, acquiring and archiving localised version of newspapers and their supplementary products, which are delivered through hybrid channels, pose great challenges for libraries. This is made more difficult if publishers do not see the need to differentiate the variation in content with easy-to-recognise labels.

4

Blurring of editorial and marketing content

Another trend in the newspaper industry is concerned with editorial content. Traditionally, the editorial policy dictates a strict separation between editorial content and marketing or advertising content. Editorial content is usually regarded as having balance, objective and even critical views when reporting on businesses, products and services while marketing and advertising are designed to promote images and brands of businesses and their product and services. This allows readers to exercise greater care in evaluating advertising information while they are more trusting with editorial pieces. We are now seeing the emergence of a hybrid editorial commonly known as advertorial. This is a marketing feature which takes the form of an editorial. It is designed to assist businesses and industries reach out to readers by providing relevant information about the industry, companies and their product services. The relevant industries and corporations

14

Idris: The Future of Newspaper Resource

commission these advertorials. These are sometimes written by journalists who write for editorial pages. This is part of the marketing, branding and public relation exercise of an industry or corporation and is essentially advertising content disguised as editorial pieces to benefit from the editorial tradition of journalistic objectivity. This new form of marketing is beginning to encroach into editorial content. Many of these come in the form of special supplements, pull-outs or special subject magazines. Others come in the form of regular sections, which are often indistinguishable from the editorial generated content. Although some newspapers provide labels to distinguish between these sections, most readers would find difficulty making the distinction especially those that pose as service and utility information. These materials are taken as editorial pieces by news aggregators and syndicated services, minus the labels, and are archived in commercial news databases. This development has significant impact on how librarians would manage the newspaper content and the thinking on user education. 5

Content unbundling and syndication and news archives

The creation of newspaper content has for some time now, adopted a full digital workflow from the editorial to pre-press. In the last decade or so, publishers have been able to post the newspaper content through the web initially as html web pages and more recently publishers are uploading their e-papers on the web. These e-papers are a facsimile of the printed version usually as PDF or DJV files. In a sense, this is still a very traditional newspaper with stories, which are posted on the web, and are often the exact copy of the printed version. However, the publication of online papers as exact copy of the printed version has not been able to generate higher advertising rates for its print advertisements. Advertising rates are still tied to the traditional circulation count. Revenue from online subscription has yet to be a viable alternative. Newspaper companies are unbundling newspapers either by sections or categories such as news, companies and business information, perspectives and analysis, and so on. Each of these sections or categories can have different revenue models. For example, news and multimedia content which usually attract higher traffic may run on an advertising model while perspective and analysis which have a lower traffic, run on subscription or pay-perview model. This unbundling is also applicable to materials supplied to commercial news archives and syndicated content providers. These are content resellers. As another source of revenue for newspapers, it is now in the position to be one of the major sources of revenue. Syndicated content providers, like Gettys, Mochila and AFP Image Forum are offering innovative sales and advertising models, which allow web publishers to use aggregated content for customised services. This content may be used in a context quite different from the original news context. For example, news photos and articles may be used, often with modification as marketing brochures or as educational materials in a context far removed from its original news context. These may then be archived as new information resource in commercial databases. Even content aggregators, like Factiva and Lexis & Nexis which traditionally provides reference database services, are beginning to offer licenses to their clients to post their aggregated content on the clients’ web publications. Moreover, digitised historical

Idris: The Future of Newspaper Resource

15

newspapers dating back to more than one century are being unbundled to suit the business models of various content resellers. The archives maintained by publishers or commercial database providers are selective, based on the commercial value of the content. Content that is deemed to have no reuse or resale value are often discarded or at best archived on offline medium. For example, classified advertisement and notices and often not available in commercial databases. Photos and graphical information are not available from the databases, which provide textual content. The effect of this unbundling in commercial news archives is significant because these are the new source of information relied by both casual and expert users including information professionals like librarians. Yet there is currently no standard guidelines on how these unbundled materials should be described to retain as much as possible the original context of the information. There is a need to record context data, often referred to as content metadata, at the source of creation and use. There is a need for standard tools and facilities in editorial and news production system to create and manage these metadata and to enable it to flow and across different publishing and archival system. Currently, there is no standardised way of structuring and handling metadata in the editorial content management system. Some applications structure their metadata elements following IPTC convention, while others use metadata elements similar to that of Dublin Core. Sharing these data across different information and archival systems is limited by the data structure of the different systems. As a result, only selective metadata from the source are shared when the data is transferred and a new set of metadata is created for the emigrated data, which could depart from its original description. Content creators and newspaper companies are aware of the value of their content. However, they often evaluate this in terms of the potential commercial value vis-à-vis the cost of managing the content. Commercial considerations rather than the heritage preservation often determine their consideration on content description. Newspaper companies are often ready to invest in resources, which add commercial value to the content. The need for the comprehensive archives of the intellectual output of the society is outside the scope of its business. Many libraries are already relying on commercial databases to provide access to newspaper content. Unless these libraries are responsible for comprehensive archival and preservation of newspaper resource as heritage resource, most of the needs are being satisfied by these commercial databases. Unless there is an arrangement for public institution responsible for heritage preservation, like the national library to engage newspaper companies and content aggregators and database providers based on mutual benefit, it is unlikely that commercial archives will ever meet the requirements for heritage preservation.

6

Newspapers in education

Another major initiative of newspaper companies is the use of its news content in education. They are keen to promote the use of newspapers for education as part of their strategy to cultivate young readers for its long term development of readership. Programmes designed to support education is also part of newspaper companies’ whole

16

Idris: The Future of Newspaper Resource

market approach for new revenue streams. Most recently, newspapers are being actively promoted through the Newspaper in Education (NIE) programme and newspaper school projects through the school media club. One of the projects is to offer journalism training programs directly to students. NIE is a programme designed to promote the use of newspapers in education. Although it is an established programme which started in 1955, it was only recently that it was used extensively as a major part of school teaching resource. Newspaper companies are actively designing lesson plan around their newspaper content to encourage teachers to use its content in teaching every possible subject in schools. It is developing new business models in partnership with educational content providers to push content to schools, training teachers or providing teaching guides and student workbooks. Digital copies of the print version with lesson plan, which are in line with the school curriculum, are available through subscriptions or the licensing model. Newspaper content is also incorporated in course packs and as supplementary materials for case studies used in institutions of higher learning. College and university students are offered a chance for attachment as journalists for the newspapers. The NIE programme is useful in promoting literacy and opens the readers to a range of information resources on current affairs. “Several studies highlight the significant effect using newspapers has on academic achievement, such as students receiving higher test scores, and on the development of civic values…research has found a strong link between a child using a newspaper in the classroom and that child becoming a lifelong newspaper reader”. (McMane, 2007, p.8) However, engagement with newspaper content in NIE programmes often assumes that the media is a repository of facts and wisdom. The programme often neglects the evaluation of the very nature of content itself, the localised social and political backdrop of the discourse, the social network and interaction which generate the content and the ideological underpinning of the content source. The blurring of distinctions between editorial and marketing content, between the output of professional journalists and stringers who are industry based, impact the skills and competencies required by students to effectively critique the content and create knowledge. The NIE programme, which is planned and managed by a newspaper company, faces some inherent problems. Critical information literacy skills and competencies are often not incorporated as part of the programme itself. It may not be in the interest of a newspaper company to engage students critically on its content but its reputation as trusted content provider could be undermined. This is therefore the challenge for librarians, especially for school and public librarians to introduce a suitable programme to develop the critical media literacy skills and competencies for students engaged in the use of media resources for education.

7

Content information and Information literacy

The preceding discussion on the changing business of newspaper companies and its impact on the newspaper as information, undermine the basis of our assumption of a newspaper. Newspapers are no longer the best public records that “captures the day to day life of the community and citizens better than the local newspaper” and “As a primary source of local history information all newspaper… are worthy of retention and preservation”.

Idris: The Future of Newspaper Resource

17

The archived newspaper whether in the form of print or digitised copy will not be primary source of information for scholars, researchers or library users. Commercial archives, specialised databases, social network spaces will be main source of newspaper content. This unbundled content can be presented in a variety of forms depending on the medium and purpose of usage driven by the needs of users. These are often divorced form the original context of the content. Moreover, libraries will no longer be able acquire and to archive the various expression of newspaper content in various media and customised channels of delivery. Since libraries can no longer capture adequately the forms, it must now turn to ways of handling its content. There are two feasible ways. One is the development of standard guidelines for content producers to participate in describing the content and the way the content is used This must necessarily include content creators, publishers, database providers, continue aggregators and resellers and users in the social networking spaces. This will have to involve negotiations and collaboration amongst the stakeholders. Although this may look like Herculean tasks, some effort in this direction has already started. For example, a working group of key players in the audio-visual market has developed a practical framework to capture and promote good practice and aims in the provision of content information. Content providers are to offer content information to empower users and allow them to make informed choices about the content they and their families access and consume. This also includes commitment in promoting and enabling media literacy as reflected in their approaches to providing content. It also requires providers to employ editorial policies to reflect the context in which their content is produced and delivered, present the information in a way that is easy to use and understand, and to give adequate information to enable users to make informed choices. (Audiovisual content information: good practice principles, 2008) Although these are still in the preliminary stages of collaboration, it could over time involve media communities and covering other materials too. Another approach is to develop new information literacy standards for libraries to help develop appropriate programmes to educate and facilitate library users in the use and management of news content in the context of new media environment. Information literacy programmes in libraries are often designed to facilitate information retrieval in the context of information search strategy. Information literacy is often based on the “Big Six Skills Approach” developed by Eisenberg and Berkoswitz. Information literacy as defined by the authors comprise of six steps in a “hierarchy of skills” which represent linear information seeking process. Yet the information representation on the new media environment is a dynamic process which is closely tied to the social, political and ideological construction of knowledge. According to Alan Luke and Cushla Kapitzke, the new electronic communications and information technologies represent a radical shift in the way that knowledge is sought and taught, applied and re/produced. They argue that “as a consequence, and because of the rhizomatic character of knowledge and related power relationship, the internet is a medium that enables a great deal of agency and free play amongst uses. This agency entails both new capacities to juxtapose, to ignore, to elide, to silence and to critique information that doesn’t appear to be relevant or valuable or interesting - but as well new capacities to produce, change, alter, relocate and transform these messages.” (Luke & Kapitzke, 1999, p 480). These are new skills, which need to be incorporated into the new information literacy programme.

18

Idris: The Future of Newspaper Resource

Luke and Kapitzke further argue that we cannot avoid the central question facing students, teachers and librarians about “social construction of and cultural authority of knowledge; the political economies of knowledge ownership and control; the development of local communities and cultures capacities to critique and construct knowledge”. (Luke & Kapitzke, 1999, p.483-484) They also criticised the American Library Association’s (ALA), “Information Literacy Standards for Student Learning” which are standards to define “best practice” for librarian and teachers” as inadequate for the learner is conceptualised as the passive recipient of information. No mention is made of students as active agents in the production of knowledge. They further argue that the ALA assumption, that a student “who contributes positively to the learning community and to society is information literate and recognises the importance of information to a democratic society”, is meaningless, if not dangerous, without explicit recognition of the social, political and ideological construction of knowledge. (Luke & Kapitzke, 1999, p 480-481) They then proposed the need for critical information literacy training that begins with three core assumptions: •

that the texts and knowledge of the new technologies are potentially powerful sources for shaping students’ beliefs, practices and identities, and indeed that students will require critical perspectives and strategies for repositioning themselves in relation to these texts and knowledge;



these texts and knowledge are not pre-existing, waiting to be discovered and documented through library work. Rather, they can be co-constructed by the student in a mediated dialogue with other times and spaces, texts and identities – both real and virtual. In this way, libraries can be sites where students can use these same technologies to actively and critically construct, shape and negotiate knowledge, practices and identities; and



in so doing, a critical information literacy can encourage and enable learners to systematically reposition themselves in relation to dominant and non-dominant modes and sources of information. (Luke & Kapitzke, 1999, p.486-487)

These assumptions could underlie the new basis for the construction of critical information or media literacy programmes, which can be presented to librarians, educators and information service providers who are concerned that users of media information have the capacity to use information technology and information content in a communicative and active manner, which enable them to critique and create knowledge. In order to develop this new critical literacy programmes, librarians must become critical commentators, mediators and mentors, and even nomadic intellectuals and cultural guides rather than the traditional archivist and monitors.

8

Conclusion

The newspaper is undergoing a major shift in terms of its role and content. Its development is shaped by the changing media landscape, the lifestyle and culture of consumers and the interest of the businesses it serves and the response by newspaper companies to these changes as they seek new value proposition.

Idris: The Future of Newspaper Resource

19

Libraries can no longer manage newspaper resources as repositories that are centrally located and categories with pre-determined metadata. User will use newspaper resources from a variety of sources, together with tools, which allow them to juxtapose, produce, change, alter, relocate and transform these resources and in turn archive them back as new resource. Users have the tools to create just- in-time databases using new data mining tools with user and community-generated indexes for specific tasks under specific problem situations. There is scope for all the stakeholders and users of newspaper content from the perspective of content producers and consumers to collaborate to develop new principles and guidelines in providing content information, which would enable users to make informed choices when using information content. However, the most urgent step that needs to be taken is to look into ways to develop the capacity of users to use the media and its resources in critical ways. There is an urgent need to review the concept of information literacy, its standards and programmes. Our information literacy programmes, which is based on traditional notion of users as passive recipient of information searching and retrieving of resources as part of linear information process, need major revision. A new critical literacy must be based on new sets of assumptions, which regard users interaction with information and knowledge as a dynamic process.

References Audiovisual content information: good practice principles. (2008). Retrieved January 15, 2008 from http://www.audiovisualcontent.org/audiovisualcontent.pdf Envisioning the newspaper 2020. Shaping the Future of the Newspaper, Strategy Report, 7(1), 2007 Gray, S. T. (2008). Newspaper next 2.0: making the leap beyond ‘newspaper companies’. American Press Institute. Hirschman, D. (2008). Hyper-local 2.0: forging deeper audience connections. INMA Inc. Luke, A. & Kapitzke, C. (1999). Literacies and libraries: archives and cybraries. Curriculum Studies, 7(3), 467-491. McMane, A. (2007). Introduction. In Engaging Young Readers. Shaping the Future of the Newspaper, Strategy Report, 7(2), 7-14 Morgan, D. (2006). The end of advertising and media as we have known it. Retrieved January 15, 2008 from http://adage.com The Library of Congress. (n. d.). Preserving newspapers. Retrieved January 15, 2008 from http://www.loc.gov/preserv/care/newspap.html

THE AUSTRALIAN NEWSPAPER PLAN (ANPLAN) Pam Gatenby National Library of Australia

Abstract The Australian Newspaper Plan (ANPlan) (http://www.nla.gov.au/anplan/) is a collaborative undertaking managed by the National Library of Australia and involving the Australian State and Territory libraries. ANPlan aims to preserve all newspapers published in Australia and to provide public access to these important chronicles of Australia’s past. Since the collaboration was established in the early 1990s, considerable progress has been made with developing agreed standards and strategies, with microfilming titles, and with tackling the many issues involved in a national undertaking of this kind. The proposed paper will outline the ANPlan achievements and future work program including actions to address the impact of digital technologies on newspaper publishing and preservation. It will cover the outcomes of a workshop held in 2007 to discuss newspapers in the digital age as well as describe the National Library’s Australian Newspapers Digitisation Program (http://www.nla.gov.au/ndp/, a major undertaking to digitise out of copyright issues of major capital city dailies from each Australian State.

1

Background to newspaper publishing in Australia

The history of newspapers in Australia reflects the history and development of the country, from its colonial origins to its current day membership of the global community. Australia’s history is short compared to most countries so it is not unreasonable for Australian libraries to aim to preserve a comprehensive record of newspaper publishing from the time of white settlement of the country to the current day. Early Australian newspapers are among the few remaining resources that provide contemporary accounts of how the colonies were governed and of key historic events that shaped the nation such as the first encounters with Indigenous Australians, land settlement and the discovery of gold. They reflect the concerns and circumstances of our ancestors and are heavily used in most Australian research libraries to support historic enquiry. Australia’s first newspaper– the Sydney Gazette and New South Wales Advertiser - was published on 5 March 1803. It was a government gazette published by authority of the Governor of New South Wales. It distributed official announcements, shipping news, excerpts from foreign newspapers, and local social news. In each of the other Australian colonies, the first publication was also a government gazette. By the end of the 19th century several metropolitan, provincial and suburban newspapers were being published and weeklies were starting to appear. These played an important role in bridging the distance between city and country and in fostering Australian creative writing at a time when the book publishing industry was still in its infancy. The most famous weekly was the Bulletin, established in 1880, which nurtured a distinctive, radical national literature. (Unfortunately the Bulletin ceased publication this year due to declining sales.)

22

Gatenby: The Australian Newspaper Plan

Many of today’s main newspaper titles can trace their origins to publications from the colonial era; for instance, Australia’s longest running title, the Sydney Morning Herald, was first published as the Sydney Herald in 1831. However, the history of newspaper publishing in Australia is marked by competition, mergers and “take overs” which many titles have not survived. Australia now has the most concentrated print media ownership in the western world. The Australian Press Council in its 2006 report, the State of the News Print Media in Australia (http://www.presscouncil.org.au/snpma/snpma_index.html), notes that in 1923 there were 26 metropolitan dailies owned by 21 proprietors and by 1987 there were three major proprietors as well as a small number of independent publishers. Today, four companies own most of the newspapers in Australia with News Ltd (the biggest) controlling 68 per cent of the market. The Press Council’s 2007 supplement to its 2006 report (http://www.presscouncil.org.au/ snpma/index_snpma2007.html) states that currently there are 52 daily English-language newspapers (two national, 13 metropolitan, 36 regional and one suburban) and eleven metropolitan Sunday newspapers published in Australia. In addition, hundreds of suburban and community newspapers are published at different intervals and many of these contain comprehensive local news and have substantial circulation. Besides the English-language press, there are more than one hundred newspapers in other languages, nine of them dailies. As in many other countries, the main newspapers published in Australia now have web sites, many of them including multi-media and interactive features. Newspapers report that visits to their news related web sites are increasing at a fast rate. The historic nature and characteristics of newspaper publishing in any country have implications for how libraries attempt to preserve their newspaper heritage. In Australia, we are taking a national, collaborative approach to addressing the issues through the Australian Newspaper Plan.

2

The Australian Newspaper Plan (ANPlan)

The Australian Newspaper Plan (ANPlan) was established in 1992 but was then called the National Plan for Australian Newspapers (or NPLAN). The name was changed to ANPlan in 2006. The idea for a national plan for newspapers came from the State Library of South Australia, one of the few Australian libraries to run their own microfilming units. The initiative was taken up and endorsed by the Consortium of Australian State libraries (CASL) which has also had a name change and is now known as NSLA, or the National and State Libraries of Australasia. (The role of NSLA is to provide a consultative and advocacy forum for state and public library services in Australia and to develop common policies and programs to advance these services.) The broad objective behind establishing a national plan was to coordinate activity at the national level in order to maximise the effectiveness of limited resources available to preserve access to the country’s newspapers. The ANPlan was managed by the State Librray of South Australia until 2001 when, following a review of the future directions of the program, CASL invited the National Library to take over responsibility for managing it. ANPlan is a national, collaborative

Gatenby: The Australian Newspaper Plan

23

undertaking based on shared objectives, clearly defined responsibilities, and practical action. The National Library and each Australian State and Territory library is a member of the consortium and the National Library of New Zealand participates with observer status. Each partner library is responsible for collecting, preserving and providing access to each newspaper title published in their jurisdiction – more specifically, to ensure at least one hardcopy of every newspaper published in their jurisdiction is retained in their collection for as long as possible and that a surrogate copy of every title is made to facilitate longterm public access at the national level.

3

Collecting

In the area of collecting partners are required to: - collect hardcopies of all newspapers from their area of responsibility as published; and - identify, locate and collect missing titles and issues. All member libraries collect current print newspaper titles under legal deposit provisions and have acquisition programs in place to fill important gaps in their collections. Repatriation of missing titles to the library with primary responsibility for their preservation is one way in which gaps can be filled. However, even though repatriation is an agreed strategy that underpins ANPLan objectives, it is unpopular with some partners and can be difficult to manage. Libraries and their users can be reluctant to give up materials from their collections if they are used and the resources required to de-accession material can be a deterrent. A recent initiative to fill gaps in newspaper collections is a search and rescue campaign which is using specially designed publicity material to draw media and public attention to the search for particular missing titles. Each library will run their own campaign and focus on their priority missing titles but they will use the same publicity materials to give the campaign a national identity. The campaign started a couple of weeks ago and has already attracted considerable media interest and triggered a pleasing level of interest from the public.

4

Preservation

The second area of responsibility for ANPLan partners is preservation. Partners are required to: -

retain as long as possible one hardcopy of every newspaper published in their jurisdiction; create or purchase an archival standard master reproduction and at least one working copy reproduction of every title; and provide appropriate housing and management of all copies of every title.

The preservation strategy followed by ANPlan involves microfilming to archival standards and providing appropriate housing and treatment of the original print titles and the different generations of microfilm, to optimise their life expectancy. To assist libraries meet microfilming standards, in 1998 the National Library published Guidelines for

24

Gatenby: The Australian Newspaper Plan

Preservation Microfilming in Australia and New Zealand. The standards are also available on our web site in the policy document, Retention and preservation of Australian newspapers. (http://www.nla.gov.au/policy/revnpan.html#microfilming). This document covers standards for preservation storage of print and microfilm versions of newspapers, as well as the microfilming standards that should be followed. (These are based on current versions of recognised international standards.) To support the ANPlan preservation strategy the National Library provides members who require it with free Cold Storage of archival standard preservation masters, based on a Deed of Agreement. The Library also provides funding support to member libraries to microfilm titles in their collections. A submission process is used to allocate funds with priority given to “at risk” titles. Since 2002, we have provided around $1.2 million for this purpose. Additional funding comes from the member libraries’ own budgets and occasionally from government grants. Australian libraries have been microfilming newspaper titles for almost 50 years and it has long been considered a relatively easy means of capturing a reliable copy that can be managed for the very long term and facilitate access. A couple of libraries have their own microfilming facilities and there are a small number of commercial agencies providing copying services to the others. However, two recent developments have led a serious challenge to the previously preeminent preservation role of microfilming with regard to newspapers, namely; -

the development of increasingly sophisticated digital technology for capturing, organising and presenting content, and the increasing costs and difficulties of relying on microfilm as an adequate preservation and access path for newspapers - i.e. older film does not comply with current standards, it is expensive to store adequately, is unpopular with users, and the equipment needed for access is likely to become hard to maintain and acquire.

These developments, as well as the increasing availability of newspaper titles in online form, were the catalyst for ANPLan to organise a workshop to explore the implications of digital technologies for newspaper preservation.

5

Access

The last area of responsibility for ANPLan partners relates to access. Partners are required to: -

catalogue all print and microfilm holdings of newspapers into the Australian National Bibliographic Database (ANBD) on Libraries Australia, and provide easy access pathways to the content of each title.

At the heart of the ANPlan is the belief that libraries should make it easy for people to locate and obtain access to their cultural heritage, including newspapers. Australian libraries have a long tradition of collaboration in resource discovery with the most obvious manifestation being Libraries Australia (http://librariesaustralia.nla.gov.au/apps/kss).

Gatenby: The Australian Newspaper Plan

25

Libraries Australia is an online resource discovery service managed by the National Library which provides access to the holdings of around 900 Australian libraries through the Australian National Bibliographic Database (ANBD), as well as access to a range of other Australian and overseas databases. The public can search Libraries Australia free of charge through an easy, Google-like interface, and they can then link to easy “getting” options via copying and document supply services, when full-text is not available online. Several formats of material, including newspapers, can be searched separately or as part of an integrated search across many material formats. There are currently around 11,200 records for Australian newspapers recorded on Libraries Australia - around 4,600 of these are for microform versions and 500 for online versions. However, it is not know how many might be duplicate records. By describing the various versions of newspaper holdings held by partner institutions and information about the extent of issues held, the Libraries Australia database serves as a national register of holdings – both print, microfilmed and digital. However, this is a complex area of bibliographic control and the extent to which the Libraries Australia database represents the newspaper situation in Australia is unknown, mainly because it is hard to estimate the number of titles ever published in the country. Also, in many cases, it is not known who made and holds master copies of microfilmed titles so this information is missing, holdings information on records is not always up to date, and bibliographic records can be difficult to decipher. Standardised approaches to describing the different formats of newspaper titles have not always been followed – for instance, sometimes different records have been created for the different versions but other times all information has been recorded in the record for the original print version. To address the problem of inconsistent bibliographic control of the different versions of newspaper titles, partners recently developed guidelines with the aim of making it easier to understand records in Libraries Australia and to record information about the different generations of microfilm and intentions to do preservation microfilming. Another way of improving access to newspapers is, of course, to digitise them.

6

ANPlan operations

To coordinate ANPlan activities and to provide a national focus, the National Library contributes a dedicated part-time position to managing the Program and to maintaining the public website. The website provides information about the program and its objectives, the standards and guidelines followed and new developments relating to newspaper preservation in Australia and overseas. A members’ space and discussion list are also maintained. ANPlan members meet by teleconference twice a year and we aim to have one face to face meeting as well. The meetings provide the opportunity to discuss issues and share information and also serve to build commitment to the shared work plan. The basis of collaboration is through a 5-Year Plan, which sets specific goals for the period, serves as a basis for internal and external funding and provides a framework for reporting progress. An annual report on progress against the Plan is provided to NSLA.

26

Gatenby: The Australian Newspaper Plan

The specific goals of the current 2005-2010 Plan are to: -

acquire and preserve access to 51 missing titles that are considered nationally significant; microfilm 12 nationally significant “at risk” titles; re-film several titles to a quality which will support digitisation; and to address a number of particular concerns that include checking the condition of masters, replacing acetate masters, improving storage arrangements for masters and for original newspapers, and reviewing the extent to which Libraries Australia is serving as a national register of Australian newspapers.

The 5-Year Plan also includes several actions that arose from the workshop to explore the impact of digital technologies mentioned earlier.

7

ANPlan achievements

So what has ANPlan achieved so far? During the 16 years that the Australian Newspaper Plan has been in place, considerable progress has been made towards protecting and making accessible Australia’s newspaper heritage. The coordinated, collaborative approach offered by ANPlan has been a great stimulus to sustaining progress against milestones, to developing national strategies aimed at eliminating duplication of effort, and to sharing solutions to common problems. However, it is very difficult to come up with reliable statistical estimates of progress at the national level against the core ANPLan objectives relating to collecting, access, and preservation. This is because some partner libraries do not have the relevant information available and it can be complex and resource intensive to compile. Nevertheless, some general statements can be made about progress made as well as the issues still to be addressed. With regard to microfilming for instance, in June 2006 a national audit of progress with microfilming programs was undertaken by ANPlan in order to help shape planning in this area. It revealed that, while at the national level there is still some way to go in preserving our newspaper heritage, steady and impressive progress is being made, with four out of eight libraries indicating that they had filmed between 70 and 97% of their entire newspaper collection to preservation quality. Access to the country’s newspapers has also definitely improved over the last decade. All partner libraries now catalogue current print newspaper titles onto Libraries Australia and most have catalogued their complete collections online. Also, standards supporting the core ANPlan objectives have been agreed and are, by and large, being followed; we have a much better understanding of the state of newspapers at the national level; and strategies are in place to address issues. The issues that remain to be addressed as a priority are reflected in the 5-Year work plan. To summarise, the key ones are as follows. -

There are many gaps in the record of microform copies of newspaper holdings in the register on Libraries Australia. The location, ownership and condition of some preservation masters are unknown as some key titles have been filmed over the years by commercial bureaux or by publishers.

Gatenby: The Australian Newspaper Plan

-

-

27

Some partners have identified quality control concerns with older microfilm that renders it less than useful for preservation, access or digitisation purposes and some have large collections of older film still need to be quality assessed. All partners are aware of important titles that are missing from their collections.

As well as continuing to deal with a range of issues associated with pursuing microfilming as a preservation strategy over the years, ANPlan partners must now come to grips with the impact of digital technologies on newspaper preservation and access. As mentioned earlier, in June 2007 the National Library held a workshop to address the broad topic and to identify strategies for dealing with specific issues of more immediate concern to ANPlan. A catalyst for the meeting was a discussion paper prepared by Colin Webb, then Director of the National Library’s Preservation Branch, titled Roles of digitisation and microfilming in Newspaper preservation. This paper explored questions such as: -

the preservation potential of digital copies; the timeframe for considering microfilm a viable preservation medium; and the timeframe for digital copies to become the preferred preservation medium.

The paper reached the following conclusions. (i) Digital copies could serve perfectly well as preservation master copies of newspapers once some concerns are met – for instance, the cost of capturing a high level of fidelity, the cost of storing uncompressed files, and ability to commit to an appropriate digital preservation plan to meet long-term sustainability requirements. (ii) Microfilm is potentially subject to a number of factors which could seriously threaten its continued viability as a preservation medium but it is unclear when this might eventuate. The factors include for instance, withdrawal of suitable film stock and microfilming bureaux services from the market, withdrawal of industry support for access technologies, withdrawal of suitable microfilm storage facilities, and user rejection of the format. (iii) Australian libraries are likely to have a mix of microfilm and digital preservation approaches for the foreseeable future, depending on the situation (or “states of being”) of titles in their collections, their budgets and capacity to manage digital collections for the long-term. The paper raised a number of particular issues for ANPlan members which were the focus of discussion at the workshop. Most of the issues translated into actions for the 5-Year Plan which, once carried out, will enable ANPlan partners to take more informed steps towards shaping the future preservation strategy. The key actions are: -

-

develop practical standards and guidelines for digital capture of hard-copy and microform copies of newspapers, and for management, storage and preservation of digital newspaper files; develop guidelines on how others can contribute content to the National Newspaper Digitisation Program (I’ll talk about this program in a minute) ; investigate the issues involved in collecting, preserving and providing access to online newspapers and pre-press electronic versions of newspapers;

28

Gatenby: The Australian Newspaper Plan

-

develop better understanding of the future viability of microfilming; and develop a costing framework for comparing newspaper microfilming and digitisation costs.

Good progress is being made with these actions - in fact the guidelines, except those for user contributions, are now available through the ANPlan website. As with many other countries, Australia has commenced digitising its newspapers though a number of local projects as well as a major national undertaking, the Australian Newspapers Digitisation Program. This Program is managed by the National Library under the auspices of ANPlan and like ANPlan, is a collaborative undertaking involving the Australian state and territory libraries. Information about the Program is available from the website (http://www.nla.gov.au/ndp/). Planning for the Australian Newspapers Digitisation Program commenced in 2006. The Program is staffed with a mixture of permanent, fixed term and casual staff, with staff from our Information Technology area and two senior manager positions dedicated to it. The aim of the program is to build a database containing newspaper content from the first Australian newspaper issue in 1803 through to the 1954, when copyright comes into effect. We have started with one major newspaper from each state and territory and intend to extend coverage in the future by the addition of regional newspapers. The aim is to develop one national access point for all digitised newspaper content. During the first phase of the Newspaper Digitisation Program, which is now underway, we aim to create up to 4.4 million pages of digital newspaper content by June 2011. Titles included in this phase are The Sydney Gazette; The Sydney Morning Herald; The Maitland Mercury; The Argus; The Courier-Mail; The Hobart Town Gazette; The Advertiser; and The West Australian. The Program is funded by the National Library wish the assistance of some external donations – for instance, we were very pleased to receive a grant of $1 million from the Vincent Fairfax Family Foundation to support the digitisation of The Sydney Morning Herald. The actual digitisation of the newspaper titles is carried out under contract by two commercial companies. One company undertakes the initial stage of the digitisation which is to convert newspaper microfilm into digital page images. To date, over 800,000 page images have been created. The most complex stage, carried out by the second company, involves the conversion of the digital page images into text-searchable files through the use of Optical Character Recognition (OCR) technology and other processes including the “zoning” of the newspaper articles. (The zoning process involves defining the border of each article and creating links between parts of an article that may be separated). While progress with establishing the newspaper digitisation program has been slow and at times frustrating, significant progress has been made over the last year. Particular challenges we have encountered include microfilm quality, OCR accuracy, zoning and categorisation of text, and quality checking procedures. However, management and workflow procedures, including quality assessment, have been established and specifications for OCR requirements have now been finalised following several months of iterative testing and refinement using sample data. To provide access to the full text of digitised newspaper articles, a search and delivery system is being developed at the National Library. The search service will be offered free of charge to the public. Development of the database search and delivery interface is well

Gatenby: The Australian Newspaper Plan

29

advanced and we hope to release it to the public in late 2008. The interface includes advanced search features such as relevance ranking, clustering of result sets by date span, geographic coverage, article category and size, and title of newspaper. In addition, related resources such as pictures and published works retrieved from other Library discovery services are presented.

8

Conclusion

Through the Australian Newspaper Plan Australian libraries are working together to preserve and provide on-going public access to their country’s newspaper heritage. While many issues remain to be tackled, significant progress has been made with many “at risk” titles saved and a much better understanding of the state of control of the country’s newspapers established. By pursuing agreed standards and strategies over several years and working together to address issues, Australian libraries are now well placed to take on the challenges that digital technologies present to libraries with the responsibility for saving their newspaper heritage.

THE IMPORTANCE OF PARTNERSHIPS FOR NEWSPAPER PRESERVATION Beth M. Robertson State Library of South Australia

Abstract The State Library of South Australia has a substantial record of achievement under The Australian Newspaper Plan (ANPlan) (http://www.nla.gov.au/anplan/), dating from our leadership of the original National Newspaper Plan Working Group formed in 1991. We maintain in-house preservation microfilming facilities and a team of qualified staff that produces consistently excellent results to international standards. Partnerships provide the foundation of all our achievements.The proposed paper will describe these partnerships (with the National Library of Australia, local governments, public libraries, newspaper publishers and the general public) and what has been achieved through them over the last 15 years. The paper will also outline a new kind of partnership that the State Library is exploring with local newspaper publishers that could see electronic pre-press versions of hard-copy newspapers begin to play a role in newspaper preservation.

1

Introduction

Four months is a long time in the life of a library in the digital era, and between writing the abstract for this conference last October and completing the paper in February much had changed at the State Library of South Australia. The focus of the paper remains the importance of partnerships for newspaper preservation but the scope has been expanded to encompass the 45-year history of microfilming newspapers at the Library, including the legacy of microfilm produced between 1960 and 1992 before international preservation standards were adopted.

The stimulus has been three-fold: 1. In 2007 the National and State Libraries of Australasia, the peak body for State and Territory libraries, endorsed The big bang: creating the new library universe. This document calls on members to fundamentally shift their libraries into the digital world. It challenges members to accelerate their digitising efforts and to achieve mass digitisation online through collaboration and resource sharing. 2. This has coincided with my work on a digitisation plan for the State Library that will set out the priorities, strategies and goals for digitising our South Australian collections for preservation and access over the next 10 years. 3. Thirdly, over Christmas and the New Year Australia’s State Libraries had the opportunity to test the prototype of the National Library’s Australian Newspapers Digitisation Program website, which you have just heard about from Pam Gatenby.

32

Robertson: The Importance of Partnerships for Newspaper Preservation

These developments have highlighted the importance of reviewing the progress of our microfilming program to date: o To determine what proportion of the film produced over 45 years meets the standards required for digitising and optical character recognition. o To consider how we can maintain a balance between ongoing preservation needs and increasing access imperatives. o And to reassess the potential of partnerships for newspaper preservation in the digital world. What follows is an account of the State Library of South Australia’s newspaper preservation activities to date. I trust that a wide-ranging overview of this kind so early in the program will stimulate questions and comments that can be returned to at later stages of the conference.

2

South Australia’s newspaper heritage

South Australia was settled as a British colony in December 1836, but our first newspaper, the South Australian gazette and colonial register, had been printed six months earlier in London, shortly before its publishers sailed for South Australia. They took with them the equipment needed to set up a newspaper in the 'wilderness'. By 1846, just ten years after the Europeans' arrival, there were five newspapers serving the infant colony. 1 South Australia is very large – close to one million square kilometres (380,000 square miles) – but it includes a great proportion of the most arid land in Australia. Our population has only recently reached 1.5 million, and it has always been concentrated on the areas of highest rainfall on the south-eastern coast and hinterland, with the remainder of the state much more sparsely settled around pastoral, mining and fishing industries. Over the years 410 newspapers have been published in South Australia, of which 51 are currently in production.2 The State Library is responsible for preserving this cultural heritage and making it available to the public. South Australian newspapers currently comprise about 11 million pages and represent about 10% of the published and non-government South Australiana collections for which the State Library has preservation responsibility.

3

The first era of microfilming – 1960s

A distinctive feature of newspaper preservation in South Australia has been the very minor role played by commercial microfilming bureaus.

______________ 1 From the SA Newspapers section of the State Library of South Australia’s website SA Memory: past and present for the future, www.samemory.sa.gov.au. 2 The figure of 410 was calculated by newspaper specialist Anthony Laube in January 2008; it excludes title changes and titles that have been catalogued as newspapers but do not meet the criteria of that format. It is subject to further refinement.

Robertson: The Importance of Partnerships for Newspaper Preservation

33

Reliance instead on in-house microfilming can be traced back to 1955 when Hedley Brideson took over as the State Library’s Principal Librarian. 3 He was convinced that technology should play a central role in modern library practices. Newspapers were stored in old, unlined galvanised iron sheds behind the Library building where, Mr Brideson reported, ‘the irreplaceable volumes … are subject to the attack of damp in the winter, dust in the summer and vermin at all times.’4 Having determined that ‘Much of the State’s early history, recorded in the newspapers of the day, will be gone forever soon if it is not microfilmed’ 5 … … Mr Brideson managed to secure an additional ₤5,500 from the State government for microfilming equipment on the undertaking that it would be used ‘not only by the [Library] but also [to] assist a number of government departments who will send material to be microfilmed by us’.6 I have not found any reference to the equipment being used in that way, but the intent stands as the first example of a partnership furthering the cause of newspaper preservation at the State Library. Work was divided between historic and current newspapers, including the first 50 years of South Australia’s first newspaper the Register, the first 40 years of the tabloid daily The News and current issues of the broadsheet daily The Advertiser. The momentum of the program slowed during 1966 and 1967 while a new library building was built on the site of the old iron sheds. It did not recover during the remainder of Brideson’s tenure, but the newspapers were now more securely stored. The standard of microfilm produced by the Library in the 1960s is aesthetically poor; most papers have been filmed in bound volumes with the background pages showing, and the thumbs and fingers of some operators have been captured for posterity. There are no targets on the reels to document technical standards, and it appears that little effort was made to locate missing pages or issues. Nevertheless, on most of the reels that we have sampled to date the text is clearly legible and unobscured. In recent weeks we have discovered that the Library was being guided by the Association of Research Libraries’ Proposed standard for the microphotographic reproduction of newspapers, which had been drafted by a committee appointed in 1947 in advance of the development of any American Standards Association findings. 7 The document includes some recommendations that the Library did not follow, such as routine disbinding of volumes, and includes some standards that are now unacceptable, such as allowing density levels as high as 2.5. It is silent on the matter of allowing foreign ______________ 3 Hedley Brideson began his career at the State Library in 1927 as a cadet. He was Principal/State Librarian from 1955 until his retirement in 1970. 4 Annual Report of the Libraries Board of South Australia [hereafter Annual Report] 1960-1961, p.5. 5 Sunday Mail, 5 May 1956, p.20. 6 Public Library of South Australia Librarian’s Report, 21 October 1957. 7 An undated two page typescript titled ‘Outline account of the Public Library of South Australia’s microfilming equipment and technical procedures’ states that ‘The microfilming done by the Public Library of South Australia will meet the requirements of the Standard for Microphotographic Reproduction of Newspapers, which was proposed by a special committee of the Association of Research Libraries.’ A typescript of the latter is also held by the State Library.

34

Robertson: The Importance of Partnerships for Newspaper Preservation

objects such as fingers and other implements to intrude into the filmed frame. On the whole, the document certainly helps us to better understand the microfilm we have inherited from the 1960s. At the same time, The Advertiser’s publisher was paying a commercial bureau to microfilm their newspaper from is inception in 1858 to 1959 with much poorer results. The masters were subsequently badly damaged by being used for reference purposes by Advertiser staff, and over the years researchers have become more and more frustrated by the available access films.

4

The second era – 1970s

While the rate of microfilming slowed in the late 1960s the public’s use of newspapers in the Library increased dramatically. 8 The general public was becoming interested in family history and local history, and South Australian history was starting to be taught in schools and at university. This increased use was coupled with a ‘marked and rapid deterioration in the condition of many of the newspapers’. 9 The man monitoring these trends was newspaper librarian Len Marquis, a truly eccentric character who, between 1962 and 1980, became expert on South Australian newspapers and a tremendous help to a new generation of historians. He was particularly concerned about the growing pressure on historic country newspapers. This prompted the Library to take advantage of a government unemployment relief scheme in 1972, and two men were employed to prepare selected country papers for microfilming.10 Two years later a benefactor donated money to the Library that allowed filming on these titles to proceed. 11 In 1979 the Library began to form the kind of partnerships that would become essential to support the preservation of historic newspapers. According to the Annual Report, ‘an approach was made to the Provincial Press Association of South Australia, seeking financial help towards microfilming some of the old files of country newspapers. The Association’s members generously agreed to provide 35 per cent of the cost of microfilming.’ 12 The same year a country school library ‘raised enough money by an appeal to the residents of their area’ to employ a person to prepare one hundred years of their local newspaper for microfilming. 13 The standard of microfilm produced by the Library in the 1970s is aesthetically improved and generally of a high standard. There are still no technical targets, but an older staff member recalls that quality assurance was carried out and reels were re-filmed if missed ______________ 8 9 10 11

Annual Report, 1971-1972, pp.10-11. Annual Report, 1970-1971, p.13. Annual Report, 1972-1973, p.13. Annual Report, 1974-1975, p.20. The benefactor was Miss Mabel Somerville. It was the first of a series of gifts that were followed by a bequest. 12 Annual Report, 1979-1980, p.7. 13 Ibid. This was the Burra Community School; the newspapers were the Burra News and the Burra Record.

Robertson: The Importance of Partnerships for Newspaper Preservation

35

pages were detected. Also, methylene blue tests were carried out by another government department. 14

5

The third era – 1980s

In the 1980s newspaper preservation was given a new boost by preparations for South Australia’s 150th anniversary of colonisation. In 1980 the Library set ‘the goal of microfilming all of South Australia’s newspapers from 1836 to 1950’ 15 in time for the State’s Jubilee 150 in 1986. The microfilming project was widely publicised and many newspaper publishers provided both funding and in-kind assistance. The in-kind assistance included lending the Library sets of newspapers and assigning their own staff to prepare the papers for microfilming. The local German Association also helped by ‘providing German-speaking workers to prepare South Australia’s early German language newspapers’. 16 Local government councils, businesses and individuals also sponsored the project. Thirty per cent of the costs of the project came from these partnerships. The balance was from the Library’s budget and a substantial bequest managed by the Libraries Board. 17 The Library did not achieve the project’s goal, but the output of the previous 20 years was probably doubled in this period. New equipment was installed and efforts were made to apply recognised microfilming standards to the library’s procedures. Technical targets began to appear at the beginning of reels and most titles were filmed one page per frame. The Library now had a substantial catalogue of newspaper microfilms. Publishers readily authorised the sale of their titles, and the sales program began to contribute significantly to funding the Library’s microfilming program.

6

The fourth era – 1990s and the new Millennium

The rate of filming both historic and current newspapers fell again after 1986. Some partnership commitments still had to be fulfilled and microfilm operators now had duties in other areas of the Library that halved the time spent on preparing and filming newspapers. But while the Library was struggling with present circumstances, it was much involved in planning for the future: Associate Director Liz Ho was the first chair of the National Newspaper Plan Working Group formed in 1991; Heather Brown, known to many IFLA members, was ensuring that Library microfilming practices met international standards; … … and Mr Marquis’s first-rate successor, Anthony Laube, appointed in 1987, was determining microfilming priorities for historic newspapers, based in part on the continuing deterioration of the condition of so many titles. ______________ 14 15 16 17

Information from Daniel Planquart, 2008. Annual Report, 1980-1981, p.9. Ibid. The JT Mortlock bequest.

36

Robertson: The Importance of Partnerships for Newspaper Preservation

When the National Library of Australia began offering grants to National Newspaper Plan partners in 1994, the State Library was ready. During the next ten years the Library received over $140,000 from the National Library and filmed 25 titles comprising almost 300 reels. A photocopy of the first cheque received from the National Library is still on file! By this time, The Advertiser’s publisher was also paying the Library to microfilm current issues of that title and its Sunday tabloid stable mate, an arrangement that continues today. By the late 1990s community expectations about the Library’s capacity to film both historic and current newspapers were very high, although the Library’s annual budget was decreasing in real terms. The Library was frequently asked why particular historic titles weren’t being filmed, and customers of current titles began complaining that they had to wait about three years to buy their copies. In response, we thoroughly reviewed our procedures and determined a way forward that has been ultimately successful: o Microfilming staff were relieved of other duties and refocused on preparation, filming and quality assurance. o The Library’s annual microfilming budget was dedicated to current newspapers. This strategy is referred to as ‘capping the backlog’, so that it is confined to historic newspapers. o We tried outsourcing some of the current titles to commercial bureaus, as a way of assessing the cost effectiveness of our in-house program. However, the level of problems identified through quality assurance was unacceptable, and the Library renewed its commitment to in-house filming. 18 o We began asking local organisations to share the responsibility for preserving newspapers most relevant to their areas, and we began charging partners the full cost of preservation microfilming, including preparation and quality assurance. This means that our costs are higher than commercial bureaus, but the strategy is usually successful. The Library’s way forward was further refined in 2004 when the National Library announced the Australian Newspapers Digitisation Program and asked each State Library to select a major daily newspaper to feature in the development of the program. South Australia’s longest running newspaper The Advertiser was the obvious choice, but the microfilm produced by a bureau in the 1960s is too poor for digitisation. Instead, the National Library has supported the re-filming of The Advertiser through the annual grants program since 2004. We have received over $370,000 to date and there are now only nine years left to film, which we hope to complete next year. The final element in our current strategies is risk management. The legacy of microfilm produced between 1960 and 1992 was on unstable cellulose acetate film stock. Since 2003 the Libraries Board has allocated $168,000 from bequest funds to copy all acetate reels to polyester film. This copying process has enabled us to reduce the technical deficiencies of the original films in two ways. When possible, we have improved the density of the image and the contrast between the text and the page, thereby enhancing legibility and the ______________ 18 Report to Associate Director, State Library of South Australia by Manager, Image Centre, 2 August 1997.

Robertson: The Importance of Partnerships for Newspaper Preservation

37

potential for successful optical character recognition. Also, polyester masters are eligible for cold storage at the National Library, which maximizes their long-term stability. The impact of partnerships in the last 15 years has been profound. The Library has gained almost $1.2million 19 to devote to newspaper preservation. This level of funding has enabled us to triple the size of our microfilming team. While this must be on a contract basis because partners can rarely commit funding for more than one year at a time, we have managed to train and retain qualified staff. 20 During the last year this has meant that we have been able to take on filming work for other State Libraries while the largest commercial bureau has been working to capacity scanning microfilm for the Australian Newspapers Digitisation Program.

7

The digital future

What is the future of partnerships for preserving historic newspapers in the digital world? We know that the expectations of most of our partners and the public are changing. There is the assumption that digitisation has replaced the need for microfilming, as well as the assumption that high resolution digitisation and online delivery is a quick and cheap solution. To date, we have found partners receptive to explanations about the real costs of digitisation and about the complementary roles of microfilm and digitisation. Recently, a new councilor from a local government argued that their money should no longer be wasted on microfilming. Yet a long-standing partnership had reduced a 42-year gap in their regional newspaper’s microfilm to just five years. Learning that high quality microfilm is an excellent platform for digitising newspapers, and that the Australian Newspapers Digitisation Program is underway, the council agreed to fund the end of the microfilming program. A country library has funded the digitisation of their regional newspaper from 18661974 21, using microfilm produced through a partnership in the 1980s. We have coordinated the project, outsourcing the scanning of the microfilm but maintaining responsibility for quality assurance, post-optimisation, and storage of the master files. The local library soon appreciated the magnitude of electronic storage required to manage files of 22,000 pages. They provide public access from a series of 20 DVD-Roms. In recent weeks we have begun a trial in which the State Library is converting the scanned pages to PDF in-house and applying optical character recognition without any correction process. The country library’s users are providing us with feedback on the usefulness of this relatively cheap enhancement. One of the most important questions in my mind is whether the Australian Newspapers Digitisation Project will provide an affordable model for future partnerships, so that local government councils and other organisations that have supported newspaper preservation in the past can eventually share the results online. ______________ 19 At February 2008 exchange rates $1.2m Australian dollars is equivalent to $1.53m Singapore dollars, $1.08m US dollars, 740,000 Euro and ₤550,000. 20 Since 2000 the State Library has supported staff undertaking the nationally accredited Certificate IV in Preservation Microfilming course. The Library partnered with the Adelaide Institute of TAFE (Technical and Further Education) in the late 1990s to develop the course. 21 The titles are Southern Argus and Victor Harbor Times.

38

Robertson: The Importance of Partnerships for Newspaper Preservation

Another important question is how the changes in newspaper production of the last 10 years can play a role in the preservation of contemporary hard copy titles. Last year the State Library surveyed 43 newspaper publishers in South Australia about their electronic pre-press production and post-press distribution methods, and whether they are archiving the files. We had already been corresponding with the publisher of the Port Lincoln Times about these subjects, and using his name in the covering letter probably encouraged other publishers to participate. Of 23 replies, 17 publishers are archiving electronic versions of their newspapers, the earliest since 2001. The average file size per issue is about 50 megabytes, and the publishers’ storage media is CD, DVD and external hard drive. Twelve publishers expressed willingness to supply the State Library with their electronic archives immediately. The implications for storage, preservation and public access are complex, but we hope to begin a trial with the Port Lincoln Times later this year. I am looking forward to several papers in the program that address this new kind of partnership with publishers.

8

Conclusion – an exercise in optimism?

In the late 1990s when the Library was struggling to maintain momentum in its microfilming program Associate Director Liz Ho made the ironic comment that continuing to identify newspapers that needed to be microfilmed as a priority was ‘an exercise in optimism.’22 In fact the analysis undertaken for this paper reveals that optimism is not inappropriate. It is difficult to gauge the dimensions of a large newspaper collection, but our calculations and samples to date indicate that we have microfilmed about 60% of our collection, and that the majority of microfilm is of preservation standard and suitable for digitisation. This has been an exciting finding, as our assumption has been that most of the legacy microfilm produced between 1960 and 1992 would need to be recreated. We still have a big task ahead, but the advent of the Australian Newspapers Digitisation Program and our own digitisation planning has given us the opportunity to measure our preservation microfilming in a new way. We will not compromise preservation standards in the future, but decisions about whether an entire title, or a span of years, or only an issue or a single page needs recapturing will be determined by whether or not the existing film can be digitised and OCR’ed successfully. While making these determinations will be time consuming, it should be more cost effective than preparing and re-filming a title again from first principles. And whether or not digital technologies replace microfilm for newspaper preservation, I can see established partnerships energised and potential for new partnerships in being able to promote the fact that the State Library is more than half way there. Of course it is still important to remind everyone we can that a newspaper cannot be preserved if it is not rescued in time.

______________ 22 Email to Director, State Library of South Australia from Associate Director, 15 April 1997.

DIGITISING HISTORIC NEWSPAPERS IN GERMANY – THE CASE OF BAVARIA Dr. Klaus Ceynowa Bavarian State Library, Munich, Germany

[Power Point presentations are not ideal summaries of conference contributions. In this case, however, the “red thread” that connects the factual information shows clearly, and above all, Germany has so far been more an onlooker than a participant in the active development of newspaper librarianship. Now, the current project of the Bavarian State Library looks like the long expected breakthrough, and thus this presentation will serve as a welcome source of information. Ed.]

40

Ceynowa: Digitising Historic Newspapers in Germany – The Case of Bavaria

Ceynowa: Digitising Historic Newspapers in Germany – The Case of Bavaria

41

42

Ceynowa: Digitising Historic Newspapers in Germany – The Case of Bavaria

Ceynowa: Digitising Historic Newspapers in Germany – The Case of Bavaria

43

44

Ceynowa: Digitising Historic Newspapers in Germany – The Case of Bavaria

Ceynowa: Digitising Historic Newspapers in Germany – The Case of Bavaria

45

46

Ceynowa: Digitising Historic Newspapers in Germany – The Case of Bavaria

19TH CENTURY BRITISH LIBRARY NEWSPAPERS: UTILISING THE ONLINE DATABASE Ed King The British Library

Abstract Many runs of older newspapers are being converted into digital formats, and made available online. The British Library (BL) has one of the world’s finest collections of newspapers. Legal deposit for newspapers published in the UK has ensured the systematic deposit of newspapers since the 1840s. In recent years, the BL has been planning the digitisation of its older newspaper collections, to bring the texts to a wider audience. The paper will look at the results of work of the BL, in converting 2 million pages of its older 19th century UK newspapers into digital format. The paper will explore briefly issues surrounding the conversion of the texts, and then explore how researchers may utilise the resource now available.

1

Introduction

The British Library has been active in the field of digitisation of older newspaper for several years. 1 Previous papers given at IFLA conferences have outlined the work of the Library with regard to the digitisation nineteenth century newspapers. 2 3 This work has included the digitisation of one million pages of 18th Century newspapers (The Burney Collection); however, this project will not be described further here. The purpose of this paper is to take a brief look at the aims of the British Newspapers 1800-1900 project. The results of this project have now been made publicly available online under the title: 19th Century British Library Newspapers. Much of this paper will describe the actual results of the project planning and execution in the years 2004-2007, and how the plans that were made have reached fruition in the online environment. It can be difficult to grasp the scale of newspaper publishing in the United Kingdom in the 19th Century. Taken as a whole, the huge and diverse production of newspapers since 1700 provides an enormous resource for research on all subjects for all of the UK, both urban ______________ 1 For the story of the British Library’s steps towards mass digitisation, see: Digitisation of Newspapers at the British Library. The Serials Librarian, Vol. 49 (1/2) 2005, pp. 165-181. 2 IFLA Oslo, 2005, 10 Billion Words: The British Library British Newspapers 1800-1900 Project Some guidelines for large-scale newspaper digitisation. See: http://www.ifla.org/IV/ifla71/papers/154eShaw.pdf 3 INTERNATIONAL CONFERENCE ON NEWSPAPERS COLLECTION MANAGEMENT: PRINTED AND DIGITAL CHALLENGES, Santiago, Chile, April 3-5, 2007. E.M.B. King. Digital Historic Newspapers online: prospects and challenges. Paper to be published.

King: 19th Century British Library Newspapers: Utilising the Online Database

48

and rural. For those libraries that have collected newspapers (particularly national libraries), the need to provide ready access to newspaper texts has posed a dilemma, given the often poor quality of the paper the newspaper was printed on. The need to prevent undue wear and tear upon the paper has provided impetus to copying texts, to allow continuous public access. It is in this context that the British Library has pushed forward with its programme for digitisation of these older newspapers.

2

British Newspapers 1800-1900 Project

Early in 2004, the British Library secured funding of £2 million pounds from JISC4. Under the Digitisation Programme, JISC enabled a small number of large-scale digitisation projects that would bring significant benefits to UK Further and Higher Education communities, one of which is the British Newspapers 1800 – 1900 (BN) project. 5 The overall goal of the project is to provide a mass of historic newspaper content on the web for full text for searching by UK academic and further education communities. The main aims were: -

to digitise up to 2 million pages of out-of-copyright UK printed material to select UK regional, local and London newspapers to digitise the majority of newspapers from new microfilm to offer access to this collection via a sophisticated searching and browsing interface on the web.

Selection issues The original business plan included a preliminary list of many titles, at least 160; split into London national dailies and weeklies; English regional dailies and weeklies; Home Countries newspapers (Scottish national, Scottish regional, Welsh, Northern Irish) and ‘specialist sub-clusters’. 48 titles were identified by a User Panel of Experts with the Project staff. 6 Surprisingly, very little information was available about how many pages there were in each title. In order to keep to the project schedule, a decision was made to start with a Pilot of a discrete specialist sub-cluster, such as the newspapers which chronicled the Chartist movement, followed by the first work batch which included titles, such as the Examiner, Morning Chronicle, Graphic. At the same time, an audit into the pagination and condition began of further likely candidates for selection from the preliminary list. The project aimed to deliver the following – the microfilming of all the newspapers selected for digitisation; the scanning of all pages of the entire microfilmed content; article zoning for the texts on each page; OCR of the article images; and the production of the required metadata. Searching online were planned to include: names and dates, obituaries, advertisements, regional perspectives and local perspectives, national news. Throughout 2007, the BL worked with its selected partner, Gale Cengage, to bring the content created ______________ 4 The JISC agreed to support the project from April 2004 to December 2006 at a total cost of £2,022,131. See: http://www.jisc.ac.uk 5 For details of the project online, see: http://www.jisc.ac.uk/whatwedo/programmes/programme_digitisation/digitisation_bln.aspx 6 A full list of the titles selected is listed at: http://www.bl.uk/collections/britishnewspapers1800to1900list.html

King: 19th Century British Library Newspapers: Utilising the Online Database

49

into the online environment. The launch of the database to the UK Higher Education Community took place in London in October 2007. Access to the files is free within BL reading rooms, and to any UK Higher Education organisations who sign a license with Gale Cengage. It is planned to launch a wider service during 2008, which will enable other non-UK organisations and individuals anywhere to secure access to the files, on payment of a subscription, or of a fee. Copyright The Library policy has been to proceed with the agreement of rights holders and their representative bodies. In the case of newspapers, recent discussions with newspaper publishers and updated legal advice to the Library means that for this project, the starting point has been that no newspaper less than 100 years old will be digitised for access by UK higher and further education organisations. To cover the situation to a greater degree than in the early stages of the project, the BL has carried out further checks relating to newspaper owners in late 2007 and early 2008. The purpose is to establish copyright as precisely as is possible. In the event of any claim of infringement of copyright, the BL and Gale are prepared to consider the take down of pages. Whether this happens is likely to depend upon the nature of an infringement claimed by those who say they are the current copyright holders. The British Newspapers 1800-1900 Project aimed for users to have the following features for the online environment: • full text searching of 2 million pages of newspaper texts • Searches across one, several, or all newspaper titles to examine any aspect of research possible e.g. changes in opinion, names history, design and publishing history) • the ability to search individual newspapers by date • browsing forwards and backwards through a selected issue • images of the original pages to read in the usual way • like-for-like comparisons of the same subject’s treatment by different titles • the possibility to save and build searches, to aid in course teaching and collaborative working • display of the results of searches at the article level within the context of the original page • the ability to search advertisements, obituaries etc. • the ability to download text versions of the original pages • an anticipated 80% accuracy on the OCR conversion

50

King: 19th Century British Library Newspapers: Utilising the Online Database

3

How each of the objectives work in practice: Using 19th Century British Library Newspapers

To see how the early objectives have been realised, it is useful to understand somewhat how Gale structure their interface with each and every user. When one first enters the site, the screen displayed in the first instance is the introductory screen. This offer users a choice – one can get straightaway into searching the database; or, one can click on the “About” button; if one does this, the user is then offered background information relating to the creation of the database. Next one can click on the “Topic Guide” button, and here the user is offered a great deal of background information to newspapers and topics. There are essays on the history of British Newspapers 1800-1900. These amplify the nature of newspaper developments throughout the 19th Century, and, provide points of reference regarding the significance of the newspapers that have been digitised. It is important that the users understands that, significant as this database is, it is only a fraction of what was published in the UK in the 19th century. In the “Topic Guide” section, there are also essays on topics which provide a socioeconomic context to 19th century studies, and the role of newspapers within these developments. Essays such as such as Fact, Fiction and Fun; or, The Crystal Palace; or, one can view one of a number of biographies, such as that of William Cobbett, a radical thinker, pamphleteer, and publisher of the Weekly Political Register in the early part of the 19th Century. All this information allow those less familiar with the 19th Century context to read and learn, before attempting to launch into detailed searches, and, if they are not familiar with 19th century history, perhaps not readily understanding the nature, or the significance of the results returned as a result of the search. In addition to these general introductions, each newspaper has had a descriptive “Headnote” written for it. This briefly gives an outline of the newspaper during the 19th century, and points about its significance, which could be: the newspapers political orientation; its publishers; its frequency of issue and size of issue; its circulation; the nature of the subjects covered. The format of UK regional newspapers was frequently similar. What sets them apart from each other is their coverage of local or regional news. Users can view these “Headnotes” by clicking on the “Publication Search” button, and then selecting one (or more) titles from the list presented. So, when one arrives at the “home” page of the website, one can carry out a Basic Search 7 If one simply keys in a word, and no particular newspaper title is selected, then the user is in effect searching the full 2 million pages of all of the newspapers. For example, searching on the word “Singapore”, results in 82,077 hits- a huge number. The results are presented in ascending order by date, so a useful tip if one wants to move around this large body of results is to key in a new number in the results box near the top, such as 80,000. This takes the user to newspaper citations for the year 1899. Using the button at the right hand side on this page, it is also possible to sort results, for example, in: - date order descending - in alphabetical order of each newspaper title It is sensible to refine such a large body of results in a couple of simple ways – by date, or by newspaper title, or by both. If one searches a London newspaper – The Morning Chronicle (the years1800-1862 are digitised) with word “Singapore”, with no dates, one ______________ 7 These searches done in the period 1-15 March 2008.

King: 19th Century British Library Newspapers: Utilising the Online Database

51

receives 3634 results. To refine this further one can revise the search to bring it back to the screen and add date delimitation, such as the single year of 1860 for this one newspaper. This delivers 193 results. Unsurprisingly, there are many items relating to shipping. If one clicks on result 3, one then receives the article, with the word Singapore highlighted. Another example of searching widely is for the word: “shipbuilding” in the Glasgow Herald. (The run for the years 1800-1900 has been digitised.) Glasgow was a great centre for ships and the construction throughout the 19th century. 12,247 results are returned. Jumping to result 5000, and clicking on the article for “The War in Egypt” brings up the word shipbuilding, which is printed within an article about the The Institution of Mechanical Engineers. i. Search individual newspapers by date A ready method to search for newspapers is by the “Publication Search” option near the top of the first screen. This calls up the next screen, which allows the user to select one, or more, of the 48 newspapers in the database. If one selects, for example, The Daily News for 1858, the next screen displays the Headnote for the newspaper, and below, the dates of the issues that are available for year and each month of its publication. All the user has to do is to key in the year and month of publication, and the response screen shows what dates are available for browsing each issue. If one selects the issue of 30 January 1858, the next screen displays the first page of the issue selected. It is then possible to read this page, or to read just one article within the page. ii. Browsing forwards and backwards through a selected issue with images of the original pages to read in the usual way At the Search Results page, it is possible to select the “Browse issue” button, and to have the first page displayed. Then one can choose the “next” button, and page two will be displayed. At any time, the user can enlarge the image of the pages by selecting the size button displayed. This allows users to read the text better if they so wish, The text starts to break up, when one selects the 200% option. This option is versatile, and allows users to read the newspaper as though they are turning the pages of an original hardcopy issue in a library. The option also encourages systematic viewing by users, who may want to see one page in an issue regularly each day, week or for each month. iii. Like-for-like comparisons of the same subject’s treatment by different titles This matter can be readily accomplished if a user can be reasonably specific about the search to be done. It is perhaps most readily conducted in relation to events perceived at the time to be of national significance, such as the Battle of Trafalgar in 1805; the opening of the Crystal Palace in 1851; the Reform Act of 1832; The Crimean War of 1852-1855; the purchase of the Suez Canal shares by the British Government in 1878. A search via the “Advanced Search” screen, for the three words “Suez”, “Canal”, Shares”, give 21 results, with reports appearing with these words included in thirteen newspapers published between 1875 and 1892. It is less easy to carry out like-for-like comparisons for events that occurred in each town or region. However, it may be worth attempting, provided one has a specific event in mind, such a s great storm at sea; of the sinking of a ship – these may well have been reported in several newspapers in the week or fortnight after they happened. An example is the sinking of a ship. Being an island nation the UK has many storms at sea and many ships are lost at sea. Here, it is possible to deploy the “Advanced Search” option. Keying the

52

King: 19th Century British Library Newspapers: Utilising the Online Database

search words: “sinking” and “ship”, gives 20 results. Nos. 15, 16, and 17 all cover the sinking of an iron ship, The Ganges. The report was reprinted from the London Times, in the newspapers: Glasgow Herald, Thursday, August 14, 1862; The Newcastle Courant, Friday, August 15, 1862; and Reynolds's Newspaper , Sunday, August 17, 1862. It was normal practice at this time, for regional newspapers to reproduce texts that had appeared in the London daily newspapers. At first glance the text of the article appears the same. However, a closer reading shows that, probably in the interests of saving space, and of updating the event itself, the last couple of sentences of the report in each newspaper varies slightly. If one goes to the article originally printed in the Times, on August 13, 1862 (via the Times Digital Archive online), one can work out precisely what has been altered by the three other newspaper in the week after the event. iv. It should be possible to save and build searches This feature has been enabled by Gale, for subscribing organisations. In each working session, each search is stored, and the list of previous searches can be viewed. It is then possible to refine one of the previous searches, and see the results of this additional search step. It is useful for users to be able to send an email to themselves, in order that they may keep a record of searches that have been done. v. Display of the search results at the article level within the context of the original page Before the article is opened for reading, users can see the “thumbnail” of the article, and where it was printed on the page. It is normal for each article to be displayed once a result is opened. The search word is highlighted for the user. If one clicks on the “page” option, then the whole page is displayed, together with the article highlighted. An additional feature at this point, is the ability of a user to run the cursor across the list of articles on the right hand side of the screen, and a different article is highlighted as each title is selected on the title list. vi. The ability to search advertisements, obituaries etc. This is a huge area of potential enquiry. A couple of examples have to suffice here: vii. Advertisements All sorts of medicines were promoted throughout the 19th century. A search on Beechams (a well known British medicine company) gives 2,537 results. One can see quickly that many of these appear in the advertisements pages. The default for the results in ascending publication date order; however, it is possible to sort the results in other ways, as the drop down box on the right hand side shows. Most usefully in this instance, one can sort by publication title, and bring all the citations of Beecham’s medicines together one newspaper at a time, thus allowing systematic viewing of each citation within that newspaper. viii.

Obituaries

As stated above, one can search across all the newspapers, or in a selected group of newspapers, or again just in one newspaper. A search across all 48 typing the word “Obituaries” newspapers give the huge results set of 196,793 results. Searching the same word in two newspapers, Lloyds Weekly Newspaper, and the Ipswich Journal, gives us 75,326 results. Searching for this word in just one newspaper, The Aberdeen Journal, gives

King: 19th Century British Library Newspapers: Utilising the Online Database

53

us 11,330 results. It would be sensible next to refine the search by date, to secure delivery of a smaller number of results. This will allow those who want to establish the date of death (possibly of an ancestor) to do more quickly. ix. Marriages and Births The same logic applies to the Marriages and Births columns of newspapers.

x. The ability to download text versions of the original pages After discussion, it was agreed that it would not be possible to download text versions of the original. By this is meant the version that has had OCR applied to it. It is important to emphasise how quick and easy, and rewarding these results are, when compared to the traditional method of reading all the advertisements, births marriages and deaths in each page of each issue of each newspaper. xi. An anticipated 80% accuracy on the OCR conversion This matter has also generated much debate, which continues. It is well known that OCR software has been developed in recent years to recognise printed texts that have been published quite recently. On the whole, OCR software works less well with older text fonts. For these, not only are the fonts less familiar, and more varied in their application to newspapers, but also, one has to deal with: fonts that are worn, resulting in imperfect registration of the character upon the paper; with fonts that are imperfectly inked, thereby registering on the paper with too heavy inking or too little inking – both of which make recognition of the character much harder for software. The problem of binding spine curvature also has yet to be fully dealt with, as the software most readily “recognises” characters on a flat surface, rather than on a paper surface that is curved. These limitations are well recognised in many spheres. Even with these, it is remarkable haw much text has been recognised. The ability of separately developed software to carry out “fuzzy” searches at the time the search is being requested by the user, also enhances the probability of the character being recognised, and the word in which that character is contained. At this time, it is important to be positive about the achievements of the software, rather than too critical about how much more there is to be done to raise the percentage of characters and words that are successfully recognised. Here and now, searching is a profitable and rewarding experience for users, despite the limitations of software.

4

Conclusion

Two of the concepts that were spoken of a little while ago were ensuring that online database of older texts were both “readable” and “searchable”. In bringing these files to a wider audience, it is possible to state that these two concepts have been kept in mind and we can see how well they have been realised. Implementing both ensures that users see the look and layout of the original printed texts; and also gain the maximum benefit of online searching, and are enabled to work online in whatever way suits them best. The vast quantities of information now available will greatly enhance research of all kinds, at all levels. It is really difficult to see an end to the research possibilities. One can say with certainty that some of the information that can now be found would have been impossible

54

King: 19th Century British Library Newspapers: Utilising the Online Database

to find by the conventional means of consulting pages sequentially, reading issue after issue of each newspaper. British Library Newspapers is planning to digitise a further million pages of 19th century UK newspapers, which by 2009, will provide 3 million pages of newspaper texts to a far wider public. The BL is actively seeking further funds to create more digital newspaper content of older newspapers, to add to the achievements of these two projects. The BL looks forward to the greater fulfilment via the online availability of these texts of its strategy to Enrich the User Experience.

NEWSPAPER DIGITISATION IN THE NETHERLANDS The Dutch Digital Databank for Newspapers and other initiatives Astrid Verheusen Koninklijke Bibliotheek (KB)

Abstract The Koninklijke Bibliotheek (KB), the National Library of the Netherlands initiated the Databank of Digital Daily newspapers project at the end of 2006. The project will realize the large-scale digitization of Dutch national, regional, local and colonial newspapers and make these freely accessible on the Internet. The Databank of Digital Daily newspapers will contain eight million pages, from the first newspaper dated 1618 to the newspapers of the twentieth century. The project involves the selection, digitization, search-andretrieval and online presentation of the newspapers. One of the greatest challenges for the project is the design of an efficient workflow to handle the selection, preparation, digitisation, processing, long-term storage and presentation of millions of pages. The KB carried out market research to explore the recent technical developments in digitization of historical newspapers. The results of this market research and the new ways in which the KB handles all phases of the digitization process will be presented in the paper. This will include the selection process, copyright issues and co-operation with the publishing sector, research into alternative file formats to reduce the cost of storage, automatic quality control mechanisms, digital preservation of the files and the specifications for the digitisation itself. This paper will also present the KB’s initiative to set up a network containing about forty other cultural heritage institutions in the Netherlands, involved in the digitisation of local newspapers, to explore possibilities for co-operation, standardize the technical approach for digitisation and to build a national portal providing access to all digitised historical newspapers.

1

Introduction

The Koninklijke Bibliotheek (KB), the National Library of the Netherlands, initiated the Databank of Digital Daily newspapers project at the end of 2006. The project will realize the large-scale digitisation of Dutch national, regional, local and colonial newspapers and make these freely accessible on the Internet. The Databank of Digital Daily newspapers will contain eight million pages, from the first newspaper dated in 1618 to newspapers of the twentieth century. The project involves the selection, digitisation, search-andretrieval and online presentation of the newspapers. A web service will be set up with advanced search options for researchers and the general public. The project budget is 12.5 million and is being financed by the 'National Programme for Investments in Large-Scale Research Facilities', a one-off investment in large-scale research facilities.

56

Verheusen: Newspaper Digitisation in the Netherlands: the Dutch Digital Databank ...

One of the greatest challenges for the project is the design of an efficient workflow to handle the selection, preparation, digitisation, processing, long term storage and presentation of millions of pages. The KB carried out a market research in the period MayJune 2007 to explore the recent technical developments in digitisation of historical newspapers. The results of this market research and the new ways in which the KB handles all phases of the digitisation process will be presented in this paper. This will include the selection process, copyright issues and co-operation with the publishing sector, research into alternative file formats to reduce the cost of storage, automatic quality control mechanisms, digital preservation of the files and the specifications for the digitisation itself. Apart from the KB, about forty cultural heritage institutions in the Netherlands are involved in the digitisation of local newspapers. More than 150 titles of historical newspapers are being digitised. However, all organisations do things their own way and every organisation builds their own website with specific search functionalities. The KB took the initiative to set up a network of these institutions to explore possibilities for cooperation, standardise the technical approach for digitisation and eventually to build a national portal that gives access to all digitised historical newspapers in the Netherlands. The paper will present the way cultural heritage institutions in the Netherlands are trying to achieve this. The project has a duration of five years and will be completed in phases. The first titles will become available at the end of 2008.

2

Selection

Since the publication of the first newspaper in June 1618, more than 7000 newspaper titles have been published in the Netherlands. Only a selection of these can be included in the Databank of Digital Daily newspapers. A responsible choice therefore needs to be made. A Scientific Advisory Committee composed of experts in (press) history was appointed for the selection of the titles on the basis of several criteria. There is no complete or coherent overview of newspapers in the Netherlands and the catalogues and bibliographies available are incomplete as well. Within the project a complete as possible list of candidate titles was compiled to facilitate the selection of titles. The following characteristics were applied for the definition of the term newspaper: • • • • •

a product from the printing press (ie not handwritten); periodicity (published - preferably - two or more times per week); high content of current affairs; universality (all types of news are covered); can be purchased by anyone

For the selection of titles a classification according to periods was made: • 1618-1800 • 1800-1813 • 1813-1869 • 1869-1914 • 1914-1965 • 1965-1995

Verheusen: Newspaper Digitisation in the Netherlands: the Dutch Digital Databank ...

57

For each period, political, social, economic and cultural characteristics have been compiled and the development of the journalistic profession and the newspaper sector in that period has been examined. Selection criteria were drawn up for each period using these characteristics. The available titles will be assessed according to the extent to which they are an instrument and reflection of society at that point in time. A separate set of criteria was drawn up for the colonial newspapers. For the Second World War additional funding was found which allows for the digitisation of all illegal newspapers from the period 1940-1945 and a selection of the legal newspapers. Even after the initial selection of titles, practical aspects might mean that a title is still not eligible for selection: - The collections kept have major gaps and have not been stored in one place. Only in rare cases is the complete title set stored at one location. Sometimes it is not possible to determine the location of certain editions. Per title it will need to be considered how much time should be spend finding and collecting missing copies. - In the project as much material as possible will be digitised from microfilms. The use of microfilms offers many advantages compared to digitisation from originals. The microfilms - in the possession of the KB or other institutions - are, however, of variable quality, made with differing reduction factors and are filmed in high or low contrast. Many of the available microfilms will not be suitable for digitisation. It could also be the case that titles on microfilms are incomplete. - Copyright applies to newspapers from the twentieth century and therefore these cannot be digitised and made available without permission. When a title is selected for digitisation the claimant will be tracked down and then contacted. For some titles - the so-called 'orphans' - the claimant might not be found. To prevent problems with respect to copyright, the publishing sector has been consulted at the start of the project. - In the Netherlands many newspapers have been digitised at local and regional archives. For the project, these newspaper titles digitised elsewhere, will be identified. The inventory is taken place continuously and will be kept as up to date as possible. If such a title occurs in the selection then efforts will be made to reach agreements with the involved organisations. Of the titles that are ultimately eligible for digitisation, it will be ascertained whether the material - original or microfilm - is available and usable for digitisation. If the title does not meet the requirements it will be removed from the selection. If possible, another title will be selected. However if the conditions are met, the material will be collected and prepared so that the process of digitisation can start. By now the selection process is almost finished and an estimate can be made about how many pages have been selected for each defined period.

58

Verheusen: Newspaper Digitisation in the Netherlands: the Dutch Digital Databank ...

Period Number of titles 1618-1800 47 1800-1813 48 1813-1869 27 1869-1995 15 Colonial newspapers 173

Number of pages 760.000 75.000 550.000 3.478.802 3.124.232

Percentage of total 10 1 7 37 39

Copyright issues From the start of the project, a lot of effort was put into communication with interest groups from the publishing sector and freelancers. The aim was to reach agreements on copyright issues so the project could also include newspaper titles from the twentieth century. Although those negotiations were appreciated by the groups involved, they did not result in permission to digitise and make 20th century newspapers freely accessible. Both the publishing sector as well as freelancers are still afraid they will lose revenue. The best results were reached by talking to individual publishers, until now this led to agreements for publication of five titles from the twentieth century.

3

Technical developments

The Databank of Digital Daily newspapers is one of the largest digitisation projects of historical material in the Netherlands. In a period of four years an average of 200,000 pages per month will be selected, prepared, digitised, processed, stored and presented. One of the greatest challenges for the project is the design of an efficient workflow within which this capacity can be achieved. This applies to both the workflow within the KB and at the supplier of the digital content. The market research revealed that only a handful of suppliers have experience in the digitisation of such volumes. The digitisation has been outsourced by means of an European invitation to tender. The KB has published the invitation to tender in November 2007 in which the specifications for digitisation were described in detail. In preparation of the tender, a market research study was carried out in May 2007 amongst fourteen companies to investigate the current state of affairs in the area of newspaper digitisation. 1 The findings of the market research study were incorporated into the specifications of the invitation to tender and are described in the paragraphs below. Work on digitising the first newspapers will begin in April 2008. Prior to digitization, an extensive analysis of the material will take place during which gaps will be noted, pages repaired and instructions for digitisation made. For this, a production line has been set up within the KB to process large quantities of newspapers for digitisation in an efficient manner. The original material The quality of the digital files is not only dependent on the quality of the digitisation process. It also depends on the condition of the original newspaper. Original newspapers are often of inferior printing quality (for example the ink has seeped through the page; ‘bleeding ink’), stained and torn. Some newspapers in collection binders have been bound ______________ 1 See http://www.kb.nl/hrd/digi/ddd/RFIanalyse.pdf for a summary of the outcomes of this request for information.

Verheusen: Newspaper Digitisation in the Netherlands: the Dutch Digital Databank ...

59

so firmly in the fold that it is difficult to digitise the separate pages. These problems occur with both the digitisation from microfilm and from originals. In the case of microfilm the quality of the film carrier can also influence the quality of the digital files. Original versus microfilm Digitisation will be carried out using both microfilm and original newspapers from materials owned by the KB itself as well as other institutions within and outside of the Netherlands. New microfilms will not be generated for the purpose of the project. Of all Dutch newspapers published between 1618 and 1995, it is estimated that about 20% are available on microfilm. The quality of these microfilms is variable. Digitisation from microfilm is faster and cheaper, but in general produces qualitatively less impressive digital files than digitisation from the original. At the KB, a study is underway into the level of suitability of different types of microfilms for digitisation and OCR. The results of this research will be made available on the website of the KB. Specification of the image files The quality of the image files is determined by the degree to which a scan is a faithful representation of the original. The quality is affected by various factors including bit depth, resolution, storage format and compression. Throughout the project, efforts are being made to attain measurable and 'objective' quality standards. During consultation with the suppliers, agreements are made regarding the optimal fine-tuning and benchmarking between equipment and software. Two types of image file are distinguished in the Databank of Digital Daily newspapers: master files and derivatives. The master files form the basis for all further processing. Derivatives are necessary for presentation on the Internet and as an 'intermediary' for the optical character recognition. Alternative file formats for masters For the storage of eight million pages 250 terabytes of storage space will be required if the TIFF format is used for the preservation of the masterfiles. Because the KB is also involved in some other large scale digitisation projects - the estimate is a production of 40 million images (counting master versions only) in the next four years - a revision of the storage strategy was deemed necessary. Currently, master files are stored in uncompressed TIFF file format, a format used world wide. 650 Terabytes of storage space will be necessary to store 40 million files in this format. This was the main reason the KB conducted a study into alternative file formats for the storage of master files from digitisation projects. Aim was to describe alternative file formats in order to reduce the necessary storage space. The desired image quality, long-term sustainability and functionality were taken into account during the study. The KB distinguished three main reasons for wanting to store the master files for a long or even indefinite period: 1. Substitution (the original is susceptible to deterioration and another alternative, high-quality carrier - preservation microfilm - is not available) 2. Digitisation has been so costly and time consuming that redigitisation is no option 3. The master file is the basis for access, or in other words: the master file is identical to the access file The conclusion of the research is that for the first two reasons JPEG 2000 lossless is a good alternative file format while for the third reason JPEG 2000 lossy and JPEG are good alternatives. The use of JPEG 2000 will save about 50 percent of storage space.

60

Verheusen: Newspaper Digitisation in the Netherlands: the Dutch Digital Databank ...

The results of this research will be published on the website of the KB. Optical Character Recognition Digitisation does not only relate to the scanning of the material but also to the conversion of the image files into machine-readable text by means of Optical Character Recognition (OCR) and the addition of metadata. The better the quality of the image files of the newspaper pages, the more successful the OCR. The machine-readable text forms the basis of the project. A good OCR result increases the accessibility of the collection. Considerable attention must therefore be paid to this process. The KB coordinates the new project IMPACT: IMProving ACcess to Text. IMPACT is a project funded by the European Commission in the Seventh Framework Programme. It aims to significantly improve access to historical text by improving OCR technology. The project brings together fifteen national and regional libraries, research institutions and commercial suppliers. The start date is 1 January 2008 and the duration of the project is four years. The outcomes of the project are very important for the improvement of OCR of historical text like newspapers. Layout analysis The machine-readable text will be supplied in XML just like the metadata. The relationship between the different files is defined in a concordance table. The identification of newspaper headings, articles and other 'units' on a newspaper page is accomplished by means of an automated layout analysis. So-called zoning tools can recognize and register separate elements including text blocks, images and horizontal/vertical lines. Subsequently, OCR of the separate text blocks and an analysis of the content allows the different segments to be distinguished: articles, advertisements, captions, etcetera. Through registering the coordinates of words, and if necessary the separate symbols, search terms can be marked in a picture (‘hitterm highlighting’). By means of segmented newspaper pages, a collection can be made searchable at article level, whereas the storage of the actual data will remain at page level. Zoning and segmenting of newspaper pages can be carried out semi-automatically. The checking of the automatic layout analysis as well as the merging of articles that are spread over several pages are important, but also labour intensive parts of the process. Patent A few days after the KB had concluded the invitation to tender and selected the German company CCS, the KB received a letter from a Dutch company that claimed to have a patent on the method for automatically extracting articles from a document. After some research this turned out to be true. The company had just acquired this patent that was issued in 1997 and the patent turned out to be valid in five European countries including the Netherlands. 2 All other companies in those five countries that have developed software for automatic segmentation and layout analysis are in fact violating the patent. This patent can therefore greatly affect newspaper digitisation in a bad way. For the KB the solution came from CCS who is now going to do things manually. At the same time the patent has to be disputed because other companies were already developing this method since the beginning of the nineties so the patent should not have been issued in the first place. This strange story shows how the interest of private companies sometimes conflict with those of cultural heritage institutions. ______________ 2 “Apparatus and method for extracting articles from a document’, EP 0 753 833 B1. The patent is valid in Germany, France, Great-Britain, Italy and The Netherlands.

Verheusen: Newspaper Digitisation in the Netherlands: the Dutch Digital Databank ...

61

Quality Assurance After the content has been supplied, a number of quality and integrity checks will take place. These checks will focus on the quality of the images (carried out by quality managers), the validity of the XML files, the accuracy of the metadata content and the correct correlation between the different files. These checks will be partly automated and in part completed manually. Random samples will be taken for the manual checks. The KB has developed a special application to support these checking activities. The market research revealed that various companies have developed systems for quality control on the web. This offers the client the possibility of following and checking the production process on-line. During the project some checks will be made via the Internet and some following delivery at the KB. Because of the large amount of data quality checks will be done as automatically as possible. During the project QA will be evaluated on a regular basis to avoid spending to much time on the procedure. A right balance between quality and quantity is essential when dealing with mass digitisation. Besides quality controls, conversions are carried out during the processing phase to incorporate the files into the data infrastructure of the KB. This includes the conversion to Dublin Core, MPEG21 DIDL and the standards used for the e-Depot, the digital archiving environment of the KB that ensures long-term access to digital objects. 3 4

Local initiatives and national co-operation

Newspaper digitisation in the Netherlands is very popular. About 150 local newspaper titles are currently being digitised by forty local institutions. To avoid duplication in digitisation, the KB identified these newspaper titles digitised elsewhere. The inventory is taking place continuously and is kept as up to date as possible. If a title that is being digitised elsewhere occurs in the selection then efforts are made to reach agreements with the involved organisations. Until now, agreements have been made for thirteen newspaper titles that have been digitised elsewhere and are going to be available in the Databank for Dutch Newspapers as well. Even more important are the differences in technical approach of the local digitisation projects. The ideal for the future is to have a national portal which gives access to all digitised Dutch historical newspapers. This will only be possible if all local digitisation projects conform to some necessary technical standards. To reach this goal a core group for newspaper digitisation in the Netherlands was set up at the end of 2006. It consists of several Dutch institutions that are involved with newspaper digitisation. The main aim of the core group is to exchange knowledge about technical standards, copyright issues and all other aspects concerning the digitisation of newspapers. A mailing list was set up and in 2007 a conference was organized. By exchanging knowledge in this manner hopefully in the future it will be possible to exchange files and eventually to build a national portal that gives access to all digitised historical newspapers in the Netherlands.

______________ 3 See for more information about the e-Depot http://www.kb.nl/dnp/e-depot/e-depot-en.html.

THE CALIFORNIA DIGITAL NEWSPAPER PROJECT: CANVASSING, CATALOGING, PRESERVATION, DIGITISATION Henry L. Snyder University of California

Abstract The paper will discuss our project to digitise select California newspapers titles as part of the National Digital Newspaper Program sponsored by the National Endowment for the Humanities (NEH) with additional support from the California State Library (CSL). It will begin with the acquisition of a major film archive as part of the California Newspaper Project, also funded by NEH to inventory and preserve California newspapers. The next step was a request from the California State Library for us to digitise the first California newspaper, the Daily Alta California. That metamorphosised into a major digital newspaper program when we were selected as one of the six test sites for the NDNP. The main focus of the paper is the issue and challenges faced in embarking upon this program such as the task set by NEH which was complicated when the program was expanded to include additional titles with CSL support. Most of all, the challenge has been to create our own webpage and mount all the issues digitised. In the process we are designing our own presentation software still a further complication. I will go through each step in turn laying out the choices we faced and the decisions we have made.

1

Background

The California Digital Newspaper Project is an outgrowth of the California Newspaper Project (CNP), a component of the United States Newspaper Project (USNP), a massive effort to record the surviving issues of newspapers published in the United States and ensure their preservation for future generations. Initially the National Endowment for the Humanities (NEH) selected ten major repositories to inventorise their collections, thus creating a base file. The records were loaded into CONSER, the national serials data base, maintained by OCLC. Now, two decades later, projects have been sponsored in all US states and territories and the District of Columbia. All are finished except for New York, Pennsylvania and California. California has the third largest project (after New York and Illinois) in terms of total number of titles recovered and second (after Texas) in terms of geographical area covered. The main office of the Center for Bibliographical Studies and Research (CBSR),[PP] which houses the project, is located at the University of California, Riverside. In addition to the CNP the Center manages the English Short Title Catalog (ESTC) in North America and CCILA, a union catalog of Latin American imprints to 1851. The ESTC is a joint venture with the British Library and the American Antiquarian Society. As of August 2007, we had created or edited 8,925 records for California titles and 4,524 for out-of-state titles held in California repositories. Our working file includes entries that

64

Snyder: The California Digital Newspaper Project

have yet to be verified, some for which no copy survives, and some duplicates, totals 18,510. We have verified a total of 45,070 holdings records to date and visited over 1,400 institutions, some many times. Unlike many states there is no one predominant collection of U.S. or California newspapers in the state. The largest collection is in the Bancroft Library at the University of California, Berkeley (UCB), which is why we located one of the offices there. UCB has approximately 5,000 titles. The second largest collection is at the State Library, which has files of approximately 2,000 titles with another 80 titles as sample issues. The State Library undertook an initial survey, published in 1984, to determine the scope of the project, but declined responsibility for its management. UCB also refused. As a native Californian, and recognizing the importance of the project, I agreed to accept responsibility for it at the Center. We applied for funding in 1990 and received our first award in 1991. Microfilming began filming in 1999. Fortunately, as the bulk of materials had already been filmed that we have been able to film those we have identified to be of interest. Most of the remaining unfilmed titles were small weeklies whose content was of limited interest. The most interesting development in the CNP, which leads directly to our digitisation efforts, is our acquisition of extensive film stocks. So far as we are aware, this is unprecedented in the USNP. And it was serendipitous. In sum, over a period of half a dozen years we have acquired an archive of some 100,000 reels, totaling more than 50,000,000 pages. To put that in perspective, other than the 20,000 reels stored at Heritage Microfilms in Iowa and an equal or larger number of reels stored at ProQuest vaults in Ohio, we have the only significant archive and far and away the largest. The reels were cut into 100 foot lengths, viewed to obtain an accurate record of what was on each reel, labeled and boxed, catalogued and entered into the appropriate Local Data Record [LDR] in OCLC, and then shipped to the campus library. The campus library in turn adds the data to their OPAC and then ships them to the library storage facility which enters them in their OPAC. This series of acquisitions proved to be providential for it enabled us to become a player in the new National Endowment for the Humanities initiative to digitise a significant number of titles and runs of U S newspapers and mount them on the Chronicling America webpage of the Library of Congress for free public access.

2

CBSR’s Role in Chronicling America

When the program was announced in 2004, we applied for and received one of the six initial test awards. We were authorised to digitise 100,000 pages over two years of California newspapers published between 1900 and 1910. We chose the San Francisco Call, arguably the best daily of the period and one especially noteworthy for the illustrations, and two small weeklies to represent the diversity of the state.1We were also fortunate to obtain additional funding from the State Library which expanded our NDNP project. Several years before the State Library had approached us about digitizing the earliest California newspaper, the Alta California, and its immediate successors, the most important title for the third quarter of the nineteenth century. We applied for funding to the ______________ 1 Its quality is best demonstrated by the fact that the publishers of the two other morning dailies, M H De Young of the San Francisco Chronicle and William Randolph Hearst of the San Francisco Examiner conspired to buy it secretly in 1913, shut down the printing plant and converted it to an afternoon newspaper, also abandoning its high standards

Snyder: The California Digital Newspaper Project

65

State Library and received two successive grants which not only enable us to digitise the complete run of the Daily Alta California and its predecessors from 1846 to 1891 but all also the San Francisco Call for the remaining nine years in the century, linking it up to the NEH funded effort. The result will be to make available free to the public a complete run of a San Francisco newspaper from 1846 to 1910. We have had a third grant and have now begun digitizing the Los Angeles Herald. In the next phase of both the federal and state funded grants we will be extending the run and adding a Sacramento title, as well as smaller regional newspapers. By 2009 we expect to have digitised 200,000 pages under the NDNP program and another 200,000 pages with State support. Taking on the additional charge from the State Library added another important ingredient, for we agreed to mount what we were doing with State support together with the NDNP titles together on our own web page for free access by the public. The NDNP specifications called for page level access. For our own web page we decided to offer article level access.

3

THE DIGITAL NEWSPAPER PROGRAM: DEALING WITH THE ISSUES

Our digital newspaper program is the most daunting initiative I have undertaken in thirty years of managing online bibliographical projects. The technological challenges are many and diverse. The first round of NDNP awards were intended as a test phase and the sponsoring agencies encouraged the participants to select different vendors. We had three bids and we selected OCLC working in partnership with CCS. It has been an excellent decision. The process We begin with the film itself from which the digitised images are created. One variable is the quality of the film. A second issue is the filming itself and the third is the nature of the run. Technical issues aside, the selection of the titles themselves can be a more sensitive issue than it has been for filming. As a large part of the surviving runs and issues had been filmed, especially of the most important titles, the project team was able to film titles with more limited coverage and circulation but which were still required to provide geographic coverage of the state. Digitisation, however, is a different matter; it requires careful selection and the percentage of titles digitised will be only a fraction of those filmed. There is the issue of copyright. The NDNP ends in 1922 to avoid this issue. How would you choose? After selection, the film is taken to the preservation laboratory at UC Berkeley to have some test copies made. The technician gives us several samples made to different specifications and we choose the best. Once we make the determination the film is sent to OCLC which does the digitisation and the raw OCR. But this is only the beginning. At this point the data is transferred to a 450 gigabyte hard disc. OCLC loads the disc together with a check list and ships it to Content Conversion Specialists (CCS) in Germany. CCS completes the digital/OCR process creating the metadata and reloads the data with the additional metadata and a new check list. The disc is then sent to our UC Berkeley office. There the quality check is made, metadata and images, by a trained digital specialist in the Bancroft Library. Sometimes there is some error, a missing track, missing or incorrect data. The defective drive is retained and CCS notified to send a replacement for the defective or corrupt files. The replacements are sent via FTP to Berkeley and the drive is updated and validated again. The balance of the data is sent on to our UC Riverside office. There it is downloaded to our digital archive and backed up. Again a check is made for

66

Snyder: The California Digital Newspaper Project

completion and any discrepancies noted and reported to be corrected. We then send the disc on to the Library of Congress. Once more the data is validated and if acceptable loaded into Chronicling America. At that point, once the presentation software is in place, it is available for public access either through the Chronicling America and our California Newspaper Digital webpage at UC Riverside. The images funded by the State Library go directly from CCS to Riverside where they are checked and loaded. The technical specifications for the metadata, the digitised images, etc., have been carefully worked out by the Library of Congress for the NDNP project. Because of the size of the files, the poor quality of the paper employed, there are many technical hurdles to be overcome. Moreover, there is no internationally accepted set of standards, though the National Digital Newspaper Program (NDNP) standards are becoming the de facto standard. Briefly, they are • Page Image - grayscale, 300 or 400 dpi, from microfilm • Tiff 6.0; JPEG 2000 (.jp2); PDF with hidden text • OCR • XML - ndnp/alto Schema • Page-level, uncorrected, column zones with “bounding box” • mapping coordinates • Metadata • XML in METS/MODS for digital objects The metadata specifications are very detailed and complex. For extra large pages the dpi is 300. Validation has been developed for the two levels which require a considerable investment in themselves. Though carefully worked out all the participants discover that in the process of organizing the materials, submitting the metadata, and then processing the results unresolved problems arise. Duplicate pages This involves both unintentional duplicates and deliberate duplicates where the page was refilmed to obtain a better image. We can choose to include them or exclude them. At the local level when we load the data for our own web page we are concerned that if they are included they are difficult to remove at a later stage and could affect the metadata. Missing issues There are instructions for dealing with these. When we received the files back from the vendor and loaded them, we found that the metadata included citations to missing pages at the page level data but not at the article level. Given the complicated metadata structure and the difficulty in modifying it once it is loaded we are still unclear how we can integrate overlapping files of the same title but we assume that can be worked out, eliminating duplicates in the process, manually or electronically. The same procedures are followed for the files created solely for the CBSR webpage but the greater complexity of the data owing to the article-level requirement creates new sets of problems. The LC validation software cannot be employed as it works only with page-level data. These are issues related to NDNP specifications. We have a second set of issues in dealing with our vendors. This has been a learning process for all of us. For example, thirteen years of the Daily Alta California were printed on elephant folio paper. Each page was filmed as two segments. We have asked CCS to stitch them together. They had done so with other publications but nothing of this size or antiquity. The samples we have seen are marvelous as the join can only be detected by a slight alteration in the line separating the columns and the text at the join is fully legible.

Snyder: The California Digital Newspaper Project

67

Corrupt or missing files Dealing with corrupt or missing files was more problematical. Most batches have some errors, but they are generally minor. For example, the page count does match the file submitted; the meta data doesn’t match the manifest; there are missing pages; the metadata does not include the missing pages; the pages are filmed out of sequence and hence the numbering is incorrect; incorrect xml linking; and broken tags. These errors may occur for a variety of reasons, such as corruption on the mailed disks. Errors like these require hand correction. The errors occur mainly in the metadata, more rarely in the images. Our developer told us it was better to wait until the whole file was corrected rather than try to overlay a correction later. Consequently when an error in a batch is detected CCS is notified and a corrected file requested. We had requested a prompt return. But CCS, faced with production schedules, and systems operating at capacity, wanted to accumulate the requests and process them together on a monthly basis, as it meant halting the production line to insert them and they did not want to do so more frequently. Eventually a compromise was worked out. The issues we faced in readying the data to send to LC were dwarfed by the issues we faced in setting up our own web page. Just the selection of the software required to load, preserve and present our files is a daunting task. LC employs four different packages, three of which are open source and one of which is proprietary. Initially we assumed we would load some existing presentation software, either proprietary or open source, and then adapt it to our needs. We hired a Ph.D. candidate in computer science to do the evaluation. After some study he decided none of the existing programs provided an ideal response and proposed to design his own. At his request we hired a second graduate student to assist him. We had to order new equipment for the developer team and had to hire a new systems manager who understood Linux. He discovered that our computers were operating on outmoded software, versions no longer supported even though we were paying substantial maintenance fees for them, computers improperly installed, and systems composed of a patch work of ill fitting pieces and no documentation. He had to gradually convert the data systems to more stable platforms. At the same time he had to evaluate, select and install new servers and programs to meet the ever increasing demands of our programmers and the processing of the digital files we were receiving on a regular basis. Creating a new presentation system from scratch is a major task. When we rely upon two part time student workers, however skilled, it does not make the task any easier. They have come up with some innovative features that we consider very attractive, and the speed with which files are retrieved and displayed is amazing. But there are many other facets, web page design, user friendly features, and more. They are also deeply involved in ingestion procedures. They have had to create a validation system for article level data in order to automate a large part of the processing. They set up protocols for sending updates/corrections. We believe the end product will rival or surpass anything currently available and plan to market it commercially.

Presentation System Feature List Below is a brief list of features for the METS/ALTO Presentation System. • Full resolution pan-and-scan page viewing Users are able to view the entire page at full resolution, using a click-and-drag method commonly referred to as “pan-and-scan” to navigate over the page in the viewport. Users are also able to zoom in and out of the image in order to

68

Snyder: The California Digital Newspaper Project















ameliorate ease of navigation. The image is also highlighted with the search query to maximize ease of navigation. Article clipping If the ingested data has been logically analyzed and segregated into blocks, then the presentation system can be used to query the original, full-resolution image for the blocks that contain a given article. These blocks are clipped out of the image at the time of request and are displayed in a printer-friendly webpage to facilitate printing of individual articles. Proprietary fuzzy OCR search technologies Since OCR is an inherently error-prone process, oftentimes words are misread. Whether it happens that letters are missing or misinterpreted, sometimes expanding the query to include similar letter-alternatives (i.e., ‘1’ for ‘l’) has shown to give more accurate results for OCR data. The presentation system’s search engine provides an interface by which the user can tune parameters to automatically perform this query expansion for terms. Utilizing a principled approach to fuzzy search, our system performs character replacement based upon OCR specific metrics. Scalable multi-head & single storage ingestion system The ingestion system is a lightweight, cross-platform application that parses METS/ALTO xml files and builds the data store necessary to interact with the presentation system. A head of the ingestion system can be run from multiple machines, from anywhere (provided they still have access to the raw data repository and to the database)! This allows the administrator to run one head for standard processing, while allowing for accelerated processing with multiple heads if the resources are available. Automated ingestion The ingestion system also provides mechanisms by which the administrator can monitor the status of ingestion of assets. Similarly, he can roll-back or even reingest a resource all with just the click of the mouse. There is a modest built-in failure recovery process that can handle errors such as a head that gets disconnected in the middle of parsing an asset. Collection Browsing The presentation system allows users to browse an entire collection chronologically. The user simply needs to select the title of the newspaper he would like to browse; then he can navigate by date to see the titles of all the articles in the newspaper. Each of the titles can then be viewed in the printable format or with the pan-and-scan image viewer. User-defined clipping Alternative to printing an entire article, the user can clip a region of interest from the full-resolution image. This is similar to the article clipping except that the region is entirely user-defined. Proprietary high performance on-the-fly jp2 splicing The ingestion system and the presentation system both use a high-performance JPEG2000 processing library for on-the-fly clipping as well as prepared clipping of the images for faster response.

We are planning to have a commercial version available by 1 January 2008. Let me make one lesson very clear that we have learned through painful experience. Whatever one plans eventually the costs will be greater, the equipment projections will be found to be inadequate, and the time line too short. It is not made any easier by the rapidity with which the technology changes. Two years ago, I sought the best advice I could find on the storage needs for the initial 200,000 pages. Nobody really knew how large digitised

Snyder: The California Digital Newspaper Project

69

newspaper files could be. (Parenthetically, at the NDNP meeting in Washington in July we were told by LC programmers that they averaged 56 megabytes a page.) The data as received from CCS included tiffs, JPEG 2000 files, and metadata. A few months ago, we were informed that we did not have to load the tiffs on our server and that they could be created on the fly from the JPEG 2000s. Our 24 TB began to look as though they could handle two or three times or more than originally envisaged. Needless to say this was very welcome news. Not long before we started the digital project we had purchased a major new server that we thought would take care of all our needs for some years to come. Instead, we have added five new smaller but powerful servers with the expectation that as our database grows and user searches increase we will be adding new servers regularly, based upon a distributed rather than a centralized system. What we are saving on storage units we are spending on new servers! In all fairness, this storage space and the additional servers must also accommodate the needs of our developers which are not modest. But it is a caveat to all of us that in a developing field of this kind it is impossible to forecast with finality the resources required. But it does not end there. The Center is housed in an old motel building that is neither fire proof or secure and it is essentially unprotected, unguarded at night. We asked the campus computing center to house our new equipment. The new storage rack and hard disc were installed there and were soon operating without a hitch. But when our developers required additional equipment, we were told that they had neither the space nor secure power source to supply it. I knew that the rack we had purchased was only half filled. When I inquired I was told that they had loaded equipment from other departments on it and could not move it until space and power supply were identified. Bit by bit we have worked these issues out and though sometimes with a delay all the equipment is installed and functioning at the computing center. The cost of the additional storage required was worrisome. If we could store the entire back up files elsewhere they would reduce our space needs substantially. Given the fact the area is prone to earthquake having the backup files stored in another location was also desirable. Consequently, we contacted the California Digital Library (CDL), operated by the Office of the President of the University of California and specifically designed to provide a university center to store and disseminate data on behalf of the campuses and their libraries. When we approached them we were told that they were just setting up test sites to explore storage needs, management and dissemination. Newspapers were not part of the test scheme. When we pointed out the importance for the scholarly community and the state and the unique needs that had to be accommodated we were told to submit the project through our library for consideration. After some months the project was accepted but the actual transmittal and storage of date was some months away. It was at this stage I authorised the purchase of an additional fifteen terabytes of storage. The CDL was also concerned about how to transmit data files of the size we were preparing and determined it had to develop a whole new set of procedures. They are doing so in cooperation with the super computer center at the University of California, San Diego and ours is a test project. Once more this illustrates the pioneering nature of the tasks in which we are engaged and the complexity and challenge that newspapers represent. A recent development for which I am particularly keen is our scheme to develop a program to utilize the digital newspapers as a teaching tool. We are commissioning a series of lesson plans that link our files to digitised pictures, manuscripts and other printed archives posted on Calisphere, a product of the California Digital Library of the University of California, and the American Memory, an ambitious digital program of the Library of Congress. I have no doubt that the California Digital Newspaper Project (CDNP) will

70

Snyder: The California Digital Newspaper Project

engender a wide audience. At the same time I recognize that to attract that audience I know to exist we must launch and maintain a major public relations and publicity effort. We hosted an all-day conference in October 19th 2007, bringing together state officials, publishers, historians, librarians, teachers and researchers to explore the value of digitised newspaper files to a variety of users and how best to exploit them.

4

REACHING OUR GOAL

Our ultimate goal is to create a California digital newspaper webpage that focuses exclusively on California titles, has links to all existing digital California newspaper projects and, where possible, permit cross searching over databases other than our own. We want to create tools to increase both the usefulness of the data and the ease with which it can be accessed. The lesson plan is one such initiative aimed at the K12 population. We are concerned about the retrieval accuracy. We all know that is less than 100% and the older the title, the poorer the original or the filming, the lower the retrieval rate. Genealogists, another key user group, are searching mainly for vital statistics which are invariably set in the smallest type. Moreover, they are not consistently marked off so that we cannot group the sections under a genre heading. We are open to suggestions as to how to improve the retrieval for searchers. We did make the decision to create article level retrieval as opposed to the page level retrieval dictated by LC. We have also established a limited number of genre headings to further limit searches to provide more focused retrievals. The California Newspaper Project is a term project. Once it is completed, and funding ends, it will close. But we have become aware that the whole enterprise is too vital for preserving the historical record of the state to simply shut down. For one thing, if there is value in inventorising and preserving what survives of the newspapers published in California surely that value extends to the future as well as the past. It is over a decade since we surveyed the first counties. What has transpired since that time? New titles have appeared while others have been terminated. Are the physical copies still in existence? Have they been preserved safely and filmed? There is an urgent need to provide for a continuing presence. There is the management of the film stock. files include details of issue by issue inventories that cannot be entered in the CONSER records. There are all our historical notes, a library of newspaper histories and special issues, and other related material. All are of considerable value to future researchers. It is our intent to create a California Newspaper Archive supported by state funding to ensure the maintenance of this invaluable and unique historical resource.

NEW ACCESS TO OLD MATERIALS: The Hong Kong Newspaper Literary Supplements Digitisation Project Leo F. H. Ma and Louise L. M. Chan The Chinese University of Hong Kong Abstract Newspapers are an essential resource for teaching and research activities in educational organizations. In many academic libraries, newspapers are stored in microform format to optimize on space. However, readers find it difficult to access this important pool of information efficiently. To meet the increasing demand for newspaper resources on Hong Kong literature in the universities of Hong Kong, the University Library System of The Chinese University of Hong Kong launched the Hong Kong Newspaper Literary Supplements Digitisation Project in 2000. Twenty major newspaper literary supplements in Hong Kong such as Ta Kung Pao Wen Yi 大公報文藝 (1938 - 1941), Lih Pao Yan Lin 立報言林 (1938 - 1941), Sing Tao Daily Xing Zuo 星島日報星座 (1938 - 1941), Wah Kiu Yat Po Wen Yi 華僑日報文藝 (1947 - 1949), New Life Evening Post Xin Qu 新生晚報新趣 (1945 - 1975), Wen Wei Pao Wen Yi Zhou Kan 文匯報文藝周刊 (1948 - 1949), etc. were identified and digitised. Currently, more than 170,000 entries of the newspaper literary supplements have been provided through the Hong Kong Literature Database, a common platform for different kinds of resources on Hong Kong literature including monographs, journals, newspaper articles, theses and dissertation and internet resources. Apart from reviewing the Hong Kong Newspaper Literary Supplement Digitisation Project, this paper also looks at the challenges in digitizing newspaper information and suggests possible solutions to the problems encountered. 1

INTRODUCTION

Recognizing the emerging need for primary resources on Hong Kong literature among academicians, the University Library System of The Chinese University of Hong Kong (hereafter CUHK ULS) launched the Hong Kong Literature Database (hereafter the Database) in June 2000, which is the first database on Hong Kong literature around the globe. The Database provides a common platform for storing and retrieving various types of materials, including monographs, journals, newspaper literary supplements, theses and dissertations, and web resources, on Hong Kong literature.1 The Hong Kong Newspaper Literary Supplements Digitisation Project (hereafter the Project), being part of the Database project, was implemented at the same time as the launch of the Database. Before ______________ 1 For detailed description about the objectives, scope and technical specification of the Database, please refer to another article on this topic by the author. (Ma, Wong & Lau 1341-1347)

72

Ma and Chan: New Access to Old Materials

describing the Project, this paper first looks into the increasing research interest in modern Chinese literature using newspaper literary supplements as a supplementary source to the traditional methodologies and perspectives among Chinese scholars in the past two decades. The paper then describes the Project details including the objectives, scope and searching capabilities in order to demonstrate how CUHK ULS responds to the demand for accessing newspaper literary supplements from readers. Apart from outlining the challenges, the outcomes of the Project are also discussed. 2

The Study of Newspaper Literary Supplements

Modern Chinese literature has been developed for more than 90 years since the Literary Revolution (1917) in China. In the past, studies of modern Chinese literature were mainly focused on writers, works, sects, literary history, etc. Relatively speaking, there are not too many in-depth studies on newspaper literary supplements. Taiwan scholar Lin Qiyang (林 淇養) pointed out that literary supplements form a significant part of Chinese newspapers which has long been neglected by researchers (Lin 23). However, a new research methodology which emphasizes interdisciplinary studies of Chinese literature has been emerging in the past two decades. That means that literary study no longer adheres to its traditional perspectives and methodologies. Instead it integrates knowledge from different subjects and disciplines. Some researchers now approach their studies from both textual and socio-cultural aspects. Textual study analyses fundamental composition and features of the genre while socio-cultural study examines the production and development of literature in relation to its social and cultural context (Zhou & Yang 1). Newspaper literary supplement is also a major publication tool in mass media. It is both physical and intellectual since literary supplement provides not just a physical condition for the production and development of literature but a fundamental change in format and contents for literary works. Thus, Chen Pingyuan (陳平原), scholar from mainland China, stated that the study of newspapers is a study in the directions of both physical and intellectual, culture and literature, as well as contents and format (Chen 97). No matter in mainland China, Hong Kong, Taiwan or even overseas, the study on newspaper literary supplements has gradually become a popular research topic. A number of publications specialized in newspaper literary supplements were published.2 Recently, universities in Hong Kong launched increasingly more research projects in newspaper literary supplements. For instance, Department of Chinese Language and Literature of The Chinese University of Hong Kong has undertaken five research projects in literary supplements in Hong Kong newspapers namely An Investigation of the Literary Supplement “Qianshui Wan’ Edited by Liu, Yi Chang in Hong Kong Times (1960.2.15 – 1962.6.30) ( 劉 以 鬯 主 編 《 香 港 時 報 淺 水 灣 》 時 期 [1960.2.15 – 1962.6.30] 研 究 ) (2002/03), A Study on the Literary Supplement “MEGPAPER” in the Hong Kong Daily News (《新報》副刊MEGPAPER研究) (2002/03), A Study on the Newspaper Literary Supplements of the Overseas Chinese Daily News (Wah Kiu Yat Po) of Hong Kong (《華 ______________ 2 In the past ten years, scholars in mainland China and Taiwan began to recognize the importance and significance of newspaper supplements in their research. In 1997, a major conference on newspaper supplements was held in Taiwan and subsequently a proceedings Shijie Zhongwen Baozhi Fukanxue Zonglun (《世界中文報紙副刊學綜論》“On the Chinese Newspaper Supplements in the World”) (Taipei: Council for Cultural Affairs, 1997) was published. Also, scholars such as Chen Pingyuan (陳平 原), Leo Li Ou-fan (李歐梵), Leung Ping-kwan (梁秉鈞 ), Lin Qiyang (林淇養), etc. have published papers on the research methodology and significance of newspaper literary supplements.

Ma and Chan: New Access to Old Materials

73

僑日報》副刊研究) (2003/04) and A Study on Supplements of the New Life Evening Post (《新生晚報》副刊研究) (2005/06). From the perspective of literary study, studies on literary supplements of Hong Kong newspapers carry three purposes: (1) Literary supplements in Hong Kong newspapers are major resources on Hong Kong writers. Many prolific Hong Kong writers are either contributors or editors of newspaper literary supplements. For example, Gao Xiong (高雄) and Hu Juren (胡菊人) were editors of New Life Evening Post and contributors of “Xin Qu” (“New Fun”) of New Life Evening Post (新生晚報新趣). Other famous Hong Kong writers like Shisan Mei (十 三妹), Dong Qianli (董千里) and Liu Yichang (劉以鬯) were either contributors or columnists of “Xin Qu”. (2) Literary supplements in Hong Kong newspapers provide a channel for writers from mainland China to publish their works especially in a politically turbulent period. For instance, “Xingzuo” (“Horoscope”) of Sing Tao Daily (星島日報星座) (1938-1941) and “Wenyi” (“Literature and Arts”) of Ta Kung Pao (大公報文藝) (1938-1941), significant literary supplements in Hong Kong, had many notable writers from mainland China, such as Ba Jin (巴金), Dai Wangshu (戴望舒), Shen Congwen (沈從文) and Xiao Qian (蕭乾), as their contributors. (3) The layout and content of the literary supplements reflect the production mechanism and mode of transmission of literary works in a society, which in turn, effectively reflect the literary ecology and societal changes of a particular place at a particular time. Liu Yichang (劉以鬯), renowned writer and veteran editor in Hong Kong, once mentioned that there is a close relationship between newspapers and the development of Hong Kong literature. Since the 1920’s, almost every newspaper has a literary supplement (Liu, About Hong Kong Literature 132). In addition, the readership for literary works in a highly commercialized place such as Hong Kong is small and it is hard for a literary journal to survive in the market. Thus, writers in Hong Kong have to rely on newspaper literary supplements to publish their works (Liu, About Hong Kong Literature 143). All in all, literary supplements in Hong Kong newspapers play an important role in Hong Kong literature and are significant research materials for the study of Hong Kong literature. 3

The Hong Kong Newspaper Literary Supplements Digitisation Project

Literary supplements in Hong Kong newspapers have, without doubt, published substantial primary source materials on Hong Kong literature and culture. However, literary supplements are very often published either on a daily or weekly basis which will result in a great quantity of paper originals. It is neither easy to take good care of this large volume of old newspapers nor store them in proper condition, which is a great hindrance to researchers of Hong Kong literature. The Project is one of the major digitisation projects initiated by CUHK ULS in 2000. The Project has digitised literary works in 20 major literary supplements in Hong Kong newspapers since 1930’s and makes these literary supplements available free on Internet. The Project not only allows readers around the world to access Hong Kong literary works via Internet, but also provides first hand materials for researchers in order to support their research on Hong Kong literature.

74

Ma and Chan: New Access to Old Materials

Right now, the Project includes “Xingzuo” of Sing Tao Daily (1938-1941), “Yan Lin” of Lih Pao (立報言林) (1938-1941), “Wenyi” of Ta Kung Pao (1938-1941), “Wen Xi” of New Life Daily (新生日報文協) (1945-1946), “Xin Yu” (“New Language”) of New Life Daily (新生日報新語) (1945-1946), “Xin Qu” of New Life Evening Post (1945-1975), “Wenyi” of Sing Tao Daily (星島日報文藝) (1947-1953), “Wenyi Zhoukan” of Wah Kiu Yat Po (華僑日報文藝週刊) (1947-1949), “Wenyi” of Ta Kung Pao (1948-1952), “Wenyi Zhoukan” of Wen Wei Po (文匯報文藝周刊) (1948-1949), “Xin Wenyi” of Wen Wei Po (文匯報新文藝) (1950-1951), “Wenyi” of Wen Wei Po (文匯報文藝) (1956-1967), “Qianshui Wan” (“Repulse Bay”) of Hong Kong Times (香港時報淺水灣) (1960-1962), “Xing Hai” of New Evening Post (新晚報星海) (1979-1991), “Da Hui Tang” (“City Hall”) of Sing Tao Evening Post (星島晚報大會堂) (1981-1991), “Wen Lang” of Wah Kiu Yat Po (華僑日報文廊) (1992-1994), “Wenyi Qixiang” of Sing Tao Daily (星島日報文藝氣 象) (1992-1993), “Wanfeng” of New Evening Post (新晚報晚風) (1993-1995), “Wenyi” of Wen Wei Po (1993-1998) and “Wenxue” of Ta Kung Pao (大公報文學) (1992-2003). (Figure 1) Each newspaper literary supplement has a home page so that readers can browse the literary supplement by issue and/or date. (Figure 2)

Figure 1: Title List for Hong Kong Newspaper Literary Supplements

Figure 2: Home Page of “Wenyi” of Wen Wei Po

The Database also provides different access points, such as title, author, keyword, source title, etc., to facilitate searching. For those supplements which have been granted permission, both index and full-text are available. Otherwise, only index is provided. For example, full-text articles of “Wenxue” of Ta Kung Pao are available from the first issue onwards. (Figure 3)

Ma and Chan: New Access to Old Materials

Figure 3: An example of a fulltext article in “Wenxue” of Ta Kung Pao

In addition, the Database provides a hyperlink to the library catalogue of CUHK to indicate the availability of the paper originals. (Figure 4)

Figure 4: Hyperlink to Library Catalogue

Aiming at providing the most relevant primary source to readers, the Project collaborates with experts and scholars in Hong Kong literature as well as research units in universities. For instance, from July 2004 to February 2006, the Project received a research grant from the Hong Kong Arts Development Council (hereafter HKADC) to compile indices of three literary supplements, namely “Xin Qu” of New Life Evening Post, “Wenxue” of Ta Kung Pao and “Wenyi” of Wen Wei Po. “Xin Qu” of New Life Evening Post, a major literary supplement in Hong Kong, published daily for 30 years (1945-1975) with 110,000 some articles. Notable prolific Hong Kong writers such as Shisan Mei (十三妹) and San Su (三 蘇) were its contributors.

75

76

Ma and Chan: New Access to Old Materials

In 2005, Fan Sin-pui from the Department of Chinese Language and Literature of CUHK, Principle Investigator of the research grant project A Study on Supplements of the New Life Evening Post funded by the Research Grant Committee of the Hong Kong SAR Government, made use of the index provided in the Database to conduct an in-depth and comprehensive study on “Xin Qu”. On 3rd March, 2006, a press conference was held by the University Library System to publicize the outcomes of the Project. In the press conference, Fan Sin-pui talked about the result of his research project with the media. (Figure 5) Also, Liu Yichang (劉以鬯) delivered a talk on “My Experience as an Editor of Newspaper Literary Supplements in Hong Kong” to share his experience in editing literary supplements such as “Qianshui Wan” of Hong Kong Times and “Da Hui Tang” of Sing Tao Evening Post (Liu, “My Experience” 74-78). (Figure 6)

Figure 5: Press Conference held on 3rd March 2006 (From left: Fan Sin-pui, Rita Wong and Leo Ma)

4

Figure 6: Talk by Liu Yichang (劉以鬯) on 3rd March 2006

Challenges and Outcomes

The Project has encountered various problems and challenges since its inception in 2000. In summary, there are two main types of challenges, namely resources and technical processing issues. Regarding the issue of resources, given the huge volume of articles published in literary supplements, a significant amount of resources have to be allocated to the Project so as to better manage the mass of data. Aserver has to be acquired for data storage, computer experts have to be recruited to create and maintain the Database, and a lot of manpower has to be engaged to collect, input and proof-read the data. In the light of these difficulties, a variety of measures have been taken to secure funding for the Project. First of all, allocate funding and manpower to the Project from the recurrent library budget. Secondly, obtain research grant from outside the funding body.3 Lastly, proactively engage in collaboration with faculty and upload their research results to the Database.4 Klijn pointed out that the challenge of a newspaper digitisation project “is to find a balance between accessibility and feasibility within the limitations of available resources.” (Klijn) _______________ 3 As mentioned in section III of this paper, the Project received funding from HKADC to digitise the indices of three newspaper literary supplements. 4 As a recent example, permission is granted by Hoyan Hang-fung (何杏楓) and Cheung Wing-mui (張詠 梅) of the Department of Chinese Language and Literature of CUHK to upload the index of “Qianshui Wan” of Hong Kong Times, which is part of their research findings, to the Hong Kong Literature Database.

Ma and Chan: New Access to Old Materials

77

Apart from resources issue, the Project has to resolve different technical processing problems. First, the availability of newspapers is vital in digitizing its literary supplement. For instance, the supplements of Sing Pao Daily News (成報), as mentioned by Lo Wailuen (盧瑋鑾), was very popular and influential in Hong Kong (Tay, Wong & Lo 9). However, there is no complete run of this newspaper in Hong Kong libraries. Second, the output of a digitisation process is often affected by the quality of the preserved newspapers. It is difficult to digitise newspaper because it is sometimes badly printed and the paper gets fragile and damaged very easily. In case the digitised images are converted from microform, the fine print on the newspaper may get blurred. (Figures 7 & 8) All theses factors would certainly affect the quality of images produced in the digitisation process. Third, inaccurate data in the paper original requires an additional checking process which, of course, will lengthen the inputting process.

Figure 7: An example of blurred characters in the title of an article (“Xin Qu” of New Life Evening Post on 1947.1.10

Figure 8: A symbol “” is used to indicate an unrecognized character (“Xin Qu” of New Life Evening Post on 1947.1.10)

For example, the publication date of “Xin Qu” of New Life Evening Post on March 14, 1948 was wrongly printed as March 13, 1948.5 (Figure 9) It is, therefore, of no surprise for James-Gilboe to comment that “[d]igitization is difficult, harder than you ever anticipated, more frustrating than you expect, and full of lessons learned the hard way.” (James-Gilboe 155) Õ ý _ Õ ÈÚ Æ£º Ð Ö_ à ñ _ È ý® ÊÆ ß Ä êÈ ý Ԯ ÊÄ ËÕ È

Figure 9: “March 14, 1948” was wrongly printed as “March 13, 1948” (“Xin Qu” of New Life Evening Post)

_______________ 5 Actually, the problems encountered by CUHK ULS are quite common to other newspaper digitisation projects worldwide. For example, Bremer-Laamanen stated that the reasons for the rather late start for newspaper digitisation projects, especially in the Nordic countries, are due to the large holdings of newspapers, the poor print, the poor quality paper and the use of Gothic Fraktur and Roman text in the Nordic countries. (Bremer-Laamanen 169)

78

Ma and Chan: New Access to Old Materials

Despite all these challenges, the outcomes of the digitisation project are very encouraging. Readers can now access Hong Kong newspaper literary supplements through the Internet regardless of time and geographical location. Through the Project, worldwide readers can get acquainted with Hong Kong literature. And researchers can access more primary source materials to facilitate their research on Hong Kong literature. Furthermore, the Project helps preserve the paper originals of newspaper literary supplements. It is widely acknowledged that the paper format of newspaper is inconvenient to use while the microform format is certainly not easy to access. The digitised format, on the contrary, is easy to use and access. Since readers can locate the required information through the Database, they no longer have to browse the newspapers page by page. The paper originals of the literary supplements can be better preserved in the long run. 5

Concluding Remarks

Digitisation is an on-going project. In the past eighty years, the number of newspaper literary supplements ever published in Hong Kong is far more than the number of titles included in the Project. As a strategic response, major newspaper literary supplements are identified and digitised first. Given the constraints of available resources, it is not difficult to see that there is still a long way to go before the sheer amount of primary source materials contained in newspaper literary supplements can be digitised and made available to readers around the world.

References Bremer-Laamanen, Majlis. “Connecting to the Past: Newspaper Digitisation in the Nordic Countries.” Journal of Digital Asset Management 2.3/4 (2006): 168-171. Chen, Pingyuan. Wenxue de Zhoubian. (《文學的周邊》“In the Margin of Literature”) Beijing: New World Press, 2004. James-Gilboe, Lynda. “The Challenge of Digitisation: Libraries are Finding that Newspaper Projects are not for the Faint of Heart.” Serials Librarian 49.1/2 (2005): 155-163. Klijn, Edwin. “The Current State-of-Art in Newspaper Digitisation: A Market Perspective.” D-Lib Magazine 14.1/2 (2008): 9 pp. 29 Jan. 2008. http://www.dlib.org/ dlib/january08/klijn/01klijn.html Lin, Qiyang. Writing and Mapping: A Study of Phenomenon of the Literary Communication in Taiwan. (《書寫與拼圖:台灣文學傳播現象硏究》) Taipei: Rye Field Publishing Co., 2001. Liu, Yichang. Changtan Xianggang Wenxue. (《暢談香港文學》“About Hong Kong Literature”) Hong Kong: Holdery Publishing Enterprises Ltd, 2002. ---. “My Experience as an Editor of Newspaper Literary Supplements in Hong Kong.” (〈我編香港報章文藝副刊的經驗〉) Hong Kong Literature Monthly (《城市文藝》) 1.8 (2006): 74-78.

Ma and Chan: New Access to Old Materials

Ma, Leo, Rita Wong and Paul Lau. “Preserving the Literary Past Looking to the Future: the First Hong Kong Literature Database.” Journal of Zhejiang University SCIENCE 6A.11 (2005): 1341-1347. Tay, William (鄭樹森), Wong Kai-chee (黃繼持) and Lo Wai-luen. Xianggang xin wenxue nianbiao (1950 – 1969 nian) ( 《 香 港 新 文 學 年 表 ﹝ 1950 – 1969 年 ﹞ 》 ”The Chronology of Modern Hong Kong Literature [1950-1969]”). Hong Kong: Cosmos Books Ltd., 2000. Zhou, Haibo (周海波), and Yang Qingdong (楊慶東). Chuanmei yu Xiandai Wenxue Zhijian (《傳媒與現代文學之間》“In Between Media and Modern Literature”). Beijing: China Social Sciences Press, 2004.

79

CREATION OF A NATIONAL NEWSPAPER REPOSITORY AT THE UNIVERSITY OF ZIMBABWE (UZ) LIBRARY Edward Tasikani University of Zimbabwe Main Library

Abstract The creation of the National Newspapers Repository (NNR) has been necessitated by the absence of such a facility to provide information for the nation apart from one having to make use of the National Archives. This can be done for the nine main newspapers from the nations’ premier institution of higher education. The newspaper collection at the UZ library was destroyed because of storage problems. The Institutional repositories use DSpace software to run the database and this could be put to use for a star. The online database would be an open access facility for the academic and research communities throughout the world. Local newspapers provide vital primary information depicting social and economic activities of the local population in a country.

1

Introduction

Lynch, C. regards a National Newspaper Repository (NNR) as a set of services that a university offers to the members of its community for the management and dissemination of digital materials generated by local newspaper publishers to promote local research. It is most essentially an organizational commitment to the stewardship of these digital materials, including long-term preservation where appropriate, as well as organization and access or distribution of the material. This serves the purpose of an information centre without a physical location for the end user.

2

Background to the Problem

Zimbabwe has about nine popular local newspapers which are used by researchers as primary sources for research information. They are: 1.The Chronicle; 2.The Herald; 3. Financial Gazette; 4. The Independent; 5. Manica Post; 6.The Standard; 7. The Sunday mail; 8. Sunday News and 9. Zimbabwe News and quite a few other ceased tiles. Universities in the country currently do not have a reliable collection of local newspaper. All researchers requiring this resource are referred to the National Archives. The National Archives is a collection of microfilm edition of the newspapers which means one has to get to the National Archives in order to access the collection. An online database stands to benefit the general researchers in that the metadata can be searched using controlled terms and that saves the user’s time by quickly identifying what they are interested in from any computer on the internet anywhere. Collections provided by individual publishers tend to be biased and do not promote objective and balanced research because of publisher biases embedded in the various stories in their articles. Because of that, there is a need to host local newspaper articles from a neutral database coordinated by the University of Zimbabwe library to promote research. This is more so in an environment

82

Tasikani: Creation of a National Newspaper Repository at the UZ Library

where some of the papers are state owned while others are privately owned. The views expressed in these papers would be directed by the editorial policy, some political biases are deliberately put across to influence readers’ views. A neutral database for these papers would provide a fair platform from which readers would be in a position to sift through the material and draw a conclusion based on available facts. National Newspapers provide vital primary information for local researchers on various aspects of the country as a whole. This being a valid source of research information is not adequately provided for by academic libraries to further research on local issues. Digital repositories are a common feature in the information landscape and have advanced information communication systems wherever it is launched. The objective of this paper is to propose the organisation of a database of all locally published newspaper articles and avail them to readers and researchers on an open access platform complementing the work provided by the National Archives as the sole provider of newspaper research facility after University of Zimbabwe (UZ) library erroneously discarded her own collection due to storage problems. According to Ngulube, P. microfilming and digitisation are the two reformatting products widely used in Sub Saharan African Archival Institutions. It is in view of the above assertion that this paper hopes to unmask the benefits to be accrued by the majority of scholars and researchers.

3

The way forward

A database on primary information issues to do with Zimbabwe with regards access and use of newspaper articles from everyday events to portray the reality prevailing in the country at every stage in the history of Zimbabwe. Publisher biases are neutralized by availability of publications from the right and the left wing on the same platform for the researcher to assess and select information according to their own interpretation of the situation on the ground. The UZ library is on course in harnessing local publications for research through the establishment of the Faculty Institutional Repositories which use DSpace software. The same concept can surely be extended to local newspapers as a primary source of vital information required during research by scholars accessing this information on an open access platform. This would go a long way in enhancing local research through use of already available information published in local papers. There are, however, views discouraging the use of ICTs to repackage this resource due to expenses incurred during the process. Traditional archivists tend to look at the expenses side more than the benefits accrued, especially where some of the material is already in electronic form. All that would be required is to install acrobat distiller to convert the text to PDF format before loading it onto the appropriate platform/database as part of the already established open access institutional repository which uses DSpace to upload documents. This can be done with all the newspapers which provide online editions of their paper and subsequently archive them for shorter periods of up to four or six months. This therefore necessitates the creation of an online digital newspaper database available to researchers on an open access platform. The UZ Library has the capacity to handle that type of a programme since projects of a similar nature are already operational in the library system. The institution would be ready to archive the material for a longer period and avail the collection to all those in need of such information. This information is required to be kept in local academic library archives since they reflect on the life and culture of the local inhabitants. Apart from mere

Tasikani: Creation of a National Newspaper Repository at the UZ Library

83

archiving, this would help augment the depleted library stock which is greatly affected by the ever diminishing library book budget. Librarians and archivists have a duty of acquiring documentary heritage and availing to current and future generations by prolonging the life of the available information in archival material. The main objective of repackaging the documents through digitisation is to prolong life in the content/information provided and not to promote the life span of the original document. Scholars are more interested in the content than the actual document. This project would have its focus on value addition of local research as far as academic circles are concerned. The education system would not have to buy back local research that is perfected by foreign authors who get access to this information and write about Zimbabwe even better than their Zimbabwean counterparts because they have the necessary resources at their disposal.

4

Partnership with publishers – win situation for both parties

On undertaking to do the project, the responsible institution would have to have clear issues of copyright with the various newspaper publishers since this means changing the material from its initial format to digital material. This could be eased by having the material uploaded in retrospect, current newspapers available on the streets and on the publisher’s website can continue to be accessed from there while the database uploads older information required for research purposes. The National Newspaper Repository would therefore be there to provide information for scholarly research purposes. The microfilming process is more expensive than digitisation of the collection. This would be done by a single library for the benefit of the whole academic community the world over. When the material is digitised, original documents would not suffer from over usage and the need to renew the microform negatives. The National Archives would retain its position and preserve the original documents from which the materials are derived. Original paper borne copies need to be kept in environments that are conducive to the preservation of paper. The temperature and humidity need to be closely monitored so that newsprint does not become brown and brittle, these would have to be published on acidfree paper. Apart from controlled temperature, the cost of putting such infrastructure in place is more costly than simply repackaging the material. Tsadik, D. G. points out that people in developing countries give archives a second priority status after other basic needs at the expense of cultural heritage. Cox concurs and points out the common view towards archives and repositories, these are viewed as written in indecipherable, ancient scripts on yellowing scraps of paper treasures waiting to be discovered or nuisance dumped in basements. It is because of such reasons that UZ and other major libraries in Zimbabwe abandoned the preservation of national newspapers or have been late starters in the preservation of local newspapers and making them available to researchers citing it as a responsibility of the National Archives. By taking up the project, the University of Zimbabwe will not only be complementing the work of the National Archives with regards to availability of the local newspapers to the academic community but will extend its position nationally as a focal source of information for academics and researchers alike. The capturing of the metadata will be done by professionally trained librarian and this will enable easy retrieval of relevant documents from the database. Researchers would be able to access similar documents from different publishing houses and arrive at a logical conclusion in as far as the matter at hand is concerned. This would make UZ Library

84

Tasikani: Creation of a National Newspaper Repository at the UZ Library

outshine other local information service providers by providing a neutral source of information which knows no creed, colour or politics in its quest to provide information. Other schools of thought dismiss digitisation of newspapers to promote access as unprogressive and unhelpful in the preservation of locally produced information sighting illiteracy with regards to ICTs. Zimbabwe is preparing for a situation whereby all schoolleavers are meant to be ICT literate. The President launched the programme, “ICT Literacy for All” whereby computers were supplied to schools throughout the country. The University of Zimbabwe reciprocated by the introduction of a programme known as Information Literacy Skills (ILS), a programme taught as a foundation course to all first year students. This is meant to back up the Presidents’ efforts and also empower students to manage to exhaust all sources of information during research for their assignments and also equip them with a skill to fall back on in the world of work as decision makers. Students write an examination at the end of the first semester in this course.

5

Digital newspapers

These are digitised copies on the web as a service from the library. As indicated in the introduction, at the moment there are nine national papers which need to be digitised and put on the web. The ultimate aim is to get all newspapers in the country accessible on the web. To reach that goal, it will take some time and require some progress from the Library's point of view in the area of digital rights. Right now all old newspapers have been shredded due to storage problems. Once this project takes off with current papers, the old papers would be captured gradually going backwards. If the library manages to set a fund for this purpose and staff dedicated to this, a good amount of the current issues would be loaded and the metadata captured according to show all articles sought after by general search terms entered by clients into the database. Clausen. S refers to a service by the National Library of Oslo Newspaper collection through some localized databases. The University of Zimbabwe can also start on a smaller scale by adopting a few such databases which are user friendly especially for the less computer literate public in the countryside. The chosen database should be able to handle needs of the historians and the historically interested public. That means this database should be able to accord students of History access to the collections. That means updated and complete catalogues and bibliographies with both simple and sophisticated functionalities, and of course accessibility on the WWW. It also means a high degree of accessibility to the document in full text, directly on to a computer screen for anyone who intends to research on Zimbabwe’s developmental issues and social life. Old newspaper volumes are not suitable for interlibrary loan or too much copying and the collection is located in the middle of the country, some distance from major towns or densely populated areas. The Library has to make the collections digital as fast as possible in order to be able to serve the academic community and researchers as a modern academic library with adequate local resources to promote local research. To easy the process, the library could coordinate the project putting satellite database in other university libraries in the country since this is meant to be an open access database to benefit all would be searchers. University of Zimbabwe Library would coordinate the creation of this online database with the help of periodicals staff since this is part of serials collection. At this point it is hoped newspaper publishers can be involved. They can do this in the form of monthly cumulative of their publications which would downloaded onto Compact Disks as part of the

Tasikani: Creation of a National Newspaper Repository at the UZ Library

85

subscriptions charged to the library. Some publishers visited did not have problems with providing the library with such a service. In order to cover the vast amount of information which is at large, the Harare Polytechnic has a library school and the Head of department was agreeable with the involvement of student librarians’ in the capturing of the information to create an up-to-date database which would be available to all researchers provided they access it through the University of Zimbabwe website where the database would be hosted from.

6

Conclusion

The University of Zimbabwe as the nation’s premier institute of higher learning has to chart a way forward in the preservation and availing of locally produced information. This can be done to provide for continuity in research as and also a means of providing the international world a window through which they can view some of Zimbabwe’s artistic and cultural beauty. Libraries in the country have not realised the value of locally produced information as a means of developing local research to greater heights. Involvement of publishers and student librarians is meant to be a way of ensuring continuity of this database. If this is to be started at the University of Zimbabwe without adequate patronage, it is most likely to die a premature death once the initiator of this idea leaves the institution. This reduces the fears expressed by traditionalists that conversion would take a longer time, this would be circumvented by the use of the abundant manpower at the Bulawayo and Harare Polytechnics and partially processed material from the publishers as evidenced by responses obtained from newspaper publishers in Harare through a survey carried out in the country’s capital city.

References

Cox, Richard J. Managing institutional archives: foundational principles and practices. Westport: Greenwood Press. 1992 Lynch, Clifford A. Institutional Repositories: Essential Infrastructure for Scholarship in the Digital Age IN: ARL Bimonthly Report 226 Feb. 2003. Retrieved October 10, 2007 from http://www.arl.org/newsltr/226/ir.html. p.4 Ngulube, Patrick. Preservation reformatting strategies in selected Sub Saharan African Archival Institutions. African Journal of Library, Archives & Information Science. vol.12, no.2. pp.117-132 Søren Clausen Newspapers at the National Library of Norway. Library Management Volume 26 Number 1/2 2005 pp. 57-62 [online] Retrieved December 11, 2007 from http://www.emeraldinsight.com/Insight/ViewContentServlet;jsessionid=BC6DA3531A2B 8C0E9C1E38811292DB20?Filename=Published/EmeraldFullTextArticle/Articles/015026 0108.html Tsadik, Degife Gabre.A National archives for Ethiopia what hopes? African Journal of Library, Archives & Information Science .vol.1, no.2. pp.117-132(1991)

WIDENING ACCESS AND LEGAL ISSUES – NEWSPAPERS IN FOCUS Majlis Bremer-Laamanen The National Library of Finland

Abstract Finland’s new Act on Legal Deposit and changes in the Copyright Law came into force in 2008. The Act on Legal Deposit and Preservation of Cultural Material encompasses the archiving and preservation of different categories of cultural material from traditional print material to newer categories such as web pages films and television and radio programmes. The Centre for Preservation and Digitisation in the National Library of Finland has been a cornerstone in enhancing the mass digitisation of a number of collections such as journals, books and sound. Mass digitisation and advanced search were the key issues when planning the digital Historic Newspaper Library in Finland. The library was one of the pioneers in the field when it was launched in 2001 as part of the Nordic Historic Newspaper Library - TIDEN. The search possibilities and technical infrastructure have been enhanced by its second launch in November 2007. More than a century of newspapers are now included in the site, from the first Finnish paper in 1771 to 1890. This includes all 160 titles published during the period. During this time the use has constantly risen, with 3.3 million searched pages in 2007. A new role of the Centre for Preservation and Digitisation is underway. In the near future the Centre will be a digitisation centre for memory organisations, libraries, archives and museums.

1

Legal Deposit and Copyright Law – Widening Access

1.1. Legal Deposit The new Act of Legal Deposit and changes in the Copyright Law in Finland came into force the 1st of January, 2008. The Act on Legal Deposit and Preservation of Cultural Material encompasses the archiving and preservation of different categories of cultural material: traditional print material, web pages and other web material, films and television and radio programmes. Web material as well as television and radio programmes are new objects for archiving and preservation. The traditional print material, web pages and other web material like newspapers and sound is retrieved and stored by the National Library. The publisher of web materials is obliged to enable retrieval and saving and in certain cases to submit the material to the Library. The other material is retrieved by the Finnish Film Archive. For newspapers the law enables the Library to get newspapers published in paper format – also in the electronic format used for printing. The publishers have to include bibliographic

88

Bremer-Laamanen: Widening Access and Legal Issues – Newspapers in Focus

and copyright information with the delivery. The newspapers are to be preserved at the National Library. In reality the web material will not be as inclusive as the paper based legal deposit. The Ministry of Education will approve of a plan for the web delivery and access, these again are dependent on technical and financial resources, other needs and so on.

1.2. Copyright The copyright law is enabling access to the legal deposit material in electronic format in: - the legal deposit libraries in Finland (6 around the country), the National Library included and - the Library of Parliament - the University Library of Tampere - the Finnish Film Archive The digital material is available at special self-service desks in these organisations for research and study purposes. This widening access makes it now possible and feasible to digitise 20th century newspapers for a geographically wider range of users than would have been the case, if only the National Library had had the right to give access to the material within its premises. The copyright law also gives the possibility for one of the copyright bodies to make license agreements regarding the use of copyright based material. This law will in the future enhance the use of newspapers and other material on a wider scale in society – based on agreements and payment.

2

Long-Term Preservation

Microfilm has for more than half a century been the media for the long-term preservation for newspapers but also for brittle books in Finland. The microfilm has a lifespan of 500 years. This is also promised by Ilford for colour microfilm today. Still, the originals in the national collection are stored but not used after filming. In February 2007, the Ministry of Education set up a Working Group on Long-term preservation of Cultural Material with the central governmental bodies: The National Archives Service, The National Library of Finland, the National Board of Antiquities, and the Finnish Film Archive. The report was published in January 2008. Among its commissions were: -

to define principles for long-term preservation and use of electronic materials at the national level to draw up a plan for organizing long-term preservation to examine benefits of shared technical infrastructure to make a proposal for a funding model and administrative structure for a shared technical infrastructure

The benefits of a shared technical infrastructure for long-term preservation could well be combined to access services according to the report.

Bremer-Laamanen: Widening Access and Legal Issues – Newspapers in Focus

89

Internationally, microfilm is gaining more importance as a preservation option via - digital preservation options on microfilm - the use of colour film - the possible sound preservation on film and - the possibilities to store as much as 2 Tera Bytes of information on one film (source: current research suggestion at Fraunhofer Institute in Germany)

3

Mass Digitisation – More than a Century of Newspapers Online

Mass digitisation and advanced search were the key issues when planning the digital Historic Newspaper Library in Finland. The library was one of the pioneers in the field when it was launched in 2001 as part of the Nordic Historic Newspaper Library - TIDEN. The search possibilities are still in place and have been enhanced, while the technical infrastructure has undergone a remarkable development when reaching its second launch in November, 2007. During this time the use has constantly risen, being 3,3 million searched pages in 2007.

3.1. Enhancement of Search Functions Advanced search possibilities for the user since 2001: - full text search for Roman and Gothic Fracture texts. This means that all the words are searchable in the newspapers. Technically advanced optical character recognition (OCR-technology Abbyy Finereader) is used. A retrieval ware with a fuzzy search engine (Retrieval Ware Excalibur) was chosen to identify words as bit strings and thus interpret the words even if 2-3 letters were inaccurate. – The actual page of facsimile newspaper was given to the reader. -

browsing of the facsimile newspapers according to title, date: year, month and day. Specific information on the newspaper is also included.

-

an article index made into a tree-formed structure with about 450.000 articles. The articles had been manually collected to a huge index in the 19th and early 20th century. This was then digitised and OCRed. You can search by personal names, names of places (cities, villages) and search words. These are then classified with more specific search terms which lead you to the articles in the newspapers.

The new second version of the Historical Newspaper Library was launched, with the Minister of Education as a guest speaker, in November, 2007. (Introduction live or using PPs: Historical Newspaper Library, version 2. http://digi.lib.helsinki.fi) -

More than a century of newspapers are now included in the site, from the first Finnish paper in 1771 to 1890. This includes all 160 titles published during the period.

-

The search possibilities from the first Historical Newspaper (above) are included

-

The search words are now enlightened on the facsimile newspaper page. This helps the reader to find the search word when having e. g. a 7 column page on the screen.

-

Enlarging text and printing are easy.

90

Bremer-Laamanen: Widening Access and Legal Issues – Newspapers in Focus

-

The architecture and outlook of the site is more advanced. Translation into Swedish and English has been improved.

3.2. Technical Development If we look at the main processes of the Digital Chain for Newspapers, they include three parts: -

the physical object in the process the digital object in the process the collection of metadata

The technical development in mass digitisation did undergo a total change in the digital and metadata processes for the Digital Chain during 2001 - 2007.

3.2.1. The Physical Object Enhancing Access The use of high-quality microfilm as a platform for digitisation is the key for mass digitisation of newspapers in Finland. Microfilm has been used as an access and storage media at the National Library since 1951. All newspapers and free papers are microfilmed immediately after publication at the National Centre for Preservation and Digitisation since 1997. Some years earlier a refilming programme for older newspapers had been established at the site. Microfilming standards, chemistry and film material (from acetate to polyester film) and better preparation made it possible to plan for microfilming with better quality, for future mass digitisation. The results are available via the IFLA Newspaper Section in “Guidance on the best practice for microfilming of newspapers in preparation for possible future digitisation”. 2003. English, French, Spanish and Chinese versions are available on the IFLA-net. You can also find more information on the TIDEN-page http://tiden.se. The re-filming is done for the newspapers from 1771-1920 and continues for the years 1920-60. The process is economically cost-efficient and includes: • •

the preservation of the newspapers a digitisation chain for the future

That is why we are technically able to continue and widen access to all newspapers until the 1920’s in the following years and until the 1950’s within the decade.

3.2.2. The Digital Object and Access The main process for digital object consists of the following procedures: scanning, postprocessing, access, archiving and back-up. In the Newspaper project we could see a change in the technical infrastructure that made it possible to proceed from the first phase of mass digitisation and post-processing to the second phase:

Bremer-Laamanen: Widening Access and Legal Issues – Newspapers in Focus

• •

• •

91

more efficient microfilm scanners with grey-scale options enter the market The post-processing of the images is now integrated into one tool, docWorks by CCS. The tool gives the means to perform structural analyses of the content: titles, numbers, dates are checked, the material is OCRed with Abbyy Finereader as a component. The coordinates for each word are automatically integrated, which enables us to enlighten the words in the facsimile newspaper. The use of one tool speeds up the process and our staff are now able to handle more than ten times the amount of newspapers than before. The speed is however dependent on the depth of the structural data captured and the material in question and on process development. We get METS, ALTO, JPEG and TIFF output from DocWorks. The technical and access environment have to be developed continuously according to new demands. This has also been the case. The automatic importation and exportation of the data to the database and the access user interface has been developed (http://digi.nationallibrary.fi)

At this moment the National Centre for Preservation and Digitisation is responsible for the digital working environment and the accessibility and the preservation of the data. We have 94 terabytes of hard disc and over 300 terabytes of LTO.-tapes. In the future we will see nationally integrated systems for electronic Access and Long term preservation in Finland. These will naturally influence the handling of the Newspapers as well.

3.2.3. Capture of Metadata The capture of the Metadata is not, as we all know, limited to the bibliographical records collected according to international standards and guidelines. Here I will not go into detail but it is important to collect the metadata for mass digitisation throughout the whole process: • • • • • •

unique object identifiers administrative metadata descriptive metadata structural metadata preservation metadata rights management metadata

In mass digitisation we have established the metadata to be collected in accordance with standards and guidelines in the field. Interoperable search functions for museums, archives and libraries are based on the use of agreed metadata. As guidance and standards are developing, this is a field in which the future has much to give.

4

Future

Widening access to newspapers is now possible. The development in our processes has been a cornerstone for enhancing mass digitisation for other collections like journals, books and sound. Legal support enhances access. A new role of the Centre for Preservation and Digitisation is underway. In the near future the Centre will be a digitisation centre for memory organisations, libraries, archives and museums. Ministry of Education has granted us development funding for two years, for

92

Bremer-Laamanen: Widening Access and Legal Issues – Newspapers in Focus

instance for the logistics of the material. Newspapers are the first cornerstone of our activity – and we will enhance the access to them also with this funding and in years to come.

DIGITAL INGEST OF CURRENT NEWSPAPERS BY THE BIBLIOTHÈQUE NATIONALE DE FRANCE: The Situation End 2007/Beginning 2008 Else Delaunay IFLA Newspapers Section

Abstract This paper will provide an overview of the situation at the National Library of France (BNF) which is also the national legal deposit library. The author will try to answer some of the questions linked to the legal deposit of digital newspaper files and to the long-term archiving of those files through the experimentation going on at the National Library. How to negotiate the legal deposit of the DTP layout files yielded for printing with newspaper companies? Which contents of each newspaper issue are included in the digital format compared with the paper format? What about copyright? How far is the experimental development of very long-term archiving and what are the necessary requirements of digitisation and control for such archiving? Guidelines and preservation standards are still unstable while newer versions of technical developments steadily turn up. Ongoing discussions within the BNF deal with all these questions.

1

INTRODUCTION

In France, the legal deposit of digitised files of newspapers actually concerns a few regional dailies with local editions and the deposit of electronic files prior to the printing of the issues in paper format. Generally, not all the contents of an issue are included in the digital format of the online newspaper proposed by the publisher. The image of a whole page of the issue may not be complete. Therefore further preservation of digitised issues may be compromised so much more that the quality of digitisation may not be high enough. Digitised files by newspaper companies do not necessarily meet quality requirements for long-term archiving and standards for such archiving are still unstable. On the other hand, electronic files used for printing of the newspaper cover all the contents of each issue, including advertisements. The French law on Legal Deposit1 stipulates in its article 7 of 13th June 2006 that “by derogation of the first paragraph the Bibliothèque nationale de France (BNF) may require the deposit of the digitised file serving as a substitute for the deposit of the printed, graphic or photographic item. The conditions of such a deposit should be defined in agreement with the depositors.”

______________ 1 Article 7 of the Order of 2006-696 of 13th June 2006 which modifies the Order 93-1429 of 31st December 1993 concerning the law on legal deposit.

94

Delaunay: Digital Ingest of Current Newspapers by the BNF

Such files can be deposited without any special permission from the publisher but the law does not stipulate that it is obligatory to deposit them. Access is possible only for the users of the Legal Deposit Library which also may be in charge of the long-term archiving of the electronic files.

2

STATE OF THINGS

In 2004, the BNF’S Legal Deposit Office wanted to have the possibility to substitute the deposit of a printed item by the deposit of its digital version. The Law was subsequently modified. The procedure of digital deposit should be applicable to all kinds of items for which collecting and preservation are problematic: large size posters, standards, advertising newsprint, and etc. The BNF decided to select the regional daily newspapers (PQR = presse quotidienne régionale) in order to experiment with such a deposit. Several problems are linked to regional daily newsprint: • Difficulty in controlling the collection of a huge amount of items: more than 400 issues per day of some 60 newspaper titles • Preservation of such a bulky item printed on a fragile medium (low quality paper and ink). Until now microfilming and shared preservation with regional main libraries (BDLI = Bibliothèques du dépôt légal imprimeur) have been the solution but funding and the means to do so are not sufficient at the BNF as well as in the regional libraries • Stakes of legal deposit of electronic files of regional dailies: o Better covering of all editions o Better preservation of the items o Strengthening of the network (access allowed in the regional libraries (BDLI) o Reducing of costs However, there are also some constraints: • • •

• • •

The deposit is based on the depositor’s support: it is impossible to dictate strong technical constraints The digital version must be absolutely similar to the printed version. It must be possible to print the newspaper at scale 1 The digital version of what in fact has been printed and circulated: changes in writing during the printing (ex. sport results) or changes in the making (ex. merging of two editions). Such changes are made in the printer’s workshop and are not preserved electronically Perenniality of the delivered format Volume of digital data: vast storage space to be estimated Difficulty in setting up a generic deposit

Two solutions are considered in order to remedy this last difficulty: •

Integrating the deposit directly in the publisher’s management software. However, only a few publishers in the market have such a capability. This was experimented with La Dépêche du Midi and its software publisher Protec who

Delaunay: Digital Ingest of Current Newspapers by the BNF

95

were ready to develop a specific module but one more constraint remained: it would not be possible to split identical pages from one edition to another so storage costs would be too high •

Recovery of files committed to the national Trade-Union of regional daily newspapers (SPQR = Syndicat de la presse quotidienne régionale). The TradeUnion has set up a database in which each newspaper enters its pdf versions of all the pages every day (this is especially in order to prove to the advertisers that their advertising has been published). A single flow of some 30 titles. But a problem remains: the restoring of editions as the metadata are not good enough. There is an agreement in principle with the BNF.

The production line process 1) Daily or weekly delivery on FTP server or on CD/DVD of pdf files according to the pages of the newspaper and flat plan (which allows editions and pages to be restored in the right order) 2) Delivery control: files well received or not 3) Integrity control: pdf format, flat plan, coherence between names of files and flat plan 4) Completion control of the form: title strip, column strips, page numbers 5) Manual completion control: articles, photos, etc. All should be present 6) Automatic registration in Millenium (BNF’s electronic registration of items received by legal deposit) Setting up of UC3 (preservation unity nr 3) for users’ visualising in BNF’s reading rooms or in the BDLI. Archiving in the digital long-term storage space The BNF receives 60 regional dailies through legal deposit of which 40 include several local editions. The library keeps the main edition in paper format of 20 dailies as well as all editions on microfilm. As to the other 20 dailies the library keeps only the main edition in paper format, the local editions (paper and microfilm) being taken over now by the regional legal deposit libraries (BDLI). Long-term preservation of the huge amount of items is crucial. Due to the amendment of the law on Legal deposit stipulating that a single copy in paper format shall be preserved by shared preservation, there may be a risk for the survival of such items. The question on access to these vast holdings is also essential.

1. Legal deposit of digitised newspaper files • Legal aspects: 1) digitisation of older newspapers must meet legal provisions on copyright (limit of 70 years after authors’ death or after publication of a joint work). There is no free online access after 1937 2) for the digitisation of current newspapers, it is necessary, according to the Law, to negotiate with the depositors to solve various problems such as quality requirements, copyright, access, etc. It seems to be the best way to arrange legal deposit of these files, even if the law makes such deposit possible. It may be necessary to mask certain passages or illustrations if the

96

Delaunay: Digital Ingest of Current Newspapers by the BNF

author or the illustrator will not permit the online display of their works. Access is for BNF users only. •

Compensation for legal deposit of digitised files of current newspapers: -

The legal deposit library (the BNF) may suggest to take over the long-term digital archiving so as to perpetuate the digital information. However, this means high quality digitisation (number of pixels, high contrast, etc.) which is not always the case of commercial digitising. So this point must be included and specified in the agreement with the company.

-

Some experiences of legal deposit of digitised files have been carried out with 2 newspaper companies : the Populaire du Centre right up to February 2007 and Ouest France (still running).

2. Legal deposit of digitised files of current newspapers in France

1) The case of Ouest France, the first daily newspaper in France in terms of print run (with 42 local editions and 9 Sunday editions): This newspaper company produces microfilms and digital files using a “hybrid” solution: from born digital data to microfilm2. Unlike most newspapers Ouest France has always been taking a great interest in its own preservation. It is the only French newspaper which itself microfilmed and is still microfilming all issues since 1899 when its predecessor Ouest Éclair was founded (on the whole around 9 million pages). In 2002, it decided to microfilm directly its DTP layout files yielded for printing: instead of two captures and two processings, one continuous graphic chain (from the publisher’s site to the provider’s site, in this case ACRPP, Association for Conservation and Photographic Reproduction of Newsprint) 3. The agreement between Ouest France and the BNF in September 2005 as a part of the Digitisation Programme of Regional Daily Newspapers includes 3 directions of cooperation: a) Digitisation of retrospective files on microfilm (Ouest Éclair, 18991944 + Ouest France, 1945-2003) in image mode+OCR with the BNF in charge and considering only the main edition. Long-term preservation on microfilm by the newspaper company, all the microfilming being funded by the company. ______________ 2 see Chapman, S., P.Conway & Ar. Kenney. Digital Imaging and Preservation Microfilm: The Future of a hybrid approach for the Preservation of Brittle Books. Washington,D.C. CLIR, 1999 http://www.clir.org/pubs/archives/hybridintro.html 3 see www.acrpp.fr

Delaunay: Digital Ingest of Current Newspapers by the BNF

97

b) Purchase of current microfilms (all editions) by the BNF, accessible in the Public Library Area. c) Experimentation of legal deposit of electronic files to the BNF: the files are received on DVDs made from files prior to printing. The long-term archiving is in charge of the BNF as a kind of compensation for the deposit of the electronic files. According to the Agreement the experimentation will continue until 31st December 2008. The last point means that the legal deposit of the digitised files will include the 42 daily editions (local and other editions) of Ouest France as well as the 9 Sunday editions, in other words 240 000 pages per year. Until now these editions were and will still be regularly acquired on microfilm for preservation purpose, the main edition in paper format being also preserved by the library. The BNF will assume the very long-term archiving of the digital files. But here is the real challenge: how to structure files when considering new developments? How to provide for such new technical developments? Could there be a hybrid approach so as to ensure free access through the electronic files yielded for printing and long-term preservation through microfilming of the DTP lay-outs in a single continuous chain? The experimental schedule will continue in 2008 2) The Legal Deposit Office at the BNF has also started another experiment: the regional daily Le Populaire du Centre (Limoges) deposited its digitised files until February 2007. If the deposit stopped, it was probably because the newspaper did not get anything in return for its deposit. The BNF is not yet able to give access to these files as the interface is not ready. In fact the library does not know how to handle this problem at the moment. Staff and sufficient funding are not available. However, it is suggested that such experiments should be extended to other newspapers. Legal deposit online may result in agreements with many other regional papers with local editions. The National Trade-Union of the Regional Dailies (SPQR) was contacted so as to extend such kind of legal deposit of digitised files to all the newspapers being members of the TradeUnion in order to standardize procedures, formats and specifications.

3) The recent agreement with La Dépêche du Midi will also consider the possibility of legal deposit of its digital files of current issues. But there is some hesitation. 4) Another experiment with the regional dailies L’Union (Reims) and La Montagne (Clermont-Ferrand) did not work for various reasons:

98

Delaunay: Digital Ingest of Current Newspapers by the BNF

As a matter of fact the DSI, the IT Division of the BNF, noted that each newspaper uses a different electronic format. With regard to the deposit of digitised files of several, if not all, regional dailies with local editions it would result in a great number of channels if the files were to be used as received by legal deposit (flat plan). Therefore the DSI considers that the situation can progress only if the National Trade-Union of Regional Daily Newspapers decides to carry out its project of setting up a common portal. With such a portal shared by 40 regional dailies a single format would be available and thus facilitate the online deposit of these files. The present situation The Legal Deposit Office and the Trade-Union of the Regional Dailies insist on their mutual interest in promoting legal deposit of digitised files but this position remains a statement of principle. The agreement with Ouest France proves the will to continue the experience on legal deposit of its electronic files. This means that it is necessary to determine an official position of the BNF regarding that form of deposit. It may mobilize the internal teams and would represent a strong signal to our partners. An agenda of the different steps to follow must be settled and applied strictly: re-contacting partners – Ouest France, Le Populaire du Centre, SPQR (Trade-Union of Regional Dailies) – pointing out of specifications by each partner, expiry dates of IT developments. Access: Until now the BNF has not been able to handle the files that have been deposited files. The interface for access is not ready yet for users’ on site online access. This is probably due to lack of funding for specific equipment at the moment.

3. Shared newspaper digitisation Such projects concern regional and local older or current newspapers with several local editions. The following is an example of such a digitisation project: French digitisation projects may be shared between the newspaper company, the Regional Centre for Letters (CRL), the regional legal deposit library (generally the Main public library) and the BNF. In the case of Ouest Éclair the CRL is in charge of digitising the local edition from Lower Normandy as well as its free online access before 1937 (delay of 70 years). The newspaper company will assume the long-term digital archiving: one in non compressed TIFF format with a control every 2 years; another in 300 dpi PDF format stored on disc with up to 90-95 % of the text according to BNF’s requirements.

The partners participate in the funding of the project in accordance with the contract. •

Digitisation from microfilms (provided by Ouest France) + OCR in charge of the BNF (Rennes and Nantes editions) and of the CRL for Lower Normandy (Caen edition) as the project is limited to the 3 main editions of the newspaper :

Delaunay: Digital Ingest of Current Newspapers by the BNF

• • •

99

Caen (1912-1944) Nantes (1915-1944) Rennes (1899-1944)



Access will be through Gallica 2 (BNF’s digital library4) for Rennes and Nantes editions, access to the Caen edition in charge of the CRL



Stakes linked to the partnership around Ouest Éclair : • • • •

limit double copies maintain the homogeneity of technical specifications of the digitisation ensure the perenniality of digital data ensure setting up of bridges

4. Other projects Many other projects have been planned or are ongoing, such as: • Europeana launched 2006, online since March 2007. It does not include newspapers at present. Europeana is the French prototype of the European Digital Library, Museum and Archive5 to be launched in November 2008 with links to 2 million digital treasures. • The Bibliothèque numérique francophone, the digital library of French speaking countries founded by 6 national libraries in 2006 (among which the Library and Archives Canada, the BNF, the Royal Library of Belgium). It includes newspapers. The Alexandrina library has also joined the project. Last year, during the meeting in Paris of the Network of National Digital Libraries from all French speaking countries (in the North as well as in the South), there was a proposal to build a great common portal of digitised newspaper files in French with special regard to digitisation of newspapers in cooperation. The project is well supported by the OIF (Organisation internationale de la francophonie). A presentation of the first results of the proposed common portal of digitised newspaper files should take place during the Summit of the Network of National Digital Libraries in French in Montreal this autumn. •

the Digital Library of France, a common digital library including collections from all French libraries as digitising is going on in many libraries now. It is only a pre-project at the moment but generally it is very much wanted by libraries. Later, archives and museums should also enter their digitised collections. The Digital Library of France should include printed matter, periodicals, maps, musical matter, newspaper collections,

______________ 4 see http://gallica.bnf.fr 5 see www.europeana.eu

100

Delaunay: Digital Ingest of Current Newspapers by the BNF



5.

etc. It could be “a rational structuring body of the national digital bids: these are difficult to locate and point out; multiple participants; dispersion of projects and often small projects; risk as to data perenniality. the Digital Library of France, a common digital library including collections from all French libraries as digitising is going on in many libraries now. It is only a pre-project at the moment but generally it is very much wanted by libraries. Later, archives and museums should also enter their digitised collections. The Digital Library of France should include printed matter, periodicals, maps, musical matter, newspaper collections, etc. It could be “a rational structuring body of the national digital bids: these are difficult to locate and point out; multiple participants; dispersion of projects and often small projects; risk as to data perenniality. Long-term archiving of electronic files: Storage of huge amounts of digital files (more than 100 terabytes per year) and a great variety of formats If archiving of manuscripts, printed items, etc. has always been considered as important, it is now essential to store digital items in a safe and perennial way. It is therefore necessary to relay on a solid and performing basis. After several months of investigation, the BNF launched in 2006 the SPAR system (Système de préservation et d’archivage réparti6), a real digital store and a shared preservation and archiving system. It has been conceived according to international standards on the perenniality of digital data and it is expected to be operational in 2009. SPAR provides not only a safe storage of data but is also able to make several copies of digital items and ensure a permanent control of the condition of the registered files which allows to anticipate new copies before any definitive destruction. Through precise and full recognition of the entered data formats, it is possible to guarantee permanent access as the system will initiate changes needed in case of technological obsolescence of electronic restoring tools. For instance, if the image format JPEG becomes obsolete, SPAR can charge the given images into a new better performing format. However, to make available such a guarantee, a permanent technological control of formats, of prototyping and of testing of tools is necessary. All this is included specifically in the conception of SPAR. At any moment SPAR can also go back so as to restore the items in their original format. SPAR is not only a tool for the BNF. The library opens the system to other partners and institutions in proposing a “third-archivist” service of digital heritage. At present, it seems as if the outline of the enormous electronic amount of the regional daily newspapers (with local editions) has not been sufficiently provided for within SPAR. It is necessary to arrange for a very vast electronic storage space of such newsprint.

______________ 6

See http://vds.cnes.fr/pin/presentations/2007/Presentation_SPAR.pdf

Delaunay: Digital Ingest of Current Newspapers by the BNF

6.

101

New tools to facilitate researches Some new tools facilitate searching such as the digitisation of: - indexes, tables - newspaper yearbooks - newspaper bibliographies These are also linked to the digitised newspaper collection.

3

CONCLUSION

As you will have noted from the earlier sections, the BNF is still in an experimental stage as to the legal deposit of digitised newspaper files prior to printing of current issues. At present the major points are: -

setting up of the Trade-Union’s common portal adoption of an official position of the BNF as to this new special kind of legal deposit development of the long-term preservation system to include the huge amount of electronic files of regional daily newspapers handling of internal online access for BNF’s users.

If some regional newspaper companies have the will to continue or start the deposit of their electronic files, the library, on the other hand, must be able to handle such a deposit, to give access to it and to ensure its long-term preservation through the SPAR system. The near future will be a crucial one. Technical developments will change the deal so experimentation will go on.

COOPERATIVE EFFORTS IN PRESERVATION OF AND ACCESS TO THE WORLD’S NEWSPAPERS James Simon ICON Project Director, Center for Research Libraries

Abstract This paper will discuss the worldwide cooperative efforts of The Center for Research Libraries (CRL), the International Coalition of Newspapers (ICON) and the Southeast Asia Microform Project (SEAMP) to preserve global newspapers and provide both bibliographic and physical access to them. Also important in this discussion is strategic long-range planning for the continuation and sustainability of such programs. Cultural and language barriers are more easily overcome as these cooperative ventures continue, yet financial and technical barriers are certain to remain. These are the challenges to cooperative ventures going forward. Some possible solutions will be discussed, and an invitation for further dialog will be extended.

1

Long-Term Sustainability of Cooperative Efforts

The International Coalition on Newspapers (ICON), which began nearly ten years ago, originated as a community-driven collaborative effort with broad goals that are international in scope and long-term in nature. It has strongly benefited from the participation of its charter members, advisory committee, and participating institutions (in particular, the Library of Congress and the British Library). The Center for Research Libraries (CRL) has been a key supporter of the initiative, and continues to invest considerable time and effort to its pursuits. Since 1999, CRL has leveraged more than $1.5 million dollars in grant funds and more than $750,000 in cost share and in-kind activities. However, as a long-term sustainable initiative, ICON requires a broader base of support and a diversification of activity to ensure it achieves its goals. ICON has recently issued a call for participation to major international institutions with a mandate to preserve and make accessible news content. Institutions are being asked to augment the financial support provided to ICON by CRL, its member libraries, the National Endowment for the Humanities, Andrew W. Mellon Foundation, and other funders. As many of us have experienced, the cultivation of long-term collaborative initiatives is far more challenging than it appears on the surface. A look back over the history of cooperative collection management shows as many failures as successes. Several of the major collecting initiatives in the U.S. (the Farmington Plan, the RLG Conspectus) proved unsupportable over time. Shifting local priorities and competition among participating institutions make such broad-based and prescriptive tactics unworkable. However, other national plans, such as the U.S. Newspaper Project and the United Kingdom's NEWSPLAN, have shown remarkable resiliency over time. International cooperation has proven even more challenging to sustain. While some multiinstitution efforts have continued successfully for decades (CRL's Global Resources

104 Simon: Cooperative Efforts in Preservation of and Access to the World’s Newspapers

Network programs and area studies microform projects, for example), other transnational programs have not fared so well, especially in times of budget retraction. What makes for success in international library cooperative efforts? This paper aims to look at some examples of successful programs, with a particular focus on cooperative preservation efforts of newspapers in particular, and suggest a model of cooperation for the ongoing preservation of and access to the world’s news content. The paper will pay particular attention to content from or relating to Southeast Asia as a case example of regional cooperation.

2

Center for Research Libraries and Foreign Newspaper Preservation

CRL is at its core an international cooperative effort. It owes its very existence to the ten US Midwestern Universities, which conceived of the idea of a new library to maintain their rarely-used yet vital primary research material. This idea became what was initially known as the Midwest Inter-Library Center in 1949. During the 1960s and the 1970s, research libraries throughout North America accepted the invitation to participate in an expanded CRL, and today CRL stands as a trusted and enduring framework for institutional cooperation and resource-sharing on a international level. CRL's initial collections were deposited from its participating institutions. A significant portion of these early receipts was foreign (non-U.S.) newspapers. These newspapers, by virtue of their being deposited at the Center, became the shared property of all the member institutions. The Center also began a program of cooperative acquisition, subscribing to forty major foreign newspapers on microfilm to meet the growing demands of scholars. In the meantime, other US institutions were investing in local preservation of U.S. and international newspapers in microfilm. Harvard University's program involved the sale of copies of film to subscribing institutions to fund the ongoing preservation of additional titles. None of these programs was extensive enough to provide the access needed to meet an ever–burgeoning demand for newspapers from all regions of the world. In 1946, the Librarian of Congress recommended to the Association of Research Libraries (ARL) that a national program be created to coordinate the microfilming of “extensive runs of library materials.”1 Out of this call, the Foreign Newspaper Microfilm Project (FNMP) was formed. Representative newspaper titles from some eighty countries were selected, among these being New Times of Burma (Rangoon), Journal d’Éxtreme-Orient (present-day Hồ Chí Minh City), Times of Indonesia (Jakarta), The Manila Times, The Straits Times (Singapore), and The Bangkok Post. Because of its mission of cooperation as well its unique distributing infrastructure, the Center was chosen by the ARL to administer the FNMP in 1956. The program was devised to provide for the acquisition of some 100 foreign newspaper titles on microfilm which would then be available by loan to participating institutions that paid an annual subscription fee. The program would also provide for original filming of titles where none existed – these materials were available for purchase by members (along the same model as Harvard's earlier program). By 1974, the FNMP had almost 200 foreign newspaper titles under its purview and eighty institutional subscribers. ______________ 1 John Y. Cole, “Developing a National Foreign Newspaper Microfilming Program”. Library Resources & Technical Services, Winter 1974, vol. 18, no. 1. p 6.

Simon: Cooperative Efforts in Preservation of and Access to the World’s Newspapers

105

The cooperation inherent in the planning of the FNMP allowed participating libraries access to newspaper titles without the added expense of filming titles individually when they were needed. It also provided libraries with a wide variety of readily accessible newspaper titles with the added assurance that the titles which were little used would be maintained by CRL through the program for an unlimited period of time.

3

Southeast Asia Microform Project (SEAM)

International collaborative networking was also the impetus behind the formation of the Southeast Asia Microform Project (SEAM) in 1969. Early that year, at a conference on Southeast Asia documentation held in Chicago, the basic outline was put in place for a cooperative network of librarians, scholars, and others to gather, preserve and make accessible material from Southeast Asia. 2 Operating under the auspices of the Center for Research Libraries, in conjunction with the Committee on Research Materials on Southeast Asia (CORMOSEA), SEAM was established to provide subscribing institutions with better coverage of research materials related to the study of Southeast Asia through acquisition or original microfilming of rare or unique resources. Twenty-one North American libraries eagerly joined the project by the first meeting of SEAM in April, 1970. 3 The program was modeled largely on its existing sister programs at CRL, the Cooperative Africana Microform Project (CAMP) and the South Asia Microform Project (SAMP). However, unlike these programs which were formed to primarily serve its North American constituents, SEAM was intended as an international collaboration. Unfortunately, the logistics of sharing collections between large distances ultimately forced SEAM’s founders to move the network in the direction of partnerships external to Southeast Asia. SEAM’s early efforts included he acquisition of the Deli Courant (1885-1940), an important early colonial newspaper (filmed from the holdings at the Koninklijke Bibliothek in the Netherlands). SEAM also commissioned original filming from the PRO to preserve various Sessional Papers (Borneo, Brunei, Kelantan, Malay States, Malacca, Singapore, and Trengganu) from the Colonial Office records. A third item was the preservation of the Burma Gazette (1875-1927), the official publication of colonial Burma. This major undertaking took several years to accomplish and filled more than 300 reels of film. 4 As the program matured, broader collaborative efforts emerged through SEAM. In 1983, Alan Feinstein, a doctoral student at the University of Michigan, proposed microfilming of early Javanese newspapers and periodicals held in the Museum Pusat in Jakarta. The museum was considered a "rich and virtually untouched treasure trove for Javanese court literature," and the proposed materials, including the newspapers Bramartani and ______________ 2 CORMOSEA Bulletin, Vol. 2, No. 4. 1969. http://www.cormosea.org/bulletin/cormosea-02-04.pdf. (Accessed 8/25/05). The meeting also formalized the existence of the Committee on Research Materials on Southeast Asia (CORMOSEA), formed as a successor to two previous committees. 3 CORMOSEA Bulletin, Vol. 3, No. 4. 1970. http://www.cormosea.org/bulletin/cormosea-03-04.pdf. (Accessed 8/25/05). 4 Simon, James T. “Southeast Asia Microform Project: 35 Years of International Collaboration”, FOCUS Newsletter, Fall, 2005, vol. 25, no. 1. http://www.crl.edu/focus/05FallSEAM.asp?issID=1 (accessed 2/4/08)

106 Simon: Cooperative Efforts in Preservation of and Access to the World’s Newspapers

Jurumartani, represented the first vernacular newspapers in Indonesia. In cooperation with the National Library, SEAM successfully arranged for filming of dozens of titles with onsite assistance by Feinstein. Later, with the cooperation and support of the Ford Foundation and several Australian Universities, other rare documents and manuscripts from the 18th, 19th, and early 20th centuries were filmed, with SEAM as the U.S. depository. SEAM also received grant funding of $180,000 in late 1993 from the Henry Luce Foundation to establish preservation microfilming facilities at the National Library of Viet Nam. The objective of the proposal was to film early newspapers published in the Romanized vernacular quoc ngu script, with a focus on those titles not held in the Bibliotheque Nationale de France. From the late 1990s to the present, SEAM has continued its course of identifying materials in need, both within collections in the U.S. as well as in the region. SEAM has also collaborated to support preservation of major archives such as the Documentation Center of Cambodia's collection of Khmer Rouge documents. SEAM has increasingly focused more on contemporary materials, such as human rights documentation, election returns, and political ephemera. The project has also begun to focus attention on the creation of digital resources, particularly for materials that prove easier to use in electronic format (SEAM has, for example, sponsored the encoding of Philippine election returns at the Institute for Public Policy in Manila).

4

The International Coalition On Newspapers (ICON)

The increasing attention to the need for preservation of US domestic newspapers shifted priorities away from foreign titles. The United States Newspaper Program, which began in 1982, was a comprehensive and highly successful effort to identify, gather and preserve the newspapers of all fifty U.S. states and the District of Columbia. While CRL was an early participant of the USNP (in fact, CRL helped establish the procedures and rules that now govern newspaper cataloging), the Center through the FNMP continued to advocate for foreign newspaper preservation in the U.S. In 1987, the National Endowment for the Humanities and other organizations sponsored the “First International Symposium on Newspaper Preservation and Access.” The purpose of this gathering was to bring together scholars, librarians and other information professionals to draw on their expertise and to discuss the challenges of newspaper preservation and access within a global context. Ten years later, a follow-on “Symposium on Access to and Preservation of Global Newspapers” was convened by CRL in collaboration with ARL, the Council on Library and Information Resources, and the Library of Congress. Participants were asked to design a course of action that would guarantee acquisition of and access to international newspapers. The culmination of the participants' assessment was a recommendation to form a permanent body to monitor and coordinate an international effort of newspaper acquisition and preservation.5 In 1999, the International Coalition on Newspapers (ICON) ICON was officially established and a steering committee was formed from among the charter participating institutions, including the Library of Congress, the British Library, the ______________ 5 For a history of the ICON project, including these reports establishing the project, see the ICON website at: http://icon.crl.edu/history.htm.

Simon: Cooperative Efforts in Preservation of and Access to the World’s Newspapers

107

Library and Archives Canada, the New York Public Library, the University of Illinois, and the University of Washington. The continuing mission of ICON is to increase the availability of international newspaper collections by improving both physical access and bibliographic access. Among ICON’s original and continuing goals are: to amass information on the preservation status of the world’s newspapers; to provide access to and unprecedented array of newspaper holdings; to preserve global, cultural and intellectual resources; and to provide an ongoing forum for discussion of issues related to global newspapers. ICON has honored its commitment to preserve and improve access to international newspapers by the pursuit of several core activities: ● Creation of the ICON Database of International Newspapers: The ICON Database is a freely accessible electronic resource intended to provide reliable information about newspapers published outside of the United States. It includes a bibliographic description of titles as well as specific information on institutions’ holdings of the same. The database serves as a central locus for information about international newspaper collections available in North American libraries and in selected libraries outside North America, providing a tool for resource discovery, access, and collection management. The database is available from ICON’s home page . ● Preservation Microfilming of Important and Underrepresented Titles: ICON’s work involves the preservation of newspapers that exist only in hard-copy. ICON has preserved whole runs or significant parts of more than 50 titles from around the globe, including titles such as Národní Listy (Prague’s leading title of the day, for the years 19181931) and Eritrean Daily News (published in occupied Asmara during the Second World War). Currently, ICON is filming nine historically important titles from the South American continent. ● Dissemination and Coordination of Substantive Information about International Newspapers: The ICON Web site contains the ICON database as well as a variety of resources, including a “clearinghouse” of international standards for newspaper preservation and bibliographic access. ICON is mounting information relating to project reports and presentations; links to newspaper informational sites; news and developments of current preservation projects; reference resources; and digitized guides. It is ICON’s goal that the Web site will become a vital and comprehensive stopping place for gathering data about accessioning and preserving newspapers. ● Coordinated Cataloging of Partner Institutions’ International Newspaper Holdings: Increasing access to international newspapers depends on the creation and sharing of bibliographic information in local and national catalogs or databases. Bibliographic control of foreign newspapers is a resource-intensive activity that requires language or area specialization in addition to high-quality cataloging skills. ICON’s role as an advocate and support structure for original cataloging of international newspapers was an early goal of the project’s visionaries. To date, ICON has created over 600 original cataloging records of previously uncataloged holdings through its partnerships. During its history, ICON has received generous support from the National Endowment for the Humanities (NEH). In doing so, the NEH has signaled that the preservation of foreign newspapers and bibliographic access to them is vitally important for scholars and researchers today and in the future.

108 Simon: Cooperative Efforts in Preservation of and Access to the World’s Newspapers

5

The State of Southeast Asian Newspaper Preservation in US

In George Schwegman’s 1948 publication “Newspapers on Microfilm” published by ARL, there were only nine Southeast Asian newspaper titles preserved on microfilm in the U.S. at that time. These included three titles from the Philippines, one title, the Straits Budget, from Singapore, and three titles from Thailand. With the exception of the Manila Times, filmed by the Library of Congress starting with 1898, all of the Southeast Asian titles were recently filmed issues beginning in the late thirties and early forties. In addition, all of the titles on film were in English. With the addition of several more (still western-language) newspapers added with the Foreign Newspaper Microfilm Project in 1956, Southeast Asian titles were slowing becoming more visible in the US. Early cooperation among universities (particularly Cornell and Yale University) gave rise to microfilm projects producing copies of newspapers, theses, and out-of-print materials. Other institutions such as Ohio University, University of Washington, and University of Wisconsin coordinated filming of specific newspapers from the region.6 University of Michigan and the University of Hawaii also have also contributed filming over time of their newspaper holdings. The Library of Congress (LC) has been a strong supporter of cooperative efforts to microfilm foreign newspapers. LC has been heavily involved in the Foreign Newspaper Microform Project, SEAM, and ICON since their respective foundings. LC continues to be the largest not-for-profit producer of newspapers in microform in the US, In 1963, the Library of Congress opened its field office in Jakarta, Indonesia to acquire important materials for the library, but also to serve as a cooperative acquisitions service for other libraries that want to acquire publications from countries and regions covered by the office. The Jakarta office instituted microfilming activities in 1978.7 Today Library of Congress films nearly 60 current newspapers & periodicals from all countries in the region, either filmed in the Jakarta offices or sent to New Delhi for processing. The office also coordinates digital and microform preservation with Indonesian institutions and other foreign CAPSEA participants in Jakarta, and provides training and quality control services. US microfilming efforts of newspapers have concentrated mainly on current receipts (in early years, microform was used as a distribution mechanism and as a means for saving space, with preservation a secondary benefit). This reflects the depth of holdings and the general strength of collecting of Southeast Asian materials in U.S. libraries, with the spike in interest in the region leading up to and following the Second World War. While institutions hold some old and rare content in their collections, retrospective preservation has been taken up only by a handful of institutions. SEAM, for example, has provided substantial funding to Cornell University to support preservation of their extensive newspaper backfiles (to date, SEAM has supported the filming of nearly 175 titles in long or short runs for the period 1950-1990, including a long run of the Vietnam Press).

______________ 6 Riedy, Allen, "Southeast Asia Newspaper Collecting in the United States," 65th IFLA Council and General Conference, Bangkok, Thailand, August 20 - August 28, 1999. http://www.ifla.org/IV/ifla65/ papers/136-140e.htm (Accessed 2/22/08). 7 Mitchell, Carol, and Armstrong, James. "Understanding the World: The Library's Operations Overseas." Library of Congress Information Bulletin, May 2005. http://www.loc.gov/loc/lcib/0505/overseas.html (accessed 2/22/08).

Simon: Cooperative Efforts in Preservation of and Access to the World’s Newspapers

109

A survey of newspaper holdings in OCLC's Worldcat database provides interesting information on the makeup of Southeast Asian newspaper collections in North America. Newspaper holdings listed in OCLC Country Records Unique Titles Burma Brunei Cambodia Indonesia Laos Malaysia Philippines Singapore Thailand Timor Leste Vietnam

67 12 101 555 30 110 470 121 127 4 343

35 4 76 398 21 61 272 77 72 4 263

Combined Holding % titles with Statements MF holdings 137 70% 18 75% 138 58% 1,065 83% 65 81% 134 75% 1,200 87% 240 77% 193 74% 7 100% 802 87%

Newspaper holdings tend to conform to general U.S. collecting strengths in Southeast Asia. Indonesia, Philippines, and Vietnam top the list of country strengths, with Singapore, Thailand, and Cambodia a distant second. The dates of holdings also conform to general subject strengths, with the bulk of holdings in the 1940's-1960's. Some holdings for Malaysia and Singapore extend back to the mid-19th century, and Vietnam to the 1930's as a result of cooperative filming (see below). The greatest concentration of titles is in English, with strongest holdings from the Philippines, Singapore, and Burma. Relatively few English holdings come from Indonesia and Vietnam. Titles also focus mainly from the capital cities, rather than from the provinces. The 1966 union list of “Philippine Newspapers in Selected American Libraries”8 declared that provincial holdings were entirely inadequate, with virtually no current subscriptions to these more remote publications. An assessment of holdings from the Philippines shows only modest gains in acquisitions from other areas. The majority of held titles continue to be from Manila. A look at the preservation status of the newspapers held in U.S. institutions shows a very strong representation of preservation microfilming. The average percentage of titles held with at least partial preservation (or holdings exclusively in film) is 79%. This is not to be confused with comprehensive film holdings of every title. From experience, a great number of titles are held in scattered runs in the US—and the film often reflects these gaps. The details of holdings within OCLC records is notoriously inconsistent, but a casual survey of records indicates that holdings in film often do not extend back to the beginning of a title (reflecting perhaps the time at which titles came to libraries' attention in the US), and in many cases do not extend fully to the present. Absent issue-level holdings information in many catalogs, it is possible only to generalize on this ______________ 8 Saito, Shiro, “Philippine Newspapers in Selected American Libraries : a Union List”. East-West Center Library, 1966.

110 Simon: Cooperative Efforts in Preservation of and Access to the World’s Newspapers

point. However, it bears reminding that summary holding statements in bibliographic records often overlook small—or sometimes sizable—gaps. Where microfilming capacity exists in country, one finds that there has been at least de facto cooperation in filming to reduce duplication. •

In the cases of Burma, Laos, and Timor many or most titles on film were produced by US institutions (including LC, Cornell, CRL, Yale, Berkeley, and Washington). For Thailand, limited holdings were acquired from the National Library and Archives.



In Cambodia, film holdings from the US predominate, but there exists additional filming from ACRPP and the National Archive of Cambodia (particularly titles from the 1990s). For Vietnam, ACRPP played a similarly important role, with the National Library recently filming a wide variety of papers not held elsewhere. This was accomplished in part by cooperative funding from SEAM through a grant from the Luce Foundation.



In the Philippines, the Ateneo de Manila University has preserved many important resources, reflected in significant holdings in the US. For Malaysia, much film was completed by the National Library of Singapore, with the National Library of Malaysia, National Archives, and University of Malaya Library also playing significant roles. In Indonesia, film holdings are spread across a great number of players, with LC, Cornell and SEAM topping the list, but with appreciable holdings from the Perpustakaan Nasional (National Library) and Arsip Nasional (National Archives). The Library of Congress office in Jakarta, as mentioned above, films a number of current dailies for preservation purposes.



For Singapore, the bulk of film was acquired from the excellent filming operation of the National Library of Singapore.

The extent of filming of contemporary titles (1990's- ) is considerably less than historical titles. This is predictable, given the relatively scarce resources, combined with an attention on preservation priorities and the availability of many titles in electronic format through the producer's Web site and/or a commercial aggregator such as Lexis-Nexis. Still, the lack of print and film in our collections may well reveal itself as a collection gap in the future, should the electronic versions no longer be available and print holdings not retained longterm. Without a comprehensive collection and preservation plan for these electronic resources, we are exposing our institutions to great risk.

6

Digitization of News in the US: A Snapshot

The work of digitization of backfiles of newspapers is the challenge of our times. While mass-production digitization facilities have taken hold at numerous institutions globally, these have focused primarily on monographs, with newspaper remaining a distant goal. It is left (for the moment) for our own institutions to consider how to make this content available in ways our scholars are becoming accustomed. As many have amply demonstrated, the challenges are not small, and the scale of the effort is daunting. The availability of digitized versions of news resources is still somewhat spotty to date, with very few complete runs or major collections being made available.

Simon: Cooperative Efforts in Preservation of and Access to the World’s Newspapers

111

Many institutions are, it seems, still in a "pilot" stage when it comes to making their material accessible online. In the US, most attention has been paid to its own historical content. Following on the U.S. Newspapers Project, the Library of Congress and National Endowment for the Humanities have embarked on the National Digital Newspaper Program (NDNP). While this has started small, the first results have been encouraging and demonstrate the powerful possibilities of aggregating these widely distributed resources.9 Also on the national level, a number of major vendors such as ProQuest and Gale have expanded the availability of US newspapers. ProQuest Historic Newspapers has digitized backfiles of several major “papers of record,” such as the New York Times, Wall Street Journal, The Los Angeles Times, Chicago Tribune, the Washington Post, The Christian Science Monitor, The Boston Globe, The Atlanta Constitution, and The Hartford Courant. They have recently announced the digitization of the UK’s Guardian and Observer newspapers to add to the collection. ProQuest has added news components of Black Newspapers and newspaper from the Civil War era. Gale has launched several major projects combining collections of newspapers. Content from the U.S. includes their new and large 19th Century US Newspapers collection. Gale has also digitized numerous UK newspapers, featured in the Times Digital Archive (17851985) as well as the 17th and 18th Century Burney Collection Newspapers and the new 19th Century British Library Newspapers. Not to be outdone, Readex (NewsBank) has launched the full-text America's Historical Newspapers project, which incorporates the various phases of their Early American Newspapers Series 1-7, covering the years 1690-1922, and Hispanic American Newspapers, 1812-1980. Civil War-era newspapers are also covered by such companies as Accessible Archives, Alexander Street Press, and NewsBank. There is considerable overlap of content among many of these, as well as with other educational or non-profit efforts such as the University of Richmond, Pennsylvania State University, and others. From the education sector, there has been significant effort as well, though relatively small in comparison to commercial offerings. Major initiatives include the California Digital Newspaper Collection,10 Florida Digital Newspaper Library,11 Utah Digital Newspapers,12 and other institutions that have been recipients of NDNP funding. Smaller programs have focused on digitization of more limited periods of material, with varying methods of access. A number of small community libraries have taken the initiative to seek donor or grant funds to digitize local collections. Every week reveals another small collection of titles from smaller cities and counties around the United States. Some examples include the ______________ 9 See “Chronicling America: Historic American Newspapers”. http://www.loc.gov/chroniclingamerica/ (accessed 3/15/08). 10 California Digital Newspaper Collection http://cdnc.ucr.edu/ (accessed 3/15/08)/ 11 Florida Digital Newspaper Library, http://www.uflib.ufl.edu/UFDC/UFDC.aspx?c=fdnl1 (accessed 3/15/08). 12 Utah Digital Newspapers, http://www.digitalnewspapers.org/ (accessed 3/15/08)/

112 Simon: Cooperative Efforts in Preservation of and Access to the World’s Newspapers

Boca Raton Archives (Florida),13 Quincy Historical Newspaper Collection (Illinois),14 the Barnstable Patriot archive (Massachusetts),15 Winona Newspaper Project (Minnesota),16 and the Northern New York historical papers.17 Additional efforts have been made on the part of numerous universities and colleges to make their campus publications available. However, content from foreign news sources have remained largely untended in the United States. Some exceptions include the Digital Library of the Caribbean,18 sponsored by the University of Florida and regional and international partners; the Protesta Humana19 collection at the University of California, Los Angeles featuring rare examples of anarchist, socialist, and communist newspapers published in Buenos Aires during the late nineteenth century; and several examples of US ethnic press, including Dziennik Zwiazkowy (1908-1917) a Polish language paper still published today and digitized by CRL.

7

Cooperative Model, 2008-

As cooperation has existed for decades in the microfilming of newspapers, so too should there be in the cooperative digitization of news content. As noted in the 2005 paper "The Importance of 'Yesterday’s News,' "20 numerous reports produced by newspaper digitization efforts have expressed the importance for national and international cooperation among libraries, educational institutions, technology providers, standards organizations, and others. This need is becoming more acute as the number of efforts worldwide is expanding, in scales both large and small. As with so many other aspects of library collection management, the digitization of the world's news far outstrips the capacity of any one institution alone. At the IFLA Newspaper Section meeting in Salt Lake City (2006), CRL proposed a broad effort for the cooperative digitization of the world's newspapers.21 The proposed initiative brought together nine major repositories of newspapers to combine holdings and leverage appropriate terms with publishing organizations. While focusing initially on newspapers from a single region (Latin America and the Caribbean), the effort assumes that the

______________ 13 Boca Raton Historical Society, http://www.bocahistory.org/boca_history/br_history_newpaper.asp (accessed 3/15/08). 14 Quincy Historical Newspaper Archive, http://quincylibrary.org/library_resources/NewspaperArchive.asp (accessed 3/15/08). 15 Barnstable Patriot Newspaper Archive, http://www.sturgislibrary.org/collections/barnstable-patriot (accessed 3/15/08). 16 Winona Newspaper Project, http://www.winona.edu/library/databases/winonanewspaperproject.htm (accessed 3/5/08). 17 Northern New York Historical Newspapers, http://news.nnyln.org/ (accessed 3/15/08). 18 Digital Library of the Caribbean, http://www.dloc.com/ (Accessed 3/15/08). 19 UCLA Digital Collections: Latin American Newspapers, http://digital.library.ucla.edu/newspaper/ (accessed 3/15/08). 20 Jones, Allison, "The importance of 'Yesterday's News': Opportunities & Challenges in Newspaper Digitization." Richmond Daily Dispatch digitization projects, n.d. (2005). http://dlxs.richmond.edu/d/ddr/ docs/papers/yesterdaysnews.pdf (Accessed 3/15/08). 21 Simon, James T., "Cooperative Digitization and Dissemination of World Newspapers: a Proposal," IFLA 2006 International Newspapers Conference, May 17, 2006.

Simon: Cooperative Efforts in Preservation of and Access to the World’s Newspapers

113

arrangement with the partner organization(s) will be ongoing and eventually encompass newspapers and news-related materials from other world regions. The affiliated institutions have been investigating the feasibility of the World News Archive with several potential partners over the past several months. As of this presentation, an agreement has not been concluded for this partnership. Not surprisingly, such an effort has its share of challenges, not the least being costs of mounting such a large-scale effort and equitable returns for the subsidizing partners. Nevertheless, CRL remains optimistic that a model will emerge that will carry this effort forward. A number of other useful models have begun to emerge on regional, national, or international scales. The NDNP is providing critical services to US repositories by providing startup funds for state efforts, setting minimum acceptable standards, offering a centralized portal for content while encouraging interoperability. The TIDEN project in Finland coordinated the newspaper digitization for several countries for the eventual creation of a Nordic Digital Newspaper Library. These efforts aside, the ongoing needs of projects and efforts call for additional cooperation on a broad scale. We assert that ICON may be the appropriate locus for these cooperative efforts, and can offer a home for some or all of the following activities: •

Information network: One of the most crucial needs for emerging projects is access to reliable information on the establishment, production, and persistence of news repositories. Given the amount of information available, but difficult to access given its scattered nature, a central portal of information on processes, costs, technical specifications, and archival models would be extremely helpful for existing and new efforts. Moreover, a personal network of individuals and institutions familiar with these practices would greatly aid others interested in learning from existing efforts.



Project / title inventory: A number of studies have called for a central inventory of efforts and titles underway. While this is very much a moving target, ICON has several elements in place, including its registry of retrospective newspaper digitization efforts (http://icon.crl.edu/digitization.htm) and its title-level newspaper database. ICON hopes to integrate data from disparate sources to provide title-level information on holdings in print, microfilm, digital, and "last-copy" preservation status. Contributions from a network of partners are the only way of ensuring this information is comprehensive and current.



Project evaluation: Not only is it important to know what has been digitized, it is also valuable for information professionals to receive unbiased assessments of the scope, utility, and overall value of commercial and non-commercial products. Peer assessments of products will improve the field overall and enhance the utility of the services we seek to provide. In a related area, CRL has recently partnered with the Charleston Advisor, a publication featuring critical reviews of Web products, to add analyses of the archiving provisions made by publishers for electronic products, including news resources such as Access World News, The Times Digital Archives, LexisNexis, and others.18

______________ 22 Machovec, George, "From Your Managing Editor - New Working Relationship between the Center for Research Libraries and The Charleston Advisor," The Charleston Advisor. Vol. 9, no. 3 (January 2008).

114 Simon: Cooperative Efforts in Preservation of and Access to the World’s Newspapers



Rights management: This topic is a necessity for all projects. While most projects have started their efforts with materials in the public domain, the demands of scholarship for recent content will necessitate moving forward in time. Libraries are finding that they must increasingly depend upon the cooperation of news publishers, micropublishers and digital re-publishers to preserve and make newspapers widely available. In dealing with such organizations libraries must be able to protect the substantial investments they have made, and will continue to make, in the acquisition, storage, preservation, and maintenance of news content. A cooperative partnership can assist libraries to secure from newspaper publishers certain limited digital rights for newspapers or, in the case of national depositories, recommend appropriate rights and terms that advance the custodial library’s preservation and access goals through digitization. Ongoing collaboration can help advance these goals. Content provision and exchange: Finally, collaboration may take the form of exchange of files or of equitable access to projects of mutual interest. Certainly there is interest in the United States in acquiring access to news content in Southeast Asia (or Europe or the Pacific, etc). A collaborative network will open channels to communities of interest in ways that will help support local initiatives, through provision of content ("gap fillers" or complementary resources), subscription or purchasing of resources, publicity or other means. A second component of this cooperation might be in licensing content that is of interest to a broader constituency.

8

Conclusion

In 2008 CRL will bring the ICON program under the umbrella of its Global Resources Network. The Network, largely supported by CRL member institutions, provides legal, financial, strategic, and communications support for a range of programs that preserve and afford access to important knowledge and historical and cultural evidence. Through collaboration in ICON, members will play a role in supporting the efforts of the Global Resources Network. In return, member institutions will participate in the governance and strategic direction of the GRN as well as CRL's efforts, particularly as CRL expands its efforts into digital access and conversion of its news content. Most importantly, though, participation in a broad global collaboration such as ICON will further the collective and individual efforts of those institutions seeking to preserve their heritage and make it available to the widest audience.

THE INDEX TO PHILIPPINE NEWSPAPERS (IPN) ONLINE Chito N. Angeles University of the Philippines Diliman

Abstract In February 2007, the University of the Philippines Diliman Library launched another inhouse developed online system known as the Index to Philippine Newspapers (IPN). The IPN Online is the country’s first free online index to local newspapers made publicly available thru the University of the Philippines Diliman Library’s official website (http://www.mainlib.upd.edu.ph/ipn). This paper traces the development of the IPN beginning from its early years (using printed indexes) until the development and implementation of the online version. It also highlights the salient features of the IPN online and how this new system successfully addressed the problems and difficulty encountered in the legacy system. The paper also cited important reasons for developing a newspaper index database as against setting-up a newspaper digital archives or enewspapers. Finally, plans for further development of the system were laid down.

1

Introduction

The Index to Philippine Newspapers (IPN) is the University of the Philippines Diliman Library’s index to nineteen (19) locally published newspapers. The newspaper index, which started in 1981, was originally a guide to eight (8) locally published newspapers, namely: Manila Bulletin; Philippine Daily Express; Manila Chronicle; Philippine Daily Inquirer; Philippine Star; Malaya; Manila Times; and Times Journal. During its early years, index entries were written in 3x5 slips (shown in Figure 1), arranged alphabetically by subject, and kept in drawers chronologically arranged by date. Because the entries written in paper easily got worn-out, the library adapted two preservation methods: these are, photocopying of the 1981-1990 index slips and encoding the 19911993 index slips into a CDS/ISIS database (“Filipiniana Serials/Special Collections Section: Proposed Project 1994-1995”).

116

Angeles: The Index to Philippine Newspapers (IPN) Online

Figure 1: 3x5 Index Slips

The initial automation of the newspaper index, which started in 1991, continued only until 1995. All records, which were encoded in CDS/ISIS database, from 1991 to 1995 have been checked and revised. After 1995, the project was stopped due to the unavailability of staff to encode and verify the records in the database. However, the indexing of newspaper articles continued in the next couple of years, but again 3 x 5 slips were used to record the entries. As the newspaper collections grew big, more and more articles were indexed and more and more entries were added in the drawers, but slowly deteriorated due to the changing weather conditions and the nature of the slips of paper where the entries are written. Producing a hardbound copy of the slips would just occupy more space in the library, not to mention the problems of users having difficulty reading hand-written entries. Added to this is the time consumed in browsing entries in the drawers arranged by subject. In 2001, there was an attempt towards the creation of a new system for the IPN. The author, being the head of the Computer Services Division of the University of the Philippines Diliman Library, initiated the development of a new database system to continue encoding indexed newspaper articles from 3x5 slips of paper. The Computer Services staff started encoding index entries from slips beginning 2001 moving backwards to the 1996 issues. Records from CDS/ISIS, covering index entries from newspapers published in 1991-1995, were also converted to the new database system. By the end of 2001, a total of 22,500 entries were already available in the database covering the newspapers published from 2000-2001. Data entry continued since then and in 2006, the database has grown to more than 162,000 index entries covering newspapers from 1991 to the present. In 2006, a full-blown automation project was carried out by the author to include: 1) design of a new newspaper index database; 2) conversion of more than 162,000 electronic records from legacy systems covering index entries from newspapers published in 1991-2006; and

Angeles: The Index to Philippine Newspapers (IPN) Online

117

3) development of an online newspaper management and article indexing system, known as the Index to Philippine Newspapers (IPN) Online.

2

The Index to Philippine Newspaper (IPN) Online

The IPN online database is a web-based, platform-independent system, developed using Free Open-Source Software (FOSS). Because it was developed and implemented using free open-source technologies, there were no expenses incurred during its development and implementation. The most outstanding features of the new system include: •

Search Engine - a robust search interface that supports both simple and complex search expressions, such as Boolean searching, phrase searching, and truncation;



Issue Browser - list (and count) all available issues in selected newspaper title in hierarchical structure from specific year, month, down to specific issue dates;



Online Request - an interface where users can submit request for reading, photocopying, or printing on designated request (OPAC) terminals;



Circulation System – similar to a book circulation system but intended for lending newspapers in print or microfilm;



Built-in Authority File - for managing subject and name entries. It supports cross-references, such as Use, Use For, and Related Terms;



Virtual Cart - used to bookmark records for later processing (i.e., printing, sending to email, saving to file);



Newspaper Management - For recording newspaper titles, daily issues, etc.;



Report Generator - for generating usage reports and performance output of staff.

Currently, the IPN Online includes article indexes of 19 locally published newspapers from 1981 to the present, namely: Bulletin Today, Daily Express, Dyaryo Pilipino, Kabayan, Malaya, Manila Bulletin, Manila Chronicle, Manila Standard, Manila Standard Today, Manila Times, Philippine Daily Globe, Philippine Daily Inquirer, Philippine Star, Philippines Journal, Sunday Standard, Times Journal, Today, Veritas, and We Forum. As of this writing, there are 186,537 article indexes in 63,333 issues from the 19 newspaper titles mentioned above. Among the 19 newspapers, 6 are still currently being indexed by the library, namely: Malaya, Manila Bulletin, Manila Chronicle, Manila Standard Today, Philippine Daily Inquirer, and Philippine Star. Issues in print older than one (1) year are already discarded and converted to microfilm format.

3

The IPN Online vs. Digital Newspaper Archives and E-Newspapers

Why develop a newspaper index database when most libraries abroad have started putting up digital copies of their newspapers in print and microfilm? In addition, a number of local

118

Angeles: The Index to Philippine Newspapers (IPN) Online

newspaper companies have already developed electronic versions of their newspapers, known as “e-Newspapers”, made accessible via the Web. The most important reasons for pursuing this project are as follows: • Producing full-text newspaper article database is expensive and the workload requires additional staff as well as equipment; • Digitization of newspapers is laborious because of the relatively different layout of newspapers (i.e., contents are usually distributed in several different pages); • Online full-text newspaper databases do not include everything published in the paper; • Not all newspaper companies locally with online versions of their dailies maintain archives of their previous issues. Those who do usually keep only the issues within the last 2 to 3 years; and • Subject indexes to newspapers provide a means to group together articles of the same topic or having similar content.

4

THE IPN ONLINE: A PRODUCT OF AUTOMATION

This section describes the state of the Index to Philippine Newspapers at the University of the Philippines Diliman Library before and after the deployment of the new system. Specifically it describes the activities and services of the library relating to the management of newspaper titles and issues, preparation and creation of article indexes, servicing of newspapers in print and microfilm formats, etc., and how the new system was able to improve these services and address the problems in the old system thru automation. -

Recording (titles and issues) Before, as newspapers are delivered to the library, the issues are recorded by the staff on paper. Using the new system, newspaper issues are recorded online in their respective titles and saved into the database (shown in Figure 2). A pop-up calendar is made available to easily select specific issue dates to add under each newspaper title.

Angeles: The Index to Philippine Newspapers (IPN) Online

119

Figure 2: Newspaper Management

-

Indexing Procedures and practice Before, Indexers read newspapers and select articles to be indexed. Index entries are written in 3x5 slips and filed in drawers arranged in chronological order. Revisers review index entries in 3x5 slips and update entries as required. Encoders pull out slips from drawers and encode index entries into the database. Since the index entries in slips are hand-written, electronic records are prone to human errors (e.g., typographical errors). Information, such as: newspaper title, issuedate (year, month, day) are repeatedly keyed-in by Encoders for each index entry. Using the new system, after selecting articles to be indexed, Indexers encode the index entries directly into the database using an online indexing worksheet (shown in Figure 3).

120

Angeles: The Index to Philippine Newspapers (IPN) Online

Figure 3: Online Indexing Worksheet

Newly encoded index entries are automatically marked as “In-Process”. Revisers check records marked “In-Process” and perform the necessary revision. Upon revision of index records, Revisers mark the records as “Active”. Records marked “Active” immediately become searchable to the public. In addition, issues associated with an article can be added easily by selecting available issues under each newspaper title (shown in Figure 4) rather than manually encoding the issue dates in every article index.

Angeles: The Index to Philippine Newspapers (IPN) Online

121

Figure 4: Adding Issues

-

Vocabulary Control Before, when Indexers assign subject headings, they search for the proper heading on an authority list in a CDS/ISIS database. A separate authority list is available for both the author names and subject headings. When the proper headings are found, Indexers update the index entries in the 3x5 slips. Though an authority list is available, it is only as good as a look-up tool for proper headings to use. Index entries are still encoded from 3x5 slips and are therefore prone to human errors (e.g., typographic errors) resulting to inconsistencies in the subject headings and author names. Using the new system, Indexers simply search the built-in authority list and select the proper headings to be assigned to each index record (shown in Figure 5).

122

Angeles: The Index to Philippine Newspapers (IPN) Online

Figure 5: Authority Control

The built-in authority list maintains only a single list combining author names and subject headings in one authority file. This is due to the fact that author names are also used as subject headings. This way, redundancy is minimized, if not totally eliminated. Finally, the built-in authority list also includes cross references (i.e., USE, USE FOR, RELATED TERM) and SCOPE NOTE to help the indexers choose the right headings to use.

-

Search and Retrieval Before, users looking for newspaper articles published from 1980-1990 will have to consult the printed index. For articles published from 1991 onwards, the users can search the database. The search results provide only the basic bibliographic data (i.e., title, author, newspaper name, issue date, and page). Using the new system, the user may choose to customize his/her query from the different search options available (shown in Figure 6).

Angeles: The Index to Philippine Newspapers (IPN) Online

123

Figure 6: Search Interface

Users can search by subject, title, or author’s name. Search can be narrowed down further by limiting query by date and by specific newspaper title. The search interface also provides option to refine query using Boolean operators AND [all the words], exact words [as phrase], or truncation [as rootword]. The record details interface (shown in Figure 7) provides complete bibliographic information and tells the user where to find the specific newspaper article. In addition, links on subject headings and author names provide a means to retrieve articles of the same subject or written by the same author.

124

Angeles: The Index to Philippine Newspapers (IPN) Online

Figure 7: Record Details

For users who are not sure of what keywords to use for searching the database, the new system provides an interface to browse thru the available subject headings and author names (shown in Figure 8). Clicking on a heading will trigger a search on the database using the selected subject or author’s name.

Figure 8: Headings Browser

Angeles: The Index to Philippine Newspapers (IPN) Online

125

The new system also provides an interface for browsing available issues hierarchically beginning from the newspaper title, year, month, down to the specific issue dates (shown in Figure 9). In addition, the number of articles indexed is also summarized per newspaper title, year, month, and issue dates. Clicking on an issue date will retrieve all articles indexed in that issue.

Figure 9: Issue Browser

-

Servicing Newspapers in Print and Microfilm Before, when a user wishes to read newspapers in print or in microfilm, a request slip (shown in Figure 10) is filled-in and submitted to the staff. The staff retrieves the newspapers in print from the shelf, or the microfilm reel from the microfilm storage area. The newspapers are then turned over to the borrower. The borrower has to leave his/her Identification Card while the newspaper is with him/her.

126

Angeles: The Index to Philippine Newspapers (IPN) Online

Figure 10: Request Slip

Using the new system, from a designated request (OPAC) terminal, a user simply selects the issues he/she wishes to request on an online request interface, scans or types his/her borrower ID, and waits for his/her request to be acknowledged by the staff (shown in Figure 11).

Figure 11: Online Request Interface

The request is automatically forwarded to the staff terminal. The staff acknowledges receipt of the request by scanning or typing his/her employee ID and retrieves the newspapers in print or microfilm. The interface reflects the name of the attending staff for every request submitted from the request terminal. This is to avoid the same request being attended by two or more circulation staff. Before lending the newspapers to the borrower, the staff checks-out the newspapers using a newspaper circulation check-out interface (shown in Figure 12).

Angeles: The Index to Philippine Newspapers (IPN) Online

127

Figure 12: Circulation Check-out Interface

All newspaper issues currently checked-out are displayed on the staff terminal screen (shown in Figure 13). When newspapers are returned, the staff checks-in the newspapers using the same circulation interface.

Figure 13: On-Loan Interface

-

Generating Reports Before, statistical reports such as how many articles were indexed for each newspaper and which newspaper titles and issues are frequently being requested for reading, photocopying or printing are being generated by the staff manually based on log books, used periodical call slips, etc. Using the current system, the staff can simply generate statistical reports from an available report generator interface (shown in Figure 14). Reports such as how many articles were indexed or revised by each staff within a given period and how many requests for reading, photocopying or printing were accommodated by the staff, including which newspapers are most frequently requested, can be generated by the system on demand.

128

Angeles: The Index to Philippine Newspapers (IPN) Online

Figure 14: Report Generation

5

Future Plans

The IPN Online has been in continuous development since its launching to keep the system up-to-date and in good working condition. Additional features are currently being incorporated, such as: -

Digitization Digitizing newspapers from print or microfilm and uploading these into the database so users need not request for the print or microfilm copies. This will also include a full-text search engine to be able to implement “data mining” on the actual text of newspaper clippings.

-

Inclusion of Abstracts In the absence of a digital copy, an abstract can be added to provide a more detailed description of the actual content in addition to the title and subject headings assigned.

-

MARC export The MARC export function will make the records of the IPN Online interoperable with other systems that are compliant with the Machine Readable Cataloging format.

Angeles: The Index to Philippine Newspapers (IPN) Online

6

129

Conclusion

While access to current and historical information from newspapers is available in varied formats, subject indexes continue as a vital information need in Philippine libraries and the research community in general. The development and implementation of the Index to Philippine Newspapers (IPN) Online using Free Open-Source Software (FOSS) did not only save the University of the Philippines lots of money, it also facilitated the research needs of its users by extending access to current and historical information found in Philippine newspapers beyond the four walls of the library. Finally, the IPN Online has successfully addressed the problems and difficulty in the old system being used at the University of the Philippines Diliman Library thru automation of major services relating to the management, indexing, and servicing of newspapers.

SERVICE AND PROCESSING OF NEWSPAPER IN SUPPORTING RESEARCH: A Case Study at Libraries of Universities in Surabaya Munawaroh STIE Perbanas Surabaya’s Library

Abstract This paper attempts to describe the service and processing of newspaper at library of universities in Surabaya. It looks at the real conditions of newspapers processing and service to support research. It will also discuss challenges and issues faced by librarians to use newspapers as information resources which must be processed.

1

Background

The responsibility of higher education institutes in Indonesia is the implementation of the Triple Mission of Higher Education that incorporates education and learning, research and community services. The library supports the implementation of the Triple Mission of Higher Education by performing its role and functions as defined by the Higher Education Library Guide Book to develop the collection, organise and maintain library materials, provide customer service and carry out administration of the library. The functions of the library are: •

Education: Library is a learning resource for academic civitas, thus the collection provided serves to support the aim of learning, organisation of the materials for every study programme, collections on strategy for studying and teaching and materials that support the implementation of learning evaluation.



Information: Library is an information resource that is easily accessible to patrons and information users.



Research: Library provides the most recent primary and secondary materials for research and study of knowledge, technology and the arts. Collections in the libraries of higher institutes are to support research works that can be applied to the development of the community in various fields.



Recreation: Library has to provide recreational materials that develop and promote the creativity and innovative interests and capability of its users.



Publications: Rightly, library shall assist in publishing works produced by its academic and non-academic staff.



Deposit: Library is the central deposit for all the works, contents and knowledge created by the denizens of its higher education institute.

132

Munawaroh: Service and Processing of Newspaper in Supporting Research



Interpretation: Library has to conduct research and provide value-add on its information resources to assist its users in conducting their education and research.

The role and functions of a higher education institute library are manifested through the services to its users. The services provided depend on the size of the university and its library. The general services provided at the library of higher education institute are circulation service, reference service, subscription of journals and academic materials. These services are provided by the library depending on the needs of its users and the resources the library can acquire. The universities in Surabaya are developing e-library services, as requested by the users, where all digital information are provided online. Established universities such as Airlangga University, Surabaya University, Widya Mandala University, Kristen Petra University have already provided online services where access to title index and information, and online journals is limited to their academic communities through the use of passwords.

2

Problem

Newspapers contain daily local and national information. The information is typically popular and sales depend on the title of the newspapers. The title of the newspaper reflects its information specialisation and the information it provides. Almost all higher education institutes in Surabaya subscribe newspapers as part of their libraries’ collections. However, the newspapers service and processing are very moderate, located together with the collections that are recreational in nature. There is very little research on the advantages of newspapers as sources of information. As a result, the librarians are not required to creatively process the information from the newspapers. Studies that require newspapers as a source of information are often conducted by undergraduates from S1 level, especially those studying the management and accountancy courses, as the data needed are from the latest transactions in the stock exchanges. For instance, of dates when stocks become public-listed and dividends paid out. These data are often available only in the newspapers, making the newspapers a primary source of information. As the information in the newspapers is not systematically processed, the data search is conducted manually by going through all the physical copies. This condition is not favourable for users to filter the information, exacerbated by the fact that sometimes the pages are torn or missing.

3

Newspapers Processing and Service at the Universities in Surabaya

Newspapers are part of the serials collections, and process by a system of recording. According to Soelistyo Basuki (1991:242), there are six systems of recording on the serials collections. They are: registration system, big book system, two-card system, three-card system, Kardex and in-house system. Newspapers in higher education institutes in Surabaya (Surabaya University, Widya Mandala University, Kristen Petra University, Airlangga University, 10 November Technological Institute of Surabaya and STIE Perbanas Surabaya) are processed using the in-house system, where each library records its own inventory of newspapers appropriate to the situation and conditions in the library.

Munawaroh: Service and Processing of Newspaper in Supporting Research

133

The processing of newspapers also undergoes changes subject to changes in requirement to the library. The chronological process of newspapers in general is as such: recording the date of arrival, stamping the library seal and displaying the newspapers on the racks. The processing after displaying depends on each library’s initiatives, some keep the newspapers for a period of 1-2 years as well as maintaining a collection of newspapers clippings. Newspapers inventory also serves as evidence to make claims against the agents or publishers when the newspapers do not arrive. Following is an illustration of newspapers processing and services at the universities in Surabaya through interviews and surveys conducted by the author: STIE Perbanas Surabaya Library • Newspapers subscribed are Bisnis Indonesia, Investor Daily, Kompas and Jawa Pos. • Newspapers processing: o Newly arrived newspapers are recorded in a newspapers inventory form, stamped with the library’s seal and displayed on the newspapers rack. o Newspapers processing after display: From 1996-2000, after display, the newspapers articles are indexed and categorised by subjects with call numbers given for retrieval purposes. All indexed newspapers are kept in a bundle on a monthly basis. Initially, the index was not used by the patrons and was not continued. All newspapers were then kept in a bundle on a monthly basis after display with no research facility provided. Since 2005, taking into account the usage by patron and efficiency of the practice, only Bisnis Indonesia and Investor Daily are kept in a bundle as the two titles are well-used by S1 undergraduates in the accountancy and management courses. • Newspapers services include accessing and photocopying current or past issues in bound copies. Widya Mandala University Library • Newspapers subscribed are Surya, The Jakarta Post, Kompas and Jawa Pos. • Newspapers processing: o Newly arrived newspapers are recorded in a newspapers inventory form, stamped with the library’s seal and displayed on the newspapers rack. o Newspapers processing after display: o All titles are kept in a bundle on a monthly basis, kept for a year before being sent for pulping. This is because (1) there are online services of these newspapers, which can be downloaded, and (2) there is a subscribed service of clippings provided by Centre for Strategic and International Studies (CSIS) in a variety of subjects. • Newspapers services include access in the current newspapers display area. Kristen Petra University Library • Newspapers subscribed are Surya, The Jakarta Post, Kompas and Jawa Pos, Seputar Indonesia, Republika, Suara Pembaharuan, Surabaya Post, Bisnis Indonesia and Radar Surabaya. • Newspapers processing: o Newly arrived newspapers are recorded in a newspapers inventory form, stamped with the library’s seal and displayed on the newspapers rack. o Newspapers processing after display: o Before 1995, the newspapers were kept in a bundle on a monthly basis. Taking into account space constraints and the fact that newspapers now provide online

134



Munawaroh: Service and Processing of Newspaper in Supporting Research

services, all titles are kept in a bundle on a monthly basis, kept for three months before being sent for pulping. Clippings of newspapers articles are only those pertaining to the Kristen Petra University. Newspapers services include access in the current newspapers display area.

Surabaya University Library • The 17 newspapers subscribed include Surya, The Jakarta Post, Kompas and Jawa Pos, Seputar Indonesia, Republika, Suara Pembaharuan, Surabaya Post, Bisnis Indonesia, Radar Surabaya, Suara Karya, Media Indonesia, Memorandum and others. • Newspapers processing: o Newly arrived newspapers are recorded in a newspapers inventory form, stamped with the library’s seal and displayed on the newspapers rack. o Newspapers processing after display: o Before 1999, the newspapers are kept in a bundle on a monthly basis. Taking into account space constraints and the fact that newspapers now provide online services, all titles are kept in a bundle on a monthly basis, kept for two years before being sent for pulping. The library provides electronic newspaper clippings service. •

Newspapers services include access in the current newspapers display area. The electronic newspaper clippings service can be accessed via the Surabaya University Library’s website.

Airlangga University Library • Newspapers subscribed are Jawa Pos, Bisnis Indonesia, Kompas, Republika, Surya, Surabaya Post and Suara Karya. • Newspapers processing: o Newly arrived newspapers are recorded in a newspapers inventory form, stamped with the library’s seal and displayed on the newspapers rack. o Newspapers processing after display: o The newspapers are kept in a bundle every month (by title) and arranged on shelves. They will be sent for pulping after a certain period of time. • Newspapers services include access in the current newspapers display area. 10 November Technological Institute Library • The eight newspapers subscribed are Surya, The Jakarta Post, Kompas, Jawa Pos, Seputar Indonesia, Republika, Surabaya Post and Radar Surabaya. • Newspapers processing: o Newly arrived newspapers are recorded in a newspapers inventory form, stamped with the library’s seal and displayed on the newspapers rack. o Newspapers processing after display: o Clippings of articles on technological and technical fields such as environment, computer, architecture, information and news on the institute are collated and title-indexed. • Newspapers services include access in the current newspapers display area. Newspaper clippings service is available in the Reference Section.

4

Newspapers Processing to Support Research

The processing of newspapers as part of the library’s collections has undergone development and changes, from keeping the newspapers in bundles to using online

Munawaroh: Service and Processing of Newspaper in Supporting Research

135

services provided by the publishers, reflecting the needs of each library. Librarians are required to allow the use of information in the newspapers as a source of information for research in an accurate, easy and affordable manner. The online services provided by the newspapers publishers are not always freely available to the public, and there are access limitations to both new and old information. For instance, the online version of Jawa Pos (www.jawapos.co.id) only provides articles from the present day issue, while Kompas has developed the Kompas Information Centre (http://pik.kompas.co.id) that allows access through subscription or printout service at the office of Kompas for old and current news and articles. With the proliferation of local and national newspapers, not all information is relevant to the libraries. Thus, there is a need to systematically process the information in the newspapers for research and reference so that the information can be accessed in an accurate, easy and affordable manner - for instance to obtain data from the Surabaya Stock Exchange. The following steps should be undertaken to select newspapers as part of the library’s collections that can support research: •



• • • •



Newspapers selected for subscription must be suitable for the library to meet the information needs of its users. For example, STIE Perbanas Surabaya library whose scope covers the fields of economy, management, accountancy, business and banking, can subscribe to Bisnis Indonesia and Investor Daily, as these are business newspapers. It can also subscribe to the popular national newspaper, Kompas, and the popular local newspaper, Jawa Pos. Selection of rubric that is most needed by the users pertaining to data for research or writing papers based on an analysis of the users’ needs. For example, at the STIE Perbanas Surabaya library, the data most needed by the undergraduates are on finance, prospectus, and financial reports. The Investor Daily newspaper provides data and information on industry and banks prospectus, industry and financial banks reports, transactions in the Indonesian Stock Exchange and other financial data. The Bisnis Indonesia newspaper provides data and information on industry and banks prospectus, industry and financial banks reports, stock exchanges and other financial data. Information selected to be transferred to a digital format with the use of scanner and to be kept as digital files. Group the data according to types. Information to be indexed to facilitate research. Librarians, assisted by financial data experts, to organise the raw primary data continuously. This can be marketed to analysis industry or other libraries. For example, the Indonesian Banking Indicator and Financial Performance Rating produced by PT Ekofin Konsulindo. This product is a collection of per semester financial reports of Bank BUMN, Bank Devisa, Bank Non Devisa, Bank Campuran and Bank Asing. Another example is the Indonesian Sekuritas Market Database (ISMD), which was produced by Pusat Pengembangan Akuntasi Universitas Gadjah Mada. It is the collection of daily trading data of the Jakarta Stock Exchange (JSE). Establish a network of access with newspapers publishers for old aerticles, news and data already published.

136

5

Munawaroh: Service and Processing of Newspaper in Supporting Research

Newspapers Service to Support Research

Information research service Librarians to find or show articles/news for users. •

Consultation service To advise researchers which newspapers are suitable for their needs.



Online, email or fax service Collections of newspapers data/articles that have been selected and digitised can be provided via online, email and fax.

6

Conclusion

Newspapers services at the higher education institutes in Surabaya are mainly the access of current issues as part of the library’s collections. This is shown through: • Moderate newspapers processing. • No prioritisation as the demand for information from articles/news is low. • The up-to-date nature of information in newspapers is not lasting. • Requires treatment costs. • Requires a large space. • Availability of online subscription of newspapers provided by publishers, but not subscribed by libraries of the higher education institutes. • Provision of facility (especially sofas) for patrons to use the newspapers.

References Departemen Pendidikan Nasional Republik Indonesia.2004.Buku Pedoman Perpustakaan Perguruan Tinggi. Edisi ke-3. Jakarta: Direktorat Jenderal Pendidikan Tinggi. Sulistyo-Basuki. 1991. Pengantar Ilmu Perpustakaan.Jakarta.Gramedia Pustaka Utama. Surveys and interviews with librarians.

ENHANCING ACCESS TO THE NEWSPAPER COLLECTIONS: The Lee Kong Chian Reference Library Experience Gracie Lee and Josephine Yeo National Library Board, Singapore

Abstract The Lee Kong Chian Reference Library of the National Library Board (NLB), Singapore has over 200 current and historical Singaporean and Malayan newspapers in its collection. These newspapers are currently accessible through the microfilm collection that numbers around 18,000 reels. This paper discusses how the library has enhanced access to the newspapers on microfilm collection through the streamlining of library processes. In July 2007, the NLB signed landmark agreements with the Singapore Press Holdings (SPH) to digitise The Straits Times (ST) archives dating as far back as 1845. The Straits Times is the long running and most widely read English language newspaper in Singapore. It is a frequently consulted resource known for its reliable information. The digitised content will be made available to patrons for individual research and reference purposes within NLB's network of libraries in Singapore. With this new service, patrons will now be able to search for newspaper articles more effectively. The second part of this paper discusses the digitisation process and the challenges faced.

1

Background

Newspapers are one of the primary sources of information for researching into a nation’s past and present. Belying Singapore’s small geographical size and relatively short history, around two hundred Singaporean and Malayan newspapers have been published since its founding in 1819. The Lee Kong Chian Reference Library of the National Library Board (NLB) is one of two institutions in Singapore with the largest holdings of Singaporean/Malayan newspapers in Singapore, the other being the National University of Singapore’s Central Library. Historic and current Singaporean/Malayan newspapers at the Lee Kong Chian Reference Library are available to members of the public primarily through microfilm. This collection currently stands at 18,000 reels and comprises newspapers in the four official languages of Singapore, namely English, Chinese, Malay and Tamil. This paper discusses how access to the collection has been enhanced through the streamlining of library processes and more recently through newspaper digitisation. This paper also shares the thought processes behind the new service concepts as well as the challenges and issues faced in implementation.

2

Microfilming as a Form of Preservation and Access

Microfilming has been and still is the main means of preserving and providing access to the library’s newspaper collection. During the 1950s, most of the newspapers were deteriorating due to the effects of the humid tropical climate. Hence the library embarked

138

Lee and Yeo: Enhancing Access to the Newspaper Collections

on microfilming in order to save these documents. Since then, all of our newspapers have been comprehensively microfilmed. The library continues to preserve all local papers through microfilming, though a commercial vendor now handles the actual microfilming process. Newspaper issues are usually available on microfilm within three months from the date of publication. During this period, hard copies are used. These are discarded once the surrogate is available. The Legal Deposit Unit continues to store the preservation copy.

3

From Closed to Open Access: The Current Affairs Centre and the Self-Help Microfilm Service

Until the mid-1990s, both the print copies of newspapers and the newspaper on microfilm were kept in closed access areas and were retrieved upon patrons’ request at the reference desk. A card catalogue was maintained for patrons to check the microfilm holdings. This practice continued until the reference library in the old library building at Stamford Road underwent renovations in 1997. After the renovations, much of the library’s collections moved out from the closed access areas to the open shelves. A Current Affairs Centre was set up to bring all newspaper and periodical reading activities together. The centre housed over seventy local and foreign newspapers as well as the newspaper on microfilm collection. An in-house database, created on Lotus Notes, replaced the card catalog. With these changes, a self-help microfilm service was introduced. Patrons could check the microfilm holdings from the database, retrieve the microfilm reels from the open cabinets and then proceed to the information counter to register for a microfilm reader. With this change, waiting time was significantly reduced and the manpower previously needed for the retrieval of the newspapers and microfilms was now redeployed. Though there were initial staff concerns that the opening of the collections would lead to lost or misplaced items, the loss rate was extremely low.

4

Improving Access through a New Newspaper on Microfilm Catalogue and Subscription to News Database

In the following years, access to Singaporean newspapers has been enhanced through a new and improved “Newspapers on Microfilm” holdings database. The new database is available through the internet and allows user to search and browse for newspaper holdings in the four official languages: English, Chinese, Malay and Tamil.

The old client-based Newspapers on Microfilm Catalogue

The new web-based Newspapers on Microfilm Catalogue

Lee and Yeo: Enhancing Access to the Newspaper Collections

139

The library has also subscribed to a number of news databases including Factiva, LexisNexis, Newsbank and Library Pressdisplay which allow patrons to perform keyword searches on local news content. Though the coverage of local news content may not be comprehensive, they have at least enabled our patrons to search for current news. Below is a table showing the usage of the microfilm collection and the newspaper databases from April to December 2007 at the National Library.

5

Type

Microfilm

Factiva

LexisNexis

Average Usage Per Month

3400 reels

18,677 document count

11,832 searches

Newsbank

Library PressDisplay 4,770 1,223 documents monthly read viewed and issues and searches total users performed

Newspaper Digitisation Project

The most significant progress made in enhancing access to the newspaper collection is the Straits Times newspaper digitisation project, which the library embarked on in 2007. This project is currently in progress. In 2006, the National Library started discussions with the newspaper production company Singapore Press Holdings (SPH) to explore the possibility of having the National Library digitise all issues of The Straits Times. The Straits Times is the longest-surviving English language broadsheet in Singapore. It is also the most-read paper in Singapore and is a frequently consulted resource for both current and historical information. After months of negotiation, the final agreement was signed on 31 July 2007. The SPH had very kindly agreed to let the National Library digitise all back issues of the newspaper from its inception on 15 July 1845 right up to 2006. From 1 January 2007, the SPH would digitally deposit every issue of the paper with the National Library.

A proposed interface design for the digitised newspaper collection

140

6

Lee and Yeo: Enhancing Access to the Newspaper Collections

Digitisation Process

NLB outsources all digitisation. For the Straits Times project, a Request for Proposal was published, inviting experienced vendors to propose a solution for digitising the newspapers from microfilm, and provide a content delivery system for the digitised newspapers. To cater for future plans to digitise other Singapore newspapers, as NLB’s collection also included newspapers in Chinese, Malay and Tamil, the proposed solution had to be able to handle Chinese and Tamil scripts. Among other criteria was that the solution utilised open standards as opposed to proprietary formats, and that where articles continued over more than one page, they had to be linked. Several proposals were received, ranging from very simple image-only solutions to custom developed solutions. The chosen proposal was one that followed closely to the Library of Congress’ NDNP specifications extended to articlelevel, coupled with a customised Greenstone solution for content delivery, very much like the National Library of New Zealand’s Papers Past. Among the challenges faced was having to work with various parties located in several countries across many time zones. Although we were fortunate to be able to engage a vendor that has an office in Singapore, the expertise was not locally based. The primary vendor, which is headquartered in India, worked in partnership with others in Germany, USA and New Zealand to provide a total solution. Face-to-face meetings were minimised to avoid escalating travel expenses, and much of the interaction took place by email, Skype and phone calls. Another challenge was poor quality microfilm images. Although a fresh set of direct duplicates was used, some of the original images were poor to begin with. One way of dealing with this would be to take stock of the poorly captured pages and explore the possibility of rescanning them from the original hard copy. The greatest challenge, however, was to implement digital rights management. Access rights had to be applied not only to every issue, but to every article and every photograph as well. All newspaper digitisation projects had been for historical newspapers in the public domain, so there was no example to follow. A small sample of a few issues were processed and checked each time. The checking process was a combination of checking every article on every page and examining the accompanying XML files, as well as making sure that continuations were correctly linked, and that the article types were correctly assigned. It took several iterations to get the conversion right and only then could full-scale production begin. The data was shipped from India to Singapore on hard disks.

7

Benefits to Patrons

Access to the digitised newspaper will be made available to our patrons through multimedia terminals located in our network of 23 national and public libraries. This would be a great boon for our users who will be able to search articles in The Straits Times much more efficiently using keywords. Using Greenstone, users will be able to perform keyword searches as well as advanced search. Article titles have been 100% corrected, and categorised by article type. Users will also be able to browse the newspapers by date, and browse article titles within each issue. This will be a vast improvement over using cumbersome microfilm. Though the digitisation process is at current only partially completed, it has been effectively used to answer reference questions and to package

Lee and Yeo: Enhancing Access to the Newspaper Collections

141

information products such as the Singapore Infopedia (http://infopedia.nlb.gov.sg) in pilot cases.

8

What Next?

In the immediate future, other historical Singapore newspapers will be digitised, greatly enhancing access for users. We hope, in time, to be able to provide access to all Singapore newspapers through this single delivery system, and that SPH will allow NLB to digitise their other newspapers as well.

ONLINE NEWSPAPERS: A NEW ERA Ed King British Library

Abstract This paper will be in two parts: the first will be a brief survey of current newspapers online throughout the world. It will attempt to: • provide some information regarding the archiving of information held within online newspapers. • Show how far screen technology and software has removed users from the form of the presentation of information traditionally presented upon printed newspaper • To explore briefly the role of interactive/ dynamic updates to live screen feeds • To look briefly at the rapid development of online newspaper archives throughout the world during the last few years • the conversion of back runs of printed newspapers into digital format and the presentation of pages on the web. The paper will attempt to discern possible patterns of provision and preservation arising from the information gathered. It will also look briefly at the possible impact of all these developments upon the role of libraries, both public and national in providing and preserving large quantities of digital data for newspapers.

1

Introduction

The advent of the Internet, and its rapid development in the last few years, has created a new opportunity for newspaper publishers all over the world. The number and variety of online newspapers is enormous. Naturally, the use of screens to present information is very different from the format of newspapers printed on paper. Presenting information on a bright screen, capable of displaying multiple colours, has brought forth a great number of different designs for the presentation of information. The paradox of the new technologies is most apparent in that it delivers information far more quickly and permits far more frequent updates than was previously possible in print. But it is normally the case that only one small screen of information can be displayed at any one time. With conventional printed newspapers, two pages at a time can be looked at quickly by a user.

2

What We Have Now

It is possible only to give a snapshot of what is available today. Perhaps the most convenient way to access the scale of online newspaper publishing is through portal sites, which aggregate a larger number of titles available, world-wide. For example, the site Online Newspapers.com divides the world into thirteen segments on its front page1 the ______________ 1 See: http://www.onlinenewspapers.com/ Visited 10.11.2007.

144

King: Online Newspapers: A New Era

Americas, Asia, the Pacific, etc. A click on any one country calls up a list of titles available. For example, the Dominican Republic has thirty-three titles listed as of November 2007. The Dominican Sun offers news in English.2 The El Viajero Digital offers news in Spanish.3 El Mercurio de Valparaiso (Chile) offers the reader (on Saturday the 10 November 2007) the statement that it is 180 years old, and gives an issue number of the newspaper in the top left hand corner of the screen – an echo of the print era.4 The online newspapers of Russia are accessed via “Europe” at this site. Whilst the majority of the hundreds of papers are in Russian, or other languages, the Moscow news has an English version for Financial and General News.5 When looking at all the productions of different countries, it is worth bearing in mind that there are now a number of language translation websites, where it is possible to quite adequately gain the sense of a headline, or of the full text of an article, from the language of the original article, into your native language. One such website is AltaVista’s Babelfish6 – which is easy to use; however, there are many more. One can look for newspapers that are unusual as well as “conventional” newspapers. One such example I enjoy viewing on occasion is 70 South Antarctic News.7 By contrast, a few clicks later, one can travel to the far northern hemisphere, within the Canada portion of the website – to find the Klondike Sun, published in Dawson City, Yukon.8 The very ease of doing this rather blunts the sense of the enormous distance between the two places. Why would one visit any other website, when one can visit thousands of online newspapers via Onlinenewspapers.com? Perhaps the answer to this is the infinite curiosity of all of us – we wish to try different ways of access, even if the results may be that we are led to the same online newspapers. A couple of alternative portal sites to online newspapers are Newspapers.com9; this is mainly USA orientated and allows users to rapidly gain circulation figures for the top ten and top 100 newspapers. Another site is simply entitled Online Newspapers, and it has a very effective portal to thousands of titles as well.10 One page 66 of the Google search for Online Newspapers, one finds Chinese Newspapers Online.11 On page 84 of the Google search one finds Major Chinese Newspapers Online.12 Of equal relevance to myself in the UK is the aggregated list of online UK newspapers, which appears on the Wikipedia website.13

3

Form of Screen Presentation

Screen presentation has obvious limitations compared to the many pages of printed newspapers. However structural features of printed newspapers have migrated to the online environment. The newspaper Banner remains prominent at the head of the front page of very many online newspapers. The type of the newspaper titles often is most distinct with ______________ 2 3 4 5 6 7 8 9 10 11 12 13

See: http://news.drsol.info/ Visited 10.11.2007 See: http://www.elviajero.com.do/ Visited 10.11.2007 See: http://www.mercuriovalpo.cl/ Visited 10.11.2007 See: http://www.mnweekly.ru/ Visited 12.11.2007. See: http://babelfish.altavista.com/ Visited 10.11.2007. See: http://www.70south.com/news/ visited 10.11.2007. See: http://www.yukonweb.com/community/dawson/klondike_sun/ visited 10.11.2007 See: http://www.newspapers.com/ visited 10.11.2007 See: http://users.rcn.com/virtual.nai/sot/papers.html visited 10.11.2007 See: http://www.cnd.org/China/news/newspapers/ visited 10.11.2007. See: http://mailer.fsu.edu/~flan/chinese/documents/newspaper.htm visited 10.11.2007 See: http://en.wikipedia.org/wiki/List_of_newspapers_in_the_United_Kingdom visited 13.11.2007

King: Online Newspapers: A New Era

145

with a background “banner”, for example in red or in blue, or yellow. On the front page of the online newspaper, very many newspapers offer a lead story together with a photograph alongside the story’s headline. Many others offer a summary of the main stories, and let the reader decide which story to click on, to view it. Equally common is the feature which offers users the opportunity to go and view a particular section of the newspaper, such as sport, advertisements, letters to the Editor, arts, business, news. Foreign news and business are also covered. A typical example of both characteristics is the English language edition of the Buenos Aires Herald.14 You have a summary of the top stories on the first page – plus the ability to jump to other sections of the newspaper. This is either through headings down the side (frequently the left-hand side of the screen), or by scrolling down the screen to display stories within the other section of the newspaper. This is the equivalent to “turning the pages” of a conventional printed newspaper. Conspicuous too, remains the prominence of advertisements for many online newspapers. Here, new technology scores over printed technology –for advertisements can be continually changed as the user stays at the newspaper website reading stories. So, instead of one fixed advertisement in a printed daily or weekly edition of a newspaper, multiple advertisements increase the exposure of users to the persuasions of companies selling their products. An example of this is the Sydney Morning Herald.15 Another increasingly common feature is the “pop-up” display of a breaking news story; or, the latest news story. It is striking that so many online newspapers now conform to a similar format of page and screen presentation. A couple of examples make the point: The Tripoli Post in English has the same look and feel as other newspapers16; Athinopli published in Greek, has the same structure.17

4

Interactive/Live Screen Feeds

In the last two to three years, online newspaper websites have started to incorporate this feature. The software behind these presentations is complex, but for the user, the result is simple and immediate. Two forms appear common: firstly, one has a one line feed of information in a prominent place on the home screen, with headlines. Examples are: La Cronica de Hoy, Mexico City, where one can see Cronica al momento at the bottom of the front screen.18 The story is up dated each few seconds. In the United Arab Emirates, the newspaper The Gulf Today, has a running headline at the top of the screen.19

5

Business Models

There are a great number of online newspapers that do not charge for access to current content. However, as a recent study shows, national newspapers in Western Europe

______________ 14 15 16 17 18 19

See: http://www.buenosairesherald.com/ Visited 11.11.2007. See: http://www.smh.com.au/index.html visited 11.11.2007. See: http://www.tripolipost.com/ visited 11.11.2007. See: http://www.athinapoli.gr/athens/ visited 11.11.2007. See: http://www.cronica.com.mx/ visited 11.11.2007. See: http://www.godubai.com/gulftoday/main.asp visited 11.11.2007.

146

King: Online Newspapers: A New Era

attempt to create revenue from part of their online content.20 The New York Times had a subscription service, TimesSelect in 2006, but the newspaper has now ended this scheme.21 In the UK, another discussion of the issue appears on the website of The Financial Times, which itself runs an online subscription service for access to the content.22 It seems that, whilst advertising revenues for online newspapers grows, and remains healthy, then the need for newspaper publishers to create a subscription model (over and above revenue from advertising) has diminished. An example of a hybrid approach is The Buenos Aires Herald, where to access full articles in the foreign news category, it is necessary to consider purchasing a pass to gain access to the information.23

6

Multiple Language Within Countries

World wide migration has been a feature of humankind probably since ancient times. What the online newspapers now show us is the manifestation of these mass migrations, in a form immediate to audiences both within the countries of the immigrants, but also to find an audience in the country of origin. The Pakistan Overseas Daily is published in Norway, in Urdu.24 In Toronto, the Russian Ethnic publication, Nasha Canada is published in Russian and English for the 300,000 community of Russian speaking immigrants.25

7

Archiving of Online Newspapers

Within the online environment, newspaper publishers regard an archive as a few years of back issues of the newspaper. This is evidenced at a number of sites. In the UK, the archiving of the issues of recent years and making them available is gaining ground. This is partly because a large number of regional newspapers are owned by a few large companies. The Bolton News, keeps an archive online for issues of the online newspaper back to 1995.26 The Harrogate Advertiser keeps online issues of the newspaper back until 2002.27 In Canada, the Interlake Spectator, published in Gimli, Manitoba, offers only recent stories as an archive.28 The South China Daily Post offers its back files through subscription.29 The Sydney Morning Herald has a “News Store” that allows you to search Fairfax publications and ASX company announcements from 1990 to today. It's free to search the data, however a small charge applies if you want to view the full text of an article.30 ______________ 20 See: Western European newspapers and their online revenue models: An overview by Valérie–Anne Bleyen and Leo Van Hove. First Monday, Volume 12 Number 12 - 3 December 2007. http:// www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/viewArticle/2014/1899 (visited 11.1.2008) 21 See: http://www.buenosairesherald.com/the_world/note.jsp?idContent=439541 visited 11.11.2007. 22 See: 1 http://blogs.ft.com/undercover/2007/10/undercover-ec-1.html/ visited 11.11.2007. 23 See: http://www.buenosairesherald.com/ visited 11.11.2007; and http://www.buenosairesherald.com/ the_world/note.jsp?idContent=455750 24 See: http://www.pakistanoverseas.com/ visited 11.11.2007. 25 See: http://www.nashacanada.com/index.htm/ visited 11.11.2007. 26 See: http://archive.theboltonnews.co.uk/ A Gannett Company. visited 13.11.2007. 27 See: http://www.harrogateadvertiser.net/ArticleIndex/Listmonths.aspx . Johnston Press Digital. visited 13.11.2007 28 See: http://cgi.bowesonline.com/pedro.php?id=208&x=archives Published by Bowes Publishers Ltd. Visited 13.11.2007. 29 See: http://www.scmp.com/portal/site/SCMP/menuitem.cfae3624a26f436a2455841053a0a0a0/?s= idx_Search &ss=Advance+Search visited 13/11/2007. 30 See: http://newsstore.smh.com.au/apps/newsSearch.ac?/index.html visited 13.11.2007.

King: Online Newspapers: A New Era

147

Clearly, newspaper publishers see their back files of texts, pictures, etc as assets. Some see it as possible to earn revenue from the information published previously online. This is a positive development, as the more it is done by publishers, and for longer in time, and the more the back files are used, then the greater the need for publisher to keep an archive of their own to make available to users. Perhaps in the world of libraries and archives, we need to be alert to the opposite happening – when a publisher decides to abandon access to its files of online older issues, then the future preservation of these files may be in question, and a library or an archive may need to act to take a copy of the files for future public availability. In a real sense, the creation of intermediary websites now plays a role in finding information about newspapers online. An obvious example is the Wikipedia entry for Online Newspaper Archives.31 This contains quite a lot of information, broken down by country ,and the list indicates whether the file is freely available, or whether there is a charge to access the data files. The list remains relatively small at present compared to world newspapers available, and the quality of information is known to be variable, but it is possible for us all to contribute. This may be useful, because we all want to avoid duplication of digitization effort in creating digital copies of back runs of newspapers published in our country of origin. It is in our interest to make contributions, or, alternatively to make links to our catalogues of newspapers and any digital copies of them - to ensure that we all share information. It also costs nothing except the time of the contributor to Wikikpedia, whereas, if no list of newspapers exists already in a library, or the list is rudimentary, then creating lists of digitized newspapers within libraries and maintaining the availability of the lists in-house might cost more over time. If one turns to other organizations, including professional bodies in the search for information on archiving newspapers: The World Association of Newspapers has a good web site. However, information about archiving newspapers is not readily available at this site. Emphasis is placed instead upon the achievements of printed newspapers. There is some discussion regarding subscriptions and making content pay.32 One might have expected some sites relating to newspapers would be within the Internet Archive. However, a search of the site shows many references to newspapers but little information on the archiving of newspaper websites.33 8

National Libraries

In recent years, national libraries round the world have instigated work to secure copies of online files, including newspapers. Progress has been made. The PADI initiative of the National Library of Australia provides a useful portal to further information.34 Under the Policies, Strategies, Guidelines page at this website, one can see a large body of information about digital preservation.35 Also, the page containing Projects and Case Studies, refers enquirers to many reports on actual experience of digitisation over the last few years.36 The National Library of Australia’s Digital Preservation Policy available to guide others who may now be starting their own digitisation programme.37 In my own ______________ 31 32 33 34 35 36 37

See: http://en.wikipedia.org/wiki/List_of_online_newspaper_archives visited 13.11.2007. See: http://www.wan-press.org/recherche.php3?recherche=archive visited 13.11.2007. See: http://www.archive.org/details/texts visited 13.11 2007 Preserving Access to Digital Information. See: http://www.nla.gov.au/padi/ visited 25.11.2007 See: http://www.nla.gov.au/padi/format/policy.html visited 28.11.2007 See: http://www.nla.gov.au/padi/topics/68.html#case visited 29.11.2007 See: http://www.nla.gov.au/policy/digpres.html visited 29.11.2007

148

King: Online Newspapers: A New Era

Library, the British Library, there is a statement of Digital Preservation, with related links.38 Searching for the digital archiving of older newspaper files digitised from microfilm may be found, one would expect, in those websites where work has taken place on the digitisation of older newspapers. For the Nordic Countries TIDEN project, the assumption is that microfilm of newspapers remains the principal preservation medium.39 The USA National Newspapers Digitization Program has a Preserving Newspapers webpage. There is as yet no information regarding how the files of newspapers that are digitised from microfilm are to be preserved.40 The Digital Preservation Coalition (Mainly UK based) provides a focus for the efforts of organisation to ensure longevity for digital files of all kinds. Their Handbook provides a guide for this.41

9

Legal Deposit and Electronic Legal Deposit

Many countries have created laws which provide for the deposit of printed newspapers with national libraries or with archives. In UK, the statutes of 1911, 1956, 1988, and 2003 have all developed the concept of copyright. The Act of 2003 provides an update for the deposit of printed publications; and also for the future deposit of electronic publications, via the development of future regulations.42 The working Group of the Conference of Directors of National Libraries (CDNL) published its findings on the Legal Deposit of Electronic Publications, and the report contains a summary of the situation at the time of its finalisation in December 1996. Appendix B gives a summary of the situation world-wide, country by country. The whole report deserves wider attention, as many of the points relating to made in the Appendices remain relevant today.43 10

Content Conversion of Older Newspapers

Initiatives have quickened all over the world in the last few years. The details of projects, frequently undertaken at the national level with public funding, are the subject of many papers, some of them at this conference. The UK, USA, France, Australia, Spain, The Netherlands, and many others - all have active ongoing projects to convert older newspaper texts into digital format. What is truly remarkable is the ability of current optical character recognition software to capture correctly to some extent, the optical scans of texts originally printed with less than “good” quality. What are currently being sought by all are improvements to: • • • • • •

the scanning process segmentation techniques, especially where (as is the case with newspapers) the original layout is complex character recognition techniques recognition of words enhance software to present the mass of texts to users, via web pages allow even greater searchability of the texts by users

______________ 38 39 40 41 42

See: http://www.bl.uk/aboutus/stratpolprog/ccare/digpres/index.html visited 29.11.2007 See: http://tiden.kb.se/Project.htm visited 29.11.2007 See: http://www.loc.gov/preserv/care/newspap.html visited 29.11.2007 See: http://www.dpconline.org/graphics/handbook/index.html visited 29.11.2007. See: http://www.england-legislation.hmso.gov.uk/acts/acts2003/ukpga_20030028_en_1 visited 29.12 2007 43 See: http://www.unesco.org/webworld/memory/legaldep.htm#AppendixB visited 29.12 2007-12-30

King: Online Newspapers: A New Era

149

I am confident that such improvements will come about. An example of work that is about to begin is the EU-funded Project IMPACT.44

11

Conclusion

Whilst the library community has articulated policies for digital archiving, there is but little evidence that there are sufficient public funds available in many countries to carry out as large a scale programme of digital archiving as would be wished. Specifically with regard to the world of newspapers, events have moved very fast to produce current online newspapers in the digital environment. There has also been significant public and business investment in digitising older runs of newspapers. The impact of all these multifarious developments upon our library community, is already significant, in so far as many libraries are having to work hard to keep up with the pace of developments. Overall, there is but little public evidence of archiving of current newspapers being done. Further projects need to be done, possibly country by country, to list activity relating to digital preservation of online newspapers, possibly making this information available via websites which themselves can be updated on a regular basis. Above all, doing nothing is no longer an option. As a result of doing work for two large scale historic newspaper digitisation projects within the British Library over the last three years, my conviction has grown in a number of ways: •

that we shall all have to work hard in years to come to ensure that the digital objects are preserved, for newspapers, or for any other publication created in a digital format



that this digital preservation activity will apply both to current online newspapers as well as older printed newspapers



that, where possible, we should always remember the origin of the printed texts, of other media (e.g. sound, maps) that we have converted; and plan to factor into digital repositories all relevant information relating to these earlier acts of publication



that we strive to retain the vision that all printed and digital productions are part of the wider cultural heritage that we need to preserve for others to enjoy in future



that digital repositories based in libraries and archives will need to grow rapidly in size to accommodate the very large quantities of digital information that will come to be stored in them.

______________ 44 See: http://impact.gdz-cms.de/ visited 6.1.2008.

NEWSLINK 2.0 : MAJOR ISSUES IN THE DEVELOPMENT OF THE SPH MULTIMEDIA NEWS ARCHIVES Tay Sok Cheng, Sebastian Chow and Ben Lim Singapore Press Holdings

Abstract Singapore Press Holdings, SPH, publishes 12 newspapers in Singapore. Since 1990, the text archives of these newspapers have been available for public access from SPH’s commercial database, Newslink. News articles from its selected papers are also available from aggregators like Lexis Nexis, Factiva and Newsbank. Although the archive is a useful reference database, it does not allow users to view the accompanying images and the “play” of the story in the context of the full page. The contextual information is important for researchers who need to study the way information is presented in the media. SPH recently developed a multimedia archive enabling users to access the full page of the newspaper content including the display advertisements. Users will be able to search the respective information objects like text, photos and infographics, advertisements and full page PDF files from a single search interface. This paper attempts to highlight the rationale for the multimedia newspaper archives and the opportunities it offers for future development. It will also discuss the flow of the information from the time they are created in newsrooms until they are finally archived. The information flow, which is determined by the editorial workflow, has great impact on the architecture of the archive, the way it organises the search and access rights. This paper discusses the major issues faced by the development team, comprising representatives from major stakeholders, namely the library, the business unit and the information technology division such as incorporating the varied responsibilities and interests of each of the stakeholders in the selection and application of technology, the organisation of the information and business model in serving the needs of their clients.

1

Introduction

Singapore Press Holdings Limited (SPH) is a media organisation with businesses in print, Internet, new media, radio, outdoor media and property. In Singapore, SPH publishes 14 newspapers in four languages. Besides print, there are online editions of its key newspapers. In keeping with industry trends, SPH continues to expand its online and new media initiatives. This results in the production of intellectual content in different media and formats. The SPH multimedia news archives has its humble origins in the news clippings and photo prints collections. These were organized and housed in the library as early as the 1950s. Advancements in computing technology made it possible to implement systems for the electronic archive of published news stories and photographs and to improve on these systems over the last 20 years.

152

Cheng, Chow, Lim: Newslink 2.0: Major Issues in the Development ...

This paper documents the development of electronic news archival in SPH, beginning with a simple text retrieval system for inhouse users to the implementation of multimedia archives for inhouse (editorial staff) and external (subscribers) users and provides an insight to the main considerations, issues and concerns at each stage.

2

Sph News Database

In July 1989, the Library implemented its first text archival system called LASR. This was followed by an image archival system - Pictoria – in 1995. These were completely separate systems, each with its own mode of access. Both systems were replaced with NICA (Networked Interactive Content Access ) in 2005. NICA is a system developed by IBM. It supports the archival of text, image, and pdf pages within the same system, requiring only a single access point. NICA has many new features that overcame the main limitations of LASR and Pictoria. LASR SPH had acquired an editorial system developed by System Integrated Inc (SII) in 1985. There were no archival systems available then that could integrate with SII. In 1989, SII developed a text electronic database, known as LASR (Library Archive, Search and Retrieval). LASR was the Library’s first text database, residing on the SII system. SPH English and Malay newspaper articles were archived in LASR. The Straits Times was the first publication available in LASR, starting from the 1 July 1989 issue. The current day’s articles were archived in LASR only at the end of the day. In the day, Library staff had to meta-tag (eg byline, headline, column etc) the articles before they were uploaded to LASR. The task of meta tagging at least 4 newspapers was usually completed at the end of the day. Main limitations of LASR 1) LASR resided on SII, which is a proprietary system. This meant that only internal staff had access to the database. To enable SPH to sell its database of news stories to subscribers, the company embarked on an inhouse project to develop a system called Newslink. Current day’s stories from LASR were uploaded to Newlink the next morning and made available to subscribers. Newslink was launched in 1992. It is a web-based system. Its search functions and interface were more user-friendly than LASR. It could be accessed by both SPH staff and external subscribers. . 2) LASR was not a live system. Amendments to stories with errors could not be dealt with immediately. The error story had to be taken out of LASR, correction text added, then uploaded to LASR again. All corrected stories were batch uploaded to Newslink the following day. There was a delay of at least one day before users were able to view the published correction text added to the error story. Meanwhile, users were open to the possibility of getting “wrong information”

Cheng, Chow, Lim: Newslink 2.0: Major Issues in the Development ...

153

PICTORIA In 1995, SPH worked with an external party, Information Technology Institute, an arm of NCB, to develop an image database known as Pictoria. Pictoria was used to archive published pictures and selected unpublished pictures. As both LASR and Pictoria were completely different systems, a user who wanted to search for a photo that was published with a particular story had to use two separate systems. An attempt was made to improve the above limitation by linking the thumbnail image to the news story in Newslink through a semi-automated process. This effort was abandoned after a couple of years as the linking process could not be fully automated and it was too time-consuming. The unsuccessful attempt to enhance Newslink to include images was due to the incompatibility of editorial system (SII) and archival systems (LASR and Pictoria). Overall, there was a need for an integrated system which will support story-image links, as well as live update of corrections made to published stories. NICA In 2004, SPH started looking into revamping its editorial and archival systems. Unlike SII which is mainly for story writing and editing, the new editorial system will have the full suit of functions – from story assignment to story preparation to page layout. Back-end systems have also developed rapidly to archive different types of resources on the same platform. Most new archival systems can be integrated with editorial systems. Meta-tagging and other inputs previously done by the library staff could be incorporated into the work processes at the front end. All these point to a more efficient workflow and data integrity in the archives as there is less room for error caused by human factors in the flow of editorial data to the archival system. In Oct 2005, NICA was launched. It replaced LASR and Pictoria. NICA was selected as it was able to deliver many new features on top of existing ones in the old systems. Some of the new features in NICA that were not possible in the old archival systems are:• • • • • • •

All the archives are available in the same system. Page, text and image archives are now accessible from the same system. NICA is integrated with Hermes, the new editorial system. Text metadata properly tagged at editorial are correctly mapped to NICA. Page archive is made possible by the integration of Hermes and NICA. Pages are in pdf and the editorial content is free-text searchable. Users can now view story/page layout at their workstation. NICA links the Page to the stories and images. It also establishes a link between a story and its relevant images and vice versa. NICA is a live system. Users can retrieve current day’s pages, stories and images on the same day. Correction text added to stories are immediately reflected in NICA. NICA archives the published version of photos in the photo archive. These versions are archived as sub-records of the original photo.

154

Cheng, Chow, Lim: Newslink 2.0: Major Issues in the Development ...

• • •

NICA removes the high-resolution of photos where usage condition does not allow re-use and retain the thumbnails for reference purposes. The metadata fields of the photos are IPTC compatible. This facilitates the selling of photos to resellers as the metadata values can be exported and are readable using applications like Photoshop and Photostation. The text in infographics is searchable.

While NICA has been able to provide a number of technical and workflow solutions that the earlier systems could not, 3 main issues remain. They are 1) Copyright Regardless of how sophisticated the backend system is, the accuracy of some values is still dependant on the source, especially in the case of copyright information. The copyright value in NICA is used by Newslink to upload SPH stories, as subscribers to Newslink have access to SPH copyright stories only. Stories with SPH copyright are also sold to aggregators like LexisNexis and Factiva. The editorial staff has to enter the correct value at the front end to indicate whether SPH holds the rights to a particular story. A mistake in tagging at the front end would result in non-SPH copyright materials being made available in Newslink and in third party databases. 2) Data Object Linking NICA uses information provided by editorial to establish the link between related stories and images. Any link information error will result in the wrong photo/graphic linked to the story and this will be reflected when the stories and images are uploaded to the subscription database Newslink 2.0. 3) Not all published in print is archived electronically. Non-editorial pages such as classifieds and “recruit” pages are not archived in NICA. Decisions made on what to archive are essentially driven by demand, either from editorial or subscribers. Consequently, content for which there is no demand do not get archived. As such, the electronic archive cannot be used as a complete replacement for all that has been published in print.

3

Newslink 2.0

In 2007, the subscription database Newslink was revamped and replaced by Newslink 2.0 Newslink was a text database whereas Newslink 2.0 offers multimedia content - text articles, infographics, photographs, advertisements and full page PDF. Newslink 2.0 search technology is powered by FAST Search and the system is built using Java. Why Newslink 2.0 The need for a new system is due to the following reasons: • To replace an obsolete technology • Enhance search functions • Enable e-commerce

Cheng, Chow, Lim: Newslink 2.0: Major Issues in the Development ...

• •

155

Enable sale of multimedia content Enrich subscription models

Data flow from NICA to Newslink 2.0 NICA resides in the corporate network zone whereas Newslink 2.0 is in the internet network zone. The separate zoning is essential for security reasons. In addition, servers and operating systems used by NICA and Newslink 2.0 are essentially different. The main challenge involved the transfer of content which includes digital photos and PDFs from NICA to Newslink 2.0. This had to be done daily within a short time window and using minimal manual effort. Reasons for choosing FAST Search Firstly, Fast search is a listed company and is highly rated on the Gartner Magic Quadrant. It has delivered more than 3500 installations, many at Fortune 500 and Global 2000 companies. Till now no one has dropped out. It has delivered the FAST ESP technology to many media companies worldwide like SPH. This proves that FAST has the experience in delivering our project. It has a big research team, meaning there will be enhancements and will very likely stay relevant in the industry. Lastly, it provides local support. Major Considerations in the Design of the System The major considerations in the design are as follows: • Catering to potential future business requirements. The system has to be designed such that it can have available functions that can be turn on in the future. This will reduce substantial cost of development in future when business plans change. • The system is also designed to cater to other digital content types like audio, video files plus many other formats. This is a necessary design consideration to cater to future enhancements. New Features •

• •

Newslink 2.0 was built with the fundamental principle to enhance the probability of successful searches. Thus, it was designed with a relevancy model in order to return relevant results to users. In addition, a few core navigators were built to assist users to zoom into their searches. This new system provides search on text content, digital images, full page PDF as well as advertisements. Users can either search across all these types of content or zoom in to search on anyone of these content types. Newslink 2.0 provides a single platform for search on English, Malay and Chinese content.

Managing Access There are a few subscription packages and each one has different access rights to the various types of content. Thus, it was necessary to control access to the different types of content based on packages. In addition, the access right is also different for the type of subscribers. As such, it was very critical to specify clearly what the business rules are. Numerous test cases were created to ensure that all business rules implemented are correct, access controls works accordingly, accounting and billing are correct and accurate.

156

Cheng, Chow, Lim: Newslink 2.0: Major Issues in the Development ...

Meta Data Understanding the metadata relating to the content and how they are used was very important. First of all, knowing the metadata well helps us to define the search capabilities to be built. Second, it also helps in the design to fulfil business requirements like for e.g. copyright issues. Third, the understanding of the metadata helps to decide what metadata is required and not to take all metadata that is available. By doing so, it helps to prevent redundancy and wastage. Weaknesses Newslink 2.0 was built using Java technology. As such, changes in business requirements or future enhancements will require more time as compared to development using Perl or PHP.

4

Business Considerations

Besides technical issues, Newslink 2.0 has to address needs and concerns arising primarily from its key groups of users. Advertisers/Advertising Agencies Advertisers/advertising agencies need PDF access to advertisements published in the newspapers to keep abreast of marketing developments by their clients, and to monitor advertisements in general for account pitching purposes. Cost considerations are very important to this group in view of the competitive nature of the industry which has resulted in razor thin margins. They would like to subscribe to the service as cheaply as possible. However the cost of archiving high resolution PDFs is very high and the main issue here is to keep costs down for more retrieval usages and at the same time, to provide an efficient and reliable platform for the advertisers’ use. In response to the advertisers’ need to have advertisement monitoring service, the decision was made for PDF in high resolution rather than low resolution, despite the higher costs. The hi-res PDF would also to cater to the advertising agencies need to download their clients’ advertisement for billing and for monitoring purposes . Researchers, teachers/schools For researchers from government agencies, research houses, etc, basic text access with editorial images is sufficient for their daily research needs. These subscribers do not need PDF access and as such, the cost of accessing Newslink 2.0 can be kept low for them. However, there were requests from schools for images and infographics to be included for the purpose of classroom teaching. This was especially so for the lower secondary schools and upper primary schools where images speak louder than words. Access to a longer period of archived material was preferred for students’ research. As a result, 3 years of image archives were included in Newslink 2.0 in addition to the text articles. The availability of PDFs and photos have enabled schools to use Newslink 2.0 as a resource for classroom teaching, leading to increased subscriptions.

Cheng, Chow, Lim: Newslink 2.0: Major Issues in the Development ...

157

Other measures introduced to make Newslink 2.0 an attractive resource for schools include keeping costs low through a yearly subscription for unlimited usage. Schools are also provided with seamless login uploaded onto the school intranet for the convenience of the students/teachers to access, even from their homes. A Translation/Voice over service for selected articles from Chinese to English is another feature that is popular with schools. To ensure that schools maintain a high usage of Newslink 2.0, resellers are enlisted to help monitor school usage. If usage drops, the resellers will try to boost the use of the service by organising games, word puzzles etc that encourage students to access Newslink 2.0. Challenge of “free news” One of the biggest challenges of marketing Newslink 2.0 is the availability of free news. Hence, additional services such as NewsTracking are introduced to make subscription to Newslink 2.0 more attractive.

5

Other Issues

Reproductions To facilitate the demand for purchase of photos, online sales enquiry forms are included in Newslink 2.0. Subscribers can select their choice of photos using keywords to do their own photo search and submit their selections online. Photos selected for sale can either be mailed to them or self-collect. For interested parties who wish to use SPH articles for online and print reproduction to advertise their product and services, the facility to apply for content licensing online is now available. There is a growing need, especially from the medical industry who are not allowed to advertise. Accuracy in Content Identification Editorial / Archival requirements dictate that SPH provides for its own copyrighted articles and for them to be correctly classified and captioned in order not to run foul of the copyright laws. As such, workshops have been conducted internally to stress the importance of correct classification with correct headline caption, especially for photos.

6

Next Phase

To meet the next phase of user needs, files of archived material based on popular demand such as company news, property news, motoring etc will be included. Users need not do their own searches, the files will be compiled for them to subscribe to. Other plans include making available the PDF version of the other flagship newspapers like The Business Times and Lianhe Zaobao. Currently, only The Straits Times is available in Newslink 2.0. The archive period of the PDF may be increased in time to come when there is a demand for a longer PDF archive. Currently, it is a two-year archive.

158

Cheng, Chow, Lim: Newslink 2.0: Major Issues in the Development ...

As a subscription service, the future development of Newslink 2.0 continues to be very much driven by user demands.

Architecture of Newslink 2.0

ALL NEWS BUT NO PAPER – HARVESTING SWEDISH ONLINE NEWSPAPERS Pär Nilsson National Library of Sweden

Abstract The National Library of Sweden has been harvesting Swedish web sites since 1997 in a project called Kulturarw3. The harvester has so far completed 14 collection cycles. In September 2004 a daily harvesting of the web sites of Swedish newspapers was begun. This paper will examine the results so far of the harvesting process for the Swedish newspaper sites, compare the results with those achieved by the Internet Archive and discuss some ideas about the future development of this work and the long-term research needs which have to be met. As a background for this examination the paper will also give an overview of the online practices of Swedish newspapers, the relationship between the printed newspapers and their websites and the archiving practices of the newspapers themselves. An alternative to both the printed newspapers and their online versions is the use of reading devices based on emerging e-paper technology. No Swedish newspaper is using this alternative today, but seven major newspaper publishers together with the Swedish Newspaper Publishers' Association, Halmstad University and the Royal Institute of Technology cooperated in a nationwide project in this area, examining reader responses, technology issues, business and distribution models, etc. The project was part of the European DigiNews project. This paper will give an overview of the results and discuss the implications of this technology for long-term preservation and access in libraries.

1

Introduction

Sweden has 9 million inhabitants and about 200 newspapers, with a circulation between 2000 to 475 000 copies per day.1 The titles represent a broad spectrum, ranging from local, sometimes free, papers to national dailies. Average circulation is quite small, about 22 000 daily copies, but in terms of daily circulation per 1000 adults, Sweden is in second place in the European Union (EU) with 489 copies, defeated only by Finland with 522 copies. Of the total circulation, 79 per cent are subscribed newspapers, compared to the EU average of about 50 per cent.2 There are a growing number of free newspapers, especially in the city areas. One of the free papers in Sweden, Metro, has actually spread over a large part of the world and now has 70 editions in 23 countrie and 19 languages. 3 The responsibility for archiving all printed material published in Sweden rests with the National Library of Sweden (NLS). The Swedish legal deposit law4 was last changed for ______________ 1 2 3 4

http://www.kb.se/soka/bibliografier/tidningar-tidskrifter/nyalundstedt-tidningar/ http://www.sweden.gov.se/content/1/c6/01/90/32/c6a0f7aa.pdf http://www.metro.lu/ http://www.notisum.se/rnp/sls/lag/19931392.htm

160

Nilsson: All News But No Paper

printed newspapers in 1979, when a separate copy to be used for microfilming was introduced and legal deposit of news bills was included. Despite the changes in the newspaper business, it has not been necessary to amend the law since then. The present Swedish legal deposit law includes all editions of the newspaper, news bills and supplements. Since the NLS keeps both the original newspaper and produces a surrogate in the shape of microfilm, we can be quite sure that we can offer a complete and authentic picture of the printed newspapers, now and for a long time to come

The definition of a newspaper used in the Swedish legal deposit law follows that of the International Organisation for Standardisation (ISO), which is: “Newspaper: Serial Publication which contains news on Current events of special or general interest. The individual parts are listed chronologically or numerically and usually appear at least once a week. Note - newspapers usually appear without a cover, with a masthead, are normally larger than A3 (297mm x 420mm) in size.”5

2

From One Channel Too Many

The ISO definition is physically oriented and naturally based on the traditional printed version. It is of course still valid for the printed newspaper, although there have been changes to the format, so that most Swedish newspapers today are about A3-size. But at the same time we are well aware that we can no longer limit ourselves to collecting just the printed newspapers, if our aim to preserve, at least, samples of everything that the newspapers companies publish today. Newspapers are certainly bound by traditions and user habits, but they are now also, by necessity, on the lookout for new ways to stay in business. It is no doubt the case that many newspaper companies in Sweden and elsewhere have stopped thinking about themselves as only providers of news in printed form. Instead they are investigating new methods for publishing their content and trying to make a profit in doing so. One example of this is the Swedish national daily Dagens Nyheter (Daily News), commonly abbreviated DN, which recently introduced the “DN cell phone”.6 The cell phone is offered only to subscribers of the paper and the price plan lets the user surf the newspaper’s cell phone friendly web pages at a fixed monthly cost using mobile internet, but the phone can of course also be used for viewing other web pages and making calls at the regular rates. According to the editor-in-chief of Dagens Nyheter, the paper wants to give its readers the opportunity of using the newspapers content in three different ways, on paper, through traditional web browsing and via the cell phone. These three channels are viewed by the paper as representing three different paces, covering the whole day of a newspaper consumer.

Many of the larger Swedish newspapers publish a light version of their web site for mobile phones. A poll7 made by the Swedish daily Aftonbladet showed that only 13 % of the respondents used their cell phones for viewing web pages, but mobile browsing is probably a rapidly growing market with possible advantages for the newspapers in the form of targeted advertisements. If users could be made to register some basic facts about ______________ 5 Quoted from “International guidelines for the cataloguing of newspapers” by Hana Komorous and Robert B. Harriman, 6 http://www.dn.se/DNet/jsp/polopoly.jsp?d=147&a=723883 7 http://www.aftonbladet.se/pryl/article1709795.ab

Nilsson: All News But No Paper

161

themselves, it would for example be possible to reach potential customers by text messages in a very easy manner. This is also a market with its own problems, as is exemplified by the current conflict8 in Sweden between one service provider (phone company) and the content providers (newspapers). The service provider in this case uses a web browser which is supposed to adapt the web pages of the newspapers for mobile browsing. The newspapers, however, see this as unwarranted manipulation and are critical of the way in which the service provider puts in advertising not approved by the content providers. 27 Swedish newspapers have decided to block access to their web pages for this service provider. There is a similar conflict in Norway and the struggle between service and content providers in the mobile market will probably continue. If the newspapers in Sweden still have not yet reached as many readers through the mobile platform as they would like to, that does not mean that their web sites are not used. At the end of January 2008 there were 15 newspaper web sites among the 100 most popular web sites in Sweden, if networked sites like msn.se are excluded. Among the 25 most popular sites there were 6 newspaper sites. Number one on the list was Aftonbladet, which has about 4.1 million unique visitors during one week.9

3

Are There Any Online Newspapers?

I have used the expression “newspaper website” instead of “online newspaper” because the latter is a bit misleading. As the aforementioned ISO definition describes it, a newspaper is a periodically printed and issued publication containing news material. The website of a newspaper may contain a large part of the text and image material used in the printed paper, but it is also often used as only a “teaser” for the printed newspaper, containing just some articles, and it is sometimes only a contact page with services for the print subscribers. In the latter case, it would be misleading to call it an online newspaper. I believe it is difficult to define how much material is needed online, in order to define the website of a newspaper company as an online newspaper. What factors determine how much of the content is put online from the printed Swedish papers? One important thing seems to be the frequency of the newspaper and the circulation. The more frequently issued and larger newspapers often put more material online. They are of course often better equipped, when it comes to staff and technical equipment, and so are more able to run an large web site efficiently. On the other hand, the web site is certainly an opportunity for the non-daily newspapers for keeping in contact with the readers between the issues, but at the same time they need to limit the volume of news they publish, so as not to undermine the paper version of the newspaper. So, although there still are exceptions, such as the daily Jönköpingsposten, most Swedish newspapers publish quite a lot of the material from the printed paper on the web, very often mixed with web-only content, such as video clips, comments on articles, blogs, dynamic or interactive content etc. All these new features contribute to the difficulty of viewing the newspaper website as an online newspaper and in many ways make it difficult to define what separates newspaper websites from, for example, the websites of news oriented TV or radio networks.

______________ 8 http://www.dagensmedia.se/mallar/dagensmedia_mall.asp?version=149317 9 http://www.kiaindex.org/

162

Nilsson: All News But No Paper

The video clips or web TV are in many cases material which comes from international providers and it is more or less focused on celebrities and gossip, but there is a growing interest in producing video clips locally at the newspaper and journalists are being trained to handle this medium too. Actual live transmissions from newspapers on the web are still very rare and would in many ways be contradictory to the on-demand character of web publishing. But the newspapers’ interest in traditional broadcast TV is apparent. The large regional daily Norrköpings Tidningar has been broadcasting in the local cable network since October 2006.10 The paper is also reusing the same material for web TV and podcasts. Recently a number of newspaper companies also applied for new TV licenses from the Swedish Radio and TV Authority, to broadcast both national and regional programs in the now completely digital terrestrial network.11 The producers of printed newspapers are still the strongest force in news publishing on the Swedish web and we have so far only seen a handful of new Swedish web-only news sites, which come close to the web sites of the printed newspapers. One example is the local news site Jnytt,12 which was started in 2006 and was recently acquired by the aforementioned Jönköpingsposten, probably to make up for its own lack of presence on the web. Some of the other web-only news sites are closely affiliated with newspaper web sites (e.g. www.e24.se) or, in some cases, mostly aggregators of news from other sources, with little or no original news material (e.g. www.dagensps.se). 4

The Future of Newsprint

What, then, will the future be like for the newspapers? In August 1998 Jacob Nielsen predicted that “most current media formats will die and be replaced with an integrated Web medium in five to ten years” and that “around 2008, all computer users will prefer using the Web over reading printed pages”.13 Although there are now many months left of 2008, we can safely assume that some printed newspapers will still be published in December. But, as I mentioned earlier, it can be shown statistically that Nielsen wasn’t completely wrong. Text rich web sites like those of the newspapers are widely used, reluctance to screen reading is probably decreasing rapidly and for longer texts you can always do your own print-on-demand. I believe that Nielsen’s mistake is the thought that one medium or channel completely and dramatically replaces another, what Paul Duguid has called “supersession”.14 In some ways the tasks of a national library would be much easier if a new technology simply superseded the old one in newspaper publishing, but I believe that he next decade will be one of multiple and mixed channels for the press. So if the printed newspaper survives 2008, how long will it be present to some degree? Predictions about the death of the printed newspaper differ widely and are probably nothing more than qualified guesswork. The well known Swedish professor of media economics Karl-Erik Gustafsson predicted in 2006 that the last printed newspaper in Sweden will be published in the first quarter of 2081.15 In his book The Vanishing Newspaper Philip Meyer prophesied that newsprint in America will finally die in the first quarter of 2043.16 ______________ 10 11 12 13 14 15 16

http://www.24nt.se/ http://www.rtvv.se/se/Press/Nyheter/080207/ http://www.jnytt.se/ http://www.useit.com/alertbox/980823.html http://www2.parc.com/ops/members/brown/papers/mm.html http://blogg.svd.se/reklamochmedier?id=1957 Quoted from The Economist 26 August 2006

Nilsson: All News But No Paper

163

But for some printed papers the end may much closer. In an interview with the Israeli newspaper Haaretz in February 2007 Arthur Sulzberger, owner, chairman and publisher of the New York Times, said: "I really don't know whether we'll be printing the Times in five years, and you know what? I don't care either" and he continued to say that the "Internet is a wonderful place to be, and we're leading there". Sulzberger also mentions the huge investments needed when building a new printing plant, in the case of his paper 1 billion dollars. Compared to this the development of even very ambitious web sites can be done quite cheaply.17 For one newspaper the future will arrive April 30, 2008. The Capital Times (Madison, Wisconsin) will be the first daily paper in the USA “to go digital”, according to the paper’s associate editor John Nichols.18 The paper will still print two weekly sections, which will be distributed with another paper and also offered free in newspaper racks, but the daily updates will be online only.

5

Economic Factors

So far no printed Swedish newspaper has decided to stop printing, but the cost of building new printing plants has been mentioned as one factor, which could influence newspapers to publish exclusively online. Another area of importance in the Swedish context may be the system of subsidising newspapers. (The market consequences of the system have recently been analyzed by Karl-Erik Gustafsson).19 The system of state subsidies was introduced in the1960s and 1970s. It has been argued that the subsidies were of a political nature in the beginning, but they have later certainly contributed to safeguarding the diversity of the Swedish newspaper market. The subsidies are handled by The Press Subsidies Council.20 There are two kinds of subsidies, one for distribution and one for production. About 140 Swedish newspapers receive distribution subsidies, which are used for joint distribution, and around 80 papers receive production subsidies, which are usually given to the second largest newspapers in certain areas. In recent years the system of press subsidies has received criticism from the European Commission for unfairly limiting competition in the media market. The Swedish government is negotiating with the Commission and has proposed lowering the subsidies. For some newspapers this might mean drastic cuts. The two papers which stand to lose the most are Svenska Dagbladet, which is the second largest paper in Stockholm, and Skœnska Dagbladet, which is one of the smaller papers in the south of Sweden and very dependant on the subsidies.21 For these two papers a change in the subsidy system could possibly mean the end of printing and web-only publication could then be one solution, since both papers already have content rich web sites. Even though many newspaper web sites have a lot of visitors, being popular has been one thing and making money another. In Sweden it has formerly been said that just a handful of the printed papers were making money from their web sites. That was certainly the case some years ago, but with the present close integration between the print publishing and the

______________ 17 18 19 20 21

http://www.haaretz.com/hasen/spages/822775.html http://www.thedailypage.com/daily/article.php?article=21536 http://www.sweden.gov.se/sb/d/3011/a/19032 http://www.presstodsnamnden.se/english.htm http://blogg.svd.se/reklamochmedier?id=6116

164

Nilsson: All News But No Paper

web publishing at many newspapers, the additional cost for also putting the material on the web has decreased. Some Swedish papers actually view web publication as their primary channel, together with the mobile friendly web version, putting the printed paper in second place. The revenues for online advertising is on the rise at the same time as there is less money to be made from printed advertisements. In addition to this some newspaper companies use their considerable media and web experience to make money from consulting, web hosting and other services. In this way the advertising revenues may be supplemented by other sources of income. The old newspaper economy in Sweden, with a mix of ad revenue and subscriptions, can’t easily be transferred to the web. The traditional way of creating newspapers, where the ads are often put in place first and the news material produced by the paper is then allowed to fill the remaining “news hole”,22 is certainly not easily applicable in the online newspaper. Some newspapers still seem to hope that they will be able to attract subscribers to the news material by offering the paper as PDF files, but considering the amount of news available for free, this hope seems futile. Charging for additional services and material might be another matter. The Swedish daily Aftonbladet claims that more than 90 per cent of its web content can be had for free, but still manages to attract 118 400 subscribers charging 19 Swedish crowns per month for its Aftonbladet Plus service, making a nice profit of 27 million crowns per year23 on material it has often already produced for the printed version. 6

From Paper to E-Paper?

The new e-paper technology has been mentioned as an alternative, whereby the newspaper companies could offer a reading experience close to that of the paper version, but which would relieve them from the burden of printing and distributing several tons of paper every day. No Swedish newspaper is using this alternative today, but seven major newspaper publishers together with the Swedish Newspaper Publishers' Association, Halmstad University and the Royal Institute of Technology cooperated during 2004-2006 in a nationwide project, examining reader responses, business and distribution models, and technology issues.24 The project was part of the European DigiNews project.25 E-paper differs from normal computer screens in several ways. It is thin and flexible. It doesn’t contain any light source, but relies on reflected light, like ordinary paper. The energy consumption is very low and device doesn’t have to be recharged every day. It usually has a higher resolution than normal screens. The screens have so far been quite small, usually about A5 (210x148 mm) and only black and white or grey scale, but LG.Philips recently announced an A4 e-paper display, which handles 16.7 million colours.26 According to a report from the Swedish DigiNews project 79 per cent of the respondents to a survey made in 2004 were willing to change from the print version to e-paper in the future. The conclusion in the report, however, is that the e-paper newspaper will probably become reality, but that it will be only one of several channels for the papers, together with the versions in print and on the web.27 ______________ 22 23 24 25 26 27

http://en.wikipedia.org/wiki/News_hole http://www.aftonbladet.se/plus/article820927.ab?plus=true http://diginews.se/ http://www.hitech-projects.com/euprojects/diginews/index.htm http://www.infosyncworld.com/news/n/8787.html http://diginews.se/files/Media%20IT_slutrapport%20DigiNews.pdf

Nilsson: All News But No Paper

7

165

Newspapers and Archiving

So, if a large part of the future of the newspapers seems to be on the web and some or most of their material will be available only through this channel, what are the Swedish newspapers doing to preserve their web versions? Most papers have some sort of archive available on their web sites, but these are sometimes available only to the subscribers or limited in time to only the past six months. There are, however, examples of large and freely available archives, like those of Aftonbladet (from 1998) and Norrköpings Tidningar (from 2001). Needless to say, the archiving done by the newspapers themselves is focused on content, not on preserving the past layout or context of the articles. In many cases there have been numerous changes to the web layout over the years and also changes to the underlying publishing systems, so keeping the old look of the pages is probably impossible. Since the papers themselves have a hard time keeping archives, which can guarantee that the material can, in the future, be viewed as it was once presented, it must then certainly be necessary for libraries and archives to collect, preserve and make available as much as possible of the newspaper web sites.

8

Harvesting the Swedish Web

The NLS has been harvesting Swedish websites since 1997. The decision to do so was taken by the library itself and was not based on any type of legislation or formal instruction from the Ministry of Education, to which the library answers. It was simply felt among a group of dedicated people at the library that it was time to do something to preserve at least a part of this new and growing media phenomenon.28 The harvesting began in the form of a project (Kulturarw3 (cultural heritage), where w3 is a play on www), where the goal was to “to test methods of collecting, preserving and providing access to Swedish electronic documents”,29 but it is now part of the Digital Library of the NLS. As any large collection of web pages inevitably will contain a lot of personal information which may come into conflict with the Swedish Personal Data Act,30 the Swedish Data Inspection Board in 2002 proposed that the web harvesting done by the NLS should be regulated in law, so as to control what is stored and in what way the stored material is made available.31 The regulation was passed in May 2002 and clearly permits the library to collect and store the Swedish “national digital cultural heritage” as it is published on the Internet.32 This includes all material which can be classified as Swedish on the grounds of “address, addressee, language, originator or sender”. According to the regulation, information about individuals may be collected and stored in the database “in order to benefit the need for research and information”, even if it is sensitive information as defined in the Personal Data Act, i.e. concerns ethnicity, political views, religion, etc. The information may even be exported on e.g. CD or DVD, but solely for research purposes. Direct access to the database, however, is only allowed on the premises of the library. ______________ 28 29 30 31 32

http://www.bok.hi.is/Apps/WebObjects/HI.woa/swdocument/1009821/%C3%9Eorsteinn_KB-06.pdf http://etjanst.hb.se/bhs/ith//1-00/jm2.htm http://www.riksdagen.se/webbnav/index.aspx?nid=3911&bet=1998:204 http://www.datainspektionen.se/nyhetsarkiv/nyheter/2002/juni/2002-06-13.shtml http://www.riksdagen.se/webbnav/index.aspx?nid=3911&bet=2002:287

166

Nilsson: All News But No Paper

The harvesting has been done completely for the Swedish top-level domain .se and in a selective way for the generic top-level domains (.com, .org and .net) and the top-level domain for the island nation Niue ( .nu), which has been very popular for the simple reason that “nu” means “now” in Swedish. The comprehensive harvesting cycles for the Swedish web have been made once or twice per year and can be thought of as a snapshot of material at the time of collection. There have been 15 comprehensive cycles so far. Because of the time it takes to collect all the pages, different pages in one cycle are by necessity harvested at different points in time, so the snapshot is really extended in time. This can make it difficult to guarantee that a page under another domain linked to from the harvested page will still be available for collection when this domain is harvested and, perhaps more treacherously, the page linked to may have been altered in the mean time, so that the context or association implied by the link may have disappeared. In 2007 the total amount of data archived was more than 300 million files for the comprehensive cycles,33 but is now probably considerably larger after the fifteenth cycle. In 2006 the newspaper archive alone grew with 12 million files and is now larger than 3 TB in size.34 More than 800 different file types have been identified, although 96 per cent of the total amount consists of the five most common file types. Harvesting static web pages with ordinary images and normal links is a rather straightforward procedure, but as soon as the pages are created in a dynamic way, often from databases and with some kind of user input, things are not so easy anymore. The program harvesting the pages is very good at following normal links, but more and more links are created through scripts. Web pages are built for web browsers, which are nowadays quite complex systems, often with a lot of plug-ins to handle different file types. They have even been called miniature operating systems. Compared to a web browser the software used for harvesting web pages has so far in many ways been simpler, with limited or no support for many of the formats handled by the browser. One problem for the harvesting software has been links to style sheets (CSS) dynamically created through a java script on the basis of what web browser is being used and also on the version of the browser. In some cases the harvesting program can collect all the text and images which belong to a page, but not the style sheet containing the rules for the layout. This means that it will be impossible to show the archived page as it was once presented. And even if we were able to collect the style sheet, the changes over time in how web browsers interpret the web pages and their style sheets could make it difficult to guarantee that the intended original layout is retained. There are other media types which are far more difficult or even impossible to harvest. Sound and video contained in static files can easily be collected, but more and more some sort of streaming protocol is used for these types, so that the user won’t have to download the whole file before viewing or listening to it. So far streamed files have not been harvested, since there is a lack of support for these media types in the harvesting software.

______________ 33 http://www.kb.se/soka/internet/sv-webbsidor/om/ 34 http://www.kb.se/Dokument/Om/verksamhet/arsredovisningar/arsredovisning2006.pdf

Nilsson: All News But No Paper

9

167

Harvesting Online Newspapers

When the NLS started harvesting Swedish web sites, the sites of most newspapers were archived with the same frequency as all other web sites, but this was changed in 2004, when collection on a daily basis was started. The selection of newspaper web sites to be visited was made from a link page for Swedish newspaper sites maintained by the NLS since about 1998. This has for a long time been the single most popular page on the library’s web site, which is a good indicator of the popularity of newspapers on the web. About 140 newspaper web sites are being collected every day. Unfortunately the results for the daily harvesting are not very good. To a large extent only the start pages of the newspaper web sites have been collected and often there are things missing, such as images and layout. There are several reasons for these deficiencies. When the pages are collected a set of rules is used and this inevitably puts a limit as to what can be achieved. One rule concerns how deep into the web site the collecting software is supposed to go. Depending on how the web site is structured the results can differ. If the web site is structured only in two levels, so that all the articles are one page below the start page, it would enough to collect only to this depth, but the structures are often more complex and different between web sites. Another rule decides how many objects will be saved for each web site. The objects are all the files creating the web pages, including text files, images, style sheets etc. One additional problem is that the archived pages contain links for advertisements that were not collected. Instead, we have found that if you use the archive and have a working internet connection, the current advertisements are shown instead of the historic ones.

10

Swedish Online Newspapers in the Internet Archive

The NLS has actually not been the only place where Swedish online newspapers have been harvested. In fact the oldest preserved pages are those collected by the Internet Archive (IA) and available through its Way Back Machine.35 The Swedish daily Aftonbladet went online 25 August 1994. This was an early start, a couple of months before the Netscape web browser reached 1.0 and one year before Microsoft released its Internet Explorer. The first page of www.aftonbladet.se that was preserved by IA was retrieved on 23 October 1996.36 The number of pages collected from this web site by IA has varied over the years and the total number of harvesting cycles is 476 since 1996,37 but the harvesting has never been done daily and no page seems to have been preserved since the beginning of 2006. The overall result for the pages harvested by IA seems to be better, with more of the layout and images preserved, but there are still examples of the same problems as in the NLS archive.

______________ 35 http://www.archive.org/web/web.php 36 http://web.archive.org/web/19961023235430/www.aftonbladet.se 37 http://web.archive.org/web/*/www.aftonbladet.se

168

11

Nilsson: All News But No Paper

Access to the Harvested Pages

So far, the only way to access the Swedish web archive is through two publicly available PCs at the NLS. The PCs used for browsing the archive are not connected to the internet, as we must make sure that the archived pages are not copied, but you are allowed to print the pages. There is no search facility, by which you could “google” the pages. Instead, for the pages from the complete harvesting cycles, you have to know the URL of the page, enter it in a search box and choose one of the links presented in the result list. There is one link for each time the page was archived. For the harvested newspaper web pages things are a little easier. Here you can choose one of the available URLs from a list. The archived web pages are kept in a tape archive and are fetched to disk on request, which means that you have to wait for about two minutes. This has hardly made the archive attractive to users, but the whole archive will now be put on disk instead.

12

What Can Be Done?

What does all this mean to librarians and archivists? In the traditional newspaper work at the library, i.e. collecting the originals and producing a microfilm surrogate, the NLS has since 1979 been aiming to preserve and make available the whole newspaper, including not only all editions and supplements, but also the news bills, which are certainly an important part in Sweden of the relationship between the papers and their readers. The printed paper is preserved in its entirety, whereas the aim of the microfilm is to preserve all unique pages, not all pages. In preserving and presenting all the different parts of the newspaper, you could say that the goal also has been to preserve the integrity of context of the newspaper, making it possible for future users to view and understand not only the bits and pieces of information, but also the whole environment of the paper, preserving the connections within each issue, between issues over time and between different titles. While microfilming is in many ways cumbersome and expensive, it has not been too difficult to preserve context in this way. The problem has rather been in removing context and providing access to the individual articles through indexing and classification, something which has only been done selectively in Sweden.38 Providing access to the information in the harvested web pages through search engines shouldn’t be a problem. The task in preserving online newspapers in many ways rather lies in the preservation of context, collecting and keeping the different parts of a web page and media types together. Recently the NLS has begun discussions with the Swedish National Archive of Recorded Sound and Moving Images39 about what can be done to collect web sites in a more complete way, including different kinds of sound and video. One of the web sites chosen for analysis was a newspaper web site with a very rich and varied content. So far no new strategies for this cooperation have been decided, but the working group has concluded that ______________ 38 http://www.kb.se/soka/bibliografier/tidningar-tidskrifter/ 39 http://www.slba.se/index_english.html

Nilsson: All News But No Paper

169

it is very important to preserve the context of for example video material when harvesting a web page. The video may be very closely linked to text material etc., so that just harvesting the video file separately would not be sufficient. How much of what is published as online newspapers can we expect to collect and preserve? Since newspaper web sites are in many cases updated minute by minute, we can of course never get everything. I believe it is important to find an automated harvesting schedule that is adapted to how most newspaper web sites publish, perhaps with individual modifications for the most important titles. This has to be supplemented by selective and thematic harvesting, which in contrast to the automated harvesting will involve selection issues and an active collection development. An increasingly important channel for the newspapers is their mobile friendly web sites, which should actually be easier to harvest than the ordinary web pages. One other interesting way of keeping record of what the newspapers deem to be the most important news could be to collect the different RSS feeds made available on their web sites. In collecting the newspaper websites daily we certainly run the risk of storing a number of copies of the same page. Of course it could be possible to automatically weed out duplicates, but the need to do so is not the same as with printed material, since disk space is far more inexpensive than shelf space. So, while we can safely expect to have a combination of deficiency and redundancy in our online newspaper archives, this doesn’t mean that this archiving is pointless or impossible. As the newspapers are moving their focus from the printed paper to the web version, the full attention of newspaper librarians and web archivists is certainly needed in this area.

CANADIAN INUIT NEWSPAPERS AND PERIODICALS: PAST, PRESENT & FUTURE Sharon Rankin McGill University Library, Montreal, Quebec, Canada Abstract The Canadian Inuit people have lived in the Canadian Arctic for centuries and are one of several aboriginal peoples of Canada. As their culture’s oral tradition shifted and became written, newspapers, newsletters and small magazines became an essential form of communication for communities. This paper surveys the newspapers currently and formerly published, by and about Canadian Inuit, spanning the past half century. It also describes existing collections, abstracts and indexes and paper bibliographies that have significant relevance for locating these Inuit titles. The information collected during this study has been collated into a web resource. Entitled “Caninuit: a comprehensive bibliography of Canadian Inuit periodicals”, this website is a unique resource for students, arctic scholars and Inuit communities to discover Inuit periodical publications when they use Internet search engines for their research. Introduction The theme of today’s session is “The North American Ethnic Press” and the goal of my paper today is to introduce you to the newspapers published in Canada’s Arctic by or about Canadian Inuit communities. I will begin by providing brief historical information about one of Canada’s aboriginal peoples, the Inuit and describe the development of Inuit literacy, writing systems and literature. Several bibliographies have informed my study of this media. I will describe the significant collections of these titles in Canadian libraries and cultural centres, sharing search vocabularies for locating information in online catalogues and relevant abstracts and indexes. I will continue with a survey of the newspapers that are currently published or have been published over the past five decades across the Inuit regions of the Canadian Arctic. These publications have significance for Inuit communities and I will outline the reasons why this is so. I will conclude with a description and sample screens from a new web resource that I have been constructing to bring together this information on the Internet, so that it can easily be searched. Entitled “Caninuit: a comprehensive bibliography of Canadian Inuit periodicals”, this website will hopefully become a useful resource for all interested in Canadian Inuit newspapers, newsletters and magazines. Background information about Canada’s Inuit people Until the early 1970’s, Canada’s Inuit people were usually referred to as “Eskimos”. This name is a European term, which has its linguistic roots in the French word “Esquimaux”. Professor Louis-Jacques Dorais, a Laval University anthropologist explains in his history of the Inuit that the French word was probably a translation of an Indian (Algonquian) language term meaning “raw meat eaters” or “those who speak a foreign tongue”. Indians are First Nations peoples who also lived in Canada, before Europeans arrived and as this example shows, were a different and distinct culture from the Inuit.

172

Rankin: Canadian Inuit Newspapers and Periodicals

Today, the term “Eskimo” is viewed as the “non preferred term”. Some Inuit find the term offensive or derogatory. It does remain part of the large body of literature published that documents Canada’s Inuit and the term must be kept in mind when constructing search strategies in bibliographic tools. The “preferred term”, is “Inuit” meaning “the people” in Inuktitut, the Inuit language. Inuit is the term that Inuit use to refer to themselves. The singular form of Inuit is “Inuk”. The ancestors of Canada’s Inuit population arrived from Asia over 7,000 years ago. Called Dorset and Thule peoples, they crossed the Bering Strait into North America and migrated across the Canadian Arctic from west to east, settling north of 50 degrees latitude. Living in small nomadic groups, the Inuit had been entirely self-sufficient, dependent upon hunting, fishing and gathering for their survival. Inuit settlement regions Figure 1 – Map

On the Makivik Corporation’s website, an Inuit owned economic development company representing the Inuit in the province of Quebec, there is a useful map of Canada’s Arctic with four coloured regions corresponding to the current Inuit settlement areas. On the “Canadian Inuit Map: Settlement areas and population by region”: •

the orange area in the western arctic is the “Inuvialuit Settlement region”, located in Canada’s Northwest Territories. This region has a population of approximately 3,000 Inuit living in six communities;



the purple area in the central arctic is called “Nunavut”. This region became a selfgoverning Canadian territory in 1999. Nunavut has the largest Inuit population in Canada, numbering 22,500. Its territory is divided into three regions: Kitikmeot, Kivalliq and Baffin. Nunavut comprises one fifth of Canada’s land mass and contains twenty-six communities, the largest number of Inuit communities in Canada;



the yellow area in the eastern arctic is called “Nunavik”, in the province of Quebec. Nunavik is home to 8,700 Inuit who live in fourteen coastal communities;



the green area on the eastern arctic shore is called “Nunatsiavut”. This region comprises a northern region of Labrador, the western area of the province of

Rankin: Canadian Inuit Newspapers and Periodicals

173

Newfoundland & Labrador. Labrador is home to 2,300 Inuit who live in six coastal communities.

Canadian Inuit culture At the beginning of the 20th century, the vast majority of Canada’s Inuit people still lived a traditional lifestyle based upon the land and a nomadic existence. Trading posts, Christian missions and police detachments altered this nomadic life. After World War II, the Canadian government increased services to the arctic. Schools, nursing stations and government offices were built and Inuit were strongly encouraged to settle in permanent villages. By the 1970’s, nearly all of Canada’s Inuit people lived in one of the small communities in the arctic. This trend has continued into the present day. The Inuit Tapiriit Kanatami (ITK) is the political organization that represents all of Canada’s Inuit population. Their 2007 Inuit Statistical Profile states that: “Of the 45,075 Inuit living in Canada in 2001, 36,640 or 81% lived in one of four Inuit regions in the Arctic.” A highly readable introduction to Canada’s distinct Inuit culture has been published by Pauktuutit, the Inuit Women’s Association of Canada. Revised in 2006, “The Inuit way: A guide to Inuit culture” describes traditional and modern Inuit life and explains Inuit cultural values. Canadian Inuit now have a foothold in two worlds, the traditional world and the modern one. A constant value has been the important emphasis placed upon the oldest members of the family. “Elder family members are considered wise and essential sources of knowledge about the past. They are often sought out for their story telling and advice on many issues” (Panktuutit 26). Oral to written tradition Inuit culture has always had a very strong and well developed oral tradition. It is the myths, tales and songs that elders have told and have sang in Inuktitut, at family and community gatherings that have ensured that traditional beliefs, symbols and values were transmitted from one generation to the next. In the late 19th and early 20th century, arctic explorers recorded Inuit poetry and songs in their anthropological reports. Knud Rasmussen, Franz Boas and Diamond Jenness published the first texts of Inuit poetry. Christian missionaries arrived in the arctic in the beginning of the 20th century. They challenged Inuit traditional beliefs as they worked to convert Inuit families to Christianity. As part of this conversion, the missionaries transcribed biblical scripture into a written form of Inuktitut. Inuktitut - written language The Moravian missionaries arrived in Labrador from Greenland in 1771 and were the first to write and teach a Canadian Inuit dialect, using Latin characters. This script is referred to as “Inuktitut roman orthography”. In the late 19th century, the Anglican missionary Edmund J. Peck transcribed parts of these Moravian translations using symbols. This syllabic orthography is referred to as “Inuktitut”. In the 1970s, the various regional Inuit associations each standardized a different writing system. In Labrador and Inuvialuit, the versions use roman characters. In Nunavut and Nunavik, syllabics are used. These standardized orthographies are now used to express in written form of the Canadian Inuit spoken dialects. In the ensuing decades, as a syllabic

174

Rankin: Canadian Inuit Newspapers and Periodicals

character set became available for typewriters, word processors and computers, the handwritten characters in early publications were replaced by typeset characters. The availability of syllabic typeset greatly facilitated the publication of Inuktitut language newspapers.

Canadian Inuit Newspapers – Bibliographies In the 1980’s to 1990’s, there was a proliferation of newspapers and magazines written and published in Inuit communities. This fact is noted in Pamela Stern’s “Historical Dictionary of the Inuit” in the entry for journalism and broadcasting. “Inuit journalism and broadcasting have been powerful tools in the struggle for land claims, self-government and aboriginal rights…Initial efforts at northern journalism were the work of Christian missionaries, but Inuit quickly participated as writers and reporters…The proliferation of northern newspapers and magazines indicates Inuit interest in public affairs and a desire for Inuktitut reading material” (Stern 87). Several bibliographies published during this time period were used to locate Inuit newspapers. I have included both those titles that are currently published and those that have ceased publication. A publication has been categorized as a newspaper if it contains community news and is published with some frequency. There seems to be a fluidity concerning the categorization of newsletters. They will often be called newspapers. I have excluded a newsletter that reports only on the activities of a specific association or interest group. Albert C. Heinrich’s bibliography was published in a 1973 issue of the academic journal Canadian Ethnic Studies. The entries were based upon questionnaire information sent to publishers and include ten newspaper titles. Hugh McNaught studied the publishing history of newspapers in the Northwest Territories for his 1980 thesis. Community newspapers published between 1945 and 1978 were divided into six categories; school, government, special interest, adult education, religious and community. The Inuvialuit Settlement region published fourteen titles, recorded in this bibliography. Robin McGrath’s 1984 thesis entitled “Canadian Inuit Literature: The development of a tradition” is unique study of the how the Inuit oral tradition of literature in Inuktitut shifted to a written tradition of writing in English. McGrath describes in detail the kinds of publications that make up the corpus of Inuit literature and newspapers figure prominently in this review. “The development of Inuit periodical literature, newspapers and magazines by and for Inuit, parallels that of Inuit books, and in some ways is a more important development because these periodicals, although relatively impermanent, encouraged readers, and writers who were or are of only limited proficience” (McGrath 34). McGrath’s appendix attempts to collect together all known information about this “unique body of literature”. It includes a surprising number: almost one hundred titles, twenty-two of these titles are newspapers. In Un/covering the north: news, media and aboriginal people, Valerie Alia provides a comprehensive review of aboriginal media in Canada and has a very useful appendix listing by region the newspapers and magazines published in the Canadian North. Twelve titles are Inuit newspaper publications in this 1999 publication.

Rankin: Canadian Inuit Newspapers and Periodicals

175

Thirty-eight newspaper titles were entered this year into the Caninuit bibliography, a new web resource being constructed as part of my sabbatical research project.

Primary Collections in Canada Five collections in Canada have been significant for my study of this topic and it is within these libraries and cultural centres, that the paper holdings of Inuit newspapers can be found. The Library and Archives Canada (LAC), Canada’s National Library located in Canada’s capital city, Ottawa in the province of Ontario has significant holdings of Inuit periodicals. LAC provides its National Union Catalogue of over 30 million records online in a system called AMICUS1. The collections of LAC and 1,300 other Canadian libraries can be searched simultaneously in this catalogue. The University of Alberta Libraries in Edmonton, Alberta has the most extensive university collection of Inuit periodicals in Canada. U of A, is a member of the American Association of Research Libraries, has one of the strongest research collections in Canada and makes its collections accessible via the NEOS Library Consortium Catalogue2. The University of Calgary Library in Calgary, Alberta is home to the Arctic Institute of North America, (AINA). “Created by an Act of Parliament in 1945, the Arctic Institute of North America is a non-profit membership organization and a multi-disciplinary research institute of the University of Calgary. The institute's mandate is to advance the study of the North American and circumpolar Arctic through the natural and social sciences, the arts and humanities and to acquire, preserve and disseminate information on physical, environmental and social conditions in the North.”3The AINA collection is housed in the Gallagher Library and its partial contents can be searched online in the University of Calgary Library catalogue. The Library of Indian and Northern Affairs Canada, (INAC) the Canadian federal government department responsible for First Nations and Inuit affairs, headquartered in Hull, Quebec has the largest special library collection of Inuit periodicals. Its INAC Library Portal4 is searchable on the web and provides a union catalogue for all departmental libraries as well many special research collections both inside and outside the department. The fifth collection that was very useful in the preparation of the Caninuit bibliography is held in the Avataq Cultural Institute,5 a documentation centre in Montreal, Quebec. Founded in 1980, this cultural institute is a non-profit organization dedicated to protecting and promoting the language and culture of the Inuit in Nunavik. The Avataq Documentation Centre is publicly accessible by appointment and its periodical database is searchable in the centre only.

______________ 1 2 3 4 5

http://www.collectionscanada.gc.ca/amicus/ http://www.library.ualberta.ca/searchcollection/ http://www.arctic.ucalgary.ca/ http://virtua.ainc-inac.gc.ca/ http://www.avataq.qc.ca/spip.php?page=accueil&lang=en

176

Rankin: Canadian Inuit Newspapers and Periodicals

Search methods – locating Canadian Inuit newspapers Locating bibliographic references for Inuit newspapers to determine where paper and online copies reside is not a straightforward exercise because there is no single controlled vocabulary term that can be used to search online catalogues for this type of material. Some records in the Library and Archives Canada catalogue, AMICUS have the subject term “Canadian newspapers” followed by their place of publication. SUBJECTS: Canadian newspapers (English)--Northwest Territories--Rankin Inlet Canadian newspapers (Inuktitut)--Northwest Territories--Rankin Inlet Using this heading and adding the keywords “Inuit” or “Eskimo” and limiting the results to publication type equal to “serials”, a search in AMICUS database will result in eight records. This search result is misleading as the LAC collection does have more than eight newspaper titles. If the search is reconstructed using keywords, a more representative search result of 27 can be obtained. The slide shows that by using a keyword anywhere search of the term “newspaper”, truncated to find both singular and plural forms, and adding the keywords “Inuit” or “Eskimo” the results are a larger set. The University of Alberta Libraries’ NEOS catalogue has one very appropriate looking subject heading: 650:

0

: Inuit|zCanada, Northern|vNewspapers.

Hyper-linking on this term to locate other records with the same heading provides no results. A better result is obtained by using truncated keywords, Boolean logic and a limit to format equal to serials: (Eskimo or Inuit) and canad$ and newspaper$ The 98 titles retrieved also contain false hits, most noticeably the annual report publications. The Department of Indian & Northern Affairs Canada has used controlled subject vocabulary in their catalogue records and browsing the subject headings will provide records for Inuit newspapers. 1 Canadian newspapers (Inuit) -- Northwest Territories. Canadian newspapers (Inuit) -- Northwest Territories -- Frobisher 1 Bay. 1 Canadian newspapers (Inuit) -- Northwest Territories -- Pond Inlet. 1 Canadian newspapers (Inuit) -- Northwest Territories -- Rankin Inlet. 1 Canadian newspapers (Inuit) -- Quebec (Province) -- Fort Chimo.

In summary, using the advanced search option is always recommended, so that several searching variables can be entered. Search strategies differ depending upon the record coding practices of the catalogue being queried. As a general approach, it is always best to

Rankin: Canadian Inuit Newspapers and Periodicals

177

explore the controlled vocabulary headings first, and then use keywords, Boolean logic and format limiters. Online periodical indexes & abstracts Canadian academic libraries normally license several indexes and abstracts to magazines and newspapers to provide indexed and full-text coverage to Canadian titles. Inuit newspapers are absent from all three of the following “mainstream” indexes: • ProQuest’s “Canadian Business & Current Affairs” (CBCA) • ProQuest’s “Canadian Newsstand” • Gale’s “CPI.Q” Ulrich’s6 online periodical directory, an authoritative source of bibliographic information on more than 300,000 periodicals of all types from around the world, has one Nunavut newspaper. It is clearly evident that Inuit newspapers remain hidden from the standard periodical directories. There is some coverage of their contents in the existing northern studies indexes. “The world's largest collection of international polar databases providing comprehensive and multidisciplinary coverage of polar research” is the Arctic and Antarctic Regions (AAR)7 database published by National Information Services Corporation (NISC). AAR is actually a compilation of records from twelve international sources; four of these sources are Canadian and they do have some indexing coverage of Inuit newspapers: • ASTIS (Arctic Institute of North America) • BOREAL (Canadian Circumpolar Library) • BOREAL Northern Titles • INAC (Department of Indian and Northern Affairs Canada) The Arctic Science and Technology Information System (ASTIS)8 database can also be searched separately and free of charge from the Arctic Institute of North America’s website. ASTIS contains over 63,000 records describing publications and research projects about northern Canada. The BOREAL databases describe the Canadian Circumpolar Collection held at the University of Alberta Library, Edmonton, Alberta. Also searchable as the PolarInfo database9, this index contains over 300,000 records but is no longer being updated. Another specific region index with coverage of Labrador titles is the Periodical Article Bibliography (PAB)10, a retrospective bibliography of Newfoundland and Labrador publications created and maintained by the Centre for Newfoundland Studies, at Memorial University Libraries in St. John’s, Newfoundland & Labrador. Survey of Inuit newspapers by region Now let’s turn to short historical description of newspaper publications by Inuit regions. ______________ 6 7 8 9 10

http://www.ulrichsweb.com/ulrichsweb/ http://biblioline.nisc.com/scripts/login.dll http://www.aina.ucalgary.ca/astis/ http://polarinfo.library.ualberta.ca/ http://www.library.mun.ca/qeii/cns/pab.php

178

Rankin: Canadian Inuit Newspapers and Periodicals

Inuvialuit Settlement region The earliest newspaper published in the Inuvialuit Settlement region, located in Canada’s Northwest Territories was in the mid 1950’s by the Roman Catholic Mission in Aklavik. The Aklavik Journal self subtitled “Canada’s most northern newspaper” published each month for two years, except during break-up and freeze-up. The influence of climate and seasonal occupations that take people out onto the land, are always evident in the Inuit newspaper publishing periodicity. This newspaper was printed by offset lithography and had one column of text in English and the other in Inuktitut roman orthography. A second mission newspaper published for two years beginning in 1968. The Bank Lands Letter was published in English by the Roman Catholic mission in Sachs Harbour. A third mission family newspaper was published for twelve years by the Anglican Church beginning in 1975 called Ilavut/Our Family. In 1966 in the community of Inuvik, a weekly newspaper called The Drum began to publish 1,400 copies per issue. This paper continues today as the Inuvik Drum. In 1975 the Government of the Northwest Territories, published for three years, the The Interpreter in English and Inuktitut syllabics and roman orthography. In 1983, the Inuvialuit Communications Society began publishing a biweekly newspaper called Tusaayaksat. It was distributed to all 800 odd households in the region. The society lost federal funding in 1990 and the publication transformed itself into a magazine which is still being published. By 2008, the Inuvik Drum11 is the only remaining newspaper in the Inuvialuit Settlement region. It is published in paper and online as one of a group of seven northern newspapers published by Northern News Services Limited. Daily news summaries are available free of charge on its website. Figure 2 - Inuvik Drum

______________ 11 http://www.nnsl.com/inuvik/

Rankin: Canadian Inuit Newspapers and Periodicals

179

Nunavut In the central and eastern Canadian Arctic, there was a proliferation of community newspapers beginning in mid 1960s. In the Baker Lake, Cape Dorset, Cambridge Bay, Coppermine, Eskimo Point, Frobisher Bay, Igloolik, Pangnirtung, Pond Inlet, Rankin Inlet, Resolute Bay and Whale Cove communities a total of twenty-three print newspapers, with varying life spans were published between 1965 and 1980. McGrath (1991) concluded that one of the major reasons for the short life span of these publications relates to their foundations. “Often spearheaded by a teacher, priest or community worker… the paper folds when he or she is transferred, willingly or otherwise, out of the community” (McGrath 95) Only one newspaper survived into the next three decades. Inukshuk began publishing in 1973 in the largest community in Nunavut, Frobisher Bay. In 1976, the newspaper’s name was changed to Nunatsiaq News. Nortext Publishing Corporation now publishes weekly print and online editions from Iqaluit. (Frobisher Bay was renamed “Iqaluit” in 1987. Iqaluit means "place of fish" in Inuktitut). The 2007 Combase12 newspaper readership survey reported that Nunatsiaq News is read each week by 6,200 people over 18 years of age in Nunavut and Nunavik. The online edition of this newspaper has a searchable online archive of issues from 1995 to the present day.13 Figure 3 – Nunatsiaq News

There are two other newspapers currently published in Nunavut; Nunavut News also in Iqaluit and Kivalliq News in Rankin Inlet. Both of these newspapers are owned and operated by Northern News Service. The same web format is used by the company for all of its northern newspaper editions.

______________ 12 http://www.nunatsiaq.com/advertising/moreinformation.html 13 http://www.nunatsiaq.com/archives/archives.html

180

Rankin: Canadian Inuit Newspapers and Periodicals

Figure 4 – Nunavut News

Nunavik The earliest newspaper published in Arctic Quebec is the Northern Star. Published in Fort Chimo, now called Kuujjuaq (meaning great river in Inuktitut) this independent community newspaper published for four years beginning in 1961. In 1974 Big Dipper News from Povungnituk published for a year. Similarly, the independent newspaper Atuaqnik in 1979 published only thirteen issues before collapsing due to lack of funds and too few trained reporters. (McGrath 1991) Figure 5 - Atuaqnik

The current newspaper of this region is Nunatsiaq News which also serves the population in Nunavut.

Rankin: Canadian Inuit Newspapers and Periodicals

181

Nunatsiavut (Labrador) The oldest Inuit newspaper Aglait Illunainortut was published in Nain, Labrador by the Moravian Missionaries from 1902 – 1914. Copies of this publication can be found in the Rare Books Collection of McGill University Libraries. In 1970, Kinatuinamot Illengajuk was published weekly in Nain by the Labrador Inuit Association, in Inuktitut roman orthography and English. The current weekly newspaper The Labradorian14 is published by Transcontinental Media Network in Happy-Valley Goose Bay the largest community in Labrador. This newspaper reports on community events across the region and covers events in the Inuit coastal communities. Figure 6 – Aglait Illunainortut

Figure 7 – The Labradorian

______________ 14 http://www.thelabradorian.ca/

182

Rankin: Canadian Inuit Newspapers and Periodicals

Importance of this Media The decrease in numbers of Inuit newspapers in the 1990’s has been studied by Alia (1999) and Avison (1996). One significant factor has been the withdrawal of financial support for aboriginal newspapers by the Canadian federal government. In 1990, the federal “Native Communications Program” was cancelled, forcing Inuit newspapers to quickly restructure their finances. Some newspapers were unable to continue to publish. The importance of newspapers for Canadian Inuit cannot be underestimated. These publications have provided a medium to shape and preserve cultural identity, and they have been an accessible means to share information about land claims and government activities. Newspapers provide a venue for Inuit journalists and authors to tell stories, publish photographs and write political commentary. Penny Petrone, author of Northern voices: Inuit writing in English, an anthology of Inuit literature describes the importance of journalism for Inuit writers: “Acculturated Inuit young people are articulating the feeling of a generation caught in a crisis of identity trying to determine a way of life that will protect their tradition and at the same time cope with the massive outside influences in their lives…Journalism dominates the imagination and absorbs the intellectual energies of many of these talented writers.” (Petrone 201) In the existing Canadian Inuit literature anthologies, the majority of each bibliography contains writings that were first published in Inuit periodicals (newspapers and magazines). This media contains the published written original of Inuit writers’ pieces. Many of these published articles have no other publication form. For this reason, preserving and providing access to the Inuit newspapers contributes to the preservation of the Canadian Inuit writing. “Community newspapers constitute a valuable primary historical source for observing how people perceive the world, their neighbours and the affairs within their communities”. (McNaught 1) Librarians are all too aware of the importance of newspapers as primary research sources of information. Caninuit – Web bibliography One of the motivating factors of my research project, the creation of a web bibliography of Canadian Inuit periodicals has been to remedy the invisibility of these publications. On the Caninuit website, each of the periodicals has a personality page, which includes a review describing the publication, bibliographic details (publishing history, location information), indexing information and links to digitized issues. Thumbnail cover images are being added to each of the records. The website, located at http://www.libris.ca/inuit/go.exe can be browsed in several ways: • by current or ceased • by format (journal, magazine, newspaper, newsletter, catalogue) • by Inuit region • alphabetically by title The web resource is a work in progress. I plan to update Caninuit through the coming years as new information emerges, new indexing sources become available, new digitization projects provide access to full-text versions of ceased publications and additions as new titles become published.

Rankin: Canadian Inuit Newspapers and Periodicals

183

International Polar Year 2007 - 2008 This year is the celebration of the fourth International Polar Year 2007– 2008 (IPY)15. The IPY is organized through the International Council for Science (ICSU) and the World Meteorological Organization (WMO). The IPY program is creating over 200 projects, with thousands of international scientists who are studying the Arctic and the Antarctic from March 2007 to March 2009. Social science projects are also being funded as part of the IPY program. “A special focus for the IPY year is education and outreach activities to demonstrate the truly global significance of contemporary and historical polar issues.”16 The creation of “Caninuit: a comprehensive bibliography of Canadian Inuit periodicals” contributes to this IPY theme, by ensuring that students, arctic scholars and Inuit communities will discover Inuit newspapers and magazines, when they use Internet search engines for their research.

References Alia, V., Un/covering the north: news, media and aboriginal people. 1999, Vancouver: UBC Press. Avison, S., Aboriginal newspapers: their contribution to the emergence of an alternative public sphere in Canada. 1996, Montreal: Concordia University, Department of Communication Studies Dorais, Louis-Jacques. Aboriginals: Inuit from Multicultural Canada website. http://www.multiculturalcanada.ca/ecp/content/aboriginals_inuit.html Gedalof, R. and A. Ipellie, Paper stays put: a collection of Inuit writing. 1980, Edmonton: Hurtig Publishers Heinrich, A.C., Periodical publications of Canada's Eskimos: a preliminary checklist. Canadian Ethnic Studies 5(1-2), 1973: p. 47-9, 51-7. Inuit Tapiriit Kanatami, 2007 Inuit Statistical Profile http://www.itk.ca/publications/StatisticalProfile_Inuit2007.pdf Magocsi, P. R., Aboriginal Peoples of Canada: A Short Introduction. 2002, Toronto: University of Toronto Press. McGrath, R., Canadian Inuit literature: the development of a tradition. Mercury series / National Museum of Man. 1984, [Ottawa]: National Museums of Canada. McGrath, R. "Atuaqnik: The Duration and Demise of a Native Newspaper." Native Studies Review 7 no. 1, 1991: 94-102. McNaught, H., Newspapers of the Modern Northwest Territories: A Bibliographic Study of Their Publishing History (1945-1978) and Publishing Record. 1980, Calgary: University of Alberta, Faculty of Library Science. Pauktuutit Inuit Women of Canada. The Inuit Way: A Guide to Inuit Culture. 2006, Ottawa: Pauktuutit Inuit Women of Canada.

184

Rankin: Canadian Inuit Newspapers and Periodicals

Petrone, P., Northern voices: Inuit writing in English. 1988, Toronto: University of Toronto Press. Robinson, G. ed., Isuma Inuit Studies Reader: Anthology of Selected Writings by and About Inuit. 2004, Montreal: Isuma Publishing. Stern, P. R., Historical Dictionary of the Inuit. Historical Dictionaries of Peoples and Cultures; No. 2. 2004, Lanham, Md.: Scarecrow Press. Figures Figure 1 – Map - Canadian Inuit Map: Settlement areas and population by region http://www.makivik.org/images/map/11_inuit_settlement_areas.gif Figure 2 – Inuvik Drum Online, September 4, 2008 http://www.nnsl.com/inuvik/inuvik.html Figure 3 – Nunatsiaq News, September 5, 2008-09-06 http://www.nunatsiaq.com/index.html Figure 4 – Nunavut News, September 5, 2008 http://www.nnsl.com/nunavutnews/nunavut.html Figure 5 – Atuaqnik, vol 1. No. 1, January 1979 Figure 6 – Aglait Illunainortut, 1910 Figure 7 – The Labradorian, September 6, 2008 http://www.thelabradorian.ca/

SAUVEGARDER ET NUMERISER LA PRESSE DES IMMIGRATIONS EN FRANCE A LA BNF, XIXEME-XXEME SIECLES Philippe Mezzasalma Bibliothèque nationale de France

1)

Un projet qui vient de loin

L’histoire des immigrations en France représente plus de 150 ans d’histoire des révolutions, des troubles, des convulsions économiques et politiques qui ont agité la plupart des pays du monde. Comme le disait l’historien René Rémond : « Il n’est guère de conflit ou de guerre civile qui n’ai déposé sur nôtre sol une strate supplémentaire d’exilés, de proscrits, de réfugiés. (…) C’est par centaine de mille, par millions même qu’ont afflué de toute l’Europe, puis de la terre entière les étrangers fuyant un pays ou leur vie était en danger, leur sécurité menacée, leur liberté condamnée. »1 Aucun pays d’Europe, GrandeBretagne comprise, n’a accueilli autant d’étrangers, issus d’horizons aussi diversifiés. Seuls les Etats-Unis seraient comparables, mais l’accueil et l’intégration des immigrants se posent comme des données constitutives de leur nation. Outre la position géographique de la France, les principales explications résident dans la richesse attractive du pays, l’identification à la Révolution françaises et à ses valeurs émancipatrices et égalitaires, et la relative hospitalité des autochtones. En dépit d’une certaine méfiance initiale, qui relève sans doute plus de la xénophobie que d’un racisme véritable, les étrangers, que l’on n’appelait pas encore les immigrés, se sont acclimatés, en dépit de début parfois difficiles, et se sont intégrés avec leurs enfants, à la nation française, adoptant ses traditions tout en l’enrichissant de leurs différences culturelles. Le modèle d’intégration français, en dépit des vicissitudes et des remises en question, demeurent suffisamment attractif pour continuer à attirer des migrants récents, venant du Sri Lanka, du Kurdistan, ou de Bosnie-Herzégovine. Chaque communauté de langue structura son existence en France en se rassemblant et en se tournant vers l’ensemble de la société. L’information, tant du pays d’origine, que celle du pays d’accueil mise à disposition des nouveaux arrivants dans leur langue native, ou en bilingue, de manière à contribuer à leur apprentissage du français, s’articula sur la création d’organes de presse de toute nature, bulletins d’associations, de groupes politiques, mais aussi authentiques journaux d’information générale. Cette presse eut à jouer un rôle déterminant, tant dans le processus d’intégration que dans la préservation des identités originaires. Elle joua parfois aussi un rôle déterminant dans l’accompagnement et la traduction des évènements nationaux et internationaux, l’anti-fascisme, la guerre d’Espagne, l’occupation allemande, ou la décolonisation. L’immigration, phénomène historique majeur dans la France contemporaine, peut se décliner selon trois angles, une perspective économique, un axe politique, et enfin comme ______________

186

Mezzasalma: Sauvegarder et Numériser la Presse des Immigrations en France

1 René Rémond, préface du catalogue de l’exposition France des étrangers, France des libertés ; presse et mémoire, Paris, Editions Ouvrières, 1990.

la résultante de groupes issus des anciennes colonies françaises vivant en métropole avant et après les indépendances. Cette dernière question pose un chevauchement de statuts juridiques, l’indigène devenant l’étranger, dans des permanences de représentations symboliques portées par la presse de ces communautés. L’immigration économique Celle-ci est multiforme, et commence dès le XIXème siècle. La presse de ces vagues d’arrivants se caractérise par son extrême diversité, une grande neutralité de ton, et un caractère éphémère, hormis quelques quotidiens par communauté. Ainsi des espagnols : s’il existait des réfugiés politiques espagnols en France dès 1813, c’est avec la guerre de 1914-1918 que la population espagnole double, avec plus de 255 000 personnes en 1921, le gouvernement français incitant l’immigration à des fins de recrutements d’ouvriers agricoles, pour répondre aux besoin des campagnes. En 1936, au début de la guerre civile entre républicains et franquistes, la communauté espagnole est la troisième derrière les italiens et les polonais. Les publications ont un rayonnement régional, comme le Boletin trimestral, édité en Bourgogne entre 1932 et 1936, ou Espana en Tolosa réalisé et diffusé dans le Sud Ouest à la même période. Ce sont des titres d’information générale, avec une certaine neutralité de ton, les titres politiques n’apparaissant qu’après 1936. Il existe en revanche une presse syndicaliste et mutualiste liée au travail et aux conditions de vie : citons Vida Obrera, publiée dès 1927 à Poissy, ou Accion mutualista espanolades Melodia de Francia, diffusée à Béziers au milieu des années 1930. Les polonais constituent le deuxième grand groupe d’immigration économique, à la fin du XIXème siècle, vers les mines du nord de la France, ou vers la Lorraine. Cet afflux massif constituait en fait une deuxième vague, la première remontant aux années 1831-1848, mais qui était de caractère politique, numériquement plus faible, et intellectuellement prestigieuse, très engagée contre les russes, démocratique voire républicaine. Cette presse nationaliste à caractère propagandiste n’appelait pas les polonais à rester en exil, mais bien à construire leur nation. L’émigration économique de la fin du XIXème siècle, et surtout celle suivant la création de la Pologne en 1920 voit l’apogée de la presse polonaise en France, les deux principaux quotidiens (Wiarus Polski et Narodowiec) tirant respectivement à 20 et 40 000 exemplaires en 1936, quatre quotidiens paraissant même à la Libération2. Troisième groupe ancien de l’émigration économique, les Italiens, arrivés massivement en France à la fin des années 1880 en Lorraine et Provence. Avant l’arrivée des antifascistes, la presse italienne sera largement inspirée par l’Eglise catholique (l’hebdomadaire Il Coriere à Agen, tirant en 1926 à 5 000 exemplaires) ou la Buona Parole, publiée à Marseille par la Mission Catholique. Des nombreux bulletins à caractère économique émerge sinon le quotidien modéré Don Quichotte, créé par Luigi Campolonghi à Marseille en 1920, soucieux de contribuer à la bonne entente entre français et italiens après le « pogrome » anti-italien d’Aigues mortes en 1893. Les autres vagues d’immigration économiques, hors du champ de l’influence politique et coloniale française, sont on le sait, plus récentes, et ont donné lieu par conséquent à un nombre de publication moindre, et moins diversifié, qui s’explique par une intégration plus rapide des enfants de migrants dans le creuset français, et un éloignement plus sensible de ______________

Mezzasalma: Sauvegarder et Numériser la Presse des Immigrations en France

187

2 Halina Florkowska-Francic, « La presse Polonaise, 1918-1984 », Revue du Nord n°4, Université de Lille 3, 1988.

la culture d’origine, un accès simplifié à l’information internationale, qui rend le média du pays d’origine plus accessible, et donc le journal de la communauté moins indispensable, et enfin une standardisation des médias à un format international. On peut néanmoins citer les journaux portugais (qui connaissent une inflation de titres et un regain d’intérêt au moment de la révolution des œillets en 1974-1975, moyen de décrypter l’information délivrée par les grands médias français3) depuis les années 1960, des journaux chinois (quatre quotidiens), kurdes (3 quotidiens), et turcs depuis les années 1980, et des journaux tamouls (Sri Lanka) depuis dix ans. L’immigration politique Terre d’asile, la France le fut dès 1831 et la répression des nationalistes polonais par la Russie Tsariste. De manière générale, ces émigrations politiques sont le fait de minorité politiques agissantes, souvent formés d’intellectuels ou de gens de plumes, dont le rôle déterminant dans la création d’organes de presse, et l’influence idéologique est inversement proportionnelle à leur nombre réel en tant que migrants. Ils ne s’agit pas de vagues d’arrivants, mais ces immigrés changent la structure et les contours politiques de leurs communautés à l’étranger. C’est le cas des polonais entre 1831 et 1871, créant nombre de revues en France au rayonnement indiscutable (120 titres recensés), au point d’ouvrir les portes de la presse françaises à certains journalistes polonais. C’est le cas des russes, marxistes et anarchistes sous le tsarisme (70 titres), ou fuyant la révolution bolchevique, monarchistes, antisémites, mais aussi socialistes ou démocrates (plus de 500 revues durant l’entre deux guerres) C’est le cas pour les antifascistes italiens, dont les journaux deviennent majoritaires dans la presse italienne en France (179 sur 230, pour seulement 9 fascistes), autour des grands titres que sont l’Avanti ! socialiste (tirant à 5 000), l’hebdomadaire Giustizia e Liberta des frères Rosselli, ou l’Araldo communiste, tirant à 15 000 exemplaires. L’antifascisme de l’entre deux guerre fournira des contingents de réfugiés juifs de Pologne, publiant des journaux ouvriers en yidish (Der Yidisher Arbayter), de réfugiés allemands antinazis (Die Neue Weltbühne, Die Aktion) qui s’engageront sous l’occupation dans la Résistance, continuant de publier clandestinement pendant la guerre (Freies Deutschalnd, Arbeiter und Soldat…) et de réfugiés républicains espagnols, dès 1936 (l’Espagne Nouvelle, l’Espagne Socialiste, Espana), mais surtout après 1939, malgré la clandestinité entre1940 et 1945.4 On ne saurait terminer sans mentionner la communauté arménienne ayant fui le génocide, et son grand quotidien, puis hebdomadaire Haratch. En revanche, les communautés de réfugiés chiliens et sud américains des années 1970 ne publièrent que des bulletins, que l’on ne peut qualifier de journaux. Du colonisé à l’émigré L’image est en revanche beaucoup plus trouble pour les populations issues d’Afrique noire, du Maghreb ou de l’ex Indochine (Vietnam, Cambodge et Laos). En effet, amenés à venir travailler ne France métropolitaine dès 1914 pour remplacer dans les concentrations industrielles les hommes partis au front, les colonisés, par leur statut juridique, se vivent comme des étrangers en France. Leurs publications nombreuses s’en ressentent, avant de devenir farouchement nationaliste, puis indépendantiste, comme El Ouma, organe de ______________ 3 Gilles Rodrigues, « la Révolution des œillets vue par la presse française », in Revue de la BnF, n°25, printemps 2007.

188

Mezzasalma: Sauvegarder et Numériser la Presse des Immigrations en France

4 Geneviève Dreyfus Armand en dénombre 345, dans son article « Pages d’exil », in catalogue de l’exposition France des étrangers, France des libertés ; presse et mémoire, Paris, Editions Ouvrières, 1990.

l’Etoile Nord Africaine fondé par Messali Hadj (qui tire entre 2 et 5 000 numéros) , avant que ne se diffuse clandestinement la presse du FLN, El Moudjahid, la Voix du travailleur algérien. Les communautés issues d’Afrique verront elles naître dès les années 1920 dans la mouvance communiste les fameux la Voix des Négres puis le Cri des nègres, alors que la Dépêche africaine, plus assimilationniste, qui sera soutenue dès 1928 par la CGT. La Revue du Monde Noir, fondée en 1931, entend réhabiliter la civilisation noire, portée en milieu étudiant par l’Etudiant noir à partir de 1935. Le processus sera globalement le même pour la presse indochinoise, même si celle-ci n’attend pas la décolonisation, contrairement à la presse africaine, pour se scinder en nationalités, et ou domine la presse en vietnamien, pro communiste pour l’essentiel. Après guerre, et après une décolonisation longue et douloureuse, les groupes restés en France devenus des immigrés, les journaux disparaitront progressivement au profit de bulletins d’association par pays. Le Vietnam représente une exception, puisqu’une diaspora de ce pays se reconstitue entre 1975 et 1979 pour fuir le régime communiste. Plusieurs publications mensuelles sont alors publiées.

2)

Une prise de conscience : collecter et signaler les sources de la mémoire

Une douloureuse décolonisation, avec une guerre en Algérie et des opérations militaires dans d’autres pays avaient longtemps freiné la reconnaissance et la prise en compte de l’apport de l’immigration en France, car certains anciens colonisés étaient devenus des immigrés, l’indépendance n’enrayant pas l’émigration vers la France, mais n’atténuant pas pour autant les blessures d’une histoire devenue conflictuelle. Les traces de cette présence en France étaient multiples, mais dispersées et méconnue. Le premier pas dans cette reconnaissance fut la création d’une association, Génériques, au milieu des années 1980, chargée de collecter les multiples éléments de mémoire permettant de constituer les sources pour une histoire apaisée de l’immigration. Un deuxième temps fort apparu en 1990, avec la première exposition consacrée à cette mémoire et centrée, fort opportunément autour de la presse. Celle-ci servit de support régulier aux expositions qui se développèrent en régions, en particulier à Lyon et Grenoble. Lancé dès 1995, le projet de Cité de l’immigration fut finalement inauguré en 2002, avec une ouverture effective en 2007. Les historiens sortirent alors des cénacles restreints qui étaient les leurs jusque là, pour être rejoints par les préoccupations des professionnels de la documentation. Un guide des sources en France avait été lancé par l’association Génériques depuis1999, mais il recense prioritairement les archives publiques, ainsi que quelques fonds privés. La BnF, soucieuse de recenser les publications imprimées (donc de son ressort par le biais du Dépôt légal) concernant l’histoire des immigrations en France, entreprit le chantier d’un Guide des sources de l’Histoire des immigrations en France. Celui-ci recensait les publications par communauté, en adoptant un plan simple, description des monographies, périodiques, par ordre chronologiques. Au travers de ces différents travaux se profilait le volume considérable des documents publiés non seulement sur les immigrations, mais produites par elles. L’importance dans l’histoire nationale apparue désormais indéniable également par cet angle de vue documentaire. Ces documents en langue française ou étrangère, en caractères latin, cyrillique, chinois ou sanscrit, faisaient partie des collections de la Bnf par leur entrée via le Dépôt légal. Les collections de la BnF Cette recherche bibliographique permit de repérer près de 2 000 titres de périodiques issus des différentes immigrations, du début du XIXème siècle à nous jours. Si l’ensemble est

Mezzasalma: Sauvegarder et Numériser la Presse des Immigrations en France

189

plutôt constitué de revues, plusieurs dizaines d’authentiques journaux, quotidiens et hebdomadaires d’information générale et politique, certains possédant d’ailleurs une longévité de plusieurs décennies, parcourent l’ensemble des communautés. Chaque communauté linguistique fit l’objet d’un corpus bibliographique spécifique, à vocation exhaustive. Les travaux de recherche et l’aide de certains universitaires permirent d’exhumer des périodiques rares, aux titres non significatifs, et par conséquent difficile à trouver. Ils donnèrent aussi l’occasion de découvrir la richesse ignorée de certains corpus, comme la presse anglaise publiée en France sous la Restauration.5 L’objectif étant de valoriser ces titres, il fallait se poser la question de leur signalement catalographique d’une part, de leur communicabilité d’autre part. Le diagnostic concernant leur état matériel devait ainsi déterminer les traitements de conservation et de sauvegarde afférents. Les corpus furent les suivants, par ordre d’importance numérique : italiens, polonais, allemands, espagnols, russes, yiddish, portugais, algériens, arméniens, vietnamiens et kurdes. De très petites communautés d’exilés politiques, liées à des convulsions spécifiques de leurs pays d’origine (Hongrois en 1945, puis 1956, Chiliens après 1973) présentent même des corpus certes restreins, mais d’une très grande tenue éditoriale, par la participation d’intellectuels réfugiés. Les corpus anglais, tunisiens, chinois, turcs et africains sont en cours de constitution. Mis à part les journaux anglais, ces derniers ensembles thématiques dénombraient peu de titres, et semblaient moins prioritaires. D’autre part, dans certains cas il s’agissait d’immigrants plus récents et moins bien connus, là où les vagues précédentes (polonais, italiens ou algériens) s’étaient implantées de manière suffisamment anciennes pour que ces communautés puissent jouer un rôle dans l’espace politique national, rôle reflété dans leurs publications. Il faut noter le souci de ces collectivités éditrices de déposer les journaux et de se mettre en conformité avec le droit français : les collections présentent un taux remarquable d’exhaustivité, les lacunes sont fort peu nombreuses, ce qui fait des collections de la BnF l’ensemble le plus riche en France des presse des immigrations. Les lacunes couvrent des journaux d’anarchistes espagnols réfugiés à Toulouse après la fin de la guerre d’Espagne, les anarchistes marquant historiquement leur défiance envers toute institution étatique, même de dépôt. Il en est de même pour certaines revues révolutionnaires italiennes, ou pour les journaux du FLN. Ces titres incomplets sont rentrés par dons, ou par collecte volontaire de la BN. La plupart des titres demeurent éphémères, de quelques numéros à quelques années, regroupés en quelques pochettes ou cartons. La périodicité de ces titres assez spécialisés d’un point de vue thématique (informations économiques, fractions syndicales par langue d’origine, associations culturelles) est le plus souvent mensuelle. Seuls quelques quotidiens (3 à 4 par grande communauté de langue en même temps) ont une durée de vie plus longue (qui excède rarement plus de vingt ans, le journal arménien Aratch, ou la Naïe Presse en Yiddish étant des exceptions) et une volumétrie de ce fait plus importante : trente à quarante cartons pour les plus grands. Ils sont complétés par des hebdomadaires ou bimensuels de format magazine, en particulier chez les Allemands qui ne publièrent pas de quotidien. La pagination reste souvent modeste, de 4 à 8 pages pour les quotidiens, de 12 à 16 pour les hebdomadaires. L’intégration à la nation française plus approfondie à chaque génération née sur le territoire métropolitain, la plupart de ces journaux s’éteignent, ou deviennent des bulletins centrés sur la mémoire, comme pour les Arméniens et la mémoire du Génocide. Dans les usages de lecture, le journal du pays remplace le journal dans la langue d’origine publié en France. Pour les jeunes générations, ces pratiques sont extrêmement minoritaires, car leurs pratiques identitaires sont marquées du sceau de l’uniformisation, et le recours à la presse française, voire internationale dans un sens élargi. D’un point de vue éditorial, les grandes heures de cette presse vont de la monarchie de ______________

190

Mezzasalma: Sauvegarder et Numériser la Presse des Immigrations en France

5 Diana Cooper, « la presse anglaise publiée à Paris sous la Restauration et la Monarchie de juillet », conférence donnée dans le cadre de la Journée d’études la civilisation du journal, Paris, Bnf, automne 2006.

juillet aux années 1960. Des grandes plumes ou de futurs grands auteurs (Heinrich Mann, Walter benjamin, Borges, Gasset, Jorge Semprun, Missak Manouchian) y côtoient des personnalités politiques de premier plan (Messali Hadj, le socialiste italien Pietro nenni, Arthur London, Zhou Enlai ou Ho Chi Minh). Aujourd’hui, les quotidiens les plus dynamiques sont ceux des communautés récentes : chinois, très tournés vers les pages pratiques et les petites annonces, et turcs, centrés autour de la politique internationale en général, autour de l’actualité politique en Turquie en particulier. La méthode engagée fut identique pour l’ensemble des corpus : l’identification des titres par communauté permettait l’établissement d’une liste globale exhaustive. L’état matériel était ensuite établi par sondage des collections, permettant de déceler les titres ou exemplaires défectueux et le traitement, reconditionnement, repassage, doublage, ou microfilmage après d’éventuelles petites réparations. Dans l’ensemble, la plupart des journaux était dans un état moyen : papier de qualité médiocre, encrage peu marqué. Ces journaux ne furent pas reliés, mais mis sous pochette, sauf les quotidiens et hebdomadaires d’une certaine durée, conditionnés sous cartons. La composition reflète aussi les disparités entre communautés, voire entre types de journaux, ou entre périodes : autant les revues polonaises de la monarchie de juillet s’apparentent aux meilleures revues littéraires françaises, dans le contenu comme dans la forme6 , autant les journaux communistes polonais de l’entre deux guerre présentaient une maquette maladroite, de type artisanal, que l’on retrouve par exemple pour publications clandestines du FLN (Résistances) ou les journaux clandestins de la Résistance publiés en polonais ou en espagnol, souvent de mauvaises feuilles de format A4 ronéotées. Il faut noter aussi dans certaines communautés un souci didactique envers des lectorats faibles, et récemment alphabétisés. Michel Dreyfus considère7 ainsi qu’en 1918, 80% des immigrés italiens sont illettrés : la façon même du journal s’en ressent inévitablement.

3)

L’étape du microfilmage

A cette étape, il s’est avéré urgent de sauvegarder par reproduction ces publications en langue étrangères ou en bilingue français étranger, collections patrimoniales françaises par leur entrée via le Dépôt légal, et part maintenant reconnue de la mémoire nationale. En 2004 et 2005, la préférence aurait été de numériser ces documents souvent illustrés, et de volume suffisamment faibles pour s’intégrer avec bonheur dans une chaine de numérisation. Les aléas de la politique documentaire ne l’ayant pas permis, et la reproduction devenant urgente, les journaux furent dirigés dans une filière de microfilmage, dont le cahier des charges prévoyaient des normes de réalisation permettant une numérisation optimale depuis le film. Cette opération s’intégrait dans le plan systématique de sauvegarde de la BnF entrepris depuis les années 1950 : couverture de la grande presse nationale de 1958 à 1980, couverture de la PQR et de la presse publiées dans les anciennes colonies françaises (presse indépendantiste comprise) des origines à 1962, depuis 1980. Ces programmes arrivant à terme, la presse des immigrations s’est insérée dans ses filières, complétant utilement les collections de journaux microfilmés pour l’Afrique du Nord et les pays de l’ex-Indochine. ______________

Mezzasalma: Sauvegarder et Numériser la Presse des Immigrations en France

191

6 Je renvoie là aux travaux de Janine Ponty, notamment L’immigration dans les textes, 1789-2002, Paris, Belin, 2004. 7 Michel Dreyfus et Pierre Milza, Un siècle d’immigration italienne en France (1850-1950), Paris, Centre d'études et de documentation sur l'émigration italienne, 1987.

Chaque titre fait l’état d’un récolement complet, fascicule par fascicule, d’un dépouillement feuille à feuille, mentionnant les éventuels défauts et lacunes, traitement accompagné de petites réparations avant envoi chez un prestataire externe. Chaque titre fait ensuite l’objet d’une bobine propre, la BnF ayant choisi par mesure de simplicité de consultation, de ne pas mélanger les éphémères des corpus sur les mêmes bobines. Une petite partie de ces titres nécessitait des traitements spéciaux en interne, avec repassage et doublage partiel permettant la manipulation des documents pour la prise de vue. A ce jour, les journaux italiens, allemands, espagnols, polonais et portugais sont terminés.

4)

Une deuxième étape : Numériser les microfilms

La numérisation de masse engagée par la BnF (programme triennal prévoyant la numérisation de 100 000 documents par an ) permis de relancer le projet de numérisation des journaux de l’immigration, qui n’avait pu démarrer faute de filière adéquate. Il s’agit donc de numériser d’après microfilm dans le cadre du marché de la numérisation de masse des journaux de format A1 maximum. Une sélection se porta prioritairement sur les journaux allemands et italiens, qui rentraient facilement dans ce cadre, et offrait un affichage prestigieux en termes de contenu éditorial. En cours de prélèvements, ces bobines devraient être numérisées au deuxième semestre 2008. Cette méthode devrait être appliquée pour tous titres microfilmés (espagnols, portugais et polonais) correspondant à ce format. Les lignes budgétaires étant pérennisées sur deux ans pour ces actions de microfilmage, la totalité des corpus devraient donc être numérisés de cette manière pour les journaux de moyen format. La décision de ne pas numériser directement depuis le papier n’est donc pas ici une question de dogme, mais bien issue des contraintes des différents projets et marchés de numérisation portés par la bibliothèque. Cette année, ce sont donc quelques dizaines de titres qui seront numérisés, et accessibles par le catalogue, sans à cette étape bénéficier d’une interface spécifique , d’un accès direct par Gallica 2, la bibliothèque numérique de la BnF, ni d’un appareil critique accompagnant la publication. L’ensemble de ces développements est néanmoins prévue à moyen terme. D’un point de vue technique, les journaux sont numérisés en haute qualité à 300 Dpi en niveaux de gris, en format TIFF non compressé pour la sauvegarde, en format JPEG pour la diffusion. Chaque page est comprise entre 15 et 45 Méga octets. La recherche plein texte sera possible par le passage d’un OCR sur les données numérisées, qui permettra d’avoir le format texte derrière le format image. Il s’agit d’un OCR brut, dont on attend un niveau qualificatif minimum de 95% de reconnaissance de caractère. L’installation d’un visualiseur adapté à la lecture de la presse sur Gallica 2 depuis le 15 juillet 2008 assurera un confort de lecture supplémentaire par l’activation des fonctions de zoom. L’unité du document numérique, en ce qui concerne le catalogage, était au départ le titre lui-même. Avec la numérisation de la presse, l’unité est le fascicule, qui reçoit une cote numérique propre. Le fichier de récolement lie les fascicules entre eux. Les métadonnées sont constituées par extraction du catalogue. La totalité de ces titres sera donc consultable et téléchargeable gratuitement à distance. Notons enfin que les collections incomplètes sur microfilms pourraient en revanche être complétées dans le cadre de la numérisation par l’adjonction de fascicules manquants après leur éventuelle localisation dans d’autres établissements, de manière à arriver à une collection virtuelle complète.

192

5)

Mezzasalma: Sauvegarder et Numériser la Presse des Immigrations en France

Extensions possibles du projet

Il est envisagé de plus que les grands formats puissent être numérisés dans le cadre de l’extension probable du programme de numérisation de la presse nationale, programme en cours actuellement autour des quotidiens nationaux du XIXème siècle. Certains titres pourraient ainsi être numérisés depuis le microfilm Pour d’autres journaux actuellement non sauvegardés pourraient se poser la question de leur numérisation en direct depuis le papier. Une difficulté apparait néanmoins : certaines communautés ont un alphabet non latin, qui ne passerait pas à l’OCR : arabe, chinois, russe ou yiddish, ou comportant un grand nombre de signes diacritiques, comme le vietnamien ou l’arménien. Cela interroge sur le type de numérisation, entre sauvegarde et valorisation à des fins de recherche : doiton se contenter de les microfilmer, au risque de décevoir les chercheurs concernés ? Ou doit-on s’orienter vers une numérisation en mode image, certes moins performante en termes de recherche qu’en mode texte, mais qui rendrait néanmoins ces titres accessibles à distance ? La question est loin d’être tranchée, et prendra encore un certain temps d’instructions. La BnF est de plus confronté au souhait d’autres structures documentaires, comme l’association Génériques de numériser des journaux de l’immigration algérienne, ou des immigrés portugais. Des partenariats pourraient ainsi à l’avenir être envisagés, réunissant des collections virtuelles uniques via l’interopérabilité des accès aux différents portails. Si ces collaborations en termes de conservation partagée de documents numériques en sont au niveau des prolégomènes, elles sont une indication forte de l’attente des chercheurs et des citoyens concernant l’accès aux sources sur le domaine de l’histoire des immigrations. Ajoutons pour conclure qu’il semblerait se dessiner un accord au niveau du Ministère de la Culture pour que la BnF soit le portail de toute la presse numérisée en France, qui serait donc Gallica 2. Si une telle orientation s’affirmait, il est probable que la presse publiée en France en langue étrangère, la presse des communautés, ou anciennes communautés immigrées, en serait partie prenante. Remerciements à Gilles Rodrigues et Thibaut Baladier, en charge de la sauvegarde des journaux de l’immigration, sans lesquels le projet de numérisation ne pourrait voire le jour.

PUBLICATION, ACCESS AND PRESERVATION OF SCANDINAVIAN IMMIGRANT PRESS IN NORTH AMERICA James Simon and Patricia Finney Center for Research Libraries

Abstract The 19 th century brought more than two and a half million immigrants from the Nordic countries to North America. Whether due to political upheaval, population explosion, famine, individuals or communities seeking religious freedom, or simply enterprise, the Scandinavian emigrants traveling westward had a measurable impact on the settlement and development of both the United States and Canada. These highly-literate populations developed a social infrastructure that was both true to their traditional heritage as well as uniquely “American.” Communities established religious and social institutions that promoted linguistic and cultural heritage. Thousands of vernacular and hybrid newspapers were founded by local communities, with some of the larger titles becoming de facto national newspapers for the immigrant population. This paper will explore the publication of ethnic Scandinavian newspapers published in the Midwestern region of the United States and the western Canadian provinces and territories from the mid-19 th to the early-20 th centuries. Drawing on examples of press held extensively by the Center for Research Libraries (based in Chicago, Illinois, a hub of immigrant expansion), the authors will describe the type and context of news publications, their reflection of (and impact on) the settled communities, and their collection and retention status by research institutions and other cultural organizations.

Introduction: Emigration, Short History Sweden Brief country data (1926): Area (square miles)

174,000

Population

6,000,000

Emigration rate at height of emigration

12,000 (1925).

Government

Constitutional monarchy; monarch bestowed with power to conclude treaties, declare war or peace and veto absolutely any decree of the two chambers of Parliament.

Historical conditions – 19 th century

-1809 concluded a war with Russia, surrendering Finland; -1814 concluded a war with Denmark by which Norway was ceded to Sweden by Denmark, but

194

Simon and Finney: Publication, Access and Preservation of Scandinavian ...

Sweden lost all of her European possessions; Norway remained in Swedish hands until 1905.1 Historical Conditions - World War I

Officially “neutral,” with pro-German sentiments.

Emigration from Sweden to America began at a surprisingly early date in the 17 th century, continued through the 18 th century, gained great momentum in the 19 th century, and reached its zenith in the early 20 th century. Early emigrants, sponsored by the Swedish crown, founded the Swedish colony of New Sweden in what is now the U.S. state of Delaware. The area was ceded to the Dutch in 1655, but original settlers stayed in the area, preserving their culture and language. Emigration began in earnest in the 1840’s with organized groups, primarily farmers, who settled in the U.S. states of Illinois and Iowa. The U.S. census of 1910 cites, 1.4 million residents of 1 st or 2 nd generation Swedish dissent; a remarkable number when compared to the 5.5 million population of Sweden at the time. The huge numbers of Swedish who left their homes and traveled so far to America were motivated by lack of opportunity at home, both financial and social. Before the Industrial Revolution, most people were farmers; farming was greatly stressed in 19 th century Sweden due to the tradition of inheritances which divided farmland between all heirs (a practice resulting in land holdings which were too small to be sustaining); in addition, good farmland is not common in Sweden. Stress on farming was augmented by a total population which was multiplying due to: advances in medicine (such as the advent of vaccination for smallpox); a period without war (1814+); and the introduction of potatoes. Population in Sweden doubled between the mid-18 th and mid-19 th centuries. As the Industrial Revolution gained momentum, the rural isolation of the majority of the population lessened; railroads allowed ease of movement, industry encouraged urbanization, and the advent of compulsory elementary education resulted in a population better able to discern opportunities. Literacy was an extremely important factor in overseas emigration from Sweden. The Lutheran Church had been contributing to literacy for a long period of time when the Elementary School act of 1842 essentially eliminated illiteracy in the younger generations. The Swedish emigration pattern of the early 19 th Century was comprised of groups: sometimes of organized settlers, sometimes of religious dissenters, who entered the U.S. through New York and settled in Wisconsin, New Sweden, IA, and Bishop Hill, IL. Emigration gained momentum in the mid-19 th Century when famine, due to crop failure, beset Sweden; 60,000 left during the years 1867-1869. Emigration was also greatly fueled by the U.S. Homestead act of 1862: easy availability of good farmland brought Swedes to states such as Minnesota, where all of the land in several counties was eventually owned only by Swedes. The U.S. census of 1920 cites an amount of farmland owned in America by Swedes which would cover 2/3 of arable land in Sweden. Emigration was not restricted to rural areas; Swedes also took to the cities, to a point illustrated by the U.S. 1910 census which cites 61% of 1 st generation living in cities. Chicago, and to a lesser extent, Minneapolis, were the cities of choice. In the early 20 th century, Chicago’s Swedish population surpassed that of Gothenburg, the second largest city in Sweden. The urbanization process brought immigrant Swedes from the bottom level ______________ 1 Note, this is critical to immigration to the U.S. as any Norwegian entering before 1905 would have most likely been recorded as emigrating from Sweden, as Norway was part of Sweden until that time.

Simon and Finney: Publication, Access and Preservation of Scandinavian ...

195

of the society with very little professional experience, to lasting achievements in business, the professions, the arts and politics. Norway Brief country data (1926): Area (square miles)

125,000

Population

3,000,000

Emigration rate at height of emigration

7,000 (1925).

Government

Constitutional and hereditary monarchy.

Historical conditions – 19 th century

-1814 Norway was ceded to Sweden by Denmark; Finland; Norway remained under Swedish rule until 1905 when Norway declared its union with Sweden to be dissolved

Historical Conditions - World War I

Neutral, with strong anti-German sentiments (due to its submarine warfare resulting in losses of NorNorwegian lives and shipping).

The port of Quebec, in the 1850’s and 1860’s, was a point of entry for more than 50,000 Nordic emigrants, primarily Norwegian, traveling to the U.S. Middle-West, primarily the states of Illinois and Wisconsin. Canadian governmental officials initiated practices aimed at attracting immigration to Canada rather than losing these emigrants to the U.S. Sherbrooke, in the Eastern Townships, was the first place of settlement, in 1854, by 14 Norwegian families who were quickly followed by other families settling in the town of Bury. In 1857, a plan was developed to establish a Norwegian colony in the West: in Gaspe District, Ottawa County, St. Maurice or on the eastern shore of Lake Superior. An emigration agent was hired, and in 1857, 90 settled also at Bury in the Eastern Townships. During the subsequent two years, the numbers of immigrants increased; 3000 acres of land were purchased; in 1859, 15 more families arrived and purchased another 1000 acres of land. A plan to provide three townships exclusively for Norwegians on Chaleur Bay in the Gaspe Peninsula, another in the Eastern Townships, and one on the north shore of Lake Huron was advanced. An emigration pamphlet was published in 1860, centering upon the Eastern Townships and citing 2 million acres of available land. The site for Norwegian colonization was chosen in 1859, at Gaspe, and sanctioned by the Canadian government. Land would be available in plots of 100 acres, priced at $20 and restricted to Norwegians. Norwegian settlement was not to be restricted to Gaspe, other areas were also to be encouraged. In 1860, 7 families began colonization in Gaspe; 100 emigrants followed, sailing from Norway in 1861; with a total increase in settlement of 400 Norwegians and Swedes during that year. Unfortunately, the colony failed; all members left after the winter. The climate, an inability to purchase shoreline land, financial problems, and a lack of supplies, led to this failure. More than 5,500 Norwegians and Swedes used Quebec as the port of entry to North America in 1864; many expressed regret that there were no settlement opportunities in Canada, primarily due to a lack of potential employment; virtually all continued to travel and settled in the U.S. Later in the 19 th century, many of Nordic origin emigrated to the Canadian prairie provinces with great success, but these early efforts did not bear fruit and are interesting when compared to Norwegian settlement in the U.S. during the same time period.

196

Simon and Finney: Publication, Access and Preservation of Scandinavian ...

Westby, WI in the U.S. is today a town of 1600 residents. The population of the town is still singularly of Norwegian descent, settlers from Norway having first come to the area in the late 1850’s. The land upon which the town was settled was purchased from the U.S. government by a group of emigrants from western Norway; they were attracted to the area due to its physical similarity to the hilly country from which they came. Ole Westby was the first storekeeper and it was by his front door that the Milwaukee Railroad first passed, thus the name of the town. The first and second Lutheran parishes were organized in 1852, with the third added in 1888. Schools were opened in 1880 and 1883. Agriculture, including tobacco was an important occupation; cigar filler and cigar binder leaves being the chief crop, with dairy products and white leghorn chickens also prominent. At one time, Westby had a cooperative creamery, feed store and electric company. Today, the Norwegian origins of the town are clearly recognized with Norwegian commonly spoken, lutefisk dinners at the churches, Norwegian Independence Day (Syttende Mai) celebration in May, a Fall (Host) festival, an international ski jumping tournament, and the Norskedalen Nature and Heritage Center nearby. Restrictions on immigration severely curtailed the stream of Scandinavians traveling to the United States—in 1925, Norway was awarded an annual quota of only 2,400. Denmark Brief country data (1926): Area (square miles)

17,000

Population

3,500,000

Emigration rate at height of emigration

6,000 (1914).

Government

Constitutional monarchy.

Historical conditions – 19 th century

1813 compelled to cede Norway to Sweden; 1860’s-1870’s period of conflict with France, German, and Austrian states over governance of Schleswig-Holstein with Denmark’s power in the region being eventually withdrawn.

Historical Conditions - World War I

Neutral

Danish emigration to the United States measured 250,000 persons from 1864-1914 and was at its highest numbers in the decades of 1880, 1890, 1900, 1910, and 1920 with the number of emigrants per decade being 88.000, 50,000, 65,000, 41,000 and 32,000 respectively. The number of Danish born persons living in the United States remained grossly over 100,000 for each U.S. Census of 1890-1930, with that of 1920 registering the highest total of 189,000. The 1920 U.S. Census recorded 7 U.S. states as having a number of Danish born residents being more than 10,000: those, in descending order: California, Iowa, Indiana, Minnesota, Wisconsin, New York, and Nebraska; and 5 U.S. states as having a number more than 5,000: those being, in descending order, Washington, Michigan, Utah, South Dakota, New Jersey. The pattern of Danish emigration began with family or group emigration, followed by that of single individuals, including an increasing number of women as time elapsed. Far

Simon and Finney: Publication, Access and Preservation of Scandinavian ...

197

greater numbers emigrated from towns than from rural areas. The rate of emigration ranged from 30%-10% of the total population of Denmark, cited above, for 1926; poor economic conditions in Denmark are given as the most common catalyst for emigration yet the questions of: why was America so attractive; and through what means was information about America gleaned, come to mind. Officials of the Danish government had contact with, or visited, America: diplomats, naval officers, and other public officials; by in large, their observations were not widely distributed; in fact most official reports warned against emigrating, citing robbery and fraud as prevalent. Yet the general populace believed that opportunity lived in America; a Danish world history textbook by H.F.J. Estrup characterized the U.S. as “In no country is there greater freedom or faster growing trade and lower taxes…” Beginning in the 1840's guidebooks for emigrants became available. These books directed potential emigrants to specific locations, explaining natural features of the land, flora, and animals, weather, the condition of Native Americans, natural resources, and prospects for success. The second half of the 19th century saw continued publication of such guidebooks with one, The Little America, by M.A. Sommer, reprinted 10 times from 1864-1891. States of the U.S., such as Wisconsin, Minnesota, Iowa and Nebraska, and many U.S. railroad companies, contributed to the proliferation of guidebooks, offering separately published pamphlets in addition to advertisements in German and Scandinavian newspapers. Danish emigration to Canada rose after the 1880's, as land in the United States became scarcer and the transcontinental railroad opened up new opportunities in Canada's west. Canada actively campaigned for Scandinavian immigrants and established an Information Bureau for the Trades in Copenhagen. Between 1919 and 1934, 18,645 Danes immigrated to Canada. They settled primarily in Manitoba, Alberta, British Columbia, and Saskatchewan. Finland Brief country data (1926): Area (square miles)

144,000

Population

3,500,000

Emigration rate at height of emigration

20,000 (1913).

Government

Republic (1919+)

Historical conditions – 19 th century

1809 under repressive independent in 1917.

Historical Conditions - World War I

Identification with Germany in a successful effort to overthrow Russian rule.

Russian

rule

until

The Finns, like the Danes, began large-scale emigration later than the Swedes and Norwegians, the peak being between 1899 and World War I. Two locations to which Finnish emigrants migrated in great numbers will be discussed, those being the Keweenaw Peninsula of the state of Michigan in the United States and Thunder Bay in the province of Ontario, Canada. The laani counties of Vassa, Finland, specifically South Ostro-Bothnia were the point of origin for one half of the Finnish emigration to the United States. In 1900, 19,000 Finns lived in the state of Michigan, with 78% living in the Upper Peninsula, and 38% living in

198

Simon and Finney: Publication, Access and Preservation of Scandinavian ...

the Keweenaw area. The Keweenaw Peninsula is rich in minerals, specifically copper; most early immigrants to the area came to mine. The Finnish disrupted that pattern as, initially, many were involved in mining, but eventually, agriculture, businesses and service industries provided occupation for many. By the late 19th century, the Keweenaw area was primarily “cutover” timber land; such land is not usually prosperous; but the Finns succeeded where others failed due to: their interest in agriculture and its inherent self sufficiency, their interest in communal organization and assistance, their lack of inclination to incur dept, their love of dairy products, and their skills with dynamite (acquired in the mines), being able to easily rid the land of stumps. The Finnish adapted to their new environment while attempting to perpetuate tradition. To others they seemed different and “socialistic” in that they belonged to various cooperative stores and organizations. Temperance halls stood as social centers, by 1917 there were 16 in the area. Here they maintained native folk songs, games, stories, poetry, music, athletics, drama; and housed libraries of original language materials. Religious life was rooted in Lutheranism but with many dissenting groups offering differing doctrinal interpretations, so much that by 1917, 22 different churches served the Finns of the Keweenaw, all but one being of Lutheran origin. Socialism and socialist halls contributed to a view of the Finnish as “radical”, an oversimplification of description for a group with many values including fiscal and, at times, social conservatism while advocating: sound public education, shorter working hours, cooperatives, and government aid. They were fiercely independent, freethinking, and champions of “strong minded women”. They felt that they lived on an island, “Kuparisaari (Copper Island)” far outside of the American mainland, in physical distance, and within a unique cultural life which perpetuated tradition while succeeding in a new environment. Finnish settlers started to arrive in Port Arthur and Fort William (now Thunder Bay), Ontario in 1888. By 1911, 1643, Finnish immigrants lived in the area. Political and religious unrest in native Finland along with poor economic conditions served as the primary motivation for emigration. Finland achieved independence from Russian rule in 1917; a civil war followed, accounting for a strong wave of Finnish immigration to Canada in the period 1920-1930. The aftermath of World War II, resulted in another influx of Finns to the area, all contributing to an estimated population of 15,000 persons of Finnish descent presently. The Lutheran church was prominent by the early 1890’s, with temperance societies established and followed by the first Finnish workingman’s organization in 1903. Men found employment in lumber, railways and construction; women served as domestics or clerks. Agriculture and subsequent cooperative dairies and stores employed many. Unions and cooperative ventures were extremely important; Finnish business districts still serve the local population and tourists. Interestingly, many area rural communities declined in the 1930’s due to radical immigrants, disillusioned by the Depression having emigrated to Soviet Karelia and having left to serve in the Spanish Civil War.

Scandinavian Immigrant Newspaper Publication Swedish The publication of newspapers, whose audience was primarily Swedish immigrants in the U.S., began slowly, with only 4 titles published before the U.S. Civil War; accelerated, with 176 titles having been published by the year 1886; and reached its zenith, with 1500+ titles (and tracts of all kinds) having been published by 1910. Of all of these titles, only a few, 10 or more, had national circulation; most were significant only locally. Publications

Simon and Finney: Publication, Access and Preservation of Scandinavian ...

199

brought to their readers news from Sweden and from the Swedish settlements in the U.S., with stories dedicated to building the Swedish-American spirit. Editorially, these publications were uniformly in support of the Republican Party, during, and for a long period of time, after, the Civil War; they also advanced ideological, doctrinal and personal feuds within the Swedish American communities with debate which was at times, hostile. The majority of early Swedish dailies did not survive many years, the longest being the Skandinavisk Post (Scandinavian Post) (New York) published from 1867-1875. T.N. Hasselquist was the publisher of the first title to endure, his Hemlandet, Det Gamla och Det Nye (Homeland, Old and New) was published in Galesburg, ILL beginning in 1855, moving to Chicago in 1859; he was extremely influential, affecting political, social and religious opinions throughout the 19 th century. Publications in opposition to Hasselquist, especially on the topic of religion, arose in Galesburg and Bishop Hill, IL, but were of short duration. Svenska Amerikanaren (Swedish American), 1866+, Illinois Swede/SvenskaTribunen, 1869+, and Nordstjernan (North Star), 1872+, New York contested Hasselquist’s dominance and views, while maintaining allegiance to the Republican Party and conservative policies in general; and sentencing to short lives any papers seeking to advance the Democratic Party such as the Scandinavisk Post, 1863-1875, New York. This solidarity of view persisted through the turbulent period of 1880-1900 with little support for publications which sided with the Democratic Party, even on issues of monetary or tariff policies. In 1899 it was estimated that 99% of Swedes were Republican; by the election of 1912, views had changed. The Progressive movement within the Republican Party, the candidacy of Theodore Roosevelt, and the rise of industrial labor and Socialist groups heralded that change. Socialist newspapers such as Arbetaren (Worker), 1894-1928, New York, and Svenska Socialisten/Ny Tid (Swedish Socialist/New Times), 1905-1935, Rockford/Chicago gained importance and persisted. There were 58 weeklies and 290 journals published in the U.S. in the Swedish language in 1910. On the eve of World War I, the most important Swedish newspapers were issued in Chicago, Svenska Amerikanaren and Svensaka Tribunen-Nyheter (Swedish Tribune-News) each with 75,000 subscribers, Svenska Kuriren (Swedish Courier) with 42,000 subscribers; in New York, Nordstjernan (North Star) with 12,500 subscribers; and in Minneapolis, Svenska Amerikanska Posten (Swedish American Post) with 56,000 subscribers. Religious newspapers, primarily Lutheran organs, and regional papers in Worcester, MA, Jamestown, NY, Sioux City, IA, Omaha, NB, San Francisco and Los Angeles, CA, Portland OR, and Seattle, WA had substantial circulation. Conservatism held firm within these papers during World War I, with feuds and disputes given less importance in the wake of a perceived need to display solidarity against forces seeking to disrupt the community; support of Swedish American institutions and the Republican Party prevailed. The depression of the 1930’s was instrumental in precipitating a decline in the SwedishAmerican press; in 1938 it was estimated that there were only 30 newspapers with a total circulation of 300,000-400,0000, down from 40 titles in 1931. The neutrality of Sweden during World War II inspired a surge in Swedish-American publication which endured only until the U.S. joined the war effort. By 1942 only 19 titles survived, by the end of the war, less than 10, most containing less and less Swedish language articles and more and more written in English. The Swedish population in Canada accelerated its growth during the 1880's, making feasible for the first time a Swedish-language press in the country. The first published title, Skandinaviske Canadiensaren, was published in September 1887 in Winnipeg. The paper provided news on both general and Scandinavian issues, and also played a role in promoting immigration to Canada. While this title lasted only a few years, the owner soon

200

Simon and Finney: Publication, Access and Preservation of Scandinavian ...

took over another newspaper (Väktaren) published by the Lutheran Church and reissued it as a secular paper under the titles Canada (1895-1907), Svenska Canada Tidningen (19071932), and later simply Canada Tidningen (1933-July 1, 1970).This title was an important publication that successfully competed with the largest Swedish-language titles published in the Unted States. However, as the population of Swedes in Canada tended to be spread across the West rather than centered in urban areas, the title declined in importance until it merged with the Svenska Amerikaneren Tribunen published in Chicago. Other major titles from Canada included Svenska Pressen, published in Vancouver since 1929 and continues today as the monthly Swedish Press; Vancouver Posten (1930-1947); and Canada Svensken, a Swedish-Finnish paper published in Toronto from 1961 to 1978. The majority of titles were secular in nature, the exception being the newspaper published by the Mission Covenant titled Canada Posten (Winnipeg, May 2, 1904–Feb. 27, 1952). Norwegian The first Norwegian newspaper published in America was Nordlyset (Northern Light), which began in the Muskego settlement in Wisconsin in July 1847.2 When a cholera epidemic struck the settlement in 1849, the limited-circulation newspaper was incapacitated, transferred ownership, and was discontinued in 1851. Several other early attempts were made to found a newspaper in Norwegian. Between 1850 and 1860, seven Norwegian newspapers were started—five in Wisconsin, two in Illinois. The major successor to Nordlyset (purchasing its equipment to publish its first issues) was Emigranten, founded in 1852 in La Crosse, Wisconsin. Emigranten brought together news and reports from various settlements around the country in addition to news from Norway, fomenting a sense of unity among the immigrant settlements. By 1860, its circulation numbered more than 4,000. The title continued through consolidation as Faedrelandet (1864-1868) and Faedrelandet og emigranten (1868-1892). This title was eventually consolideated into the long-running Minneapolis Tidende in Minnesota. Following the Civil War, Norwegian press publication accelerated. By some accounts, more than 500 Norwegian newspapers and magazines were begun between 1865 and 1914. This press served the more than 750,000 immigrants that had moved to the United States during this time. The most influential titles were produced in large cities with large concentrations of immigrants. Decorah-Posten was founded in Iowa in 1874 and published until 1972. Minneapolis Tidende lasted from 1887 to 1935. In Chicago, Skandinaven was published from June 1866 to 1941. The tale of John Anderson, founder and publisher, is a quintessential American tale. Anderson was brought by his parents to Chicago in 1845, where he attended public school for a year. With the sudden death of his father, Anderson was obliged at the age of 12 to support his family. He peddled apples, worked in a butcher shop, and carried newspapers. Eventually he became a compositor at the Chicago Tribune, where he learned much of the trade. With prudent speculation in local real estate, Anderson acquired enough means to launch Skandinaven as a weekly, and later tri-weekly publication. The publisher lost everything in the Chicago fire of 1871. However, unbowed, with borrowed type and a small press, Anderson printed a small issue the day following the fire and was soon back in business. Skandinaven soon ______________ 2 While Nordlyset was once considered the first Scandinavian newspaper in America, evidence points to another title published in New York earlier in 1847. Skandinavia served Norwegians, Danes, and Swedes, and thus published columns in several languages.

Simon and Finney: Publication, Access and Preservation of Scandinavian ...

201

became nationally recognized and the largest circulating Norwegian-language newspaper in the world (exceeding any title published in Norway). Decorah-Posten was the longest enduring title, having absorbed Minneapolis Tidende in 1935 and Skandinaven in 1941. A partial key to its success was the avoidance of the controversial political and religious divisions that characterized so many of the other early newspapers of the time. The publisher, B. Anundsen, declared in print in the first issue "I hope in a very short time to have a subscription list larger than any other paper in the county. The paper will contain NO POLITICS, but local and other news from the new and the old world besides novels and other interesting reading matter."3 It covered local events as well as world news, provided serialized literature, illustrations and cartoons, and editorials designed to appeal to a traditional, if distinctly democratic, audience. Its circulation at its peak (1920's) reached 44,000, and it deployed dozens of correspondents in Norway and across the States. Norwegian newspapers published in Canada are few, and little about them is published. The majority of early Norwegian immigrants passed through Quebec, but as mentioned above, few settlements reached a level of coherence akin to those in the United States (only 27% of the Norwegian population in Canada lived in urban areas). The earliest recorded title is Norden, published in 1907 in Winnipeg. Nøronna was founded in 1910 in Winnipeg and continued publication for several decades. Several publications were founded in Vancouver, including Canada Scandinaven, Norseman, Norsk Nytt (1942-1955), and B.C. Posten. Danish Like other groups, the Danish press was slow to develop in North America. Prior to 1870, only a handful of titles were published. This is not surprising, given that Danish immigrants were fewer in number, were spread more widely than their Scandinavian counterparts, and tended to mix more readily into other populations in urban centers. The earliest tiles were Dano-Norwegian publications or multi-lingual publications that focused on the broader Scandinavian audience. Scandinavia (New York) was published in 1847, featuring news from Denmark, Norway and Sweden. Similarly, Skandinaven in Chicago was published with both Norwegian and Danish news and languages. Den Danske Pioneer (Omaha, Nebraska) was founded in 1872. This title represented a new phase of Danish newspaper publication, in that it was written exclusively in Danish, for Danish Americans. A "scrappy, liberal weekly," the Pioneer was sometimes at odds with its conservative Midwest base (approximately 75% of its subscribers were farmers). However, the paper served as a uniting force for the Danish community, publishing regular news from various settlements in the area and around the country. From 1870-1900, approximately thirty-four Danish and twenty-four Dano-Norwegian newspapers were launched in the U.S., but many of these failed rather quickly. The same was true for the beginning of the 20 th century, as additional titles came and went within a year of publication. The Pioneer maintained a strong following among the community, gradually focusing increasing attention on Danish-American interests rather than general

______________ 3 Odd S. Lovoll, "Decorah-Posten: The Story of an Immigrant Newspaper". Norwegian-American Studies, Vol. 27, p. 77. http://www.naha.stolaf.edu/pubs/nas/volume27/vol27_5.htm (accessed May 15, 2008).

202

Simon and Finney: Publication, Access and Preservation of Scandinavian ...

news. The Pioneer reached peak circulation in 1914, with 40,000 subscribers. It continues to be published to this day, one of only two Danish newspapers in the U.S. About a dozen titles managed to succeed in the pre-war period. These included Nordlyset (New York) which published from 1891 through 1953, Chicago Posten (1881-1929), Ugebald (Minnesota, 1881-1959) and Danske Tidende (Chicago, 1895-1952). Bien, a Danish publication from San Francisco, began in 1882 (and is the second Danish newspaper still in publication, serving primarily the Danish community of Western United States). In the post-War period, the Danish press experienced a gradual and steady decline, as communities assimilated, early settlers (and newspaper editors) passed on, or production costs forced titles to cease or merge with others. As circulation figures dropped, the importance of the newspaper as the source for general information faded, and the papers gradually transitioned to ethnic community publications, focusing on Danish heritage and the Danish-American experience. All told, the number of Danish newspapers published in the U.S. was far smaller than the other groups, numbering around 50 titles (95, including Dano-Norwegian titles). In Canada, the Danish press was far smaller yet, with only a dozen or so uniquely Danish — and mostly short-lived — titles. Danebrog was published in Ottawa from 1893-1932. It strongly promoted immigration to Canada, and its distribution was assumed by the Canadian Department of the Interior for distribution to immigrants in transit to persuade them not to cross into the United States. The most widely-read Canadian publication was Danske Herold (Winnipeg, Manitoba), which published from 1932-1940 and included news from across all of Canada. Finnish Finnish publication followed the general trends of immigration. The first Finnish newspaper in the United States was Amerikan Suomalainen Lehti (America's Finnish Newspaper), published in Hancock, Michigan, April 14, 1876.This title, along with several other early efforts such as Lehtinen (1876), Swen Tuuva (1878-1880), and Kansan Lehti (1889), lasted only a few months. However, as immigration exploded, the viability of published newspapers in the region grew, and several papers were started in Calumet, Hancock, and other cities of the region. Amerikan Suometar was established in 1889 in Hancock as a mouthpiece for the Lutheran church. By 1914, the paper had a circulation of 4,500, and it continues to be published to this day. Paivalehti started in 1901 in Calumet. Its reputation grew quickly in pre-War years, with a series of capable editors, and its circulation grew to 7,410 subscribers by 1912. It moved to Duluth, Minnesota in 1914, where it continued publication until 1948. The first Finnish newspapers in Canada were hand-written papers — known as "fist-press" — produced in the absence of vernacular publications focusing on local issues. These were not commercial enterprises, but often put forth by community organizations espousing such wide views as temperance, women's emancipation, or socialism. Aika (Time) was the first printed Finnish-language newspaper in Canada, established in 1901 in the Kalevan Kansa colony, a planned utopian socialist community. The newspaper, like the colony, foundered within a few years. Työkansa (The Working Class), the second Finnish paper, was published in Port Arthur in 1907 as the official organ of the Finnish Socialist Branches of the Socialist Party of Canada. Bankruptcy forced its closure in 1915.

Simon and Finney: Publication, Access and Preservation of Scandinavian ...

203

The Canadian Finnish press was very politically oriented, as it was in the U.S. The socialist literary weekly Murtava Voima began publication in 1908 by the Finnish Publishing Co. Ltd., the same publisher as Työkansa. Vapaus (Liberty) was one of the most successful publications of the Finnish Organization of Canada, published from 1917. When the Finnish language was declared an "enemy language" under the War Measures Act in 1918, Vapaus was suspended until April 1919. The title soon recovered, and by the 1930's its circulation reached five thousand subscribers.4 Vapaus and other socialist papers played an important role in the labor movement in North America. Other titles include Industrialisti (The Industrial Worker), the Finnish language newspaper of the Industrial Workers of the World party published ca. 19151975;.Toveritar (The Women’s Comrade) published in Oregon and circulated in Canada, and Vapaa Sana (Free Speech), which was founded in 1931 after the ideological split within the Finnish Organization of Canada. Vapaa Sana aligned itself with the minority Finnish Canadian Workers' and Farmers' Federation, but soon declared its independent status. It continues to be published today.

Current Collection & Preservation Status The collecting status of ethnic newspapers in the United States has been left largely to the province of historical institutions and state archives, rather than by academic institutions. Using Norway as an example of the kinds of resources available, the paper will illustrate the multiplicity of resources at ones disposal. There does not exist one single reliable source of information relating to the holdings and preservation status of Scandinavian newspapers in the U.S. OCLC WorldCat contains the most records of the many resources surveyed. For example, a search of Norwegian language newspapers published in the United States returned 229 entries. Comparing this to newspaper directory hosted by the Library of Congress, "Chronicling America," the number of records far exceeds the 86 titles available in the Chronicling America newspaper directory. However, OCLC does not contain the most accurate information relating to these titles, either, partly due to partial or inaccurate cataloging. By no means is every title cataloged within Worldcat. Moreover, numerous records do not seem to contain reliable information relating to microfilm holdings of particular titles. A search of Wisconsin newspapers published by Norwegian Americans will produce numerous records with no formal indication of preservation filming. However, a search of the catalog of the Wisconsin Historical Society5 provides more detailed holdings, including microfilm information for particular titles. A variety of inventories have been created and/or added to over time, including appendices in books, spreadsheets on the Internet, and newspaper databases from regional organizations. A number of these contain unique information pertaining to newspapers available locally. An excellent example of an inventory exists at the Norwegian American Historical Archive,6 which lists 375 records for newspapers published in the United States. ______________ 4 In 1974, Vapaus merged with the literary weekly Liekki to form Vikkosanomat, which continued publication until 1987. 5 http://madcat.library.wisc.edu/index.html (acessed May 15, 2008) 6 http://fusion.stolaf.edu/naha/index. cfm?fuseaction=newspaper (Accessed May 15, 2008)

204

Simon and Finney: Publication, Access and Preservation of Scandinavian ...

No one inventory appears to be entirely accurate nor complete—the NAHA database often contains conflicting statements about publishing locations, starting dates, etc. Thus, it falls to the researcher to conduct a more exhaustive and comparative search of regional and state catalogs, databases, and publication lists. In terms of preservation, it can be stated, generally, that there has been modest success in the preservation of Scandinavian titles on microfilm. Institutions such as state historical archives and regional educational institutions have worked over time to preserve these resources. A number of institutions have included major titles as part of their state-oriented preservation work under the laudable United States Newspapers Project. This, states such as Wisconsin, Minnesota, and North Dakota have done remarkable work in preserving numerous Scandinavian titles. Other states have yet to complete the work or did not engage in the preservation of these particular titles. By ethnicity, Finnish-American newspapers are the comprehensively preserved of all the Scandinavian groups. In 1983, The Finnish American Newspaper Project began microfilming Finnish-American newspapers held in a variety of institutions. Copies of the films were stored with the Immigration History Research Center at the University of Minnesota in St. Paul, MN, with copies stored at the Helsinki University Library. Extensive foreign support was instrumental in the success of the project. The following institutions hold collections dedicated to preserving and demonstrating the Nordic immigrant experience in the United States. Their collections include ethnic newspapers and publications. Swedish American Swedish Historical Museum, Philadelphia, PA American Swedish Institute, Minneapolis, MN Augustana College, Rock Island, IL Nordic Heritage Museum, Seattle, WA Swedish American Museum Center, Chicago, IL Norwegian Illinois State Historical Society Luther College Library Minnesota Historical Society Norwegian-American Historical Association, St. Olaf College North Dakota State Historical Society South Dakota State Historical Society University of North Dakota Danish Danish Immigrant Archives, Dana College, Blair, NB Danish Immigrant Archives, Grand View College, Des Moines, IA Danish Immigrant Museum, Elk Horn, IA Danish American Heritage Society Finnish Bentley Historical Library, University of Michigan. Finlandia University: Finnish American Heritage Center, Finnish American Historical Archive and Museum, Hancock, MI Immigration History Research Center, UMN (St. Paul, MI) Suomi College (Hancock, MI)

Simon and Finney: Publication, Access and Preservation of Scandinavian ...

205

Within Canada, much of the preservation work has been undertaken by the local archives via the Decentralized Program for Canadian Newspapers sponsored by the Libraries and Archives Canada. Due perhaps to the relatively few publications issued by the Canadian immigrants, the proportion of titles preserved on microform in Canada is higher than that in the United States. In every ethnic group, however, it can be asserted that only a portion of the available titles have been preserved. It seems clear that some effort will be required. Numerous unique titles still are held by relatively few institutions without the capacity to preserve their works, and some resources are still being rediscovered as they are donated from first and second-generation immigrants to local historical societies. A good example of this is the discovery of a Norwegian-Canadian paper Vikingen from Edmonton, Alberta published ca. 1911-1915 and brought to light in 1996.

Conclusion: Center for Research Libraries The Center for Research Libraries' collection of press from the various immigrant communities in the United States and Canada is a trove of information useful for researchers and genealogists. The U.S. Ethnic Press Collection includes newspapers produced by or for ethnic communities from the 1700s through the present day. The papers mirror the lives, values, and concerns of Chinese- and Polish-Americans in nineteenth century Chicago, African-Americans along the Atlantic seaboard, and recently established communities from Southeast Asia and the Middle East. CRL preserves and makes accessible historical runs of more than 110 titles from the Scandinavian countries. The holdings are strongest surrounding the Chicago area, but the collection is representative of the entire Scandinavian experience, with papers from Wisconsin, Iowa, Minnesota, North and South Dakota, Michigan, and select titles from Canada. CRL maintains an ongoing database of more than 2,000 periodicals and newspapers published by or for various ethnic communities. As the work of the U.S. Newspaper Project winds down, and institutions consider digitization of their press holdings, it would be prudent to reassess the preservation status of these minority publications and take action to ensure their longevity. Concerted, cooperative action may be the best means of accomplishing the ongoing challenges of preservation and access to these resources.

206

Simon and Finney: Publication, Access and Preservation of Scandinavian ...

Sources: Scandinavian (General) Babcock, Kendric Charles, The Scandinavian element in the United States. New York, Arno Press, 1914 Blegen, Theodore Christian, The historical records of the Scandinavians in America. Minnesota history bulletin. Vol. 2, no. 6 (May 1918). Encyclopedia Americana, a Library of Universal Knowledge. New York: Americana Corporation, 1929 Hancks, Jeffrey W. Scandinavians in Michigan. East Lansing; Michigan State University Press, 2006 Hoerder, Dirk & Harzig, Christiane, The Immigrant Labor Press in North America, 1840s1970s: An Annotated Bibliography. Greenwood Publishing Group, 1987. Hoerder, Dirk. Essays on the Scandinavian-North American radical press, 1880s-1930s. Bremen : Labor Newspaper Preservation Project, Universität Bremen, 1984 Magocsi, Paul R. Encyclopedia of Canada's Peoples, Multicultural History Society of Ontario, University of Toronto Press, 1999, p. 407 Thernstrom, Stephan, Harvard Encyclopedia of American Ethnic Groups. Cambridge, Mass. : Belknap Press of Harvard University, 1980 W.P.A. Writers’ Program. Wisconsin: A Guide to the Badger State. American Guide Series. New York: Duell, Sloan and Pearce, 1941 Danish Danish Emigration to the U.S.A., ed. Larsen, Birgit Flemming, et. al. Aalborg, Denmark: Danes Worldwide Archives, 1992 Danish Immigrant Museum, Elk Horn, IA. Web resource. Marzolf, Marion. The Danish-language press in America. New York : Arno Press, 1979, ©1972 "The Danish Pioneer: the early years". http://www.thedanishpioneer.com/english/theearly. html (accessed May 15, 2008). Finnish Finnish Settlement in Thunder Bay, Ontario Canada. Web resource extracted from The Finnish Experience produced by the Thunder Bay Finnish Canadian Historical Society. Holmio, Armas Kustaa Ensio & Ryynanen, Ellen M., History of the Finns in Michigan. Wayne State University Press, 2001. Thurner, Arthur W. Strangers and Sojourners: A History of Michigan’s Keweenaw Peninsula. Detroit Wayne State University Press, 1994. Hoglund, A. William. Survey of Finnish-American newspaper holdings. Storrs, Conn. : University of Connecticut, 1984 Hoglund, A. William, Union list of Finnish newspapers published by Finns in the United States and Canada, 1867-1985. Minneapolis : Finnish American Newspapers Microfilm Project, 1985 Kolehmainen, John Ilmari, "Finnish newspapers and periodicals in Michigan". Michigan history magazine. Lansing, Michigan Historical Commission. v. 24, no. 1, Winter number, 1940. p. 119-127. http://www.suku.fi/emi/art/article213e.htm (accessed May 15, 2008). Kolehmainen, John Ilmari, The Finns in America a bibliographical guide to their history. [Hancock, Mich.] : Finnish American Historical Library, Suomi College, 1947

Simon and Finney: Publication, Access and Preservation of Scandinavian ...

207

Laine, Edward W. Archival Sources for the Study of Finnish Canadians. National Archives of Canada, Ottawa, Ontario. Web resource: http://my.tbaytel.net/bmartin/finnarch.htm (Accessed May 15, 2008). Norwegian Allwood, Inga Wilhelmsen, The Norwegian-American press and Nordisk tidende, a content analysis. Mullsjö, Sweden, Institutet för samhällsforskning, 1950 Andersen, Arlow W., The immigrant takes his stand; the Norwegian-American press and public Affairs, 1847-1872. Northfield, Minn., Norwegian-American Historical Association, 1953 Andersen, Arlow W., Rough road to glory : the Norwegian-American press speaks out on public affairs, 1875 to 1925. Philadelphia : Balch Institute Press ; London : Associated University Presses, 1990 Anundsen, B., Decorah-posten, 1867-1897. [Decorah, Iowa : Decorah-posten, 1897] Barton, Albert O., The beginnings of the Norwegian press in America. [Madison, Wis.] : State Historical Society, Year: 1916 Blegen, Theodore Christian, Norwegian Migration to America, 1825-1860. Ayer Company Publishers, Inc., 1969. Blegen, Theodore C., Norwegian Migration to America, the American Transition. Northfield, MN: The Norwegian American Historical Association, 1940 Norlie, Olaf Morgan, Norwegian-Americana papers, 1847-1946. Northfield, Minn. Eilron mimeopress, 1946 Semmingsen, Ingrid, Norway to America: A History of the Migration. University of Minnesota Press, 1979. The Story of Skandinaven, 1866-1916. [Chicago, Ill. : J. Anderson Pub. Co., 1916] Sundby-Hansen, Harry. Norwegian immigrant contributions to America's making. New York : [The International Press], 1921 Swedish Backlund, Jonas Oscar, A century of the Swedish American press. Chicago, Swedish American Newspaper Co. 1952 Capps, Finis Herbert. From Isolation to Involvement, The Swedish Immigrant Press in America. Chicago, Swedish Pioneer Historical Society, 1966. Erickson, E. Walfred, Swedish-American periodicals : a selective and descriptive bibliography. New York : Arno Press, 1979 Setterdahl, Lilly. "Swedish-American newspapers :a guide to the microfilms held by Swenson Swedish Immigration Research Center, Augustana College, Rock Island, Illinois." Rock Island, Ill. : Augustana College Library, 1981 Swedish Emigrant Institute, The House of Emigrants, Vaxjo, Sweden with special thanks to Mr. Yngve Turesson. Web resource. http://www.utvandrarnashus.se/eng/

PRESS, COMMUNITY, AND LIBRARY A study of the Chinese-language newspapers published in North America Tao Yang Rutgers University Libraries Introduction “At the entrance to the Princeton branch of the Asian Food Market, half a dozen free Chinese-language newspapers are stacked next to the usual supermarket offerings…” (Kwong & Miscevic, 2005, p. 401). With a vivid account of the Chinese-language newspapers circulating in Princeton, New Jersey, Kwong and Miscevic began the final part of their highly acclaimed book on Chinese American history. They continued to describe the content of the papers and concluded that “(t)he papers serve as focal points for Chinese speakers in the geographic areas they cover, and by dispensing information about the needs of their readers and the services and opportunities offered them, they make the otherwise disconnected Chinese immigrants feel that they in fact belong to a community” (ibid, p. 402). Today, it is a common scene in the Asian grocery stores in New Jersey that a number of Chinese newspapers are stacked at or near the entrance; its origin may be traced back to the mid-1990s, when the New York Times first reported that a “press war” was happening among Chinese newspapers in New Jersey (Chen, 1995). I first encountered this scene in the fall of 2007 when I was in the process of relocating to New Jersey. Since then I have had the opportunity to examine these newspapers from the perspectives of a consumer and an information professional. This study of the North America-based Chinese newspapers also stems from my previous experience working with the English-language newspapers and periodicals published in the pre-Communist China (Yang, 2005). New Jersey is far from unique in terms of seeing the proliferation of Chinese-language newspapers. Currently, several Chinese-language dailies such as World Journal (世界日报 ) are distributed across Canada and the United States. Community-based press is revived in old Chinatowns and burgeoning in new “ethnic suburbs” where a large number of Chinese are moving in (Li 1998). Correspondingly, in English-language scholarship, the study of Chinese-language press has become a multi-disciplinary endeavor, drawing interests from both social scientists (Zhou, Chen, & Cai, 2006, Lin & Song, 2006) and historians (Lai, 1987, 1990). Since the publication of a union list of Chinese-language newspapers in North America (Lo and Lai, 1977), the library community has made considerable progress in relation to this type of press, but considerable challenges remain ahead, given the proliferations of publications and the increase of research interest. In the following, I will first provide some general observations about the relationships between the Chinese community and its press. Then, drawing information from the existing scholarship, I will outline the historical development of the Chinese-language press in North America. A snapshot of the contemporary newspaper will also be provided, largely based on my own investigations of newspapers circulating in two metropolitan

210

Tao: Press, Community and Library

areas: the New York City-New Jersey metropolitan area in the U.S. and the Greater Toronto Area in Canada. At the end of the paper, I will summarize the issues pertinent to the library community. People and Press: General Observations When we talk about the Chinese living in North America, we have to remind ourselves that this is a diverse group in terms of spoken language, educational background, economic status, geographic origin, and political attachment. Therefore, their relationships with the newspapers vary accordingly. Many Chinese Americans or Chinese Canadians who grew up in North America may not be able to read Chinese, so to them the Chinese-language newspapers, as a source of information, may be insignificant. But to those who work in business or provide professional services, the newspapers provide a venue to reach potential co-ethnic customers. Due to the continuous waves of immigration during the last several decades, the firstgeneration immigrants still make up the majority of Chinese population in North America. New immigrants arrive via different tracks. Professional workers enter the workforce in the U.S. or Canada after finishing college and more often postgraduate training and therefore may have acquired sufficient English-language skill; to them Chinese newspapers may complement or supplement the English-language media they are able to access. Seniors and unskilled workers and usually come under family sponsorship and most of them may have to rely on the Chinese newspapers heavily and even exclusively for information about the outside world. Increasingly, affluent immigrants seek permanent residence through their business investments and the Chinese newspapers present a unique advertising opportunity. Chinese speakers who came from different geographic areas often develop subgroups that are separate from each other, due to linguistic and/or political barriers. Those who came from Hong Kong, a British colony until 1997, and its neighboring Guangdong Province, often prefer to speak in Cantonese, which is next to impossible for the Mandarin speakers to understand. Mandarin is the standard language in both the mainland China and Taiwan, but the hostility and distrust between mainland and Taiwan since 1949 is a barrier that takes time to overcome, even in North America. Understandably, ethnic Chinese who came from Southeast Asia consist of another distinctive subgroup; many of them arrived as refugees in the 1970s and 1980s. Political rifts further divide the Chinese community along the fault lines such as the issues of Taiwan independence and Falun Gong. All these divisions create diverse or even conflicting demands for the content of the newspapers. Given all these divisions, one of the few common characteristics that most Chinese immigrants do share is the ability to read Chinese text, but even that is problematic. In Hong Kong and Taiwan, the traditional Chinese characters remain to be the standard writing system, but the mainland China started to adopt the so-called “simplified characters” half a century ago. Therefore, the immigrants from Taiwan or Hong Kong can only read the traditional characters, while the mainlanders prefer to use simplified characters even though they can recognize most of the traditional characters with an extra effort. So the Chinese newspapers published in North America have to decide which character set to use in order to appeal to the subgroup(s) they want to reach.

Tao: Press, Community and Library

211

Needless to say, the above analysis is necessarily generalized and even overly simplified— so many other variables could also affect an individual’s relationship with a newspaper, but hopefully this analysis has demonstrated the complexity between the ethnic Chinese and their newspapers. It may serve as a point of departure when we discuss the development of Chinese-language newspapers as a genre, from its beginning in the 1850s to today. History of Chinese-language Newspapers in North America: An Outline Available evidence suggests that the first Chinese newspapers in North America appeared in 1850s, shortly after the initial wave of Chinese labors entered California during the Gold Rush, primarily from Guangdong. There were several hypotheses as to the exact title and year of the very first newspaper, but today most scholars agree that the Golden Hills' News (金山日新录) is the first Chinese-language newspaper in North America, which started in San Francisco in 1854, probably on the day of April 22, as Karl Lo concluded in a painstakingly detailed analysis (Lo, 1971). The Golden Hills’ News was established by American missionaries working with the Chinese labor in San Francisco. In the April 29 issue of the newspaper, a note in English, supposedly from the publisher Howard, indicated that the Golden Hills' News was created for a new Chinese Mission Chapel and “(t)he influence of chapel and press is intended to relieve the pressure of religious ignorance, settle and explain our laws, assist the Chinese to provide their wants and soften, dignify and improve their general characters” (quoted in Lai, 1977, p28). A digitized copy of the May 27, 1854 issue of the Golden Hills' News is available in California Digital Library (original copy in the collection of the California Historical Society), so we are able to look into the full content of this issue. It has four pages, printed on a single sheet of paper that is folded in the middle. On the first page of this issue, there is a small piece written in English titled “Chinese Exodus”, in which the author called the “(m)erchants, (m)anufacturers, (m)iners, and (a)griculturists” to come forward as the friends of Chinese, “so that they may mingle in the march of the world, and help to open for America an endless vista of future commerce” (Anonymous, 1854). Similarly, the first Chinese piece written by an author Li suggested that this newspaper can provide some small help to Chinese through facilitating business, providing knowledge, conveying popular opinions, and communicating governmental issues (Li, 1854). The other pieces, all written in Chinese, are different types of news items, including ship arrival or departure dates, prices of commodities, local news, and news related to U.S., China, and other parts of East Asia. The Golden Hills' News probably only existed for a few months, but it is significant as the first papers of ethnic Chinese newspaper (see discussion in Wilson, Gutierrez, & Chao, 2003, p280-285). Among other things, it is believed that the Golden Hills' News probably established a technical standard for Chinese newspapers that was in use until the turn of the century (Lai, 1990). Shortly after the demise of the Golden Hills' News, another missionary publication, The Oriental (东涯新录), started publishing in January 1855 by Reverend William Speer, who happened to belong to the same Chinese Mission Chapel that sponsored the Golden Hills’ News. The Oriental included an English section and ceased publishing at the end of 1856. The first Chinese-run Chinese newspaper, Chinese Daily News (沙架免度新录), began publication in 1856 in Sacramento, California and continued until 1858, even though not much else is known about it (Lo & Lai, 1977). Several papers published in 1870s and 1880s sampled by William E. Huntzicker (1995) are remarkably similar to similar to the Golden Hills’ News in terms of content. Taking San

212

Tao: Press, Community and Library

Francisco China News (旧金山唐人新闻纸,1874-1875) as an example: it was published weekly and covered prices of commodities, shipping news, statistics of imports and exports, advertisements, editorials, tabloid-type stories, and China-related news (Huntzicker, 1995). The end of 19th century saw the increase of interest in China politics in the Chinese newspapers in North America, when the various political factions in China sought to compete for the influence over and the support of overseas Chinese through publishing. Actually the first Chinese-language newspaper in Canada, Chinese Reform Gazette (日新 报 or 新报,Vancouver 1903-1911), belongs to this category; it was established by the Chinese Reform Association, which had pursued the goal of reforming China while preserving the Manchu monarchy, until the revolution of 1911 overthrew the Manchu government. The revolution of 1911 was led by Dr. Sun Yat-sen and his Revolutionary Party. Dr. Sun built a large following among the Chinese living in North America and might have direct influence over the revolutionary organ The Youth (later Young China, 少年中国晨报) in San Francisco (Wen, 2005). Both the reformists and the revolutionaries published various newspapers to promote their own agendas and argue against the other side. In 1927, the Chinese Nationalist Party, the successor of the Revolutionary Party, established a national government in Nanjing that ruled China until 1949. During this period, most Chinese newspapers in North America previously favoring the opponents of Nationalists either ceased publishing or transferred control to various Nationalist factions. The most notable exception was the Chinese Times (大汉公报) in Canada, which was controlled by the Chinese Free Masons, a quasi-political party, and stayed in business from 1914 to 1992, which makes it one of the longest running Chinese newspapers in North America. In the late 1940s, the Nationalists engaged in the civil war with the Chinese Communists. When the defeat of the Nationalists seemed inevitable, the pro-communist voices started to surface in the Chinatown newspapers, but they were subsequently suppressed and prosecuted by the Nationalists and the U.S. government that supported the Nationalists (Lai, 1990). Ironically, when the political forces in China dominated the Chinatown newspapers in North America during the first half of 20th century, one of the most popular newspapers in this period was Chung Sai Yat Po (中西日报), a newspaper that was unaffiliated with any Chinese political party for most of its life. The early success of Chung Sai Yat Po may be attributed to its focus on community issues and its promotion of the integration of Chinese immigrants into the American society (Sun, 1998). Later, Chinese Pacific Weekly (太平洋 周报,San Francisco 1946-1979), another influential community press, adopted similar non-partisan stances (Zhao, 2006). It is also worth noting that Chinese-language dailies such as Chung Sai Yat Po and Mong Hing Yat Po (文星日报,San Francisco, 1891-1969) published quality literary works written by immigrants; these literary works, when rediscovered in the 1970s, became important sources for studying the early Chinese American literature (Kwong & Miscevic, 2005, p304-305). Generally speaking, the Chinese-language press was in decline from the end of World War II through the 1960s, for both political and demographic reasons. The hostility between the Communist China and the West and the anti-communist sentiment in North America (particularly the anti-communist McCarthyism of 1950s in the U.S.), might have silenced a generation of ethnic Chinese. In terms of demography, many younger generation of Chinese born in North America were unable to read Chinese, while very limited number of

Tao: Press, Community and Library

213

new immigrants arrived during this period. At that time, the Chinese newspapers appeared to be “doomed to eventual extinction”, as observed by Lai (1977). In 1960s, at the same time as the established Chinatown newspapers were in decline, a new type of Chinese press entered North America from Hong Kong and later Taiwan and gradually achieved nationwide prominence in both the U.S. and Canada. Sing Tao Daily ( 星岛日报), a Hong Kong based paper, made one of the earliest efforts to enter North America (Leung, 2007). In 1961, Sing Tao began distribution in San Francisco through airmail on daily basis. Two years later, it started printing in San Francisco. In the subsequent years, Sing Tao developed local editions in New York, San Francisco, Los Angeles, and Vancouver and is now distributed across much of the U.S. and Canada. The most important press from Taiwan is the World Journal (世界日报), which was established in 1976 by Wang Tih-Wu, who was the owner of United Daily (联合日报) and also a high-ranking member of the Nationalist Party in Taiwan. Joseph Leung (2007), Sing Tao’s current Editor in Chief in San Francisco, differentiated three periods in the development of Sing Tao in North America: nostalgia period (1960s70s), localization period (1980s-1990s), and global Chinese network period (21st century), which may be applied to the World Journal and other similar publications. Leung (2007) also attributed the success of Sing Tao in North America to its advantages over the older Chinatown presses in terms of business model, available capital, and content coverage. But the ultimate reason for the success of this new type of transnational press may be the steady increase of Chinese immigrants during the last several decades, due to the favorable changes of the immigration laws in the U.S. and Canada in the 1960s, the improved relations with China in the subsequent years, and the economic and political developments in China, Hong Kong, and Taiwan that generally facilitate out migration. Contemporary Chinese Newspapers in North America: A Snapshot Today, the Chinese press in North America mainly consists of two types of newspapers: (a) major dailies that are distributed nationwide and (b) numerous community papers that serve local Chinese population, while a few papers do not fall into either category easily. All of the major papers are distributed via paid subscriptions and sold at retail places, while the community papers are usually free to consumers. Table 1 lists the names and web addresses of four major Chinese dailies, their circulation numbers, their transnational affiliation, as well as the year they established a North American edition (Sing Tao and Ming Pao had begun distributing and printing in North America long before their North American editions).

214

Tao: Press, Community and Library

Table 1: Major Chinese-language Daily Newspapers in North America (adapted from Zhou et al., 2006) Circula- Circula- Transnational Year Local tion in tion in Connections Establishing Editions the U.S. Canada 1st North American Editions World Journal 298,500 25,000 世界日报 www.worldjournal.com

Affiliated with 1976 United Daily in Taiwan

New York, Los Angeles, San Francisco, New Jersey /Philadelphia , Boston, Chicago, Washington, D.C. Atlanta, Toronto, Vancouver

Sing Tao Daily 星岛日报 www.singtaousa.com www.singtao.ca

181,000 40,000

Affiliated with 1975 Sing Tao Group in Hong Kong

San Francisco, New York, Los Angeles, Toronto, Vancouver, Calgary

China Press 侨报 www.usqiaobao.com

120,000 N/A

Maintaining 1990 close ties with the mainland China

Los Angeles, New York, San Francisco

Ming Pao Daily News 明报 www.mingpaona.com

100,000 35,000

Affiliated with 1993 the Ming Pao Enterprise in Hong Kong

New York, San Francisco, Toronto, Vancouver

Among these four papers, World Journal is unquestionably the leader in terms of both total circulation number and the geographic coverage. Sing Tao and Ming Pao have appeals to immigrants from Hong Kong; historically Sing Tao was pro-Taiwan while Ming Pao was more sympathetic toward mainland China (Lai 1990), but today that difference may be diminishing. China Press claims a niche market with extensive coverage of mainland China, but its presence in Canada seems to be at the minimum. All of these major dailies are either formally affiliated with media organizations in Taiwan (World Journal) or Hong Kong (Ming Pao and Sing Tao) or have maintained informal ties with the authority in mainland China (China Press). This has been an issue of contention for other newspapers with no such ties: for example, Lai (1990) described the protests against the World Journal by the editors of smaller Chinese newspapers when it was first launched in 1976. The transnational connection is also a source of confusion for outside observers. There is an interesting case in point: in early 2007 when the U.S. Senator

Tao: Press, Community and Library

215

Hillary Clinton was in San Francisco to raise money for her failed presidential campaign, her campaign staff excluded reporters from China Press, Sing Tao, and World Journal from entering the fund-raising event (but they did let the Ming Pao journalist in). When the journalist from the World Journal who has been working in the San Francisco area for two decades complained, a Clinton staffer responded that she was considered to be from “foreign media”, of which only a single pool reporter could be admitted. After the World Journal and other Chinese media publicized the story, the Clinton campaign apologized and promised to update its press list (Hua, 2007). My own perspective on this issue is that these major dailies should be considered part of the ethnic Chinese press, given their extensive efforts to cover issues interesting to the Chinese communities. In addition to these four major dailies, some other newspapers also strive for nationwide distribution. International Daily News (国际日报, Los Angeles, 1981-) sometimes is regarded as a major paper, but it does not have a substantial presence outside of California. International Daily News is currently also the publisher of the Overseas Edition of People’s Daily (人民日报), a mouthpiece of the Chinese government, and the North American Edition of Wei Wei Po (文汇报), a pro-PRC paper in Hong Kong, but both papers only reprint selective contents from home editions and have no local news, so they should not be confused with the majority of Chinese newspapers published in North America. Some special interest publications published in newspaper format are also able to reach audiences across North America, but their value as news source is not the same as the major papers aforementioned. Herald Monthly (号角月报), a Christian publication by New York-based Chinese Christian Herald Crusades, is widely distributed in the U.S. and Canada, but it has minimum coverage of current affairs. Epoch Times(大纪元时报), a weekly publication from Falun Gong with numerous local editions, does report current news, but usually through the lens of Falun Gong. Both are freely available in retail outlets. In contrast to the major papers, it is impossible to list all the local community papers given the sheer size. Instead, I will use the papers from two metropolitan areas as examples. Table 2 lists the papers from the New York-New Jersey metropolitan area and the Table 3 lists the papers from the Greater Toronto Area. I must acknowledge that the information about papers in Toronto is incomplete due to the difficulty of collecting information remotely.

216

Tao: Press, Community and Library

Table 2: Local Chinese-Language Newspapers and Periodicals (New York-New Jersey Metropolitan Area) Title Frequen- Distribution Year CircuOffice cy Method Founded lation

Sour ce

Asian American Times 亚美时报

Weekly

Free

1987

20,000 Flushing, NY

a

Brooklyn Chinese Monthly 号外月刊

Monthly Free

1999

35,000 Brooklyn, NY

a

Chinese Consumer Weekly 今周刊

Weekly

Free

1997

Chinese News Weekly 新象周刊

Weekly

Free

1995

Duowei Times Weekly 多维时报 www.duoweotimes.com

Free

1999

Edison, NJ a & b

Global Chinese Times 新州周報

Weekly

Free

1991

Edison, NJ b

Liberty Times 自由時報

Daily

Free in NYC, Newsstand in NJ, Subscriptio n

1995 30,000 Flushing, (ceased NY publishing in June 2008)

a

2003

Flushing, NY

b

South Plainfield, NJ

b

New York Community Times 纽约社区报 NJ Chinese Living 新泽西生活报

Weekly

Free

Edison, NJ b

15,000 Metuchen, NJ

a

Sino Monthly Monthly Free, 1991 Subscriptio 汉新月刊 n www.sino-monthly.com

17,000 Edison, NJ a&b

The SinoAmerican Weekly Free Times 侨报周刊 www.sinovision.com/ newspaper.php (Sources: a. Scher 2004. b. This author’s investigation)

30,000 New York, b (NYC); NY 20, 000 (NJ)

2007

Tao: Press, Community and Library

217

Table 3: Regional Chinese-language Newspapers (Greater Toronto Area) Title Frequency Distribution Year Circulation Method Founded Canada China News 中华导报 http://www.canadachinanews.com Chinese Canadian Times 加中时报 http://www.cctimes.ca

1995

Weekly

Free

2002

Chinese News Three times Free a week 大中报 http://www.chinesenewsgroup.com

1993

Global Chinese Press 环球华报 http://www.gcpnews.com

Twice a week

Free

2000

New Star Weekly 星星生活 http://news.newstarnet.com

Weekly

Free

2002

Today Daily News 现代日报 http://www.todaydailynews.com/

Daily

Subscription 2005 & Newsstand

13,000

60,000

Very Good News Weekly Free 2004 11,000 华报 http://www.verygood.ca (Based on the information provided on the web site of each respective newspaper) The two tables show considerable similarities between the newspapers published in these two areas: most of the publications are weeklies and are relatively new. There is also a noticeable difference between these two areas: the Toronto newspapers are more likely to have their own web sites than the papers in NY-NJ, which may be due to the regional variation of hiring affordable information technology talents. Within each metropolitan area, the papers differ by content and geographic coverage. I will illustrate this point with papers from NY-NJ region. Some papers in this area can reach all the communities within the metropolitan area and even beyond, because they have little community-specific content. Their contents are either recycled from other sources or advertisements. The better ones of this group tend to be affiliated with a larger media company; for example, Duowei Times and Chinese Consumer Weekly are affiliated with Chinese Media Net (多维媒体公司), which owns several other Chinese language media outlets, while SinoAmerican Times is a product of SinoVision (美国中文电视), a New York-based Chinese-language TV network. The other type of papers focuses on covering a smaller community in the region and therefore can be considered purely community press. These papers tend to provide more unique or original content; the examples are Asian American Times, Brooklyn Chinese Monthly, and New York Community News in New York, as well as Chinese News Weekly, Global Chinese Times, and Sino Monthly in New Jersey.

218

Tao: Press, Community and Library

Issues for Libraries The above information about the historical and contemporary Chinese-language newspapers in North America can help us address the issues facing the library community, particularly in respect to locating, collecting, preserving, and digitizing these newspapers. In the following, I will discuss both the accomplishments in these areas and the challenges ahead. Locating newspapers To access the newspapers, a researcher needs to know what newspapers are available and where they can be found. One of the traditional ways for the library community to provide access is to compile a union list of newspapers. The first union list of Chinese newspapers published in North America came out in 1976, as a result of the collaboration between a librarian, Karl Lo, and a pioneering scholar in Chinese American studies, Him Mark Lai. Altogether, they found 252 Chinese-language newspapers and periodicals published in the U.S. and Canada in the period from 1850s to 1975 and indicated the holding libraries as well (Lo & Lai, 1977). More recently, the Overseas Chinese Documentation Center at Ohio University developed an online searchable database for overseas Chinese newspapers and journals (http://132.235.47.66/opac), which includes publishing information of 402 titles from North America (342 from the U.S. and 60 from Canada), but not the information about the holding libraries, which means that a researcher has to use other tools to find the information about holding libraries. Union catalogs such as the WorldCat from OCLC WorldCat would be an ideal tool for identifying the holding library of a particular title. But in respect to the Chinese newspapers, the efficacy of library catalogs is compromised by the existence of variant titles and transliterations and the lack of cross reference in the cataloging records. Most of the Chinese newspapers published in North America have a parallel English name, which may or may not be the exact translation of the Chinese name: for example, the World Journal is the direct translation of 世界日报,but the Chinese Press is not the exact translation of 侨报. To the English speakers, a newspaper may be known in its English name, or in its romanized (or transliterated) form. When we take into account the differences of pronunciations between Chinese dialects (particularly Cantonese vs. Mandarin) and the changes of Chinese romanization system since 19th century, things could become even more complicated. A case in point is the Golden Hills’ News, the first Chinese newspaper published in North America. It has various names and transliterations that are used in different contexts: • 金山日新录¸ the title in Chinese characters, usually appearing in Chineselanguage writings. • Golden Hills’ News, the parallel English title, used in many Englishlanguage writings. • Kim Shan Jit San Luk, the older romanization rendered from Cantonese pronunciation, used in some English writings, such as Karl Lo (1971). • Jin shan ri xin lu, the contemporary pinyin romanization rendered from Mandarin pronunciation, used in the library cataloging records (but quite a few libraries mistakenly use “jin san ri xin lu” instead). As we can understand, if one variant title or romanization is missing or misspelled in the cataloging records, problems are bound to happen to researchers trying to retrieve the title via the online catalog. A test with Golden Hills’ News in WorldCat will confirm that.

Tao: Press, Community and Library

219

To help researchers locate the newspapers, the near term solution is to enhance the cataloging records, particularly adding variant titles or romanization to the record (and correcting mistakes). In the long run, we may need to refine specialized reference tools. Keeping track of contemporary publications Most newspapers published since 1970s have not made their way into the library collections yet. Therefore, keeping track of these newspapers is both a challenge, since there is no single source of information, and a necessity, because the librarians need to know what have been published to pursue collection development opportunities. Relevant to contemporary Chinese newspapers in the United States, Yan Ma, a librarian turned library science professor, surveyed editors of Chinese American newspapers and periodicals twice: one in 1987 and the other in 1995/96 (Ma, 1989 & 1999). In 1987, 38 newspaper and periodical editors responded, out of the 66 questionnaires sent; in 1995/96, 51 editors responded, out of the 137 questionnaires sent (42 were returned undeliverable). Because Ma was interested in “Chinese American” publications, her surveys covered both English and Chinese publications, ranging from daily newspapers to general interest periodicals to academic journals. Ma’s surveys probably only covered a small percentage of the Chinese-language press available at that time, but in each survey, Ma identified the newspaper's title, location, editor, starting date, circulation number, frequency, and distribution method, and therefore captured snapshots of some Chinese-language publications at two different points of time. Based on her 1995/96 survey, Ma also tried to examine the web presence of these newspapers (Ma, 2003). One of the lessons I learned from this study is that information about the Chinese press, especially the small ones, is more likely to be captured by specialized regional directory of newspapers (e.g. Scher, 2004) rather than the national directory. Also, in the case of currently circulating newspapers, a physical copy of the newspaper or its web site may confirm the starting year and publishing frequency. To keep track of contemporary newspapers, especially the local or regional publications that tend to be unstable, we may need to employ a variety of sources: questionnaires to the publishers or editors, newspaper web sites, newspaper directories, and direct observations. An ideal solution will be to have librarians working on each metropolitan area and then pool the information together at the national level. Collection and Preservation On the newspaper preservation front in the U.S., the most significant effort was undertaken by the United States Newspaper Program (USNP), funded by the National Endowment for the Humanities. California, given its large number of Chinese newspapers, has microfilmed many historical titles through the USNP program (Chiu, 1997). Other states might also have also microfilmed some Chinese newspapers, but probably not to the extent of California. Some titles might have been microfilmed by their own holding libraries independently. Because the print copies of the historical newspapers are not widely available, purchasing the microfilms appears to be the only way to build a comprehensive collection. One of the main challenges is that there is no systematic effort to collect and preserve the Chinese-language newspapers published in recent years. Major Chinese newspapers such as the World Journal may be found in libraries throughout the U.S., but they are normally not kept permanently. In New Jersey, my institution and quite a few other libraries have

220

Tao: Press, Community and Library

subscriptions to the Sino Monthly (Edison, NJ), a monthly publication devoted to covering the Chinese community. Because the Sino Monthly is in magazine format and publishes only once a month, it is relatively easy to keep it as part of the permanent collection than dailies or weeklies. But this is certainly inadequate even for the Chinese community in New Jersey, so what should we do? As described above, since the 1970s, the Chinese-language press has developed into a twotiered system: nationwide newspapers and regional/local ones. Each nationwide newspaper usually represents a Chinese subgroup with different geographic origins: Hong Kong, Taiwan, and the Mainland China, so all of them need to be collected and preserved. On the other hand, the regional and local newspapers that cover the same area tend to have a lot of overlap in terms of both community news coverage and even advertisements, so it may not be so cost effective to preserve them all. Therefore, a two-tiered arrangement seems appropriate for collecting Chinese newspapers published in recent years: the national libraries and research libraries of national prominence will be responsible for the nationwide newspapers, while the state/provincial libraries and research libraries in the region will take care of the regional and local newspapers. Regarding the regional or local newspapers, we need carefully examine the content and reputation of each title in relation to its competitors and try to select the better ones for the library collections. Digital Issues Libraries in both Canada and the Unites States have taken steps to digitize historical Chinese-language newspapers. In Canada, Simon Fraser University, in collaboration with the University of British Columbia Libraries, digitized a significant portion of The Chinese Times 大汉公报 (Minkus, 2004). In the United States, libraries in the University of California system digitizedthe 1900-1904 issues of the Chung Sai Yat Po 中西日报 (“Guide to the Chung Sai Yat Po Newspaper Collection”). Sporadic issues of other titles may have also been digitized, such as the Golden Hills' News mentioned before. Text recognition is one of the biggest technical challenges for digitizing historical newspapers. The Chinese newspaper digitization projects mentioned above have not overcome the obstacle of text recognition; they provide only scanned images and users cannot conduct full text search, which limits the utility of the digital version. Understandably, the text recognition technology for Chinese is most advanced in China, where Chinese is the native language. Therefore, international collaboration may be necessary in order to improve the digitization of these Chinese-language publications. Separate from the library digitization projects, major Chinese-language dailies are providing more and more digital content on their web sites. The World Journal is a good example. On its web site (worldjournal.com), users can browse the content from the last seven days, and significant amount of previous contents are archived and can be retrieved via web search engines. It also provides the digital version of the same day’s newspaper online for free. Therefore, preserving the digital contents of major dailies may be an area that the library community and the newspapers can work on together. (Acknowledgement: I wish to thank my colleagues in the Rutgers University Libraries for many stimulating conversations about this paper. Rutgers also provided funding for my travel to the 2008 IFLA conference in Quebec City, Canada.)

Tao: Press, Community and Library

221

References Anonymous. (1854). Chinese Exodus. Golden Hills' News, p. 1, May 27, 1854. Chen, D. W. (1995). With Affluent Chinese Moving In, A Press War Begins to Heat Up. New York Times, April 16, 1995. Retrieved January 3, 2008, from http://query.nytimes.com/gst/fullpage.html?res=990CE4DE103AF935A25757C0A 963958260. Chiu, K. (1997). Access to the past of a nation of immigrants: Asian language newspapers in the United States. Journal of East Asian Libraries No. 112: 1-8. Guide to the Chung Sai Yat Po Newspaper Collection. Retrieved April 14, 2008, from http://content.cdlib.org/ark:/13030/kt0g5016h6/. Hua, V. (2007). Clinton staff's gaffe with local ethnic papers. San Francisco Chronicle, B3 February 27, 2007. Huntzicker, W. (1995). Chinese-American Newspapers. In Outsiders in 19th-Century Press History: Multicultural Perspectives. Bowling Green, OH: Bowling Green State University Popular Press. Kwong, P., & Miscevic, D. (2005). Chinese America: The Untold Story of America's Oldest New Community. New York: New Press. Lai, H. M. (1977). A short history of Chinese journalism in the U.S. and Canada. In Chinese newspapers published in North America, 1854-1975 (pp. 1-16). Center for Chinese Research Materials, Association of Research Libraries. Lai, H. M. (1987). In The Ethnic Press in the United States: A Historical Analysis and Handbook, pp27-43. Westport, Conn.: Greenwood Press. Lai, H. M. (1990). The Chinese Press in the United States and Canada since World War II: A Diversity of Voices. Chinese America: History and Perspectives, v.4: 107-155. Leung, J. (2007). Sing Tao Daily's Overseas Edition and the Globalization of Chinese Language Newspapers (in Chinese). Chinese America: History & Perspectives, v.21: 181-182. Li. (1854, May 27). Untitled. Golden Hills' News, p. 1, May 27, 1854. Li, W (1998). Anatomy of an ethnic settlement. Urban Studies, 35(3): 479-501. Lin, W. Y., & Song, H. (2006). Geo-ethnic storytelling: An examination of ethnic media content in contemporary immigrant communities. Journalism, 7(3): 362. Lo, K. (1971). Kim Shan Jit San Luk: The First Chinese Paper Published in America. Bulletin of the Chinese Historical Society of America, 6(8). Lo, K., & Lai, H. M. (Eds.). (1977). Chinese newspapers published in North America, 1854-1975. Center for Chinese Research Materials, Association of Research Libraries.

222

Tao: Press, Community and Library

Ma, Y. (1989). Chinese American Newspapers and Periodicals in the United States. Ethnic Forum, 9(1-2): 100-121. Ma, Y. (1999). Chinese-American Newspapers and Periodicals in the United States: An Analysis of a National Survey. The Serials Librarian, 35(4): 63-69. Ma, Y. (2003). Chinese American Newspapers and Periodicals in the United States and Their Web Presence. Serials Review, 29(3), 179-198. Miller, S. M. (1987). Introduction. In The Ethnic Press in the United States: A Historical Analysis and Handbook (pp. x-xxii). New York: Greenwood Press. Minkus, K. (2004). History Online. aq: The Magazine of Simon Fraser University, April issue. Retrieved May 1, 2008, from http://www.sfu.ca/aq/archives/april04/features/ history.html. Scher, A. (2004) Many Voices, One City: The IPA Guide to the Ethnic Press of New York and New Jersey Metropolitan Area. San Francisco, CA : Independent Press Association. Sun, Y. (1998). San Francisco's Chung Sai Yat Po and the Transformation of Chinese Consciousness, 1900-1920. In Print Culture in a Diverse America. Urbana: University of Illinois Press. Wen, X. (2005). Founding of the Chinese Revolutionary League in America. Chinese America: History & Perspectives, v.19: 21-42. Wilson, C., Gutierrez, F., & Chao, L. M. (2003). Racism, Sexism, and the Media: The Rise of Class Communication in Multicultural America. Sage Publications, Thousand Oaks, CA. Yang, T (2005). English-language Serials in Pre-revolution China: Final Report. New Haven, CT: Yale University Library. Retrieved April 14, 2008, from http://www.library.yale.edu/scopa/grants/2004fin2.htm. Zhao, X. (2006). Disconnecting transnational ties: the Chinese Pacific Weekly and the transformation of Chinese American community after the Second World War. In Media and the Chinese diaspora: community, communications, and commerce (pp. 26-41). London ; New York: Routledge. Zhou, M., Chen, W., & Cai, G. (2006). Chinese Language Media and Immigrant Life in the United States and Canada. In Media and the Chinese diaspora: community, communications, and commerce, pp. 42-74. New York: Routledge.