WEB DELIVERY FOR RESEARCH IN THE HUMANITIES AND SOCIAL SCIENCES: NEW DEVELOPMENTS FROM CHADWYCK-HEALEY

by
Tony O'Rourke
Chadwyck-Healey

Chadwyck-Healey was formed in 1973 by Sir Charles Chadwyck-Healey. It now has offices in France, Spain, United States of America and two offices in the United Kingdom, with headquarters in Cambridge. More than 200 staff are employed by Chadwyck-Healey around the world.

Chadwyck-Healey specialises in publishing research collections and other reference material for all areas of the humanities and social sciences in electronic formats, such as CD-ROM and via the World-Wide Web on as well as on more traditional media such as microfilm, microfiche and even print.

In this short talk, I would like to illustrate how our publishing programme is responding to changes in the global information market and to give examples of new titles which will benefit researchers in the future.

For electronic use, CD-ROM in still considered to be the most convenient way of delivering large amounts of data to the "information centre" such as a library or other information point. In recent years, hundreds of millions of dollars have been invested (and continues to be invested) by institutions around the world in CD-ROM technology, from CD-ROM drives or CD-ROM workstations to sophisticated, multi-site CD-ROM networks. It is clear that, despite technological advances in other directions, CD-ROM will be with us for a long time to come.

In spite of their convenience however, CD-ROMs do have certain drawbacks.

a) The fastest CD-ROM drives, although they are getting faster all the time, are still no match for the average PC hard disc, let alone Unix workstation.

b) The sheer volume of discs that are available for research libraries mean that libraries are faced with the management of ever-growing CD-ROM collections, requiring dedicated drives or large CD-ROM jukeboxes if they are to be accessed on demand. It is not unusual for libraries to have 500+ different CD-ROM titles, as is the case with one British public library.

Indeed, the explosion in the number of databases available on CD-ROM (more than 10,000 titles according the latest TFPL directory) has led to a growing resistance from research libraries in acquiring new CD-ROM titles and thereby adding to the situation. Only recently, one librarian at a major East Coast university in the United States of America told a Chadwyck-Healey representative that she has authorised her acquisitions staff not to buy any more databases on CD-ROM. If the information and software cannot be delivered in any other electronic format, then that library would do without it!

Now this is a somewhat extreme point-of-view but one that is symptomatic of the desire for other electronic delivery media. With the Internet for example, there is generally no need for the library to manage large collections of CD-ROMs. There is no need to acquire much in the way of new technology to access the information on the internet. The existing base of workstations can be used to access the internet and for a relatively low investment, libraries can access resources which may have cost millions of dollars to develop. The one main disadvantage of accessing information across the Internet is that normally there is no transfer of ownership rights, unlike many CD-ROMs or in other traditional media.

In recent months, research publishers have launched WWW compatible versions of their databases or in some cases completely repackaging databases for delivery across the internet, recognising this new demand from their customers.
e.g. Last year, Silverplatter launched Web-SPIRS, a WWW version of their retrieval engine.
Knight-Ridder have Science Base

In April 1996, Chadwyck-Healey launched PCI Web, an internet version of the Periodicals Contents Index which was previewed at the Infomedia 95 conference in Prague last year.

Periodicals Contents Index or PCI started life as a CD-ROM and magnetic tape database, providing access to millions of articles from scholarly journals in the humanities and social sciences. No science, technology or medicine journals are included. The journals included for selection in PCI represent the most important scholarly titles in their area that were published between 1801 and 1990. Over 3,500 scholarly journals from around the world have so far been chosen for PCI and many more will be added.

It is estimated that when PCI is complete in 1998 or 1999, the CD-ROM edition will comprise of some 18 to 20 discs containing more than 12 million article records. Clearly this number of discs, although it is not the largest disc collection for a single title, could cause some access problems for certain libraries. This is why we have launched PCI Web. This new internet version of PCI will give researchers around the world access to all of PCI from their internet workstation, whether from the library, in their office or from home. They simply need to obtain the relevant password from the purchasing institution. It means that libraries can access all or just selected parts of PCI for a fixed annual cost.

The current version of PCI Web contains the contents of the seven PCI CD-ROM discs published to date.
Series I 1801-1960
Segment 1	756,611 articles
Segment 2	798,421 articles
Segment 3	756,389 articles
Segment 4	726,484 articles
Segment 5	681,415 articles

Series II 1961-1990
Segment 1	639,861 articles
Segment 2	618,328 articles

Total (as of 20 May 1996)  4,977,509 articles
New records will be added regularly to PCI Web. In fact, PCI Web will be updated more regularly than the CD-ROM version. Between 1.7 and 2 million articles will be added to PCI Web each year so that 12 million articles records will be available by the end of the century.

Here are some sample screens from PCI Web.

1) PCI - Search and Browse

In this screen you can enter search terms in one or more of the following categories. Title, Author, Heading, Language, Journal Title, Journal Subject as well as Year of Publication. In this particular search, I am looking for scholarly articles which link the words Holinshed and Shakespeare. It refers to the Holinshed Chronicles which were reputed to be one of the most important sources Shakespeare used for his plays.

2) PCI - Search Results

Once you have entered the search, the results are displayed in this way. The user is given the title and author of the article, the headings given for that article, the citation/source of the article including the page number, volume number and year of issue. This list may then be downloaded to disc or printer. As you can see from this first screen, the articles retrieved range from journals published between 1882 (Englische Studien) and 1968 (Etudes Anglaises). Without PCI, the contents of these journals would otherwise remain inaccessible

3) PCI - Bibliographic Records

In this screen the user receives full bibliographic information on the journals selected in PCI, including the title, publisher, frequency, publishing history, journal subject terms, ISSN, Library of Congress number and Dewey classification.

One of the advantages of the CD-ROM version of PCI is that users can create a record of local holdings, by marking the records of the journals which they have in their institution so that users may select only those journals that are local to them. PCI records are also supplied on magnetic tape if required so that libraries may integrate the records with their own OPAC.

PCI Web is available for an annual fixed per institution. Once the institution (university for example) acquires a licence to PCI Web then any member of that institution is allowed access.

PCI Web is currently available in two editions:

PCI Web is available on a number of servers around the world. Negotiations are currently taking place with international academic institutions to add more servers to improve local access to PCI Web.

1996 is the year for accessing PCI on the World Wide Web. Not only has PCI Web been launched, but in the United Kingdom for example PCI is being mounted on the EDINA server from Edinburgh University for access across the internet by every institution of higher education in the United Kingdom. This is the result of a national licence for PCI agreed by Chadwyck-Healey and the Combined Higher Education Software Team (CHEST) which is part of the Joint Information Services Committee of the Higher Education Funding Council. Another test project to access PCI on the Internet is taking place at an Italian university for possible access by all Italian institutions of higher education.

PCI Web is Chadwyck-Healey's first database to be delivered on the World-Wide Web and other titles are planned for release later in the year.

If you are interested in PCI or PCI Web and you would like to receive an evaluation copy of the CD-ROM or a temporary licence for access on the Internet then please contact either Albertina Icome in Prague or Chadwyck-Healey.

Ladies and gentlemen, I hope that you found my talk to be of interest and in the time remaining I am happy to answer any questions you may have.


Zpatky do INFOMEDIA 96.