This is a preprint version of the following paper:

Networks, standards and end-user information services. Vine, December, 1993.

In any citation please refer to the printed version.


 

 

 

Networks, standards and end-user information services

Lorcan Dempsey

UKOLN : The Office for Library and Information Networking

University of Bath, Bath BA2 7AY

 

 

'Networking' is now seen as crucial because in the near future 'networking' and 'automation' will be largely coextensive: most automated activities will be networked in some way. One difficulty in talking about so inclusive a topic to a general library audience is that it will contain different levels of experience and expectation. For example, for some in the academic community, with experience of Network Information Retrieval and Resource Discovery Tools (Gopher, Archie, WorldWideWeb, etc.), and some acquaintance with ongoing discussions about integrated methods for the description, naming, discovery and use of network resources, 'the network' is seen as a virtual place in which new patterns of communication, publishing and information use are being formed, uninhibited by per-use charging. For some, the primary experience of networks may be the use of public metered services to connect to commercial online hosts. For another group, the notion evokes a private closed network which supports shared cataloguing and access to a shared resource. And so on. These groups will have different views of the likely impact and potential of network services. [1]

In this short piece I focus on use in public sector libraries to provide end-user information services, and sketch some areas in which standards are now being designed and implemented. I first look at aspects of the technology and service environments, to establish a context. A rather reductive historical schema orients discussion, in which three stages of library automation are identified: centralised, local and distributed.

The technological environment - some notes

Stage

1. Centralised

2. Local

3. Distributed

Characteristic

System

Utilities, Online hosts

ILS,

CD-ROM

"virtual library"

Integrative end-user environments

 

These three stages emerged successively and continue to exist in concert. The first is the stage of shared central systems, characterised by the bibliographic utilities created within the library community [2], and the online hosts, largely developed without. Various library operations were automated: the successors of these central systems now mainly concentrate on services supported by a union catalogue (shared cataloguing and interlibrary loan) although they are diversifying into reference and other services. In the UK, and in some other countries, these organisations have also become system vendors. The second stage saw the automation of local operations, and the emergence of the integrated library system which is currently the dominant feature of library automation. There have also been other system investments to support particular areas: ILL, slide management, CD-ROM networks, other information services, and so on. We are now on the threshold of the third stage, which will be distributed.

In this stage, the challenge for end-user services will be the creation of integrated information access and handling environments, which present a seamless service interface to the user. The effective integration of diverse services for identifying, locating, requesting, delivering, paying for and using items of interest is a major task which will be explored below. The traditional library system is likely to be one only among a number of communicating components of the information environment, which will be rich in electronic information services.

 

Topology

Terminal access to central host

Dedicated dumb terminals

Offline delivery

 

Host

Terminal access to multiple systems

Emulation/comms packages

Uploading, downloading, etc.

Systems

Application to application

Client/server

 

Exchange of data between applications

Services

 

Stage two is characterised by terminal access to multiple self-standing systems, between which there is little communication. The network infrastructure enables widespread connection; there is rather less connectedness: between applications and between applications and user programs. What this means is that there is little sharing of information directly between applications; most information applications work by connecting people to computers; users have to shift data around manually, often losing any structure it has; they have to negotiate problems of connection, logon and use and have to locate services for themselves; display and exchange are largely in plain text. Applications tend to be monolithic and enclosed. For example, in pursuit of a reference, a library user has to, maybe, separately search the OPAC, one or more CD-ROMs, and as many other systems as are required. They may have to write the results from a reference resource, and repeat the search on the OPAC, or they may have to fill in an ILL form. They will have to grapple with several different search and display interfaces, and maybe move from terminal to terminal. The ILL department may rekey details. This is wasteful of time; it will become increasingly so as these resources multiply, and as more aspects of the process are automated (e.g. end user document ordering). Essentially, in the current environment people interact separately with diverse applications. Of course, there have long been 'gateway' and 'front-end' programs which provide valuable assistance in various ways, but don't fundamentally alter this mode of operation.

What is likely to become more common is that applications will be unbundled into communicating components which can be distributed as appropriate. For example, it is widely anticipated that end-user information services will increasingly be operated along client-server lines. Typically, request formulation, display and presentation, and consolidation of results might be handled by a client application; a server would run requests against particular database services and return results. Client and server communicate via a protocol. Of course the protocol may be proprietary; standard protocols also exist for a number of applications and more are in development. These include such common applications as e-mail, file transfer and remote login, as well as more domain-specific applications. Z39.50 (the coming buzzword) is a standard application protocol for information retrieval. It provides facilities for the open interconnection of client and server applications. Importantly, it allows users to use the local command and dialogue interface to communicate with diverse, distributed database services. (In theory anyway, if not yet widely in practice!). The system component of the IRIS service, an Irish document supply and current awareness service, provides an interesting example of emergent library practice using Z39.50. As part of the service, users will be able to search the OPACs of six Irish libraries, and, in some cases, to request items from them (there are also other service elements). The user interacts with a user application which hides the different OPAC interfaces and connection details and which provides integrated search and request functionality. The user may search one or more OPACs in the same operation. The user application communicates with the OPAC servers through Z39.50.

These developments mark a fundamental shift in the way we interact with systems. Typically, access to resources will be mediated by a program which will invoke one or more client programs as appropriate. The existence of publicly-defined client and server interfaces, and data interchange formats, will provide the basis for a range of distributed applications which hide the complexity of the interacting systems of service provision from the user. Users of the Internet Gopher or of X.500 services, for example, will be familiar with simple versions of this scenario. In this way, the power of desktop machines can be more fully exploited to improve the type of interface presented, to further process structured data, to display images, and so on.

Standard application interfaces will also facilitate the construction of the services which are now required to support emerging linked applications discussed in the next section. For example, links may be created between union catalogues and local circulation systems to determine availability, items identified in a search may be used to initiate document supply or ILL requests, and so on. New automated services are doing much to reduce the tedium of aspects of library use, the end-user literature search for example; however without such links the full benefits of automation will not be realised.

 

Network

Dedicated

Fragmented

(Dedicated, Public switched, academic, LAN)

Pervasive and interconnected

(CableTV/Internet/SMDS/..)

Protocols

In-house

X.25/TCP/IP

LAN

TCP/IP

?

The trend to distributed computing is supported by, and further encourages the development of, the network infrastructure. One can identify a general trend away from dedicated library infrastructure, towards use of general purpose infrastructure. This has been particularly the case in the academic community where the use of academic and research networks by libraries is common. Indeed, recent debate in the UK had tended to be dominated by consideration of JANET, which has given it a very partial character. The library community would benefit from a more general discussion, in which the relationship between the existing private network funded for research and education, JANET, and emerging public infrastructure, is one important component. We are in a period of rapid growth, with potential convergences between a number of strands: public data and telephone networks, cable TV networks, Internet services. Libraries do not have a major influence on these developments, which are being driven by business, entertainment, research and other interests.

The library community was an early and enthusiastic supporter of the OSI enterprise, recognising the importance for library purposes of its aims. A technically open environment for information interchange has always been an important library project also, and, indeed, is the rationale for the development of MARC. However, the overall slow development of OSI, and the emergence of TCP/IP as the de facto open communications protocols throughout the global Internet, have now complicated issues. It seems likely that the size and strength of the TCP/IP applications and user base will increasingly make it the environment of choice for distributed library applications. This article does not further consider these lower protocols, but focuses on the applications which will run over them.

The service environment

Characteristic

Activity

Cataloguing

Collection management

Information management

Public access services

None

Limited access - OPAC

plus

emerging (unintegrated) information services

Integrated discovery and request systems

 

Much automation until recently has been 'backroom', with little overt impact on users. This was certainly the case in Stage one, where the major output has been print-based, and later COM, catalogues, and also until recently in Stage two, which has been dominated by the automation of library routines, and collection management. Essentially the dominant activity has been data processing in support of various library operations. Reviewing the impact of IT in public libraries Chris Batt notes:

Customers entering a public library for the first time in 20 years are unlikely to be astounded by the changes that have taken place; certainly not changes driven by IT. They are more likely to be surprised by changes in the layout of stock than the impact that computers have made on services.

During this period the main focus for public access has been the OPAC. Yet from a number of points of view, the OPAC is a very limited instrument. Typically it provides shallow access to the monograph stock, partial access to the whole stock, and a very limited service interface: shallow because much of the content of books remains unrevealed; partial because much of the collection (e.g. journal articles) is unrepresented in it; limited because, until recently, there has been no link to requesting, reservation or other services. Moreover, typically, in the UK, the user does not have unmediated access to remote resources. Of course other information resources exist. However, commercial online services have not been heavily used and, largely, have not been end-user oriented.

In stage two, then, public information services have been dominated by the local OPAC. More reference services have emerged recently, but their penetration is uneven, and, as discussed above, they remain unintegrated with each other or into users' work environments. Automation is largely confined to secondary services; apart from several exceptional projects there is little penetration of automation into the management and delivery of primary materials.

The third stage will be driven by two related technical challenges. The first is the need to provide genuine end-user access to existing library resources. The second is the need to remain relevant to users' information requirements by providing organised, selective access to a range of newer information and learning materials, which will be routinely exchanged in electronic form.

One can identify many elements under this second challenge (which is really a somewhat arbitrary category of convenience). These include:

- network resources and the information retrieval and resource discovery systems which make them visible;

- on demand and custom publishing; bespoke learning packages; support for distance learning;

- entertainment and leisure materials;

- reference, factual and business resources; large scientific databases; government, census and other survey data;

- new electronic journals; interactive documents; ...

These services will require technical support in a number of ways: further processing and selective output of structured documents; tools for data handling and manipulation; format conversion, presentation and display capabilities; and so on. Some of these will be introduced more quickly than others; some will be more difficult than others. Libraries are already experimenting in preliminary ways with some, through the development of Gopher servers for example.

Much recent discussion has focused on these exciting aspects of the 'electronic' library, but it is critical that the first challenge is not neglected, given the extent of what remains to be done. It is not easy to mine the collective resources of libraries and related organisations for what they contain. Typically the current library user has no easy way of knowing just which articles are hidden on the shelves. He or she has no easy way of knowing what books are potentially available to them: they don't have unmediated access to the utility union catalogues, to the Document Supply Centre's Monographs file, to the Viscount database. Users have no guarantee that requested items will arrive; nor do they have any direct indication of when an item will arrive. The popularity of CD-ROM, BIDS and emerging table of contents services, coupled with end-user document ordering facilities, highlights the extent of underprovision of access to journal literature. However, coverage is still partial, and crucial links are missing: into local holdings, into request or ILL systems, and so on. The introduction of integrated services which allow a user to search for what they want, to locate it (in their own library, in another library, or from some other resource), to request it, and to pay for it or have it electronically delivered where appropriate, is now required.

Internationally this 'empowerment' of the end-user is evident in the strategic direction of significant organisations. OCLC, for example, is developing a range of system, information and publishing products to support end-user access. In the Netherlands, Pica is pioneering a range of technical developments in support of its service aims . They are putting in place an organisational and technical infrastructure which will bring a range of integrated resources to the users' desktop: the ability to identify and request books from any library in the system; a table of contents service which allows the identification and request of journal articles from libraries in the system; links to foreign lending and searching systems; an electronic document delivery facility. These developments go under the general name of the Open Library Network concept, and signal a marked shift in emphasis towards the provision of unmediated end-user services, and a technical architecture which facilitates integrated 'desktop' access.

Buzz-word

MARC

OPAC

Z39.50

Picture

Strand

Multiple strands

'connection'

Fabric

'connectedness'

Each stage has an associated buzz-word, arising from what was perceived as its main technical challenge. It is too soon to say what the buzz-word of the third is. It could be client-server; but looking for a more specific library inflection one could suggest that it will be Z39.50. Z39.50 will be one of the major technologies which promote 'connectedness', the transformation of multiple strands of information use into a unified fabric. Of course, there will be a gap between the aspiration and the achievement. Certainly, examination of the current situation with regard to the open international exchange of bibliographic records, or the effectiveness of the public access provided to library collections, makes one cautious about prediction!

 

Standards

Information systems of the future, then, will rely on a variety of information flows and application to application communication. These requirements are now being addressed, and a variety of customised solutions is being put in place. But this will limit the flexibility and sharability of future solutions. An open architecture with standard communications, applications and data interchange services will facilitate the interaction between system components, and the extension and addition of services. (This section draws heavily on a paper written for the IT Sub-Committee of the HEFCs' Libraries Review, to which readers are referred for a fuller treatment of these issues. What follows is necessarily terse and elliptical.)

The advantage of standards is that they reduce the variety of interactions that need to be supported. Ideally, one would like to be able to identify a number of types of required transactions and exchanges, and to implement, in each case, a standard approach.

Of course, this is not possible, or maybe desirable, for a number of reasons. The communications requirements of libraries are still evolving and there is no general coordinating framework in which standards are developed. Increasingly, the library community will be looking to solutions which transcend purely library interests, and are outside of its control. Document interchange standards are an important example here. So, even where a requirement is identified there may be a number of more or less incompatible approaches, and selection criteria may be unclear. This is especially the case in a period of fundamental transition where complexity and compromise are inevitable. A major example, mentioned above, are the so-called protocol wars (OSI versus TCP/IP).

In the absence of an agreed reference model one could suggest the following broad framework in which to begin to consider library communication requirements. The focus is on emerging communications applications, rather than on traditional bibliographic standards, or underlying network protocols.

Communication services

A communications infrastructure (e-mail, file transfer, etc.) will provide the basis for other services. This area is a victim of the protocol wars: there are OSI and Internet standards for file transfer and electronic mail. For example, X.400 has been developed in the OSI environment; a different approach has been taken in the Internet environment. Recently, MIME (Multipurpose Internet Mail Extensions) which defines the content of messages for use with the Internet mail protocol, SMTP, has become a strong contender to support the transfer requirements of many future multimedia applications, electronic document delivery for example. More applications will also appear which are mail-enabled: it will be as easy to mail a document as it is to send it to a printer. In this environment the use of competing mail protocols is a problem, at best adding the overhead of conversion. These standards, and necessary conversion tools between them, will become increasingly important as general purpose carriers, for EDI, for document delivery, for ILLL messages and so on.

Application services

Information retrieval services. Currently, the protocol of choice in this area seems to be Z39.50, or its ISO equivalent, Search and Retrieve (SR ISO 10162/3). It is anticipated that this will allow services to be built which provide transparent user access to diverse resources, and communication between database applications. There are now some production services in place which use Z39.50, and there is a very active group of North American developers. There is also a Pre-Implementors' Group in the UK, which is gradually becoming more of an Implementors' Group. The British Library is carrying out a feasibility study to see whether it ought to develop a Z39.50 interface to its OPAC, and SLS, BLCMP, Fretwell Downing and others are actively working on implementation.

Request services. A standard way of managing request transactions is required. The library community has developed an ILL protocol (ISO 10160/1) which models the multiple transactions involved in interlibrary loan transactions. It is anticipated that the protocol will be implemented in two modes: interactive and mail-based. It has not yet been widely implemented, and effort has so far focused on the latter mode. However, implementation in a real-time environment will be required if it is to support online requesting of materials with instant feedback, as in a current awareness or 'table-of-contents' service. Other approaches are also being investigated.

Electronic Data Interchange. The communication of trade data - invoices, orders, etc. - is increasingly being automated. Work in this area is proceeding along a number of channels, though future convergence is expected. One important line of development is within the Edilibe project of the European Union, where UK and other partners are implementing agreements based on the international EDIFACT syntax.

Remote database access. It is likely that various cooperative arrangements will require communication between library housekeeping systems. Library systems will also need to communicate and share data with other financial and administrative systems within their organisations. Requirements in these areas are not yet clear and experience is limited.

Document interchange services

Applications will need a common understanding of the representation of the information objects they exchange. Libraries are beginning to exchange scanned page images, but the exchange of structured documents and multimedia and hypermedia objects will become increasingly important. Standards in three areas are of immediate interest: image (TIFF, GIF, fax, ...), page display (PostScript, Acrobat,...) and structured documents (SGML, HYTIME, ...). The first is important in the context of electronic document delivery and preservation. The Group on Electronic Document Interchange, comprising several document suppliers and major library organisations, has defined a profile for document delivery and their agreements are being implemented in a number of projects.

Most library 'Document delivery' applications, will create an image of existing paper products, because that is all they currently have to work with. New publisher based document delivery services may do likewise - it is likely to be the easiest machine-readable version that many of them could produce, given the state of their type-setting systems. As well, they may want to ship a non-revisable final version, not something that can be further edited or changed. Structured document formats will become of greater interest. For example, it is likely that several of the proposals of the Follett Report in the area of electronic books and journals will store the underlying files using SGML. It is a 'metalanguage' for the definition of syntaxes or languages for specific classes of document known as Document Type Definitions. Systems for the interchange of documents, which allow for the further manipulation and processing of information for selective output to different types of media, for browsing, for on-demand publishing, etc., will benefit from agreed DTDs.

In the future libraries may find it useful to receive materials in SGML format, which would facilitate manipulation/selective output/searching/indexing/display/integration of bibliographic records with text/input into various database applications/etc./etc. This method of distribution may be resisted by some publishers for obvious reasons.

Meta information services

Services are required which assist in the identification and use of other services. This is an area of intense activity on the Internet, and one can identify a number of levels of activity. Network information retrieval and resource discovery tools (Gopher, WorldWideWeb, Archie, etc.) are now very heavily used though it is acknowledged that they represent a first generation and are the focus for significant development work. At the same time, various Internet Engineering Task Force Working Groups and other organisations are working on standard approaches to the naming of network resources, and are producing specifications for Uniform Resource Names (persistent unique resource identifiers) and Uniform Resource Locators (access method and location identifiers). Related initiatives are looking at standard ways of describing resources. The challenge will be to integrate these various strands to produce user-oriented resource discovery systems which help the user to identify, and use resources of interest. Internationally, one can point to much diffuse activity, and to varying levels of library involvement in ongoing work, some of which is described elsewhere in this issue.

Authentication, charging and related services

A well-understood protocol framework for these operations will be critical for the development of significant information services on the networks. Some services now accept credit card payments over the network.

 

Conclusion - information infrastructure

The information infrastructure has at least three components: connection, connectedness, and content. Connection is the enabling network infrastructure; connectedness is the flow of information facilitated by application protocols and data interchange services; content is the services and resources in which we are interested. Without connectedness we will be overwhelmed by the volume and diversity of content. (An interesting contrast could be made between library and Internet information resources in these terms. Libraries are high on content, but low on connectedness. The emergence of Gopher, WorldWideWeb and other tools in recent years have improved connectedness on the Internet, but content, while rapidly improving, is very uneven).

One can then reformulate the two technical challenges faced by libraries as follows:

1. To modernise access to library resources through the construction of a libraries information infrastructure. (Systems are not now in place which permit the full exploitation of library resources, or which allow users to easily do what they want to do.)

2. To further develop library services through participation in the wider information infrastructure.

Of course these are continuous and overlapping, especially so as libraries move towards more electronic resources. This somewhat glib formulation is not meant to obscure the real complexity of what is required, and in conclusion, it might be useful to briefly highlight some difficulties.

The first is to do with the building blocks themselves. Many of the standards mentioned do not support routine production services. Some are still in development; some may not be accepted in the marketplace. It will be some time before some are sufficiently mature and tried to be widely implemented to support real-world requirements. Of course individual standards are not important in themselves. They are important as the enabling 'plumbing' which will allow the modernisation of bibliographic and other services. For example, Z39.50 itself is only important in as much as distributed bibliographic applications are now needed; it is the current lead candidate to build such applications.

The real advantages will be seen when these blocks begin to be put together, in support of transformed services. How do you support access to multiple regional OPACs, a table of contents service, BL files etc., in such a way that a user is provided with a seamless interface which allows them to identify, locate and request items they are looking for, subject to whatever are in place? How do you manage routine electronic document delivery services? How do you create appropriate links between existing library services and the Internet world of resource discovery tools, ftp sites, etc.? For example, one can imagine a service which runs a user's search against Archie indexes, and against a table of contents service, and consolidates the results. One can extend this to suggest that the user can select items for delivery, and that the service takes the appropriate action: gets the file from an anonymous ftp archive, or sends a request to a document supply centre. It is not too difficult imagining which particular building blocks would be used to put these services together. Many other scenarios can also be suggested. However, the organisation of a number of these scenarios in an institutional setting is less easy to discern. In order to satisfy routine information requirements it is likely that the user may need to use several applications, have access to programs which convert formats, to printing facilities, to graphics services and so on. It is unrealistic to imagine that everything will sit on the desktop, though this seems to be a common view. Again, experience of organisation and integration will have to be gained in pilot or demonstrator projects.


[1] This paper is based on a presentation given at a joint UKOLN/JUGL seminar, The importance of standards, London, 23 November, 1993. UKOLN: the Office for Library and Information Networking is funded by the BL R&D Department and by the Joint Information Systems Committe of the Higher Education Funding Councils. The views expressed in this article are the author's own.

[2] Lorcan Dempsey, ed. Library bibliographic networks in Europe: a LIBER directory. 2nd edition. The Hague: NBLC, 1992.

[3] Batt, Chris. Public libraries and new technology. (available from Comedia, The Round, Bournes Green, Near Stroud, Glos GL6 7NL, £5)

[4] Arnold, K et al. Electronic library for higher education - an experiment at De Montfort University Milton Keynes. Journal of Information Networking, 1(2), 1993.

[5] Pica. OBN final report: from project to library user. Leiden: Pica, 1992.

[6] Dempsey, L. and Mumford, A. and Tuck, B. Standards of relevance to networked library services. In: Libraries and IT: working papers of the HEFCs' Libraries Review IT Sub-committee. Bath: UKOLN, 1993. pp.131-155.

[7] Joint Higher Education Funding Councils (UK). Joint funding councils' Libraries Review Group: report. Bristol: HEFCE, 1993.