Please use this identifier to cite or link to this item: http://hdl.handle.net/1880/46201
Title: COMPRESSING THE DIGITAL LIBRARY
Authors: Bell, Timothy C.
Moffat, Alistair
Witten, Ian H.
Keywords: Computer Science
Issue Date: 1-Mar-1994
Abstract: The prospect of digital libraries presents the challenge of sto ring vast amounts of information efficiently and in a way that facilitates rapid search and retrieval. Storage space can be reduced by appropriate compression techniques, and searching can be enabled by constructing a full-text index. But these two requirements are in conflict: the need for decompression increases access time, and the need for an index increases space requirements. This paper resolves the conflict by showing how (a) large bodies of text can be compressed and indexed into less than half the space required by the original text alone, (b) full-text queries (Boolean or ranked) can be answered in small fractions of a second, and (c) documents can be decoded at the rate of approximately one megabyte a second. Moreover, a document database can be compressed and indexed at the rate of several hundred megabytes an hour.
URI: http://hdl.handle.net/1880/46201
Appears in Collections:Witten, Ian

Files in This Item:
File Description SizeFormat 
1994-537-06.pdf1.48 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.