The Full Wiki



More info on DjVu

DjVu: Wikis

  
  

Note: Many of our articles have direct quotes from sources you can cite, within the Wikipedia article! This article doesn't yet, but we're working on it! See more info or our list of citable articles.

Encyclopedia

Updated live from Wikipedia, last check: June 01, 2012 15:14 UTC (52 seconds ago)

From Wikipedia, the free encyclopedia

DjVu
DjVu-logo.svg
Filename extension .djvu, .djv
Internet media type image/vnd.djvu
Type code DJVU
Developed by AT&T Research
Initial release 1996
Latest release Version 27[1] / July, 2006
Type of format Image file formats
Website www.djvu.org

DjVu (pronounced like déjà vu) is a computer file format designed primarily to store scanned documents, especially those containing a combination of text, line drawings, and photographs. It uses technologies such as image layer separation of text and background/images, progressive loading, arithmetic coding, and lossy compression for bitonal (monochrome) images. This allows for high-quality, readable images to be stored in a minimum of space, so that they can be made available on the web.

DjVu has been promoted as an alternative[2] to PDF, as it gives smaller files than PDF for most scanned documents. The DjVu developers report[3] that color magazine pages compress to 40–70 kB, black and white technical papers compress to 15–40 kB, and ancient manuscripts compress to around 100 kB; a satisfactory JPEG image typically requires 500 kB. Like PDF, DjVu can contain an OCR text layer, making it easy to perform cut and paste and text search operations.

Contents

History

The DjVu technology was originally developed[3] by Yann LeCun, Léon Bottou, Patrick Haffner, and Paul G. Howard at AT&T Laboratories in 1996.

Due to the high compression ratio and ease of which large volumes of texts can be converted into .djvu format, a large number of academic texts that are being circulated on the warez scene are also in .djvu format, with pdf files a close second[citation needed].

Release history

DjVu Version Release Date Notes
1 - 19[1] 1996 - 1999 Developmental versions by AT&T labs preceding the sale of the format to LizardTech.
Version 20 [1] April 1999 DjVu version 3. DjVu changed from a single-page format to a multipage format.
Version 21[1] September, 1999 Indirect storage format replaced. The searchable text layer was added.
Version 22[1] April, 2001 Page Orientation, Color JB2
Version 23[1] July, 2002 CID chunk
Version 24[1] February, 2003 LTAnno chunk
Version 25[1] May, 2003 NAVM chunk. Support for DjVu bookmarks (outlines) was added. Changes made by Versions 23 and 24 were made obsolete.
Version 26[1] April, 2005 Text / Line annotations
Version 27[1] July, 2006 "SDjVu" (secure DjVu) support added.
Meaning
Red Old Standard; not supported
Yellow Old Standard; still supported
Green Current Standard
Blue Future Draft

Compression

DjVu divides a single image into many different images, then compresses them separately. To create a DjVu file, the initial image is first separated into three images: a background image, a foreground image, and a mask image. The background and foreground images are typically lower-resolution color images (e.g., 100dpi); the mask image is a high-resolution bilevel image (e.g., 300dpi) and is typically where the text is stored. The background and foreground images are then compressed using a wavelet-based compression algorithm named IW44[3]. The mask image is compressed using a method called JB2 (similar to JBIG2). The JB2 encoding method identifies nearly-identical shapes on the page, such as multiple occurrences of a particular character in a given font, style, and size. It compresses the bitmap of each unique shape separately, and then encodes the locations where each shape appears on the page. Thus, instead of compressing a letter "e" in a given font multiple times, it compresses the letter "e" once (as a compressed bit image) and then records every place on the page it occurs.

Optionally, these shapes may be mapped to ASCII codes (either by hand or potentially by a text recognition system), and stored in the DjVu file. If this mapping exists, it is possible to select and copy text.

Format licensing

DjVu is a free file format[2].

In 2002, the DjVu file format was chosen by the Internet Archive as the format in which its Million Book Project provides scanned public domain books online (along with TIFF and PDF).[4]

The file format specification is published as well as source code for the reference library[citation needed].

The ownership rights to the commercial development of the encoding software have been transferred to different companies over the years, including AT&T, LizardTech, Celartem and Caminova.

The original authors maintain a GPLed implementation named "DjVuLibre".

References

  1. ^ a b c d e f g h i j DjVu File Format Version,By Jim Rile, Posted: Fri Feb 23, 2007 1:08 am, PlanetDjVu
  2. ^ a b "What is DjVu - DjVu.org" (in English). DjVu.org. http://djvu.org/resources/whatisdjvu.php. Retrieved 2009-03-05. 
  3. ^ a b c Léon Bottou, Patrick Haffner, Paul G. Howard, Patrice Simard, Yoshua Bengio and Yann Le Cun: High Quality Document Image Compression with DjVu, Journal of Electronic Imaging, 7(3):410-425, 1998 http://leon.bottou.org/publications/pdf/jei-1998.pdf
  4. ^ "Image file formats - OLPC". Wiki.laptop.org. http://wiki.laptop.org/go/DJVU. Retrieved 2008-09-09. 

External links


Simple English

DjVu
File extension:.djvu, .djv
MIME type:image/vnd.djvu
Type code:DJVU
Developed by:AT&T Research
Type of format:Image file formats

DjVu (pronounced like déjà vu) is a computer file format. It is made mostly to store scanned documents. It is especially used for things with a mix of words, line drawings, and photographs inside. DjVu has been sold as an alternative[1] to PDF, as it gives smaller files than PDF for most scanned documents. The DjVu developers report[2] that color magazine pages make smaller to 40–70 kB. Black and white technical papers make it smaller to 15–40 kB. Old manuscripts make smaller to around 100 kB; a satisfactory JPEG image usually needs 500 kB. Like PDF, DjVu can have a OCR text layer. This makes it easy to cut and paste, and search for text.

References

  1. "What is DjVu - DjVu.org" (in English). DjVu.org. http://djvu.org/resources/whatisdjvu.php. Retrieved 2009-03-05. 
  2. Léon Bottou, Patrick Haffner, Paul G. Howard, Patrice Simard, Yoshua Bengio and Yann Le Cun: High Quality Document Image Compression with DjVu, Journal of Electronic Imaging, 7(3):410-425, 1998 http://leon.bottou.org/publications/pdf/jei-1998.pdf

Other websites








Got something to say? Make a comment.
Your name
Your email address
Message
Please enter the solution to case below
45-15=