![]() |
|
The previous release of the DjVu plug-in has already been supporting multipage DjVu documents in two formats: INDEXED and BUNDLED. We have discovered that they were slightly inefficient and their usage was error prone. That is why these formats have been made obsolete (and now called OLD_INDEXED and OLD_BUNDLED) and are currently replaced by two new formats: INDIRECT and BUNDLED. Here is a brief comparison of these four formats:
Format | Description |
---|---|
OLD_INDEXED format |
Every page was stored in a separate file, which made this format good for web publishing. Every file was referencing (through the INCL chunk) a directory (index) file containing the list of all pages composing this document. No matter what page you loaded, you would have access to all other pages because every page included the directory file. |
OLD_BUNDLED format | Same as OLD_INDEXED except that all page files and the directory file are packaged into one bundle (archive). |
INDIRECT format | In this format every page is also stored in one or more files. Thus it is a replacement for OLD_INDEXED format. This format is great for web publishing since users only need to download the pages that they view. There is no directory (index) file, which is included into every page, but there is a top-level document index file, which references every page. Whenever you want to load an INDIRECT document, you should load that top-level file. Document pages and files can be shared between more than one document This format supports shared dictionaries. |
BUNDLED format | Same as INDIRECT format except for the fact that all page files and the directory file are packaged into one bundle (archive), which makes this format ideal for archiving images and sending them via email. |
The DjVu plug-in 3.0 support all four formats and can be used to convert documents from any format to INDIRECT or BUNDLED.
The DjVu 3.0 plug-in support so-called Pseudo-DjVu files, in which the background and/or foreground layers are encoded using the standard JPEG algorithm and/or the mask is stored in G4/MMR format. this format allows for extremely rapid creation of DjVu compatible content.
In order to compress multipage documents even better, DjVu 3.0 now supports shared dictionaries. The DjVu compressor can be made to scan the whole document for repeating shapes and store these shapes into a file shared by all the pages. This shared file is called a shared dictionary. By storing repeating shapes only once it's possible to achieve even better compression ratios for multipage documents. The DjVu plug-in 3.0 support this new efficient format.
Note that this feature is completely independent of the OCR and Searching capability of DjVu 3.0.
The previous releases of the DjVu plug-in supported some flags that could be specified in the EMBED tag in order to customize the plug-in's behavior. It was possible to make the plug-in completely passive, to stretch the image to fit the EMBED area precisely, etc. In 3.0, we added more flags and we made some of the existing flags obsolete. Please refer to the section on EMBED flags for the list of all the flags that are currently supported.
Creators of DjVu documents can now run an OCR (Optical Character Recognition) engine on every page and store the obtained textual information into a special text chunk (TXTz) inside the page. If this has been done, the viewer and the plug-in will allow you to search for a string in a page or in the whole document.