Second Edition, Emacs-w3 Version 2.2.0
June 1995
William M. Perry
wmperry@spry.com
Copyright © 1993, 1994, 1995 William M. Perry
Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Emacs-w3 is an Emacs subsystem that allows the user to browse the wonderful World Wide Web.
The World Wide Web was started at the CERN physics institute in Switzerland in 1991. The project was initially started by Tim Berners-Lee (timbl@w3.org) for distributing data between different research groups effectively.
The Web has since grown into the most advanced information system currently on the internet. It is now a global hypertext system with servers and browsers (programs written to interpret the hypertext language and display it correctly, and allow the user to follow links) exist for all major platforms (VMS, Windows, DOS, Unix, VM, NeXTstep, Amiga, and Macintosh).
The basic concepts used in the Web are hypertext and hypermedia. Hypertext is the same as regular text, with one exception—it can contain links (cross-references) to other textual documents. Hypermedia is slightly different—it can contain links to other forms of media (movies, sounds, interactive programs, etc.).
WWW also allows searches of indices that are located anywhere on the network; in this respect, it mirrors certain capabilities found in both WAIS and Gopher. The WWW world consists of documents and links. Indexes are special documents which, rather than being read, may be searched. The result of such a search is another virtual document containing links to the documents found. A simple protocol, HTTP is used to allow a browser program to request a keyword search by a remote information server.
The web contains documents in many formats. Those documents which are hypertext, (real or virtual) contain links to other documents, or places within documents. All documents, whether real, virtual or indexes, look similar to the reader and are contained within the same addressing scheme. WWW browsers can access many existing data systems via existing protocols (FTP, NNTP) or via HTTP and a gateway. In this way, the critical mass of data is quickly exceeded, and the increasing use of the system by readers and information suppliers encourage each other.
Providing information is as simple as running the WWW server and pointing it at an existing directory structure. The server automatically generates the a hypertext view of the files to guide the user around.
To personalize it, a few SGML hypertext files can be written to give an even more friendly view. Also, any file available by anonymous FTP, or any internet newsgroup can be immediately linked into the web. The very small start-up effort is designed to allow small contributions. At the other end of the scale, large information providers may provide an HTTP server with full text or keyword indexing. This may allow access to a large existing database without changing the way that database is managed. Such gateways have already been made into Oracle(tm), WAIS, and Digital’s VMS/Help systems, to name but a few.
The WWW model gets over the frustrating incompatibilities of data format between suppliers and reader by allowing negotiation of format between a smart browser and a smart server. This should provide a basis for extension into multimedia, and allow those who share application standards to make full use of them across the web.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Several different markup languages, and various extensions to those languages, are supported by Emacs-w3. HTML is composed of a set of elements that define a document and guide its display. An HTML element may include a name, some attributes and some text or hypertext, and appears in an HTML document as <tag_name>text</tag_name>, <tag_name attribute_name=argument>text</tag_name>, or just <tag_name>.
For example: ‘<title> My Useful Document </title>’, and ‘<pre width=60> A lot of text here. </pre>’.
An HTML document is composed of a single element: <html>...</html>, that is, in turn, composed of head and body elements: <head>...</head>, and <body>...</body>. To allow older HTML documents to remain readable, <html>, <head>, and <body> are actually optional within HTML documents.
All the tags and attributes of HTML are fully supported in Emacs-w3.
The full HTML 2.0 specification is available via the web at http://www.hal.com/users/connolly/html-spec/HTML_TOC.html, or via anonymous ftp to www.ics.uci.edu:/pub/ietf/html/.
The HTML 3.0 language is an extension of HTML, with a large degree of backwards compatibility with HTML 2.0. The latest revision of HTML 3.0 can be retrieved via anonymous ftp to ftp.w3.org, the file is /www/arena/html3-dtd.txt. The following HTML 3.0 language elements are supported by Emacs-w3:
HTML 3.0 extends the power of the BASE tag by allowing multiple BASE tags in a single document, each with a NAME or ID attribute. Any other tag required either the HREF or SRC attribute can also have the BASE attribute. If this matches the NAME/ID of a BASE tag, then that URL is used as the basis for resolving the relative link. If no base tag matches the, a base tag with no NAME/ID is used. If no such base tag exists, the URL of the document is used.
The default alignment, causes text to be flush left, with a ragged right edge.
Causes text to be flush right, with a ragged left margin.
Causes text to be centered between the left and right margins (this can be affected by list indentation, blockquotes, and a few other tags).
Causes text to be fully justified (both right and left margins smooth).
Causes text to be indented from the left margin.
For unordered lists (<UL>), you can specify that the browser should not insert any bullets on list items by using the new PLAIN attribute.
I hate to say it, but I broke down and actually included some of the netscape extensions into Emacs-w3.
You can change the font size. Valid values range from 0-7. The default font size is 3. The value given can optionally have a ’+’ or ’-’ character in front of it to specify that it is relative to the document basefont.
This ugly, ill-thought-out alternative to the HTML 3.0 align attribute on headers and paragraphs was included for compatibility, and as an example of how not to do things.
The isindex tag can now take a prompt attribute, to get rid of the default ’This is a searchable index’ label.
You can now control the width of a horizontal rule, in relation to the total window size. The WIDTH attribute specifies how wide the rule should be, as a percentage of the window width.
The ALIGN attribute specifies where the horizontal rule is placed. Valid values are left, right, center, and indent.
You can specify various colors and formatting issues on a document wide basis. This is done with various attributes on the BODY tag. Stylesheets are really a better way to do this, and are recommended. This is just for compatibility. See section Style Sheets
Specifies a graphic to tile in the background of the document. This only works in XEmacs 19.12.
Specifies the background of the document, as a color instead of a graphic. color should be either an RGB specification (like ‘"#FF00BB"’), or a logical color name (like ‘"PaleGoldenrod"’). The logical color names supported are system dependent.
Specifies the color of text on the page. color should be either an RGB specification (like ‘"#FF00BB"’), or a logical color name (like ‘"PaleGoldenrod"’). The logical color names supported are system dependent.
Specifies the color of hypertext links on the page. color should be either an RGB specification (like ‘"#FF00BB"’), or a logical color name (like ‘"PaleGoldenrod"’). The logical color names supported are system dependent.
Specifies the color of hypertext links that have been visited already. color should be either an RGB specification (like ‘"#FF00BB"’), or a logical color name (like ‘"PaleGoldenrod"’). The logical color names supported are system dependent.
There are several different markup elements that are not officially part of HTML or HTML 3.0 that Emacs-w3 supports. These are either items that were dropped from HTML 3.0 after I had implemented them, or experimental parts of HTML that should not be counted as "official" or long lived.
<hr label="testing" textalign="right"> yields ----------------------------------------------------------testing- <hr label="testing" textalign="center"> yields -----------------------------testing------------------------------ <hr label="testing" textalign="left"> yields -Testing----------------------------------------------------------
Renders the enclosed text in a suitably ugly font/color combination. If no default has been set up by the user, this is the default font, with red text on a yellow background.
When selected, the enclosed text runs and hides under the nearest window. OR, giggles a lot and demands nachos, depending on your definition of "roach." (the formal definition, of course, to be determined by the Official Honorary Internet Standards Committee For Moving Really Slowly.)
Should anyone foolish enough to think that HTML is still SGML and try and run a netscape-html document through an SGML editor, processor, or other tool, this tag causes an immediate core dump, erases anything on your disk with "DTD" in the name, and emails a randomly-selected insult to Tim Pierce.
Emacs-w3 just inserts a rude comment.
Inserts "zippyisms" into the enclosed text. Perfect for those professional documents. This is sure to be a favorite of mine!
In order to read the enclosed text, you have to have secret spy decoder glasses (available direct from Mcom for a reasonable fee). You can also read it by holding your computer in front of a full moon during the autumn solstice.
In Emacs-w3, this displays the text using rot13 encoding.
Causes Marc Andreesen to magically appear and grant you an interview (whether you want one or not). Please use this tag sparingly.
So you want more control over screen layout in HTML? Well, here ya go.
Actually, <peek> could almost be considered useful. The VARIABLE attribute can be used to insert the value of an emacs variable into the current document. Things like ’Welcome to my page, <peek variable=user-mail-address>’ can be useful in freaking people out.
Summons the elder gods to suck away your immortal soul. Or Bill Gates, if the elder gods are busy. Unpredictable (but amusing) results occur when the <YOGSOTHOTH> and <HYPE> tags are used in close proximity.
Causes the enclosed text to .... ooops that one made it in.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Emacs-w3 supports the following protocols
Can either display an entire newsgroup or specific articles by Message-ID: header. This supports a unix-style .newsrc file, so the user does not see articles they have read using another newsreader, but due to how news URLs work, Emacs-w3 cannot update the users .newsrc after they have read news.
To be more in line with the other URL schemes, Emacs-w3 lets you specify the hostname and port of an NNTP server in a news URL. URLs of the form news://hostname:port/messageID work, but will not work in the majority of other browsers (yet).
Supports both the HTTP/0.9 and HTTP/1.0 protocols. Fully MIME-compliant with regards to HTTP/1.0. See section HTTP/1.0 Support
Support for all gopher types, include CSO queries.
Support for Gopher+ retrievals. Support for converting ASK blocks into HTML 3.0 FORMS and submitting them back to the server.
FTP is handled by either ange-ftp or efs, the choice is up to the individual user.
Local files are handled, and MIME content-types are derived from the file extensions.
Telnet is handled by running the Emacs Lisp function telnet
, or
spawning an xterm running telnet.
TN3270 is handled by running a tn3270 program in an Emacs buffer, or spawning an xterm running tn3270.
Causes a mail message to be started to a specific address.
A more powerful version of mailto, which allows the author to specify the subject and body text of the mail message. This type of link is never fully executed without user confirmation, because it is possible to insert insulting or threatening (and possibly illegal) data into the message. The mail message is displayed, and the user must type ’yes’ to send it.
A URL can cause a local executable to be run, and its output interpreted as if it had come from an HTTP server. This is very useful, but is still an experimental protocol, hence the X- prefix.
SSL requires a set of patches to the Emacs C code and SSLRef 2.0, or an external program to run in a subprocess (similar to the ‘tcp.el’ package that comes with GNUS. See section SSL
Work is in progress to add support for the Secure HTTP specification from Enterprise Information Technologies. The specification for SHTTP can be found on EIT’s web server at http://www.commerce.net/information/standards/drafts/shttp.txt.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This section of the manual deals with getting, compiling, and configuring Emacs-w3.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
If using Lucid Emacs 19.9 or later, skip to the section on starting Emacs-w3. Emacs-w3 comes standard with all versions of Lucid Emacs from 19.9 onwards.
Three files are required when first installing Emacs-w3. All of them can be found via anonymous ftp to ftp.cs.indiana.edu:/pub/elisp/w3. The files are ‘w3-xxx.tar.gz’, where xxx is the version number of Emacs-w3 being retrieved, ‘icons.tar.gz’, which contains approximately 50 small icons that Emacs-w3 requires, and are available to HTML authors. The ‘extras.tar.gz’ file contains ange-ftp, html-mode, and nntp.
After retrieving the files, unpack them with the following commands: zcat w3-xxx.tar.gz | tar xvvf -, zcat icons.tar.gz | tar xvvf -, and zcat extras.tar.gz | tar xvvf -. This unpacks the distribution into three subdirectories ‘w3’, ‘icons’, and ‘extras’. To compile and install all the packages in the extras directory, please see the comments at the top of each lisp file.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
To install Emacs-w3, go into the ‘w3’ subdirectory and edit the ‘Makefile’. These variables might need to be changed:
EMACS
This variable controls what version of Emacs is used to compile the programs. It should be the full path to the Emacs executable on the system. The default is to use GNU Emacs (‘emacs’).
LISPDIR
This variable controls where the lisp code is copied to when it is
installed (with make install
). This is usually the users
personal lisp code directory (I prefer ‘~/lisp’). The value is run
through expand-file-name and then added to the load-path.
DOTEMACS
This variable points to the Emacs customization file, usually ‘~/.emacs’.
INFODIR
This variable points to the local info directory (usually
‘/usr/local/info’). This can be any valid directory, as long as it
is in Info-default-directory-list
so that info-mode can find
it.
MAKEINFO
This variables controls how the info files are built. Possible values
are makeinfo
or emacs -batch -q -f
batch-texinfo-format
.
Once the ‘Makefile’ has been modified, several different targets can be built.
make w3
This compiles all the .el files into the much faster .elc files. If
Jamie Zawinski’s optimizing byte compiler (standard in GNU Emacs 19 and
Lucid Emacs) is used, then a few compilation warnings are displayed (not
many hopefully). These can be safely ignored as long as everything
finishes compiling. This is the default target for make
with no
arguments.
make install
Compiles all the .el files and copies .el and .elc files into the
directory specified by LISPDIR
.
make emacs
Modifies the file specified by DOTEMACS
. A statement modifying
the load-path variable and several autoload statements are added to the
end of the file.
make all
Compiles and installs the .el files, and also modify/create the
DOTEMACS
file.
make w3.info
Creates the Emacs-readable info files. The info files are created in
the directory specified by INFODIR
. The makefile variable
MAKEINFO
determines how the info file is built.
make w3.dvi
Creates the printable documentation, using tex and texindex to properly generate the indices. A ‘w3.dvi’ file is left in the current directory.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
There are a few variables that almost all people need to change.
url-bad-port-list
List of ports to warn the user about connecting to. Defaults to just the mail and NNTP ports so a malicious HTML author cannot spoof mail or news to other people.
url-be-asynchronous
Controls whether document retrievals over HTTP should be done in the
background. This allows emacs to keep working in other windows while
large downloads occur. Defaults to nil
. This variable is
buffer-local, so use setq-default
after loading Emacs-w3 to
globally modify it, or set it in your ‘~/.emacs’ file.
url-confirmation-func
What function to use for asking yes or no functions. Possible values
are 'yes-or-no-p
or 'y-or-n-p
, or any function that takes
a single argument (the prompt), and returns t
only if a positive
answer is gotten. Defaults to 'yes-or-no-p
.
w3-default-action
A lisp symbol specifying what action to take for files with extensions
that are not in the mm-mime-extensions
assoc list. This is
useful in case Emacs-w3 ever run across files with weird extensions
(.foo, .README, .READMEFIRST, etc.). In most circumstances, this should
not be required anymore.
Possible values: any lisp symbol. Should be a function that takes no
arguments. The return value does not matter, it is ignored. Some examples
are 'w3-prepare-buffer
or 'indented-text-mode
.
w3-default-homepage
The url to open at startup. It can be any valid URL. This defaults to the environment variable WWW_HOME if it is not set it in the users ‘.emacs’ file. If WWW_HOME is undefined, then it defaults to the hypertext documentation for Emacs-w3 at Indiana University.
w3-delay-image-loads
Controls the loading of inlined images. If non-nil
, images are
not loaded. For slow network connections, this is usually set to
t
. Defaults to nil
.
w3-delimit-emphasis
Whether to use characters at the start and end of each bold/italic
region. Types of characters are specified in
w3-style-tags-assoc
. w3-style-tags-assoc
is an assoc
list of style names, the cdr
of each is a cons cell. The
car
of this cell is the string to insert at the beginning of the
emphasis, and the cdr
is the string to insert at the end of the
emphasis. The default value is a good example.
( (b "*" . "*") (address "*" . "*") (byline "_" . "_") (cite "_" . "_") (cmd "*" . "*") (dfn "*" . "*") (em "~" . "~") (i "~" . "~") (q "\"" . "\"") (removed "" . "") (s "" . "") (strong "*" . "*") (sub "" . "") (sup "" . "") (u "_" . "_") )
w3-delimit-links
Controls how hypertext links are displayed. If this variable is
eq
to 'linkname
, then the link NAME or ID of the link is
inserted after the link text. If nil
, then nothing is done. If
it is non-nil
and not eq
to 'linkname
, then [[ & ]]
is inserted around the entire text of the link. Is initially set to be
t
iff in normal Emacs, nil
if in Epoch or Lucid Emacs,
since links should be in different colors/fonts.
url-global-history-file
The global history file used by both Mosaic/X and Emacs-w3. This file contains a list of all the URLs that have been visited. This file is parsed at startup and used to provide URL completion. Emacs-w3 can read and write Mosaic/X or Netscape/X style history files, or use its own internal format (faster). The file type is determined automatically, or prompted for if the file does not exist.
w3-hotlist-file
Hotlist filename. This should be the name of a file that is stored in NCSA’s Mosaic/X or Netscape’s format. It is used to keep a listing of commonly accessed URL’s without having to go through 20 levels of menus to get to them.
w3-personal-annotation-directory
The directory where Emacs-w3 looks for personal annotations. This is a directory that should hold the personal annotations stored in a Mosaic-compatible format. (ncsa-mosaic-personal-annotation-log-format-1)
url-pgp/pem-entity
The name by which the user is known to PGP and/or PEM entities. If this
not set when w3-do-setup
is run, it defaults to
(user-real-login-name)
@(system-name)
, which can often be
wrong.
url-personal-mail-address
Your full email address. This is what is sent to HTTP/1.0 servers as
the FROM field. If this is not set when w3-do-setup
is run, then
it defaults to the value of url-pgp/pem-entity
.
w3-right-border
Amount of space to leave on right margin of WWW buffers. This amount is
subtracted from (window-width)
for each new WWW buffer and used
as the new fill-column
.
w3-track-mouse
Controls whether to track the mouse and message the url under the mouse.
If this is non-nil
, then a description of the hypertext area
under the mouse is shown in the minibuffer. This shows what type of
link (inlined image, form entry area, delayed image, delayed MPEG, or
hypertext reference) is under the cursor, and the destination. This
only works in Emacs-19, Lucid Emacs, or XEmacs.
w3-use-forms-index
Non-nil
means translate <ISINDEX> tags into a hypertext form. A
single text entry box is drawn where the ISINDEX tag appears. If
t
, the isindex handling is the same as Mosaic for X.
url-use-hypertext-gopher
Controls how gopher documents are retrieved. If non-nil
, the
gopher pages are converted into HTML and parsed just like any other
page. If nil
, the requests are passed off to the
‘gopher.el’ package by Scott Snyder. Using the ‘gopher.el’
package loses the gopher+ support, and inlined searching.
url-wais-gateway-port
The port # of the WAIS gateway to pass all wais:// requests to. See section Native WAIS Support
url-wais-gateway-server
The machine name where the WAIS gateway lives. See section Native WAIS Support
url-xterm-command
Command used to start a windowed shell, similar to an xterm. This
string is passed through format
, and should expect four strings:
the title of the window, the program name to execute, and the server and
port number. The default is for xterm, which is very unix-centric, but
is the most common case.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
There are several different reasons why you may need to use the gateway support in Emacs-w3.
NOTE: Emacs 19.22 has patches to enable native TERM networking, to enable it, #define TERM in the appropriate s/*.h file for your operating system, then change the SYSTEM_LIBS define to include the ‘termnet’ library that comes with the latest versions of TERM.
If for some reason your system administrator will not recompile Emacs
with the ‘-lresolv’ library or dynamic linking, you need to act as
if you were behind a firewall. Another alternative is to set the
variable url-broken-resolution
- this will use the support in
ange-ftp or EFS to use ‘nslookup’ in a subprocess to do all
hostname resolving. See the variables efs-nslookup-bprogram
,
efs-nslookup-on-connect
, and efs-nslookup-threshold
if you
are using EFS, or ange-ftp-nslookup-program
if you are using
Ange-Ftp.
exit
or closed
. This causes retrieval of HTTP
and gopher pages to hang indefinitely, with Emacs chewing up large
amounts of your CPU time.
NOTE: You do not need to use gateways anymore for this problem.
With the release of Lucid Emacs 19.10, this problem is fixed if you #define FIX_SOLARIS_W3_BUG somewhere in your config.h or s/*.h configuration file. This will be in the stock 19.11 release of XEmacs
Emacs-w3 has support for using the gateway mechanism for certain
domains, and directly connecting to others. To use this, you must
change the value of url-gateway-local-host-regexp
. This should
be a regular expression (2) that matches local hosts that
do not require the use of a gateway. If nil
, then all
connections are made through the gateway.
Emacs-w3 supports several methods of getting around gateways. The variable
url-gateway-method
controls which of these methods is used. This
variable can have several values (use these as symbol names, not
strings):
Run a program in a subprocess to connect to remote hosts (examples are itelnet(3), an expect(4) script, etc.).
This allows you to log into another local computer that has access to the internet, and run a telnet-like program from there.
Masanobu UMEDA umerin@mse.kyutech.ac.jp has written a very nice
replacement for the standard networking in Emacs. This does basically
the same thing that a method of program
does, but is slightly
more transparent to the user.
This means that Emacs-w3 should use the builtin networking code of Emacs. This should be used only if there is no firewall, or someone at your site has already hacked the Emacs source to get around your firewall.
Two of these need a bit more explanation than that:
If you are running a program in a subprocess to emulate a network
connection, you need to set a few extra variables. The variable
url-gateway-telnet-program
should point to an executable that
accepts a hostname and port # as its arguments, and passes standard
input to the remote host. This can be either the full path to the
executable or just the basename. The variable
url-gateway-telnet-ready-regexp
controls how long Emacs-w3 should wait
after spawning the subprocess to start sending to its standard input.
This gets around a bug where telnet would miss the beginning of requests
becausse it did not buffer its input before opening a connection. This
should be a regular expression to watch for that signifies the end of
the setup of url-gateway-telnet-program
. The default should work
fine for telnet.
If you are using the host
-based gatway method, things get a bit
more complicated. This is basically my attempt to do some of the basic
stuff of expect within elisp. First off, set the variable
url-gateway-host
to be the name of your gateway machine.
The variable url-gateway-connect-program
controls how the host is
reached. The easiest way is to have a program that does not require a
username and password to allow you to login. The most common of these
is the rsh command.
If you do not have rsh, then things get very ugly. First, set the
variable url-gateway-program-interactive
to non-nil
. Then
you need to define the variables url-gateway-host-username
and
url-gateway-host-password
to be the username and password
necessary to log into the gateway machine. The regular expressions in
the variables url-gateway-handholding-login-regexp
and
url-gateway-handholding-password-regexp
should match the login and
password prompts on the gateway system respectively. For
example:
(setq url-gateway-connect-program "telnet" url-gateway-host-program "telnet" url-gateway-program-interactive t url-gateway-host-username "wmperry" url-gateway-host-password "yeahrightkeepdreaming" url-gateway-host "moose.cs.indiana.edu" url-gateway-host-program-ready-regexp "Escape character is .*" url-gateway-handholding-login-regexp "ogin:" url-gateway-handholding-password-regexp "ord:")
This should take care of you logging into the remote system. The
variable url-gateway-host-prompt-pattern
should contain a regular
expression that matches the shell prompt on the remote machine. This
should appear no where in the login banner/setup, or things could
get very confused.
Now you are ready to actually get off of your local network! The
variable url-gateway-host-program-ready-regexp
should contain a
regular expression that matches the end of the setup of
url-gateway-host-program
when it tries to make a connection to an
off-firewall machine. (Basically the same as
url-gateway-telnet-ready-regexp
.
Now you should be all set up to get outside your local network. If none of this makes sense, its probably my fault. Please check with your network administrators to see if they have a program that does most of this for you already, since somebody somewhere at your company has probably been through something similar to this before, and would be much more helpful/knowledgeable about your local setup than I would be. But feel free to mail me as a last resort.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
In late January 1993, Kevin Altis and Lou Montulli proposed and implemented a new proxy service. This service requires the use of environment variables to specify a gateway server/port # to send protocol requests to. Each protocol (http, wais, gopher, ftp, etc.) can have a different gateway server. The environment variables are PROTOCOL_proxy, where PROTOCOL is one of gopher, file, http, ftp, or wais.
The main thing to understand about the proxy gateway is that instead of a partial URL being sent to the HTTP server, which is what we do today when a client talks directly to an HTTP server, a client must send a full URL (http://..., gopher://..., ftp://...) to the proxy gateway server, the rest of the HTTP message is the same. For gopher and ftp, the proxy gateway server returns the data encapsulated as a MIME content type to the client like a normal HTTP message. HTTP MIME content types are returned for all URL requests, regardless of the protocol type of the URL. FTP directories, Gopher directories, etc. are returned as text/html.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Emacs-w3 is similar to the Info package all Emacs users hold near and dear to their hearts (See Info in The Info Manual, for a description of Info). Basically, space and backspace control scrolling, and return or mouse2 follows a hypertext link. The f and b keys maneuver around the various links on the page.
NOTE: To enter data into a form entry area, you must select it using return or mouse2 just like a hypertext link.
On non-graphic terminals (vt100, DOS, etc.), or with a graphics terminal and old versions (18.xx) of Emacs, hypertext links are surrounded by ’[[’ and ’]]’ by default. On a graphics terminal with newer versions of Emacs (epoch, lucid, or FSF 19), the links are in bold print. See section Controlling Formatting for information on how to change this, or for help on getting the highlighting to work on graphics terminals.
There are approximately 50 keys bound to special Emacs-w3 functions. The basic rule of thumb regarding keybindings in Emacs-w3 is that a lowercase key takes an action on the current document, and an uppercase key takes an action on the document pointed to by the hypertext link under the cursor.
There are several areas that the keybindings fall into: movement, information, action, and miscellaneous.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Scroll downward in the buffer. With prefix arg, scroll down that many screenfuls.
Scroll upward in the buffer. With prefix arg, scroll up that many screenfuls.
Attempts to move backward one link area in the current document. Signals an error if no previous links are found.
A hypertext listing of the items in the hotlist is generated on the fly, and the links can be followed normally.
Possibly go to a link in the hotlist. A new buffer is created for the new document.
Choose a link from the current buffer and follow it. A completing-read is done on all the links, so space and TAB can be used for completion.
Attempts to move forward one link area in the current document. Signals an error if no more links are found.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
These functions relate information about one or more links on the current document.
This shows the URL of the current document in the minibuffer.
This shows the URL of the hypertext link under point in the minibuffer. If there is not a hypertext link under point, then it shows the type of form entry area under point. If there is no form entry area under point, then it shows the inlined image’s URL that is under point, if any.
Shows miscellaneous information about the currently displayed document. This includes the URL, the last modified date, MIME headers, the HTTP response code, and the LINK tags found in the document (that describe relations between this and other documents).
Shows information about the URL at point. If it is an HTTP link, the HTTP/1.0 HEAD method is used to retrieve information, and this is what is displayed. If it is a file or an ftp link, information about the file is shown, including whether it is a directory, who owns it, last access, modified, and changed times, the size, and the file type that Emacs-w3 thinks it is (text/plain, image/gif, etc).
This shows the HTML source of the current document in a separate buffer.
The buffer-name
is based on the document’s URL.
Shows the HTML source of the hypertext link under point in a separate
buffer. The buffer-name
is based on the document’s URL.
This stores the current document’s URL in the kill ring, and also in the current window-system’s clipboard, if possible.
Stores the URL of the document under point in the kill ring, and also in the current window-system’s clipboard, if possible.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
First, here are the keys and functions that bring up a new hypertext page, usually creating a new buffer.
Pressing return when over a hyperlink attempts to follow the link under the cursor. With a prefix argument (C-u), this forces the file to be saved to disk instead of being passed off to other viewers or being parsed as HTML.
Pressing return when over a form input field will prompt in the minibuffer for the data to insert into the input field. Type checking is done, and the data is only entered into the form when data of the correct type is entered (ie: you can’t enter 44 for ’date’ field, etc).
This function expects to be bound to a mouse button. It moves to the point under mouse and tries to fetch the link that was clicked on. If no link is found, a message is displayed in the minibuffer.
Pressing return when over a form input field will prompt in the minibuffer for the data to insert into the input field. Type checking is done, and the data is only entered into the form when data of the correct type is entered (ie: you can’t enter 44 for ’date’ field, etc).
This function tries to retrieve the inlined image that is under point. It ignores any form entry areas or hyperlinks, and blindly follows any inlined image. Useful for seeing images that are meant to be used as hyperlinks when not on a terminal capable of displaying graphics.
Prints out the current buffer in a variety of formats, including Postscript, HTML source, or formatted text.
Prints out the URL under point in a variety of formats, including Postscript, HTML source, or formatted text.
Choose from a list of all the hyperlinks in the current buffer. Use space and tab to complete on the links.
Reload the current document—the current buffer is killed, and the URL it was visiting is fetched and redisplayed. The position within the buffer remains the same (unless the document has changed since it was last retrieved, in which case it should be relatively close).
This function prompts for a URL in the minibuffer, and attempts to fetch it. If there are any errors, or Emacs-w3 cannot understand the type of link requested, the errors are displayed in a hypertext buffer.
Find a local file, interactively. This prompts for a local file
name to open. The file must exist, and may be a directory. If the
requested file is a directory and url-use-hypertext-dired
is
nil
, then a dired-mode buffer is displayed. If nonnil
,
then Emacs-w3 automatically generates a hypertext listing of the directory.
The hypertext mode is the default, so that all the keys and functions
remain the same.
Perform a search, if this is a searchable index. This sends a string of the type ‘'URL?search-terms'’ to the server this document was retrieved from. Searching requires a server - Emacs-w3 can not do local file searching, as there are too many possible types of searches people could want to do. Generally, the only URL types that allow searching are HTTP, gopher, and X-EXEC.
If url-keep-history
is non-nil
, then Emacs-w3 keeps track of all
the URLs visited in an Emacs session. This function takes all the
links that are in that internal list, and formats them as hypertext
links in a list.
And here are the commands to move around between Emacs-w3 buffers:
Quits WWW mode. This kills the current buffer and goes to the most recently visited buffer.
This is similar to w3-quit, but the buffer is not killed, it is moved to
the bottom of the buffer list (so it is the least likely to show up as
the default with switch-to-buffer). This is different from
w3-goto-last-buffer
in that it does not return to the last WWW
page visited - it is the same as using switch-to-buffer
- the
buffer left in the window is fairly random.
Take one step back along the path in the current history. Has no effect if at the beginning of the history list.
Take one step forward along the path in the current history. Has no effect if at the end of the history list.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Mails the current document to someone. Choose from several different formats to mail: formatted text, HTML source, PostScript, or LaTeX source. When the HTML source is mailed, then an appropriate <base> tag is inserted at the beginning of the document so that relative links may be followed correctly by whoever receives the mail.
Mails the document pointed to by the hypertext link under point to someone. Choose from several different formats to mail: formatted text, HTML source, PostScript, or LaTeX source. When the HTML source is mailed, then an appropriate <base> tag is inserted at the beginning of the document so that relative links may be followed correctly by whoever receives the mail.
Prints the current document. Choose from several different formats to print: formatted text, HTML source, PostScript (with ps-print), or by using LaTeX and dvips).
When the formatted text is printed, the normal lpr-buffer
function
is called, and the variables lpr-command
and lpr-switches
control how the document is printed.
When the HTML source is printed, then an appropriate <base> tag is
inserted at the beginning of the document.
When postscript is printed, then the HTML source of the document is
converted into LaTeX source. If the variable w3-use-html2latex
is non-nil
, then the program specified by
w3-html2latex-prog
is run in a subprocess with the arguments in
w3-html2latex-args
. The w3-html2latex-prog
must accept
HTML source on its standard input and send the LaTeX output to standard
output. If w3-use-html2latex
is nil
, then an Emacs Lisp
function uses regular expressions to replace the HTML code with LaTeX
markup. The variable w3-latex-docstyle
controls how the document
is laid out in this case, and postscript figures are printed as
well.
Prints the document pointed to by the hypertext link under point.
Please see the documentation for w3-print-this-url
directly above
for more information.
Insert a fully formatted HTML link into another buffer. This gets the name and URL of either the current buffer, or, with a prefix arg, of the link under point, and construct the appropriate <a...>...</a> markup and insert it into the desired buffer.
Inserts the URL of the current document into another buffer. Buffer is prompted for in the minibuffer. With prefix arg, uses the URL of the link under point.
Select one of the <LINK> tags from this document and fetch it. Links are attributes of a specific document, and can tell such things as who made the document, where a table of contents is located, etc.
Link tags specify relationships between documents in two ways. Normal
(forward) relationships (where the link has a REL="xxx" attribute), and
reverse relationships (where the link has a REV="xxx" attribute). This
first asks what type of link to follow (Normal or Reverse), then does
a completing-read
on only the links that have that type of
relationship.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Since NCSA Mosaic for Xwindows or Netscape is the de-facto hypertext browser at most sites, Emacs-w3 is compatible with them in several ways.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
In order to avoid having to traverse many documents to get to the same
document over and over, Emacs-w3 supports a “hotlist” like Mosaic. This is
a file that contains URLs and aliases. Hotlists allow quick access to any
document in the Web, providing it has been visited and added to the hotlist.
The variable w3-hotlist-file
determines where this information
is saved. The structure of the file is compatible with Mosaic’s
hotlist file, so this defaults to ‘~/.mosaic-hotlist-default’.
Hotlist commands are:
Converts a Netscape bookmark file into Emacs-w3’s internal hotlist format.
This adds the current document to the hotlist, with the buffer name as
its identifier. Modifies the file specified by w3-hotlist-file
.
If this is given a prefix-argument (via C-u), the title is
prompted for instead of automatically defaulting to the
buffer-name
.
This rereads the default hostlist file specified by
w3-hotlist-file
.
Prompts for the alias of the entry to kill. Pressing the spacebar or
tab will list out partial completions. The internal representation of
the hotlist and the file specified by w3-hotlist-file
are
updated.
Some hotlist item names can be very unwieldy (‘Mosaic for X level 2 fill
out form support’), or uninformative (‘Index of /’). If you are not
satisfied with how a specific item is labeled, you may change it by
typing M-x w3-rename-hotlist-entry. Prompts for the item to
rename in the minibuffer—use the spacebar or tab key for completion.
After having chosen an item to rename, prompts for a new title until a
unique title is entered. Modifies the file specified by
w3-hotlist-file
.
Prompts for the alias to jump to. Pressing the spacebar or tab key shows partial completions.
This converts the hotlist into HTML and displays it.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
NCSA Mosaic keeps track of the URLs followed from a page, so
that it can provide forward and back buttons to keep a path
of URLs that can be traversed easily.
If the variable url-keep-history
is t
, then Emacs-w3
keeps a list of all the URLs visited in a session.
To view a listing of the history for this session of Emacs-w3, use
M-x w3-show-history
from any buffer, and Emacs-w3 generates an
HTML document showing every URL visited since Emacs started (or
cleared the history list), and then format it. Any of the links can
be chosen and followed to the original document. To clear the history
list, choose ’Clear History’ from the ’Options’ menu.
Another twist on the history list mechanism is the fact that all Emacs-w3
buffers remember what URL, buffer, and buffer position you were at
before jumping to this document, and also keeps track of where you jump
to from that buffer. This means that you can go forwards and
backwards very easily along the path you took to reach a particular
document. To go forward, use the function w3-forward-in-history
,
to go backward, use the function w3-backward-in-history
. These
are fairly stable functions, but may not work as expected all the time.
First, the buffer-list is used to look at the URL of every buffer, and
if it matches the item in the history list you are looking for, then it
is brought forward. If no buffer containing the desired URL is found,
then the URL is fetched. Then the desired position in the buffer is
searched for.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Mosaic and Netscape supports the idea of a “history” of URLs the
user has visited, and it displays them in a different style than
normal URLs.
If the variable url-keep-history
is t
, then Emacs-w3
keeps a list of all the URLs visited in a session. The file is
automatically written to disk when exiting emacs. The list is added to
those already in the file specified by url-global-history-file
,
which defaults to ‘~/.mosaic-global-history’.
If any URL in the list is found in the file, it is not saved, but new ones are added at the end of the file.
The function that saves the global history list is smart enough to notice what style of history list you are using (Netscape or XMosaic), and writes out the new additions appropriately.
One of the nice things about keeping a global history files is that Emacs-w3 can use it as a completion table. When doing M-x w3-fetch, pressing the tab or space keys will show all completions for a partial URL. This is very useful, especially for very long URLs that are not in a hotlist, or for seeing all the pages from a particular web server before choosing which to retrieve.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Mosaic can annotate documents. Annotations are comments about the current document, and these annotations appear as a link to the comments at the end of the document when you browse it in Mosaic. The original file is not changed. There are two types of annotations supported in Mosaic, and both are supported by Emacs-w3 as well.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
NOTE: The group annotation experiment has been terminated. It will be replaced with support on the server side for adding <LINK> tags to documents.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
If you do not want to share your musings about a particular document with the entire network, you can add a personal annotation that only you can see. Personal annotations are stored in a subdirectory in the users account on the local disk, with a log file that contains information about what URLs have been annotated and which files contain the annotations.
Emacs-w3 looks in the directory specified by
w3-personal-annotation-directory
(defaults to
‘~/.mosaic-personal-annotations’). Any personal annotations for a
document are automatically appended when it is retrieved.
To add a new personal annotation, type M-x
w3-add-personal-annotation. This creates a new buffer, in the mode
specified by w3-annotation-mode
. This defaults to
html-mode
. If this variable is nil
, or it points to an
undefined function, then default-major-mode
is consulted.
A minor mode redefines C-c C-c to complete the annotation and store it on the local disk.
To delete a personal annotation, it must be the current page. Once reading the annotation, M-x w3-delete-personal-annotation will remove it. This deletes the file containing the annotation, and any references to it in the annotation log file.
Editing personal annotations is not yet supported.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
How Emacs-w3 formats a document is very customizable. How a document is displayed depends on whether the user is on a terminal capable of graphics and a few variables.
The following sections describe in more detail how to change the formatting of a document.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Each time a document is parsed, the fill-column
is recalculated
using window-width
and w3-right-border
.
w3-right-border
is an integer specifying how much room at the
right edge of the screen to leave blank. The fill-column
is set
to (- (window-width)
.
If the variable w3-right-border
)w3-delimit-links
is non-nil
(the default
for text-terminals), then hypertext links are surrounded by text
specified by the user. The variables w3-link-start-delimiter
and
w3-link-end-delimiter
control what text is at the start and end
of a hypertext link. These variables are cons-pairs of two
strings.
If a link has never been visited before (it is not in the global
history), then the car
of these variables is inserted at the
start and end of the link. If the link has been visited before, then
the cdr
is inserted. So, links look like:
[[This is a hypertext link]] that has never been visited. {{This one, however}} has been seen before at some point in time.
There are several different ways to control the formatting of lists.
The most obvious is how deeply they are indented relative to the rest of
the paragraphs in the document. To control this, set the
variable w3-indent-level
. This is the number of spaces to
indent lists and other items requiring special margins.
Another thing that is easy to change about lists is the bullet character
put at the front of each list item. This is controlled by the variable
w3-list-chars-assoc
, which is an assoc list. This is a list of
lists, each sublist describing what to put at the start of each
particular list type. The car
of this list should be a symbol
(not a string) representing the type of list (e.g., ‘ul’).
The rest of the list should consist of strings to insert at certain
levels of lists. The n
th element of this list is used when the
list is nested n + 1
levels. If the list is not long enough to
define a string for a certain nesting level, then it defaults to either
a ’*’ or a ’.’.
When Emacs-w3 encounters a link to a directory (whether by local file access
or via ftp), it can either create an HTML document on the fly, or use
dired-mode
to peruse the listing. The variable
url-use-hypertext-dired
controls this behavior.
If the value is t
, Emacs-w3 uses directory-files
to list them
out and transform the directory into a hypertext document, then pass it
through the parser like any other document.
If the value is nil
, just pass the directory off to dired using
find-file
. Using this option loses all the hypertext abilities
of Emacs-w3, and the users is unable to load documents in the directory
directly into Emacs-w3 by clicking with the mouse, etc.
A new option in the 2.2 series is url-forms-based-ftp
- this is
still in the experimental stages, but can be useful. If
url-forms-based-ftp
is t
, then all automatically generated
directory listings will have a form mixed in with the file listing.
Each file will have a checkbox next to it, and a row of buttons at the
bottom of the screen. Selecting one of the buttons at the bottom of the
screen will take the designated action on all the marked files.
Currently, only deleting and copying marked files is supported.
There are two different ways of viewing gopher links. The built-in
support that converts gopher directories into HTML, or the
‘gopher.el’ package by Scott Snyder snyder@fnald0.fnal.gov.
The variable that controls this is w3-use-hypertext-gopher
. If
set to nil
, then ‘gopher.el’ is used. Any other value
causes Emacs-w3 to use its internal gopher support. If using
‘gopher.el’, all the hypertext capabilities of Emacs-w3 are lost.
All the functionality of ‘gopher.el’ is now available in the
hypertext version, and the hypertext version supports Gopher+ and ASK
blocks.
The main way to control the display of gopher directories is by the
variable w3-gopher-labels
. This variable controls the text that
is inserted at the front of each item. This is an assoc list of gopher
types (as one character strings), and a string to insert just after the
list item. All the normal gopher types are defined. Entries should be
similar to: ‘("0" . "(TXT)")’. I have tried to keep all the tags
to three characters plus two parentheses.
Horizontal rules (<HR> tags in HTML[+]) are used to separate chunks
of a document, and is meant to be rendered as a solid line across the
page. Some terminals display characters differently, so the variable
w3-horizontal-rule-char
controls which character is used to draw a
horizontal bar. This variable must be the ASCII value of the character,
not a string. The variable is passed through make-string whenever a
horizontal rule of a certain width is necessary.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
On character based terminals, there is no easy way to show that a
certain range of text is in bold or italics. If the variable
w3-delimit-emphasis
is non-nil
, then Emacs-w3 can insert
characters before and after character formatting commands in HTML
documents. The defaul value of w3-delimit-emphasis
is
automatically set based on the type of window system and version of
Emacs being used.
Two variables control what text is inserted around different markup
tags. w3-header-chars-assoc
controls what characters are inserted
around header items, and w3-style-chars-assoc
controls what
characters are inserted around most other markup (italics, addresses,
etc.).
w3-header-chars-assoc
is an assoc list of header tags and a list
of formatting instructions. The car
of the list is the level of
the header (1–6). The rest of the list should contain three items.
The first item is text to insert before the header. The second item is
text to insert after the header. Both should have reserved characters
converted to their HTML[+] entity definitions. The third item is a
function to call on the area the header is in. This function is called
with arguments specifying the start and ending character positions of
the header. The starting point is always first. To convert a region to
upper case, please use w3-upcase-region
instead of
upcase-region
, so that URLs within the region are not
corrupted.
w3-style-chars-assoc
is an assoc list of style tags and a list of
strings. The car
of the list is the type of style tag it
specifies (DFN, B, I, etc.). The rest of the list should contain two
items. The car
is text to insert before the stylized text. The
cdr
is text to insert after the stylized text. Both should have
reserved characters converted to their HTML[+] entity
definitions.
If using Emacs 19.2x on a VT100 compatible terminal, Emacs-w3 can show links, headers, and various other types of emphasis in bold or underlined text.
To do this, set the variable w3-emacs19-hack-faces-p
to
non-nil
in your ‘~/.emacs’ file. Also make sure that the
environment variable TERM
is set to the correct terminal
type.
If there is a function called w3-emacs19-hack-TERMINAL
,
then this is used to setup the special characters that turn on bold and
underlined text. If this function does not exist, it is fairly easy to
write one from scratch, using the terminal’s entry in the
‘/etc/termcap’ file.
Each function should use the standard-display-table
to replace
^A, ^B, ^C, and ^D with escape sequences that turn on highlighting.
When reading the ‘/etc/termcap’ file, be on the lookout for these
codes:
us
Code to turn on underlining
ue
Code to turn off underlining
mb
Code to turn on boldface type
se
Code to turn off all attributes
Here is an example for creating the VT100 control sequences:
(defun w3-emacs19-hack-vt100 () "Hack 'faces' for ttys (vt100)" (or standard-display-table (setq standard-display-table (make-vector 261 nil))) (aset standard-display-table 1 (vector (create-glyph "\e[4m"))) (aset standard-display-table 2 (vector (create-glyph "\e[m"))) (aset standard-display-table 3 (vector (create-glyph "\e[5m"))) (aset standard-display-table 4 (vector (create-glyph "\e[m"))) )
To turn off the highlighting features, set the variable
w3-emacs19-hack-faces-p
to nil
and execute the function
w3-emacs19-unhack-faces
NOTE: This highlighting is not perfect and could cause some odd display glitches, especially when Emacs does a smart redisplay and doesn’t redraw the whole screen. C-l usually fixes these problems.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
When running in a graphic environemnt (Xwindows or NeXTstep for example), the fonts and colors used by Emacs-w3 to display text can be controlled by setting a few resources. To specify these resources:
The resources for each version of Emacs are the same. For each style of text that Emacs-w3 uses, you can specify any of the following resources by replacing <style> with the actual style name.
Emacs-w3 uses these special style names:
w3-node-style
For links to other documents
w3-visited-node-style
For displaying hypertext links that have been viewed before
All other styles can be specified by using the tag name as the <Style> section of the resource. For example: ‘Emacs*h1.attributeForeground’, or ‘Emacs*address.attributeForeground’.
When running in Lucid Emacs 19.10 or XEmacs 19.11 and higher, Emacs-w3 can display inlined images and MPEG movies. There are several variables that control how and when the images are displayed.
Since Lucid/XEmacs only natively understands XPixmaps and XBitmaps, GIFs
and other image types must first be converted to one of these formats.
To do this, the netpbm utilities(5) programs are normally used. This is a suite of freeware image
conversion tools. The variable w3-graphic-converter-alist
controls how each image type is converted. This is an assoc list, keyed
on the MIME content-type. The car
is the content-type, and the
cdr
is a string suitable to pass to format
. A %s in this
string will be replaced with a converter from the ppm image format to an
XPixmap (or XBitmap, if being run on a monochrome display). By default,
the emacs-w3 browser has converters for:
Since most displays are (sadly) not 24-bit, Emacs-w3 can automatically
dither an image, so that it does not fill up the application’ colormap too
quickly. If w3-color-use-reducing
is non-nil
, then the
images will use reduced colors. If w3-color-filter
is eq
to
'ppmquant
, then the ppmquant program will be used. If eq
to
'ppmdither
, then the ppmdither program will be used. The ppmdither
program tends to give better results. The values of
w3-color-max-red
, w3-color-max-blue
, and
w3-color-max-green
control how many colors the inlined images can
use. If using ppmquant, then the product of these three variables is used
as the maximum number of colors per image. If using ppmdither, then only
the set number of color cells can be allocated per image. See the man
pages for ppmdither and ppmquant for more information on how the dithering
is actually done. w3-color-filter
may also be a string, specifying
exactly what external filter to use. An example is: ‘ppmquant -fs
-map ~/pixmaps/colormap.ppm’.
When running in XEmacs 19.11 or XEmacs 19.12, Emacs-w3 can insert an MPEG movie in the middle of a buffer. This utilizes a now deprecated feature of HTML 3.0, so its use should be limited to pages you do not mind modifying once the standard way to do this is nailed down.
The basic syntax is:
<embed href="somevideo.mpg" type="video/mpeg">
This requires a special version of the standard ‘mpeg_play’ mpeg
player. Patches against the 2.0 version are available at
ftp://ftp.cs.indiana.edu/pub/elisp/w3/mpeg_patch. The variable
w3-mpeg-program
should point to this executable, and
w3-mpeg-args
should be a list of any additional arguments to be
passed to the player. By default, this includes -loop, so the
mpeg plays continuously.
Because images and movies can take up an incredible amount of bandwidth,
it is useful to be able to control whether they are loaded or not. By
default, images and movies are loaded automatically, but the variables
w3-delay-image-loads
and w3-delay-mpeg-loads
control this.
If set to non-nil
, then the images or movies are not
loaded until explicitly requested by the user.
To load any delayed images, use the function
w3-load-delayed-images
. Its counterpart for delayed movies is
w3-load-delayed-mpegs
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The new revision of the HTTP specification adds much more functionality to the server side of a transaction. Access authorization has been added, and several types of redirection can occur. All of this negotiation and redirection should take place before the user ever sees the first requested page—this avoids the overhead of parsing any error messages or old documents the server may have returned with the redirection or authorization message. The new protocol is also MIME (Multipurpose Internet Mail Extensions, see RFC 1341) compliant.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
One of the most useful aspects of HTTP/1.0 is the ability to transparently move files between different servers (perhaps even different protocols). Most of the WWW browsers support redirection in some form or another. The Emacs browser supports all three types of redirection in the HTTP/1.0 specification (error codes 301, 302, and 303).
Whenever a redirection response is detected, the URL specified by the Location: header is retrieved. All relative references are resolved before requesting the new URL.
An HTML editor that is tightly integrated with Emacs-w3 is planned, and will include the ability to edit documents to change their links if a permanent relocation is seen.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Lots of information is useful to a group of people within an organization, or a group working on a project, but it is not always wise to distribute this information to the world at large.
The HTTP/1.0 protocol adds the capability to have authentication based on usernames and passwords. If the improper username/password pair is sent to the server, an error code of 401, Unauthorized is returned by the server.
The browser has a very extensible interface to its authentication handling.
When a 401 error code is received, the WWW-Authenticate header is
checked—this header field should be a space-separated list of suitable
authorization schemes for the requested URL. The value of this header is
read into a lisp symbol by way of Emacs’s read-string function. This lisp
symbol looks like url-authtype-auth
, where authtype is
replaced by the correct authorization type. The authorization types
defined in the latest HTTP/1.0 specification include user, basic, public
key, and kerberos (versions 4 and 5). If a function of this name is
currently defined, then the function is called via funcall
with
several parameters:
This interface was chosen for its flexibility and extensibility. The main routine that does the MIME parsing and the building of the Authorization header does not need to know how to handle each type of authentication, and the addition of a new method for authentication is simply a matter of defining one function that conforms to a simple interface.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The Chargeto: header will be used to pay for services offered over the World Wide Web. Such things as electronic magazines and commercial databases all need a way to restrict access to only authorized subscribers. The format of the header has yet to be specified, but the interface and storage techniques will be similar to the Authorization section.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
MIME is an emerging standard for multimedia mail. It offers a very flexible typing mechanism. The type of a file/message is specified in two parts, separated by a ’/’. The first part is a general category of data (text, application, image, etc.). The second part is the specific type of data (postscript, gif, jpeg, etc.). So ‘text/html’ specifies an HTML document, whereas ‘image/x-xwindowdump’ specifies an image of an Xwindow taken with the xwd program.
This typing allows much more flexibility in naming files. HTTP/1.0 servers can now send back content-type headers in response to a request, and not have the client second-guess it based on file extensions. HTML files can now be named ‘something.gif’ (not a great idea, but doable).
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
For some protocols however, it is still necessary to guess the content of a file based on the file extension. This type of guess-work should only be needed when accessing files via FTP, local file access, or old HTTP/0.9 servers.
Instead of specifying how to view things twice, once based on
content-type and once based on the file extension, it is easier to map
file extensions to MIME content-types. The variable that controls this
is mm-mime-extensions
.
This variable is an assoc list of file extensions and the corresponding MIME content-type. A sample entry looks like: ‘(".movie" . "video/x-sgi-movie")’ This makes all files that end in ‘.movie’ (‘foo.movie’ and ‘bar.movie’) be interpreted as SGI animation files. If a content-type is defined for the document, then this is over-ridden. Regular expressions can NOT be used.
Both Mosaic and the NCSA HTTP daemon rely on a separate file for mapping
file extensions to MIME types. Instead of having the users of Emacs-w3
duplicate this in lisp, this file can be parsed using the
url-parse-mimetypes
function. This function is called each time
w3 is loaded. It tries to locate mimetype files in several places. If
the environment variable MIMETYPES
is nonempty, then this is
assumed to specify a UNIX-like path of mimetype files (this is a colon
separated string of pathnames). If the MIMETYPES
environment
variable is empty, then Emacs-w3 looks for these files:
Each line contains information for one http type. These types resemble MIME types. To add new ones, use subtypes beginning with x-, such as application/x-myprogram. Lines beginning with # are comment lines, and suitably ignored. Each line consists of:
type/subtype ext1 ext2 ... extn
type/subtype is the MIME-like type of the document. ext* is any number of space-separated filename extensions which correspond to the MIME type.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
In order to avoid having to specify viewers for gopher in a different
way, Emacs-w3 converts gopher types to MIME media types and uses the
standard mailcap viewers. The variable url-gopher-to-mime
determines how this mapping of gopher types to MIME is done. This is an
assoc list, the car
of each element should be a character (not a
string) specifying the gopher type. The cdr
of each element
should be a string, specifying what MIME media type the gopher object
should be treated as.
The default value for this should be sufficient for most uses, but if any gopher types have been left out, or mislabeled, please let wmperry@spry.com know.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Not all files look as they should when parsed as an HTML document (whitespace is stripped, paragraphs are reformatted, and lots of little changes that make the document look unrecognizable). Files may be passed to external programs or Emacs Lisp functions to be viewed.
Not all files can be viewed accurately from within an Emacs session (GIF files for example, or audio files). For this reason, the user can specify file "viewers" based on MIME content-types. This is done with the standard mailcap file. See section Mailcap File
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
NCSA Mosaic and almost all other WWW browsers rely on a separate file
for mapping MIME types to external viewing programs. This takes some of
the burden off of browser developers, so each browser does not have to
support all image formats, or postscript, etc. Instead of having the
users of Emacs-w3 duplicate this in lisp, this file can be parsed using
the mm-parse-mailcaps
function. This function is called each
time w3 is loaded. It tries to locate mimetype files in several
places. If the environment variable MAILCAPS
is nonempty, then
this is assumed to specify a UNIX-like path of mimetype files (this is a
colon separated string of pathnames). If the MAILCAPS
environment variable is empty, then Emacs-w3 looks for these
files:
This format of this file is specified in RFC 1343, but a brief synopsis follows (this is taken verbatim from sections of RFC 1343).
Each mailcap file consists of a set of entries that describe the proper handling of one media type at the local site. For example, one line might tell how to display a message in Group III fax format. A mailcap file consists of a sequence of such individual entries, separated by newlines (according to the operating system’s newline conventions). Blank lines and lines that start with the "#" character (ASCII 35) are considered comments, and are ignored. Long entries may be continued on multiple lines if each non-terminal line ends with a backslash character (’\’, ASCII 92), in which case the multiple lines are to be treated as a single mailcap entry. Note that for such "continued" lines, the backslash must be the last character on the line to be continued.
Each mailcap entry consists of a number of fields, separated by semi-colons. The first two fields are required, and must occur in the specified order. The remaining fields are optional, and may appear in any order.
The first field is the content-type, which indicates the type of data this mailcap entry describes how to handle. It is to be matched against the type/subtype specification in the "Content-Type" header field of an Internet mail message. If the subtype is specified as "*", it is intended to match all subtypes of the named content-type.
The second field, view-command, is a specification of how the message or body part can be viewed at the local site. Although the syntax of this field is fully specified, the semantics of program execution are necessarily somewhat operating system dependent.
The optional fields, which may be given in any order, are as follows:
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
There are an increasing number of ways to authenticate yourself to a web servivce. Emacs-w3 tries to support as many as possible.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The weakest authentication available, not recommended if you are at all serious about security on your web site. This is simply a string that looks like ‘user:password’ that has been Base64 encoded, as defined in RFC 1421. It is given as an example of how to write an authorization module. All of the functions for storing, retrieving, and over-writing the cached authorization information should all be handled by one function (although it would be perfectly acceptable to have a stub function that passed off to three larger functions based on its parameters). The most efficient way to store the cached information is by an assoc-list of assoc-lists. The top level assoc list is keyed on the name of the server. The secondary assoc-list is keyed on the full path of the file that is protected. Thus, a sample authorization cache would look like this:
((``info.cern.ch'' . ((``/foo'' . ``d21wZXJyeTp0ZXN0aW5n'') (``/bar'' . ``amtvbnJhdGg6ZGlzbWVtYmVy'') (``/foo/x.html'' . ``dmlvbGV0dDpvcGVuZ2w=''))) (``cs.indiana.edu'' . ((``/elisp/w3/'' . ``dGxvb3M6Y29ucXVlcg=='') (``/'' . ``bXZhbmhleW46a2lsbGh1bGljaw==''))) )
The structure consists of two assoc-lists for the sake of speed. The list of cached information could conceivably hold several thousand links (if the user does not exit Emacs for long periods of time.) If the list were keyed on a full URL, the assoc function would have to search through every link before failing to find a new URL. With the current scheme, assoc only has to search though a few items (maximum is the number of HTTP servers, which should always be much, much smaller than the number of distinct URLs.) Even with a 3:1 ratio of URLs to each server, this is a big win.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Jeffery L. Hostetler, John Franks, Philip Hallam-Baker, Ari Luotonen, Eric W. Sink, and Lawrence C. Stewart have an internet draft for a new authentication mechanism. For the complete specification, please see draft-ietf-http-digest-aa-01.txt in your nearest internet drafts archive(6). What follows is mainly taken from the March 24, 1995 version of the internet draft.
The protocol referred to as "HTTP/1.0" includes specification for a Basic Access Authentication scheme. This scheme is not considered to be a secure method of user authentication, as the user name and password are passed over the network in an unencrypted form. A specification for a new authentication scheme is needed for future versions of the HTTP protocol. This document provides specification for such a scheme, referred to as "Digest Access Authentication".
The Digest Access Authentication scheme is not intended to be a complete answer to the need for security in the World Wide Web. This scheme provides no encryption of object content. The intent is simply to facilitate secure access authentication.
Like Basic Access Authentication, the Digest scheme is based on a simple challenge-response paradigm. The Digest scheme challenges using a nonce value. A valid response contains the MD5 checksum of the password and the given nonce value. In this way, the password is never sent in the clear. Just as with the Basic scheme, the username and password must be prearranged in some fashion.
If a server receives a request for an access-protected object, and an acceptable Authorizatation header is not sent, the server responds with:
HTTP/1.1 401 Unauthorized WWW-Authenticate: Digest realm="<realm>", domain="<domain>", nonce="<nonce>", opaque="<opaque>", stale="<TRUE | FALSE>"
The meanings of the identifers used above are as follows:
<realm>
A name given to users so they know which username and password to send.
<domain> OPTIONAL
A comma separated list of URIs, as specified for HTTP/1.0. The intent is that the client could use this information to know the set of URIs for which the same authentication information should be sent. The URIs in this list may exist on different servers. If this keyword is omitted or empty, the client should assume that the domain consists of all URIs on the responding server.
<nonce>
A server-specified integer value which may be uniquely generated each time a 401 response is made. Servers may defend themselves against replay attacks by refusing to reuse nonce values. The nonce should be considered opqaue by the client.
<opaque> OPTIONAL
A string of data, specified by the server, which should returned by the client unchanged. It is recommended that this string be base64 or hexadecimal data. Specifically, since the string is passed in the header lines as a quoted string, the double-quote character is not allowed.
<stale> OPTIONAL
A flag, indicating that the previous request from the client was rejected because the nonce value was stale. If stale is TRUE, the client may wish to simply retry the request with a new encrypted response, without reprompting the user for a new username and password.
The client is expected to retry the request, passing an Authorization header line as follows:
Authorization: Digest username="<username>", -- required realm="<realm>", -- required nonce="<nonce>", -- required uri="<requested-uri>", -- required response="<digest>", -- required message="<message-digest>", -- OPTIONAL opaque="<opaque>" -- required if provided by server where <digest> := H( H(A1) + ":" + N + ":" + H(A2) ) and <message-digest> := H( H(A1) + ":" + N + ":" + H(<message-body>) ) where: A1 := U + ':' + R + ':' + P A2 := <Method> + ':' + <requested-uri> with: N -- nonce value U -- username R -- realm P -- password <Method> -- from header line 0 <requested-uri> -- uri sans proxy/routing
Where H() is the RSA Data Security, Inc. MD5 Message-Digest Algorithm (7).
Upon receiving the Authorization information, the server may check its validity by looking up its known password which corresponds to the submitted <username>. Then, the server must perform the same MD5 operation performed by the client, and compare the result to the given <response>.
Note that the HTTP server does not actually need to know the user’s clear text password. As long as H(A1) is available to the server, the validity of an Authorization header may be verified.
All keyword-value pairs must be expressed in characters from the US-ASCII character set, excluding control characters.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
SSL is the Secure Sockets Layer
interface developed by Netscape
Communications (8).
In order to use SSL in Emacs-w3, you will need one of the reference implementations of SSL that are publicly available. These are the implementations that I am aware of:
SSLRef 2.0
Available from Netscape Communications at http://www.netscape.com/newsref/std/sslref.html. This requires the RSARef library, which is not exportable. The RSARef library is available from ftp://ftp.rsa.com/rsaref/
SSLeay 0.4
An implementation by Eric Young (eay@mincom.oz.au) that is free for commerial or noncommercial use, and was developed completely outside the US by a non-US citizen. More information can be found at ftp://ftp.psy.uq.oz.au/pub/Crypto/SSL/
Whichever reference implementation you choose to download (I recommend
the SSLeay distribution, just to thumb a nose at the NSA :), you must
have a program you can run in a subprocess that takes a hostname and
port number on the command line, and reads/writes to standard
input/output (the Netscape implementation comes with one of these by
default). Once you hvae this program, set the variable
ssl-program-name
to point to the executable.
This should be all you need to do. In the future, I will be distributing a set of patches to Emacs 19.xx and XEmacs 19.xx to SSL-enable them, for the sake of speed.
NOTE: This implementation does not support the use of client certificates, but then nobody else supports that area of the protocol either, so I’m not too worried about it.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Most of this section was taken from the documentation written by Rob McCool robm@ncsa.uiuc.edu. Gratefully reproduced here with permission from him.(9).
RIPEM is ’Riordan’s Internet Privacy Enhanced Mail’, and is currently on version 1.2b3. US citizens can ftp it from ripem.msu.edu:/pub/crypt/ripem.
PGP is ’Pretty Good Privacy’, and is currently on version 2.6. The legal controversies that plagued earlier versions have been resolved, so this is a competely legal program now. There is also a legal version for european users, called 2.6ui (the Unofficial International version).
PGP and PEM are programs to allow you and a second party to communicate in a way which does not allow third parties to read them, and which certify that the person who sent the message is really who they claim they are.
PGP and PEM both use RSA encryption. The U.S. government has strict export controls over foreign use of this technology, so people outside the U.S. may have a difficult time finding programs which perform the encryption.
You will need a working copy of either Pretty Good Privacy or RIPEM to begin with. You should be familiar with the program and have generated your own public/private key pair. You should be able to use the TIS/PEM program with the PEM authorization type. I haven’t tried it. This tutorial is written assuming that you are using RIPEM.
Currently, the protocol has been implemented with PEM and PGP using local key files on the server side, and on the client side with PEM using finger to retrieve the server’s public key.
As you can tell, parties who wish to use Emacs-w3 and httpd with PEM or PGP encryption will need to communicate beforehand and find a tamper-proof way to exchange their public keys.
Pioneers get shot full of arrows. This work is currently in the experimental stages and thus may have some problems that I have overlooked. The only known problem that I know about is that the messages are currently not timestamped. This means that a malicious user could record your encrypted message with a packet sniffer and repeat it back to the server ad nauseum. Although they would not be able to read the reply, if the request was something you were being charged for, you may have a large bill to pay by the time they’re through.
This protocol is almost word-for-word a copy of Tony Sander’s RIPEM based scheme, generalized a little. Below, wherever you see PEM you can replace it with PGP and get the same thing.
*Client:* GET /docs/protected.html HTTP/1.0 UserAgent: Emacs-W3/2.1.x *Server:* HTTP/1.0 401 Unauthorized WWW-Authenticate: PEM entity="webmaster@hoohoo.ncsa.uiuc.edu" Server: NCSA/1.1 *Client:* GET / HTTP/1.0 Authorization: PEM entity="robm@ncsa.uiuc.edu" Content-type: application/x-www-pem-request --- BEGIN PRIVACY-ENHANCED MESSAGE --- this is the real request, encrypted --- END PRIVACY-ENHANCED MESSAGE --- *Server:* HTTP/1.0 200 OK Content-type: application/x-www-pem-reply --- BEGIN PRIVACY-ENHANCED MESSAGE --- this is the real reply, encrypted --- END PRIVACY-ENHANCED MESSAGE --- That's it.
Emacs-w3 uses the excellent mailcrypt package written by Jin S Choi jsc@mit.edu.(10). This package takes care of calling ripem and/or pgp with the correct arguments. Please see the documentation at the top of mailcrypt.el for instructions on using mailcrypt. All bug reports about mailcrypt should go to Jin S Choi, but bugs about how I use it in Emacs-w3 should of course be directed to me.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
:: WORK ::
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
:: WORK ::
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
:: WORK ::
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
:: WORK ::
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
:: WORK ::
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
:: WORK ::
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
:: WORK ::
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Emacs-w3 currently supports the experimental style sheet mechanism proposed by H&kon W. Lie of the W3 Consortium. This allows for the author to specify what a document should look like, and yet allow the end user to override any of the stylistic changes. This allows for people with special needs (most notably the visually impaired) to override style bindings that could make a document totally unreadable.
A stylesheet consists of comments and directives. A comment is any line starting with a #, and is terminated by the end of the line. A directive includes the tag name, an attribute name, and a value. A sample stylesheet is:
<style notation="experimental"> # This line is a comment # These will be ignored, up the the terminating end-of-line # h1: align=center h1: color.text=yellow h1: color.background=red h1: font.size *= 2 </style>
Below is a comprehensive list of the attribute names.
color.text
Specifies the foreground color of the text for this item.
color.background
Specifies the background color of the text for this item.
background.bitmap
Specifies a bitmap to be used as the background for this item.
font.size
Specifies the font size. This can be specified with the +=, -=, /=, or *= operator, signifying a change from the default font size. For example, font.size *= 2 would mean a font twice as large as the default font.
font.style
Specifies the font style. This controls whether a font is bold, italic, underlined, or any combination of these. The value can be a comma or ampersand (&) separated list of values.
font.family
Specifies the font family - this is the basic type of font. Note that not all font families will be available on all platforms, or even the same platform in a slightly different configuration. If the specified font family cannot be found on the machine, the default font is used instead.
align
Specifies how the text contained within the item is to be aligned. Possible values are left, right, justify, center, or indent.
width
Specifies how wide the item should be. This is only used for horizontal rules (<HR>) tags right now.
To include a stylesheet into your document, simply use the <style> tag. You can use the notation attribute to specify what language the stylesheet is specified in. The default is experimental. The data between the <style> and </style> tags is the stylsheet proper - no HTML parsing is done to this data - it is treated similar to an <XMP> section of text. To reference an external stylesheet, you should use the <link> tag.
<link rel="stylesheet" href="/bill.style">
If these two mechanisms are mixed, then the URL is resolved first, and the contents of the <style> tag take precedence if there are any conflicting directives.
In the future, DSSSL and DSSSL-lite will be supported as valid stylesheet languages, but not in this release.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
A cache stores the information on a page on your local machine. When requesting a page that is in the cache, Emacs-w3 can retrieve the page from the cache more quickly than retrieving the page again from its location out on the network. With a well-populated cache, the speed of browsing the web is dramatically increased.
The first time a page is requested, Emacs-w3 retrieves the page from the network. When requesting a page that is in the cache, Emacs-w3 checks to see if the page has changed since it was last retrieved from the remote machine. If it has not changed, the local copy is used, saving the transmission of the file over the network.
To turn on disk caching, set the variable url-automatic-caching
to non-nil
, or choose the ’Caching’ menu item (under ‘Options’).
That is all there is to it. It is recommended that you use the
clean-cache
shell script fist, to allow for future cleaning of
the cache. This shell script will remove all files that have not been
accessed since it was last run. To keep the cache pared down, it is
recommended that this script be run from at or cron (see the
manual pages for crontab(5) or at(1) for more information)
With a large cache of documents on the local disk, it can be very handy
when traveling, or any other time the network connection is not active
(a laptop with a dial-on-demand PPP connection, etc). Emacs-w3 can rely
solely on its cache, and avoid checking to see if the page has changed
on the remote server. In the case of a dial-on-demand PPP connection,
this will keep the phone line free as long as possible, only bringing up
the PPP connection when asking for a page that is not located in the
cache. This is very useful for demonstrations as well. To turn this
feature on, set the variable url-standalone-mode
to
non-nil
, or choose the ‘Use Cache Only’ menu item (under
‘Options’)
Emacs-w3 caches files under the temporary directory specified by
url-temporary-directory
, in a user-specific subdirectory
(determined by the user-real-login-name
function). The cache
files are stored under their original names, so a URL like:
http://www.spry.com/foo/bar/baz.html would be stored in a cache file
named: /tmp/wmperry/com/spry/www/foo/bar/baz.html. Sometimes, espcially
with gopher links, there will be name conflicts, and an error will be
signalled. This cannot be avoided, and still have reasonable
performance at startup time (reading in an index file of all the cached
pages can take a long time on slow machines, or even fast machines with
large caches). If you are running XEmacs 19.12 or later, you can use an
alternate naming scheme that avoids name conflicts, but loses the human
readability of the cache file names. The cache files will look like:
/tmp/wmperry/acbd18db4cc2f85cedef654fccc4a4d8, which is certainly
unique, but not very user-friendly. To turn this on, add this to your
‘.emacs’ file:
(add-hook 'w3-load-hooks '(lambda () (fset 'url-create-cached-filename 'url-create-cached-filename-using-md5)))
If you will not be using other emacs variants, I highly recommend this method of creating the cache filename.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
In the file ‘w3-search.el’ is a function that some may find handy. It is not 100% completed yet, so if you run into any problems with it, please try to fix it, not just say its broken.
The function is w3-do-search
. It must be called with at least
one argument. All others are optional. Arguments are TERM,
BASE, HOPS-LIMIT, and RESTRICTION. This recursively
descends all the child links of the current document for TERM.
TERM may be a string, in which case it is treated as a regular
expression, and re-search-forward
is used, or a symbol, in which
case it is funcalled with 1 argument, the current URL being
searched.
BASE is the URL to start searching from.
HOPS-LIMIT is the maximum number of nodes to descend before the search dies out.
RESTRICTION is a regular expression or function to call with one
argument, a URL that could be searched. If RESTRICTION returns
non-nil
, then the URL is added to the queue, otherwise it is
discarded. This is useful for restricting searching to either certain
types of URLs (only search ftp links), or restricting searching to one
domain (only search stuff in the indiana.edu domain).
You may check several variables from the main w3-do-search
routine in any functions passed to it (as RESTRICTION or
TERM). QUEUE is the queue of links to be searched,
HOPS is the current number of hops from the root document,
RESULTS is an assoc list of (URL . RETVAL), where
RETVAL is the value returned from previous calls to the TERM
function (or point if searching for a regular expression).
The function returns a list of the form: ((URL . RETVAL)...)
Please note that there is no interactive use for this function yet—it was designed for non-interactive, batch-mode processing. However, if anyone wants to write a wrapper function for it, please feel free.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
More and more people are including URLs in their signatures, and within the body of mail messages. It can get quite tedious to type these into the minibuffer to follow one.
To access URLs with VM, the following in your ‘~/.emacs’ or ‘~/.vm’ files should do the trick. It adds two keybindings to the main VM message window. The middle mouse button now tries to follow a hypertext link.
(add-hook 'vm-mode-hook (function (lambda () (define-key vm-mode-map [mouse-2] 'w3-maybe-follow-link-mouse) (define-key vm-mode-map "\r" 'w3-maybe-follow-link))))
To access URLs with RMAIL, the following in your ‘~/.emacs’ file should do the trick.
(add-hook 'rmail-mode-hook (function (lambda () (define-key rmail-mode-map [mouse-2] 'w3-maybe-follow-link-mouse) (define-key rmail-mode-map "\r" 'w3-maybe-follow-link))))
To access URLs with GNUS, the following in your ‘~/.emacs’ file should od the trick.
(add-hook 'gnus-article-mode-hook (function (lambda () (define-key gnus-article-mode-map [mouse-2] 'w3-maybe-follow-link-mouse) (define-key gnus-article-mode-map "\r" 'w3-maybe-follow-link))))
NOTE: XEmacs 19.12 has a special version of VM and GNUS that does the highlighting of URLs automatically. All that is required to follow one of these links is clicking the middle mouse button on the highlighted text.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
If you are feeling adventurous, or are just as anal as I am about people
writing valid HTML, you can set the variable w3-debug-html
to
t
and see what happens.
If a emacs-w3 thinks it has encountered invalid HTML, then a debugging
message is logged to the buffer specified by w3-debug-buffer
.
This can be a buffer object, or the name of a buffer.
NOTE: This has not yet been reintegrated into the new display engine and parser.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This version of Emacs-W3 supports native WAIS querying (earlier versions required the use of a gateway program). In order to use the native WAIS support, a working waisq binary is required. I recommend the distribution from think.com - ftp://think.com/wais/wais-8-b6.1.tar.Z is a good place to start.
The variable url-waisq-prog
must point to this executable, and
one of url-wais-gateway-server
or url-wais-gateway-port
should be nil
.
When a WAIS URL is encountered, a form will be automatically generated
and displayed. After typing in your search term, the query will be sent
to the server by running the url-waisq-prog
in a subprocess. The
results will be converted into HTML and displayed.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The w3-link-delimiter-info
variable can be used to ’rate’ a URL
when it shows up in an HTML page. If non-nil
, then this should
be a list specifying (or a symbol specifying the name) of a function.
This function should expect one argument, a fully specified URL, and
should return a string. This string is inserted after the link
text.
If a user has decided that all links served from blort.com are too laden with images, and wants to be warned that a link points at this host, they could do something like this:
(defun check-url (url) (if (string-match "://[^/]blort.com" url) "[SLOW!]" "")) (setq w3-link-delimiter-info 'check-url)
So that all links pointing to any site at blort.com shows up as "Some link[SLOW!]" instead of just "Some link".
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The gopher+ support in Emacs-w3 is limited to the conversion of ASK blocks into HTML 3.0 forms, and the usage of the content-length given by the gopher+ server to give a nice status bar on the bottom of the screen.
This will hopefully be extended to include the Gopher+ method of content-type negotiation, but this may be a while.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
These are the various hooks that can be used to customize some of Emacs-w3’s behavior. They are arranged in the order in which they would happen when retrieving a document. All of these are functions (or lists of functions) that are called consecutively.
w3-load-hooks
These hooks are run by w3-do-setup
the first time a URL is
fetched. All the w3 variables are initialized before this hook is
run.
w3-file-done-hooks
These hooks are run by w3-prepare-buffer
after all parsing on a
document has been done. All url-current-
* and
w3-current-
* variables are initialized when this hook is run.
This is run before the buffer is shown, and before any inlined images
are downloaded and converted.
w3-file-prepare-hooks
These hooks are run by w3-prepare-buffer
before any parsing is
done on the HTML file. The HTTP/1.0 headers specified by
w3-show-headers
have been inserted, the syntax table has been set
to w3-parse-args-syntax-table
, and any personal annotations have
been inserted by the time this hook is run.
w3-mode-hooks
These hooks are run after a buffer has been parsed and displayed, but before any inlined images are downloaded and converted.
w3-source-file-hooks
These hooks are run after displaying a document’s source
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
There are lots of variables that control the real nitty-gritty of Emacs-w3 that the beginning user probably shouldn’t mess with. Here they are.
w3-icon-directory-list
A list of directorys to look in for the w3 standard icons... must end
in a /! If the directory data-directory
/w3 exists, then this is
automatically added to the default value of
http://cs.indiana.edu/elisp/w3/icons/.
w3-keep-old-buffers
Whether to keep old buffers around when following links. If you do not
like having lots of buffers in one Emacs session, you should set this to
nil
. I recommend setting it to t
, so that backtracking
from one link to another is faster.
url-passwd-entry-func
This is a symbol indicating which function to call to read in a
password. It is set up depending on whether you are running EFS
or ange-ftp at startup if it is nil
. This function should
accept the prompt string as its first argument, and the default value as
its second argument.
w3-reuse-buffers
Determines what happens when w3-fetch
is called on a document
that has already been loaded into another buffer. Possible values are:
nil
, yes
, and no
. nil
will ask the user if
Emacs-w3 should reuse the buffer (this is the default value). A value of
yes
means assume the user wants to always reuse the buffer. A
value of no
means assume the user always wants to re-fetch the
document.
w3-show-headers
This is a list of HTTP/1.0 headers to show at the end of a buffer. All
the headers should be in lowercase. They are inserted at the end of the
buffer in a <UL> list. Alternatively, if this is simply t
, then
all the HTTP/1.0 headers are shown. The default value is
nil
.
w3-show-status, url-show-status
Whether to show progress messages in the minibuffer.
w3-show-status
controls if messages about the parsing are
displayed, and url-show-status
controls if a running total of the
number of bytes transferred is displayed. These Can cause a large
performance hit if using a remote X display over a slow link, or a
terminal with a slow modem.
mm-content-transfer-encodings
An assoc list of Content-Transfer-Encodings or
Content-Encodings and the appropriate decoding algorithms for each.
If the cdr
of a node is a list, then this specifies the decoder is
an external program, with the program as the first item in the list, and
the rest of the list specifying arguments to be passed on the command line.
If using an external decoder, it must accept its input from stdin
and send its output to stdout
.
If the cdr
of a node is a symbol whose function definition is
non-nil
, then that encoding can be handled internally. The function
is called with 2 arguments, buffer positions bounding the region to be
decoded. The function should completely replace that region with the
unencoded information.
Currently supported transfer encodings are: base64, x-gzip, 7bit, 8bit, binary, x-compress, x-hqx, and quoted-printable.
url-uncompressor-alist
An assoc list of file extensions and the appropriate uncompression programs for each. This is used to build the Accept-encoding header for HTTP/1.0 requests.
url-waisq-prog
Name of the waisq executable on this system. This should be the ‘waisq’ program from think.com’s wais8-b5.1 distribution.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
If you need more help on Emacs-w3, please send me mail (wmperry@spry.com). Several discussion lists have also been created for Emacs-w3. To subscribe, send mail to majordomo@indiana.edu, with the body of the message ’subscribe listname <your email addres>’. All other mail should go to <listname>@indiana.edu.
If you need more help on the World Wide Web in general, please refer to the newsgroup comp.infosystems.www. There are also several discussion lists concerning the Web. Send mail to listserv@w3.org with a subject line of ’subscribe <listname>’. All mail should go to <listname>@w3.org. Administrative mail should go to www-admin@w3.org. The lists are:
As a last resort, you may always mail me. I’ll try to answer as quickly as I can.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Changes are constantly being made to the Emacs browser (hopefully all for the better). This is a list of the things that are being worked on right now.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This chapter attempts to explain some of the internal workings of Emacs-w3 and various data structures that are used. It also details some functions that are useful for using some of the Emacs-w3 functionality from within your own programs, or extending the current capabilities of Emacs-w3.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Due to the many different flavors of Emacs in existence, the addition of data and font information to arbitrary regions of text has been generalized. The following functions are defined for using/manipulating these zones of data.
w3-add-zone (start end style data &optional highlight)
This function creates a zone between buffer positions start and end,
with font information specified by style, and a data segment of data.
If the optional argument highlight is non-nil
, then the region
highlights when the mouse moves over it.
w3-zone-at (point)
Returns the the zone at point. Preference is given to hypertext links, then to form entry areas, then to inlined images. So if an inlined image was part of a hypertext link, this would always return the hypertext link.
w3-zone-data (zone)
Returns the zone’s data segment. The data structures used in Emacs-w3 are relatively simple. They are just list structures that follow a certain format. The two main data types are form objects, link objects,and inlined images. All the information for these types of links are stored as lists.
w3-zone-hidden-p (zone)
w3-hide-zone (start end)
w3-unhide-zone (start end)
w3-zone-start (zone)
Returns an integer that is the start of zone, as a buffer position. In Emacs 18.xx, this returns a marker instead of an integer, but it can be used just like an integer.
w3-zone-end (zone)
Returns an integer that is the end of zone, as a buffer position. In Emacs 18.xx, this returns a marker instead of an integer, but it can be used just like an integer.
w3-zone-eq (zone1 zone2)
Returns t
if and only if zone1 and zone2 represent the same
region of text in the same buffer, with the same properties and
data.
w3-delete-zone (zone)
Removes zone from its buffer (or current buffer). The return value is irrelevant, and varies for each version of Emacs.
w3-all-zones ()
Returns a list of all the zones contained in the current buffer. Useful for extracting information about hypertext links or form entry areas. Programs should not rely on this list being sorted, as the order varies with each version of Emacs.
w3-zone-at (pt)
This returns the zone at character position PT in the current buffer
that is either a link or a forms entry area. Returns nil
if no link at
point.
These data structures are what is generally returned by
w3-zone-data
.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
There are also some variables that may be useful if you are writing a
program or function that interacts with Emacs-w3. All of the
w3-current-*
variables are local to each buffer.
url-current-mime-headers
An assoc list of all the MIME headers for the current document. Keyed on the lowercase MIME header (e.g., ‘content-type’ or ‘content-encoding’.
url-current-server
url-current-file
url-current-type
A string representing what network protocol was used to retrieve the current buffer’s document. Can be one of http, gopher, file, ftp, news, or mailto.
url-current-port
w3-current-last-buffer
w3-running-FSF19
w3-running-epoch
w3-running-xemacs
This is t
if and only if we are running in Lucid Emacs, WinEmacs, or
XEmacs.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Form objects are used to store information about a FORM data entry area.
'w3form
nil
, specifying the ID attribute on this input
tag.
A new development in the World Wide Web is the concept of collapsible areas of text. If a zone controls one of these regions, it is marked with the w3expandlist property. The format of this structure is:
'w3expandlist
A zone with the w3graphic property is a link to an inlined image’s source file.
'w3graphic
w3-follow-inlined-image
is invoked.
A zone with the w3 property is a full-fledged hypertext link to another document.
'w3
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
I have done quite a bit of work trying to make a clean interface to the internals of Emacs-w3. Here is a list of functions that you can use to take advantage of the World Wide Web.
url-clear-tmp-buffer
Sets the current buffer to be url-working-buffer
, creating it if
necessary, and erase it. This should usually be called before
retrieving URLs.
w3-convert-html-to-latex
Takes a buffer of HTML markup (which should be in
w3-working-buffer
), and convert it into LaTeX. This is an
adaptation of the simple sed scripts from Cern. Does as good a job as
the html2latex program, and I usually prefer its formatting over
html2latex’s.
w3-fetch
This function takes a URL as its only argument. It then attempts to retrieve the URL. For example: ‘(w3-fetch "http://cs.indiana.edu/")’ would retrieve the Indiana University CS home page and parse it as HTML.
w3-fix-entities-in-string
This function takes a string, and removes all HTML[+] entity references
from it, replacing them with the correct character(s). It consults the
variable w3-html-entities
for the entity names and translations.
For example, ‘(w3-fix-entities-in-string ">testing<&")’
would return ‘">testing<&"’.
url-generate-new-buffer-name
This function takes a string, and returns the first unique buffer name
using that string as a base. For example
‘(url-generate-new-buffer-name "new-buff")’ would return
‘"new-buff<1>"’ if buffer new-buff
already existed.
url-generate-unique-filename
This functions returns a string that represents a unique filename in the /tmp directory. For example, ‘(url-generate-unique-filename)’ would return ‘"/tmp/url-tmp129440"’. The filename is arrived at by using a unique prefix (url-tmp), the uid of the current user (12944 in my case), and a number that is incremented if a file already exists.
url-buffer-visiting (url)
Return the name of a buffer (if any) that is visiting URL.
url-create-mime-request (fname ref-url)
Create a MIME request for the file fname. The Referer: field of the HTTP/1.0 request is set to the value of ref-url if necessary. Returns a string that can be sent to an HTTP server. The request uses several variables that control how the request looks.
If the value of url-request-extra-headers
is non-nil
, then
it is used as extra MIME headers when an HTTP/1.0 request is
created.
url-get-url-at-point
This function returns the url at a point specified by an optional argument. If no argument is given to the function (point) is used. Tries to find the url closest to that point, but does not change the users position in the buffer. Has a preference for looking backward when not directly on a URL.
url-hexify-string
This function takes a string and replaces any characters that are not acceptable in a URL with the "escaped" encoding that is standard for URLs (replaces the character with a % followed by the hexadecimal representation of the ASCII value of the character). For example, ‘(url-hexify-string "this is a test")’ would return ‘"this%20is%20a%20test"’.
url-open-stream
This function takes the same parameters as open-network-stream
,
and functions similarly. It takes a process name, a buffer name, a host
name, and a port number or server name. It attempts to open a network
connection to the remote host on the specified port/service name, with
output going to the buffer. It returns the process object that is the
network connection.
url-retrieve
This function takes 3 arguments, a URL, a method type, and a data block. It then attempts to retrieve the URL using the specified method, using data (if any) as the body of the MIME request. For example: ‘(url-retrieve "http://cs.indiana.edu/")’ would retrieve the Computer Science home page from Indiana University. This function does no parsing of the retrieved page, and leaves you in the buffer containing the document you requested. Any HTTP/1.0 redirection/authorization is done before this function exits.
url-unhex-string
This is the opposite of w3-hexify-string
. It removes any %XXX
encoded characters in a string. For example ‘(url-unhex-string
"this%20is%20a%20test")’ would return ‘"this is a test"’.
w3-view-this-url
This function returns the URL of the zone under point (if no zone is
under point, then it returns nil
). If the optional argument is
nil
, then the URL is also displayed in the minibuffer.
url-view-url
This function returns the URL of current document. If the optional
argument is nil
, then the URL is also displayed in the
minibuffer.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
mm-compose-type(TYPE)
Compose a body section of MIME-type TYPE. This uses the compose field of a mailcap entry to generate the data, and returns a string that contains the data, with a correct content-type header.
mm-extension-to-mime(EXTN)
Return the MIME content-type of the file extension EXTN
mm-mime-info(ST ND REQUEST)
Get the mime viewer command for a specific MIME type.
If ST is a number, then the MIME type is the buffer-substring
between ST and ND, otherwise ST should be a string specifying the MIME
type and associated data. Returns nil
if the specified type is not
found.
Expects a complete content-type header line as its argument. This can be simple like text/html, or complex like text/plain; charset=blah; foo=bar
Third argument REQUEST specifies what information to return. If it is
nil
or the empty string, the viewer (second field of the mailcap
entry) is returned. If it is a string, then the mailcap field
corresponding to that string is returned (print, description, whatever).
If a number, then all the information for this specific viewer is
returned.
mm-parse-mailcap(FILE)
Parse the mailcap file specified by FILE.
mm-parse-mailcaps(PATH)
Parse the default mailcap files. Optional argument PATH specifies a UNIX-style path of where to find the mailcap files. This function must be run before the rest of the mm-* functions.
mm-parse-mimetype-file(FILE)
Parse out a mime-types file specified by FILE.
mm-parse-mimetypes(PATH)
Parse the default mimetypes files. Optional argument PATH specifies a UNIX-style path of where to find the mimetypes files.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Jump to: | 1
3
<
A B C D E F G H I L M N O P R S T U V W X Y |
---|
Jump to: | 1
3
<
A B C D E F G H I L M N O P R S T U V W X Y |
---|
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Jump to: | <
>
A B C D F G H I K L M N O P Q R S T U V |
---|
Jump to: | <
>
A B C D F G H I K L M N O P Q R S T U V |
---|
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Jump to: | L M S U W |
---|
Jump to: | L M S U W |
---|
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Jump to: | F L M S U W |
---|
Jump to: | F L M S U W |
---|
[Top] | [Contents] | [Index] | [ ? ] |
TERM is a user-level protocol for emulating IP over a serial line. More information is available at sunsite.unc.edu:/pub/Linux/apps/comm/term
Please see the full Emacs distribution for a description of regular expressions
Itelnet is a standard name for a telnet executable that is capable of escaping the firewall. Check with your system administrators to see if you have anything similar
Expect is a scripting language that allows you to control interactive programs (like telnet) very easily. It is available from gatekeeper.dec.com:/pub/GNU/expect-3.24.0.tar.gz
Available via anonymous ftp from ftp.x.org:/R5contrib/netpbm-1mar1994.tar.gz, and most large ftp sites.
One is ftp://ds.internic.net/internet-drafts
RFC 1321. R.Rivest, "The MD5 Message-Digest Algorithm", http://ds.internic.net/rfc/rfc1321.txt, April 1992.
http://www.netscape.com/
See http://hoohoo.ncsa.uiuc.edu/docs/PEMPGP.html
Available via anonymous ftp to archive.cis.ohio-state.edu in /pub/gnu/emacs/elisp-archive/interfaces/mailcrypt.el.Z
[Top] | [Contents] | [Index] | [ ? ] |
[Top] | [Contents] | [Index] | [ ? ] |
[Top] | [Contents] | [Index] | [ ? ] |
This document was generated on December 6, 2024 using texi2html 5.0.
The buttons in the navigation panels have the following meaning:
Button | Name | Go to | From 1.2.3 go to |
---|---|---|---|
[ << ] | FastBack | Beginning of this chapter or previous chapter | 1 |
[ < ] | Back | Previous section in reading order | 1.2.2 |
[ Up ] | Up | Up section | 1.2 |
[ > ] | Forward | Next section in reading order | 1.2.4 |
[ >> ] | FastForward | Next chapter | 2 |
[Top] | Top | Cover (top) of document | |
[Contents] | Contents | Table of contents | |
[Index] | Index | Index | |
[ ? ] | About | About (help) |
where the Example assumes that the current position is at Subsubsection One-Two-Three of a document of the following structure:
This document was generated on December 6, 2024 using texi2html 5.0.