home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
OS/2 Shareware BBS: 35 Internet
/
35-Internet.zip
/
srev13g.zip
/
negotiat.doc
< prev
next >
Wrap
Text File
|
1999-06-26
|
27KB
|
684 lines
24 June 1999
SRE-http and Content Negotiation
1) Introduction
Content negotiation refers to the choosing of a best representation
of a web resource from several alternatives.
In general, content negotiation is used to choose between resources
created with one of several languages, or one of several mimetypes.
Content negotiation can also be used to choose between resources
using alternate character sets, and to choose shorter documents.
In the future, content negotiation may also be used to choose documents
using certain features (such as documents using different versions of html).
There are basically two forms of content negotiation:
1)Server-side.
Server-side content negotiation (which is defined in http/1.0)
is performed by the server -- the server uses ACCEPT request
headers (provided by the client) to automatically choose
and return one of serveral variants.
2)Client-side.
Client-side content negotiation (which is new to http/1.1)
is accomplished by the client automatically requesting one of
several variants:
the choice is based on a "variants list", containing
URL's and descriptive information, contained in
response headers returned by the server.
SRE-http supports both server-side and client-side content negotiation.
---------------------
II) Specifying a negotiable resource
To specify a "negotiable" resource (a web resource with several possible
variants) requires 3 steps:
1) Create several "variants" of a resource
2) Create a file, in your web-space, containing a "variant list"
3) Create a special !NEGOTIATE alias pointing to this file
These steps are described in the next three sections.
Notes:
* SRE-http's implementation of content negotiation is loosely based on
Apache 1.3. You might want to examine
http://www.apache.org/docs/content-negotiation.html
for a different description.
* For a technical discussion of content-negotiation, see RFC 2295
"Transparent Content Negotiation in HTTP" at:
http://gewis.win.tue.nl/~koen/conneg/
* The rationale for client-side content negotiation is that the browser
is best equipped to choose a variant (given descriptive information
on the variants mimetype, language, etc.) Furthermore, it is thought that
it is wasteful of bandwith to provide the full range of Accept: headers
on every request (since most resources will not be subject to content
negotiation). One the other hand, client-side negotiation does require
two requests...
* Note that SRE-http does NOT completely implement "transparent content
negotiation" protocol. In particular, some of the required (variant
specific) http/1.1 caching is not attempted.
-------------------------
II.1) Creating variants
Basically, a variant is any server resource. This includes documents, images,
and even cgi-bin scripts and sre-http addons. The notion is that variants
all represent variations of the same information. For example,
you may have several translations of the same document (say, an English,
Finnish and Korean version) which you'ld like to automatically send
to the appropropriate clients.
There are a few constraints:
1) variants must be on the same server as the "variant list" (discussed next).
It is considered good practice (from a security standpoint) for
each variant to be in the same directory as the variant list.
2) variants must be retrievable via GET requests. That is, each
variant should be accessible with a standard URL; which also means that
content negotiation will NOT work with POST requests.
-------------------------
II.2) Creating a Variant List
The variant list is at the heart of SRE-http's implementation of content
negotiation. The variant list is a simple (text) file containing
several multi-line records. This file should be accessible from the web.
That is: it must be placed in (or under) the GoServe data directory, or in a
virtual directory.
Each record in the variant lists must specify a URI (a selector),
and several pieces of identifying information. This information
is used to specify up to five "dimensions of negotiation": mimetype,
charset, language, encoding, and length.
The syntax of these multi-line records is (note that a wildcarded form
of these records can also be specified, as discussed in section II.4 below):
URI: a_selector
Content-type: type/subtype ; charset=a_charset ; qs=m.mm
Content-language: l1, l2
Content-encoding: enctype
Content-length: nnnn
Where:
URI is required.
URI should be a valid selector; that is, site information can not be
included -- the resource must be on the same site as the variant list.
Note that relative URI's are interepreted relative to the location
of the "variant list"
Content-type: Content-type is required. It contains 3 sub-fields.
type/subtype: Only the type/subtype subfield is required.
It identifies the mime-type of this variant.
charset: Optional. Identifies the character set. If not specified,
a ISO-8859-1 (latin1) is assumed
qs: Optional. The "selection quality". Must be between 0.0 and 1.0.
0.0 means "unacceptable", 1.0 means "perfect representation".
If not specified, a value of 1.0 is assumed. All else equal,
variants with higher values are preferentially chosen. Note that
qs is used, while "q" is used in accept headers.
Content-language: Optional.
A comma delimited list of 2 character languages codes
(i.e.; EN for English, FR for French, DE for German).
Note that the 2-2 letter codes, such as En-US, are
shortened (only the first two characters are used).
Content-encoding: Optional.
A comma delimited list of content encoding types (i.e.; identity,
gzip, and compress).
Content-length: Optional.
The length of the resource. If not specified, a length of 0 is
assumed. Note that "longer" resources are less likely to be chosen
(all else equal).
Description: Optional
An optional description of this variant.
Features: Optional
Features allows you to note special "features" of the variant.
Although not widely supported, in the future browsers may use
such information to choose a variant (SRE-http does NOT use features
in it's variant selection algorithim).
For details on the structure of the feature, please see RFC2295.
Notes:
* Appendix A contains an example of a variant list file.
* When a variant is selected, it's Content-Type, Content-Language,
and Content-Encoding are returned as response headers. Thus,
in most cases the Content-Type specified in a variant list will
override the "default" Content-Type (i.e.; the mimetype derived from
the file's extension). The exception to this rule is when
the variant is a CGI script or an SRE-http addon, in which
case this information is ignored.
-------------------------
II.3) Identifying a negotiable resource.
SRE-http uses a special "alias" to identify variant lists. That is,
unless explicitily identified as a variant list, a request for a variant
list will be treated in the normal fashion-- i.e.; the file would be
returned verbatim (say, as a text document).
To identify a variant list, you must add special lines to your ALIASES
file. This can be done with the configurator, or by hand.
Note that there are two forms of this special alias -- a simple form
and a wildcard form:
a) Simple form:
The syntax of the "simple" form is:
a_selector !NEGOTIATE
where a_selector is a selector pointing to the variant list.
For example:
/VARTEST/VAR1.LST !NEGOTIATE
could mean: "VAR1.LST file in the VARTEST subdirectory of the GoServe
data directory is a variant list"
Note that, unlike most SRE-http aliases, there is no explicit
"replacement". Actually, you can think of the variant list itself
as an extension of SRE-http's aliasing -- it contains instructions used
to decide which (of several) replacements to use.
b) Wildcarded form
The syntax of the wildcarded form is:
wild_sel !NEGOTIATE variant_sel
where
wild_sel is a * (wildcard) containing "target"
variant_sel is a valid selector that points to a variant list
For example:
/MANUALS/*.HTM !NEGOTIATE /MANUALS/DOCS.LST
In this case, all request selectors that match MANUALS/* (note that the
leading / is ignored) will use the variant list specified in
/MANUALS/DOCS.LST. Please see the next section for details on how
variants are resolved when this wild_sel form is used.
Once you've accomplished these steps, all you need to do is put a URL
pointing to "a_selector" (or that will alias-match "wild_sel"), and hope
that your client's browser either provides useful ACCEPT headers, or knows how
to do client side content negotiation.
-------------------------
II.4 Wildcarded variants
As outlined above, one must create a unique variant list (and a unique
alias) for all negotiable resources. This may become quite tedious,
especially when you have multiple sets of documents. For example, if you
have a 10 chapter manual in 3 languages (hence, 30 files), it could be
advantageous (that is, a lot less trouble) to use some wildcarded "variant
list" for all 10 chapters.
In recognition of this possibility, SRE-http supports a special form of
variant list (and ALIAS) that supports such "multiple sets of negotiable
resources". The specification of these sets requires two changes to
the simple case.
The first difference is discussed above -- the use of the "wildcarded form"
of an alias. The second involves modifications to the variant list file.
Recollect that the wildcarded form directs many possible "request selectors"
to a single variant list. Thus, the variant list must contain information that
allows the request selector to influence the value of the URI: field of each
record in the variant list. To do this, two steps are required.
a) A
PATTERN: wild_sel
entry should be put at the top of the variant list.
For example:
Pattern: /manuals/*.HTM -- (note that a leading / is ignored)
** The value of "wild_sel" used in a PATTERN: entry should be the same
as the "wild_sel" used as the "target" portion of the wildcarded alias
(that is used to identify the variant list).
b) The URI: fields may contain *s.
As with other features, SRE-http will replace these * (in the URI:
entriess) with corresponding portions in the request selector.
For example:
i) if the request selector is: manual/chap1.htm
ii) the alias is: manual/* !negotiate manual/docs.lst
iii) manual/docs.lst contains
pattern: manual/*
URI: de/*
Content-type: text/html
Content-Language: de
Uri: en/*
Content-type: text/html
Content-language: en
iv) If the request contain an accept-language: en request header, then
en/chap1.htm would be used
If the request contain an accept-language: de request header, then
de/chap1.htm would be used
That is, the * in the Pattern: (in manual/*) "corresponds to"
chap1.htm, which is then used as a substitute for the * in the various
various URI: entries.
-------------------------
III. The content negotiation algorithim.
The following sketches the algorithim used by SRE-http. Note that this
is used both for "server side negotiation" (when the client does not
include a Negotiate: request header), and as a "remove variant selection
algorithim" (when the client includes a Negotiate: * request header).
1) First, SRE-http checks for a "Negotiate:" request header. If no
such header exists, then server-side negotiation is always
attempted. If this header does exist, then server-side negotiation
may be attempted (see the notes for details).
2) If server-side negotiation is to be attempted, by default
the following selection algorithim is used.
If the client allows a "remote variant selection algorithim"
(by including a Negotiate: n.n request header), then a custom
procedure can be used instead (see Appendix B for the details).
Note that this is a "leave as soon as a definitive answer is found"
method -- latter steps are only used if earlier steps yield ties.
Furthermore, variants eliminated in earlier steps are NOT
available -- they are NOT considered in latter steps.
Lastly, if all variants are eliminated, a suitable "could not find
representation" response is immediately returned.
a) Accept: headers are read. Accept: headers contains information
on acceptable mime-types. This information can contain "selection
quality" (q) information.
The variant with the best "combined" quality is used. Combined quality
is deterimined by multiplying the variant-list q(uality) by the accept: header
"q: factors.
If there are ties (i.e.; several mimetypes have a combined q of 1.0),
then move to step b. If there is no accept: header (most browsers
send some form of Accept: header) then skip this step.
b) Accept-language: headers are read. These headers can also contain
"language specific q modifiers". The variant (of those surviving
step a) with the highest language "q" factor is used.
If there are ties, move to step c. If there is no accept-language
header, this step is skipped. Note that the content-language entries
in the variant list should NOT include "q" factors.
c) Accept-Encoding: headers are read. These headers can also contain
"encoding specific q modifiers". The variant (of those surviving
step b) with the highest encoding "q" factor is used.
If there are ties, move to step d. If there is no accept-encoding
header, this step is skipped. Note that the content-encoding entries
in the variant list should NOT include "q" factors.
d) A accept-charset: header is read. Variants that do not match this
charset are dropped.
If there are ties then move to step e. If there is no accept-charset
header then skip this step.
e) Use the variant with the smallest content-length (as pulled from the
variant list).
If there are ties, move to step f.
f) Use the first of the remaining variants.
3) If client-side negotiation is to be used, SRE-http returns a special
"300" return code. The body of the response contains an <UL> list
containing links to each variant. Furthermore, the variant list
(suitably formatted) is returned in an Alternates: response header.
Note:
* SRE-http recognizes several values in the Negotiate: header (any
combination of them can appear in a comma delimited list)
trans: client supports client side content negotiation
vlist: always send Alternates: response header (implies trans)
*: server should attempt to choose best variant (implies trans).
guess-small: always send Alternates: response header (implies trans)
This is minimal support -- with full support, sometimes
a sufficiently small "best variant" guess is returned.
n.n : A decimal number, such as 4.2.
This specifies the major and minor versions of a customized
"remote variant selection algorithim" (RVSA) -- a custom
procedure to replace SRE-https built in variant selection
algorithim. See APPENDIX B for details on how to
add a procedure to implement your favorite RVSA!
In the following cases, server-side negotiation is not attempted:
Negotiate: trans
Negotiate: vlist
Negotiate: vlist,trans
Negotiate: guess-small
In these cases,SRE-http returns a "300 Multiple Choices" response.
This response always includes an Alternates: header (containing a suitably
formatted version of the variant list). In addition, a <UL> list
of links to each variant is returned in the body of the response.
The assumption is that (in most cases) the user-agent
(the browser) will use the variant list to choose a best
variant; but if it can't, the list will be displayed and the
human can manually choose.
When * is included, then server-side content negotiation will be
attempted. In particular:
Negotiate: *, vlist
means "attempt to find best match, but always return the variant list"
In contrast
Negotiate: *
means "return the best match, only return a variant list if no best
match found".
When a "n.n" RVSA token is included, and the appropriate procedure
has been defined, then SRE-http will server-side content negotation
will be attempted. See Appendix B for the details.
In either case (that is, whenever * or an n.n appears in the
Negotiate: header), failing to find a best match causes a
"406 No acceptable representation" response to be returned.
This 406 response is similar to the 300 response described above --
it contains an Alternates: header and a <UL> list of links.
* To repeat: if no Negotiate: request header exists, then server-side
content negotiation is attempted. If no best-match can be found, then
a 404 response, containing a <UL> list of links to each variant,
is returned (but no Alternates: header).
* Whenever server-side negotiation (either pure server side, or
due to a Negotiate:* header) is succesful, a Vary: header
will be included (Vary headers are used by proxy servers)
* When a quality value is not specified (either as q factor in the
Content-type field in the variant list, or as a "q" factor in an Accept:
or Accept-Language request header), a value of 1.0 is assumed.
There are a few exceptions:
i)if a */* or xxx/* appear (in the Accept: header), and if no other
mimetypes have a q modifier, then */* is assigned a value of 0.01,
and xxx/* is given a value of 0.02.
ii)records in the variant list with no content-language field are given
a language quality factor of 0.01.
Both these exceptions are tricks that cause these "defaults" to be used
when no better match exists.
* You can specify multiple languages in the Content-language entry
(in a variant list record). However, you can NOT specify a "q" modifier
for these languages. Note that the accept-language header MAY contain
several languages, each of which may have a "q" modifier.
* If no charset is specified (in a variant list record), then
ISO-8859-1 (latin1) is assumed.
* If no content-length is specified (in a variant list record), then
a length of 0 is assumed. That is, content-length is really a final
"quality check", with lower values (and unspecified values) preferred.
* Hint: the Options-General-Language tab of Netscape can be used to
automatically generate Accept-language headers.
* WARNING: the variant returned by server side negotiation should NEVER
be a negotiable resource. If this should happen, SRE-http
will return a 506 response (Variant Also Negotiates).
-------------------------
Appendix A) Sample Variants file.
---------------------- Start example -----------
; this is sample variant file
URI: foo
Uri:foo.fr.de.html
content-type: text/html
content-language: sp,fr-ca,de
content-length: 2005
Uri:foo.txt
content-type: text/plain ; q=0.5
content-length: 2005
content-language: en
description: this is the english text version
Uri:foo.gz
content-type: text/plain ; q=0.5
content-length: 2005
content-encoding: gzip
content-language: en
uri: /status?
content-type: application/octet-stream
uri: foo.en.html
content-type: text/html ; charset=iso-8859-1 ; q=1
content-language: en
features: tables frames
uri:foo.default
---------------------- End of example -----------
Notes:
* Lines beginning with ; are comments, and are ignored
* Each, typically multi-line record, contains a
URI:, and optionally any combination of content-type,
content-languate, and content-length
* Blank lines are treated as "record delimiters"
* The first "one line record" URI: entry is optional -- it's skipped
* Note that content-encoding and content-features are NOT supported
* Content-length is the estimated size of the file,
it does NOT have to be the actual size of the file.
All else equal, smaller "lengths" are preferentially returned.
* Content-language is a comma delimited set of languages
* Content-encoding is an encoding type, with "identity" meaning
"no encoding". Only one encoding-type can be specified.
* Content-type contains 3 fields. The type/subtype field is required,
the charset= field is optional (iso-8859-1 is the default), and q
is the (optional) "quality" measure used to weight the variants.
* Note the last entry is a "fall back" variant (the client's user-agent
can choose this if all else fails).
-------------------------
Appendix B: Specifying a Remote Variant Selection Algorithim
SRE-http provides a simple hook by which a custom remote variant
selection algorithims (RVSA) can be implemented. The process is:
a) obtain the RVSA.
b) write a rexx procedure that calls this rvsa
c) save this procedure into macrospace using a name of
SREF_RVSA_n, where "n" is the major version number
of the rvsa.
For example, if the request contains:
Negotiate: 2.3,vlist
then SREF_RVSA_2 is appropriate "macrospace" procedure name.
In the simplest case, the rvsa procedure will be a rexx procedure;
in which step a and b are combined.
The rvsa procedure (say, SREF_RVSA_2) will be called with
two arguments, the version number, and the list of alternates.
For example, to read these arguments the procedure could use:
parse arg version,altlist
In the above example (Negotiate: 2.3,vlist) the version number would
be "2.3" (without the quotes). The list of alternates is simply a
copy of the Alternates: header (that would be returned to the client).
The rvsa procedure should use this information, along with request
specific information (such as the various Accept headers, which
can be read using the GoServe reqfield function) to determine which
variant is best. The number of this best variant should be
returned; or a 0 should be returned if there is no best match.
For example, if the second variant (in the list of alternates) is
best, then SREF_RVSA_2 should return a "2" (without the quotes).
Notes:
* You can use RXU, REXXLIB, or other DLLs to create macrospace
procedures
* For details on the structure of the alternates list, see RFC2295.
(see Appendix 3 for an example).
-------------------------
Appendix C: Example of a TCN Request and Response
The following is a simple example of TCN request, and the response
from SRE-http.
Assuming that:
TSTHTM/TSTHTM.NEG
is a negotiable resource, and TSTHTM.NEG has the following structure:
;---- TSTHTM.NEG is a negotiable resources
uri:tsthtm
uri:tst.1
content-type: text/plain; qs=0.8
content-language: en
uri: tst.2
content-type: text/plain ; qs=0.3
content-language: fr
description: The French Version
features: tables [abc def]
uri: /gene_test?
content-type: application/octet-stream; charset=cyrillic
content-language: ru
;---- End of TSTHTM.NEG
The urI:
http://foo.bar.net/tsthtm/tsthtm.neg
which yields the request:
GET /tsthtm/tsthtm.neg HTTP/1.1
HOST:foo.bar.net
Negotiate: vlist,*
Accept: text/plain
Accept-language: fr
would yield the response:
HTTP/1.1 200 Ok
Date: Sat, 26 Jun 1999 13:49:10 GMT
Accept-Ranges: bytes
Connection: close
Last-Modified: Thu, 24 Jun 1999 05:52:38 GMT
ETag: "990624015238_18;7964E38D"
Server: GoServe for OS/2, version 2.52; SRE-http 130607.1
Content-Type: text/plain
Tcn: choice
Content-location: tsthtm/tst.1
Vary: negotiate,accept
Alternates: {"tst.1" 0.8 {type text/plain } {language en }},
{"tst.2" 0.3 {type tex/plain } {language fr } {features tables [abc def] }
{description "The French Version"}},
{"/gene_test?" 1 {type application/octet-stream } {language ru }
{charset cyrillic }}
Content-Language: en
Content-length: 18
Cache-control: public
Notes:
* Since * appears in the Negotiate: header, SRE-http will
attempt the default server side negotiation. The Accept: text/plain
is the only information used, and the explicit q=0.8 of the first
entry beats the explicit 0.3 of the second entry. The implicit
q=1.0 of the third entry is irrelevant, since the third entry's
content-type is not text/plain
* the Alternates: header has been reformatted (it is actually on one
long line).
* The contents of tsthtm/tst.1, which has a size of 18 characters,
is sent as the body of the response.
Alternatively, if the client does not want the server to resolve the
variant:
GET /tsthtm/tsthtm.neg HTTP/1.1
HOST:foo.bar.net
Negotiate: vlist,*
Accept: text/plain
Accept-language: fr
which yields:
HTTP/1.1 300 Multiple Choices
Date: Sat, 26 Jun 1999 14:01:20 GMT
Server: GoServe/2.52 ;130607.1
Tcn: list
Etag: 990626095912_324;7964E38D
Vary: *
Alternates: {"tst.1" 0.8 {type text/plain } {language en }},
{"tst.2" 0.3 {type tex/plain } {language fr } {features tables [abc def] }
{description "The French Version"}},
{"/gene_test?" 1 {type application/octet-stream } {language ru }
{charset cyrillic }}
Cache-Control: public
Note that the portion of the etag following the ; (the 7964E38D)
is the same for both requests -- it's the "variant list validator"
which, since the Alternates: header is the same, is also the same.