home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
OS/2 Shareware BBS: 35 Internet
/
35-Internet.zip
/
srev13g.zip
/
caching.doc
< prev
next >
Wrap
Text File
|
1999-06-27
|
8KB
|
176 lines
27 June 1999: SRE-http and caching.
SRE-http ver 1.3g supports several forms of caching. This document outlines
what levels of caching may apply to a request, and what you can do
to increase (or decrease) the extent to which caches answer requests.
There are several different sorts of caches that may apply. In decreasing
universality these include:
1) Proxy server caches.
For purposes of this discussion, a "proxy server" is any intermediate
site, somewhere on the web, that may handle a request issued by a client.
These sites may store responses, and use these cached responses the next
time the same request is recieved. When such a stored response is used,
the origin server is typically not contacted (the origin server does not
know that the proxy delivered content to a client).
** Perhaps the principal advantage of http/1.1 (over http/1.0) is the
** attention given to making the web proxy-cache friendly.
2) The GoServe cache.
The GoServe cache consists of a list that matches selectors (the local
portion of a URI) to filenames. When a request for the same selector
arrives, GoServe can resolve the request by sending the matched file
(and a few http/1.0 response headers). As an option, the GoServe cache
can "run the filter anyways", which allows the filter to perform post-filter
actions (such as auditing).
3) The SREPROXY cache.
SREPROXY is a front-end to SRE-http. SREPROXY maintains a cache that matches
selectors to files. These files may be temporary files (say, as generated
by adding SSI's to an HTML document). In addition, SREPROXY can resolve a
few "dynamic" SSIs (such as the current time), and can do a limited amount
of access control.
4) The SSI and !DIR caches.
SREFILTR (the main filter) maintains a cache for SSI documents (that contains
"partially compiled" server side includes) and a cache for !DIR requests
(that contains directory listing). These are used when a matching selector
is recieved. Note that the SSI cache is often times used as a base to which
dynamic SSIs are added; where "dynamic SSIs" refers to information that
changes on a request specific basis (i.e; the current time, the client's
IP address, and output from INTERPRET SSIs).
The basic notion behind the use of a cache is to reduce processing
requirements and bandwidth demands. Proxy caches are highly effective at
both -- when successful, no communication with the origin server is
necessary. The GoServe cache does not save bandwidth, but can reduce
server load considerably (by skipping the "call the filter to resolve this
request" step). SREPROXY is similar -- although it is a filter that has
to be called, it's much smaller and faster then the regular (SREFILTR)
filter. Lastly, the SSI and !DIR caches can save a lot of processing for
SSI-including and directory-listing "processor intensive" resources.
Each of these caches has advantages and disadvantages.
Proxy Caches:
Advantages
* Very fast response times
* Can completely eliminate load on your server
* Helps reduce internet traffic
Disadvantages
* Should not be used with actively changing, or access controlled,
resources
* Should not be used when accurate auditing is important
GoServe Cache
Advantages:
* Response times are very fast (compared with SREFILTR)
* Minimizes load on your server
Disadvantages:
* Should not be used with actively changing, or access controlled,
resources
* Currently, the GoServe cache is http/1.0, but not http/1.1,
compliant.
SREPROXY cache:
Advantages:
* Response times are fast
* Can reduce load (since SREPROXY is smaller then SREFILTR)
* Can be used with changing and access controlled resources
* No loss of functionality -- when in doubt, SREFILTR is used
Disadvantages:
* Introduces another round of processing -- if a request does not
match a cached entry, the net result is to diminish response time.
* On occasion, a stale response may be returned
SSI and !DIR caches:
Advantages:
* Fully functional -- changes are immediately detected
* Greatly reduces processing for a subset of otherwise processing
intensive requests.
Disadvantages:
* On rare occassions, stale requests may be returned
It should be stressed that these caches are not mutually exclusive. In fact,
a typical scenario would have the three higher caches (proxy servers, GoServe,
and SREPROXY) examining a request, which may then be resolved via the use of
the SSI (or !DIR) cache. Thus, optimal performance is acheived by using each
cache in a complementary fashion.
The following discusses some tricks and techniques you can use.
Proxy Servers:
* If you have a very dynamic site of non-access controlled resources,
transparency concerns may override the desire for faster throughput.
That is, you might want to suppress all proxy caching.
This can be accomplished by setting proxy_cache=0 (in INIT_STA.80)
Alternatively, you can use proxy_cache to "force revalidation"
(see the description of the PROXY_CACHE variable in INITFILT.DOC for
more details).
* SRE-http will automatically supress proxy caching whenever access controls
(such as CHECKLOG and ALLOW_ACCESS), or dynamic SSIs, apply to the resource.
If desired, you can explicitily allow these resources to be cached -- just
include a CACHE (or CACHE*) "permission" in a selector-specific entry
in ACCESS.IN (or in ATTRIBS.CFG). Alternatively, resources listed
as PUBLIC URLS (using PUBURLS.IN or ATTRIBS.CFG) are assumed to
be cachable by proxy caches.
* See HITMETER.DOC for hints on how to resolve problems associated
with accurate metering of hits when proxy servers may be active.
GoServe cache:
* If you do enable the GoServe cache, be aware that it uses an http/1.0
response algorithim. Thus, your site will sometimes return http/1.1
responses, and sometimes http/1.0 responses. Although this is not
fatal, it may have strange impacts (and it's somewhat asthetically
displeasing).
Therefore, SRE-http will only use the GoServe cache (that is, allow a
request to be cached by GoServe) when a CACHE* permission exists.
Alternatively, resources listed as LITERAL_NORECORD PUBLIC URLS
(in PUBLURL.IN) are assumed to be cachable by the GoServe cache.
* In general, we recommend using the GoServe cache only for resources that
you do not care to audit (such as backgrounds and icons). In this vein,
we recommend checking the "do not call filter" GoServe caching option.
* Future releases of GoServe may upgrade the GoServe cache, so that it
returns appropriate http/1.1 response headers.
* The GoServe cache ignores TE: request headers.
SREPROXY:
* If your site is highly access controlled, or consists primarily of dynamic
HTML documents (with lots of SSIs') or addons/cgi-bin scripts, then use
of SREPROXY may hurt (increase) response times.
* NUSTATUS contains an option that will display simple statistics on
the proportion of requests satisfied by SREPROXY.
* SREPROXY.DOC contains a detailed discussion on how to use SREPROXY.
* If SREPROXY detects a TE: GZIP request header, it will NOT resolve
the request.
SSI and !DIR caches:
* There is almost no reason not to use these caches....
the exceptions being:
i) You have lots of HTML documents, and not much extra disk space
ii) Your documents change rapidly (have lots of dynamic SSIs).
iii) HTML files are contantly being edited, added, and removed.