home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Acorn User 1
/
AU_CD1.iso
/
internet
/
arcweb
/
!ArcWeb
/
Protocol
< prev
next >
Wrap
Text File
|
1995-01-31
|
28KB
|
590 lines
/*
* Arcweb
*
* Copyright (C) Stewart Brodie, 1994, 1995
*
* This product is supplied under the terms laid down in the `Terms' file.
*
* This file: Protocol
*
* For Arcweb versions 0.12 and later (until superceded)
*
*
*
* Changes from protocol in version 0.11
* Extended URL protocol defined
* - in *Request message blocks
* - notes about ownership of file handles
* - in comments about the Flags word
* - comments about how to handle Location: header from HTTP servers
* Message_ArcwebEMailRequest/Done added
* Handling of bit 27 in non-extended URL messages
* In Other Notes: comment about expiry dates updated
* Warning about setting type bits in Done messages
* Renderer protocol warnings added
* Sender protocol renamed to Poster protocol
* EMail protocol added (uses a new bit in flags)
* Suggestion about intelligent HTTP fetchers prefetching images
* Message_ArcwebAbortRequest added + description
* Message_ArcwebTransferStatus added + description
*
*/
This file describes the protocols used by Arcweb. There are seven protocols
planned, although only the first three are fully implemented so far.
* Fetcher Protocol
* Renderer Protocol
* Quit Protocol
* Extended URL Protocol
* Poster Protocol
* Email Protocol
* Expire Protocol
Programmers supporting these protocols must be extremely careful with their
memory, message and protocol management. The protocols are allowed to be
concurrent and nested.
The obvious example is that a renderer may find an inline image in a
document, and will hence want to issue a request to fetch it. A not so
obvious example is an intelligent HTTP fetcher which tries to intelligently
scan the returned document (if it was HTML) to pick out <img> tags and kick
off requests for these in-lined images so that when ArcWeb comes to ask for
them, a swifter reply can be made. The impact of such a system would be
more noticable, the more images there are inlined in a document.
General
=======
Each protocol consists of 2 messages, a "Request" and a "Done". The
Request is sent with event code 18 to force an acknowledgement. The
fetcher/renderer should acknowledge with the Done message to prevent the
Wimp returning the message to the originator. Should the fetcher/renderer
wish to call Wimp_Poll before it is ready to send the Done reply, it MUST ack
the message (ie. fill in the ref fields and reply with event type 19 to
prevent the Wimp returning it to the originator), and then send the Done
message when it is ready (having remembered to fill in the your_ref field
from the my_ref of the original Request).
Instead of sending a Wimp acknowledgement with event type 19 to prevent
delivery failure, as mentioned above, it would be much better to actually
reply with a Message_ArcwebTransferStatus, which doubles both as the ack and
as a useful message.
C programmers will want to use requester.h to define the structures used.
The types have been divorced from any Wimp library, only char, void and int
types have been used.
For an example of a Fetcher, see !ArcwebLcl application supplied. For an
example of a Renderer, see !ArcwebImg application supplied. ArcWeb
broadcasts a RenderRequest even for HTML and plain text files, as it listens
for RenderRequest messages itself (and this allows it to be overridden, when
I have put the configuration bit in, by ignoring the message).
Messages
========
buf is the buffer returned from Wimp_Poll (&e->data in Risc_OSlib). The
obvious fields have been omitted (ie. buf+0 to buf+12).
The 'Private Arcweb handle' must always be preserved, as this is the only
way Arcweb can deliver replies to the request originator.
The types of each entry are given: str0 = zero terminated string. strN =
string whose length is defined elsewhere. str0/N = string zero terminated
unless it is maximum length (defined elsewhere) in which case the terminator
is missing; date5 = 5 byte date string (in UTC) as returned by OS_Word 14,3
The buffers are all defined together and then explained below. requester.h
contains C structures for these messages.
Message_ArcwebFetchRequest (msg_Arcweb__base + 0)
buf+16 int Message_ArcwebFetchRequest (0x4A240)
buf+20 void* Private Arcweb handle
buf+24 int Flags
buf+28 int RISC OS file handle (source)
EITHER buf+32 str0 Uniform Resource Locator (fully qualified)
OR buf+32 int RISC OS file handle (URL) (if flags bit 22 set)
Message_ArcwebFetchDone (msg_Arcweb__base + 1)
buf+12 int my_ref from Message_ArcwebFetchRequest
buf+16 int Message_ArcwebFetchDone (0x4A241)
buf+20 void* Private Arcweb handle
buf+24 int Flags
EITHER buf+28 str0 Error message
OR buf+28 int Zero=this item has an expiry date
buf+32 int Non-zero=use default expiry delay
buf+36 date5 Expiry date in UTC (if buf+32 zero, and buf+28 zero)
Message_ArcwebRenderRequest (msg_Arcweb__base + 2)
buf+16 int Message_ArcwebRenderRequest (0x4A242)
buf+20 void* Private Arcweb handle
buf+24 int flags
buf+28 int RISC OS file handle (source)
buf+32 int RISC OS file handle (temporary)
buf+36 int RISC OS file handle (diagram)
buf+40 int RISC OS file handle (link)
buf+44 int size of data following
buf+48 --- (contents of buf+44) bytes of source file
Message_ArcwebRenderDone (msg_Arcweb__base + 3)
buf+16 int Message_ArcwebRenderDone (0x4A243)
buf+20 void* Private Arcweb handle
buf+24 int flags
buf+28 str0 error message
Message_ArcwebPostRequest (msg_Arcweb__base + 4)
buf+16 int Message_ArcwebPostRequest (0x4A244)
buf+20 void* Private Arcweb handle
buf+24 int flags
buf+28 int RISC OS file handle (source)
buf+32 int RISC OS file handle (form)
EITHER buf+36 str0 Uniform Resource Locator (if flags bit 22 clear)
OR buf+36 int RISC OS file handle (URL) (if flags bit 22 set)
Message_ArcwebPostDone (msg_Arcweb__base + 5)
buf+12 int my_ref from Message_ArcwebPostRequest
buf+16 int Message_ArcwebPostDone (0x4A245)
buf+20 void* Private Arcweb handle
buf+24 int flags
EITHER buf+28 str0 Error message
OR buf+28 int Zero=this item has an expiry date
buf+32 int Non-zero=use default expiry delay
buf+36 date5 Expiry date in UTC (if buf+32 zero, and buf+28 zero)
Message_ArcwebEMailRequest (msg_Arcweb__base + 6)
buf+16 int Message_ArcwebEMailRequest (0x4A246)
buf+20 void* Private Arcweb handle
buf+24 int flags
buf+28 int RISC OS file handle (form file containing e-mail)
Message_ArcwebEMailDone (msg_Arcweb__base + 7)
buf+12 int my_ref from Message_ArcwebEMailRequest
buf+16 int Message_ArcwebEMailDone (0x4A247)
buf+20 void* Private Arcweb handle
buf+24 int flags
Message_ArcwebQuit (msg_Arcweb__base + 32)
buf+16 int Message_ArcwebQuit (0x4A260)
Message_ArcwebExpire (msg_Arcweb__base + 33)
buf+16 int Message_ArcwebExpire (0x4A261)
buf+20 void* Private Arcweb handle
buf+24 int flags
buf+28 int RISC OS file handle (source)
Message_ArcwebAbortRequest (msg_Arcweb__base + 34)
buf+16 int Message_ArcwebAbortRequest (0x4A262)
buf+20 void* Private Arcweb handle
buf+24 int reserved
buf+28 str0 reason for abort (could log this)
Message_ArcwebTransferStatus (msg_Arcweb__base + 35)
buf+16 int Message_ArcwebTransferStatus (0x4A263)
buf+20 void* Private Arcweb handle from ..Request
buf+24 int reserved, must be zero
buf+28 int status flags
buf+32 int total size of data to be transmitted
buf+36 int size of data transmitted so far
buf+40 int total size of data to be received (if known)
buf+44 int size of data received so far
buf+48 str0 current status of communication
Ownership of File Handles
=========================
Some people have expressed concern at the apparent capability to lose track
of which task 'owns' the RISC OS file handles passed in the messages. This
section will clarify this.
The task which sends the FetchRequest/RenderRequest/PostRequest message will
have called OS_Find to open the file, and this task (normally ArcWeb itself)
owns the returned file handle UNTIL it calls Wimp_Poll. At this point, the
Wimp will start delivering the Request message to each task in turn. While
the Wimp is doing this the file handle is in limbo (in effect, it is
'lost').
There are two possible continuations:
i) As soon as a task receives the Request message and decides that it wishes
to reply to the Request message, it will either send the corresponding Done
message, OR just ack it with the intention on replying properly later. It
is upon the sending of this Done message, or Ack, (the call to
Wimp_SendMessage) that the task becomes the owner of the file handle, and
hence responsible for closing the file handle when it is finished with.
ii) The other scenario is when *no* task wants to reply to the Request
message. In this case, the Wimp will deliver an event code 19 copy of the
message to the originator. The originator now regains ownership of the file
handle and is responsible for it (ArcWeb closes the file(s) and generates
an error if this is sensible).
In general, for any task, if it receives a Request message, it is guaranteed
to have exclusive use of the file handle until it next calls Wimp_Poll.
This system ensures that ownership of the file is guaranteed to be allocated
to one particular task in the system at any time, and that a task has to
'claim' ownership explicitly, as shown in (i) above.
Flags
=====
The Flags word contains several useful bits of information:
bits 0-15 RISC OS file type/ArcWeb file type (see bits 16,30)
bit 16 If set, bits 0-15 are ArcWeb file type, else RISC OS
bits 17-19 Reserved (must preserve in replies)
bit 20 If set, the user has disabled image fetching
bit 21 If set, the email message is already complete
bit 22 If set, an extended URL is in use
bit 23 If set, ArcWeb will not display the results of the render
bit 24 If set (+bit 31 set), ArcWeb will not raise an error
bit 25 If clear, this is a 'page' fetch, else inline data
bit 26 If set, ArcWeb will close the window which issued request
bit 27 If set, extended URL has changed
bit 28 If set, symbolic link request/confirm
bit 29 If set, ArcWeb will not attempt to render the file
bit 30 If set, bits 0-16 are valid, else they are invalid
bit 31 If set, buffer contains an error message
The error bit (31) determines which of the options in two of the messages
above, is in force. If the error bit is set, ArcWeb will raise an error to
the user UNLESS bit 24 is also set (indicating that the other application
has already handled this - this is the preferred way of doing it).
When a request to fetch a new page is issued, ArcWeb remembers which window
(if any) launched the request. Once a successful RenderDone has been
delivered to the window, it will self-destruct if bit 26 is set. ArcWeb
will set this bit itself according to the user's choice, and so this bit
should always be preserved, unless you absolutely want to set it (see bit 23
below). That window is not allowed to launch further requests until the
whole transaction is completed. This blocked state is indicated to the user
by the web animation in the information window for a busy page.
If you find that bit 22 is set upon receipt of a Request involving a URL,
then the buffer will not contain a URL, but a file handle of a file that
does. The file will have been opened in R/W mode with OS_Find. You should
read the whole of the file to obtain the URL. It may be terminated by a
whitespace character, or by EOF. You must close the file as you would
normally with a ..Request message.
If, during a conversation with an HTTP server, you receive a Location:
directive indicating a document relocation, then you should amend the
contents of the extended URL file to represent the new target URL. You must
set flag bit 27 in the Done message (which you don't send immediately - you
pretend as though you had received the new URL from the Request message).
The transparent obedience to Location: directives is not strictly required
(and the change cannot be signalled to ArcWeb if the extended URL protocol
is not being used), but is MOST HIGHLY ENCOURAGED.
If bit 23 is set in RenderDone, Arcweb will not display the results of the
render. This is to allow applications such as image renderers to open their
own window to display the picture. This must NOT be done if bit 25 is set
(as someone else is trying to render an inline image). If bit 25 is set,
the application must convert the image into a sprite file and store it in
the 'diagram' file. If you set bit 23, it is a good idea to clear bit 26 in
order not to confuse the user.
Upon receipt of a FetchRequest message, the application should check bit 28.
If bit 28 is set, then the application may opt to retrieve the file to its
own private directory and store the name of the file in the 'source' file.
If it declines to do this (or bit 28 is clear anyway), bit 28 should be
cleared on the response and the fetched file placed in 'source'.
If bit 21 is set on receipt of a EMailRequest message, then this indicates
that the form file contains a complete e-mail message ready to be mailed
off (excluding From: header). If bit 21 is clear, then this indicates that
the helper application needs to open some kind of editor to allow the user
to complete the e-mail. The editor should be initialised with the contents
of the form file. If it is going to do anything at all, then the helper
application must reply with a EMailDone message.
Bit 20 is used by ArcWeb to indicate to fetchers that the user has
specifically requested that images are not fetched and rendered. This
allows 'intelligent' HTTP fetchers mentioned at the start of this document,
to parse the HTML responses to its fetches and immediately kick off the
transfers for these images before the render of the document begins. If bit
20 is set, then the fetcher should not do this.
Message_ArcwebAbortRequest
==========================
Upon receipt of this message, the fetcher application retrieving a URL under
the given Arcweb Private handle, should terminate the fetch with an error.
It should acknowledge this message with Message_ArcwebFetchDone, after
setting the flags correctly indicating some kind of error. All active
fetches associated with the handle should be stopped. This allows
intelligent fetchers to stop any automatic fetches that they might have
issued, to fetch inlined images.
At the choice of the FETCHER, it may decide to ignore the AbortRequest and
continue fetching data. The most obvious situation in which this is useful
is if the fetch is large and nearly complete anyway - say less than 1 second
estimated to completion. You must acknowledge the AbortRequest with a
FetchDone, in the usual manner (ie. you can acknowledge the message first,
then send a reply later). The amount of time may be configurable in the
fetcher, but should not exceed a few seconds as this may confuse the user.
Message_ArcwebTransferStatus
============================
A fetcher which wishes to inform the user of the status of the communication
between itself and the remote information server (FTP, HTTP, Gopher, etc..)
may use this message to inform the browser of its progress. The 4 size
elements in the message may be used by the browser to create a graphical
representation of the communication status. Values of -1 should be known
for unknown quantities (0 is reserved for indicating no bytes yet
transferred or expected). The browser may choose to ignore these values and
rely on the description string, which should say something like:
1000 bytes out of 2000 byte document read
0 bytes out of 16384 bytes of inlined image read
Looking up louis.ecs.soton.ac.uk ...
or whatever it feels is appropriate. Fetcher writers must recognise that
the browser may attempt to display these messages in an icon, and will hence
be issuing redraw requests (even if only via Wimp_SetIconState) for each
message received. The fetcher should provide a facility for disabling these
messages, and not send them too frequently - two or three times per second
is quite enough.
If you are going to provide byte counts as in the examples above, I HIGHLY
RECOMMEND using OS_Convert(Fixed)FileSize to give a useful display. It
isn't terribly useful to know that 65539 bytes out of 268435456 bytes have
been received. Far better, even if slightly inaccurate, to say '64K out of
256M' and maybe show the percentage that represents.
Returned messages should typically be less than about 64 characters long.
The browser may ignore these messages, and so the fetcher should not send
them recorded delivery (Wimp event type 18).
The flags for Message_ArcwebTransferStatus:
bit 0 If set, transmission is in progress
bit 1 If set, reception is in progress
bit 2 If set, transmission is complete
bit 3 If set, reception is complete
bit 4 If set, other comms. are in progress eg. DNS lookup
bits 5 - 31 Reserved
Other Notes
===========
You must preserve the Arcweb Private Handle. This is a private pointer to a
part of Arcweb's application memory used to distribute the replies which
Arcweb receives.
When specifying an expiry date in FetchDone messages, if you intend to
specify a date rather than allow the default to be used, then you must
specify this as an exact date. ie. not relative to the current time, you
must add it up yourself. HTTP servers may supply expiry dates on documents
which they return. These should be honoured (see documentation of the
Territory module's handling of Time & Date).
The 'source', 'temporary', 'diagram' and 'link' words in the FetchRequest
and RenderRequest messages are RISC OS file handles to the appropriate files
in Arcweb's cache. If you acknowledge either of the Request messages, YOU
are responsible for closing the files (ie. passing the handle to OS_Find 0).
You must close ALL the file handles in the message even if you don't use
them all. The reason for passing handles around is that filenames won't
necessarily fit in the message buffer (particularly when you want to pass 4
of them). If just having the handle is inconvenient, you should use OS_Args
7 to convert the handle into a filename (as !ArcwebLcl does), close the file
and reopen it how you want to do it. BUT BEWARE: only do this if you are
going acknowledge the message (otherwise other recipients of the message
will find that they have an illegal file handle in the message). For
FetchRequest, 'source' is opened as read/write (OS_Find 0x83 - [I always
thought that was for output only, but my RO3 PRMs say otherwise]). For
RenderRequest, 'source' is opened read only (OS_Find 0x43), and the others
are read/write (OS_Find 0x83). For PostRequests, the source is as for
FetchRequests, and the form file is read only. For EMailRequests, the form
file is read-only.
[An example of using OS_Args 7 in BASIC is in !ArcwebLcl which uses OS_CLI
to issue a straight *COPY command when the symlink bit is clear]
You do not need to fill in the file type bits (0-15,16 and 30) when sending
the Fetch Done message, but it will assist the renderers in deciding what to
do with the document if you do. The most common types returned are plain
text (&FFF), HTML (&FAF), GIF (&695), JPEG (&C85) and XBM (&10104).
WARNING: Most HTTP servers are configured with a default type to return in
the case that it cannot work out the type of a file from its name or a
specific directive telling it the file type. This default type is usually
text/plain. This is a problem when you wish to store binary files (eg.
arcweb.arc) on HTTP servers, as louis.ecs.soton.ac.uk's server will tell you
that this is a text/plain file, when obviously it isn't. ArcWeb 0.12 and
later will examine the data in the RenderRequest message* to ensure that it
does not contain non-printable characters before attempting to render it.
* see renderer protocol discussion below
Renderer Protocol
=================
The Renderer Protocol is used to transform a file into something which can
be viewed in the Arcweb window. Arcweb generates a DrawFile (type &AFF)
to represent the HTML document. I have written my own parser to read the
HTML file (conforms with the HTML 2.0 specification, 13 June 1994) and
display it. Renderers should convert images into sprites and store them in
the 'diagram' file (but only if it is an inline data fetch - see discussion
of bits 23 and 25 above). ArcWeb will be expecting a sprite file to be
there which it can then import. It will take the first sprite out of the
file. Arcweb will handle RISC OS sprite files (&FF9) itself (although it
will broadcast the message with the expectation of receiving it back). You
should not acknowledge render requests for files of type &FF9
Renderers should inspect the file type bits in the RenderRequest to see
whether the file is of a type which they can handle. If the type is not
known* (bit 30 is clear), then the renderer may inspect the first few bytes
of the file. Before sending the RenderRequest message, Arcweb will have
loaded in as much of the file as possible to the message (to fill it up to
256 bytes). The amount of data in the message is in buf+44, and the data
starts at buf+48. Thus up to 208 bytes of data will be present (currently
it is 208 unless the file is not that long, in which case the whole file is
there). If 208 bytes is not enough, the renderer may use the file handle
given in the message to read more of the file in. The file pointer should
be reset to zero if it decides that it can't handle it. As a safeguard,
Renderers MUST NOT assume that the file pointer will be zero, and should set
it explicitly.
* The renderer may decide that it doesn't 'trust' the given file type. If
that is the case, then it is allowed to ignore bit 30 and attempt to
recognise the file type from the data in the message buffer and act on that
alone.
Quit Protocol
=============
Fetchers and Renderers should have a configurable option to allow them to
quit when Arcweb dies. This provides a clean way of shutting Arcweb and all
its related programs down in one go. Upon receipt of a Message_ArcwebQuit
message, store the task handle (in buf+4) and wait for the Wimp task
closedown message (Message_TaskCloseDown 0x400C3) to arrive with the given
task handle. [To prevent Arcweb closing down, acknowledge the
Message_ArcwebQuit message - not in version 0.08. Arcweb will exit anyway.]
By using an extra message, this avoids the need for the auxiliary apps to
lookup the task name from the handle to check whether it was Arcweb or not,
and this also guards against the task name changing.
Do not acknowledge Message_ArcwebQuit unless you wish to prevent the
shutdown (treat it like you would treat Message_PreQuit)
Expire Protocol
===============
If a fetcher has used a symbolic link when fetching a file then that file
should be expired upon receipt of Message_ArcwebExpire. Arcweb will pass you
a handle to the file in the cache. If the symbolic link bit (bit 28) is set
in the flags, then the file to be removed is really the file whose name is
in the file whose handle you have been given. !ArcwebLcl ignores this
message completely (as it should). This protocol is not fixed and should
not yet be used (there is a particular problem with extended URLs), but
fetcher authors should be aware of its future existence. ArcWeb does not
currently send Message_ArcwebExpire messages.
Poster Protocol
===============
This protocol will be used when ArcWeb wants to send information to a URL
rather than retrieve it, ie. when an HTML form with a method of POST has
been completed and the SUBMIT button has been pressed. If the form did not
override the default (GET) or chooses the GET method, the the FetchRequest
protocol is used, and the fetcher is not aware of the fact that the data
transfer has happened.
PostRequest should be treated exactly like Message_ArcwebFetchRequest,
except that the 'form' file handle is a RISC OS file handle pointing to a
read-only file containing data to tag onto the end of the request.
Obviously, the fetcher should issue a POST command to the HTTP server
instead of a GET. The data in the form file will be suitable for tacking on
IMMEDIATELY after the POST and other usual headers, as it will contain a
Content-Length: header followed by a blank line, followed by the form data
with all the appropriate escape characters already substituted in. This
data should be passed transparently to the HTTP server. The fetcher should
respond to ArcWeb with a Message_ArcwebPostDone in exactly the same format
as a response to Message_ArcwebFetchRequest. The buffer for PostDone is the
same as the buffer for FetchDone. This is because the target script on the
server responds with a new virtual document, usually confirming that the
data entry was successful/failed etc.
EMail Protocol
==============
This protocol will be used when ArcWeb wants to send some e-mail. ArcWeb
will send this message when the users attempts to follow a mailto: hypertext
link, or when a form with a mailto: method is submitted. In the former
case, the message remains to be constructed, so the contents of the form
file should be taken as the initial text to be placed in the editor. ArcWeb
guarantees to make the first line of the file the To: header. Furthermore,
bit 21 will be clear indicating that the message is not yet complete. In
the latter case (the invoked form), bit 21 will be set, indicating a
complete message is in the buffer and ready for mailing. ArcWeb will still
guarantee that the first line in the form file is the To: header, but will
also add an X-Mailer: header, a Date: header and any other headers as
directed by MH tags in the form (eg. Subject: headers)
Upon receipt of the EMailRequest, the e-mail application should take over
the process of sending e-mail and return an EMailDone, after taking a copy
of the form file contents and closing the form file. ArcWeb is not
concerned with what happens to the e-mail next, as far as it is concerned,
the email is sent.
There is no virtual document returned from an EMailRequest. The e-mail
application may flag an error in the usual way (and choose whether to report
the error itself of not). ArcWeb will open a dialogue box indicating the
status of the e-mail, rather than generating a complete new page.
==END==