The Content Manager Service (CMS) allows JXTA applications to share and retrieve content within a peer group. Each item of shared content is represented by unique content id and a content advertisement which provides meta-information about the content, such as its name, length, mime type, and description. The CMS also provides a protocol based on JXTA pipes for transferring content between peers. Unlike some other P2P systems, peers running CMS are not required to use HTTP in order to exchange content.
The CMS is a JXTA service that supports the sharing and retrieval of content within a peer group. The CMS manages the shared content for a local peer, and allows applications to browse and download content from remote peers.
Each piece of shared content is referenced by a unique content identifier, using a 128-bit MD5 checksum generated from the content data. The advantage of using MD5 for the content identifier is that typically in a file sharing system content gets downloaded and reshared by many peers on the network. By using MD5, it is easy to determine if two files shared by different peers are the same rather than relying on the content name or description. Thus, it should be possible to search for any content by its MD5 id so that a peer which has the content can be found closest to the requesting peer.
In addition, each shared content item has an associated content advertisement which provides meta-information describing the content, including the content name, length, mime type, id, and description. If the content name and mime type are not specified explicitly when the content is shared, then a default name and media type based on the content file name will be chosen. There is no default content description. Content advertisements are stored as XML documents. Here is a example advertisement:
<?xml version="1.0"> <!doctype jxta:contentAdvertisement> <jxta:contentAdvertisement> <name>index.html</name> <cid>md5:1a8baf7ab82c8fee8fe2a2d9e7ecb7a83</cid> <type>text/html</type> <length>23983</length> <description>Web site index</description> </jxta:contentAdvertisement> |
The cid (content id) field contains the unique 128-bit MD5 checksum of the content. This content id is used when requesting the content data. Both the name and cid fields are mandatory, but all other fields are optional in a content advertisement.
The CMS interface currently allows applications to share content stored on the local file system. The CMS manages a persistent store which includes references to the locally shared file content as well as their associated advertisements. In the persistent store, only the references to shared files are maintained rather than copying the file contents. This can result in a big disk space savings when sharing large media files such as MP3's since the files don't need to be copied. When content is shared, the MD5 is computed and the reference to the actual content is stored along with the MD5 checksum. When the content is subsequently retrieved by another peer, the content is verified to make sure that it has not changed since last shared.
The CMS service uses JXTA pipes for remote content request and retrieval. Each instance of CMS manages a single input pipe for receiving both content requests and responses. Request and response pipe advertisements are passed in each CMS message so once the initial content request pipe advertisement is discovered for a peer, subsequent pipe advertisements can be obtained from the messages themselves. This allows CMS implementations the option of utilizing separate pipes for handling different message types, since the initial pipe is only needed to send the first request.
CMS messages are encoded as JXTA pipe messages using tags to separate each field. A LIST_REQ request message is sent to a peer to obtain the list of content shared by the peer. The peer will response with a LIST_RES response message including one or more content advertisements for the content shared by the peer. The requesting peer can then send a GET_REQ request message using the content id contained in the advertisement in order fetch the content. The responding peer will then send one or more GET_RES response messages including the data for the content requested. The CMS does not specify the routing of content search requests. Instead, it relies on another distributed search mechanism such as JXTA Search to provide this support. The CMS itself only supports local searching of content.
The following is a more detailed description of each CMS message type, as well as the individual message tags.
A LIST_REQ request message is sent to a peer to obtain the list of content advertisements for all the content shared by the peer. Each LIST_REQ message includes a unique request id, the input pipe advertisement for sending LIST_RES responses, and an optional search string for filtering the results. The following is a description of the LIST_REQ message tags:
After receiving a LIST_REQ message, the peer will send a single LIST_RES response message containing one or more advertisements for the content that is shared by the peer. If a QSUBSTR tag was present in the LIST_REQ message, only the advertisements whose content name contains the specific search string will be returned.
Each LIST_RES message includes the request id of the corresponding LIST_REQ message, the input pipe advertisement for sending content requests, and one or more content advertisements. Here is a description of the LIST_RES tags:
The requesting peer can then send a GET_REQ message to download the content data, using the request input pipe specified in the LIST_REQ message. Each GET_REQ message includes a unique request id, the id of the content requested, and the input pipe advertisement for sending content response messages. The following are the GET_REQ message tags:
After receiving a GET_REQ message, the responding peer will first check if it is sharing content for the specified content id. If the content was found, the peer will then respond by sending one or more GET_RES messages to the response input pipe specified in the GET_REQ message. Since JXTA pipes impose limitations on the maximum size of a JXTA message, content is transferred using multiple GET_RES messages, each of which contains a block of bytes within the content data. The requesting peer must wait until all data blocks have been received for the content, which can be verified against the total length of the content specified in the content advertisement.
Each GET_RES response message includes a block of content bytes, the number of bytes in the block, and the offset of the block. The request id of the corresponding GET_REQ message is also included. Here are the GET_RES message flags:
Since CMS messages are sent uses JXTA pipes, a peer receiving content GET_RES messages cannot assume that the messages will be received in the order sent, and must also be prepared to deal with dropped messages. For this reason, the interface provided by CMS for content retrieval allows the specification of a file into which to store the retrieved content. The blocks of the file will be filled in as the GET_RES messages are received. An application can register a listener to be notified when the content has been fully retrieved. If the application does not received such notification within some specified period of time, then it should assume that some packets have been lost and should abort the transfer. At this time, it is not possible to request only a range of content bytes so all the content bytes will have to be requested over again.
Several enhancements are currently in the planning stages for CMS. Here is a partial list of some proposed enhancements.
Currently, there is no support for automatic propogation of content requests, so a requesting peer will need to send content requests directly to each peer. A better search strategy could be implemented using a distributed model, where peers are configured in a "mesh" and search requests are propogated between them. The CMS already supports most of what is required in order to implement this since each LIST_REQ and GET_REQ message includes the input pipe advertisement for sending responses, so it is possible for peers to forward requests on behalf of other peers as the pipe id will resolve to the original requesting peer's pipe. An implementation would have to include some sort of TTL (time-to-live) in order to set an upper bound on the number of nodes to which a search request can be forwarded.
It should be possible to allow a peer to share dynamic as well as static content. At least in Java, the CMS could support the sharing of "content request handlers" which would be much like servlets. Additional user-defined parameters could be included in each GET_REQ message that would be passed to the content handler when making a request. For dynamic content, a different type of content id would have to be chosen since MD5 only really applies to static content. Some sort of generated UUID (Unique Universal Identifier) seems an appropriate choice for dynamically generated content.
It is not always necessary to store the content and associated advertisement on the same peer. A peer should be able to store advertisements on behalf of another peer, which would allow greater flexibility in building more efficient search mechanisms. For example, a peer might be able to cache the advertisements for another peer based on search query results. To support this the input pipe advertisement for sending content requests would be included in the content advertisement, rather than sent in the LIST_RES message. This would allow advertisements to refer to other peers sharing content.
JXTA Search could be used to search content advertisements. The CMS would register its content advertisements with a JXTA Search node. When a JXTA search node gets a request, the request would be forwarded to the appropriate CMS peer which would then handle the local search. After receiving the results, a requesting peer could then retrieve the content directly from the peer with a content request message. This would allow applications to use JXTA Search to locate content, but then use CMS itself to transfer the content.