%X Proteins are sequences of amino acids that fold into secondary and tertiary structure, which plays an important role in their function. As biologists have yet to discover the rules that govern how a protein folds in nature from its underlying sequence, this thesis tries a new approach to secondary structure prediction using dynamic programming on the input protein sequence. The sequence is broken into short words, where each word has a probability of folding into the three different types of secondary structure. By combining word probabilities with an abstraction called contexts, which model a run of the same secondary structure type up to a bounded length, the optimal prediction for an entire sequence can be computed via dynamic programming. The structure probabilities for words are learned from a training set of sequences with known secondary structure using linear programming. The combined approach to prediction using linear and dynamic programming achieves high accuracy on protein sequences whose words were observed in the training set, but is far less accurate on sequences with unobserved words not seen in the training set. The challenge for future work lies in interpolating probabilities for unobserved words to achieve improved generalization.
%K keywords
%Y
%A Huang, Huilong
%T Efficient Routing in Wireless Ad Hoc Networks
%D August 12, 2008
%Z Mon, 03 Jan 08 00:00:00 GMT
%R TR08-05
%I The Department of Computer Science, University of Arizona
%X We describe a new file system that provides, at the same time,
both name and content based access to files. To make this possible,
we introduce the concept of a semantic directory. Every
semantic directory has a query associated with it. When a user
creates a semantic directory, the file system automatically creates
a set of pointers to the files in the file system that satisfy
the query associated with the directory. This set of pointers is
called the query-result of the directory. To access the files
that satisfy the query, users just need to de-reference the
appropriate pointers. Users can also create files and sub-directories
within semantic directories in the usual way. Hence, users can
organize files in a hierarchy and access them by specifying path names,
and at the same time, retrieve files by asking queries that
describe their content.
Our file system also provides facilities for query-refinement and customization. When a user creates a new semantic sub-directory within a semantic directory, the file system ensures that the query-result of the sub-directory is a subset of the query-result of its parent. Hence, users can create a hierarchy of semantic directories to refine their queries. Users can also edit the set of pointers in a semantic directory, and thereby modify its query-result without modifying its query or the files in the file system. In this way, users can customize the results of queries according to their personal tastes, and use customized results to refine queries in the future. That is, users do not have to depend solely on the query language to achieve these objectives.
Our file system has many other features, including semantic mount-points that allow users to access information in other file systems by content. The file system does not depend on the query language used for content-based access. Hence, it is possible to integrate any content-based access mechanism into our file system.
%K dissertation
%Y
%A Coffman, E.G., Jr.
%A Downey, Peter
%A Winkler, Peter
%T Packing Rectangles in a Strip
%D April 8, 1997
%Z Wed, 08 Jan 97 00:00:00 GMT
%R TR97-04
%I The Department of Computer Science, University of Arizona