home *** CD-ROM | disk | FTP | other *** search
- .sh 1 "POSTQUEL" 2
- .pp
- The next two subsections will motivate the different issues that the
- optimizer must consider through several examples.
- Then, it will indicate how these new features affect the optimizer.
- .sh 2 "An Extendible Type System"
- .pp
- One of the features that POSTGRES supports are abstract data types
- (ADTs).
- An ADT facility allow users to define their own data types,
- simplifying representation of complex information.
- For example, a user who must store box coordinates in his database
- can define a data type called \*(lqboxcoordinates.\*(rq
- From here, he can define a relation
- BOX with a coordinates field of type \*(lqboxcoordinates.\*(rq
- The unit square box shown in figure 2.1,
- therefore, would be represented as shown in figure 2.2.
- This is in contrast to figure 2.3.
- .(z
- .hl
- .GS
- file figure2.1
- .GE
- .sp
- .ce 2
- \fBFigure 2.1\fR
- .sp
- .sz \n(ep
- .TS
- center;
- |c||c|c|
- c||c|c|.
- _
- BOX boxid coordinates
- _
- 1 ((-1,0), (0,1), (1,0), (0,-1))
- _ _
- .TE
- .sp
- .ce 2
- .sz \n(pp
- \fBFigure 2.2\fR
- Relation with user-defined type \*(lqboxcoordinates\*(rq
- .sp 2
- .sz \n(ep
- .TS
- center;
- |c||c|c|c|c|c|c|c|c|c|
- c||c|c|c|c|c|c|c|c|c|.
- _
- BOX boxid x1 y1 x2 y2 x3 y3 x4 y4
- _
- 1 -1 0 0 1 1 0 0 -1
- _ _ _ _ _ _ _ _ _
- .TE
- .sz \n(pp
- .sp
- .ce 2
- \fBFigure 2.3\fR
- Relation without user-defined types
- .hl
- .)z
- .pp
- POSTGRES also allows users to define operators
- to be used in conjunction with user-defined data types.
- By defining an operator \*(lqAREA,\*(rq a query to compute the area of the
- above box would be expressed as:
- .(l
- \fBretrieve\fR (a = AREA(BOX.coordinates)) \fBwhere\fR BOX.boxid = 1
- .)l
- rather than:
- .(l
- \fBretrieve\fR (a = sqrt (sqr (BOX.x1 - BOX.x2) + sqr (BOX.y1 - BOX.y2)) +
- sqrt (sqr (BOX.x1 - BOX.x4) + sqr (BOX.y1 - BOX.y4)))
- \fBwhere\fR BOX.boxid = 1.
- .)l
- .sp
- In addition, an operator \*(lqAREAEQ\*(rq (area equals) can be defined
- to find all boxes whose area is equal to some desired value.
- For example,
- .(l
- \fBretrieve\fR (BOX.all) \fBwhere\fR BOX.coordinate AREAEQ 10
- .)l
- finds all boxes whose area equals ten.
- Similarly, operators like \*(lqAREALT\*(rq (area less than) and
- \*(lqAREAGT\*(rq (area greater than) can also be defined.
- .pp
- The operators, AREAEQ, AREAGT, and AREALT, are quite similar
- to the conventional relational operators, =, >, and <.
- Therefore, in the same way that users can specify access methods like
- B-trees and hash tables to provide efficient access when used
- with conventional operators,
- users should also be able to specify access methods to efficiently
- scan relations when non-standard operators
- appear within restriction clauses.
- One solution, which POSTGRES supports, is to allow users to extend
- existing access methods so they can be used with a new set of operators.
- For example, by maintaining coordinate records within a B-Tree sorted
- in \*(lqAREALT\*(rq order rather than a more conventional \*(lq<\*(rq
- or \*(lq>\*(rq order,
- a user has an access method that
- provides efficient access to queries whose restrictions contain
- the AREAEQ, AREAGT, or AREALT operators.
- Another example is a hash table that can be constructed using the operator
- \*(lqAREAEQ\*(rq rather than \*(lq=.\*(rq
- .pp
- The technique just described is suitable provided there exists an
- appropriate access method upon which extensions can be made.
- However, suppose a user defines an operator contained-in, \*(lq$<<$,\*(rq that
- returns true if the left operand is spatially contained within the
- right operand.
- To provide efficient access to this operator,
- two-dimensional access methods are required, e.g.
- an R-tree [GUTT84] or a K-D-B Tree [ROBI81].
- Since conventional databases do not support two-dimensional access methods,
- an extension of an existing access method is not possible.
- To alleviate this problem,
- POSTGRES allows users to define their own access methods.
- Details on user-defined access methods are discussed in [STON86a].
- .pp
- In summary,
- with an extendible type system, users can build data types to suit their
- application needs, operators to manipulate the new types, and access methods
- to provide efficient access to queries containing these new operators.
- .sh 2 "Procedural Data Fields"
- .pp
- Existing relational databases do not provide good support for storage of
- complex objects.
- For example, if a complex object consists of a single box, circle, and
- triangle, this information would be represented as shown in
- figure 2.4.
- As a result, three join queries must be executed to
- retrieve all information about subobjects within this
- complex object.
- .(z
- .hl
- .sz \n(ep
- .in +2.5i
- .nf
- BOX (bid, bdata)
- CIRCLE (cid, cdata)
- TRIANGLE (tid, tdata)
- .fi
- .sp
- .in -2.5i
- .TS
- center;
- |c||c|c|c|
- c||c|c|c|.
- _
- COBJECT coid objtype oid
- _
- 1 box 2
- 1 circle 3
- 1 triangle 4
- _ _ _
- .TE
- .sp
- .in +2i
- .nf
- \fBretrieve\fR (BOX.all) \fBwhere\fR
- BOX.bid = COBJECT.oid \fBand\fR
- COBJECT.objtype = \*(lqbox\*(rq \fBand\fR
- COBJECT.coid = 1
-
- \fBretrieve\fR (CIRCLE.all) \fBwhere\fR
- CIRCLE.cid = COBJECT.oid \fBand\fR
- COBJECT.objtype = \*(lqcircle\*(rq \fBand\fR
- COBJECT.coid = 1
-
- \fBretrieve\fR (TRIANGLE.all) \fBwhere\fR
- TRIANGLE.tid = COBJECT.oid \fBand\fR
- COBJECT.objtype = \*(lqtriangle\*(rq \fBand\fR
- COBJECT.coid = 1
- .fi
- .in -2i
- .sz \n(pp
- .sp
- .ce 2
- \fBFigure 2.4\fR
- Storage of complex objects in a relational system
- .hl
- .)z
- In the more general case where a complex object is composed of up to \fIn\fR
- different subobject types, a user would have to execute \fIn\fR join queries.
- Without extra information indicating which subobject types are actually
- contained within a desired object, the user has no choice but to execute all
- \fIn\fR queries.
- This is quite inefficient, particularly when \fIn\fR is large, because
- as indicated in the previous sentence, many of the join queries
- are unnecessarily executed.
- .pp
- The basic problem here is that the relational model is not well-suited for
- representing hierarchical relationships.
- As a solution, Stonebraker has proposed embedding queries within data
- fields and using these queries to express the hierarchical relationship
- between the corresponding tuple and information elsewhere in the database
- [STON84].
- Using this idea, which POSTGRES supports, our complex object example is
- now represented as shown in figure 2.5.
- .(z
- .hl
- .sz \n(ep
- .TS
- center;
- |c||c|c|
- c||c|l|.
- _
- COBJECT coid components
- _
- 1 \fBretrieve\fR (BOX.all) \fBwhere\fR BOX.bid = 2
- \fBretrieve\fR (CIRCLE.all) \fBwhere\fR CIRCLE.cid = 3
- \fBretrieve\fR (TRIANGLE.all) \fBwhere\fR TRIANGLE.tid = 4
- _ _
- .TE
- .sz \n(pp
- .sp
- .ce 2
- \fBFigure 2.5\fR
- Storage of complex objects with procedural data fields
- .hl
- .)z
- To retrieve information executed by the queries embedded within this data field,
- the user would issue the following query:
- .(l
- \fBexecute\fR (COBJECT.components) \fBwhere\fR COBJECT.coid = 1.
- .)l
- Thus, \fIn\fR join queries reduce to a single \fBexecute\fR query.
- In addition, users can selectively retrieve information linked through
- these hierarchies by nesting attributes in a manner similar to the proposal
- in GEM [ZANI83].
- For example, to retrieve triangle information for a particular complex object,
- a user would nest \*(lqtdata\*(rq within \*(lqcomponents\*(rq as shown below:
- .(l
- \fBretrieve\fR (COBJECT.components.tdata) \fBwhere\fR COBJECT.coid = 1.
- .)l
- In general, attributes can be nested to an arbitrary number of levels.
- .sh 2 "The POSTGRES Optimizer"
- .pp
- Query optimization decisions are made based upon the characteristics of
- operators appearing within queries as well as the index types defined
- on relations.
- In a conventional optimizer, information about
- operators and access methods can be \*(lqhardwired\*(rq into the
- optimization code because there are only a fixed number of
- operators and access methods within the system.
- Such a solution would not suffice in a system like POSTGRES where
- an arbitrary number of operators and access methods are at a
- user's disposal.
- Consequently, this was an issue that had to be considered when designing the
- POSTGRES optimizer.
- .pp
- The optimizer also must consider queries containing nested attributes.
- As section 3.5 will describe, there is a clean and simple solution
- to this problem, which only requires that the optimizer apply
- the basic planning algorithm once for each nesting level.
- .pp
- Rules and triggers will be processed using
- query modification [STON75].
- This aspect of POSTGRES will not be discussed in this report because query
- modification is being implemented in a module separate from the optimizer.
- For details, see [STON86d].
- .pp
- Sophisticated algorithms have been proposed to optimize transitive closure
- queries as well as sets of queries.
- This is done by transforming sequences of database operations to
- equivalent sequences that execute more efficiently.
- This report, however, will not discuss these techniques any further because
- the topic is outside the scope of this project.
- For details, see [SELL85], [SELL86].
-