home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!nntp1.radiomail.net!fernwood!autodesk!tropic!tchild
- From: tchild@autodesk.com (Timothy Child)
- Newsgroups: comp.databases
- Subject: Re: Hierarchical Structures in RDBMS
- Message-ID: <18335@autodesk.COM>
- Date: 7 Jan 93 23:10:08 GMT
- References: <1993Jan6.094340.5870@devon.co.uk>
- Sender: news@Autodesk.COM
- Reply-To: tchild@autodesk.com
- Organization: Autodesk, Inc.
- Lines: 140
-
- In article 5870@devon.co.uk, don@duncan (Don Radvan) writes:
- >
- > I have an interesting problem. I need to represent hierarchical data
- > structures within a relational db, but I need to do it in a way that
- > permits the user to create his own hierarchy. That is, I do not know the
- > structure of the hierarchy before hand.
- >
- > For example:
- > A user wants to organize various objects. These objects will
- > appear as leaf nodes in the user's own hierarchy. The user may choose many
- > different ways of organizing these objects, indeed, s/he may have many
- > simultaneous hierarchies of organization for each set of objects. What
- > results is a sort of directory tree. Each directory is a category of
- > objects and each "file" or non-directory entry in a directory is an
- > object. Given the objects of apples and oranges, we may get the following:
- >
- > Stuff
- > |
- > --------------------------------
- > Food Non-Food
- > |
- > ---------------------- . . .
- > Fruit Veggies
- > |
- > -------------
- > Apples Oranges
- >
- > And so when s/he does a query such as the following:
- > select name where Food = "Fruit"
- > or
- > select name where Stuff = "Food" AND Food = "Fruit"
- >
- > we would get Apples and Oranges.
- >
-
- It's possible to model this problem with an RDBMS, but I don't know of
- a STANDARD way to express the appropriate SQL to return the information
- and I don't think there is one!
-
- What you have in the hierarchy above is a specialized form of directed
- acyclic graph (DAG). The solution proposed below applies cyclic and
- acyclic graphs so the system also must contain extra logic to assure the
- appropriate specialized form of graph.
-
- A graph can be modeled in an RDBMS with two tables Nodes and Links.
- Nodes are the entities of the hierarchy like: Stuff, Fruit, Food, etc,
- . . .
-
- A node relation would be something like:
-
- Node id (prime key), node attributes, . . .
-
- Links are the connections between Nodes and contain the keys of the parent
- and sibling nodes. A link relation would be something like . . .
-
- Parent Node id, Sibling Node id, (together making the primary key),
- link attributes, etc . . .
-
- In the above example the tables would look like, . . .
-
- Nodes Links
-
- Stuff Stuff, Food
- Food Stuff, Non_Food
- Non_Food Food, Fruit
- Fruit Food, Veggies
- Veggies Fruit, Apples
- Apples Fruit, Oranges
- Oranges
-
- The query strategy is to start with a given Node id and search
- the Link table for all entires that contain that Node id as the
- parent. The parents sibling node ids can be used to search the Links
- table for their siblings. The search continues recursively until the
- desired number of levels has been reached.
-
- Standard SQL queries have no way of dealing with this recursive
- search. Some vendors provide work arounds with non standard
- features. Another solution is to have specific queries
- for a specific number of levels.
-
- There is one piece of semantics this relational approach doesn't
- capture and that is sub-entity synonyms. Because all entries in the
- hierarchy are nodes they appear in the node table as unique names, no
- duplicates are allowed. It is possible to have a hierarchy with two
- sub-entities with the same name that are different, my above solution
- doesn't support this.
-
- For example consider the hierarchy ....
-
-
- Stuff
- |
- -------------------
- | |
- Food Non-Food
- | |
- ------- ---------
- | | | |
- Fruit Paint
- | |
- -------- -----------
- | | | | |
- Orange Apple Blue Red Orange
-
-
- Orange is both a kind of fruit and a paint. To support this sort of
- arrangement you need to introduce an extra key field to identify node
- types. Let's call it the type id. The Node and Link relations now
- look like:
-
- Node Id, Type Id, (a compound primary key) ...
-
- Parent Node Id, Parent Type Id, Sibling Node Id, Sibling Type Id,
- (all 4 fields are the primary key) . . .
-
- I would also recommend adding an additional table of Type Ids.
- The type table would allow for referential integrity checks
- on the type id.
-
- This solution only works if the nodes are of different types. It
- wouldn't work if the hierarchy was a UNIX directory structure
- for example.
-
- Another problem with this approach and the relation model is the
- issue that a Node is a Node and only a Node. Therefore all nodes have
- an identical number and kinds of attributes. This problem can be
- dealt with in a couple of ways, including extra tables which contain
- specific attributes for each type. It is considered bad practice to
- need to add new tables because of new subtype instances.
-
- In general graph like structures are one of the toughest things to
- work with in the relational model. A network or an object oriented
- data base has a more natural affinity for solving this kind of
- problem.
-
- -Tim Child
-
- Autodesk Inc. (Exercising an individual first amendment right.)
-
-