A suitable database schema for a semantic benchmark satifies four
criteria.
- The schema should be natural. That is, it should correspond to
a reasonable, though possibly greatly simplified, segment of the
real world. This both reduces the need to explain the model and
enhances the ability to recognize verball pitfalls in the path to
the query instances.
- The schema should be simple. This will aid in making the
benchmark easy to understand. This criterion restricts the number of
relation schemas and the number of attributes of the individual
schemas. Additionally, the names of the relations and of the
attributes should be short, as they will be referenced repeatedly.
When an expansion is proposed, the benefits should be carefully
compared with the added complexity.
- The schema should allow for comprehensiveness within the
chosen scope. Using the schema, it should be possible formulate
queries of all the types that appear reasonable.
This indicates a need for at least two related relation schemas (for
natural-join queries).
- A schema that has already been used frequently is preferred
over a new schema. This guarantees that many existing queries can be
adapted easily to the benchmark.
- For clarity, schema and attribute names must start with
capital letters.