home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Crawly Crypt Collection 1
/
crawlyvol1.bin
/
program
/
books
/
progem
/
user1.8
< prev
next >
Wrap
Text File
|
1986-11-01
|
28KB
|
509 lines
Permission to reprint or excerpt is granted only if the
following line appears at the top of the article:
ANTIC PUBLISHING INC., COPYRIGHT 1986. REPRINTED BY PERMISSION.
Professional GEM by Tim Oren
Column #8 - User Interfaces, Homily #1
AND NOW FOR SOMETHING COMPLETELY DIFFERENT!
In response to a number of requests, this installment of ST PRO GEM
will be devoted to examining a few of the principles of computer/human
interface design, or "religion" as some would have it. I'm going to
start with basic ergonomic laws, and try to draw some conclusions
which are fairly specific to designing for the ST. If this article
meets with general approval, further "homilies" may appear at
irregular intervals as part of the ST PRO GEM series.
For those who did NOT ask for this topic, it seems fair to explain
why your diet of hard-core technical information has been interrupted
by a sermon! As a motivater, we might consider why some programs are
said by reviewers to have a "hot" feel (and hence sell well!) while
others are "confusing" or "boring".
Alan Kay has said that "user interface is theatre". I think we may
be able to take it further, and suggest that a successful program
works a bit of magic, persuading the user to suspend his disbelief and
enter an imaginary world behind the screen, whether it is the
mathematical world of a spreadsheet, or the land of Pacman pursued by
ghosts.
A reader of a novel or science fiction story also suspends
disbelief to participate in the work. Bad grammar and clumsy plotting
by the author are jarring, and break down the illusion. Similarly, a
programmer who fails to pay attention to making his interface fast and
consistent will annoy the user, and distract him from whatever care
has been lavished on the functional core of the program.
CREDIT WHERE IT'S DUE
Before launching into the discussion of user interface, I should
mention that the general treatment and many of the specific research
results are drawn from Card, Newell, and Moran's landmark book on the
topic, which is cited at the end of the article. Any errors in
interpretation and application to GEM and the ST are entirely my own,
however.
FINGERTIPS
We'll start right at the user's fingers with the basic equation
governing positioning of the mouse, Fitt's Law, which is given as
T = I * LOG2( D / S + .5)
where T is the amount of time to move to a target, D is the distance
of the target from the current position, and S is the size of the
target, stated in equivalent units. LOG2 is the base 2 (binary)
logarithm function, and I is a proportionality constant, about 100
milliseconds per bit, which corresponds to the human's "clock rate"
for making incremental movements.
We can squeeze an amazing amount of information out of this formula
when attempting to speed up an interface. Since motion time goes up
with distance, we should arrange the screen with the usual working
area near the center, so the mouse will have to move a smaller
distance on average from a selected object to a menu or panel.
Likewise, any items which are usually used together should be placed
together.
The most common operations will have the greater impact on speed,
so they should be closest to the working area and perhaps larger than
other icons or menu entries. If you want to have all other operations
take about the same time, then the targets farthest from the working
area should be larger, and those closer may be proportionately
smaller.
Consider also the implications for dialogs. Small check boxes are
out. Large buttons which are easy to hit are in. There should be
ample space between selectable items to allow for positioning error.
Dangerous options should be widely separated from common selections.
MUSCLES
Anyone who has used the ST Desktop for any period of time has
probably noticed that his fingers now know where to find the File
menu. This phenomenon is sometimes called "muscle memory", and its
rate of onset is given by the Power Law of Practice:
T(n) = T(1) * n ** (-a)
where T(n) is the time on the nth trial, T(1) is the time on the first
trial, and a is approximately 0.4. (I have appropriated ** from
Fortran as an exponentiation operator, since C lacks one.)
This first thing to note about the Power Law is that it only works
if a target stays in the same place! This should be a potent argument
against rearranging icons, menus, or dialogs without some explicit
request by the user. The time to hit a target which moves around
arbitrarily will always be T(1)!
In many cases, the Power Law will also work for sequences of
operations to even greater effect. If you are a touch typist, you can
observe this effect by comparing how fast you can enter "the" in
comparison to three random letters. We'll come back shortly to
consider what we can do to encourage this phenomenon.
EYES
Just as fingers are the way the user sends data to the computer, so
the eyes are his channel from the machine. The rate at which
information may be passed to the user is determined by the "cycle
time" of his visual processor. Experimental results show that this
time ranges between 50 and 200 milliseconds.
Events separated by 50 milliseconds or less are always perceived as
a single event. Those separated by more than 200 milliseconds are
always seen as separate. We can use these facts in optimizing user of
the computer's power when driving the interface.
Suppose your application's interface contains an icon which should
be inverted when the mouse passes over it. We now know that flipping
it within one twentieth of a second is necessary and sufficient.
Therefore, if a "first cut" at the program achieves this performance,
there is no need for further optimization, unless you want to
interleave other operations. If it falls short, it will be necessary
to do some assembly coding to achieve a smooth feel.
On the other hand, two actions which you want to appear distinct or
convey two different pieces of information must be separated by an
absolute minimum of a fifth of a second, even assuming that they
occur in an identical location on which the user's attention is
already focused.
We are able to influence the visual processing rate within the 50
to 200 millisecond range by changing the intensity of the stimulus
presented. This can be done with color, by flashing a target, or by
more subtle enhancements such as bold face type. For instance, most
people using GEM soon become accustomed to the "paper white"
background of most windows and dialogs. A dialog which uses a reverse
color scheme, white letters on black, is visually shocking in its
starkness, and will immediately draw the user's eyes.
It should be quickly added that stimulus enhancement will only work
when it unambiguously draws attention to the target. Three or four
blinking objects scattered around the screen are confusing, and worse
than no enhancement at all!
SHORT-TERM MEMORY
Both the information gathered by the eyes and movement commands on
their way to the hand pass through short-term memory (also called
working memory). The amount of information which can be held in
short-term memory at any one time is limited. You can demonstrate
this limit on yourself by attempting to type a sheet of random numbers
by looking back and forth from the numbers to the screen. If you are
like most people, you will be able to remember between five and nine
numbers at a time. So universal is this finding that it is sometimes
called "the magic number seven, plus or minus two".
This short-term capacity sets a limit on the number of choices
which the user can be expected to grasp at once. It suggests that the
number of independent choices in a menu, for instance, should be
around seven, and never exceed nine. If this limit is violated, then
the user will have to take several glances, with pauses to think, in
order to make a choice.
CHUNKING
The effective capacity of short-term memory can be increased when
several related items are mentally grouped as a "chunk". Humans
automatically adopt this strategy to save themselves time. For
instance, random numbers had to be used instead of text in the example
above, because people do not type their native language as individual
characters. Instead, they combine the letters into words and remember
these chunks instead. Put another way, the characters are no longer
considered as individual choices.
A well designed interface should promote the use of chunking as a
strategy by the user. One easy way is to gather together related
options in a single place. This is one reason that like commands are
grouped into a single menu which is hidden except for its title. If
all of the menu options were "in the open", the user would be
overwhelmed with dozens of alternatives at once. Instead, a "Show
Info" command, for instance, becomes two chunks: pick File menu, then
pick Show.
Sometimes the interface can accomplish the chunking for the user.
Consider the difference between a slider bar in a GEM program, and a
three digit entry field in a text mode application. Obviously, the
GEM user has fewer decisions to make in order to set the associated
variable.
THINK!
While we are puttering around trying to speed up the keyboard, the
mouse, and the screen, the user is actually trying to get some work
done. We need to back off now, and look at the ways of thinking, or
cognitive processes, that go into accomplishing the job.
The user's goal may be to enter and edit a letter, to retrieve
information from a database, or simply draw a picture, but it probably
has very little to do with programming. In fact, the Problem Space
Principle says that the task can be described as a set of states of
knowledge, a set of operators and associated constraints for changing
the states, and the knowledge to choose the appropriate operator,
which resides in the user's head.
Those with a background in systems theory can consider this as a
somewhat abstract, but straightforward, statement in terms of state
variables and operators. A programmer might compare the knowledge
states to the values of variables, the operators to arithmetic and
logic operations, the constraints to the rules of syntax, and the
user's knowledge to the algorithm embodied by a program.
ARE WE NOT MEN?
A rational person will try to attain his goals (get the job done)
by changing the state of his problem space from its initial state to
the goal state. The initial state, for instance, might be a blank
word processor screen. The desired final state is to have a completed
business letter on the screen.
The Rationality Principle says that the user's behavior in typing,
mousing, and so on, can be explained by considering the tasks required
to achieve the goal, the operators available to carry out the tasks,
and the limitations on the user's knowledge, observations, and
processing capacity. This sounds like the typical user of a computer
program must spend a good deal of time scratching his head and
wondering what to do next. In fact, one of Card and Moran's key
results is that this is NOT what takes place.
What happens, in fact, is that the trained user strikes a sort of
"modus vivendi" with his tool and adopts a set of repetitive, trained
behavior patterns as the best way to get the job done. He may go so
far as to ignore some functions of the program in order to set up a
reliable pattern. What we are looking for is a way of measuring and
predicting the "quality" of this trained behavior. Since using
computers is a human endeavor, we should consider not only the speed
with which the task is completed, but the degree of annoyance or
pleasure associated with the process.
Card and Moran constructed a series of behavioral models which they
called GOMS models, for Goals-Operators-Methods-Selection. These
models suggested that in the training process the user learned to
combine the basic operators in sequences (chunks!) which then became
methods for reaching the goals. Then these first level methods might
be combined again into second level methods, and so forth, as the
learning progressed.
The GOMS models were tested in a lengthy series of trials at Xerox
PARC using a variety of word processing software. (Among the subjects
of these experiments were the inventors of the windowing methods used
in GEM!) The results were again surprising: the level of detail in
the models was really unimportant!
It turned out to be sufficient to merely count up the number of
keystrokes, mouse movements, and thought intervals required by each
task. After summing up all of the tasks, any extra time for the
computer to respond, or the user to move his hands from keyboard to
mouse, or eyes from screen to printed page is added in. This
simplified version is called the Keystroke-Level Model.
As an example of the Keystroke Model, consider the task of changing
a mistyped letter on the screen of a GEM word processor. This might
be broken down as follows: 1) find the letter on the screen; 2) move
hand to mouse; 3) point to letter; 4) click mouse button; 5) move hand
to keyboard; 6) strike "Delete" key; 7) strike key for new character.
The sufficiency of the Keystroke Model is great news for our
attempt to design faster interfaces. It says we can concentrate our
efforts on minimizing the number of total actions to be taken, and
making sure that each action is as fast as possible. We have already
discussed some ways to speed up the mouse and keyboard actions, so
let's now consider how to speed up the thought intervals, and cut the
number of actions.
One way to cut down "think time" is to make sure that the capacity
of short-term memory is not exceeded during the course of a task. For
example, the fix-a-letter task described above required the user to
remember 1) his place in the overall job of typing the document; 2)
the task he is about to perform; 3) where the bad character appeared,
and 4) what the new character was. When this total of items creeps
toward seven, the user often loses his place and commits errors.
You can appreciate the ubiquity of this problem by considering how
many times you have made mistakes nesting parentheses, or had to go
back to count them, because too many things happened while typing the
line to remember the nesting levels. The moral is that operations with
long strings of operands should be avoided when designing an
interface.
The single most important factor in making an interface comfortable
to use is increasing its predictability, and decreasing the amount of
indecision present at each step during a task. There is (inevitably)
an Uncertainty Principle which relates the number of choices at each
step to the associated
time for thought:
T = I * LOG2 ( N + 1)
where LOG2 is the binary logarithm function, N is the number of
equally probable choices, and I is a constant of approximately 140
msec/bit. When the alternates are not equally probable, the function
is more complex:
T = I * SUM-FOR-i-FROM-1-TO-N (P(i) * LOG2( 1 / P(i) + 1) )
where the P(i) are the probabilities of each of the choices (which
must sum to one). (SUM-FOR-i... is the best I can do for a sigma
operator on-line!) Those of you with some information theory
background will recognize this formula as the entropy of the decision;
we'll come back to that later.
So what can we learn from this hash? It turns out, as we might
expect, that we can decrease the decision time by making some of the
user's choices more probable than others. We do that by means of
feedback cues from the interface.
The important of reliable, continuous meaningful feedback cannot be
emphasized enough. It helps the beginner learn the system, and its
predictability makes the program comfortable for the expert. Programs
with no feedback, or unreliable cues, produce confusion, dissonance,
and frustration in the user.
This principle is so important that I going to give several
examples from common GEM practice. The Desktop provides several
instances. When an object is selected and a menu drops down, only
those choices which are legal for the object are in black. The others
are dimmed to grey, and are therefore removed from the decision. When
a pick is made from the menu, the bar entry remains black until the
operation is complete, reassuring the user that the correct choice was
made. In both the Desktop and the RCS, items which are double-clicked
open up with a "zoom box" from the object, again showing that the
right object was picked.
Other techniques are useful when operator icons are exposed on the
screen. When an object is picked, the legal operations might be
outlined, or the bad choices might be dimmed. If the screen flashing
produced by this is objectionable, the legal icons can be made mouse
sensitive, so they will "light up" when the cursor passes over - again
showing the user which choices are legal.
The desire for feedback is so strong that it should be provided
even while the computer is doing an operation on its own. The hour
glass mouse form is a primitive example of this. More sophisticated
are "progress indicators" such as animated thermometer bars, clocks,
or text displays of the processing steps. The ST Desktop provides
examples in the Format and Disk Copy functions. The purpose of all of
these is to reassure the user that the operation is progressing
normally. Their lack can lead to amusing spectacles such as
secretaries leaning over to hear if their disk drives are working!
Another commonly overlooked feature is error prevention and
correction. Card and Moran's results showed that in order to go
faster, people will tolerate error rates of up to 30% in their work.
Any program which does not give a fast way to fix mistakes will be
frustrating indeed!
The best way to cope with an error is to "make it didn't happen",
to quote a common child's phrase. The same feedback methods discussed
above are also effective in preventing the user from picking
inappropriate combinations of objects and operations. Replacement of
numeric type-ins with sliders or other visual controls eliminates the
common "Range Error". The use of radio buttons prevents the user from
picking incompatible options. When such techniques are used
consistently, the beginner also gains confidence that he may explore
the program without blundering into errors.
Once an error has occured, the best solution is to have an "inverse
operation" immediately available. For instance, the way to fix a bad
character is to hit the backspace key. If a line is inadvertantly
deleted, there should be a way to restore it.
Sometimes the mechanics of providing true inverses are impractical,
or end up cluttering the interface themselves. In these cases, a
global "Undo" command should be provided to reverse the effect of the
last operation, no matter what it was.
OF MODES AND BANDWIDTH
Now I am going to depart from the Card, Newell and Moran thread of
discussion to consider how we can minimize the number of operations in
a task by altering the modes of the interface. Although "no modes"
has been a watchword of Macintosh developers, the term may need
definition for Atarians.
Simply stated, a mode exists any time you cannot get to all of the
capabilities of the program without taking some intermediate step.
Familiar examples are old-style "menu-driven" programs, in which user
must make selections from a number of nested menus in order to perform
any operation. The options of any one menu are unavailable from the
others.
Recall that the user is trying to accomplish work in his own
problem space, by altering its states. A mode in the program adds
additional states to the problem space, which he is forced to consider
in order to get the job done. We might call an interface which is
completely modeless "transparent", because it adds no states between
the user and his work. One of the best examples of a transparent
program is the 15-puzzle in the Macintosh desk accessory set. The
problem space of rearranging the tiles is identical between the
program and a physical puzzle.
Unfortunately, most programmers find themselves forced to put modes
of some sort into their programs. These often arise due to
technological limitations, such as memory space, screen "real estate",
or performance limitations of peripherals. The question is how the
modes can be made least offensive.
I will make the general claim that the frustration which a mode
produces is directly proportional to the amount of the user's
bandwidth which it consumes. In other words, we need to consider how
many keystrokes, mouse clicks, eye movements, and so on, are going
into manipulating the true problem states, and how many are being
absorbed by the modes of the program. If the interface is wasting a
large amount of the user's effort, it will be perceived as slow and
annoying.
Here we can consider again the hierarchy of goals and methods which
the user employs. When the mode is low in the hierarchy, and close to
the user's "fingertips", it is encountered the most frequently. For
instance, consider how frustrating it would be to have to hit a
function key before typing in each character!
The "menu-driven" style of programs mentioned above are almost as
bad, since usually only one piece of information is collected at each
menu. Such a program becomes a labyrinth of states better suited to
an adventure game!
The least offensive modes are found at the higher, goal related
levels of the hierarchy. The better they align with changes in the
state of the original problem, the more they are tolerated. For
example, a word processing program might have one screen layout for
program editing, another for writing letters, and yet another while
printing the documents. A multi-function business package might have
one set of menus for the spreadsheet, another for a graphing module,
and a third for a database.
In some cases the problem solved by the program has convenient
"fracture lines" which can be used to define the modes. An example in
my own past is the RCS, where the editing of each type of resource
tree forms its own mode, with each of the modes nested within the
overall mode and problem of composing the entire resource tree.
TO DO IS TO BE!
Any narrative description of user interface is bound to be lacking.
There is no way text can convey the vibrancy and tactile pleasure of a
good interface, or the sullen boredom of a bad one. Therefore, I
encourage you to experiment. Get out your favorite arcade game and
see if you can spot some of the elements I have described. Dig into
your slush pile for the most annoying program you have ever seen, run
it and see if you can see mistakes. How would you fix them? Then...
go do it to your own program!
AMEN...
This concludes the sermon. I'd like some Feedback as to whether
you found this Boring Beyond Belief or Really Hot Stuff. If enough
people are interested, homily number two will appear a few episodes
from now. The very next installment of ST PRO GEM will go back to
basics to explore VDI drawing primitives. In the meantime, you might
investigate some of the Good Books on interface design referenced
below.
REFERENCES
Stuart K. Card, Thomas P. Moran, and Allen Newell, THE PSYCHOLOGY
OF HUMAN-COMPUTER INTERACTION, Lawrence Erlbaum Associates, Hillsdale,
New Jersey, 1983. (Fundamental and indispensible. The volume of
experimental results make it weighty. The Good Parts are at the
beginning and end.)
"Macintosh User Interface Guidelines", in INSIDE MACINTOSH, Apple
Computer, Inc., 1984. (Yes, Atarians, we have something to learn
here. Though not everything "translates", this is a fine piece of
principled design work. Read and appreciate.)
James D. Foley, Victor L. Wallace, and Peggy Chan, "The Human
Factors of Computer Graphics Interaction Techniques", IEEE Computer
Graphics (CG & A), November 1984, pp. 13-48. (A good overview,
including higher level topics which I have postponed to a later
article. Excellent bibliography.)
J. D. Foley and A. Van Dam, FUNDAMENTALS OF INTERACTIVE COMPUTER
GRAPHICS, Addison Wesley, 1984, Chapters 5 and 6. (If you can't get
the article above, read this. If you are designing graphics apps, buy
the whole book! Staggering bibliography.)
Ben Schneidermann, "Direct Manipulation: A Step Beyond
Programming Languages", IEEE Computer, August 1983, pp. 57-69. (What
do Pacman and Visicalc have in common? Schneidermann's analysis is
vital to creating hot interfaces.)