home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Current Shareware 1994 January
/
SHAR194.ISO
/
educatio
/
nptr14.zip
/
TRINST.DOC
< prev
next >
Wrap
Text File
|
1993-08-25
|
18KB
|
393 lines
------------------------------------------------------------
TRAINER
(From Passive Pupil to Self-Correcting Scholar)
------------------------------------------------------------
INDEX
TRAINER . . . . . . . . . . . . . . . . . . . . 1
Sample Answer Card . . . . . . . . . . . . . . . 2
Sample Test Instructions and Selected Answers . 3
Short Form Item Analysis . . . . . . . . . . . . 4
Converting Raw Scores to Grades . . . . . . . . 5
TRAINER Programs and Files . . . . . . . . . . . 6
Bibliography and Support . . . . . . . . . . . . 7
------------------------------------------------------------
------------------------------------------------------------
TRAINER 1/7
(From Passive Pupil to Self-Correcting Scholar)
------------------------------------------------------------
TRAINER was started in 1981 to score tests by
quality and quantity for underprepared college students.
The test grades needed to reward the development of:
1. Good study habits (as required by an essay test).
2. The sense of responsibility needed to learn at
higher levels of thinking.
3. The self-judgment required for a self-correcting
scholar.
The qualitative score (percent right) is the
feel-good score or the you-are-on-the-right-track score.
It indicates to what extent students know their own minds,
their self-judgment. The combined qualitative and
quantitative score is the test grade.
The minimum answer sheet had to have three options:
A. GUESS: The traditional guess test style, random
guess, answer every question, that encourages the
use of lower levels of thinking.
B. KNOW: Mark only when confident the answer is
an acceptable report of what is known or
reasoned, that encourages the use of higher levels
of thinking.
C. Both A and B: A concrete demonstration that each
student can voluntarily perform. A comparison
between the familiar, low performance pupil, and
the unfamiliar, high performance student.
(When given only the KNOW option, students
complained, "Why can't we guess here as we do on
tests all over campus?")
This concrete comparison was absolutely essential for
students to evaluate the two test styles. By the third hour
test each semester over 90% selected only the KNOW style:
I can spend my time on the questions I know something
about.
It is honest, I am not forced to guess (to lie).
I try to master what I am studying rather than
memorize everything.
I get better grades now with less study time.
----------------------------------------
Sample Answer Card 2/7
----------------------------------------
NAME _______________________________
COURSE _____________________________
I [ ] 0 1 2 3 4 5 6 7 8 9
D [ ] 0 1 2 3 4 5 6 7 8 9
[ ] 0 1 2 3 4 5 6 7 8 9
N [ ] 0 1 2 3 4 5 6 7 8 9
U [ ] 0 1 2 3 4 5 6 7 8 9
M [ ] 0 1 2 3 4 5 6 7 8 9
B [ ] 0 1 2 3 4 5 6 7 8 9
E [ ] 0 1 2 3 4 5 6 7 8 9
R [ ] 0 1 2 3 4 5 6 7 8 9
ANSWERS
1 [A][B][C][D][E] 51 [A][B][C][D][E]
2 A B C D E 52 A B C D E
3 A B C D E 53 A B C D E
4 A B C D E 54 A B C D E
5 A B C D E 55 A B C D E
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
46 A B C D E 96 A B C D E
47 A B C D E 97 A B C D E
48 A B C D E 98 A B C D E
49 A B C D E 99 A B C D E
50 A B C D E 100 A B C D E
----------------------------------------
Based on card F-4000-9 printed by
the Clearview Printing Co. Inc.
----------------------------------------------------------------------
Sample Test Instructions and Selected Questions * 3/7
----------------------------------------------------------------------
GENERAL BIOLOGY 102, SECTION 1, TEST #1
12-Jan-89 09:43 AM
VALID ANSWER CARD: Name, Student Number and Seat Number
You have the option of (A) answering all questions (GUESSing
if you do not know), (B) reporting (marking) only what you
KNOW or can reason, or (C) marking both methods.
Use the left side 1-49 for (A). Use the right side 51-99 for (B)
(A) SCORING: 0% for self-judgment, +2 for right, 0 for wrong
(GUESS if you do not know.)
(B) SCORING: 50% for self-judgment, +1 for right, -1 for wrong
(Report only if you are confident of being right.)
(C) SCORING: Mark (A) on the left AND (B) on the right side.
MARK ANSWER 50: A) GUESSing B) Reporting what I KNOW or can
reason C) Average of A and B
4. Ponds do not freeze from the bottom up because: A) ice has a
greater volume than the equivalent amount of water B) the
specific heat of water is low C) ice is more dense than liquid
water D) of the high surface tension of water
6. How is the arrangement of water molecules in ice different from
their arrangement in liquid water? The arrangement in ice is:
A) more random and more open than that in water B) more regular
and more open than that in water C) more regular and more
compact than that in water D) more random and more compact than
that in water
49. Next test in ( ) weeks. A) 1 B) 2 C) 3 D) Instructor's choice
----------------------------------------------------------------------
Classes tend to select two weeks. Concrete level of thinking
pupils do not like weekly tests as they require "too much studying"
and tri-weekly tests require "too much to study".
Informational questions can be inserted at any point in the test.
They are only tabulated. They have no effect on scores.
Normally a 50 minute "hour" test is limited to 49 or fewer graded
questions. There must be time to think, if thinking at higher levels
of thinking is to apply. If more questions are needed, the program
will accept a card marked with up to 99 answers with position 100
marked A for GUESS or B for KNOW.
* Test Bank by A. C. Monroe, D. J. Fox, and J. J. Cockerill, 1985,
Worth Publishers, Inc.
----------------------------------------------------------------------
Short Form Item Analysis 4/7
----------------------------------------------------------------------
1...5...10 ....5...20 ....5...30 ....5...40 ....5...50
ANSWER KEY: CBBAABABCC CBCBCDCAAB ACEEBCAEBA ADBDCBADDE ADBDDCDD
ITEM WEIGHTS: 1111111111 1111111111 1111111111 1111111111 111111110
DIFFICULTY2: 5 4C C 35 5C664 5 a b 0
DISCRIMINATION: T T T 1 1551 111 1 1 T111 11TT1 1
----------------------------------------------------------------------
DIFFICULTY2 (Questions that failed to perform well.)
A question that received a total performance value of
less than 75% (the sum of the difficulty, K x E, value
for those who answered and the difficulty, Item Score,
value for the entire class).
Upper Case = the most popular wrong answer with more than half of the
class marking. These items need to be reviewed in class.
In the above example, questions 4 and 6, usually fail to
perform well because pupils associate DENSE and COMPACT
with HARD even though they know ice floats on water.
Lower case = most popular wrong answer with less than half of the
class marking.
Number X 10 = percent of the class marking when the right answer is
the most popular.
No one answered item 48. There were 15 items of
questionable validity among the 48 scorable items. The
TRGRADES program can assign grades on the basis of the
33 (48 - 15) questions that make up the true test out of
the 48 items presented to the class, unless the
instructor has some justification for doing otherwise.
No test had a set of questions that all performed well
during an eight year period. Lacking a functional
item analysis for each test, most faculty members will
not admit that a number of questions on each test are not
valid for each class. Testing services use expert
inference based on multiple tests to determine validity.
This design fails to address the instructional validity
of items on any one test in any one class.
To be instructionally useful, an analysis must respond to
each class and each test. The analysis is further
improved when based on student reports of what they know
or can do rather than on the random guessing encouraged
by the traditional use of multiple-choice questions.
DISCRIMINATION (Questions that differentiate between those students
who did well on the test and who did not do well.)
Discrimination is expressed in probabilities: Results
that could have happened by chance alone at or less then
1%, 5% and Ten% of the time (best, better, and good).
These questions generate most of the score distribution.
----------------------------------------------------------------------
Converting Raw Scores to Grades 5/7
----------------------------------------------------------------------
A score distribution can be changed in two ways:
1. Shift Sliding the score distribution to the right or
left on a grade scale.
2. Stretch Expanding or contracting the score distribution
on a grade scale.
TRGRADES permits an instructor to repeatedly modify the score
distribution until the desired grade distribution is obtained. The
validity of the modifications is related to the way they are done.
1. Norm- The weight of all "bad" items can be used
Referenced (1/3 shift and 2/3 stretch for KNOW tests).
This tends to produce a grade distribution
(Automatic) similar to the normal curve with the
exception of a few students receiving scores
of over 100%. These are the students who
answered correctly more items than the class,
as a whole, determined were valid. This system
rewards outstanding performance at no penalty
to others in the class (almost the reverse of
curving guess-test scores).
2. Criterion- "Bad" items are inspected and then either
Referenced accepted as bad or considered items that the
class is held responsible for within the
instructional system (lecture, reading assigment,
laboratory, home work, projects, etc.). These
items need to be discussed with the class.
3. Inspection About any favorite grade distribution can be
obtained by a combination of shift and stretch.
Option 2 is the most valid use of the program. Using the short
form item analysis and a copy of the test, a determination can be
made in about 10 minutes on a 48 question test.
The program also removes two common faculty worries:
1. The test will be too hard or too easy. There is no need to
attempt the feat of selecting questions with the goal of
obtaining a raw score distribution that will also be the
grade distribution.
2. Bad questions that will require adjusting scores. This system
actually needs a few such items just to keep students using
higher levels of thinking. The idea that all questions are
equally valid for each student and for each class to answer is
an academic delusion. Bad items will always be there, in
part, due to the missmatch of student, teacher, and evaluator
operating at different levels of thinking. There is no need
to intentionally create them.
----------------------------------------------------------------------
TRAINER Programs and Files 6/7
----------------------------------------------------------------------
PROGRAMS: TR main menu
TRAINER scores for quality and quantity
TRGRADES converts raw scores to grades
TRREF referees independent marking
TRERROR corrects card reader errors
BRT71EFR PDS BASIC run-time module 7.1
FILES: TRAINER
Input: Answer Data File, positions 1-9 = I.D. field
positions 11-100 = answers
(See TRSAMPLE.DOC for an example)
Output: PRINT .FIL: Individual score slips
Ranked scores
Histogram
Class plot (quality & quantity)
Item analysis
GUESSCOR.FIL: GUESS test scores
KNOWSCOR.FIL: KNOW test scores
TRGRADES
Input: GUESSCOR.FIL and/or KNOWSCOR.FIL
Output: GRADES .FIL
TRREF
Input: Answer Data File
Output: BARGUESS.FIL Answer bar graphs that
BARKNOW .FIL supplement the item analysis.
SIMGUESS.FIL Similarity check.
SIMKNOW .FIL
CONGUESS.FIL Uniqueness check and collation
CONKNOW .FIL of similarity and uniqueness to
confirm independent marking.
INFORMATION FILES: TRSAMPLE.DOC Set of 37 answer cards.
TRINST .DOC This instruction file.
WARRANTY.DOC Warranty and distribution.
REGISTER.DOC Registration of use.
File names are designed to allow one DELETE *.FIL command to
remove all temporary files from the directory. All files to be saved
must be renamed with a different extension than .FIL.
Also see README.DOC and UPDATE.DOC.
------------------------------------------------------------
BIBLIOGRAPHY and SUPPORT 7/7
------------------------------------------------------------
Hart, Richard A. 1981. Evaluating and rewarding student
initiative and judgement or an alternative to "sitting
through" a course if you did not test out. Pages 75-76
in Directory of Teaching Innovations in Biology.
Meeth, L. R. and Dean S. Gregory, Ed. Studies in
Higher Education:Arlington, Virginia. 252 pages.
Hart, Richard and Kenneth Minter. 1985. Using a
computer to manage typical classroom problems.
National Science Teachers Association Annual Meeting,
Cincinnati, Ohio 18-21 April.
Minter, Kenneth and Richard Hart. 1986. Essay testing
using multiple choice questions. Missouri Academy of
Science Annual Meeting, Warrensburg, MO 25-26 April.
Hart, Richard and Kenneth Minter. 1988. Diagnostic
Testing Using Multi-Choice and Matching Questions.
National Science Teachers Association Annual Meeting,
St. Louis, MO 7-10 April.
Minter, Kenneth and Richard Hart. 1989. Student Choice
in Computer Graded Tests. National Science Teachers
Association Annual Meeting, Seattle, Washington
6-9 April.
Hart, Richard and Kenneth Minter. 1991. Student Choice
in Multiple-Choice Testing. National Science Teachers
Association Annual Meeting, Houston, Texas
27-30 March.
------------------------------------------------------------
Program support is available from Nine-Patch Software,
315 South Alco Ave., Maryville, MO 64468-2033 for registered
users. (Else include a stamped and self-addressed envelope.)
Phone 816-582-8589 CIS 71222,3565
Assistance in adapting higher levels of thinking to
existing instructional programs is available. Of interest
are student and teacher workshops and demonstrations in
which the participants experience the concepts as well as
learn about them.
Richard A. Hart, Ph.D.
315 South Alco Avenue, Maryville, MO 64468-2033
------------------------------------------------------------