I would like to thank the WTC for their generous support of this research. I would also like to thank Dr. Tom Furness and Dr. William Winn for their support and direction of this project.
This report outlines the final results of the Zengo Sayu Project funded by the WTC. Zengo Sayu is an immersive virtual environment created to teach basic spoken Japanese to beginning students. This research set out to answer the following research questions: 1) Can students learn spoken Japanese using the Zengo Sayu VR system? 2) How does Zengo Sayu compare to other teaching methods in terms of learning gains? 3) Does the Zengo Sayu approach have a positive affect on students' motivation and attitude toward learning Japanese?
The findings of this study are positive for research questions
one and two, but with no significant differences measured for
student motivation and attitude in question three. This study
has also given us greater insight into the creating educational
virtual environments, interface design issues and the potential
for VR technology in general. This research is also the topic
of my masters thesis, which details the supporting theory and
design principles underlying the construction of Zengo Sayu. (My
thesis is now available on the HITL web site for public distribution
at http://www.hitl.washington.edu/ publications/tech-reports/tr-95-1-rose/home.html
.)
The Zengo Sayu virtual environment simulates a Japanese style room. The essence of the learning activity is an interactive construction game using building blocks to teach colors, nouns, sentence structure and five prepositions of place. Zengo Sayu uses a whole language approach which allows the user to progress at her own pace through an environment imbued with knowledge. Touching objects makes them talk. Placing one object atop another, the objects describe their inter-relationship. As knowledge builds, so does the complexity of the environment. The design and application of the Zengo Sayu environment are described in greater detail in my masters thesis referenced above.
This study compared subjects' performance in three treatment groups: VR using Zengo Sayu, instruction which mimicked the virtual environment but taught in the real-world (RW), and a more conventional approach closer to the text-based instruction used in most foreign language programs (TB). The VR and RW groups used a direct method of language instruction where all lessons were carried on in the target language and students did not receive explicit grammatical explanations. Both the RW and TB treatments used the same props and manipulatives, with the primary distinction being that the TB group received some instruction and explicit grammatical explanations in English.
The dependent variables for this experiment were: 1) Listening comprehension measured on the Test of Listening Comprehension (TLC), 2) Speaking ability was measured with the Test of Oral Production (TOP) test, and 3) Motivation and attitude measured with an exit survey.
Subjects attended two, 90 minute classes spaced one week apart.
Total N for this study was 43 for the first session, and 25 for
the second session due to subject attrition. The n for
each of the groups is as shown in Table 1:
Table 1: Number of Subjects per Treatment
Three experienced, native speakers of Japanese were randomly
assigned to the three groups. Instructors were informed that this
was a comparison of various teaching methods, but none was informed
of the study's research hypotheses.
It was anticipated that both the RW and the VR treatments would yield higher listening comprehension (TLC) scores then the TB treatment based on their immersive characteristics, high levels of interactivity and whole language approach. It was expected that the VR TLC scores would equal or exceed the RW treatment in learning gains, though the positive effects of the VR treatment may be mitigated by a number of factors such as hardware interface problems, users' adverse reactions to VR and unfamiliarity with virtual interfaces.
A summary of the results for the three dependent variables is as follows.
Three listening comprehension tests were administered: P1 at
the end of Session 1, P2 as a retention measure before Session
2, and P3 as a final test at the end of Session 2. The mean percentages
for the three respective groups are shown in Table 2. ANOVA analysis
showed no significant statistical differences were found between
any of the groups for P1 (F=1.55) or P2 (F=2.57). Results for
the P3 test show a statistical difference between RW and TB (F=
5.85), with no statistical differences measured for the VR group.
Table 2: Test Of Listening Comprehension Results (Mean % Scores/s)
An oral production test (TOP) was administered to each subject at the end of the second session. The tests were video taped and analyzed, but the results were deemed too subjective for a meaningful quantitative analysis. However, the observations of the researchers and instructors reached general informal agreement that subjects' oral production performance was generally lower than their listening performance, and the performance of the respective groups ranked from high to low as follows: TB, VR, RW.
All subjects completed an exit survey about attitudes toward the instructional treatment and learning Japanese in general. The results of the survey reveal the same trends for both attitude and anxiety for all three treatments. No significant differences were found between treatments.
Our analysis focused on the results of the TLC combined with the results of qualitative observations. In session 1, the VR group was required to first learn to manipulate the virtual interface before they could get to the task of learning Japanese. All VR subjects developed a basic ability to move and manipulate things in the virtual environment in the 30 minute allotted warm up period (each subject was given between 7-10 minutes to fly around an English version of the Zengo Sayu environment). Yet, we believe that the added cognitive load of dealing with the unfamiliar VR interface hampered their progress in the language learning task. Thus the relatively lower TLC scores for the VR group are not surprising.
The TLC P2 scores to measure retention show a gain slight gain for the TB group, a minimal a drop for the VR group, and nearly an 18% drop for the RW group. While this sharp drop was not verified to be statistically significant due to the low number of cases in Session 2 of the RW group, this finding suggests the potential for treatment effects relating to retention and recall. The slight gains for the TB group could be attributable to a positive influence of individuals who studied their notes during the period between sessions, or to a possible ceiling effect for some subjects on the P1 test. The stability of scores between P1 and P2 for the VR group suggests that there is indeed some aspect of this treatment which enhances retention compared with the same instructional approach applied in the RW treatment.
The TLC P3 test was substantially more difficult than both P1 and P2, Mean scores show the TB group performed the best, with the VR group only 6% behind. Scores for the RW group were the lowest, and a statistically significant difference was found between TB and RW in spite of the low number of subjects in the RW group (n=6). While the difference between VR and RW scores was not significant, the fact that the VR scores closely resemble those of the TB treatment suggest that the low number of subjects for VR (n=8) was not enough to obtain measurable differences.
It is important to keep in mind the uniqueness and limitations of this study when interpreting these results. First, this study was done under 'laboratory' conditions, and not as part of an actual Japanese language course. All treatment class sizes were significantly smaller that what is normally found in the majority of foreign language programs at any level, ranging from 2-9 people. Inconsistencies in class sizes was due to a number of subjects who either did not appear for classes to which they were committed, and due to attrition. The RW group suffered the most from both of these problems, and experienced far more attrition than either of the other groups (see Table 1). The high attrition in the RW group may be attributable to outside causes or coincidence, but it may also suggest that many people are not comfortable or successful with direct method, immersive instruction.
In addition to class size, the experience and training of the instructors who participated in this study is also worth mention. All the instructors had extensive experience teaching beginning Japanese. However, two of these instructional methods (VR and RW) are highly specialized and experimental, while one (TB) is far closer to the forms of instruction to which all three teachers are accustomed. Thus in the case of the TB treatment, both the instructor, and also possibly the students, benefited from the fact that their treatment was closer to known and expected conventions. This effect may be amplified by the characteristics of the subjects group having been recruited from the University of Washington campus community. The subjects' above academic backgrounds and experience could be an external cause for the relative strength of the TB treatment.
This study is susceptible to a number of other potential sources
of confounding including: subjects' preconceptions and attitudes
toward technology, gender and age differences among subjects,
and excessive variance between treatments due to the nature of
the respective teaching methods.
The Zengo Sayu Project took a solid step towards understanding how VR may be a useful educational tool. While the scope of this study was unable to produce statistically conclusive evidence showing VR to be superior to the other methods (particularly the TW method), I am able to answer our first research question with great certainty: subjects are able to learn Japanese from Zengo Sayu. This opportunity to test and compare Zengo Sayu has substantially enriched our understanding of human factors, how the virtual environment could be improved and made more accessible to our users.
I am encouraged by these initial findings, and look forward to
enhancing the virtual environment based on this study's findings.
I also believe that the results of this study warrant continuing
this line of research within a more authentic context of a complete
language program.