Abstract:
This article reports the findings of a study on the impact of learner control on the error correction process within a web-based Intelligent Language Tutoring System (ILTS). During three one-hour grammar practice sessions, 33 students used an ILTS for German that provided error-specific and individualized feedback. In addition to receiving detailed error reports, students had the option of peeking at the correct answer, even before submitting a sentence (browsing). The results indicate that the majority of students (85%) sought to correct errors on their own most of the time, and that 18% of students abstained entirely from looking up answers. Furthermore, the results identify language skill as a predictor for students belonging to the group of Browsers, Frequent Peekers, Sporadic Peekers, and Adamants.
KEYWORDS
Intelligent Language Tutoring Systems, Intelligent and Individualized Feedback, Learner Control in CALL, Web-Based Language Instruction, Grammar Practice
INTRODUCTION
In early CALL programs, based on behaviorist principles, students worked within a strict framework: navigation was generally hard-wired into the program (students were often trapped in an exercise unless they provided the correct answer), and help options were limited or nonexistent. In contrast, modern CALL programs emphasize student control or, following Higgins (1987), the Pedagogue Role of the computer. In practical terms, users navigate more freely through the program, can terminate
295
the program at any point, and have a number of options while working on different tasks. Despite this extra user control, learners do not, of course, always use each and every option available. For example, Cobb and Stevens (1996) discovered that students did not make use of help options although they knew that such use could improve their learning outcome. (see also Steinberg, 1977, 1989; Chapelle, Jamieson, & Park, 1996; Bland, Noblitt, Armstrong, & Gray, 1990).
More than a decade ago, Chapelle and Mizuno (1989) found in a study on learner-controlled CALL grammar lessons that there is a need for teachers and researchers alike to observe students' use of CALL. Ideally, programs should be developed, tested, and then revised to reflect student preferences and instructors' guidance towards appropriate use (see also Hubbard, 1996). While the past decade has contributed to a better understanding of learner control in CALL, nonetheless, a number of issues are outstanding, and more research in this area is needed.
This article reports on a study of student-computer interaction. In particular, it examines how learner control in grammar practice affects error correction strategies while using an Intelligent Language Tutoring System (ILTS).
The ILTS tested provided error-specific and individualized feedback. In the event of an error, users could resubmit the sentence, request the correct answer, or skip an exercise altogether. The data for the project were collected from 33 students in an introductory German course who used the ILTS during three one-hour grammar practice sessions. A computer log recorded the interaction, and a total of 4,456 sentences were analyzed.
BACKGROUND
In her article "Parsers in Tutors: What are they Good for," Holland (1991) explored the role of Intelligent Computer-Assisted Language Learning (ICALL) and concluded that ICALL is useful in three areas: (a) in form-focused instruction, (b) for students of at least intermediate proficiency, and (c) in research.
ICALL systems inherently provide more learner control than traditional CALL programs due to their sophisticated answer processing mechanisms. Unlike the more traditional drill and practice programs, ICALL software employs Natural Language Processing (NLP) which overcomes the rigidity of the response requirements of traditional CALL. The programs generally consist of a grammar and a parser which performs a linguistic analysis on the written language input. When learner errors are discovered by the system, the program generates error-specific feedback explaining the source of error.
Over the past decade, a number of NLP systems have been implemented
296
(Labrie & Singh, 1991; Levin & Evans, 1995; Loritz, 1995; Hagen, 1994; Holland, Kalan, & Sama, 1995; Sanders, 1991; Schwind, 1995; Wang & Garigliano, 1992; Yang & Akahori, 1997, 1999; Heift & Nicholson, 2000a). Additionally, a number of studies have focused on comparisons of CALL programs. For example, Nagata (1993, 1995, 1996) compared the effectiveness of error-specific (or metalinguistic) versus traditional feedback with students learning Japanese. In all studies, Nagata found that intelligent computer feedback based on NLP can explain the source of an error and, thus, is more effective than traditional feedback (see also Yang & Akahori, 1997,1999; Brandl, 1995).
The studies above focused on students' learning outcomes (results) and confirm Holland's conclusion that ICALL is effective and useful, in particular, for form-focused instruction. However, it would be equally instructive to examine the learning process while students work with such systems (Heift, 2001). As Chapelle and Mizuno (1989) state, "… when low-
ability students perform poorly on a criterion measure, it remains unclear how their work with the courseware may have failed to facilitate their eventual achievement."
In terms of error correction, van der Linden (1993) found that, when comparing learner strategies in programs with different levels of feedback, feedback about the type of error encouraged students to correct their work themselves. The question arises whether such feedback strategy would apply given a learner-controlled grammar practice environment in which the student can access correct answers or even skip exercises. A study by Cobb and Stevens (1996) showed that students "who rely excessively on program-supplied help are not learning as much as those who try to solve problems through their own self-generated trial-and-error feedback." For this reason, while CALL programs should provide a degree of learner control (Steinberg, 1989), it is important that students not overuse quick routes to correct answers.
The current study focuses on whether students correct themselves in a learner-controlled practice environment of an ILTS. Moreover, it examines whether language skill level influences students' error correction behavior.
In the following sections, we will describe German Tutor, the web-based ILTS for German which was used for this study. We will then describe the participants and outline the tasks and methodology used. Finally, we will summarize the results, by providing examples of students' output during the practice session, and conclude with suggestions for further research.
AN INTELLIGENT LANGUAGE TUTORING SYSTEM FOR GERMAN
The German Tutor contains a grammar and a parser that analyzes sentences entered by students and detects grammatical and other errors. The
297
feedback modules of the system correlate the detailed output of the parser with error-specific feedback messages.
Feedback is also individualized using an adaptive Student Model which keeps a record of a student's strengths and weaknesses. The user's performance over time is monitored across different grammatical constructs; the information is used to tailor feedback messages suited to learner expertise within a framework of guided discovery learning. Feedback messages for beginners are explicit, while the instructional messages for the advanced learner merely hint at the error (see Elsom-Cook, 1988). The feedback aimed at beginning learners also contain less technical terminology than that for the intermediate and advanced learner (Heift & McFetridge, 1999).
For instance, (1c) below shows the feedback message for an intermediate student who made a mistake with an auxiliary verb.
(1a) *Bianca hat ohne er gegangen.
(1b) Bianca ist ohne ihn gegangen.
(1c) Hier stimmt das Hilfsverb HAT nicht.
(The auxiliary HAT is wrong here.)
In contrast to (1c), the feedback message for the beginning student simply states that HAT is incorrect without referring to the word auxiliary. The beginning-level message also stipulates that GEHEN requires IST. For the advanced learner, the feedback does not identify the word HAT but simply displays the message The auxiliary is wrong in this sentence.
In addition to tailoring feedback messages suited to learner expertise, the system also recommends remedial tasks. At the end of each chapter, the system displays learner results and suggests additional exercises according to the number and kind of mistakes that have occurred.
Finally, in the case of multiple errors, the system prioritizes student errors and displays one message at a time so as not to overwhelm the student with excessive error reports. For instance, once the student corrects the error with the auxiliary in (1a) above, the feedback message will then indicate that the case of the pronoun ER of the prepositional phrase OHNE ER is not correct. Previous studies (van der Linden, 1993) have found that lengthy error messages tend to distract the student from the task. Error prioritization also follows pedagogical principles by considering the salience of an error and/or the focus of a particular exercise (Heift & McFetridge, 1999). It is the purpose of this study to identify whether students indeed work through the iterative error correction process to correct errors or whether they rely overly on system help.
298
PARTICIPANTS AND PROCEDURE
During the spring semester 2000, the ILTS was used with 33 students of two introductory classes of German. The data were collected during three one-hour class sessions. For the study described here, students worked on the "Build a Sentence" exercise in which words are provided in their base forms and students are asked to construct a sentence (see Figure 1).
Figure 1
Build A Sentence Exercise
In the event of an error, students have a number of options in the exercise. They can either correct the error and resubmit the sentence by clicking the Prüfen 'check' button, peek at the correct answer(s) with the Lösung 'answer' button, or go on to the next exercise with the Weiter 'next' button. If students choose to correct the sentence, it is checked again for further errors. The iterative correction process continues until the sentence is correct.
During the three one-hour sessions, students worked on six chapters with a total of 120 exercises. Each practice session covered two chapters, but not all students finished all 40 exercises of the two chapters during the given practice time. Also, not all students were present in all three practice sessions.
The grammatical structures present in the exercises were: gender and number agreement of noun phrases, subject-verb agreement, present tense of regular and irregular verbs, accusative and dative objects/prepositions, two-way prepositions, present perfect, auxiliaries, word order of finite and nonfinite verbs, modals, and separable prefix verbs. The linguistic
299
structures had all been practiced in communicative class activities prior to the computer sessions. Students were also familiar with the grammatical terminology used in the system feedback.
For data collection, we implemented a computer log to collect detailed information on the student-computer interaction (see Heift & Nicholson, 2000b). Students were aware that the computer logs were collecting data, but they were not shown an example. Students chose an anonymous login ID which they used consistently across all three sessions.
RESULTS
Table 1 provides a general summary of student interactions with the program.
Table 1
Submission Types
A total of 4,456 server requests were made during the three one-hour practice sessions, that is, an average of 135 requests per student during their total practice time. Students did not provide any input for 51 sentences (1%); they simply requested the correct answer(s) and moved on to the next exercise. Forty per cent of the submitted sentences were correct on first submission, while 59% required retries. For 11% of the retries, students peeked at the correct answer at some point during the error correction process, while the remaining 89% of the learners corrected their mistakes and eventually submitted a correct answer.
Analyzing the data with respect to learner-system interaction, we found four distinct interaction types: (a) Browsers, (b) Frequent Peekers, (c) Sporadic Peekers, and (d) Adamants (see Table 2).
300
Table 2
Interaction Types
Table 2 shows that 18% of the students browsed through the exercises without providing any input at some point during the three practice sessions, that is, they did not attempt to answer an exercise. The remaining three interaction types were determined by two factors: (a) the number of retries for an exercise and (b) the number of peeks. Fifteen per cent of the students were Frequent Peekers who requested the correct answer(s) from the system more often than they corrected their errors. Sixty-seven per cent, the Sporadic Peekers used system help options less often than they corrected themselves. Eighteen per cent, the Adamants, corrected their errors and peeked at the correct answer not more than once during total practicing time. The four distinct interaction types will be discussed in the following sections.
Browsers
Table 2 indicates that six students (D21 [11], D16 [9], D33 [9], D8 [8], D18 [8], and D7 [6]) tended to browse through the exercises, sometimes requesting the answer without providing any input. Student D21 skipped the most exercises (11), while D7 browsed through six exercises.
There are a number of possibilities why students might have chosen this strategy during the practice sessions. First, students might have thought that they knew the answers to the exercises they skipped and chose to not type them in. Second, students may have been curious to see all possible answers for certain exercises. (If students type in an answer, they are informed whether or not their specific answer is correct, but they do not get to see other possible answers.) Third, students may have wanted to complete the two chapters of each practice session in the time allotted and decided to skip some exercises.
301
To address this question, we examined students' language skill level, which we determined by (a) the percentage of initially correct submissions and (b) the system's assessment for each student during practice. The number of initially correct submissions was above average for three of the Browsers (D18, D7, D33) who achieved 79.4%, 74.5%, and 70%, respectively. The remaining three students (D21, D8, D16) were below average with 37.9%, 35%, and 10.4%, respectively. The Browsers' group average was 54.5%, compared to 40.2% for all students (see Table 3).
Table 3
Language Skill Level for Browsers
With respect to student assessment during practice, the system keeps a detailed record of student performance. When a sentence is submitted, the value for each linguistic element in the student input (e.g., direct object, gender, subject-verb agreement, etc) is incremented or decremented depending on whether it was correct or not. In subsequent retries of the same exercise, only the values of the linguistic structures which are still incorrect are updated. The values correspond to one of three learner levels: beginner, intermediate, or advanced. As a result, the student is assessed over time, and the values reflect cumulative performance for each
302
linguistic structure of the exercises completed (see Heift & Nicholson, 2000a).
Table 3 above shows two distinct profiles for Browsers: predominantly intermediate to advanced and beginner to intermediate. However, the Browsers also overlap with the three remaining groups. D7, D8, D16, and D18 were also Frequent Peekers, while D21 and D33 belonged to the group of the Sporadic Peekers. These groups are discussed in the following sections.
Frequent Peekers
Students in the group of Frequent Peekers are characterized by the very low number of resubmissions in the same exercise. They request correct answers more often than they revise sentences and resubmit them. That is, they take advantage of the learner-controlled environment, using system help options more frequently than immediately trying to correct their errors.
Table 4 summarizes the number of retries of the Frequent Peekers.
Table 4
Frequent Peekers
303
As table 4 indicates, the Frequent Peekers peeked a total of 146 times (32.8% of all exercises for these students). Student D7, for example, submitted 118 exercises, 88 of which were initially correct. Ten exercises were correct after one retry. In 14 other exercises, the student requested the correct answer after the system indicated a mistake. In exercises in which the student submitted two retries, he/she provided the correct answer twice and peeked at the answer four times. In total, the student corrected 12 exercises and requested the correct answer in 18 others.
Also, Frequent Peekers tried no more than twice in any given exercise. Once the system flagged an error, all of these students peeked at the correct answer more often than they corrected their mistakes. Moreover, all Frequent Peekers had fewer second than first retries suggesting that Frequent Peekers generally correct a sentence once and, if unsuccessful, tend to request the correct answer.
We also determined the language skill level of the Frequent Peekers and found that with respect to initially correct submissions, two of the students (D7, D18) were performing above average with 74.6% and 72.6%, respectively. The remaining three students (D8, D12, D16) were below average with 35.1%, 33.3%, and 10.4%, respectively, for initially correct submissions. The group average was 48.8%.
Examining the log with respect to student skill level during practice, Table 5 shows that the two students, D7 and D18, were mostly at the intermediate level, never at the beginning level, during total practice time.
Table 5 Skill Level of Frequent Peekers During Practice
In contrast, while student D8 was at the advanced level twice, D12 and D16 were always at the beginning or intermediate level. The group average was 29.9% at the beginner and 64.6% at the intermediate level.
It is possible that mid to high performers (D7 and D18) had more
304
confidence in their own work than in the accuracy of a computer program. Consequently, if the system reported an error, they may have tended to look up the correct answer. Moreover, these students might have felt that they could have learned more from reading the correct answer than from the iterative error correction process. As for the weaker students (D8, D12, and D16), they probably found it more frustrating to work through their errors due to the number of mistakes they made and preferred to look up the correct answer.
While the Frequent Peekers habitually peeked at the answers, the Sporadic Peekers and Adamants corrected their mistakes and resubmitted their answers. In fact, they either requested the correct answer only very rarely or worked through an exercise until the bitter end. These two groups are discussed in the following section.
Sporadic Peekers
The majority of students (66.7%) were Sporadic Peekers. These students generally corrected their errors, requesting the correct answer once in a while but significantly less often than the Frequent Peekers. Table 6 shows the error correction pattern for student D2, a typical error correction pattern for students belonging to this group.
Table 6
Error Correction Pattern for Sporadic Peeker D2
In contrast to the Frequent Peekers, Sporadic Peekers corrected their errors far more often than they peeked at the correct answer. They also repeated an exercise more often than the Frequent Peekers: up to six iterations for a single exercise in some cases.
We also considered the language skill level of the Sporadic Peekers and found that the percentages for initially correct submissions ranged between 32% and 87%, with a group average of 62% (see Table 7).
305
Table 7
Language Skill Level for Sporadic Peekers
The computer log further showed that the majority of these students were predominantly at a intermediate level during practice. The percentages for beginning and advanced students were nearly balanced with 13.6% and 10.7%, respectively.
From a pedagogical point of view, the correction strategy employed by the Sporadic Peekers seemed very favorable. While students generally corrected their mistakes, they did not work to the point of frustration nor let the correction process turn into a guessing game. In the group below, the Adamants, students tended to correct their answers to the bitter end, even after the corrections turned into what amounted to random guesses.
Adamants
The Adamants were similar to Sporadic Peekers in that they generally preferred to correct their errors, but they were even more persistent than Sporadic Peekers. They were the users who requested the correct answer only once or never during all three practice sessions and made little use of the help options of the ILTS. Table 8 shows the number of total exercises completed and the number of peeks for the six Adamants.
Table 8
Adamants
The data demonstrate that the six students requested the correct answer once or not at all during the total practice time. It is, therefore, not surprising that students in this group submitted the greatest number of retries: up to 10 times in several cases. Moreover, this group accounted for all instances exceeding six retries.
306
Considering the number of retries, it is also not surprising that some of the corrections became random; students possibly did not remember which changes they had already made. For example, we noticed that in some instances students resubmitted an identical sentence. Consider (2a)-(2j) below which illustrate the corrections a student applied before attaining the correct answer. The error types flagged by the system are given in parentheses:
(2a) Ich esse keinem Fleisch. (direct object)
(2b) Ich esse keinen Fleisch. (direct object)
(2c) Ich esse keinen\s Fleisch. (spelling)
(2d) Ich esse keinens Fleisch. (spelling)
(2e) Ich esse keinenes Fleisch. (spelling)
(2f) Ich esse keinenen Fleisch. (spelling)
(2g) Ich esse keinen Fleisch. (direct object)
(2h) Ich esse keine Fleisch. (direct object)
(2i) Ich esse keinem Fleisch. (direct object)
(2j) Ich esse kein Fleisch. (correct)
The sentence submissions given in (2a)-(2j) indicate that, in all instances, the errors occurred with the inflection of the negation kein. It should also be noted that sentences (2g) and (2i) were submitted before (2b and 2a).
We also considered the language skill level of the Adamants and found that they were mid to high performers. The data show that all six students scored above average in entering the correct answer at initial submission. For example, the scores for the correct answers entered by student D30 on the first try was 85.7%. The mean for the remaining five students ranged between 70% and 82.5%, with a group average of 75.6%.
With respect to the students' language skill level during practice, Table 9 shows that students were at the intermediate and advanced levels across most grammatical constructs (92.8%). In a few instances (7.2%), students were assessed at the beginning level (see Table 9).
307
Table 9
Skill Level of Adamants During Practice
It could have been expected that the Adamants were mid to high performers. Students at the beginning level may have found it too frustrating to correct sentences without any expectation of success. However, individual learner differences may have also played a role: some students may have simply refused to give up.
Language Skill Level
In comparing the language skill levels of all four interaction types, Figure 2 shows that low to mid performers tended to be Browsers and/or Frequent Peekers.
308
Figure 2
Skill Profile across all Constructs for Each Interaction Type
In contrast, mid to high performers tended to be Adamants, while Sporadic Peekers consisted mainly of students with intermediate language level skills. The number of beginning and advanced students among the Sporadic Peekers is fairly balanced at 13.5% and 10.6%, respectively.
Given these results we speculate that beginning learners take more advantage of system help options. First, they make more errors than learners at other levels and thus find it more frustrating to correct exercises independently. Second, students who make a lot of errors accomplish fewer exercises in the time allotted; peeking at the answer is an expedient way to advance through an exercise set. Intermediate students achieve a higher number of initially correct responses and, even in the case of errors, require fewer tries. Finally, high performers get more sentences initially correct and find working through many retries once in a while more of a challenge than a nuisance.
CONCLUSIONS AND FURTHER RESEARCH
In this article, we investigated learner control and error correction in a web-based ILTS for German. The data show that 85% of the participants
309
revised their sentences far more often than they peeked at the correct answer(s). The remaining students corrected their errors on occasion but relied more often on system help. The data indicate that students skipped exercises in only 1% of the total server requests.
We further identified four interaction types among our participants: Browsers, Frequent Peekers, Sporadic Peekers, Adamants. The majority of students were Sporadic Peekers, preferring to work through the error correction process and to peek at the correct answer only occasionally. Language proficiency also seemed to be a determiner: lower performers generally made more use of system help (peeks and skips), with the exception of high performers who may have skipped what they considered to be trivial material.
The study offers three important findings. First, in an ILTS environment, students tended overwhelmingly to correct their errors and to make appropriate and effective use of the system capabilities. Second, in a student-controlled learning situation, the quick route to the correct answer was not overused and, in fact, was shunned by one fifth of the students. Most of the time, students opted to work through the iterative correction process. Obtaining a correct answer on demand, however, served to moderate student frustration, especially for low to mid performers who tended to peek more frequently than students at the other language skill levels. Third, students showed distinct interaction patterns depending on their language skill level, pointing to the need for CALL programs to allow for individualization of the learning process. The German Tutor achieved this goal in a learner-controlled environment by adjusting feedback messages suited to learner expertise.
Additional research is required for more definitive conclusions. First of all, the groups in this study were limited in size so the findings are, to a degree, tentative. It would also be interesting to compare students' error correction patterns in an ILTS to those in a less sophisticated CALL program, one that does not provide error-specific and individualized feedback.
Tidak ada komentar:
Posting Komentar