Rabu, 22 Desember 2010

CALL for education 10

Abstract:
This paper was written as a report to the British Council's English Teaching Advisory Committee, July 1983.
Despite current feelings among many that testing has come of age in CAI, there are several areas where we are still in need of better understanding. There are a number of testing techniques which can be effective if matched to our needs. One effective way to describe testing purposes and techniques is to describe the relationships between the population of all possible test items, the actual sample of this population to be used, and the learner's control of this sample. Only when correct principles of traditional testing are used, can we tap the usefulness of the computer's speed and memory to assist either the teacher or the learner.

KEYWORDS: testing, interlanguages, cloze tests, achievement tests, proficiency tests, aptitude tests, diagnostic tests, test samples.
Language testing is an obvious computer application, too obvious perhaps. Computer Assisted Language Learning (CALL) covers more, however, than just testing and drilling (Last, 1982). At the moment, CALL exercises appear to be more of a testing type than of a teaching type (Page, 1982). The climate of opinion today seems to be that we know about testing in CALL and it is time to move on to other applications. In this paper I question both the assumption that we know what to do with CALL testing and the obviousness of testing for CALL. I don't think we really do know what to do. The published evidence is largely programmatic and, where data-based, very sketchy and quite vague. Nor am I convinced about the general benefits of the computer for language testing—although it is true, I think, that there are some benefits—these are limited and need to be carefully spelled out.
A Look at Current Thinking
Wise (1983) makes a distinction between computer managed instruction (CMI) and computer assisted instruction (CAI). In CAI, testing is one of the major areas of use (the others are record keeping and student branching). The advantages of having students tested on the computer revolve mainly around the ease of evaluation and subsequent assigning of suggested course activities for the student as a result of the evaluation. Under CAI, Wise deals with what he regards as a major attribute of mainline CAI which is its ability to account for individual differences by helping individual students. Weible (1980a) agrees that the computer's chief advantage over all other instructional media (is) its potential—when properly programmed—to adapt its presentation of material to the specific needs of the individual learner. In a paper dealing with the teaching of reading skills, Weible (1980b) describes the development of a computer program which teaches high frequency vocabulary, basic syntax, and reading skills entirely through the inferring of meaning from contextual indicators (inferencing). This approach is made accessible to students wholly unfamiliar with the target language by initial reliance on a largely native language context. What is interesting here is the connection of successive approximations (i.e. interlanguages) to the target language with linguistic redundancy, since it must be the case that at the point of each successive approximation the ink (e.g. target language word) must be redundant, i.e. its meaning (of whatever kind) conveyed by context. If that were not the case the learner's gradual change from one interlanguage to another would be random. There are interesting implications here for the understanding of second language learning.
Hughett (1981) considers that it is the students (more than the teachers) who will benefit most from CALL: CAI gives instantaneous feedback and reinforcement to the learner on quizzes and it is capable of providing on-the-spot special instructions on difficult points or concepts. Such flexibility is not available through conventional audio language laboratory programs.
Computer-Assisted Testing
One of the few papers to deal specifically with CALL in testing is Jones (1981). Jones distinguishes two types of testing: exam testing and class testing. Jones quite properly argues that class testing is as much about teaching as it is about testing and he gives an example of class testing in which there is limited computer interaction as the learner gradually approximates to the correct answer. This example, Jones claims, shows how a computer program could deal with a fairly complex area of language, where a range of different answers might be possible. But for our purposes what
41
Jones is talking about here is really teaching rather than testing. Teaching and testing can be of help to one another but they are not identical.
Higgins and Johns (1983) are very much interested in and concerned about testing. It is not surprising that in their chapter 24 which deals with testing they begin with well established uses: in statistical operations and with optical scanners. In addition Higgins and Johns note the value of computers in the grading of materials and especially in autograding, and the matching of materials to individual students, two points to which I will return. Higgins and Johns also point to the danger, legally and psychologically, of long term memory storage in computers, making it possible to recall unwelcome memories about past performances.
In other chapters Higgins and Johns refer to the insertions and substitution facilities of CALL for, e.g., cloze tests and reading speed tests. As with the Weible example above and Harrison (1977) the learner (or teacher) can maintain a highly sophisticated control over which words to omit, what deletion rate and so on, as well as over which words to add; and again where. Texts can be made and remade at will so as to fit specific group or individual requirements. Higgins and Johns also point to the application of CALL to a restricted form of interaction, as in the game of Twenty Questions. Higgins and Johns (1982) view their approach as based more on a cognitive than a behavioristic view of language learning, and is, we would like to think, more humanistic than mechanistic. Roach (1981) describes a specialized form of CALL testing, "in the form of tests of perceptual ability we have produced and tried out an experimental system for presenting tape-recorded test items for a multiple choice perception test...students are given individual print-outs of their results immediately on completion of the test." Also mentioned are computer produced and marked vocabulary tests based on the Longman Dictionary of Contemporary English, using, e.g. headword deletion with the target task of headword retrieval.
Test Areas Emphasized
Areas that are frequently referred to in CAI literature are those of insertion/deletion (usually with reference to cloze testing, less often to testing of reading speed), autograding (learners essentially creating their own test syllabus), and interaction (whereby the testee and the machine engage in some form of communicative game or activity, usually of the information gap variety).
Areas that do not seem to be referred to, or only very little, are those of interlanguage (in which testing has the role of an elicitation technique for an evaluation and description of movement along an interlanguage continuum) and the testing of writing—a much more orthodox testing interest. Here surely is an area of considerable scope in which sophisticated syntactic and rhetorical programming (if that is feasible) can certainly be used interactively for teaching, through the techniques of drafting and redrafting, and also for testing appropriate syntactic and rhetorical written responses to a series of instructions (discrete or sequenced) by the computer. If we cannot write programs of such subtlety for testing writing it may be, perhaps, that it is not our programming skills that are lacking but that we do not know enough about language, about writing, and about style. It may be that we will learn more by attempting to write programs; it may also be that we should not overreach ourselves in claiming for CAL testing more than we know about testing in general.
Types of Tests
There are different ways of describing tests. One convenient way is in terms of test use, viz: achievement, proficiency, aptitude, diagnostic, etc. Achievement tests are used to determine how much learners control of a known syllabus; how much may be norm or criterion referenced. Proficiency tests are used to determine control of an unknown syllabus. Aptitude tests are used to predict future language learning success. Diagnostic tests are used to determine how much learners do not control either of a known or of an unknown syllabus. In all cases there are three necessary bodies of data: one, the population of all possible test items; two, the sample of test items actually under test; and three, the number of successful test items within the learner's control. In the case of most tests (certain limited achievement tests at an elementary level are an exception) there is inevitably a gap between one and two. In the case of all tests (except perhaps certain criterion referenced tests) there must be a gap between two and three. Tests by definition are not exercises, i.e. they are not within the grasp of all learners: tests establish a rank order.
Any CALL test use that can close the gap between the population of all possible test items and the sample under test would make an important contribution to testing. Most tests are group tests; this is a description of their construction not of their administration. In certain cases what is needed (certain criterion referenced uses perhaps) is a test of an individual against himself. I am not entirely convinced that this is so different a use that it requires a non norm-referenced method of construction, but it is the case that a swift reordering and reselecting of items appropriate to an individual's present needs would be of use. Here again CALL seems to be of interest for testing.
Limitations and Assets
It is important that we do not take a magic-cure view of testing in CALL, believing that we can test areas of language use that we cannot describe or categorize. It is fallacious to pretend to ourselves that interaction, communicative activities, rhetorical and stylistic control will become available to us because we are using computers. On
42
the contrary, what we need to do is to consider what we can describe and categorize in language, what is unique about CALL, and how the two can be combined to assist us in testing language.
The two features of CALL that make it a potential asset to testing are speed and memory. Speed enables the most appropriate item array to be presented for any given group of learners or any individual learner. Memory provides for the closure of the population-sample item gap mentioned above.
Summary
The three areas then where CALL seems of present interest to us are: 1. Self access/learner related testing in which entry points to a given population of items (e.g. a grammar test or a set of reading comprehension tests) can be swiftly determined.
2. The use of redundancy (both deletion and addition) in which a range of cloze and reading speed tests can be called up with built-in within-text as well as between-text grading. Clozentropy, the group related method of cloze scoring, is a special case of this type of testing.
3. Diagnostic testing (of grammar, vocabulary, common errors etc.) combining delicate grading as well as the repeated testing of parallel item tokens of the same type. Diagnostic testing has to date been highly desired but somehow unattainable—a sort of pseudo procedure. It does seem now that through CALL a large enough item bank memory and a swift enough access can produce reliable and valid diagnostic instruments.

Tidak ada komentar:

Posting Komentar