To begin, I feel I should make a distinction between an assessment and a test. I am constantly assessing my students through activities, discussions, etc. In fact, I place more stock in assessing students reading comprehension through short low-stakes writing assignments than in exams. However, I do plan to teach in countries like South Korea, Saudi Arabia or China, that place a high premium on students’ abilities to pass exams. Most of them will have to pass an English proficiency test in some form or another.
In some ways, I would like to help prepare them for passing such exams without simply giving them some generic test-taking strategies, but also because tests are important to my students. I don’t think they should be and I take every opportunity to minimize the relation of exams to their ability to communicate in English. Some of my students with the highest TOEFL® scores are almost incomprehensible when engaging in mundane chit chat. Likewise, some of the lowest scorers could function perfectly well if they didn’t have to write research papers. I stress that test-taking is a skill and their scores mostly reflect how well they take tests rather than how well they communicate their ideas in English. I began to wonder, if I could design a test that could be more useful as an assessment tool rather than simply training students for future tests. What I am looking for is a cross between what I do in my current class and the traditional multiple choice tests that students are familiar with.
First, let me tell you what I do in my current class. I think it is an improvement on multiple choice tests, but is not without its problems. The methods that I have been using so far seem more appropriate for testing students’ retention of the skills that I teach them. For example, the first exam that I give in my reading and writing class consists of giving students a passage to read, asking them to annotate it as they read it for the first time. To show their use of reading strategies, they are explicitly instructed to mark the passage as they read it. For example, students are told to highlight or underline the parts of the text that they read first. Furthermore, they are told to make notes about any questions that come to mind as they read the text. “What does this word mean?” “I don’t understand this.” “Maybe this is important.” These are all things that students are encouraged to make note of during their reading.
The stated goal of this exam is to try and make the tacit explicit; that is, to show the processes that students are going through as they read. You may have noticed by now that during the first part of this exam, no mention is made of the traditional measure of reading assessment: comprehension. In fact, the traditional comprehension questions are no where to be found in this class. Reading comprehension is checked through the second part of the exam where students are asked to write a summary of the article that accurately represents the author’s ideas while (mostly) using the students’ words. All parts of this exam are completed in class although the summary writing occurs on a subsequent day. This allows students the chance to take the article home, read it more deeply, look up vocabulary, and plan their writing.
I feel that the format of this exam succeeds in focusing on reading as a skill rather than focusing on students’ abilities to pull information from an article. This is important for two reasons. First, it provides a model for which students can develop their own reading habits that they are encouraged to carry over into their other classes. Secondly, it discourages the strategies that students have learned to help them succeed at multiple choice tests such as the TOEFL®. I must mention that I have worked as a writer of test items for Prometric, a subsidiary of ETS, the company that designed and maintains the TOEFL®, SAT® and other tests.
Before I accepted this job, I thought about the implications of working on a standardized test and how I felt about participating in creating an assessment that I feel fails to accurately evaluate students’ skills and abilities. Ultimately pragmatism won out over my idealism and I decided that standardized tests are a part of students’ lives. I wanted to participate in the process so that I could better understand the nature of the demands that students face. What I learned during the process has informed my own practices of assessment in the classroom. I don’t think anyone in our ENG 701 class would advocate using multiple choice tests to assess students’ reading, but I think it worth discussing the reasons why.
First, multiple choice tests are decontextualized. Regardless of whether the questions are about short passages that students read in class or extensive reading done outside of class, the student is only asked about information that fits within the space of a sentence or two. This relies on students’ memory as much as it does reading comprehension. Because the questions are so decontextualized, many things can lead students astray. A key word that appears in the right position, may persuade students to check the wrong answer. In fact, in designing the test we didn’t call them wrong answers, we called them “plausible distractors”. To be sure, the plausible distractors needed to be clearly wrong, but they are designed to confuse students.
To illustrate what I mean, here is an example of a vocabulary item. Please note that ETS, vigorously defends their copyrights so this example isn’t the actual item we discussed, but it does accurately represent the type of item and the issues that were raised in our discussion.
I enjoy watching the sunset over the lake in the afternoon. It’s always so serene at that time of day.
SERENE probably means:
In the above example, the issues raised were primarily with the distractors. Originally, two of the distractors had been nouns, however, reviewers felt that the student population would likely be able to eliminate them based on syntax alone. Then we changed those two distractors to the adjectives FUN and LOUD, but then it was decided that LOUD stood out too much because it didn’t end with a nasal as the three others did. It was thought that students would either select it or disregard it based on this distinction alone. WARM was ultimately selected because it fit the syntactic and phonological patterns, but also because it had the added distinction of introducing slightly more confusion complexity. The stem mentions sunset, the sun is warm, therefore students might select warm. And here’s the logic: this will help us determine if students know the meaning of the word SERENE. Really, I think it tests whether students can spot the wrong answers, rather than testing whether they know the right one.
I’m using the example of a vocabulary question, but the principles involved apply to other types of questions including reading comprehension. The basic idea is that multiple choice questions have a set of strategies that can help students be successful. Test writers know this and so write questions and distractors to confound students as much as to try and measure how much of a reading passage students comprehend. I don’t really know if it’s possible to design a hybrid test that provides students some familiarity with the format, while still truly measuring how well students are reading. Students have a lot of anxiety about the tests in my class, in part because they don’t know what to expect. There is no real way for students to study for them, except perhaps by practicing their reading and summary writing skills. While I want to be ready for the types of writing tasks that they will have to do in the US, I also do want them to improve their ability to score well on proficiency exams because they’ll likely have to take the TOEFL® or similar tests throughout their academic careers. I hope to find some articles that can help me strike a balance.