Evaluating the Reliability and Validity of an English Achievement Test for Third-year Non- major students at the University of Technology, Ho Chi Minh National University and some suggestions for chan

38 1.9K 13
Evaluating the Reliability and Validity of an English Achievement Test for Third-year Non- major students at the University of Technology, Ho Chi Minh National University and some suggestions for chan

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Evaluating the Reliability and Validity of an English Achievement Test for Third-year Non- major students at the University of Technology, Ho Chi Minh National University and some suggestions for chan

- 1 - CHAPTER 1: INTRODUCTION 1.1 Rationale for choosing this topic English has already played a specially important role in the increasing development of science, technology and international relations, which has resulted in the growing needs for English language learning and teaching in many parts of the world English has become a compulsory subject in national education in many countries, among which Vietnam has considered learning and teaching English as a major strategic tool to develop human resources, as a way to keep up with other countries Therefore, in any level of education, from primary to university or postgraduate degree, learners must learn or want to learn English as a compulsory subject or their target to access to information technology and to find a good job It is true that English teaching/ learning is essential for job training Fully aware of the importance of the English language, the University of Technology, Ho Chi Minh National University has encouraged and required their students to learn it as a compulsory subject during the first three academic years Therefore, English has been taught at the University of Technology since it was established, aiming at equipping the students with an essential tool to go deeper into the world However, to evaluate how students acquire when they learn a foreign language, how well they use what they have been taught and at which level of English they are standing is not paid much attention to The evaluation only counts for calculating the percentage of the number of students who pass English tests, which ; therefore, doesn’t say anything about the validity, reliability or discrimination of the tests The results of English test are not successfully and completely employed In addition, during the time I have worked as a teacher of English at the University of Technology, I have heard teachers and learners complaining about the English achievement test in terms of its content, its structure As a result, the English section has decided to implement the renewal of the item bank in order to make it more valid and more reliable Seeing the point, the author is encouraged to undertake this study entitled “Evaluating the Reliability and Validity of an English Achievement Test for Third- year Non- major students at the University of Technology, Ho Chi Minh National University and some suggestions for changes” with the intention to find out how valid and reliable the test is More importantly, the writer hopes that the result of the study can - 2 - then be applied to improve the current testing and to create a new really reliable item bank It is also intended to encourage both teachers and learners in their teaching and learning 1.2 Scope of study The scope of this thesis is limited to a research on examining the existing achievement test in terms of its validity and reliability for the third-year non-English major students at the University of Technology, Ho Chi Minh National University The study gives analyzed statistic data of the currently used test and proposes practical suggestions to improve the test Due to the limitations of time and research conditions, it is impossible for the author to cover all used achievement tests for third-year students Instead, only one test is studied 1.3 Aims of study The major aim of the study is to evaluate the currently used achievement test of the 3rd year non-English students of technology with a special focus on the test reliability and validity The specific aims of the research are:  To evaluate the test validity and reliability through initial score statistics obtained from the achievement test result of third-year students,  To pinpoint the strengths and weaknesses of the test, and  To provide practical suggestions for the test improvement 1.4 Methods of study In order to achieve the above-mentioned aims, the study has been carried out with the following methodologies First, the author based herself both on the theory and principles of Language Testing, major characteristics of a good test, especially test validity and reliability, achievement test and statistic methods used in interpreting test results From critical reading, the writer has gathered, analyzed and synthesized many reference materials to draw out a theoretical basis to evaluate the used achievement test for the 3rd year students in terms of its validity and reliability Then, quantitative methodology was used to collect and analyze data After collecting data, the author employed statistic software to interpret it and to present suggested findings - 3 - 1.5 Research questions This study is implemented to find answers to the following research questions: 1 Is the achievement test for third-year non-English major students at the University of Technology, Ho Chi Minh National University reliable? 2 Is the achievement test for third-year non-English major students at the University of Technology, Ho Chi Minh National University valid? 3 Is it necessary to make some changes to the test? If yes, what are the changes? 1.6 Design of study The thesis is organized into four major chapters: Chapter 1- Introduction presents such basic information as: the rationales, the aims, the method, the research questions and the design of the study Chapter 2- Literature Review reviews theoretical backgrounds on evaluating a test, which includes language testing, criteria of good tests and theoretical ideas on test reliability and validity as well as achievement tests Chapter 3- The study is the main part of the thesis showing the context of the study and the detailed results obtained from collected tests and findings in response to the research questions Chapter 4- Conclusion offers conclusions and practical implications for the test improvement In this part, the author also proposes some suggestions for further research on the topic - 4 - CHAPTER 2: LITERATURE REVIEW This chapter provides an overview of the theoretical background of the study It includes four main sections Section 2.1 discusses the importance of testing in education Section 2.2 is about language testing It is then followed by Section 2.3 in which the author provides a brief review of major characterictics of a good test with the major focus on test reliability and validity Finally, in Section 2.4, the achievement test and its types are explored 2.1 The importance of testing in education Testing is an important part of every teaching and leaning experience Testing is a tool to measure learners’ ability It may creates positive or negative attitudes toward teaching and learning process Testing reflects teaching process and overall training objectives Through testing, the administrators can make important decisions on the course, the syllabus, the course book, teachers, learners and administration Testing contributes a very important part in teaching/ learning process It is the last stage in education technology Therefore, to take advantages of testing to measure the quality of education, the administrators must build an essential and eligible testing technology This is to evaluate learners’ ability, suitability of teaching methods, teaching/ learning materials and teaching/ learning conditions and suitability of set-up training objectives Testing and Teaching Testing and teaching are closely related because it is impossible to work in either field without being constantly concerned with the other (Heaton, 1998: 5) In other words, Heaton implied that teaching and learning provide a great source of language materials for testing to make use of In turn, testing reinforces, encourages and perfects the teaching/ learning process Hughes (1989: 2) summarizes the relationship as: “The proper relationship between teaching and testing is surely that of partnership” To explain this, Hughes mentioned the effect of the vise versa relationship as backwash If the testing leaves good effects on teaching, the backwash is said beneficial However, there may be occasions when the teaching is good and appropriate and the testing is not, we are then likely to suffer from harmful backwash Test result will give information for both teachers - 5 - and learners for their future action, such as improving knowledge and skills, revising knowledge, or applying a new teaching method As Brown (1994: 375) shared the idea that testing is “what teachers measure or judge learners’ competence all the time and, ideally, learners measure and judge themselves” Shortly speaking, it is undeniable that testing is an integrative part of teaching and it can be separated from the program or from the course goals Testing has both positive and negative impact on teaching Testing provides the teacher with information on how effective his teaching has been, or the teacher can use tests to diagnose his own efforts as well as those of his students Testing and Learning Testing is a tool to “pinpoint strengths and weaknesses in the learned abilities of the student” (Henning, 1987: 1) That is, through testing, learners can find out at which level they are standing and what difficulties they have faced up with As a result, they can adjust their learning; explore more effective ways of learning At the same time, the teacher can rely on the result of tests to understand better learners’ ability and then can improve his methods of teaching or revise knowledge Thus, Read (1982: 2) said that “a test can help both teachers and learners to clarify what the learners really need to know” It is clear that not only the teacher but also learners may achieve the benefits through testing To sum up, tests can benefit students, teachers and even administrators by confirming progress that has been made and showing how they can best redirect their future efforts Tests can help In addition, good tests can sustain or enhance class morale and aid learning 2.2 Language Testing Language testing is one of the forms of testing and it is also one form of measurements Its importance in English learning is reviewed as: “properly made English tests can help create positive attitudes toward instruction by giving students a sense of accomplishment and a feeling that the teacher’s evaluation of them matches what he has taught them Good English tests also help students learn the language by requiring them to study hard, emphasizing course objectives, and showing them where they need to improve” (Davies, 1996: 5) Mc Namara (2000) presented three main roles of language testing, which is applied not only in education but in other fields as well Firstly, language testing is considered as a - 6 - key to succeed as language testing is a decisive way in recruitment Secondly, it serves educational goals According to Mc Namara, tests are used to place learners in a suitable course The third role of language testing is for the grant of research Every researcher who wishes to do a research on a language need to evaluate standard tests or to design tests in that language From Henning’s view, he suggested six purposes of language tests as follows:  Diagnosis and Feedback: to explore strengths and weaknesses of the learners  Screening and Selection: to assist in the decision of who should be allowed to participate in a particular program of instruction  Placement: to identify a particular performance level of the student and to place him at an appropriate level of instruction  Program Evaluation: to provide information about the effectiveness of programs of instruction  Providing Research Criteria: to provide a standard of judgment in a variety of other research contexts based on language test scores  Assessment of Attitudes and Sociopsychological Differences: to determine the nature, direction, and intensity of attitudes related to language acquisition (Henning, 1987: 1) 2.3 Major characteristics of a good test In order to make a well-designed test, teachers have to take into account a variety of factors such as the purpose of the test, the content of the syllabus, the pupils’ background, the goal of administrators and so forth Moreover, test characteristics play a very important role in constructing a good test The most important consideration in determining whether a test is good or not is the use for which it is intended That is to say, the most important quality of a test is its usefulness It is believed that test usefulness provides a kind of metric by which test developers can evaluate not only the tests that they develop and use, but also all aspects of test development and use Generally speaking, usefulness quality includes six components: reliability, construct validity, authenticity, interactness, impact and practicality However, there is problem that should be pointed out that rather than emphasizing the tension among the different qualities, test developers need to recognize their complementarity - 7 - Bachman and Palmer (1996) consider the criteria as qualities of test usefulness rather than individual factors Their idea of usefulness can be visually presented as in Figure 2.1: Usefulness = reliability + validity +impact + authenticity + interactiveness + practicality Practicality Impact Reliability Test Validity usefulness Authenticity Interactiveness Figure 2.1 Usefulness (Bachman and Palmer, 1996) Henning (1987) added more test characteristics and he summarized in the form of the table called A checklist for Test Evaluation The checklist is for rating of the adequacy of a test for any given purpose - 8 - Table 2.1 A checklist for test evaluation Name of test Purpose Intended Test characteristic Rating (0 = highly inadequate, 10 = highly adequate) 1 Validity _ 2 Difficulty _ 3 Reliability _ 4 Applicability _ 5 Relevance _ 6 Replicability _ 7 Interpretability _ 8 Economy _ 9 Availability _ 10 Acceptability _ Total (Adapted from Henning, 1987: 14) Other leading scholars in testing also share the idea about test characteristics with the two scholars mentioned above Among these test characteristics, they all agree that reliability and validity are essential to the interpretation and use of measures of language abilities and are the primary qualities to be considered in developing and using tests For this reason, in the study, the author would like to employ these essential measurement qualities to evaluate the test taken by a large number of third-year non-English major students at the University of Technology Following is a brief discussion about reliability and validity 2.3.1 Test Reliability Reliability has been defined in different ways by different authors Perhaps the best way to look at reliability is the extent to which the measurements resulting from a test are the result of characteristics of those being measured For example, reliability has elsewhere been defined as "the degree to which test scores for a group of test takers are consistent over repeated applications of a measurement procedure and hence are inferred to be dependable and repeatable for an individual test taker" (Berkowitz, Wolkowitz, Fitch, and Kopriva, 2000) This definition will be satisfactory if the scores are indicative of properties of the test takers; otherwise they will vary unsystematically and not be repeatable or dependable - 9 - Test reliability refers to the consistency of scores students would receive on alternate forms of the same test Due to differences in the exact content being assessed on the alternate forms, environmental variables such as fatigue or lighting, or student error in responding, no two tests will consistently produce identical results This is true regardless of how similar the two tests are For example, a test that includes a translation part would probably produce different scores from one administration to another because it is subjective, and it would thus be unreliable Henning (1987: 10) claimed that all tests are subject to inaccuracies The ultimate scores gained by the test-takers only provide approximate estimations of their true abilities While some measurement error is unavoidable, it is possible to quantify and greatly minimize the presence of measurement error A test on which the scores obtained are generally similar when it is administered to the same students with the same ability, but at a different time is said to be a reliable test And since test reliability is related to test length, so that the longer tests tend to be more reliable than shorter tests, knowledge of the importance of the decision to be based on examination results can lead us to use tests with different numbers of test items Test reliability is considered as “a quality of test score” by Bachman (1990: 24) He makes a further point that if a student receives a low score on a test one day and high score on the same test two days later, the test doesn’t yield consistent results, and the score cannot be considered reliable indicator of the individual’s ability Reliability can also be viewed as an indicator of the absence of random error when the test is administered When random error is minimal, scores can be expected to be more consistent from administration to administration Sources of Error According to Bachman (1990, 165), there are four factors that affect language test scores The effects of these various factors on a test score can be illustrated as in Figure 2.2 - 10 - Communicative language ability TEST SCORE Test method Personal Random facets attributes factors Figure 2.2 Factors that affect language test scores We can infer from the figure that a score in a language test is indicated by communicative language ability Also, the language test is affected by factors other than communicative language ability They are:  Test method facets: are systematic to the extent that is uniform from one test administration to another (Appendix 1)  Personal attributes: include individual characteristics such as cognitive style, knowledge of particular content areas and group characteristics such as: sex, race, and ethnic background It is also systematic  Random factors: are unsystematic factors including unpredictable and largely temporary conditions such as his mental alertness or emotional stage and so on Thus, a test is considered to be reliable if it possesses such ideas as:  The results of one test achieved at two different times of the same candidate are coefficient  Candidates are not allowed too much freedom  Clear and explicit instructions are provided ... Is the achievement test for third-year non -English major students at the University of Technology, Ho Chi Minh National University reliable? Is the achievement test for third-year non -English. .. background for the study and is an overview of English teaching, learning and testing at the University of Technology, Ho Chi Minh National University More importantly, it presents data analysis of the. .. the chosen test and findings drawn from the analysis 3.1 English learning and teaching at the University of Technology, Ho Chi Minh National University 3.1.1 Students and their backgrounds Students

Ngày đăng: 07/11/2012, 15:05

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan