A study on validity of 45 minute tests for the 11th grade = Nghiên cứu tính giá trị của bài kiểm tra 45 phút tiếng Anh lớp 11

Vietnam National University hanoi university of languages and international studies post-graduate Department Hoà ng HỒNG TRANG A STUDY ON VALIDITY OF 45 MINUTE TESTS FOR THE 11TH GRADE NGHIÊN CỨU TÍNH GIÁ TRỊ CỦA BÀI KIỂM TRA 45 PHÚT TIẾNG ANH LỚP 11 M.A COMBINED PROGRAMME THESIS Major: Methodology Major code: 60.14.10 HANOI - 2009 VIETNAM NATIONAL UNIVERSITY HANOI UNIVERSITY OF LANGUAGES AND INTERNATIONAL STUDIES POST-GRADUATE DEPARTMENT HOÀNG HỒNG TRANG A STUDY ON VALIDITY OF 45 MINUTE TESTS FOR THE 11TH GRADE NGHIÊN CỨU TÍNH GIÁ TRỊ CỦA BÀI KIỂM TRA 45 PHÚT TIẾNG ANH LỚP 11 M.A COMBINED PROGRAMME THESIS Major: Methodology Major code: 60.14.10 Supervisor: ASSOC PROF DR VÕ ĐẠI QUANG HANOI - 2009 iv TABLE OF CONTENTS DECLARATION i ABSTRACT ii ACKNOWLEDGEMENTS iii TABLE OF CONTENTS iv LIST OF ABBREVIATIONS vii LIST OF TABLES viii INTRODUCTION Rationale for the study Significance of the study Aims of the study Scope of the study Research questions .3 Organization of the study CHAPTER 1: LITERATURE REVIEW 1.1 LANGUAGE TESTING AS PART OF APPLIED LINGUISTICS 1.1.1 Language testing – a brief history and its characteristics 1.1.2 Purposes of language testing 1.1.3 Validity in language testing 1.1.3.1 Definition and types of validity .7 v 1.1.3.2 Content validity .8 1.1.3.3 Construct validity 1.2 CLASS PROGRESS TESTS 10 1.2.1 Language tests – definition and types 10 1.2.2 Class progress tests as a type of achievement tests 10 1.3 TESTING TECHNIQUES 12 CHAPTER 2: METHODOLOGY OF THE STUDY 2.1 Type of research: A qualitative research 17 2.2 Techniques .18 2.2.1 Data type and data collection 18 2.2.2 Data analysis .19 CHAPTER 3: THE STUDY 3.1 THE CONTEXT OF TEACHING AND TESTING ENGLISH AT HIGH SCHOOLS IN VIETNAM 21 3.1.1 The methodological innovation 21 3.1.2 The testing innovation 22 3.2 AN OVERVIEW OF THE TEACHING AND TESTING OF ENGLISH LANGUAGE IN THE 11TH GRADE .23 3.2.1 English textbook for the 11th grade 23 3.2.2 Syllabus for 11th grade English language subject 24 3.2.3 45 minute English language tests 29 CHAPTER 4: MAJOR FINDINGS vi 4.1 Phonetics section in 45-minute tests 31 4.1.1 Data concerning construct validity 31 4.1.2 Data concerning content validity 37 4.2 Grammar section in 45-minute tests 42 4.2.1 Data concerning construct validity 42 4.2.2 Data concerning content validity 47 4.3 Vocabulary section in 45-minute tests 53 4.3.1 Data concerning construct validity 53 4.3.2 Data concerning content validity 57 CONCLUSION DISCUSSION OF FINDINGS AND RECOMMENDATIONS .60 1.1 On pronunciation testing 60 1.2 On grammar testing 62 1.3 On vocabulary testing .64 CONCLUSION .66 REFERENCES 68 APPENDICES Copies of test papers collected vii LIST OF ABBREVIATIONS MCQ Multiple-choice question GF Gap-filling ER Error recognition ST Sentence transformation SB Sentence building C.V Construct Validity V Validity viii LIST OF TABLES Table 1: Bookmap of the English 11 textbook Table 2: Recommended structure of a 45 minute test Table 3: Number of pronunciation test items having their underlined parts dissimilar in letter format Table 4: No correct answer Table 5: Apparent correct answer Table 6: Underlined letter(s) not corresponding to the sounds tested Table 7: Content validity of phonetics section in Group tests Table 8: Content validity of phonetics section in Group tests Table 9: Content validity of phonetics section in Group tests Table 10: Content validity of phonetics section in Group tests Table 11: Summary of content validity of pronunciation test items of test groups Table 12: Summary of techniques for grammar testing in 30 tests Table 13: Construct validity of grammar items of Group tests Table 14: Construct validity of grammar items of Group tests Table 15: Construct validity of grammar items of Group tests Table 16: Construct validity of grammar items of Group tests Table 17: Content of grammar component of Group tests compared to the syllabus Table 18: Content of grammar component of Group tests compared to the syllabus Table 19: Content of grammar component of Group tests compared to the syllabus Table 20: Content of grammar component of Group tests compared to the syllabus ix Table 21: Summary of techniques for vocabulary testing in 30 tests Table 22: Tests having topic-relevant reading or cloze test passages Table 23: Content validity of vocabulary test items of Group tests Table 24: Content validity of vocabulary test items of Group tests Table 25: Content validity of vocabulary test items of Group tests Table 26: Content validity of vocabulary test items of Group tests INTRODUCTION Rationale for the study Language testing, a branch of applied linguistics, has witnessed its robust development within the last fourty (nearly fifty) years in terms of professionalization, internationalization, cooperation and collaboration (Stansfield, 2008, p 319) Along the process of its development, validity, together with fairness, has become a matter of increasing concern and it is predicted that research into validity will form “the prominant paradigm for language testing in the next 20 years” (Bachman, 2000, p 25) On discussing validity, much has been said about validation of standardised tests, especially those large-scale EFL tests such as TOEFL, IELTS and TOEIC (Stoynoff, 2009; Bachman et al., 1995, cited in Stansfield, 2008) since decisions based on the scores of these tests are usually considered of prime importance to test takers in both their career and life perspectives Teacher-produced tests, on the contrary, receive much less attention Studies have shown that designing a good test is a “demanding” task for teachers (Davidson and Lynch, 2002, p 65, cited in Coniam, 2009, p 227), since in a language test “language is both the instrument and the object of measurement” (Bachman, 1990) (which means difficulty regarding the careful choice of linguistic elements in a language test), and due to teachers’ lack of time and resources (Popham, 1990, p 200, cited in Coniam, 2009, p 227) Also, teachers are “unlikely to be skilled in test construction techniques” (Popham, 2001, p 26, cited in Coniam, 2009, p 227) That explains the reason why test item quality of teacher-produced tests is often lower than that of standardised tests in terms of reliability (Cunningham, 1998, p 171, cited in Coniam, 2009, p 227), and this leads to the low validity of test scores interpretations as well Nevertheless, however inferior teacher-produced tests are compared to standardised tests in terms of quality (according to several studies), little factual evidence has been found to support this (Coniam, 2009, p 227) Soranastaporn et al (2005) (cited in Coniam, 2009) attempted to compare concurrent validity between achievement tests designed by Thai language teachers and standardised tests like TOEFL and IELTS and has found low correlations between the two Another study conducted by Coniam into the reliability and validity of teacher-produced tests for EFL students at a university in Hong Kong reported poor reliability and validity results of teacher-produced tests despite a rather long process (compared to normal period of time teachers spend on designing a test) of test design and analysis (Coniam, 2009, p 238) In Vietnamese context of educational reform, textbooks at primary and secondary level have all been redesigned in structure and content to keep pace with current changes and development in society as well as in pedagogy English language textbooks, following the trend, started to be replaced in 2004 and the replacement process has just been finished in schoolyear 2008-2009 Despite the fact that techniques and guidelines for assessment have been provided in the new textbook set, there has not been any investigation into quality of the actual tests that teachers produce and use for their students at school and whether teachers follow these guidelines closely This situation calls out for research into quality of English language tests used at secondary schools so as to have a clearer and more accurate picture of language testing in Vietnam Significance of the study English language has been being learnt by over 90% of school pupils and university students in Vietnam, not to count the number of people learning English outside schools and universities Therefore, assessment of the quality of teacher-produced tests will lay the foundation for a valid interpretation of the quality of language education at schools, which in turn helps form directions and guidelines for further instruction and assessment at tertiary level and at other language education centers and institutions In a narrow scale, results of the quality assessment of school tests will assist in improvement of test items quality, creating more reliable and valid tests Aims of the study Within the small scope of an MA thesis, this study only aims at investigating two aspects of validity of a common type of English tests used in schools in Vietnam In particular, this research tries to investigate content and construct validity of the language components of English forty-five-minute tests used for the 11th grade in some high schools in northern Vietnam 56 Test no 2: Do you object to the door? A my opening B open C that I open D to have opened C for D with He works the government A to B in I like him very much, he is A quite and intelligent C a quite intelligent boy B quite an intelligent boy D a boy quite intelligent “What happened?” “We for an hour when the bus finally came.” A have been waiting B have waited C had been waiting D waited Do school-leavers give thought for their future work? A a great deal of B a few C a large number of D many It is shared belief that grammar and vocabulary are related and somehow intertwined There is hardly any clear-cut distinction between the two However, test items focussing more on lexical knowledge are supposed to be vocabulary items, while the others will fall into grammar category Question no above surely tests on the use of linking words, which is a grammar area Question no tests the use of gerund, no prepositions, no word order (the relative order among adverbs, adjectives, articles and nouns), no verb tenses, and finally no quantity words All these require grammar knowledge rather than lexical knowledge Therefore, these items fail to have construct validity 57 4.3.2 Data concerning content validity As vocabulary refers to a language component spreading throughout the test, it would be hard if we look at every sentence in the test to decide whether its words bear any relation to the vocabulary focussed in the text book Therefore, in this case investigation would be restricted to vocabulary test items and topic of the reading passage or the cloze test passage in each test For the reading passage or cloze test passage, the researcher would look at its content to decide whether that topic is relevant to the theme tested, and also look at its questions on (if any) Table 22 identifies the tests with their reading passage or cloze test passage having content relevant to the theme tested in the English 11 course book Table 22: Tests having topic-relevant reading or cloze test passages Provinces A Group Test no Group B C D Test no Test no Test no Test no Test no Group E Test no Test no Test no Group Table 23 reveals such a low percentage of tests having their reading or cloze test passages relevant in content to the themes tested (9 tests out of 30 tests collected, which equals 30%) The blank grey cells are those tests having no reading and/or cloze test passages or the content of those passages irrelevant With regard to content validity of vocabulary test items, the following tables would look at this issue more closely Table 23: Content validity of vocabulary test items of Group tests 58 Provinces A B C D E none V results 2 (/4) Content Test no (/2) (/3) (/7) (/17) (=66.7%) (=28.6%) (=0%) (=75%) (=100%) none Table 23 shows that these tests of group vary greatly in terms of the percentage of items having relevant content over the total number of vocabulary items in the test, from 0% to 100% Content validity of vocabulary test items of Group tests will be presented below in table 24 Table 24: Content validity of vocabulary test items of Group tests Provinces A B C D E 3 5 2 3 0 0 (/5) (/2) (/2) (/6) (/6) (/11) (/16) (/6) (/6) (/14) (=40%) Test no (=100%) (=0%) (=50%) (=0%) (=27.3%) (=0%) (=0%) (=0%) (=0%) Content V results From table 24, it appears that despite the small number of vocabulary test items in the whole test, not many of those items test on the vocabulary which they are supposed to The number of 2, or vocabulary items in the whole test (of about 35 items on average) is generally too small to consider 100% (of test items having relevant content to the textbook) significant in terms of content validity Besides, it is rather bitter to see tests such as the no of C Province, no of D Province and no of E Province contain quite a 59 large number of vocabulary items but few of them (or none of them) are relevant to the emphasized vocabulary at that stage in the textbook Table 25 and table 26 indicates results relating to content validity of vocabulary test items of test group and respectively Table 25: Content validity of vocabulary test items of Group tests Provinces A B Test no 4 1 0 (/1) (/2) (/8) (/6) (/3) (/4) (/3) (/7) Content V results C D E (=100%) (=50%) (=75%) (=83.3%) (=0%) (=0%) (=33.3%) (=0%) Table 26: Content validity of vocabulary test items of Group tests Provinces A B Test no 6 4 0 (/1) (/7) (/2) (/7) (/2) (/5) (/10) (=100%) (=100%) (=0%) (=0%) Content V results (=0%) C (=57.1%) (=100%) D E Throughout table 25 and 26, small number of vocabulary test items can still be seen with no more than 10 vocabulary items per test Those tests of E continue to stay at the bottom in terms of content validity as no vocabulary items seem to test the vocabulary taught in the syllabus Those tests of C appear to be the most commendable in terms of both the number of vocabulary items in the test (except for C1) and the relevance in content of those items 60 CONCLUSION DISCUSSION OF FINDINGS AND RECOMMENDATIONS 1.1 On pronunciation testing Such data above reveal that phonetic test items not correlate closely with the sounds taught in the syllabus Normally just one sound in the list is tested, and for many times the phonetic items in the test are not at all present in the syllabus of that section This situation is understandable if sounds in the syllabus are consonants as consonants often have only one way of pronouncing and there is usually no difference between the way consonants are pronounced and the pronunciation of their representative letters in the alphabet For example, such consonant pairs/groups like /pl/, /bl/ /pr/, /br/, /tr/, /dr/ (in group tests), or /sl/, /sm/, /sn/, /sw/, /spl/, /spr/ (in group tests), or /str/, /skr/, and /skw/ (in group tests), in my opinion, exist in the syllabus to serve the teaching function only They can hardly be used for testing since the differences between them are rather obvious Letter “p” is pronounced /p/ and is still pronounced so in such pairs as /pl/, /pr/, or /spr/, for example There is usually no change in pronunciation of those consonant letters in the alphabet, and often each letter is represented by only one sound As a result, it is hard for teachers to design phonetic test items based on these sounds And this may be the reason why content validity of phonetics section of such tests is not very high However, this cannot explain for the absence of other sounds of the syllabus in the tests This is especially true to tests of group 4, which focuses on pronunciation of the endings –s or –ed Only out of tests in this group have items of relevant content Also from the investigation, I found that sometimes sounds which should be in test are present in test while test focuses on some sound else and test just neglects what it is supposed to test For example, questions on sound /h/ or /dʒ/ have many times appeared on tests of group 2, not group 1, as they are supposed to This somehow indicates that teachers are not really aware of the fact that the sounds they test should be the sounds taught at that certain stage in the syllabus This situation may result from the fact that it is not until the presence of the new set of English textbook that phonetics and pronunciation become the official content in the curriculum while phonetic test items have been existing in the English tests 61 in Vietnam for several years up to now Before the appearance of the new coursebook, teachers, when designing phonetic questions, just focus on what they know or feel important to their students or what their students often make mistakes with in their experience, or sometimes they choose some sounds just because of the availability of materials Test items on those sounds are already available in some reference book and teachers just copy them and put them in their test This saves time and energy, and is also a safe choice as what has been published is often considered correct and “standard” Such habits of designing phonetic test items result in many sounds tested and retested over and over again while other sounds which should have been assessed are ignored Pronunciation testing is just a small part in the structure of every progress test, with often no more than five pronunciation items per test Whereas, the number of sounds taught in every group of lessons is quite big, usually no fewer than six sounds (at least two sounds per unit) This fact should have resulted in high level of content validity of pronunciation section in every progress test Nevertheless, the opposite result has been shown in reality Another point to look at is construct validity of phonetic test items, in which teachers’ inability to identify the boundary of sounds-letters in phonetic transcriptions and then in words emerges as the most severe problem This reveals teachers’ weak knowledge of phonetics, leading to students’ wrong perceptions/understandings of sounds and their representing letters It is undeniable that phonetics (including phonetic transcription) is a difficult subject in any linguistic course, and not many teacher-trainees can comprehend it thoroughly and precisely Several teachers have been teaching English for many years without ever “touching” phonetics in their instruction, thus, their little acquired phonetic knowledge at university can easily become rust Therefore, I recommend organizing a small phonetics training session for teachers in every province, in which teachers are re-introduced fundamental issues of phonetics and phonology which are beneficial to their everyday instruction Also in these sessions, teachers are presented with common mistakes in designing phonetic test items, so that they can avoid them in their prospective tests Previous training sessions conducted in the introduction to the new set of text books are mostly restricted to methodological matters, 62 for example, how to teach pronunciation, assuming that all teachers have acquired necessary knowledge for their teaching These appear to be insufficient as without appropriate amount of phonetic knowledge, how can teachers instruct their students even to pronounce correctly? Also, teachers’ awareness should be raised in terms of testing practice, that is, both “testing what they teach” in class and “testing what is taught in the coursebook” 1.2 On grammar testing From the data analysis of grammar component in those tests collected, it is clear that grammar still holds a significant place in every language test However, it can be caculated that only half of the tests gathered have more than one third of their total grammar items relevant to the corresponding part of the syllabus The proportion of tests having rather high content validity (compared to other tests) varies greatly across the four test groups This situation is somehow influenced by the number of grammar points to be taught in the corresponding part of the syllabus and on the weight of those points in the whole course of grammar This explains the fact that group tests have higher content validity level compared to the other groups while group tests have the lowest level of content validity Still, no matter whether M is more important than N, or H is more realistic than K, as long as M, N, H and K are in the syllabus, teachers have to try their best to include those in the test It is that feature that distinguishes achievement tests from proficiency tests Progress testing would become counter-productive if testing practice depends much on teachers’ own preferences or their personal judgements Since a test should not and cannot restrict its grammar content to only the grammatical points in the syllabus (especially when the number of grammar points in the syllabus is small), and language competence development follows a spiral path, it would be unreasonable and unrealistic for a progress test to include all and nothing but the specific grammar contents in the syllabus Tests serve not only the function of assessment and evaluation but also consolidation and review The thing is should there be some guidelines or recommendations on the relative proportion of newly-taught knowledge items over the old ones, at least for the new set of textbooks, based on the difficulty level of those items, its frequency of use in everyday communication and its significance in the English 63 grammar course? Teachers, then, will be reminded not to wander off too far from the focus of the test Regarding construct validity of grammar component in 45 minute tests, the first point to discuss here is the dominance of multiple-choice questions over others Although advantages of this testing technique are obvious, overusing it in any test will lead to the consequence that fewer abilities will be tested As progress tests are encouraged to target at three first levels of Bloom’s taxonomy (knowledge, comprehension and application), it is evident that MCQ is more effective with testing the first level (i.e knowledge) than others In order to vary the abilities tested, teachers as test designers should pay attention to varying testing techniques they use This applies to testing vocabulary as well Additionally, out of 632 grammar test items, only items were diagnosed with construct validity problems, which accounts for 0.6% Indeed, the problem with construct validity that those items have is not very serious as grammar seems to be the strongest part of most high-school English teachers’ knowledge and grammar teaching and grammar testing have been with them ever since English was taught in Vietnam Therefore, such problems, in my opinion, are mostly due to quick and careless test design With careful edition and revision of the test before application, teachers can elimininate those problems quite easily Moreover, through close examination of the tests collected, I found out that the whole test (including the language focus section, that is, grammar and vocabulary; reading; writing; and listening section) does not consistently aim at reinforcing students’ grammar knowledge among other linguistic competence Grammar testing and reviewing is the objective of the grammar section only There is nothing to with other sections in the test It is my suggestion that clear test specifications should be written for each test in the course (from test no to test no 4), in which detailed guidelines will be provided regarding the various use of different testing techniques, the proportion of newly taught items over old ones, the content/topic of the reading passage or cloze test passage, types of reading questions, and so on, so as to achieve the highest number of test items contributing 64 to the enhancement of grammar (or other linguistic) knowledge relevant to the corresponding section in the syllabus 1.3 On vocabulary testing As for lexical component in the test, the number of vocabulary test items in all 30 test collected is generally too small in proportion to grammar items to be of any importance to the whole test in terms of both content and construct validity With such a small number of vocabulary items, inevitably the vocabulary section in the test cannot much to reinforce and build up students’ lexical competence, and cannot even test students on the vocabulary they have just learnt Moreover, it seems to me that the teachers who develop those tests have not figured out clearly what part of vocabulary should be tested on For high school students, word formation, word choice (word usage), and collocations tend to be the most important lexical points, as such points frequently appear in their English vocabulary tests Teachers while designing vocabulary test items should pay attention to this along with thematic vocabulary in order to test their students’ lexical knowledge effectively Besides, construct validity of the vocabulary items is affected by a rather common problem of tests following a coursebook, that is, testing students’ background knowledge rather than lexical knowledge Because that background knowledge is mentioned in the coursebook, teachers often assume that it is what students must know, and therefore, designing a “vocabulary” item which in fact tests students’ social knowledge, not lexical knowledge This is a problem, but not a very serious one As long as teachers are reminded somewhere in their teaching material or a language testing handbook about this, I am sure they will not make the same mistake As for testing techniques used to test vocabulary, just like grammar testing, MCQ plays a dominant role And my suggestion here still is “varying testing techniques” This not only helps vary the abilities to test students on, but also reduces the boredom of doing the same question type exercise after exercise Last but not least, just like grammar, vocabulary does not seem to be developed consistently throughout the test in other sections besides vocabulary section Indeed, 65 grammar items can use relevant vocabulary for the wording of stem Reading passages can develop around relevant themes, and have questions relating to vocabulary in the passage Every test item can help enhance and review students’ thematic vocabulary However, not any of these has been achieved in any test of the collection Detailed test specifications with regulations on the number of relevant vocabulary items will solve this problem to some extent In short, there are two points that I would like to emphasize after all Firstly, what we have at the table of contents pages of any official language coursebooks we are currently using is just the book map It is not the syllabus However, we are treating it as it is That is the reason why when looking at the book, we cannot decide clearly which level (according to international standard) our students at schools are at, and at that level they need be able to know what and what We divide our language levels at school based on grades, but we have not decided which level our grade 12 (the final grade at Vietnamese schools) is compared to internationally-recognized language levels Once we are not clear where we are, how can we design tests to truly reflect students’ level? How we know which should be too easy or too difficult for students at this grade/level? How we know what to take and what to leave in deciding the content of our tests? As I have mentioned above, language competence development follows a spiral path and it is essential for us to know where we have been and where we should go, and whether the destination is too far for us if we are still too young and too weak If only the Ministry of Education and Training cooperated with textbook writers and language educators to build up a common syllabus for language education, which adheres to international standards, certainly teachers’ work of test designing will be both less demanding and more productive Secondly, each test in the course marks the end of a group of lessons on a specific theme with certain phonetic, grammatical and lexical points Therefore, the content of the progress achievement test should represent itself well by having unique features which other tests cannot have And the best way to it is to start with vocabulary A test built around environmental issues will certainly be different from the one on relationships like friendship If a forty-five minute test can some of the things suggested above, certainly 66 it will create a strong impression on students, helping them not only learn grammar in context, but also develop and strengthen their thematic vocabulary And if the phonetic section can take advantage of the vocabulary in every theme, it would be wonderful since it can assist students in pronouncing the vocabulary in the theme correctly As “practice makes perfect”, repeated encounters with vocabulary in the theme selected will certainly be of great help to students In order to achieve this, test designer teachers need regulations to follow, and this can only be done with the active support of textbook writers and testing experts Clear and detailed test specifications are needed with rigid basic requirements and some room for creativity and modifications If we want to raise the quality of language education, we need also raise quality of language testing Unanimous efforts should be put into drawing up a common testing framework for teachers, just like the framework we have already built concerning teaching methodology CONCLUSION This research has to some extent uncovered the current testing practice in high schools in Vietnam following the new set of English textbooks Although testing is a professional work and test designing a strenuous task, teachers must all work up to the standards of the profession The investigation so far has shown not very high content and construct validity of language components of those forty five minute tests collected and also revealed the inappropriate proportion of lexical test items in comparison to others This results from both subjective and objective reasons; however, with practical recommendations proposed to both teachers as the ones who design tests for their students and textbook writers and the Ministry of Education and Training as the people/body who direct teachers’ instruction and test designing, hopefully, quality of progress tests at high schools will be significantly improved Also the study reveals that the recommended structure of 45 minute tests is not taken seriously by teachers Among 30 tests collected, only one test has the listening section And because of not following a specific structure, the number of lexical items in proportion to other items turns out to be inadequate for developing lexical competence Certainly following the recommended structure as well as the test specifications suggested (if exist) will require tremendous efforts from teachers However, the results will be 67 rewarding and inevitably worth every effort To reduce workload for teachers in designing tests, teachers of the same grade in the same school can get together, take turns to design and then review tests of their colleagues so that each school-year each teacher has to design only one test at most and the quality of their tests will also be enhanced with the meticulous revision of their working partners Despite rigorous attention to detail in the analysis process, errors and mistakes are still unavoidable, especially in the stage of deciding whether one test item belongs to grammatical or lexical sphere or the combination of both Also with the number of tests collected and the number of provinces investigated, this study only acts as preliminary research, without generalizations of any kind in any aspect of test construction and design The scope of an M.A thesis only allows the researcher to look at content and construct validity of the tests while neglecting other types of validity such as face or criterion-related validity The limitations of the research defined by its scope and its employed methodology have certainly left a lot of room for further study Since the research has been carried out in 45 minute tests of five provinces only, other research should be done with other test types in more provinces with different methods of data collection in order to arrive at generalizations of content and construct validity of English tests in general in the north or in Vietnam nation-wide Besides, this research mostly uses the method of comparing with theory to analyse construct validity of tests, which is rather subjective It is suggested that other methods of construct validation be used such as internal correlations or multitraitmultimethod analysis to arrive at a more comprehensive view of construct validity of tests Also relating to this research, prospective studies can be conducted in writing specifications for progress tests in the textbook so as to standardize English tests used in high schools in Vietnam and improve the quality of tests and testing practice as a result 68 REFERENCES Tiếng Việt: Hoàng Văn Vân (chủ biên) (2004), Đổi phương pháp dạy học tiếng Anh lớp 11, Hà Nội Hoàng Văn Vân (chủ biên) (2009), Tiếng Anh 11, Nxb Giáo dục, Hà Nội Vũ Thị Lợi (chủ biên) (2008), Kiểm tra đánh giá thường xuyên định kỳ môn Tiếng Anh lớp 11, Nxb Giáo dục, Hà Nội Chương trình giáo dục phổ thơng mơn Tiếng Anh, Ban hành kèm theo Quyết định số 16/2006/QĐ-BGDĐT ngày 05 tháng năm 2006 Bộ trưởng Bộ Giáo dục Đào tạo, Nhà xuất giáo dục, Hà Nội Tiếng Anh: Alderson, J C., Clapham, C & Wall, D (1995), Language test construction and evaluation, CUP, United Kingdom Bachman, L F & Palmer, A S (1997), Language testing in practice, OUP, Hong Kong Bachman, L F (1990), Fundamental considerations in Language Testing, OUP, UK Bachman, L F (2000), “Modern language testing at the turn of the century: assuring that what we count counts”, Language testing 17 (1), pp 1-42 Borg, W R & Gall, M D (1974), Educational research: an introduction, David McKay Company, Inc, New York 10 Cohen, A D (1994), Assessing language ability in the classroom, 2nd Ed.: Newbury House / Heinle & Heinle, Boston 69 11 Coniam, D (2009), “Investigating the quality of teacher-produced tests for EFL students and the effects of training in test development principles and practices on improving test quality”, System 37, pp 226-242 12 ELT Methodology: “Issues in language testing”, downloaded at http://ieas.unideb.hu/admin/file_1401.pdf 13 Harrison, A (1983), A language testing handbook, Macmillan Press Ltd., Hong Kong 14 Heaton, J B (1988), Writing English language tests, Longman Inc, USA 15 Heaton, J B (1997), Classroom testing, Longman, England 16 Henning, G (1987), A guide to language testing: development, evaluation, research, Heinle & Heinle Publishers, Cambridge 17 Hughes, A (1989), Testing for language teachers, Cambridge University Press, Cambridge 18 McNamara & Roever (unknown), “Chapter 2: Validity and the social dimension of language testing, Language learning”, Blackwell Publishing Limited 19 McNamara, T (2000), Language testing, OUP, Hong Kong 20 Nunan, D (1992), Research methods in language learning, CUP 21 Corder, S P (1973), Introducing Applied Linguistics, Penguin Education, Baltimore 22 Harrison, A (1983), A language testing handbook, Macmillan Press Ltd., Hong Kong 23 Kunnan, A J (2000), Fairness and validation in language assessment: Selected papers from the 19th Language Testing Research Colloquium, Orlando, Florida, Cambridge University Press, Great Britain 70 24 Read, J & Chapelle, C A (2001), “A framework for L2 vocabulary assessment”, Language Testing 18 (1) pp 1-32, Arnold Publishing House 25 Richards, K (2003), Qualitative Inquiry in TESOL, Palgrave Macmillan, Great Britain 26 Skiken: JALT Testing & Evaluation, SIG newsletter, (2), October 2000 (pp 812), downloaded at http://www.jalt.org/test/bro_8.htm 27 Stansfield, C W (2008), “Where we have been and where we should go”, Language testing 25 (3), pp 311-326 28 Ta, Thi Hien (2005), The pros and cons of the multiple-choice testing technique with reference to methodological innovation as perceived by secondary school English language teachers and students, M.A Minor Thesis, Post-graduate Department, Hanoi University of Languages and International Studies, Vietnam National University 29 Thrasher, R (2000), Language Testing: Test theory and design, International Christian University, downloaded at http://subsite.icu.ac.jp/people/randy/Test%20text%20grammar%20mcw.pdf 30 Tran, Thi Hieu Thuy (2008), Evaluation of an End-term Listening Test for Firstyear Mainstream Students of English Department, College of Foreign Languages, Vietnam National University, M.A Minor Thesis, Post-graduate Department, Hanoi University of Languages and International Studies, Vietnam National University 31 Trochim, W M (2006), The Research Methods Knowledge Base, 2nd Edition, downloaded at URL: (version current as of October 20, 2006) 32 Weir, C J (1990), Communicative Language Testing, Prentice Hall, Great Britain 33 Wright, B & Stone, M (1999), Measurement Essentials, Wide Range, INC, Wilmington ...VIETNAM NATIONAL UNIVERSITY HANOI UNIVERSITY OF LANGUAGES AND INTERNATIONAL STUDIES POST-GRADUATE DEPARTMENT HOÀNG HỒNG TRANG A STUDY ON VALIDITY OF 45 MINUTE TESTS FOR THE 11TH GRADE NGHIÊN CỨU TÍNH... may mean that the test has construct validity Additionally, in “Language Test Construction and Evaluation”, Alderson, Clapham and Wall (2001) presents several approaches to construct validation... pronunciation part in phonetics section instead Therefore, the data synthesized and analyzed below will be of the pronunciation part only 4.1.1 Data concerning construct validity As construct validity

A study on validity of 45 minute tests for the 11th grade = Nghiên cứu tính giá trị của bài kiểm tra 45 phút tiếng Anh lớp 11

Thông tin tài liệu

Từ khóa liên quan

Mục lục

TABLE OF CONTENTS

LIST OF ABBREVIATIONS

LIST OF TABLES

INTRODUCTION

1. Rationale for the study

2. Significance of the study

3. Aims of the study

4. Scope of the study

5. Research questions

6. Organization of the study

CHAPTER 1: LITERATURE REVIEW

1.1. LANGUAGE TESTING AS PART OF APPLIED LINGUISTICS

1.1.1. Language testing – a brief history and its characteristics

1.1.2. Purposes of language testing

1.1.3. Validity in language testing

1.2. CLASS PROGRESS TESTS

1.2.1. Language tests – definition and types

1.2.2. Class progress tests as a type of achievement tests

1.3. TESTING TECHNIQUES

CHAPTER 2: METHODOLOGY OF THE STUDY

Tài liệu cùng người dùng

Tài liệu liên quan