Designing & evaluating an English reading test for the non-majors of Civil Engineering at Haiphong private university

51 1.2K 7
Designing & evaluating an English reading test for the non-majors of Civil Engineering at Haiphong private university

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Designing & evaluating an English reading test for the non-majors of Civil Engineering at Haiphong private university

Nguyen Thi Phuong Thu August 2005 Vietnam national university, hanoi College of foreign languages  - Designing & evaluating an English reading test for the non-majors of Civil Engineering at Haiphong private university ThiÕt kế đánh giá kiểm tra tiếng anh chuyên ngành cho sinh viên xây dựng dân dụng trờng đại học dân lập hải phòng M.A minor thesis Field: methodology Code: 50702 Course: k11 By : Nguyen Thi Phuong Thu Supervisor : Tran Hoai Phuong, MEd Hanoi - August 2005 Nguyen Thi Phuong Thu August 2005 Acknowledgements During the process of further studying and conducting this research I was really honored to receive guidance, assistance, and encouragement from various lecturers as well as supervisors among whom I would like to acknowledge my sincere thanks to the leaders of the College of Foreign Languages who have given me permission and created favorable conditions for study and research I would also like to thank my supervisor, Mrs.Tran Hoai Phuong, Med, who really sympathized with me and also gave me great help as well as invaluable guidance and encouragement from the very start to the end of my research It is also my pleasure to give my special thanks to the students of classes XD 501, XD 502 and XD 503 at Hai Phong Private University who enthusiastically took part in doing the test and helped me collect the results of the test I also benefited greatly from talks and discussions with my colleagues so let me thank all of them for what they have directly or indirectly contributed And finally I really want to thank my beloved husband who always gives great support to my further study Nguyen Thi Phuong Thu Nguyen Thi Phuong Thu August 2005 List of abbreviations HPU Haiphong Public University CE Civil Engineering CEE Civil Engineering English ESP English for Specific Purposes MCQ Multiple Choice Question T True F False M Mean Σ Sum of 10 N The number of the scores 11 x The raw score 12 f The frequency with which a score occurs 13 H The highest value 14 L The lowest value 15 SD Standard Deviation 16 FV Item difficulty 17 R The number of the correct answers 18 ve very easy 19 e easy 20 d difficult 21 vd very difficult 22 D Iitem discrimination 23 CU The number of the correct asnwers of the upper half 24 CL The number of the correct asnwers of the lower half 25 gd good discrimination 26 md bad discrimination 27 bi bad item Nguyen Thi Phuong Thu August 2005 28 p Spearman rho correlation coefficient 29 SU Score on the upper half 30 SL Score on the lower half Nguyen Thi Phuong Thu Table of contents Acknowledgement List of abbreviations Part I: Introduction 1.Rationale 2.Aims of the study 3.Scope of the study 4.Methods of the study 5.Design of the study Part II: Development Chapter one: Literature review 1.1.Language testing 1.2.Communicative language tests 1.3.Testing reading skills 1.3.1.Multiple choice questions 1.3.2.Short answer questions 1.3.3.Cloze 1.3.4.Selective deletion gap filling 1.3.5.C tests 1.3.6.Coloze elide 1.3.7.Information transfer 1.3.8.Jumbled sentences 1.3.9.Matching 1.3.10.Jumbled paragraphs 1.4.Major characteristics of a good test 1.41.Reliability 1.4.2.Validity 1.4.2.1.Content validity August 2005 Nguyen Thi Phuong Thu 1.4.2.2.Face validity 1.4.2.3.Criterion-related validity 1.4.2.4.Construct validity 1.4.3.Practicality 1.4.4.Discrimination 1.5.Achievement tests 1.5.1.Class progress test 1.5.2.Final achievement test Summary Chapter two: Methodology 2.1.A quantitative study 2.2.The selection of participants 2.3.The materials 2.4.Methods of data collection and data analysis 2.5.Limitations of the research Summary Chapter three: Discussion 3.1-The content area of the test 3.2-The relative weights of the different parts of the test 3.3-Constructing the test 3.4-Administering the test 3.5-Marking the test 3.6-Test scores interpreting and evaluation 3.6.1.The frequency distribution 3.6.2.The central tendency 3.6.2.1.The mode 3.6.2.2.The median 3.6.2.3.The mean August 2005 Nguyen Thi Phuong Thu 3.6.3.The dispersion 3.6.3.1.The low-high 3.6.3.2.The range 3.6.3.3.The standard deviation 3.7-Test item analysis and evaluation 3.7.1.Item difficulty 3.7.2.Item discrimination 3.8.Estimating reliability Summary Part III: Conclusion and recommendations References Appendices August 2005 Nguyen Thi Phuong Thu Part I: August 2005 Introduction 1.Rationale Testing is a matter of concern to all teachers - whether we are in the classroom or engaged in syllabus/ materials, administration or research We know quite well that good tests can improve our teaching and stimulate student learning Although we may not want to become a measurement expert we may have to periodically evaluate student performances and prepare reports on student progress Haiphong Private University (HPU) is a university in which there are a number of classes of Civil Engineering (CE) for students of Construction Department Generally speaking, non-majors, especially the students of this department, lack background knowledge of English The non-majors of CE have chances to learn General English (GE) during their first three terms to prepare for their 120 periods of English for Specific Purposes (ESP) in the fourth term In fact, this type of English is quite demanding for them and many had to admit that they could not learn it well As a result, many students failed after each final examination The causes for the above situation are various It might be because some students are either too hesitant or too lazy to learn anew subject It might also be because some students could not overcome the difficulties they usually meet during their study, for example their ESP is too new or too demanding for them, or they have to learn many periods per week to leave time for other subjects However, the reason which is no less important and which needs taking into account is the matter of testing In general, teachers at HPU are well-qualified and when teaching they are quite enthusiastic with good teaching methodology However, the results of their students’ tests are not always satisfactory, the scores they gained were often lower than expected Moreover, we teachers cannot deny the fact that sometimes the test results not accurately reflect the testees’ language competence According to Brown (1994a: 373) and Hughes (1989: 1) “A great deal of language testing is of very poor quality Too often language testing has a harmful effect on teaching and learning and too often they fail to measure accurately whatever it is they are intended to measure.” Nguyen Thi Phuong Thu August 2005 For all the above reasons the author of this research study would like to take this opportunity to undertake the study entitled “Designing a reading test for the non-majors of Civil Engineering at Haiphong Public University” with a view to evaluating the students’ reading ability after one term’s study last school year (2004-2005) as well as to gaining some knowledge and experience of foreign language testing for herself after completing the study 2.Aims of the study The minor thesis is aimed at designing an achievement test of ESP reading which would be conducted in a class of Civil Engineering English at HPU The test was considered as a final examination Then the results of the test will be analysed, evaluated, and interpreted The test takers are non - English - majors The specific aims of the research are:  to assess the learners’ achievement in improving reading skill with English of Civil Engineering after 120 period reading course  to measure their aptitude for the reading skill  to diagnose their strength and weakness in reading the subject matter  to find out whether or not the test satisfies the qualities of a good test From there the test will measure the effectiveness of the teacher’s teaching If the test is not a good one, some suggestions will be made for a better test form 3.Scope of the study “Not all language tests are of the same kinds They differ with respect to how they are designed, and what they are for; in other words, in respect to test method and test purpose.” (Mc Namara, 2000: 5) For example, in terms of method, there are paper-and-pencil language tests, performance tests, ect And in terms of purpose, there are achievement tests, proficiency test, and so on In fact, the same form of test may be used for different purposes, although in other cases the purpose may affect the form Due to the limitation of time and ability, it is impossible for the author to design tests of all these types or of all the four language skills (speaking, writing, listening and reading) Nguyen Thi Phuong Thu August 2005 Therefore, this minor thesis is limited to designing and evaluating an achievement test of ESP reading for the non-majors at HPU and the reading tested was for communicative purposes 4.Methods of the study In this minor thesis the author designed an achievement test of reading, administered it and then evaluated it, so the method adopted is quantitative The data will be collected through testing the students’ reading ability of Civil Engineering English 5.Design of the study The study is composed of three parts: *Part I is the presentation of basic information such as the rationale, the scope of the study, the aims of the study, the methods of the study and finally the design of the study *Part II includes three chapters: + Chapter one is the literature review in which the literature that is related to language testing and major characteristics of a good reading test is presented + Chapter two is concerned with research methodologies including the methods adopted in doing the research, the selection of participants, the materials, the methods of data collection and data analysis + Chapter three is the discussion, which is the main part of the study This chapter reviews how a reading test of Civil Engineering for the non-majors at HPU was designed, administered, and then evaluated *Part III includes the conclusion and recommendations for further research on the topic Following these parts are the references and appendices Nguyen Thi Phuong Thu August 2005 answers on a test The frequency distribution of the reading test that the author conducted is presented by the diagram below: Raw marks on reading skill 20 18 Number of students 16 14 12 10 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 Scores (It is essential to remember that the total score of the test is 50, however, after the marking the total score each student got was divided by to suit the 0-10 scale previously approved by the board of examiners) The diagram above can be seen as self-explanatory: the vertical dimension indicates the number of candidates scoring within a particular range of scores; the horizontal dimension shows what these ranges are When looking at the diagram, it is clearly seen that the students got different marks ranging from to 9, i.e the lowest score was and the highest was The charts also tells that the set of scores was distributed quite unevenly, for example no student got marks 1.5, 2.5, 8.5; the score that most of the students got was 5.5 It also points out clearly the outcome (the students who got marks 5, 5.5, 6, 6.5, 7, 7.5, 8, and would pass, and those getting marks under would fail,) of the test 3.6.2.The central tendency A convenient way of summarizing data is to find single statistic, called the CENTRAL TENDENCY, which represents an entire set of numbers Central tendency can be defined as ‘the propensity of a set of numbers to cluster around a particular value’ (Brown and Rodgers, 2002: 128) Three statistics are often used to find central tendency:, the mode, the median, and the mean Nguyen Thi Phuong Thu August 2005 3.6.2.1.-The mode The MODE is the value in a set of numbers that occurred most frequently In a way, the mode is the simplest of the three central tendency statistics discussed here because it requires no computation In this case the mode is 5.5 because it is the most frequent value 3.6.2.2.The median The MEDIAN is the point in the distribution below which 50% of the values lie and above which 50% lie To find the median for this case, first place the values in order from low to high Then, examine the value above and below which 50% marks lie Here the median is 3.6.2.3.The mean The most widely used measure of central tendency is the MEAN, which is more commonly called the AVERAGE The mean is the sum of all the values in a distribution divided by the total number of values (50) The formula for the mean is: M = ∑f N x M= mean where: ∑= sum of (or add up) N= the number of the scores x = the raw score f = the frequency with which a score occurs Using the formula above we have: table x 1.5 2.5 3.5 4.5 f 2 = = = = = = = = = xf 16 35 Nguyen Thi Phuong Thu 5.5 6.5 7.5 8.5 August 2005 12 = = = = = = = = ∑xf = ∑f 66 24 6.5 42 30 24 267.5 276.5 ≈ 5 N 50 From the above analysis we have the mean ≈ 5.5 and the median = 5.As a result there’s M= x = a quite fairly correspondence between the mean and the median When comparing to the results the students got last terms it is possibly accepted because when studying General English the score they got after their exams were a little higher (the median and the mean generally ranged from to 7) It is because of some reasons, firstly they had longer time to get in touch with the General English (at least 225 periods) Secondly this English was not so hard Therefore with the mean of 5.5 and the median of 5, the test results are quite satisfactory 3.6.3.The dispersion Knowing about the central tendency of a set of numbers is a highly helpful way of characterizing the most typical behavior in a group It doesn’t, however, tell us anything about the way the numbers spread out around that central or typical behavior To know such a thing we need to find out the dispersion, which can be defined as ‘the degree to which the individual numbers vary away from the central tendency’ (Brown and Rodgers, 2002: 130) There are three primary ways of examining dispersion: the low-high, the range, and the standard deviation 3.6.3.1.The low-high The LOW-HIGH involves finding the lowest value and the highest value in a set of numbers When looking at the marks the testees got and by putting the numbers in order from high to low, we can see immediately that the lowest value was and the highest value was Thus, the low-high is 1-9 Nguyen Thi Phuong Thu August 2005 3.6.3.2.The range The RANGE is the difference between the highest and the lowest scores, i.e it is the highest value minus the lowest The formula for the range of the reading test results is written as follow: Range = H-L where: H= highest value → the range of the test results = 9-1= L= lowest value The test with a big range proves that there was a wide range of abilities among the testees Nguyen Thi Phuong Thu August 2005 3.6.3.3.The standard deviation (SD) The best overall indicator of dispersion of the reading test is the STANDARD DEVIATION It is the degree to which the group of scores deviate from the mean Brown (1988: 69) defined it as ‘a sort of average of the differences of all scores from the mean’ The standard deviation is ‘a sort of average’ because you are averaging some values by adding them up and dividing by the number of values, just as you did in calculating the mean So the equation for the standard deviation starts with adding the squared difference between the value and the mean (5.5) up and dividing the number of the test takers (50): SD = ∑D N = ∑(X − M ) N SD: standard deviation where: X: values M: the mean of the values N: the number of the values Values 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 - Mean 5.5 = 5.5 = 5.5 = 5.5 = 5.5 = 5.5 = 5.5 = 5.5 = 5.5 = 5.5 = 5.5 = 5.5 = 5.5 = 5.5 = 5.5 = 5.5 = 5.5 = Difference Squared difference (D2) -4.5 20.25 -4 16 -3.5 12.25 -3 -2.5 6.25 -2 -1.5 2.25 -1 -0.5 0.25 0 0.5 0.25 1 1.5 2.25 2.5 6.25 3.5 12.25 ∑D = 106.25 Nguyen Thi Phuong Thu SD = ∑D August 2005 N = ∑(X − M ) N 106.25 = 1.46 50 = As seen above the Standard Deviation is the squares root of the variance The standard deviation (SD) is a very powerful measure of ‘dispersion’ In this case we have a large standard deviation (1.46) therefore it shows us the following: -the score distribution of the test was wide -the test has spread the students out -there was a wide range of ability among the testees 3.7.Test item analysis and interpretation The results obtained from the test can be used to provide valuable information concerning: + the performance of the students as a group, + the performance of individual student, + the performance of each of the items comprising the test→ the difficulty level and the level of discrimination Therefore all the 34 items of the reading test were analysed in terms of item difficulty and item discrimination as follow: 3.7.1.The item difficulty The Item difficulty (the index difficulty or facility value=FV) of an item shows how easy or difficult the particular item proved in the test.’ (Heaton, 1988: 175) The formula of item difficulty (FV) is: where: FV = R N R: the number of correct answers N: the number of the testees i.e Level of difficulty=proportion of students getting it right= the average score on this item *Note: the FV value does not tell us who got it right It tells us nothing about discrimination The scales for item difficulty are: ve (very easy) with FV=0.81÷1 (i.e 81 to 100% students got it right) Nguyen Thi Phuong Thu August 2005 e (easy) with FV= 0.61÷0.8 (i.e.61 to 80% students got it right) ok with FV=0.41÷0.6 (i.e.41 to 60% students got it right) d (difficult) with FV=0.21÷0.4 (i.e 21 to 40% students got it right) vd (very difficult) with FV=0÷0.2 (i.e to 20% students got it right) Nguyen Thi Phuong Thu August 2005 The calculation for the item difficulty is presented in the table below: Items 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 R 42 32 44 34 32 45 25 41 22 33 23 32 20 32 49 31 30 24 15 41 16 46 32 20 24 16 31 19 7 12 FV 0.82 0.64 0.88 0.68 0.64 0.90 0.50 0.82 0.44 0.66 0.46 0.64 0.40 0.64 0.98 0.62 0.60 0.48 0.30 0.82 0.32 0.92 0.64 0.40 0.48 0.32 0.62 0.16 0.38 0.14 0.14 0.12 0.24 0.14 Item difficulty Conclusions ve e ok √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ d vd √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ Nguyen Thi Phuong Thu August 2005 From the results in the table above it is clearly seen that items 1, 3, 6, 8, 15, 20, 22 were fairly easy since they had the index of difficulty of more than 0.8 or 80% In these cases, at least 81% of the students taking the test answered correctly Items 2, 4, 5, 7, 10, 12, 14, 16, 17, 23, 27 could be seen as easy since their index of difficulty ranged from 0.61 to 0.8 With the FV ranging from 0.41 to 0.6 items 9, 11, 18, 25 were all right for the students A few items were difficult (including items 13, 19, 21, 24, 26, 28, 33 with FV ranging from 0.21to 0.4) and very difficult (items 28, 30, 31, 32, 34 with FV ranging from to 0.2) 3.7.2 The item discrimination Item discrimination (D) indicates the extent to which the item discriminates between the testees, separating the more able testees from the less able The formula of item discrimination is: D = CU − CL N where CU: the number of the correct answers of the upper half CL: the number of the correct answers of the lower half The scales for item discrimination are: gd (good discrimination) with D=0.6 ÷1 md (medium discrimination) with D=0.3÷0.59 bd (bad discrimination) with D=0÷0.29 bi (bad item) with D

Ngày đăng: 07/11/2012, 14:12

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan