2. Design your tests so as to encourage 3. Build considerations of fairness into test design. and enable 2.Humanize 变换题型结构有必要提前让考生知道吗 4. process: seek ways in which to involve test test takers the to testing perform at their highest level of?ability. takers more directly in the testing process; treat test takers as 3. 考试的公平性原则在命题中如何体现 3. responsible Build considerations of fairness test design. individuals; provide them withinto as complete
学能测试 Aptitude Test: SAT/Scholastic Aptitude Test 诊断性测试 Diagnostic Test: 随堂练习/测验 编班测试 Placement Test
NR-testing: interest in seeing how * 按照对测试分数的解释方式分 candidates perform by comparison with 目标参照性测试 Criterion-referenced Test each other
the entire testing procedure as possible.
– Bachman&Palmer: Language Testing Practice
了解考试: 测试分类-考与考不同
* 按照测试目的分 成绩测试 Achievement Test: 毕业会考; 期中期末 水平测试 Proficiency Test: NMET, TOEFL
*尽力了解试卷使用省份考生的英语水平, 设计出适合实际需要的英语试卷; *在考查语言知识的基础上着重考查考生 研究考试-全卷难度: 的语言运用能力; 高考为常模参照性考试,各省仍采用原始分
相加计算总分确定录取。学科难度决定其在 * 尽量向考生提供真实可信、情景丰富的
新增难点:词汇扩充 复习范围 能力要求
教学反思: 听力测试与训练
Q: 听力部分在全卷中的位置 听力训练需要把握的要素 选材特征+考查要求 +命题原则 难在哪里? 选材 + 设题 + 语速 如何提高训练效果? 选择题以外听什么 基础是什么? 听读的相关与不同 干扰与抗干扰 课标要求
“能在听的过程中克服一般性口音干扰。” 《高中英语课程标准》
pie chart Form Meaning
What does the How is the grammar structure grammar structure mean? formed?

When or why is the grammar structure used?
试题分析 “单项填空”回 - How about eight 顾 o’clock outside the cinema?
- That _____ me fine. (0.23/.035) 2004全国2-26 A. fits B. meets C. satisfies D. suits 42.5% 17.6% 17.0% 22.8%
设计阶段:制定内容规范、公布考纲、规定内容、 试题结构和题型、计分体系
定义所测量的语言能力-考什么?怎么考? 实施阶段:命题、审题、预测、项目分析、 质量保证体系 制卷、施测、阅卷 …… 考后阶段:成绩发布、数据统计、数据分析、 试题分析、各部分相关、主客观题信度。
information about the entire testing procedure as possible. 4. Humanize the testing process: seek ways which 5. Demand accountability for test use; hold yourself, as in well as any whotest use your test, accountable for the test is to others involve takers more directly in way the your testing used. “考试人性化” process; treat test takers as responsible 6. Recognize that decisions based on test scores areindividuals; fraught with test taker, test maker, & test marker dilemmas, and that there are no universal answers to about these. provide them with as complete information
难度 / 信度 卷1(广西陕西海南西藏内蒙) 卷2(河南河北江西安徽山东山西) 卷3(四川吉林黑龙江云南) 卷4(新疆宁夏甘肃青海) 0.45 / 0.89 0.47 / 0.87 0.55 / 0.88 0.47 / 0.87
Our philosophy of language testing
1. Relate language testing to language teaching and language use. 1. Relate language testing to language teaching 考试与教学的关系 - 课改的瓶颈? 2. Design your tests so as to encourage and enable test takers to and language use. “考试要站在教改 的前沿”; “船小好调 perform 头” at their highest level of ability.
总分中的权重。为保证各科基本平衡,教育 语言材料。 部规定各科难度应该在0.55左右。 教育部考试中心 :《高考试题分析2007 2005年版》 年版》 -《高考试题分析
总体稳定 尝试推进
2004 - 2007 湖北英语卷难度对照
人均分(理) 难度(理) 人均分(文) 难度(文)
2007年英语卷共 19套,覆盖考生人数 65% 2004 分省命题 : 京沪 + 津辽江浙闽湘鄂粤渝 16 省自主命题 + 全国 3 套 2005 新增:鲁赣皖 全国1 & 2+宁&琼课改卷 (=全国3) 2006 再增:川陕 多数题型结构有所调整变化
2004 全国卷 4 套, 考生群体不同, 无可比性:
have had in that language. The content of a proficiency
test, therefore, is not based on the content or objectives of language courses which people taking the test may have followed. Rather, it is based on a specification of what candidates have to be able to do in the language in
Testing Tests
有助于中学实施素质教育; 有助于扩大高校办学自主权。 评价标准: 22条 – 结合学科特点
– 教育部考试中心
评分原则 评分标准 评分操作
关于考试性质的理解: “英语科考试是按照标准化测 试要求设计的。”
是考形式? 何谓“三维模式”? 分省命题以何 为准? 所在省市卷体现如何?数据分析能说明 Language communicative
competence test 什么?题型变化如何体现命题风格? Discourse: recordings of naturally occurring 突出语篇 samples of language within their 强调应用 communicative context - David Nunan: 注重实际
2007高考回顾& 2008展望
Oct 17, 2007, Dalian
了解考试:回顾与分析 研究考试:反思与建议
2004 2005 2006 2007
88.8 86.6 86.5
0.59 0.58 0.58
76.6 71.2 77.1
0.51 0.47 0.52
认识考试:NMET 的定性
总体风格 + 选材特征 + 命题思路
宏观到微观-分析试题,体会风格 Q: NMET 的定位、风格如何? 重在考语义内容还
Grammar Dimensions
了解考试-试题分析: 听力
Q: 听力难在哪里?选材?语速?设题?
3. What is said about the woman? (P=0.37; D=0.11) A. She spends more than she earns. 37.20% B. She saves a lot each month. 16.30% 1-5 题思路: C. She has a tight budget. 46.43% Matching! M: How do you spend your income? W: About 30 percent for shelter, 30 % for clothing, 40% for food and 20% for entertainment. M: But that adds up to 120%. W: That’s right. 07 湖北卷
order to be considered proficient. ”
Arthur Hughes: Testing for : Language Teachers 教学反思 教材的作用
认识考试- 高考功能 & 命题要求
适当难度(P)- 通过率(答对率)/得分率(平均分) 较好区分度(Discrimination)- G1 - G5: D = Hi ( - 考试是否考到了它所需要 Lo 效度:测量的有效性 n 考查的内容/达到测试目的的有效程度 ); 合理效度 (有效性 -Validity) 影响效度 :所用题型;试卷难度 信度:考试结果的可靠性(测量结果一致性程度) 追求信度(可靠性-Reliability) 影响信度:试卷题量;评卷准确性
2006 北京高分涌现的背景与后果 常模参照性测试 Norm-referenced Test
高考应该考教材吗? 为什么?
“Proficiency tests are designed to measure people’s
ability in a language regardless of any training they may