《英语测试学课件》Validity
合集下载
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
If we wish to use this test score for a particular purpose, we must justify this by considering not only construct validity and value implications, but also relevance or utility of the particular use and the social consequences of using this test in this particular way.
Definition of validity
Validity: testing the test Does a test measure what it is supposed to measure? This is the most important question of all in language testing. And it is concerned with the validity of language testing. It has been traditional to classify validity into different types, such as content, criterion, and construct validity. However, measurement specialists have come to view these as aspects of a unitary concept of validity that subsumes all of them. Messick(1989), for example, describes validity as ‘an integrated evaluative judgment of the degree to which empirical evidence and theoretical rationale support the adequacy and appropriateness of inferences and actions based on test score’ here are four points about the definition we need to pay attention to.
Defining construct
the ‘something’ we want to measure ↓
a construct of mental ability ↓
a postulated attribute of people, assumed to be reflected in test performance
Construct Validity
“A measure estimates how much of something an individual displays. The basic question[ of construct validity] is, what is the nature of that something?” (Messick 1975:957)
Thus, for justifying a particular interpretation of a test score, we must gather evidence for construct validity and consider the value implications of this interpretation.
performance consequential or ethical basis of validity
Content relevance and content coverage
The investigation of content relevance requires not only the specification of ability domain but also the specification of the test method facets.
4. Test validation is an on-going process and the interpretation we make of test scores can never be considered absolutely valid.
The unified framework of validity can been illustrated in the following table:
1. Validity is a unitary concept.
2. Validity requires both empirical evidence and theoretical rationale.
3. We are not examining the validity of the content or of even the test scores themselves, but rather the validity of the way we interpret or use the information gathered through the testing procedure.
Content relevance and content coverage Analysis of the internal structure of the test Criterion relatedness Experimental evidence Analysis of the processes underlying test performance consequential or ethical basis of validity
logical analysis expert judgment
Content relevance and content coverage
Limitations We seldom have a domain definition that clearly
and unambiguously identified the set of language use tasks from which possible test tasks can be sampled. They focus on tests, rather than test scores
Virtually all test use inevitably involves the interpretation of test scores as indicator of ability
Defining construct
When we operationally define constructs as measures of language ability, we are making hypotheses about the relationship between these constructs and test scores, which can thus be viewed as behavioral manifestations of the constructs.
Evidence supporting validity
Content relevance and content coverage Analysis of the internal structure of the
test Criterion relatedness Experimental evidence Analysis of the processes underlying test
Analysis of the internal structure of the test
Investigating the correlations among items or subscales of a test.
Correlation analysis
Function of outcome of testing
Source of justification Test interpretation
Test use
Evidential Basis Consequential Basis
Construct validity
Construct validity + Value implications
Construct validity + Relevance/Utility
Construct validity + Relevance/utility + Social consequences
Messick see this as a progressive matrix, with construct validity an essential component in each cell.
Validity
Definition of validity
Validity as a unitary concept
Construct validity
Defining construct Defining construct validity
Evidence supporting refers to the extent to which the tasks required in the test adequately represent the behavioral domain in question.
Specific approaches to get evidence of content validity
construct validation
construct validation requires logical analysis and empirical investigation.
Logical analysis is involved in defining the constructs theoretically and operationally. This comprises the first two steps in measurement and provides the means for relating the theoretical definitions of the construct to observations of behavior — scores on language test.
Defining construct validity
Construct validity concerns the extent to which performance on test is consistent with predictions that we make on the basis of a theory of ability.
Construct validity is thus seen as a unifying concept that integrates criterion and content considerations into a common framework for testing rational hypotheses about theoretically relevant relationships.
Definition of validity
Validity: testing the test Does a test measure what it is supposed to measure? This is the most important question of all in language testing. And it is concerned with the validity of language testing. It has been traditional to classify validity into different types, such as content, criterion, and construct validity. However, measurement specialists have come to view these as aspects of a unitary concept of validity that subsumes all of them. Messick(1989), for example, describes validity as ‘an integrated evaluative judgment of the degree to which empirical evidence and theoretical rationale support the adequacy and appropriateness of inferences and actions based on test score’ here are four points about the definition we need to pay attention to.
Defining construct
the ‘something’ we want to measure ↓
a construct of mental ability ↓
a postulated attribute of people, assumed to be reflected in test performance
Construct Validity
“A measure estimates how much of something an individual displays. The basic question[ of construct validity] is, what is the nature of that something?” (Messick 1975:957)
Thus, for justifying a particular interpretation of a test score, we must gather evidence for construct validity and consider the value implications of this interpretation.
performance consequential or ethical basis of validity
Content relevance and content coverage
The investigation of content relevance requires not only the specification of ability domain but also the specification of the test method facets.
4. Test validation is an on-going process and the interpretation we make of test scores can never be considered absolutely valid.
The unified framework of validity can been illustrated in the following table:
1. Validity is a unitary concept.
2. Validity requires both empirical evidence and theoretical rationale.
3. We are not examining the validity of the content or of even the test scores themselves, but rather the validity of the way we interpret or use the information gathered through the testing procedure.
Content relevance and content coverage Analysis of the internal structure of the test Criterion relatedness Experimental evidence Analysis of the processes underlying test performance consequential or ethical basis of validity
logical analysis expert judgment
Content relevance and content coverage
Limitations We seldom have a domain definition that clearly
and unambiguously identified the set of language use tasks from which possible test tasks can be sampled. They focus on tests, rather than test scores
Virtually all test use inevitably involves the interpretation of test scores as indicator of ability
Defining construct
When we operationally define constructs as measures of language ability, we are making hypotheses about the relationship between these constructs and test scores, which can thus be viewed as behavioral manifestations of the constructs.
Evidence supporting validity
Content relevance and content coverage Analysis of the internal structure of the
test Criterion relatedness Experimental evidence Analysis of the processes underlying test
Analysis of the internal structure of the test
Investigating the correlations among items or subscales of a test.
Correlation analysis
Function of outcome of testing
Source of justification Test interpretation
Test use
Evidential Basis Consequential Basis
Construct validity
Construct validity + Value implications
Construct validity + Relevance/Utility
Construct validity + Relevance/utility + Social consequences
Messick see this as a progressive matrix, with construct validity an essential component in each cell.
Validity
Definition of validity
Validity as a unitary concept
Construct validity
Defining construct Defining construct validity
Evidence supporting refers to the extent to which the tasks required in the test adequately represent the behavioral domain in question.
Specific approaches to get evidence of content validity
construct validation
construct validation requires logical analysis and empirical investigation.
Logical analysis is involved in defining the constructs theoretically and operationally. This comprises the first two steps in measurement and provides the means for relating the theoretical definitions of the construct to observations of behavior — scores on language test.
Defining construct validity
Construct validity concerns the extent to which performance on test is consistent with predictions that we make on the basis of a theory of ability.
Construct validity is thus seen as a unifying concept that integrates criterion and content considerations into a common framework for testing rational hypotheses about theoretically relevant relationships.