语料库研究基本方法
合集下载
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
语料库语言学的性质 123
☺the Wax Argument: Therefore, in order to properly grasp the nature of the wax, he cannot use the senses. He must use his mind. Descartes concludes:
语料库研究的基本方法
☺ Types of comparison
☺ Across genres ☺ Across users ☺ Across different times ☺ Across (varieties of) language(s)
123
语料库研究的基本方法 123
☺ Corpus comparability
几个常用术语 123
☺ Corpus ☺ Corpus linguistics
几个常用术语 123
☺ Token, type, lemma
The little boy looked at the other boys.
几个常用术语 123
☺ Collocation is defined as a sequence of words which co-occur more often than would be expected by chance.
☺However, it seems that it is still the same thing: it is still a piece of wax, even though the data of the senses inform him that all of its characteristics are different.
语料库语言学的性质 123
☺ 即便在语料库语言学阵营之中
☺ Corpus-driven: minimum theory-reliance. Exclusive reliance on corpus data for all theories
☺ Corpus-based: Reliance on corpus data for hypothesis-testing
☺It is a fundamental part of the scientific method that all hypotheses and theories must be tested against observations of the natural world, rather than resting solely on reasoning and intuition.
☺ Similarly, a man could be described as beautiful, but this would usually imply that he had feminine features.
几个常用术语 123
☺ Colligation is defined as a sequence of grammatical categories which co-occur more often than would be expected by chance.
语料库研究的基本方法 123
☺ Corpus-based approach: a hypothesis-testing approach
☺ Corpus-driven approach: with as “few preconceived ideas” as possible, “keeping the amount of theory-reliance to a minimum in order not to hinder the process of discovering new phenomena” (Römer 2005)
语料库研究基本方法
中国外语教育研究中心 梁茂成
主要内容
123
☻ 语料库语言学的性质
☻ 几个常用术语
☻ 语料库研究的基本方法
语料库语言学的性质 123
☺ 理性主义与经验主义
☺ Rationalism: I think therefore I am. ☺ Empiricism: My mind is a ‘blank slate’. Seeing
is believing.
பைடு நூலகம்
语料库语言学的性质 123
☺the Wax Argument: He considers a piece of wax; his senses inform him that it has certain characteristics, such as shape, texture, size, color, smell, and so forth. When he brings the wax towards a flame, these characteristics change completely.
☺ Corpus-referenced/informed: Occasionally resorting to corpus data for illustrations
语料库语言学的性质 123
☺ 我们坚决反对不顾语言事实的任何论断
☺ No introspection can claim credence without verification through real language data (Teubert 2005).
语料库研究的基本方法 123
☺ Linguistic features in corpus comparison
☺ Lexical ☺ Lexico-grammatical ☺ Syntactic ☺ Discoursal
语料库研究的基本方法 123
☺ Statistic tests in corpus comparison
☺ However, this implies that she is not beautiful at all in the traditional sense of female beauty, but rather that she is mature in age, has large features and a certain strength of character.
☺“ And so something which I thought I was seeing with my eyes is in fact grasped solely by the faculty of judgment which is in my mind.
语料库语言学的性质 123
☺Empiricism: Empiricism emphasizes those aspects of scientific knowledge that are closely related to evidence, especially as discovered in experiments.
☺ 内省数据(introspective data): rationalism ☺ 实验数据(experimental data): empiricism ☺ 真实数据(anthentic data): empricism
语料库语言学的性质 123
☺ 语料库语言学提倡真实数据 ☺ 我们不排斥其他数据类型
☺ Simple:
☺ Relationship (correlation, etc) ☺ Difference (chi-square, loglikelihood, etc.)
☺ Complicated: regression analysis, factor analysis, cluster analysis, correspondence analysis
语料库研究的基本方法
?
研究问题 结
研究设计 论
软件
对比
参
语
照
料
语
料
库
库
123
结果:
词汇 短语 搭配 语义韵 类联接 句式 等
数据分析、解释与讨 论
数据呈现
统计检验
内容5
1 2 35
Thank you.
几个常用术语 123
☺ Semantic prosody is instantiated when a word such as CAUSE co-occurs regularly with words that share a given meaning or meanings, and then acquires some of the meaning(s) of those words as a result. This acquired meaning is known as semantic prosody. (Stewart 2010)
语料库研究的基本方法 123
☺ Both approaches almost always involve a comparion of some kind.
语料库研究的基本方法 123
☺ Sizes of corpora in comparison (Rayson 2003)
☺ Small <=> big ☺ Equal sizes
☺ a big smoker ☺ a strong smoker ☺ a hard smoker ☺ a heavy smoker ☺ a furious smoker
几个常用术语 123
☺ It is quite possible, in fact, to describe a woman as handsome.
语料库语言学的性质 123
☺ Science is considered to be methodologically empirical in nature.
☺ Corpus linguistics is empirical in nature.
语料库语言学的性质 123
☺ 语言研究中的数据类型