统计学简史与数据科学(中南财经大学)
合集下载
相关主题
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
15
人口与贸易统计
1346 Giovanni Villani’s Nuova Cronica gives statistical information on the population and trade of Florence. 公元1346年,意大利佛罗伦斯当时的历史学家佐凡 尼· 微拉尼(Giovanni Villani)在著作《Nuova Cronica》 中纪录了人口和贸易的统计信息。
19
误差图
1644 Michael van Langren draws the first known graph of statistical data that shows the size of possible errors. It is of different estimates of the distance between Toledo and Rome. 公元1644年,荷兰天文学家Michael van Langren 用 统计数据画出第一张误差图,用不同方法估计从西班 牙托莱多到意大利罗马的距离。
13
人口普查
1188 Gerald of Wales completed the first population census of Wales.
公元1188年,英国威尔士的杰拉尔德完成了威尔士第 一次人口普查。
14
二项式系数
1303 A Chinese diagram entitled “The Old Method Chart of the Seven Multiplying Squares” shows the binomial coefficients up to the eighth power – the numbers that are fundamental to the mathematics of probability, and that appeared five hundred years later in the west as Pascal’s triangle. 公元1303年中国“杨辉(1261)三角形” (贾宪更早)给出二项分布系数8次幂, 奠定概率论的数学基础,而帕斯卡(1662) 三角形是500年之后才出现。
2
一、早期源头(Early Beginnings) 二、数学基础 (Mathematical Foundations) 三、现代发展 (Modern Era)
3
一、早期源头 (公元前450年至15世纪)
4
均值的使用
450 bc Hippias of Elis uses the average value of the length of a king’s reign (the mean) to work out the date of the first Olympic Games, some 300 years before his time.
普查
AD 2 Chinese census under the Han dynasty finds 57.67 million people in 12.36 million households – the first census from which data survives, and still considered by scholars to have been accurate 公元2年, 中国汉代进行了 人口普查,结果是1236万家庭, 5767万人口。记载的数据被 认为是相当准确的。
6
抽样推断
400 bc In the Indian epic the Mahabharata, King Rtuparna estimates the number of fruit and leaves (2095 fruit and 50 000 000 leaves) on two great branches of a vibhitaka tree by counting the number on a single twig, then multiplying by the number of twigs. The estimate is found to be very close to the actual number. This is the first recorded example of sampling – “but this knowledge is kept secret”, says the account. 公元前400年,印度史诗《摩诃婆罗多》(Mahabharata) 中国王利用只计算两个大树枝上的果实和叶子数量乘上 树枝的数量估算整棵树果实和叶子的数量,这是已知最 7 早的抽样推断。
16
二、数学基础 (16世纪至19世纪末)
17
概率初步
1560 Gerolamo Cardano calculates probabilities of different dice throws for gamblers. 公元1560年,意大利文艺复兴科学家吉罗拉莫· 卡尔达 诺计算出掷骰子的各种概率。
1069年最终税册:英王征服者威廉一世做的调查,对 新王国村庄和牲畜进行调查,这是英国官方统计最早 的记录(英格兰约150万人,90%是农民)。
12
随机抽样
1150 Trial of the Pyx, an annual test of the purity of coins from the Royal Mint, begins. Coins are drawn at random, in fixed proportions to the number minted. It continues to this day. 公元1150年,英国皇家制币厂开始硬币纯度和质量的 年度检验。通过随机样本进行等比例抽样检验,延续 至今。
1693 Edmund Halley prepares the first mortality tables statistically relating death rates to age – the foundation of life insurance. He also drew a stylised map of the path of a solar eclipse over England – one of the first data visualisation maps.
9
频数分析
840 Islamic mathematician Al-Kindi uses frequency analysis – the most common symbols in a coded message will stand for the most common letters – to break secret codes. Al-Kindi also introduces Arabic numerals to Europe. 公元840年,伊斯兰数学家金迪利用 最常用符号和最常用字符破解伊斯 兰密码,他还将阿拉伯数字介Hale Waihona Puke 到 欧洲。统计学简史与数据科学
袁卫
2016.12.10 中南财经政法大学
英国培根: 读史可以明智(Histories make men wise) 德国斯勒兹: 统计是静态的历史, 历史是动态的统 计. (Statistics is the state history while history is the dynamic statistics).
21
首本概率著作
1657 Huygens’s On Reasoning in Games of Chance is the first book on probability theory. He also invented the pendulum clock. 公元1657年,荷兰科学家惠更斯完成“机会游戏的推 理”一书,这是第一本概率理论的书,他还是摆钟的 发明者。
希皮亚斯(Hippias), 出生于希腊伯罗奔尼撒(Peloponnesus)西 北部的埃利斯(Elis), 与柏拉图(Plato)是同时代的人,历史上第 一位数学史家。他在公元前450年用以前每个国王执政时间长 短的均值推算出首届奥运会是距当时300多年前的公元前776年 举办的。
5
众数的使用
431 bc Attackers besieging Plataea in the Peloponnesian war calculate the height of the wall by counting the number of bricks. The count was repeated several times by different soldiers. The most frequent value (the mode) was taken to be the most likely. Multiplying it by the height of one brick allowed them to calculate the length of the ladders needed to scale the walls. 公元前431年希腊伯罗奔尼撒战争中雅典人让士兵数城 墙砖的层数,取士兵数据的众数乘以每块砖的厚度推 算城墙的高度,用以计算云梯所需长度。
8
普查
AD 7 Census by Quirinus, governor of the Roman province of Judea, is mentioned in Luke’s Gospel as causing Joseph and Mary to travel to Bethlehem to be taxed. 路加福音记载,公元7年,意大利罗马省省长奎里努斯实 施了普查,导致约瑟夫和玛丽前往约瑟夫祖籍大卫家族 所在的伯利恒申报户籍.
20
概率数学基础
1654 Pascal and Fermat correspond about dividing stakes in gambling games and together create the mathematical theory of probability.
公元1654年法国帕斯卡和费马通过对赌博中如何下注 等问题通信的研究共同创立了概率的数学理论。
10
曲线
10th century The earliest known graph, in a commentary on a book by Cicero, shows the movements of the planets through the zodiac. It is apparently intended for use in monastery schools. 公元10世纪,意大利西塞罗书中最早使用了曲线,描 述黄道带中行星运动的轨迹,也是修道院最早使用的图 表曲线。
11
官方统计
1069 Domesday Book: survey for William the Conqueror of farms, villages and livestock in his new kingdom – the start of official statistics in England.
18
均值与误差
1570 Astronomer Tycho Brahe uses the arithmetic mean to reduce errors in his estimates of the locations of stars and planets.
公元1570年,丹麦天文学家第谷· 布拉赫 在估计星球的位臵和运行时使用算术平均 数减少误差。
22
人口统计
1663 John Graunt uses parish records to estimate the population of London. 公元1663年,英国约翰 格朗特利用伦敦教区的洗礼、弥 撒等数据分析并估计伦敦 的人口, 并首次给出新生 婴儿性别比52:48。
23
首张死亡率表
人口与贸易统计
1346 Giovanni Villani’s Nuova Cronica gives statistical information on the population and trade of Florence. 公元1346年,意大利佛罗伦斯当时的历史学家佐凡 尼· 微拉尼(Giovanni Villani)在著作《Nuova Cronica》 中纪录了人口和贸易的统计信息。
19
误差图
1644 Michael van Langren draws the first known graph of statistical data that shows the size of possible errors. It is of different estimates of the distance between Toledo and Rome. 公元1644年,荷兰天文学家Michael van Langren 用 统计数据画出第一张误差图,用不同方法估计从西班 牙托莱多到意大利罗马的距离。
13
人口普查
1188 Gerald of Wales completed the first population census of Wales.
公元1188年,英国威尔士的杰拉尔德完成了威尔士第 一次人口普查。
14
二项式系数
1303 A Chinese diagram entitled “The Old Method Chart of the Seven Multiplying Squares” shows the binomial coefficients up to the eighth power – the numbers that are fundamental to the mathematics of probability, and that appeared five hundred years later in the west as Pascal’s triangle. 公元1303年中国“杨辉(1261)三角形” (贾宪更早)给出二项分布系数8次幂, 奠定概率论的数学基础,而帕斯卡(1662) 三角形是500年之后才出现。
2
一、早期源头(Early Beginnings) 二、数学基础 (Mathematical Foundations) 三、现代发展 (Modern Era)
3
一、早期源头 (公元前450年至15世纪)
4
均值的使用
450 bc Hippias of Elis uses the average value of the length of a king’s reign (the mean) to work out the date of the first Olympic Games, some 300 years before his time.
普查
AD 2 Chinese census under the Han dynasty finds 57.67 million people in 12.36 million households – the first census from which data survives, and still considered by scholars to have been accurate 公元2年, 中国汉代进行了 人口普查,结果是1236万家庭, 5767万人口。记载的数据被 认为是相当准确的。
6
抽样推断
400 bc In the Indian epic the Mahabharata, King Rtuparna estimates the number of fruit and leaves (2095 fruit and 50 000 000 leaves) on two great branches of a vibhitaka tree by counting the number on a single twig, then multiplying by the number of twigs. The estimate is found to be very close to the actual number. This is the first recorded example of sampling – “but this knowledge is kept secret”, says the account. 公元前400年,印度史诗《摩诃婆罗多》(Mahabharata) 中国王利用只计算两个大树枝上的果实和叶子数量乘上 树枝的数量估算整棵树果实和叶子的数量,这是已知最 7 早的抽样推断。
16
二、数学基础 (16世纪至19世纪末)
17
概率初步
1560 Gerolamo Cardano calculates probabilities of different dice throws for gamblers. 公元1560年,意大利文艺复兴科学家吉罗拉莫· 卡尔达 诺计算出掷骰子的各种概率。
1069年最终税册:英王征服者威廉一世做的调查,对 新王国村庄和牲畜进行调查,这是英国官方统计最早 的记录(英格兰约150万人,90%是农民)。
12
随机抽样
1150 Trial of the Pyx, an annual test of the purity of coins from the Royal Mint, begins. Coins are drawn at random, in fixed proportions to the number minted. It continues to this day. 公元1150年,英国皇家制币厂开始硬币纯度和质量的 年度检验。通过随机样本进行等比例抽样检验,延续 至今。
1693 Edmund Halley prepares the first mortality tables statistically relating death rates to age – the foundation of life insurance. He also drew a stylised map of the path of a solar eclipse over England – one of the first data visualisation maps.
9
频数分析
840 Islamic mathematician Al-Kindi uses frequency analysis – the most common symbols in a coded message will stand for the most common letters – to break secret codes. Al-Kindi also introduces Arabic numerals to Europe. 公元840年,伊斯兰数学家金迪利用 最常用符号和最常用字符破解伊斯 兰密码,他还将阿拉伯数字介Hale Waihona Puke 到 欧洲。统计学简史与数据科学
袁卫
2016.12.10 中南财经政法大学
英国培根: 读史可以明智(Histories make men wise) 德国斯勒兹: 统计是静态的历史, 历史是动态的统 计. (Statistics is the state history while history is the dynamic statistics).
21
首本概率著作
1657 Huygens’s On Reasoning in Games of Chance is the first book on probability theory. He also invented the pendulum clock. 公元1657年,荷兰科学家惠更斯完成“机会游戏的推 理”一书,这是第一本概率理论的书,他还是摆钟的 发明者。
希皮亚斯(Hippias), 出生于希腊伯罗奔尼撒(Peloponnesus)西 北部的埃利斯(Elis), 与柏拉图(Plato)是同时代的人,历史上第 一位数学史家。他在公元前450年用以前每个国王执政时间长 短的均值推算出首届奥运会是距当时300多年前的公元前776年 举办的。
5
众数的使用
431 bc Attackers besieging Plataea in the Peloponnesian war calculate the height of the wall by counting the number of bricks. The count was repeated several times by different soldiers. The most frequent value (the mode) was taken to be the most likely. Multiplying it by the height of one brick allowed them to calculate the length of the ladders needed to scale the walls. 公元前431年希腊伯罗奔尼撒战争中雅典人让士兵数城 墙砖的层数,取士兵数据的众数乘以每块砖的厚度推 算城墙的高度,用以计算云梯所需长度。
8
普查
AD 7 Census by Quirinus, governor of the Roman province of Judea, is mentioned in Luke’s Gospel as causing Joseph and Mary to travel to Bethlehem to be taxed. 路加福音记载,公元7年,意大利罗马省省长奎里努斯实 施了普查,导致约瑟夫和玛丽前往约瑟夫祖籍大卫家族 所在的伯利恒申报户籍.
20
概率数学基础
1654 Pascal and Fermat correspond about dividing stakes in gambling games and together create the mathematical theory of probability.
公元1654年法国帕斯卡和费马通过对赌博中如何下注 等问题通信的研究共同创立了概率的数学理论。
10
曲线
10th century The earliest known graph, in a commentary on a book by Cicero, shows the movements of the planets through the zodiac. It is apparently intended for use in monastery schools. 公元10世纪,意大利西塞罗书中最早使用了曲线,描 述黄道带中行星运动的轨迹,也是修道院最早使用的图 表曲线。
11
官方统计
1069 Domesday Book: survey for William the Conqueror of farms, villages and livestock in his new kingdom – the start of official statistics in England.
18
均值与误差
1570 Astronomer Tycho Brahe uses the arithmetic mean to reduce errors in his estimates of the locations of stars and planets.
公元1570年,丹麦天文学家第谷· 布拉赫 在估计星球的位臵和运行时使用算术平均 数减少误差。
22
人口统计
1663 John Graunt uses parish records to estimate the population of London. 公元1663年,英国约翰 格朗特利用伦敦教区的洗礼、弥 撒等数据分析并估计伦敦 的人口, 并首次给出新生 婴儿性别比52:48。
23
首张死亡率表