2014MathorCup优秀论文B题
2014研究生数学建模B题优秀论文
三 符号说明
r
r
k
目标径向距离 目标方位角 目标俯仰角 雷达极坐标下测距误 差 雷达极坐标下方位角 误差 雷达极坐标下俯仰角 误差 雷达在地球直角坐标 下 x 轴上的标准差 雷达在地球直角坐标 下 y 轴上的标准差 雷达在地球直角坐标 下 z 轴上的标准差 目标的运动状态
-3-
一 问题重述
目标跟踪是指根据雷达等传感器所获得的对目标的测量信息, 连续地对目标 的运动状态进行估计,进而获取目标的运动态势及意图。目标机动则是指目标的 速度大小和方向在短时间内发生变化,通常采用加速度作为衡量指标。目标跟踪 与目标机动是“矛”与“盾”的关系。因此,引入了目标机动时雷达如何准确跟踪的 问题。 机动目标跟踪的难点在于以下几个方面:(1) 描述目标运动的模型,即目标 的状态方程难于准确建立。通常情况下跟踪的目标都是非合作目标,目标的速度 大小和方向如何变化难于准确描述; (2) 传感器自身测量精度有限加之外界干 扰,传感器获得的测量信息,如距离、角度等包含一定的随机误差,用于描述传 感器获得测量信息能力的测量方程难于完全准确反映真实目标的运动特征; (3) 当存在多个机动目标时,除了要解决(1)、(2)两个问题外,还需要解决测量信息 属于哪个目标的问题,即数据关联。由于以上多个挑战因素以及目标机动在战术 上主动的优势,机动目标跟踪已成为近年来跟踪理论研究的热点和难点[1]。 目标跟踪处理流程通常可分为航迹起始、点迹航迹关联(数据关联)、航迹 滤波等步骤。 另外, 不同类型目标的机动能力不同。 因此, 在对机动目标跟踪时, [2] 必须根据不同的目标类型选择相应的跟踪模型 。 根据题目提供的 3 组机动目标测量数据,本文拟解决以下问题: 问题一 根据附件中的 Data1.txt 数据,分析目标机动发生的时间范围,并 统计目标加速度的大小和方向。建立对该目标的跟踪模型,并利用多个雷达的 测量数据估计出目标的航迹。鼓励在线跟踪。 问题二 附件中的 Data2.txt 数据对应两个目标的实际检飞考核的飞行包线 (检飞:军队根据国家军标规则设定特定的飞行路线用于考核雷达的各项性能 指标,因此包线是有实战意义的)。请完成各目标的数据关联,形成相应的航 迹,并阐明你们所采用或制定的准则(鼓励创新)。如果用序贯实时的方法实 现更具有意义。若出现雷达一段时间只有一个回波点迹的状况,怎样使得航迹 不丢失?请给出处理结果。 问题三 根据附件中 Data3.txt 的数据,分析空间目标的机动变化规律(目标 加速度随时间变化)。若采用第 1 问的跟踪模型进行处理,结果会有哪些变化? 问题四 请对第 3 问的目标轨迹进行实时预测,估计该目标的着落点的坐 标,给出详细结果,并分析算法复杂度。 问题五 Data2.txt 数据中的两个目标已被雷达锁定跟踪。 在目标能够及时了 解是否被跟踪,并已知雷达的测量精度为雷达波束宽度为 3° ,即在以雷达为锥 顶,雷达与目标连线为轴,半顶角为 1.5° 的圆锥内的目标均能被探测到;雷达 前后两次扫描时间间隔最小为 0.5s。为应对你们的跟踪模型,目标应该采用怎 样的有利于逃逸的策略与方案?反之为了保持对目标的跟踪,跟踪策略又应该 如何相应地变换?
美国大学生数学建模比赛2014年B题
Team # 26254
Page 2 oon ............................................................................................................................................................. 3 2. The AHP .................................................................................................................................................................. 3 2.1 The hierarchical structure establishment ....................................................................................................... 4 2.2 Constructing the AHP pair-wise comparison matrix...................................................................................... 4 2.3 Calculate the eigenvalues and eigenvectors and check consistency .............................................................. 5 2.4 Calculate the combination weights vector ..................................................................................................... 6 3. Choosing Best All Time Baseball College Coach via AHP and Fuzzy Comprehensive Evaluation ....................... 6 3.1 Factor analysis and hierarchy relation construction....................................................................................... 7 3.2 Fuzzy comprehensive evaluation ................................................................................................................... 8 3.3 calculating the eigenvectors and eigenvalues ................................................................................................ 9 3.3.1 Construct the pair-wise comparison matrix ........................................................................................ 9 3.3.2 Construct the comparison matrix of the alternatives to the criteria hierarchy .................................. 10 3.4 Ranking the coaches .....................................................................................................................................11 4. Evaluate the performance of other two sports coaches, basketball and football.................................................... 13 5. Discuss the generality of the proposed method for Choosing Best All Time College Coach ................................ 14 6. The strengths and weaknesses of the proposed method to solve the problem ....................................................... 14 7. Conclusions ........................................................................................................................................................... 15
2014全国大学生数学建模竞赛B题
85.19
93.02
98.74
103.02
106.22
108.59
110.25
111.31
111.84
桌腿开槽的长度 (cm)
4.0903
7.1384
9.7455
11.8915
13.5746
14.9417
15.9603
16.6140
16.8944
桌角边缘线的数学描述:
先求桌角边缘各点的三维坐标,如图,我们取各个桌腿的内侧边的靠近桌面圆心的点,从外向内,编号为 , ,….. :
z=[0 3.37 6.55 9.44 11.94 12.45 14.14 16.28 16.78 17.36];
xx=linspace(-5,25);
yy=spline(x,y,xx);
zz=spline(x,z,xx);
plot3(xx,yy,zz,'r',x,y,z,'o') ;
hold on;
桌腿编号
2
3
4
5
6
7
8
9
10
开槽的上顶点到桌腿顶点距离 (cm)
20.7
17.4
14.9
13
11.6
10.5
9.7
9.2
8.9
根据解析式(1)、(2)求出在桌子完全成型的时候,各条桌腿转动的角度 和钢筋在桌腿开槽内滑动的距离 ,此时的 也就是开槽的长度(见附录程序3)。
MathorCup杯数模竞赛优秀论文
Key words : refined circle, TCP, reverse thought, A multi-objective linear programming, traveling package
The Design of Family Summer Travelling Plan
3. the average speed of our cab is 50km per hour, while the average cost is 0.3 yuan per kilometer. 4. when we go from place A to place B, we have no visit to places during the trip. 5. during a time period, family members start from Chengdu, and end in Chengdu. 6. in a day, 12 hours for traveling and 12 hours for rest. 7. there is no accident in our travel. 8. we choose the following national 5A and 4A attractions as our potential destination after thinking about the surrounding tourist attractions: Chengdu, jiuzhaigou, huanglong, leshan, emeishan, siguniangshan, danba, dujiangyan, qingchengshan, hailuogou, kangding.
mathorcup b题
mathorcup b题
Mathorcup是一项全球性的数学竞赛,B题是其中的一道难度适中的测试题目。
让我们来一起看看这道题目吧:
在一个数轴上有n个点,每个点都有一个权值ai。
现在要求选择两个点i和j(i≠j),使得它们之间的距离恰好为k,并且这两个点的权值之和最大。
请设计一种时间复杂度不超过O(nlogn)的算法来解决这个问题。
解题思路:
假设我们选定了一个点i,那么只有当存在一个点j满足| ai - aj | = k时,i和j才可能成为最优组合。
此时,我们只需要找到所有满足条件的点j中权值最大的一个即可。
我们可以将原始数据从小到大排序,然后从前往后遍历每一个点i。
对于每个i,我们可以二分查找数轴上是否存在一个j,使得| ai - aj | = k,并且记录下当前最大的权值之和。
时间复杂度为 O(nlogn)。
总结:
这道题目涉及到了二分查找以及排序等基础算法知识。
通过这道题目的练习,我们可以更加深入地理解这些算法的实现原理,并且进一步提高自己的编程水平。
14年国赛数模B题优秀论文
h tan r 2 l 2 r 将桌高 70 cm,桌面直径 80 cm 代入上述公式,求得: 27.13 ,最长桌腿的长度 l1 78.65 。 则得到此时平板尺寸为 181.3cm 80cm 。 根据模型二求得的槽长 Ri 与 的函数关系,解出此时各木条的槽长矩阵,考虑到加工方 便,本文令所有槽长都等于最长槽长 34.87cm。 运用 matlab 软件,运用仿真技术(见附录),画出折叠桌展开的动态过程(图 9)。
五、模型的建立与求解
5.1 问题一 5.1.1 模型一的建立与求解
已知长方形平板尺寸为 120 cm × 50 cm × 3 cm,要将该平板裁剪为桌面呈圆形 的折叠桌,由于圆形桌面的对称性和木条的已知宽度,本文假设每组桌腿条数为 19, 考 虑实际裁剪过程,去掉平板两侧长为 120cm,宽为 1.25cm 的两部分(见图 1 阴影部分) 由图 1 将每根木条对应在半圆内的矩形抽象出来,得到图 2,设圆形桌面半径为 r , 已知木条宽 d 为 2.5cm ,那么根据勾股定理: l2 d 2 r2
赛区评阅编号(由赛区组委会评阅前进行编号):
2014 高教社杯全国大学生数学建模竞赛
编 号 专 用 页
赛区评阅编号(由赛区组委会评阅前进行编号):
赛区评阅记录(可供赛区评阅时使用): 评 阅 人 评 分 备 注
全国统一编号(由赛区组委会送交全国前编号):
全国评阅编号(由全国组委会评阅前进行编号):
MathorCup杯数模竞赛优秀论文
我们规定了假设中以成都为起点玩遍 11 个景点。因此利用 F (i, j )1111 为 w2 (i, j )1111 构造无向图 UG2 ,再利用 Matlab 软件进行 1 次改良圈算法(算法见附 录) ,从而有游览 11 个景点的最佳路线如下:
-1-
家庭暑期旅游套餐的设计
1.问题重述
暑假是家庭出游最好的时间之一,在孩子放暑假的时候,很多家长会选择这 个时间带孩子出门旅游,开拓孩子的视野,培养家人的感情。然而每个家庭都有 自己对旅行不同的要求,比如家庭人口多少,家庭经济所能承受的旅行费用,家 长假期长短所对旅行时间的限制等。选取一个旅游城市,通过考虑旅行路线、费 用、时间等其他重要因素为由不同要求的家庭设计不同的家庭暑期旅游套餐。
-2-
琐事时间。 7.整个旅行途中不存在任何意外情况的发生,比如天气突变,交通事故,景点关 闭等等。我们模型中的旅行一切顺利。 8. 在综合考虑了成都周边的热门旅游景点后之后, 我们从国家 5A 级景点以及部 分 4A 级景点中选择了以下 11 个热门景点:成都、九寨沟、黄龙、乐山、峨嵋山、 四姑娘山、丹巴、都江堰、青城山、海螺沟、康定,作为所有家庭旅游的候选目 的地。
-3-
返回出发城市, 要确定一条行走的路线, 使得总路径最短。 即为旅行商问题 (TSP) 。 用图论的术语说,就是在一个赋权完全图中,找出一个有最小权的 Hamilton 圈 C 。称这种圈为最优圈。与最短路问题及连线问题相反,尽管目前还没有求解旅 行商问题的有效算法。但是却有一个可行的办法是求一个 Hamilton 圈,然后适 当修改以得到具有较小权的另一个 Hamilton 圈。修改的方法叫做改良圈算法。 设初始圈 C v1v2 vn v1 。 (1)对于 1 i 1 j n, 构造新的 Hamilton 圈: Cij v1v2 vi v j v j 1v j 2 vi 1v j 1v j 2 vn v1 , 它 是 由 C 中 删 去 的 边 vi vi 1和v j v j 1 , 添加边vi v j 和vi 1v j 1 而 得 到 的 。 若
2014年全国数学建模联赛论文设计B题参考问题详解
高教社杯全国大学生数学建模竞赛承诺书我们仔细阅读了中国大学生数学建模竞赛的竞赛规则.我们完全明白,在竞赛开始后参赛队员不能以任何方式(包括、电子、网上咨询等)与队外的任何人(包括指导教师)研究、讨论与赛题有关的问题。
我们知道,抄袭别人的成果是违反竞赛规则的, 如果引用别人的成果或其他公开的资料(包括网上查到的资料),必须按照规定的参考文献的表述方式在正文引用处和参考文献中明确列出。
我们重承诺,严格遵守竞赛规则,以保证竞赛的公正、公平性。
如有违反竞赛规则的行为,我们将受到严肃处理。
我们授权全国大学生数学建模竞赛组委会,可将我们的论文以任何形式进行公开展示(包括进行网上公示,在书籍、期刊和其他媒体进行正式或非正式发表等)。
我们参赛选择的题号是(从A/B/C/D中选择一项填写): B我们的参赛报名号为(如果赛区设置报名号的话):所属学校(请填写完整的全名):农业大学参赛队员(打印并签名) :1. 富顺2. 安明梅3. 熊万丹指导教师或指导教师组负责人(打印并签名):指导组日期: 2014年 9 月 10 日赛区评阅编号(由赛区组委会评阅前进行编号):2014高教社杯全国大学生数学建模竞赛编号专用页赛区评阅编号(由赛区组委会评阅前进行编号):全国统一编号(由赛区组委会送交全国前编号):全国评阅编号(由全国组委会评阅前进行编号):太阳能小屋的设计摘要太阳能利用的重点是建筑,其应用方式包括利用太阳能为建筑物供热和供电,因此在设计电池时考虑太阳辐射强度、光线入射角、环境、建筑物所处的地理纬度、地区的气候与气象条件、安装部位及方式(贴附或架空)等对电池产电量的影响非常重要。
问题一,从题目给出的数据和收集到的资料出发,我们对所有数据进行处理,分析得到小屋每个面的总辐射强度,然后对其排序得到各个面的辐射强度的比例,利用模糊综合评判以及matlab模拟仿真得出问题的顶面最优值,小屋在35年的寿命期的发电量为343139.88KW,经济效益32万元,投资的回收年限14.33年。
2014研究生数学建模竞赛优秀论文B
一、问题的重述考虑航天器在仅受到地球万有引力、航天器自身发动机作用力的作用下作平面运动,将地球和航天器视为质点,建立航天器运动的数学模型。
显然这样的数学模型在精度上是远远不能满足实际需要的,在其他要求精确制导等有关高科技的实际问题中,我们都面临着类似的问题:我们必须建立高精度的数学模型,必须高精度地估计模型中的大批参数,因为只有这样的数学模型才能解决实际问题,而不会出现差之毫厘,结果却失之千里的情况。
由于航天器的问题太复杂,本题仅考虑较简单的确定高精度参数问题。
假设有一个生态系统,其中含有两种生物,即: A 生物和B 生物,其中A 生物是捕食者,B 生物是被捕食者。
假设t 时刻捕食者A 的数目为()x t ,被捕食者B 数目为()y t ,它们之间满足以下变化规律:()()()()()()1234x t x t y t y t y t x t αααα⎧'=+⎡⎤⎪⎣⎦⎨'=+⎡⎤⎪⎣⎦⎩ 初始条件为:()()0506x t y t αα=⎧⎪⎨=⎪⎩其中()16k k α≤≤为模型的待定参数。
通过对此生态系统的观测,可以得到相关的观测数据。
要利用有关数据,解决以下问题:1) 在观测数据无误差的情况下,若已知2α,求其它5个参数()1,3,4,5,6k k α=? 2)若2α也未知,至少需要多少组观测数据,才能确定参数()16k k α≤≤? 3) 在观测资料有误差(时间变量不含有误差)的情况下,确定参数()16k k α≤≤ 在某种意义下的最优解,并与仿真结果比较,进而改进数学模型。
4) 假设连观测资料的时间变量也含有误差,确定参数k α在某种意义下的最优解。
二、航天器运动模型的建立考虑航天器在仅受到地球万有引力、航天器自身发动机作用力的作用下作平面运动,将地球和航天器视为质点,由理论力学可知,一个刚体在空间的运动可以看作质心的移动,因此可以应用质心运动定理来研究刚体质心的移动规律。
2014美赛B论文
(4.1) 通过上式,根据一九方法,确定 aij 的设置。
8
● 计算特征值和特征向量 矩阵 A 的最大特征值是 max ,和对应的特征向量是 u u1 , u2 ,u 3, , un 。然
T
后我们通过表达式规范查的公式指标:
i0 u j
二、符号定义以及假设
1. 符号定义
2
(1)评估标准 符号
ai
bi
R
说明 第 i 年的赢局 第 i 年的输局 平均 SRS 平均 SOS 季后赛各等级名额 每个奖励的分量 每个贡献的分数
O
nk
ki ci
(2)层次分析法 符号
A
说明 评判矩阵 矩阵 A 的最大的特征值 连续检查的指示 随机连续指标 评判水平的权值集合 选择水平的权值集合 模型 1 的评价分数
Y
2. 假设 1. 假设我们已经考虑到所有对评价起重要作用的因素; 2. 假设没有考虑到的教练因素不影响排名; 3. 假设我们收集到的数据具有多样性和准确性以及定量是准确 的; 4. 假设教练存在客观而准确的排名情况,并且媒体所给出的排名 能够在一定程度上准确的反映教练的排名。
三、表述我们的指标
3.1 指定评估规范 至于球员的评价标准,主要有五个方面[9]:力量,速度,技巧,防御和攻击。 类似的一个教练的评价也可以分为五个方面:历史战绩,比赛含金量,季后赛的 表现,荣誉以及对体育的贡献。接下来的章节将着重说明这五个方面。 ● 历史战绩: 球队的战绩在教练的评价中毫无疑问的占据了最大的比例。 我们 根据主流的统计指标对球队战绩进行统计,发现输赢是非常显着的。球队的 历史战绩可以直接反映执教能力。 总得赢局可以按如下计算:
O
i
SOSi t
数学建模mathorcup获奖论文
基于层次分析法和BP神经网络对书籍推荐的研究1.问题的重述随着信息技术和互联网的发展,人们逐渐从信息匮乏的时代走入了信息过载的时代。
此时,无论是信息消费者还是信息生产者都遇到了很大的挑战:对于信息消费者,从大量信息中找到自己感兴趣的信息是一件非常困难的事情;对于信息生产者,让自己生产的信息脱颖而出,受到广大用户的关注,也是一件非常困难的事情。
推荐,就是解决这一矛盾的重要工具,在互联网的产品和应用中被广泛采用,包括大家经常使用的相关搜索、话题推荐、电子商务的各种产品推荐、社交网络上的交友推荐等。
我们获得了一个着名网上书店的用户行为信息,包括对于书籍的评分数据,书籍的标签信息以及用户的社交关系,请你根据数据完成以下问题。
1.分析影响用户对书籍评分的因素;2.建立一个模型,预测附件中的用户对书籍的评分;3.针对附件中的用户,给每个用户推荐3本没看过的书籍。
2.问题的分析对于书籍的评分与推荐,主要是基于对大量统计数据的处理。
所以,对于问题的解决需要抓住关键有用的数据,并对数据进行转变、筛选、分析、归纳,分析用户对书籍评分的影响因素,以此为依据,通过建立用户对书籍评分的模型,进而完成用户对书籍的评分预测和书籍推荐。
问题一的分析问题一要求分析影响用户对书籍评分的因素,是对附件中数据的综合分析,首先对进行原始数据筛选分别得到用户对书籍的评价为1—5分的数据;考虑到不同影响因素对书籍评分的影响,然后再对其他数据进行筛选,分析,初步得到各阶段书籍的评价分数与标签数的关系、与社交好友的关系、与书籍浏览量的关系。
最后对得到的数据进行科学分析和归纳总结,得到影响用户对书籍评分的因素。
问题二的分析问题二要求建立模型,预测附件中的用户对书籍的评分。
首先对标签数量,社交关系,书籍浏览量三个方面进行研究,这是一个多目标决策问题。
根据问题,可以运用YAAHP层次分析软件建立总评分-准则层两层次分析模型,利用层次分析法综合分析确定各指标对总评分的权系数,并确定综合书籍评价公式,从而得到书籍评分模型,进行预测评分。
(完整版)2014大学生数学建模美赛B题数据全
1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956
29 BAA
WSC
30 BAA
WSC
31 BAA
WSC
32 NBA
三
33 NBA
BOS
34 NBA
BOS
35 NBA
BOS
36 NBA
BOS
37 NBA
BOS
38 NBA
BOS
39 NBA
BOS
34 0.585
56
21
35 0.375
82
39
43 0.476
82
26
56 0.317
16
8
8
0.5
70
12
58 0.171
Stan Albeck Stan Albeck Stan Albeck Stan Albeck Bob Bass
1979 1980 1981 1982 1983
K.C. Jones
41 NBA
CAP
46 ABA
SAA
40 NBA
NOJ
42 NBA
WSB
47 ABA
SAA
43 NBA
WSB
42 NBA
NOJ
43 NBA
NOJ
44 NBA
NOJ
51 NBA
SAS
42 NBA
DET
79
5722 0.722806020
0.75
80
58
22 0.725
80
59
21 0.738
80
62
18 0.775
Coach Season
MathorCup竞赛优秀论文
基于Monte Carlo局面评估和UCT博弈树搜索的20481.问题提出2048是最近一款非常火爆的益智游戏,很多网友自称“一旦玩上它就根本停不下来”。
2048游戏的规则很简单:每次控制所有方块向同一个方向运动,两个相同数字的方块撞在一起之后合并成为他们的和,每次操作之后会在空白的方格处随机生成一个2或者4,最终得到一个“2048”的方块就算胜利了。
如果16个格子全部填满并且相邻的格子都不相同也就是无法移动的话,那么游戏就会结束。
本文将建立数学模型,解答下列问题:1. 如何才能达到2048,给出一个通用的模型,并采用完成游戏所需移动次数和成功概率两个指标来验证模型的有效性;2. 得到2048之后,游戏还可以继续玩,那么最大能达到多大的数值呢?如果将方格扩展到N*N个,能达到的最大数是多少?2.问题分析本文首先基于Random-Max-Trees和Alpha-beta剪枝算法来实现人工智能(AI)的2048。
本文认为可以把2048游戏看成是一场人类和计算机的博弈,人类控制所有方块向同一个方向移动并合并,计算机则在空白处随机放置一个“2”或“4”的方块。
但是在AI的环境下,博弈双方都是计算机,双方都不理性,所以在AI 的环境下,选用更加保守的Random-Max-Trees博弈策略比选用Mini-Max-Tree 的更加适当。
如果把当前格局作为博弈树的父节点,把下一步所有可能的走法所造成的格局作为树的一个子节点,如果继续使用Random-Max-Trees算法,则此算法的效率很不理想,会造成许多不必要的步骤。
因为每一个子节后面还有子节,可能的情况很多循环往复,直到2048为止,但是并不是所有的节点都必须搜索完毕,有些节点是不必要的。
为了解决这一问题,本文可以采用Alpha-beta剪枝算法。
对于第一问为实现2048的这种情况,蒙特卡洛评估是一个很好的解决方法,它通过对当前局面下的每个的可选点进行大量的模拟来得出相应的胜负的统计特性,在简单情况下,胜率较高的点就可以认为是较好的点予以选择。
2014年美国数学建模比赛B题(最佳教练)
B
Summary
2014 Mathematical Contest in Modeling (MCM) Summary Sheet
This paper mainly sets a model of selecting five “best all time college coach” in basketball, football and hockey respectively. The model is separated to six sub-problems and its solution in our paper:1)Selecting the best coach in one NCAA basketball season; 2)Screening the college coach at top-20 to top-30 in the past 100 years; 3)Further analysis to these selected coaches, and ranking the top-5; 4)Applying this model to football and hockey; 5)Gender impact analysis and time impact analysis; 6)Analyzing the reasons that some famous coach selected by some magazines and media are not on our ranking list. For sub-problem1:Identify four indicators, and confirm the weight of each indicator by Analytic Hierarchy Process, then transformer the indicator data into scores and add these scores multiplied by the weight of itself to get the final scores. The best coach in one season is the one with highest score. For sub-problem2: Use the method above to score for all of the coaches in the past 100 years , then select the top-20.For purpose of reducing the influence of subjective factors, we use Principal Component Analysis to get another ranking list, also, we select the top-20.And union these two top-20. For sub-problem3:Surveying these coaches in the union to learn how many NBA players they've respectively brought up. And regarding this as an indicator, then dividing all indicators into Experience,Leading ability and Ability of player cultivation three aspects, and using Multilevel Hierarchical Analysis to weight them. Next, score for each person by the Fuzzy Criterion of Composite and rank according to the scores. With the purpose of reducing the influence of subjective factors, using Grey Correlation Analysis to calculate the grey correlation degree between “perfect coach” and these coaches. After that, using t-test to judge whether the two results with significant differences. The conclusion obtained: There are no significant difference. Namely, this method is generality. For sub-problem4:Fine-tuning part of indicators and weights, then apply this model in football and hockey. For sub-problem5:As One-way Analysis of Variance used, and the conclusion obtained: gender doesn’t affect the overall score or the ranking of coaches, but female coach has unique advantages in women's basketball league. And the time has no significant effect on the comprehensive score ,but the most excellent coach were in the 70s. Combined with the history of NCAA development, we think that it is necessary to improve the weight of the indicator---champions that gotten recently, making this model better. .For sub-problem6: Comparing the data of their career with our standards to find the reasons.
2014美国数学建模B题数据大学教练-曲棍球
93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139
To Yrs 1976 16 1947 20 2007 1 1969 2 1996 1 1975 1 2010 2 1984 3 2008 23 1989 1 2012 2 2014 11 2002 2 1976 3 1978 1 1980 2 1982 3 1990 10 1994 11 2014 1 1996 1 1981 2 1970 3 1968 13 1976 3 1978 2 1954 11 1950 4 2013 3 2014 7 2002 30 2004 9 2000 7 1989 3 1995 1 1975 2 2004 14 1976 1 2014 6 1998 4 2014 4 2009 3 2014 9 1991 6 1919 2
46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92
Frank Carroll Wayne Cashman Bruce Cassidy Dave Chambers Art Chapman Guy Charron Gerry Cheevers* Don Cherry King Clancy* Dit Clapper* Odie Cleghorn Sprague Cleghorn* Cory Clouston Neil Colville* Charlie Conacher Lionel Conacher* Kevin Constantine Bill Cook* Jon Cooper Marc Crawford Pierre Creamer Fred Creighton Terry Crisp Joe Crozier Roger Crozier Randy Cunneyworth John Cunniff Alex Curry Leo Dandurand* Hap Day* Billy Dea Peter DeBoer Alex Delvecchio* Jacques Demers Cy Denneny* Bill Dineen Kevin Dineen Rick Dudley Dick Duff* Jules Dugal Art Duncan Red Dutton* Dallas Eakins Frank Eddolls Phil Esposito* Jack Evans John Ferguson
MathorCup优秀论文B题
CollaborativeFiltering-ModelofBook
RecommendationSystem
ProblemI:Requiresidentify the impact factorthatthe userscores of books.Need to taptext and database informationofthe title given, to be reasonable given the data analysis, screening, to identify factors that might be affecting the score books, through modeling, the Institute selected factors that can affect whether users of books evaluated.
For questionIII:Based on collaborative filtering technology, we establish a collaborative filtering recommendation model,soon afterwardssystemcan recommend books tousers they maybeinterested based on this model.First of all,coefficient matrix can be established according toscoring matrix from questionII.Secondly, bycomparing theMAEvalues between the modified cosine similarity matrix and the similarity matrixof Pearson,we get several users have the greatest similarity with suggested users in the annex.Finally system can search several books read by greatest similarity user and recommend the appropriate book to suggested users..
mathorcup往年题目
mathorcup往年题目Mathorcup是一项全球性的数学竞赛,涉及不同年级和课程的数学知识,历年来的题目涉及面广,难度等级不一,深受广大数学爱好者的喜爱。
接下来,本文将介绍Mathorcup历年来的题目。
一、Mathorcup历年来的题目概述Mathorcup历年来的题目共计数百道,主要包括代数、几何、排列组合、概率统计等多个领域。
其中,一些题目关注于基础的数学概念和技能,另一些则涉及更高级的数学知识和技能。
二、Mathorcup历年来的题目实例(一)代数1、第1届Mathorcup代数题已知方程2x+2y=10和y-x=2,求x和y的值。
2、第5届Mathorcup代数题求解方程5x-4y=-15和-x-4y=-9的解。
(二)几何1、第2届Mathorcup几何题如图所示,直角三角形ABC中,∠BAC=90°,AD是BC的垂线,AD=3cm,AC=5cm,BC=4cm。
求正弦、余弦、正切以及余切值。
2、第6届Mathorcup几何题如图所示,在正方形ABCD中,E、F、G、H依次是AB、BC、CD和DA上的点,知EF=3,FE与GH交于P点,求PG的长度。
(三)排列组合1、第3届Mathorcup排列组合题从0,1,2,……,9中选出3个数,不可重复且不可重叠地组成一个3位数,求出这样的3位数的个数。
2、第7届Mathorcup排列组合题有6名歌手,依次编号为1,2,3,4,5,6。
从中选出3名歌手参加一次演出,求不同的选法数。
(四)概率统计1、第4届Mathorcup概率统计题某班有4名男生和6名女生,从中随机选定3名学生,问至少有1名男生的概率是多少?2、第8届Mathorcup概率统计题一批电视机生产厂商共生产了1000部电视机,设其中200部存在问题。
现从中随机抽取一台电视机,求它不是有问题的概率。
三、结论Mathorcup历年来的题目涵盖了数学各个领域的知识点,考察了学生的数学知识、运算和分析能力。
2014MathorCup优秀论文B题
Collaborative Filtering-Model of Book Recommendation System1. Restatement and Analysis of ProblemWith the continuous development of information technology and the Internet, a lot of information to emerge in front of us. Users face of these information is difficult to find the content they are really interested in, and the information provider is difficult to accurately convey the quality of information to the interested user.Therefore, it is very important value for information provider that the study of books score to issue recommendations on quality books for the user .Problem I: Requires identify the impact factor that the user scores of books. Need to tap text and database information of the title given, to be reasonable given the data analysis, screening, to identify factors that might be affecting the score books, through modeling, the Institute selected factors that can affect whether users of books evaluated.Problem II: Asked to predict the users have not read the books score based on predict.txt annex. According to the impact factor that identifies users score of books at the problem I, as an argument to establish score prediction model based on the project and obtained score.Problem III: Requires recommend three books that each user have not seen. From the user's point of view, he should be concerned with people with similar interests have seen the books to find a relatively high score books as the user ultimately recommended books from the middle. So, how to better identify high similarity with the user to other users is the need to solve the problem.2. Hypothesis of the Model2.1 Through data mining, we only consider three possible factors that the number of labels, the attention, the number of books have been read without other factors.2.2Friends relationship is unidirectional.2.3The projects of user not rated are assumed the average score of the line .2.4Does not consider the issue of the original data missing3. Illustration of SymbolsSign Illustrationij R :correlation coefficienti x :1=i expressed bookmark 1, 2=i expressed bookmark 2MAE :the average absolute deviationxy sim :the similarity between users x i and projects y iy a,P :the scores of target user predicted the goals not ratedu NBS :nearest neighbor set of usersn R :the average score of user u and user n for project's4. Establishment and Solution of the Model4.1 The decorrelation model based on Principal component analysis4.1.1 Relevance?theoryPrincipal component analysis [1] is a statistical analysis methods which usedimension reduction technique to transform the numerous variables into a few main components (i.e. integrated variable). Each the principal components is the linear combination of the original variables and they are Uncorrelated, So as to reflect the vast information of the original variables by the principal components. This method can overcome the shortage which only own one Index can not reflect on the whole score feature, this method introduce a wide range of indicators But in turn several factors complicated are attributed to the main ingredients, Allowing to simplify complex problems, at the same time it can find out more scientific and accurate factors affecting the books evaluation.First, according to the data ,we Identify several factors that could affect scores of books. They are :1、The frequency of the book reading ;2、indirect attention (a book relationship data embodied in the user social net); 3, the number of labels books.Secondly,?conduct the overall?test of the three?factors,?namely to analyze if every element?(i.e.?single index)?is feasible,?valid, and authentic. (theso-called?feasibility,?refers to whether the?index?can?obtain the correct?value,?those who cannot or?is difficult to obtain accurate?data?of the index,?or?even?canmake?but the cost?is very high,?is not feasible,)?the correctness,?refers to the?index calculation?method and calculation?scope and calculation?contents?should be scientific.?The so-called?truth,?is mainly?the quality analysis ofspecific?evaluation?data,?need to?conform to a specific?comprehensive evaluation method.Finally,?the?comprehensive evaluation index system ofthe?measurement?object?is divided into a number of?different partor?side?(i.e.?subsystem),?and gradually subdivided,?until every?part andthe?side?can be?used to describe the?statistical indicators?and?implementationof?specific.?In order to?exclude the?interference?of irrelevant information,?this paper uses principal component analysis?to?the related method in,?exclude thesmall?relativity?between all the indexes?overlap factor, thus we get the?factors influencing the user?books on the?score.4.1.2 Establishment and solution of the modelAccording to the label?data,?the relationship data and the?books?data,?use matlab to?data mining?,?the program statements?in Appendix?I. In order to?exclude the?interference?of irrelevant information,?using principal componentanalysis?method?to?eliminate?large?index?correlation to?get?the final evaluation index.First,?calculate the correlation coefficient?matrix:⎪⎪⎩⎪⎪⎨⎧=PP P P P P R R R R R R R R R R ΛΛΛΛΛΛΛ212222111211 (1)In the formula?(1),?),,2,1.(p j i R ij Λ=?correlation coefficient of?theoriginal?variables i x ?and j x ,?the calculation formula is as follows:∑∑∑===----=n k n k j kj i ki n k j kj i ki ij x x x xx x x x R 11221)()())(( (2) Because R is?a real symmetric matrix?(i.e. ji ij R R =)?so?we only need to calculate ?the?upper triangle element?the lower triangular?elements,?as shown in table 1:Table 1 Correlation Coefficients MatrixThen accordingto?th e resul ts of correlation analysis?in Table 1,?the Relevance between The frequency of the book reading and? the?label? number?is relatively large.The frequency of the book reading is Eliminate.?Finally?this paper?obtained?factors affecting?usersto?book?score?as shown in figure 1.4.2 For each BookID and Booktag belongs to the label, this label can be understood reader preferences for the types of books that readers ’ read feature ),(21x x x i =. Read the feature contains implicit information of reader for books, the associated data mining can get the relationship between books and user ratings. There are many data mining methods, such as linear regression, machine learning system design, as well as support vector machines and other methods. Machine learning process for all involved in the literature, can be regarded as a mathematical model to optimize the parameters of the solution process, a broad term learning process can be transformed into an optimization problem. Machine learning process has three elements which impact on the efficiency and effectiveness of learning, Hypothesis function, tfunctions cos function and descent gradient - function.Integrated the advantages and disadvantages of each method, we use theoptimization of multivariable linear regression [3]relationship between books and readers score between features.After data processing users get books score table, this table is a two-dimensional vector, as shown in Table 2.Table 2 List of Being Predicted with Known Datascoring matrix column by column and linear regression models were optimized to obtain model parameters 'θ. Multivariate assuming that the output is determined by multidimensional, the input multi-dimensional features. Multiple linear regression models: n n x x x x ⋅+⋅+⋅+=θθθθθΛ22110)(h .We Select two characteristics to regress prediction. To enhance the accuracy of the model, each reader j is introduced corresponding constant term characteristics 0x and parameters j θ1+∈n R, each user are trained a j -θ. Optimization model is as follows:Gradient descent update:Univariate learning methods of decreasing gradient parameters:()()()()()()()()()⎪⎪⎭⎪⎪⎬⎫⎪⎪⎩⎪⎪⎨⎧-=-=∑∑==m i i y i m i y i x y x h y x h 111100m 1-,m 1-θθαθθαθθ:: (3) 4.2.2 Solution of the modelThe parameters 'θof matlab linear regression process and after training based on rote learning whose optimization program parameters in Appendix II. User whose ID No. is 7625225of the solution process and results of six books in Tables 3 and 4, the remaining five predicted scores in Table 3.Table 3 Preliminary Treatment of the DataTable 4 ID No. 7625225’s Rating Prediction on 6 books4.2.3 Test of the modelThe comparative analysis chart of ID No. 7625225’s prediction and the known remarks value on six books, as shown in Figure2.Fig.2 Contrast between Predictive Value and the True Value From the above chart we can see, the predicted value fluctuates in the vicinity of the actual value given in the title of this article, and that the absolute error of relatively small is 0.015 is calculated by SPSS. Therefore, the model predicted score obtained more accurate.4.3 Collaborative filtering recommendation?model4.3.1 Recommended?principlesFor?each user?to recommend 3?books they have never read, ?based onthe?principle of?CF among ?articles[2,4], ?only?in the calculationof?neighbors?the?goods itself,?and not from?the user's point of view,?which is?based on?user preferences on?items?to find?similar items,?and then?according to theuser's?historical preferences,?recommend?similar items?to him.?From the computational?point of view,?is that?all the?user preference for a certain object?as a?vector to?calculate the similarity?between items?of?similargoods,?articles,?according to?user?preference?to predict?the current user?is not expressed?preference?items,?obtained a?list of?ranked?as a recommended.?Below gives an example,?a?for?goods,?according to all?user's historical?preferences,?like user a?items?like?item C,?the?item A?and?item C?is quite similar,?but?A users?like articles?A,?so?can infer A user?C may also?like?itemC.?The?book?recommended flow chart?is as follows:Fig.3 The Books?Recommended?Flow ChartThe following?information?can be obtained?from the?flow chart collaborative filtering recommendation?technology based on? items ?most users?forsome?item?scores?are similar,?the assumption that the current?users of?these?item scores were also?similar.?Then find the?similarity?between two users?to?solve?in this paper?is especially important.?Work?flow chart in Figure?3 gives?the collaborative filtering?algorithm?is given by the?score?matrix,?get the similarity relationship between user?to user,?so as to find out?the books.?For example,?the user?of 1,?2,?3?of the project?A,?B scores were?{3,4,5}and {3,4,5},?the fourth users?of project?a?score of 1,?because the project?A,?the score of B is very similar, indicating that A and B?are very similar,?we?can think of?the fourth users?onB project?the score ?is similar with A?? score?on the project,?Therefore the use of collaborative filtering recommendation?algorithm is Appropriate.Fig.4 Flow Chart of?Collaborative Filtering Recommendation Algorithms4.3.2 Establishment of the modelThrough the above?work flow chart [5],?the Item-based?method requires?three steps:(1)obtaining score?of User-item data;(2)The nearest neighbor search for?target items,?namely?the?similarity calculation;(3)Generating recommendationFirst of all,?score?data has been provided by the?second model?, then?with the method of?nearest neighbor?and?Pearson,?cosine and improved cosine?similarity?algorithm to calculate the?similarity?between users(I)Pearson?similarity algorithm:(II)Cosine?similarity algorithm:The remarks matrix,?all the?scores?of each item?can be regarded as?a column vector of this matrix,?similarity calculation?of two items?can be?betweentwo?column vector by calculating?two projects?the corresponding?cosine?value,?to represent the similarity?of these twoprojects?using?the?cosine?value.?xy sim ?indicates the similarity between?user x i and xy I ?present the item set which is remarked by x u and y u ,?xi r ?andyi r represent?the score remarked by x u and y u . The modified cosine similarity?algorithm:∑∑∑∈∈∈---⨯-=y x xy U i i iy U i i ix I i i iy i ix xy r r r r r r r rsim 22)()()()( (5)Enter a rating matrix Similarity calculation CF recommendationalgorithmThe?cosine?similarity scale?problems of different?user ratings?did not consider the measure?method,?some?users tend to?score?lower,?some?users tend to?score higher,?the modified cosine?similarity measure?to improve?the defect?by?averages core?minus the?user.?Indicates the similarity between?users?and users,?r representsithe average?score?ofu.?Use Matlab?to evaluate ?the user?item matrix?isiestablished and?improved?cosine similarity matrix?calculation procedures?in Appendix?III.Fig.5 Calculation Diagram of?Collaborative Filtering Algorithm?Basedon?Similarity?ProjectAccording to the calculation?method of similarity, To find the user?- item?neighbor, based on the principle of The?similarity?threshold of?neighbor selection. Through the MATLAB for?data filtering, calculating[6,8,9] and have Data segmentation with?the?book?label, user’s remark The reader?of historical data And?the social relationship Data. mining?with MATLAB,?extract the user?factor matrix and?object?factor matrix, to extract implicit?information from massive data?be?in?mining,?the relationship between data,?sorting out the?relationship between books?and?books?ID?tag number corresponding to the?user?ratingsmatrix,?books,?social network?relations between user?and book,?and?between the user?similarity matrix,?and?to classify?books, provide technical support?foralgorithm?design?and data processing.?The?two-dimensional?spatial?pointset?diagram ?as figure 6.The calculation?based on The similarity?threshold neighbor?is?a restrictionon the maximum value?of the?far and near?neighbor,?fall?in?the current?point asthe center,?distance of?all points?in the region of?K?as?the?neighbor,?calculated by?this method?the?neighbor number?is uncertain,?but the?similarity?does not appear larger?error.?The B shown in Figure 6,?starting from?1,?the similarity?in the K?neighborhood,?point?2,?point 3, point 4 and?7,?this method to calculate the?neighbor?similarity?degree?is better than the?fixed number of neighbors,?especially the?treatment?of?the isolated point.Fig.6 Similar?Neighbors?Calculation SketchSimilar to?nearest neighbor?measurement?method?of target users,?the next step is to?generate the corresponding?recommendation.?The nearestneighbor?set?user u set?by u NBS ,?the user?u ?to the project i prediction?scorei u P ,?can get?to?the nearest neighbors?in u NBS ?item score?by?user u ?calculatedas follows:∑∑∈∈-⨯+=uuNBS n NBS n n i n ui u n u sim R R n u sim R P )),(()(),(,, (6)),(n u sim ?indicates the similarity between?user u ?and?user n ,?i u R ,user nto the?project of the i score,?n R represent?user u and?user n to the project's average remarks.4.3.3 Test and solution of the modelEvaluation criteria?for recommendation include?statistical precision ofmeasurement?methods and decision support accuracy?measurement method. The average?absolute deviation?statistical precision of measurement?in MAE (mean?absolute?error)?is easy to understand,?can be measured?to therecommendation quality?directly,?is a kind of?the most commonly used?method to measure the?quality of recommendation ,The average?absolute deviationof?MAE as a metric.?The accuracy of?the deviation between the?average absolute deviation of MAE through?user rating and the actual?computational prediction of?user rating?prediction metrics,?MAE smaller,?more high?quality of recommendation. A?prediction?of user remark?setrepresentation?for {}N p p p ,,,21Λ,?the actual?user rating corresponds to the?set?{}N q q q ,,,21Λ,?then?MAE ?is defined as the?mean absolute deviation:Nq p MAE Ni i i ∑=-=1(7)Calculate the accuracy of recommendation MAE value?,?analyze experimentalerror.?First calculate the?Category?attribute similarity?all new projects?and other projects,?and obtains?the new project?by category?similarity,?the?nearest neighbor prediction,?calculate MAE value,?draw MAE ?line graph,?as shown in Figure?7,?as shown in figure 8:Fig.7 MAE?Contrast?Charts between the Improved?Cosine?andPearson?Prediction?FormulaFig. 8 MAE Contrastline Chart between?the?Former and ModifiedCosine?Formula?Figure 7-8 shows the analysis, based on three similarity calculation formula: cosine, cosine and Pearson adjustment formula, the improved prediction formula calculated MAE is lower than pearson and former formula, demonstrated improved predictive formula recommended by the system to improve the effectiveness of precision, which proves the improvement compared to theprevious prediction formula Pearson prediction formula to improve and better, to a certain extent, improve the system in the case of rating data sparse under the recommendation accuracy.Using matlab software [7]?to search,?to find matching,?and eventuallygive?it?six people each?recommended?3 books,?the results?are as shown in table 5:Table 5 Each User?Recommended?3 Books5. Improvement of the ModelAccording to the data subject given user_book_score.txt, matlab deal with the similarity between the project and the project get 3557⨯7757 sparse matrix. Since scoring matrix sparsity causes two items rated intersection becomes small, even an empty set, the calculated similarity is likely very small, even zero. However, it is certainly not true that you think the two projects are completely dissimilar if the two projects common score collection is empty. Although the number of users is a lot, the number of users that jointly review there is few.Therefore, the minority user's point of view does not represent everyone. However, that the items of these common score collection is few was not consideredseparately in the general algorithm for calculating the similarity. In fact, they are obtained the similarity by calculating is not accurate that causes the accuracy of the similarity calculation is reduced. Finally, the recommendate accuracy of the algorithm is affected.From the above point of view, calculating the similarity of the two projects, basically ,we can conclude that: the number of users of the two projects rated together is more, indicating the calculated similarity is more reliable. On thecontrary, the number of users of the two projects rated together is fewer, the calculated similarity is more questionable. So the similarity calculation of the general algorithm should be improved. We should add a similarity factor, the number of projects rated intersection,and using this factor to reduce the impact on the similarity of those amplified. We can get an idea of ??similarity calculation formula:old ab b a new sim num f i i sim ⨯=)(),( (8)Improved algorithm to obtain the similarity of projects a i and projects b i is expressed as ),(b a new i i sim . The number of the projects a i and projects b i rated together is expressed as ab num . A function of ab num as a parameter is expressed as)(ab num f . a degree of influence on b The output value of the function that thenumber of project intersection degree of influence on general similarity algorithm. The similarity of the results obtained by the ordinary method is expressed as old sim .Where )(ab num f is selected: ⎪⎩⎪⎨⎧≥<=D num when D num when D num num f ab ab abab ,,1)(D is the boundary of the intersection number . When the number of the twoprojects intersection rated over D, then it is reliable that the similarity obtainedthrough traditional similarity calculation formula is reliable. On the contrary, When the number of the two projects intersection rated less than D, it is unreliable that the similarity obtained through traditional similarity calculation formula is reliable ,so we have to be multiplied by a weighting factor )(ab num f .when seeking the project similarity , class properties need to be considered . The problem of new projects can be alleviate by class properties.⎩⎨⎧=≠=φφU j i sim U j i sim j i sim sort new ),,(),,(),( (9) Compared with the traditional algorithm, one of the advantages of the improved algorithm is that the similarity between the projects can all be calculated ,no matter what state the recommended system. Many new items not rated data, but it is very likely to get its nearest neighbor by item category similarity. Then the system can predict and recommend by nearest neighbor of new projects. Finally, to a certain extent, the issue of new projects can be alleviated.6. Evaluation and Promotion of the Model6.1 Advantages1.All the?analysis?is based?on the basis of obtained data,?the prediction results is convincing.2. Application of principal component analysis?for the selected?factors of overall inspection,?find out the influence of?the?factor?score?well and reasonable.3. Optimization of?linear regression,?and the prediction precision is improved. 6.2 DisadvantagesPut forward?to solve the?problem?of sparse?data.?But?due to time?constraints,?we have no time?to solve the?sparsity problem. 6.3 Promotion of the modelIn the background?of information?now?soaring,?recommendation system?is particularly important,?recommendation system is also?more and more?attention from the academic circles.?This paper?selected the most?recommendationtechnology?widely?used:?collaborative filtering recommendation?technology as a model to study the?score and?book recommendation,?and draw?a reasonable conclusion.?The mathematical?model established in this paper?has?strong"portability",?we can use this?model?is widely used in network,?media,?film and other areas.7.References[1]Gaoxiangbao,donghanqing,SPSS data analysis and application,Beijing: Tsinghua University Press,2007[2]DENG Ai-Lin, ZHU Yang-Yong, SHI Bai-Le. A Collaborative Filtering Recommendation Algorithm Based on Item Rating Prediction,Journal of Software,1000-9825/2003/14(09)1621:1624-1626,2003.[3][4]Jijun,THE FILM WE BSITE CONSTRUCTION BASED ON A COLLABORATIVE FILTERING RECOMMENDATION ALGORITHM,Harbin Institute of Technology,2009,12.[5]Yaozhong,Weijia,Wuyue,China Journal of Information Systems,l22,Oct.2008:78-96[6]Donglin,Bingjing:PUBLISHING HOUSE OF ELECTRONICS INDUSTRY,2009.1[7]Chenjie,MATLAB Collection,Beijing,PUBLISHING HOUSE OF ELECTRONICS INDUSTRY,2011.[8] [9]AppendixAppendix I1.load('peoplebeviewdinuser_social.mat')>> x=peoplebeviewd(:);x=sort(x);d=diff([x;max(x)+1]);count = diff(find([1;d])) ;frenqency_of_people_be_viewd=[x(find(d)) count];2.load('booktag.mat')load('booktable.mat')B=sort(booktabel(:));a=B(B~=0);x=a(:);x=sort(x);d=diff([x;max(x)+1]);count = diff(find([1;d])) ;y =[x(find(d)) count];3.load('usersid of user_read_history.mat')>> load('books which are read of user_read_history.mat')>> user_read_history=[VarName4 VarName5];>> sorted_user_read_history_by_bookid= sortrows(user_read_history,2); >> B=sort(VarName5(:));>> x=B(:);x=sort(x);d=diff([x;max(x)+1]);count = diff(find([1;d])) ;frenqency_of_bookid=[x(find(d)) count]; Appendix IIb = regress(y,X)regras()B1X=zeros(row_a,row_a);for i=1:row_afor j=1:row_aif(j~=i)B1X(i,i)=B1X(i,i)-B1X(i,j);endendendV1=zeros(row_a,row_a);for i=1:row_afor j=1:row_aif(i~=j)V1(i,j)=-Wa(i,j);V1(i,i)=V1(i,i)+Wa(i,j);endendendV1a=inv(V1+ones(row_a))-1/(row_a^2)*ones(row_a); a1=V1a*B1X*a;segma=0;for i=1:row_afor j=i+1:row_aTheta=segma+Wa(i,j)*Da(i,j)*Da(i,j);endendTheta=segma+trace(a'*V1*a)-2*trace(a'*B1X*a); Appendix IIIUser’s ID Book’sID PredictionremarkUser’s ID Book’sIDPredictionremark7245481 794171 4.05 7625225 473690 4.17 7245481 381060 4.27 7625225 929118 4.14 7245481 776002 4.36 7625225 235338 4.25 7245481 980705 4.13 7625225 424691 4.31 7245481 354292 4.17 7625225 916469 4.45 7245481 738735 4.23 7625225 793936 4.734156658 175031 4.03 5997834 346935 4.20 4156658 422711 4.26 5997834 144718 4.34 4156658 585783 4.01 5997834 827305 4.14 4156658 412990 4.01 5997834 219560 4.20 4156658 134003 4.14 5997834 242057 4.32 4156658 443948 3.98 5997834 803508 4.15 9214078 310411 4.19 2515537 900197 4.16 9214078 727635 4.11 2515537 680158 4.11 9214078 724917 4.12 2515537 770309 4.32 9214078 325721 4.22 2515537 424691 4.31 9214078 105962 4.19 2515537 573732 4.07 9214078 235338 3.60 2515537 210973 4.08Appendix IV1.s=find(bookidnumbertrans(1,:)==user_book_score1(1,2))load('user_book_score.mat')>> a=user_book_score(:,1);>> x=a(:);x=sort(x);d=diff([x;max(x)+1]);count = diff(find([1;d])) ;useridnumber=[x(find(d)) count];>> b=user_book_score(:,2);>> x=b(:);x=sort(x);d=diff([x;max(x)+1]);count = diff(find([1;d])) ;bookidnumber=[x(find(d)) count];>> user_book_score1=sortrows(user_book_score,1);>> bookidnumbertrans=bookidnumber';for i=1:3557flag=useridnumber(i,2);while flaglocation=find(bookidnumbertrans(1,:)==user_book_score1(flag,2)); bookid(i,location)=user_book_score1(flag,3);flag=flag-1;endend2.load('C:\Users\chang\Desktop\jiyuyonghuxiangsixingjuzhen.mat') load('C:\Users\chang\Desktop\predict.mat')load('useridnumber.mat')load('user_read_history.mat')for i=2:7location=find(useridnumber(:,1)==predict(i,1)); [rowmax,order] = max(g(location,:)')l=useridnumber(order,1);location1=find(user_read_history(:)==l)。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
1.模型假设1、用户的读书兴趣在整个评分过程中没有发生变化;2、用户都会尽可能读自己感兴趣的书;3、书籍的不同的标签代表不同的类别;4、用户会对感兴趣的书籍反复阅读。
2.主要符号说明3.问题分析3.1第一问题的分析书籍的评分客观上取决于书籍本身的质量,同时也受到用户评分偏好等的主观影响。
根据附件中所给数据,我们首先定义书籍质量Q 、书籍受欢迎程度P 、用户评分偏好IP 、用户读书偏好BP 、以及社交圈评分影响SI 等五个因素。
然后从以上因素出发,分别分析它们对用户评分的影响。
3.2第二问题的分析在不考虑主观因素影响的条件下,同一用户对书籍质量Q 相同的书籍应具有相近的评分,首先,我们采用基于书籍相似性的协同过滤推荐算法,求出与预测书籍相似的邻居集;然后,通过邻居集对预测书籍给出预测评分。
3.3第三问题的分析给用户推荐的书籍,应尽可能的符合用户的读书偏好。
首先,我们求出了用户对书籍的关注度;其次,我们求出了用户的读书偏好;然后,求出符合用户偏好的书对用户的推荐指数;最后,找出推荐指数最高的书推荐给该用户。
4.模型建立与求解4.1第一问题的模型建立与求解 4.1.1分析书籍质量Q 因素的影响我们将书籍的平均得分定义为书籍质量Q ,定义如下:ibookIDi j i N i j score Q ∑==,)3,,(其中,)3,,(i j score 表示用户j 对书籍i 的评分,i N 表示参与对书籍i 评分的用户总数。
我们从附件user_book_score.txt 中随机找出100名用户,并对每一位用户作如下处理:Step1:找出用户所有评分书籍及对应分数; Step2:求出这些书籍的书籍质量Q ;Step3:求出用户评分与书籍质量的相关系数和置信水平。
用户7245481参与打分的书籍数量为517本,其中前10本的处理结果如表4-1所示。
表4-1 用户7245481打分与书籍质量书籍评分与书籍质量有显著的相关性。
对于随机选取100名用户中的前10名用户,书籍评分与书籍质量的相关性见表4-2。
表4-2 书籍评分与书籍质量相关性表4.1.2分析书籍受欢迎程度P 因素的影响书籍的标签数量在一定程度上反映了书籍的受欢迎程度,因此我们定义书籍受欢迎程度P :∑=ji j i tag P ),(其中,),(j i tag 表示书籍i 的第j 个标签。
我们从附件user_book_score.txt 中随机找出100名用户,并对每一位用户作如下处理:Step1:找出用户所有评分书籍及对应分数; Step2:求出这些书籍的书籍受欢迎程度P ;Step3:求出用户评分与书籍受欢迎程度的相关系数和置信水平。
用户7245481参与打分的书籍数量为517本,其中最后10本的处理结果如表4-3所示。
表4-3 书籍评分与书籍标签数其评分与书籍标签数具有较强的相关性。
因此,我们认为书籍的受欢迎程度对用户的评分具有一定的影响。
4.1.3分析用户评分偏好IP 因素的影响通过简单统计附件user_read_history.txt 以及user_book_score.txt 中数据,我们发现:1、 有部分用户对没有阅读过的书籍也给了评分;2、 部分用户对大部分书籍给了相同分数。
因此我们考虑用户评分偏好IP 因素,用户评分偏好IP 定义如下:()∑=⎪⎪⎭⎫ ⎝⎛⨯⎪⎭⎫ ⎝⎛-⨯+⨯⎪⎭⎫⎝⎛⨯-+=521111)(i i i S N N S N N N N IP αα其中,%100121⨯-=N N N α ∑==51i i N NN :所有得分书籍的数目; α:调整因子;i N :某分值出现次数排名为i 的评分次数,则有54321N N N N N ≥≥≥≥。
我们从附件user_book_score.txt 中随机找出100名用户,并对每一位用户作如下处理:Step1:找出用户所有评分书籍及对应分数; Step2:求出用户的评分偏好IP ;Step3:考察用户的所有评分与评分偏好IP 的偏离程度。
ID 为7245481的用户参与打分的书籍数量为517本,处理结果如表4-4所示。
表4-4 书籍评分与对应出现次数表经统计分析,我们得到用户的所有评分均接近于评分偏好IP 。
因此,我们认为用户评分受评分偏好IP 的影响。
4.1.4分析用户读书偏好BP 因素的影响通过统计分析附件中的user_book_score.txt 表以及book_tag.txt 表,我们发现,大部分用户会有一个读书偏好BP ,他们倾向于阅读含有某一标签的书籍并可能对其打高分。
其中用户7245481参与打分的书籍中,出现次数最多的标签为6391,出现了266次,且这些书的得分都高于该用户对所有书籍评分的平均分;而用户7625225参与打分的书籍中,出现次数最多的标签为6391,出现了140次,且这些书的得分都高于该用户对所有书籍评分的平均分。
因此,我们可以得出用户读书偏好BP 对用户评分有一定的影响。
4.1.5分析社交圈评分影响SI 因素的影响用户好友对书籍的评分在一定程度上影响着用户的评分。
但在我们随机选取的多组数据中,我们发现有大部分的用户好友并没有参与书籍评分,因此,此处我们不多做分析。
4.2第二问题的模型建立与求解我们采用基于书籍相似性的协同过滤推荐算法,分别预测附件predict.txt 中的用户对未看过书籍的评分。
1、定义算法中涉及到的书籍间的相似性BS 。
找出同时对书籍j i ,评分的用户,记为n 位,则书籍j i ,分别在n 维空间上的评分表示为向量i ρ,j ρ,书籍j i ,之间的相似性),(j i BS 为: ji ji j i j i BS ρρρρ⋅==),cos(),( (1)其中,i ρ表示i ρ的模长。
2、构造算法中涉及到的评分矩阵S 。
矩阵S 的前六列表示附件predict.txt 中对用户i u 要求预测的书籍,记为i Pb ,后n 列表示用户i u 参与评分的所有书籍b ,设共有n 本,则矩阵S 共有6+n 列; 矩阵S 的第一行表示为用户i u ,后m 行表示为参与了i Pb 中的任意一本的评分的用户,设有m 位用户,则矩阵S 共有1+m 行。
3、预测用户i u 对第一本书1i Pb 的评分。
具体预测算法如下:Step1:取矩阵S 中第1列元素和第j 列元素,n j ,,8,7 =,分别记为1s 和j s ;Step2:取在1s 和j s 中均不含0的行,记为1s ', j s ',若没有取到,则定义1i Pb与j b 的相似度为0;否则用公式(1)求1i Pb 与j b 的相似度),(1j i b Pb BS ;Step3:取相似度),(1j i b Pb BS 最高的10个书籍,组成1b 邻居集{}1021,,,b b b B =, 其中,书籍1i Pb 与1b 的相似度),(11b Pb BS i 最高,书籍1i Pb 与2b 的相似),(21b Pb BS i 次之,以此类推。
Step4:求用户i u 对1i Pb 的预测评分:()()∑∑∈∈⨯=Bb i Bb bi i b Pb BS S b Pb BS i S ),(),()1,(1,14、对附件predict.txt 中所有用户要求预测的书籍用如上算法预测评分。
表4-5用户7245481要求预测书籍的预测评分表4-6用户7625225要求预测书籍的预测评分表4-7用户4156658要求预测书籍的预测评分表4-8用户5997834要求预测书籍的预测评分表4-9用户9214078要求预测书籍的预测评分表4-10用户2515537要求预测书籍的预测评分为了对以上算法的预测评分进行检验,我们定义平均绝对误差MAE 。
取用户u 参与评分的n 本书籍,设用户u 对此n 本书的实际评分集合为{}n s s s ,,,21 ,预测集合为{}n ps ps ps ,,,21 。
则ns ps MAE ni i i ∑=-=1表4-11用户7245481实际评分与预测评分表表4-12用户7625225实际评分与预测评分表表4-13用户4156658实际评分与预测评分表表4-14用户5997834实际评分与预测评分表表4-15用户9214078实际评分与预测评分表表4-16用户2515537实际评分与预测评分表4.3第三问题的模型建立与求解我们采用基于标签相似度书籍推荐算法,针对附件predict.txt 中的用户,分别给每一位用户推荐3本没看过的书。
1、 定义算法中涉及的标签共生相似度TCS 。
设“用户—标签”矩阵为n z S ⨯, 则12(,,,)T z n z m n m n S R Q s s s ⨯⨯⨯==其中,行向量i s 的非零值表示用户读过的同一本书或者不同书籍的标签,ij s 表示标签j t 在用户i c 读过书籍集合中出现的次数。
设标签共生矩阵为n n V ⨯, 则12(,,,)TT n n z n z n n V S S v v v ⨯⨯⨯==我们采用余弦相似度计算书籍的标签共生相似度TCS ,则定义TCS 如下。
cos(,)nikjkij i j vv TCS v v ==∑2、 定义算法中涉及的用户相似度CS 。
设用户x c 对书籍s p 的关注度为xs pw , 则12xs x x pw σσ=+其中,110x x x x xp T p T σ∈⎧=⎨∉⎩x T 表示用户x c 评论过的书籍。
2x n σ=n 代表书籍s p 在用户x c 阅读记录中出现的次数。
设书籍关注度向量为x b, 则1122((,),(,),,(,))x x x x xk xk x c c c c c c b p pw p pw p pw =其中xic p 为用户ix c 读过的书籍,xic pw 为用户x c 对书籍xi c p 的关注度。
设标签对于用户x c 的偏好权重为i cx w , 则**i cx xj cx xixjt ct w tw tw ∈=∑其中,*s xixi xsp D tw pw ∈=∑xi D 为用户x c 读过并且包含标签i cx t 的书籍集合,x ct 为用户x c 看的书的标签集合。
用户x c 与yc 之间的相似度xy CS 定义如下:11((,))x y x y ijijh kxy c c c c j i CS w w TCS t t ===∑∑3、 定义算法中涉及的用户与书籍的匹配度CPS 。
定义用户x c 与书籍s p 的匹配度为xy CPS , 则11(((,)))/x x x iijh kxs c c c j i CPS w TCS t t h===∑∑其中,h表示书籍sp的标签数量。