美国数模竞赛论文
2014年美国大学生数学建模MCM-B题O奖论文
For office use only T1T2T3T4T eam Control Number24857Problem ChosenBFor office use onlyF1F2F3F42014Mathematical Contest in Modeling(MCM)Summary Sheet (Attach a copy of this page to each copy of your solution paper.)AbstractThe evaluation and selection of‘best all time college coach’is the prob-lem to be addressed.We capture the essential of an evaluation system by reducing the dimensions of the attributes by factor analysis.And we divide our modeling process into three phases:data collection,attribute clarifica-tion,factor model evaluation and model generalization.Firstly,we collect the data from official database.Then,two bottom lines are determined respectively by the number of participating games and win-loss percentage,with these bottom lines we anchor a pool with30to40 candidates,which greatly reduced data volume.And reasonably thefinal top5coaches should generate from this pool.Attribution clarification will be abundant in the body of the model,note that we endeavor to design an attribute to effectively evaluate the improvement of a team before and after the coach came.In phase three,we analyse the problem by following traditional method of the factor model.With three common factors indicating coaches’guiding competency,strength of guided team,competition strength,we get afinal integrated score to evaluate coaches.And we also take into account the time line horizon in two aspects.On the one hand,the numbers of participating games are adjusted on the basis of time.On the other hand,we put forward a potential sub-model in our‘further attempts’concerning overlapping pe-riod of the time of two different coaches.What’s more,a‘pseudo-rose dia-gram’method is tried to show coaches’performance in different areas.Model generalization is examined by three different sports types,Foot-ball,Basketball,and Softball.Besides,our model also can be applied in all possible ball games under the frame of NCAA,assigning slight modification according to specific regulations.The stability of our model is also tested by sensitivity analysis.Who’s who in College Coaching Legends—–A generalized Factor Analysis approach2Contents1Introduction41.1Restatement of the problem (4)1.2NCAA Background and its coaches (4)1.3Previous models (4)2Assumptions5 3Analysis of the Problem5 4Thefirst round of sample selection6 5Attributes for evaluating coaches86Factor analysis model106.1A brief introduction to factor analysis (10)6.2Steps of Factor analysis by SPSS (12)6.3Result of the model (14)7Model generalization15 8Sensitivity analysis189Strength and Weaknesses199.1Strengths (19)9.2Weaknesses (19)10Further attempts20 Appendices22 Appendix A An article for Sports Illustrated221Introduction1.1Restatement of the problemThe‘best all time college coach’is to be selected by Sports Illustrated,a magazine for sports enthusiasts.This is an open-ended problem—-no limitation in method of performance appraisal,gender,or sports types.The following research points should be noted:•whether the time line horizon that we use in our analysis make a difference;•the metrics for assessment are to be articulated;•discuss how the model can be applied in general across both genders and all possible sports;•we need to present our model’s Top5coaches in each of3different sports.1.2NCAA Background and its coachesNational Collegiate Athletic Association(NCAA),an association of1281institution-s,conferences,organizations,and individuals that organizes the athletic programs of many colleges and universities in the United States and Canada.1In our model,only coaches in NCAA are considered and ranked.So,why evaluate the Coaching performance?As the identity of a college football program is shaped by its head coach.Given their impacts,it’s no wonder high profile athletic departments are shelling out millions of dollars per season for the services of coaches.Nick Saban’s2013total pay was$5,395,852and in the same year Coach K earned$7,233,976in total23.Indeed,every athletic director wants to hire the next legendary coach.1.3Previous modelsTraditionally,evaluation in athletics has been based on the single criterion of wins and losses.Years later,in order to reasonably evaluate coaches,many reseachers have implemented the coaching evaluation model.Such as7criteria proposed by Adams:[1] (1)the coach in the profession,(2)knowledge of and practice of medical aspects of coaching,(3)the coach as a person,(4)the coach as an organizer and administrator,(5) knowledge of the sport,(6)public relations,and(7)application of kinesiological and physiological principles.1Wikipedia:/wiki/National_Collegiate_Athletic_ Association#NCAA_sponsored_sports2USAToday:/sports/college/salaries/ncaaf/coach/ 3USAToday:/sports/college/salaries/ncaab/coach/Such models relatively focused more on some subjective and difficult-to-quantify attributes to evaluate coaches,which is quite hard for sports fans to judge coaches. Therefore,we established an objective and quantified model to make a list of‘best all time college coach’.2Assumptions•The sample for our model is restricted within the scale of NCAA sports.That is to say,the coaches we discuss refers to those service for NCAA alone;•We do not take into account the talent born varying from one player to another, in this case,we mean the teams’wins or losses purely associate with the coach;•The difference of games between different Divisions in NCAA is ignored;•Take no account of the errors/amendments of the NCAA game records.3Analysis of the ProblemOur main goal is to build and analyze a mathematical model to choose the‘best all time college coach’for the previous century,i.e.from1913to2013.Objectively,it requires numerous attributes to judge and specify whether a coach is‘the best’,while many of the indicators are deemed hard to quantify.However,to put it in thefirst place, we consider a‘best coach’is,and supposed to be in line with several basic condition-s,which are the prerequisites.Those prerequisites incorporate attributes such as the number of games the coach has participated ever and the win-loss percentage of the total.For instance,under the conditions that either the number of participating games is below100,or the win-loss percentage is less than0.5,we assume this coach cannot be credited as the‘best’,ignoring his/her other facets.Therefore,an attempt was made to screen out the coaches we want,thus to narrow the range in ourfirst stage.At the very beginning,we ignore those whose guiding ses-sions or win-loss percentage is less than a certain level,and then we determine a can-didate pool for‘the best coach’of30-40in scale,according to merely two indicators—-participating games and win-loss percentage.It should be reasonably reliable to draw the top5best coaches from this candidate pool,regardless of any other aspects.One point worth mentioning is that,we take time line horizon as one of the inputs because the number of participating games is changing all the time in the previous century.Hence,it would be unfair to treat this problem by using absolute values, especially for those coaches who lived in the earlier ages when sports were less popular and games were sparse comparatively.4Thefirst round of sample selectionCollege Football is thefirst item in our research.We obtain data concerning all possible coaches since it was initiated,of which the coaches’tenures,participating games and win-loss percentage etc.are included.As a result,we get a sample of2053in scale.Thefirst10candidates’respective information is as below:Table1:Thefirst10candidates’information,here Pct means win-loss percentageCoach From To Years Games Wins Losses Ties PctEli Abbott19021902184400.5Earl Abell19281930328141220.536Earl Able1923192421810620.611 George Adams1890189233634200.944Hobbs Adams1940194632742120.185Steve Addazio20112013337201700.541Alex Agase1964197613135508320.378Phil Ahwesh19491949193600.333Jim Aiken19461950550282200.56Fred Akers19751990161861087530.589 ...........................Firstly,we employ Excel to rule out those who begun their coaching career earlier than1913.Next,considering the impact of time line horizon mentioned in the problem statement,we import our raw data into MATLAB,with an attempt to calculate the coaches’average games every year versus time,as delineated in the Figure1below.Figure1:Diagram of the coaches’average sessions every year versus time It can be drawn from thefigure above,clearly,that the number of each coach’s average games is related with the participating time.With the passing of time and the increasing popularity of sports,coaches’participating games yearly ascends from8to 12or so,that is,the maximum exceed the minimum for50%around.To further refinethe evaluation method,we make the following adjustment for coaches’participating games,and we define it as each coach’s adjusted participating games.Gi =max(G i)G mi×G iWhere•G i is each coach’s participating games;•G im is the average participating games yearly in his/her career;and•max(G i)is the max value in previous century as coaches’average participating games yearlySubsequently,we output the adjusted data,and return it to the Excel table.Obviously,directly using all this data would cause our research a mass,and also the economy of description is hard to achieved.Logically,we propose to employ the following method to narrow the sample range.In general,the most essential attributes to evaluate a coach are his/her guiding ex-perience(which can be shown by participating games)and guiding results(shown by win-loss percentage).Fortunately,these two factors are the ones that can be quantified thus provide feasibility for our modeling.Based on our common sense and select-ed information from sports magazines and associated programs,wefind the winning coaches almost all bear the same characteristics—-at high level in both the partici-pating games and the win-loss percentage.Thus we may arbitrarily enact two bottom line for these two essential attributes,so as to nail down a pool of30to40candidates. Those who do not meet our prerequisites should not be credited as the best in any case.Logically,we expect the model to yield insight into how bottom lines are deter-mined.The matter is,sports types are varying thus the corresponding features are dif-ferent.However,it should be reasonably reliable to the sports fans and commentators’perceptual intuition.Take football as an example,win-loss percentage that exceeds0.75 should be viewed as rather high,and college football coaches of all time who meet this standard are specifically listed in Wikipedia.4Consequently,we are able tofix upon a rational pool of candidate according to those enacted bottom lines and meanwhile, may tender the conditions according to the total scale of the coaches.Still we use Football to further articulate,to determine a pool of candidates for the best coaches,wefirst plot thefigure below to present the distributions of all the coaches.From thefigure2,wefind that once the games number exceeds200or win-loss percentage exceeds0.7,the distribution of the coaches drops significantly.We can thus view this group of coaches as outstanding comparatively,meeting the prerequisites to be the best coaches.4Wikipedia:/wiki/List_of_college_football_coaches_ with_a_.750_winning_percentageFigure2:Hist of the football coaches’number of games versus and average games every year versus games and win-loss percentageHence,we nail down the bottom lines for both the games number and the win-loss percentage,that is,0.7for the former and200for the latter.And these two bottom lines are used as the measure for ourfirst round selection.After round one,merely35 coaches are qualified to remain in the pool of candidates.Since it’s thefirst round sifting,rather than direct and ultimate determination,we hence believe the subjectivity to some extent in the opt of bottom lines will not cloud thefinal results of the best coaches.5Attributes for evaluating coachesThen anchored upon the35candidate selected,we will elaborate our coach evaluation system based on8attributes.In the indicator-select process,we endeavor to examine tradeoffs among the availability for data and difficulty for data quantification.Coaches’pay,for example,though serves as the measure for coaching evaluation,the corre-sponding data are limited.Situations are similar for attributes such as the number of sportsmen the coach ever cultivated for the higher-level tournaments.Ultimately,we determine the8attributes shown in the table below:Further explanation:•Yrs:guiding years of a coach in his/her whole career•G’:Gi =max(G i)G mi×G i see it at last section•Pct:pct=wins+ties/2wins+losses+ties•SRS:a rating that takes into account average point differential and strength of schedule.The rating is denominated in points above/below average,where zeroTable2:symbols and attributessymbol attributeYrs yearsG’adjusted overall gamesPct win-lose percentageP’Adjusted percentage ratioSRS Simple Rating SystemSOS Strength of ScheduleBlp’adjusted Bowls participatedBlw’adjusted Bowls wonis the average.Note that,the bigger for this value,the stronger for the team performance.•SOS:a rating of strength of schedule.The rating is denominated in points above/below average,where zero is the average.Noted that the bigger for this value,the more powerful for the team’s rival,namely the competition is more fierce.Sports-reference provides official statistics for SRS and SOS.5•P’is a new attribute designed in our model.It is the result of Win-loss in the coach’s whole career divided by the average of win-loss percentage(weighted by the number of games in different colleges the coach ever in).We bear in mind that the function of a great coach is not merely manifested in the pure win-loss percentage of the team,it is even more crucial to consider the improvement of the team’s win-loss record with the coach’s participation,or say,the gap between‘af-ter’and‘before’period of this team.(between‘after’and‘before’the dividing line is the day the coach take office)It is because a coach who build a comparative-ly weak team into a much more competitive team would definitely receive more respect and honor from sports fans.To measure and specify this attribute,we col-lect the key official data from sports-reference,which included the independent win-loss percentage for each candidate and each college time when he/she was in the team and,the weighted average of all time win-loss percentage of all the college teams the coach ever in—-regardless of whether the coach is in the team or not.To articulate this attribute,here goes a simple physical example.Ike Armstrong (placedfirst when sorted by alphabetical order),of which the data can be ob-tained from website of sports-reference6.We can easily get the records we need, namely141wins,55losses,15ties,and0.704for win-losses percentage.Fur-ther,specific wins,losses,ties for the team he ever in(Utab college)can also be gained,respectively they are602,419,30,0.587.Consequently,the P’value of Ike Armstrong should be0.704/0.587=1.199,according to our definition.•Bowl games is a special event in thefield of Football games.In North America,a bowl game is one of a number of post-season college football games that are5sports-reference:/cfb/coaches/6sports-reference:/cfb/coaches/ike-armstrong-1.htmlprimarily played by teams from the Division I Football Bowl Subdivision.The times for one coach to eparticipate Bowl games are important indicators to eval-uate a coach.However,noted that the total number of Bowl games held each year is changing from year to year,which should be taken into consideration in the model.Other sports events such as NCAA basketball tournament is also ex-panding.For this reason,it is irrational to use the absolute value of the times for entering the Bowl games (or NCAA basketball tournament etc.)and the times for winning as the evaluation measurement.Whereas the development history and regulations for different sports items vary from one to another (actually the differentiation can be fairly large),we here are incapable to find a generalized method to eliminate this discrepancy ,instead,in-dependent method for each item provide a way out.Due to the time limitation for our research and the need of model generalization,we here only do root extract of blp and blw to debilitate the differentiation,i.e.Blp =√Blp Blw =√Blw For different sports items,we use the same attributes,except Blp’and Blw’,we may change it according to specific sports.For instance,we can use CREG (Number of regular season conference championship won)and FF (Number of NCAA Final Four appearance)to replace Blp and Blw in basketball games.With all the attributes determined,we organized data and show them in the table 3:In addition,before forward analysis there is a need to preprocess the data,owing to the diverse dimensions between these indicators.Methods for data preprocessing are a lot,here we adopt standard score (Z score)method.In statistics,the standard score is the (signed)number of standard deviations an observation or datum is above the mean.Thus,a positive standard score represents a datum above the mean,while a negative standard score represents a datum below the mean.It is a dimensionless quantity obtained by subtracting the population mean from an individual raw score and then dividing the difference by the population standard deviation.7The standard score of a raw score x is:z =x −µσIt is easy to complete this process by statistical software SPSS.6Factor analysis model 6.1A brief introduction to factor analysisFactor analysis is a statistical method used to describe variability among observed,correlated variables in terms of a potentially lower number of unobserved variables called factors.For example,it is possible that variations in four observed variables mainly reflect the variations in two unobserved variables.Factor analysis searches for 7Wikipedia:/wiki/Standard_scoreTable3:summarized data for best college football coaches’candidatesCoach From To Yrs G’Pct Blp’Blw’P’SRS SOS Ike Armstrong19251949252810.70411 1.199 4.15-4.18 Dana Bible19151946313860.7152 1.73 1.0789.88 1.48 Bernie Bierman19251950242780.71110 1.29514.36 6.29 Red Blaik19341958252940.75900 1.28213.57 2.34 Bobby Bowden19702009405230.74 5.74 4.69 1.10314.25 4.62 Frank Broyles19571976202570.7 3.162 1.18813.29 5.59 Bear Bryant19451982385080.78 5.39 3.87 1.1816.77 6.12 Fritz Crisler19301947182080.76811 1.08317.15 6.67 Bob Devaney19571972162080.806 3.16 2.65 1.25513.13 2.28 Dan Devine19551980222800.742 3.16 2.65 1.22613.61 4.69 Gilmour Dobie19161938222370.70900 1.27.66-2.09 Bobby Dodd19451966222960.713 3.613 1.18414.25 6.6 Vince Dooley19641988253250.715 4.47 2.83 1.09714.537.12 Gus Dorais19221942192320.71910 1.2296-3.21 Pat Dye19741992192400.707 3.16 2.65 1.1929.68 1.51 LaVell Edwards19722000293920.716 4.69 2.65 1.2437.66-0.66 Phillip Fulmer19922008172150.743 3.87 2.83 1.08313.42 4.95 Woody Hayes19511978283290.761 3.32 2.24 1.03117.418.09 Frank Kush19581979222710.764 2.65 2.45 1.238.21-2.07 John McKay19601975162070.7493 2.45 1.05817.298.59 Bob Neyland19261952212860.829 2.65 1.41 1.20815.53 3.17 Tom Osborne19731997253340.8365 3.46 1.18119.7 5.49 Ara Parseghian19561974192250.71 2.24 1.73 1.15317.228.86 Joe Paterno19662011465950.749 6.08 4.9 1.08914.01 5.01 Darrell Royal19541976232970.7494 2.83 1.08916.457.09 Nick Saban19902013182390.748 3.74 2.83 1.12313.41 3.86 Bo Schembechler19631989273460.775 4.12 2.24 1.10414.86 3.37 Francis Schmidt19221942212670.70800 1.1928.490.16 Steve Spurrier19872013243160.733 4.363 1.29313.53 4.64 Bob Stoops19992013152070.804 3.74 2.65 1.11716.66 4.74 Jock Sutherland19191938202550.81221 1.37613.88 1.68 Barry Switzer19731988162090.837 3.61 2.83 1.16320.08 6.63 John Vaught19471973253210.745 4.24 3.16 1.33814.7 5.26 Wallace Wade19231950243070.765 2.24 1.41 1.34913.53 3.15 Bud Wilkinson19471963172220.826 2.83 2.45 1.14717.54 4.94 such joint variations in response to unobserved latent variables.The observed vari-ables are modelled as linear combinations of the potential factors,plus‘error’terms. The information gained about the interdependencies between observed variables can be used later to reduce the set of variables in a putationally this technique is equivalent to low rank approximation of the matrix of observed variables.8 Why carry out factor analyses?If we can summarise a multitude of measure-8Wikipedia:/wiki/Factor_analysisments with a smaller number of factors without losing too much information,we have achieved some economy of description,which is one of the goals of scientific investi-gation.It is also possible that factor analysis will allow us to test theories involving variables which are hard to measure directly.Finally,at a more prosaic level,factor analysis can help us establish that sets of questionnaire items(observed variables)are in fact all measuring the same underlying factor(perhaps with varying reliability)and so can be combined to form a more reliable measure of that factor.6.2Steps of Factor analysis by SPSSFirst we import the decided datasets of8attributes into SPSS,and the results can be obtained below after the software processing.[2-3]Figure3:Table of total variance explainedFigure4:Scree PlotThefirst table and scree plot shows the eigenvalues and the amount of variance explained by each successive factor.The remaining5factors have small eigenvalues value.Once the top3factors are extracted,it adds up to84.3%,meaning a great as the explanatory ability for the original information.To reflect the quantitative analysis of the model,we obtain the following factor loading matrix,actually the loadings are in corresponding to the weight(α1,α2 (i)the set ofx i=αi1f1+αi2f2+...+αim f j+εiAnd the relative strength of the common factors and the original attribute can also be manifested.Figure5:Rotated Component MatrixThen,with Rotated Component Matrix above,wefind the common factor F1main-ly expresses four attributes they are:G,Yrs,P,SRS,and logically,we define the com-mon factor generated from those four attributes as the guiding competency of the coach;similarly,the common factor F2mainly expresses two attributes,and they are: Pct and Blp,which can be de defined as the integrated strength of the guided team; while the common factor F3,mainly expresses two attributes:SOS and Blw,which can be summarized into a‘latent attribute’named competition strength.In order to obtain the quantitative relation,we get the following Component Score Coefficient Matrix processed by SPSS.Further,the function of common factors and the original attributes is listed as bel-low:F1=0.300x1+0.312x2+0.023x3+0.256x4+0.251x5+0.060x6−0.035x7−0.053x8F2=−0.107x1−0,054x2+0.572x3+0.103x4+0.081x5+0.280x6+0.372x7+0.142x8 F3=−0.076x1−0,098x2−0.349x3+0.004x4+0.027x5−0.656x6+0.160x7+0.400x8 Finally we calculate out the integrated factor scores,which should be the average score weighted by the corresponding proportion of variance contribution of each com-mon factor in the total variance contribution.And the function set should be:F=0.477F1+0.284F2+0.239F3Figure6:Component Score Coefficient Matrix6.3Result of the modelwe rank all the coaches in the candidate pool by integrated score represented by F.Seetable4:Table4:Integrated scores for best college football coach(show15data due to the limi-tation of space)Rank coaches F1F2F3Integrated factor1Joe Paterno 3.178-0.3150.421 1.3622Bobby Bowden 2.51-0.2810.502 1.1113Bear Bryant 2.1420.718-0.142 1.0994Tom Osborne0.623 1.969-0.2390.8205Woody Hayes0.140.009 1.6130.4846Barry Switzer-0.705 2.0360.2470.4037Darrell Royal0.0460.161 1.2680.4018Vince Dooley0.361-0.442 1.3730.3749Bo Schembechler0.4810.1430.3040.32910John Vaught0.6060.748-0.870.26511Steve Spurrier0.5180.326-0.5380.18212Bob Stoops-0.718 1.0850.5230.17113Bud Wilkinson-0.718 1.4130.1050.16514Bobby Dodd0.08-0.2080.7390.16215John McKay-0.9620.228 1.870.151Based on this model,we can make a scientific rank list for US college football coach-es,the Top5coaches of our model is Joe Paterno,Bobby Bowden,Bear Bryant,TomOsborne,Woody Hayes.In order to confirm our result,we get a official list of bestcollege football coaches from Bleacherreport99Bleacherreport:/articles/890705-college-football-the-top-50-coTable5:The result of our model in football,the last column is official college basketball ranking from bleacherreportRank Our model Integrated scores bleacherreport1Joe Paterno 1.362Bear Bryant2Bobby Bowden 1.111Knute Rockne3Bear Bryant 1.099Tom Osborne4Tom Osborne0.820Joe Paterno5Woody Hayes0.484Bobby Bowden By comparing thoes two ranking list,wefind that four of our Top5coaches ap-peared in the offical Top5list,which shows that our model is reasonable and effective.7Model generalizationOur coach evaluation system model,of which the feasibility of generalization is sat-isfying,can be accommodated to any possible NCAA sports concourses by assigning slight modification concerning specific regulations.Besides,this method has nothing to do with the coach’s gender,or say,both male and female coaches can be rationally evaluated by this system.And therefore we would like to generalize this model into softball.Further,we take into account the time line horizon,making corresponding adjust-ment for the indicator of number of participating games so as to stipulate that the evaluation measure for1913and2013would be the same.To further generalize the model,first let’s have a test in basketball,of which the data available is adequate enough as football.And the specific steps are as following:1.Obtain data from sports-reference10and rule out the coaches who begun theircoaching career earlier than1913.2.Calculate each coach’s adjusted number of participating games,and adjust theattribute—-FF(Number of NCAA Final Four appearance).3.Determine the bottom lines for thefirst round selection to get a pool of candidatesaccording to the coaches’participating games and win-loss percentage,and the ideal volumn of the pool should be from30to40.Hist diagrams are as below: We determine800as the bottom line for the adjusted participating games and0.7 for the win-loss percentage.Coincidently,we get a candidate pool of35in scale.4.Next,we collect the corresponding data of candidate coaches(P’,SRS,SOS etc.),as presented in the table6:5.Processed by z score method and factor analysis based on the8attributes anddata above,we get three common factors andfinal integrated scores.And among 10sports-reference:/cbb/coaches/Figure7:Hist of the basketball coaches’number of games versus and average gamesevery year versus games and win-loss percentagethe top5candidates,Mike Krzyzewski,Adolph Rupp,Dean SmithˇcˇnBob Knightare the same with the official statistics from bleacherreport.11We can say theeffectiveness of the model is pretty good.See table5.We also apply similar approach into college softball.Maybe it is because the popularity of the softball is not that high,the data avail-able is not adequate to employ ourfirst model.How can our model function in suchsituation?First and foremost,specialized magazines like Sports Illustrated,its com-mentators there would have more internal and confidential databases,which are notexposed publicly.On the one hand,as long as the data is adequate enough,we can saythe original model is completely feasible.While under the situation that there is datadeficit,we can reasonably simplify the model.The derivation of the softball data is NCAA’s official websites,here we only extractdata from All-Division part.12Softball is a comparatively young sports,hence we may arbitrarily neglect the re-stricted condition of‘100years’.Subsequently,because of the data deficit it is hard toadjust the number of participating games.We may as well determine10as the bottomline for participating games and0.74for win-loss percentage,producing a candidatepool of33in scaleAttributed to the inadequacy of the data for attributes,it is not convenient to furtheruse the factor analysis similarly as the assessment model.Therefore,here we employsolely two of the most important attributes to evaluate a coach and they are:partic-ipating games and win-loss percentage in the coach’s whole career.Specifically,wefirst adopt z score to normalize all the data because of the differentiation of various dimensions,and then the integrated score of the coach can be reached by the weighted11bleacherreport:/articles/1341064-10-greatest-coaches-in-ncaa-b 12NCAA softball Coaching Record:/Docs/stats/SB_Records/2012/coaches.pdf。
2010 美赛 MCM 优秀论文
3 Center of Minimum Distance Model.................................. 5
2007美国大学生数学建模竞赛B题特等奖论文
American Airlines' Next Top ModelSara J. BeckSpencer D. K'BurgAlex B. TwistUniversity of Puget SoundTacoma, WAAdvisor: Michael Z. SpiveySummaryWe design a simulation that replicates the behavior of passengers boarding airplanes of different sizes according to procedures currently implemented, as well as a plan not currently in use. Variables in our model are deterministic or stochastic and include walking time, stowage time, and seating time. Boarding delays are measured as the sum of these variables. We physically model and observe common interactions to accurately reflect boarding time.We run 500 simulations for various combinations of airplane sizes and boarding plans. We analyze the sensitivity of each boarding algorithm, as well as the passenger movement algorithm, for a wide range of plane sizes and configurations. We use the simulation results to compare the effectiveness of the boarding plans. We find that for all plane sizes, the novel boarding plan Roller Coaster is the most efficient. The Roller Coaster algorithm essentially modifies the outside-in boarding method. The passengers line up before they board the plane and then board the plane by letter group. This allows most interferences to be avoided. It loads a small plane 67% faster than the next best option, a midsize plane 37% faster than the next best option, and a large plane 35% faster than the next best option.IntroductionThe objectives in our study are:To board (and deboard) various sizes of plane as quickly as possible."* To find a boarding plan that is both efficient (fast) and simple for the passengers.With this in mind:"* We investigate the time for a passenger to stow their luggage and clear the aisle."* We investigate the time for a passenger to clear the aisle when another passenger is seated between them and their seat.* We review the current boarding techniques used by airlines.* We study the floor layout of planes of three different sizes to compare any difference between the efficiency of a given boarding plan as plane size increases and layouts vary."* We construct a simulator that mimics typical passenger behavior during the boarding processes under different techniques."* We realize that there is not very much time savings possible in deboarding while maintaining customer satisfaction."* We calculate the time elapsed for a given plane to load under a given boarding plan by tracking and penalizing the different types of interferences that occur during the simulations."* As an alternative to the boarding techniques currently employed, we suggest an alternative plan andassess it using our simulator."* We make recommendations regarding the algorithms that proved most efficient for small, midsize, and large planes.Interferences and Delays for BoardingThere are two basic causes for interference-someone blocking a passenger,in an aisle and someone blocking a passenger in a row. Aisle interference is caused when the passenger ahead of you has stopped moving and is preventing you from continuing down the aisle towards the row with your seat. Row interference is caused when you have reached the correct row but already-seated passengers between the aisle and your seat are preventing you from immediately taking your seat. A major cause of aisle interference is a passenger experiencing rowinterference.We conducted experiments, using lined-up rows of chairs to simulate rows in an airplane and a team member with outstretched arms to act as an overhead compartment, to estimate parameters for the delays cause by these actions. The times that we found through our experimentation are given in Table 1.We use these times in our simulation to model the speed at which a plane can be boarded. We model separately the delays caused by aisle interference and row interference. Both are simulated using a mixed distribution definedas follows:Y = min{2, X},where X is a normally distributed random variable whose mean and standard deviation are fixed in our experiments. We opt for the distribution being partially normal with a minimum of 2 after reasoning that other alternative and common distributions (such as the exponential) are too prone to throw a small value, which is unrealistic. We find that the average row interference time is approximately 4 s with a standard deviation of 2 s, while the average aisle interference time is approximately 7 s with a standard deviation of 4 s. These values are slightly adjusted based on our team's cumulative experience on airplanes.Typical Plane ConfigurationsEssential to our model are industry standards regarding common layouts of passenger aircraft of varied sizes. We use an Airbus 320 plane to model a small plane (85-210 passengers) and the Boeing 747 for a midsize plane (210-330 passengers). Because of the lack of large planes available on the market, we modify the Boeing 747 by eliminating the first-class section and extending the coach section to fill the entire plane. This puts the Boeing 747 close to its maximum capacity. This modified Boeing 747 has 55 rows, all with the same dimensions as the coach section in the standard Boeing 747. Airbus is in the process of designing planes that can hold up to 800 passengers. The Airbus A380 is a double-decker with occupancy of 555 people in three different classes; but we exclude double-decker models from our simulation because it is the larger, bottom deck that is the limiting factor, not the smaller upper deck.Current Boarding TechniquesWe examine the following industry boarding procedures:* random-order* outside-in* back-to-front (for several group sizes)Additionally, we explore this innovative technique not currently used by airlines:* "Roller Coaster" boarding: Passengers are put in order before they board the plane in a style much like those used by theme parks in filling roller coasters.Passengers are ordered from back of the plane to front, and they board in seatletter groups. This is a modified outside-in technique, the difference being that passengers in the same group are ordered before boarding. Figure 1 shows how this ordering could take place. By doing this, most interferencesare avoided.Current Deboarding TechniquesPlanes are currently deboarded in an aisle-to-window and front-to-back order. This deboarding method comes out of the passengers' desire to be off the plane as quickly as possible. Any modification of this technique could leadto customer dissatisfaction, since passengers may be forced to wait while others seated behind them on theplane are deboarding.Boarding SimulationWe search for the optimal boarding technique by designing a simulation that models the boarding process and running the simulation under different plane configurations and sizes along with different boarding algorithms. We then compare which algorithms yielded the most efficient boarding process.AssumptionsThe environment within a plane during the boarding process is far too unpredictable to be modeled accurately. To make our model more tractable,we make the following simplifying assumptions:"* There is no first-class or special-needs seating. Because the standard industry practice is to board these passengers first, and because they generally make up a small portion of the overall plane capacity, any changes in the overall boarding technique will not apply to these passengers."* All passengers board when their boarding group is called. No passengers arrive late or try to board the plane early."* Passengers do not pass each other in the aisles; the aisles are too narrow."* There are no gaps between boarding groups. Airline staff call a new boarding group before the previous boarding group has finished boarding the plane."* Passengers do not travel in groups. Often, airlines allow passengers boarding with groups, especially with younger children, to board in a manner convenient for them rather than in accordance with the boarding plan. These events are too unpredictable to model precisely."* The plane is full. A full plane would typically cause the most passenger interferences, allowing us to view the worst-case scenario in our model."* Every row contains the same number of seats. In reality, the number of seats in a row varies due to engineering reasons or to accommodate luxury-class passengers.ImplementationWe formulate the boarding process as follows:"* The layout of a plane is represented by a matrix, with the rows representing rows of seats, and each column describing whether a row is next to the window, aisle, etc. The specific dimensions vary with each plane type. Integer parameters track which columns are aisles."* The line of passengers waiting to board is represented by an ordered array of integers that shrinks appropriately as they board the plane."* The boarding technique is modeled in a matrix identical in size to the matrix representing the layout of the plane. This matrix is full of positive integers, one for each passenger, assigned to a specific submatrix, representing each passenger's boarding group location. Within each of these submatrices, seating is assigned randomly torepresent the random order in which passengers line up when their boarding groups are called."* Interferences are counted in every location where they occur within the matrix representing the plane layout. These interferences are then cast into our probability distribution defined above, which gives ameasurement of time delay."* Passengers wait for interferences around them before moving closer to their assigned seats; if an interference is found, the passenger will wait until the time delay has finished counting down to 0."* The simulation ends when all delays caused by interferences have counted down to 0 and all passengers have taken their assigned seats.Strengths and Weaknesses of the ModelStrengths"* It is robust for all plane configurations and sizes. The boarding algorithms that we design can be implemented on a wide variety of planes with minimal effort. Furthermore, the model yields reasonable results as we adjust theparameters of the plane; for example, larger planes require more time to board, while planes with more aisles can load more quickly than similarlysized planes with fewer aisles."* It allows for reasonable amounts of variance in passenger behavior. While with more thorough experimentation a superior stochastic distribution describing the delays associated with interferences could be found, our simulationcan be readily altered to incorporate such advances."* It is simple. We made an effort to minimize the complexity of our simulation, allowing us to run more simulations during a greater time period and mini mizing the risk of exceptions and errors occurring."* It is fairly realistic. Watching the model execute, we can observe passengers boarding the plane, bumping into each other, taking time to load their baggage, and waiting around as passengers in front of them move out of theway. Its ability to incorporate such complex behavior and reduce it are key to completing our objective. Weaknesses"* It does not account for passengers other than economy-class passengers."* It cannot simulate structural differences in the boarding gates which couldpossibly speed up the boarding process. For instance, some airlines in Europeboard planes from two different entrances at once."* It cannot account for people being late to the boarding gate."* It does not account for passenger preferences or satisfaction.Results and Data AnalysisFor each plane layout and boarding algorithm, we ran 500 boarding simulations,calculating mean time and standard deviation. The latter is important because the reliability of plane loading is important for scheduling flights.We simulated the back-to-front method for several possible group sizes.Because of the difference in thenumber of rows in the planes, not all group size possibilities could be implemented on all planes.Small PlaneFor the small plane, Figure 2 shows that all boarding techniques except for the Roller Coaster slowed the boarding process compared to the random boarding process. As more and more structure is added to the boarding process, while passenger seat assignments continue to be random within each of the boarding groups, passenger interference backs up more and more. When passengers board randomly, gaps are created between passengers as some move to the back while others seat themselves immediately upon entering the plane, preventing any more from stepping off of the gate and onto the plane. These gaps prevent passengers who board early and must travel to the back of the plane from causing interference with many passengers behind them. However, when we implement the Roller Coaster algorithm, seat interference is eliminated, with the only passenger causing aisle interference being the very last one to boardfrom each group.Interestingly, the small plane's boarding times for all algorithms are greater than their respective boarding time for the midsize plane! This is because the number of seats per row per aisle is greater in the small plane than in the midsize plane.Midsize PlaneThe results experienced from the simulations of the mid-sized plane areshown in Figure 3 and are comparable to those experienced by the small plane.Again, the Roller Coaster method proved the most effective.Large PlaneFigure 4 shows that the boarding time for a large aircraft, unlike the other plane configurations, drops off when moving from the random boarding algorithm to the outside-in boarding algorithm. Observing the movements by the passengers in the simulation, it is clear that because of the greater number of passengers in this plane, gaps are more likely to form between passengers in the aisles, allowing passengers to move unimpeded by those already on board.However, both instances of back-to-front boarding created too much structure to allow these gaps to form again. Again, because of the elimination of row interference it provides for, Roller Coaster proved to be the most effective boarding method.OverallThe Roller Coaster boarding algorithm is the fastest algorithm for any plane pared to the next fastest boarding procedure, it is 35% faster for a large plane, 37% faster for a midsize plane, and 67% faster for a small plane. The Roller Coaster boarding procedure also has the added benefit of very low standard deviation, thus allowing airlines a more reliable boarding time. The boarding time for the back-to-front algorithms increases with the number of boarding groups and is always slower than a random boarding procedure.The idea behind a back-to-front boarding algorithm is that interference at the front of the plane is avoided until passengers in the back sections are already on the plane. A flaw in this procedure is that having everyone line up in the plane can cause a bottleneck that actually increases the loading time. The outside-in ("Wilma," or window, middle, aisle) algorithm performs better than the random boarding procedure only for the large plane. The benefit of the random procedure is that it evenly distributes interferences throughout theplane, so that they are less likely to impact very many passengers.Validation and Sensitivity AnalysisWe developed a test plane configuration with the sole purpose of implementing our boarding algorithms on planes of all sizes, varying from 24 to 600 passengers with both one or two aisles.We also examined capacities as low as 70%; the trends that we see at full capacity are reflected at these lower capacities. The back-to-front and outside-in algorithms do start to perform better; but this increase inperformance is relatively small, and the Roller Coaster algorithm still substantially outperforms them. Underall circumstances, the algorithms we test are robust. That is, they assign passenger to seats in accordance with the intention of the boarding plans used by airlines and move passengers in a realistic manner.RecommendationsWe recommend that the Roller Coaster boarding plan be implemented for planes of all sizes and configurations for boarding non-luxury-class and nonspecial needs passengers. As planes increase in size, its margin of success in comparison to the next best method decreases; but we are confident that the Roller Coaster method will prove robust. We recommend boarding groups that are traveling together before boarding the rest of the plane, as such groups would cause interferences that slow the boarding. Ideally, such groups would be ordered before boarding.Future WorkIt is inevitable that some passengers will arrive late and not board the plane at their scheduled time. Additionally, we believe that the amount of carry-on baggage permitted would have a larger effect on the boarding time than the specific boarding plan implemented-modeling this would prove insightful.We also recommend modifying the simulation to reflect groups of people traveling (and boarding) together; this is especially important to the Roller Coaster boarding procedure, and why we recommend boarding groups before boarding the rest of the plane.。
美国大学生数学建模大赛英文写作
写作要求 : 1. 简短 论文标题一般在10个字内,最多不超 过15个词。
多用复合词
如:self-design, cross-sectional, dust-free, water-proof, input-orientation, piece-wiselinear 利用缩略词 如:e.g., i.e., vs.(与…相对), ibid.(出处相同), etc., cit.(在上述引文中), et al.(等人), viz.(即,就是), DEA (data envelopment analysis), OLS(Ordinary least-squares)
“Investigation on …”, “Observation on …”, “The Method of …”, “Some thought on…”, “A research on…”等冗余套语 。
4. 少用问题性标题 5. 避免名词与动名词混杂使用 如:标题是 “The Treatment of Heating and Eutechticum of Steel” 宜改为 “Heating and Eutechticuming of Steel” 6. 避免使用非标准化的缩略语 论文标题要 求简洁,但一般不使用缩略语 ,更不能使用 非标准化的缩略语 。
关键词(Keywords)
基本功能:顾名思义;便于检索 语言特点:多用名词;字数有限(4-6); 出处明确 写作要求 :论文的关键字一般列在作者与单 位之下,论文摘要之上。也有列在论文摘 要之下的。关键词除第一个字母大写外, 一般不要求大写。关键词间用逗号、分号 或大间隔隔开。最末一个关键词一般不加 用逗号、分号或句号。
数学建模 美赛获奖论文
________________
F2
________________
F3
________________
F4
________________
2010 Mathematical Contest in Modeling (MCM) Summary Sheet
(Attach a copy of this page to each copy of your solution paper.)
Keywords:simple harmonic motion system , differential equations model , collision system
数学建模美国赛论文常用句式总结
The expression of ... can be expanded as: ......的表达式可扩展为...A is exponentially smaller than B,so it can be neglected.A对B来说呈指数级减小,所以可以忽略不计。
Equation (1) is reduced to:方程(1)化简为:Substitute the values into equation (3), we get ...把这些值代入方程3,我们得到...According to our first assumption on Page 1,根据我们第一页的第一个假设,Thus we arrive at the conclusion:因此我们得到结论:From the model of ... ,we find that theoretically, it is almost true that ...由...模型,我们从理论上证明了... 是真实可信的。
That is the theoretical basis for ... in many application areas.这是...在很多领域应用的理论基础。
To quantitatively analyze the different requirements of the two applications, we introduce two measures: 为了定量的分析这两种应用的不同要求,我们介绍来两个量度标准。
We give the criterion that ...我们给出了...的判别标准According to the criterion of...根据...的标准So its expression can be derived from equation (3) with small change.所以它的表达式可以由方程3做微小改动而推出。
MathorCup杯数模竞赛优秀论文
Key words : refined circle, TCP, reverse thought, A multi-objective linear programming, traveling package
The Design of Family Summer Travelling Plan
3. the average speed of our cab is 50km per hour, while the average cost is 0.3 yuan per kilometer. 4. when we go from place A to place B, we have no visit to places during the trip. 5. during a time period, family members start from Chengdu, and end in Chengdu. 6. in a day, 12 hours for traveling and 12 hours for rest. 7. there is no accident in our travel. 8. we choose the following national 5A and 4A attractions as our potential destination after thinking about the surrounding tourist attractions: Chengdu, jiuzhaigou, huanglong, leshan, emeishan, siguniangshan, danba, dujiangyan, qingchengshan, hailuogou, kangding.
正确写作美国大学生数学建模竞赛论文省名师优质课赛课获奖课件市赛课一等奖课件
3.1)、假设条件和解释
合理旳数学模型应基于合理旳假设,所以在描述模 型之前,参赛小组应该将模型设计所用旳假设条件一一 列出并解释清楚。不要有未经阐明旳假设,以免读者自 行猜测而造成误解。另外,还应该对建模旳初衷和动机 合适旳加以讨论。
3、写作旳主要性
论文旳写作应尽早开始。根据以往旳经验,许 多参赛小组往往低估了论文写作所需旳时间,不能 及时写出条理清楚旳论文。所以,参赛小组能够考 虑在竞赛开始后旳第二天开始写作,并约定一种时 间结束手头旳建模工作,以便全力以赴写好参赛论 文。
第二部分 论文构造
1、小节划分
论文应该按内容划提成小节和子小节,并冠以恰当旳标 题,使评委无需阅读细节就能把握论文旳根本。根据论文旳 评审原则,MCM竞赛委员会提议参赛小组按下列构造将论文 分节:
特级论文(0.5﹪) 特级提名论文(0.5﹪) 甲级论文(10﹪—15﹪) 乙级论文(25-30﹪) 合格论文(60﹪) 不合格论文
2、论文评审
评审流程:
论文评审旳方式是盲审。全部参赛论文均使用唯一给定旳 编号统一辨认,这个编号称为控制编号。论文旳作者姓名及其 所在大学旳名称均不得在论文中出现。 评审分为两个阶段:
1、小节划分
下列是该论文旳小节划分及标题: Summary
1 Restatement of the Problem 2 Assumptions 3 Justification of Our Approach 4 The Model
4.1 Dissatisfaction of a passenger needing a connection 4.2 Dissatisfaction of a passenger not needing a connection 4.3 Total dissatisfaction on an aircraft 5 Testing the Model 6 Results 7 Strengths and Weaknesses References
美国大学生数学建模大赛2002年MCM特等奖论文集
(Domestic) (Outside U.S.)#2ຫໍສະໝຸດ 30 $140 #2231 $160
To order, send a check or money order to COMAP, or call toll-free
1-800-77-COMAP (1-800-772-6627). The UMAP Journal is published quarterly by the Consortium for Mathematics and Its Applications (COMAP), Inc., Suite 210, 57 Bedford Street, Lexington, MA, 02420, in cooperation with the American Mathematical Association of Two-Year Colleges (AMATYC), the Mathematical Association of America (MAA), the National Council of Teachers of Mathematics (NCTM), the American Statistical Association (ASA), the Society for Industrial and Applied Mathematics (SIAM), and The Institute for Operations Research and the Management Sciences (INFORMS). The Journal acquaints readers with a wide variety of professional applications of the mathematical sciences and provides a forum for the discussion of new directions in mathematical education (ISSN 0197-3622). Second-class postage paid at Boston, MA and at additional mailing offices. Send address changes to: The UMAP Journal COMAP, Inc. 57 Bedford Street, Suite 210, Lexington, MA 02420 © Copyright 2002 by COMAP, Inc. All rights reserved.
2016年美国大学生数学建模大赛A题获奖论文A Hot Bath
The first part has five sections: air’s heat radiation, bathtub wall’s heat radiation, person in, hot water in, bubble existed. We discuss some factors that affect water temperature, such as the shape and the volume of the bathtub and person, especially the motions made by the person in the bathtub because the temperature in the bathtub has a great connection with person. Finally, we get the water temperature variation and distribution model.
In this article, we establish two models. One is water temperature variation and distr one is finding best strategy model. We put forward some acceptable hypothesis to simplify the model. What’s more, we clear the meaning of the word “noticeably”.
2016美国大学生数学建模大赛C题特等奖(原版论文)C42939Tsinghua University, China
For office use only T1T2T3T4T eam Control Number42939Problem ChosenCFor office use onlyF1F2F3F42016Mathematical Contest in Modeling(MCM)Summary Sheet (Attach a copy of this page to each copy of your solution paper.)SummaryIn order to determine the optimal donation strategy,this paper proposes a data-motivated model based on an original definition of return on investment(ROI) appropriate for charitable organizations.First,after addressing missing data,we develop a composite index,called the performance index,to quantify students’educational performance.The perfor-mance index is a linear composition of several commonly used performance indi-cators,like graduation rate and graduates’earnings.And their weights are deter-mined by principal component analysis.Next,to deal with problems caused by high-dimensional data,we employ a lin-ear model and a selection method called post-LASSO to select variables that statis-tically significantly affect the performance index and determine their effects(coef-ficients).We call them performance contributing variables.In this case,5variables are selected.Among them,tuition&fees in2010and Carnegie High-Research-Activity classification are insusceptible to donation amount.Thus we only con-sider percentage of students who receive a Pell Grant,share of students who are part-time and student-to-faculty ratio.Then,a generalized adaptive model is adopted to estimate the relation between these3variables and donation amount.Wefit the relation across all institutions and get afitted function from donation amount to values of performance contributing variables.Then we divide the impact of donation amount into2parts:homogenous and heterogenous one.The homogenous influence is modeled as the change infit-ted values of performance contributing variables over increase in donation amount, which can be predicted from thefitted curve.The heterogenous one is modeled as a tuning parameter which adjusts the homogenous influence based on deviation from thefitted curve.And their product is increase in true values of performance over increase in donation amount.Finally,we calculate ROI,defined as increase in performance index over in-crease in donation amount.This ROI is institution-specific and dependent on in-crease in donation amount.By adopting a two-step ROI maximization algorithm, we determine the optimal investment strategy.Also,we propose an extended model to handle problems caused by time dura-tion and geographical distribution of donations.A Letter to the CFO of the Goodgrant FoundationDear Chiang,Our team has proposed a performance index quantifying the students’educational per-formance of each institution and defined the return of investment(ROI)appropriately for a charitable organization like Goodgrant Foundation.A mathematical model is built to help predict the return of investment after identifying the mechanism through which the donation generates its impact on the performance.The optimal investment strategy is determined by maximizing the estimated return of investment.More specifically,the composite performance index is developed after taking all the pos-sible performance indicators into consideration,like graduation rate and graduates’earnings. The performance index is constructed to represents the performance of the school as well as the positive effect that a college brings to students and the community.From this point of view, our definition manages to capture social benefits of donation.And then we adopt a variable selection method tofind out performance contributing vari-ables,which are variables that strongly affect the performance index.Among all the perfor-mance contributing variables we select,three variables which can be directly affected by your generous donation are kept to predict ROI:percentage of students who receive a Pell Grant, share of students who are part-time and student-to-faculty ratio.Wefitted a relation between these three variables and the donation amount to predict change in value of each performance contributing variable over your donation amount.And we calculate ROI,defined as increase in the performance index over your donation amount, by multiplying change in value of each performance contributing variable over your donation amount and each performance contributing variable’s effect on performance index,and then summing up the products of all performance contributing variables.The optimal investment strategy is decided after maximizing the return of investment according to an algorithm for selection.In conclusion,our model successfully produced an investment strategy including a list of target institutions and investment amount for each institution.(The list of year1is attached at the end of the letter).The time duration for the investment could also be determined based on our model.Since the model as well as the evaluation approach is fully data-motivated with no arbitrary criterion included,it is rather adaptable for solving future philanthropic educational investment problems.We have a strong belief that our model can effectively enhance the efficiency of philan-thropic educational investment and provides an appropriate as well as feasible way to best improve the educational performance of students.UNITID names ROI donation 197027United States Merchant Marine Academy21.85%2500000 102711AVTEC-Alaska’s Institute of Technology21.26%7500000 187745Institute of American Indian and Alaska Native Culture20.99%2000000 262129New College of Florida20.69%6500000 216296Thaddeus Stevens College of Technology20.66%3000000 229832Western Texas College20.26%10000000 196158SUNY at Fredonia20.24%5500000 234155Virginia State University20.04%10000000 196200SUNY College at Potsdam19.75%5000000 178615Truman State University19.60%3000000 199120University of North Carolina at Chapel Hill19.51%3000000 101648Marion Military Institute19.48%2500000187912New Mexico Military Institute19.31%500000 227386Panola College19.28%10000000 434584Ilisagvik College19.19%4500000 199184University of North Carolina School of the Arts19.15%500000 413802East San Gabriel Valley Regional Occupational Program19.09%6000000 174251University of Minnesota-Morris19.09%8000000 159391Louisiana State University and Agricultural&Mechanical Col-19.07%8500000lege403487Wabash Valley College19.05%1500000 Yours Sincerely,Team#42939An Optimal Strategy of Donation for Educational PurposeControl Number:#42939February,2016Contents1Introduction51.1Statement of the Problem (5)1.2Baseline Model (5)1.3Detailed Definitions&Assumptions (8)1.3.1Detailed Definitions: (8)1.3.2Assumptions: (9)1.4The Advantages of Our Model (9)2Addressing the Missing Values93Determining the Performance Index103.1Performance Indicators (10)3.2Performance Index via Principal-Component Factors (10)4Identifying Performance Contributing Variables via post-LASSO115Determining Investment Strategy based on ROI135.1Fitted Curve between Performance Contributing Variables and Donation Amount145.2ROI(Return on Investment) (15)5.2.1Model of Fitted ROIs of Performance Contributing Variables fROI i (15)5.2.2Model of the tuning parameter P i (16)5.2.3Calculation of ROI (17)5.3School Selection&Investment Strategy (18)6Extended Model186.1Time Duration (18)6.2Geographical Distribution (22)7Conclusions and Discussion22 8Reference23 9Appendix241Introduction1.1Statement of the ProblemThere exists no doubt in the significance of postsecondary education to the development of society,especially with the ascending need for skilled employees capable of complex work. Nevertheless,U.S.ranks only11th in the higher education attachment worldwide,which makes thefinancial support from large charitable organizations necessary.As it’s essential for charitable organizations to maximize the effectiveness of donations,an objective and systematic assessment model is in demand to develop appropriate investment strategies.To achieve this goal,several large foundations like Gates Foundation and Lumina Foundation have developed different evaluation approaches,where they mainly focus on spe-cific indexes like attendance and graduation rate.In other empirical literature,a Forbes ap-proach(Shifrin and Chen,2015)proposes a new indicator called the Grateful Graduates Index, using the median amount of private donations per student over a10-year period to measure the return on investment.Also,performance funding indicators(Burke,2002,Cave,1997,Ser-ban and Burke,1998,Banta et al,1996),which include but are not limited to external indicators like graduates’employment rate and internal indicators like teaching quality,are one of the most prevailing methods to evaluate effectiveness of educational donations.However,those methods also arise with widely acknowledged concerns(Burke,1998).Most of them require subjective choice of indexes and are rather arbitrary than data-based.And they perform badly in a data environment where there is miscellaneous cross-section data but scarce time-series data.Besides,they lack quantified analysis in precisely predicting or measuring the social benefits and the positive effect that the investment can generate,which serves as one of the targets for the Goodgrant Foundation.In accordance with Goodgrant Foundation’s request,this paper provides a prudent def-inition of return on investment(ROI)for charitable organizations,and develops an original data-motivated model,which is feasible even faced with tangled cross-section data and absent time-series data,to determine the optimal strategy for funding.The strategy contains selection of institutions and distribution of investment across institutions,time and regions.1.2Baseline ModelOur definition of ROI is similar to its usual meaning,which is the increase in students’educational performance over the amount Goodgrant Foundation donates(assuming other donationsfixed,it’s also the increase in total donation amount).First we cope with data missingness.Then,to quantify students’educational performance, we develop an index called performance index,which is a linear composition of commonly used performance indicators.Our major task is to build a model to predict the change of this index given a distribution of Goodgrant Foundation$100m donation.However,donation does not directly affect the performance index and we would encounter endogeneity problem or neglect effects of other variables if we solely focus on the relation between performance index and donation amount. Instead,we select several variables that are pivotal in predicting the performance index from many potential candidates,and determine their coefficients/effects on the performance index. We call these variables performance contributing variables.Due to absence of time-series data,it becomes difficult tofigure out how performance con-tributing variables are affected by donation amount for each institution respectively.Instead, wefit the relation between performance contributing variables and donation amount across all institutions and get afitted function from donation amount to values of performance contribut-ing variables.Then we divide the impact of donation amount into2parts:homogenous and heteroge-nous one.The homogenous influence is modeled as the change infitted values of performance contributing variables over increase in donation amount(We call these quotientsfitted ROI of performance contributing variable).The heterogenous one is modeled as a tuning parameter, which adjusts the homogenous influence based on deviation from thefitted function.And their product is the institution-specific increase in true values of performance contributing variables over increase in donation amount(We call these values ROI of performance contributing vari-able).The next step is to calculate the ROI of the performance index by adding the products of ROIs of performance contributing variables and their coefficients on the performance index. This ROI is institution-specific and dependent on increase in donation amount.By adopting a two-step ROI maximization algorithm,we determine the optimal investment strategy.Also,we propose an extended model to handle problems caused by time duration and geographical distribution of donations.Note:we only use data from the provided excel table and that mentioned in the pdffile.Table1:Data SourceVariable DatasetPerformance index Excel tablePerformance contributing variables Excel table and pdffileDonation amount PdffileTheflow chart of the whole model is presented below in Fig1:Figure1:Flow Chart Demonstration of the Model1.3Detailed Definitions&Assumptions 1.3.1Detailed Definitions:1.3.2Assumptions:A1.Stability.We assume data of any institution should be stable without the impact from outside.To be specific,the key factors like the donation amount and the performance index should remain unchanged if the college does not receive new donations.A2.Goodgrant Foundation’s donation(Increase in donation amount)is discrete rather than continuous.This is reasonable because each donation is usually an integer multiple of a minimum amount,like$1m.After referring to the data of other foundations like Lumina Foundation,we recommend donation amount should be one value in the set below:{500000,1000000,1500000, (10000000)A3.The performance index is a linear composition of all given performance indicators.A4.Performance contributing variables linearly affect the performance index.A5.Increase in donation amount affects the performance index through performance con-tributing variables.A6.The impact of increase in donation amount on performance contributing variables con-tains2parts:homogenous one and heterogenous one.The homogenous influence is repre-sented by a smooth function from donation amount to performance contributing variables.And the heterogenous one is represented by deviation from the function.1.4The Advantages of Our ModelOur model exhibits many advantages in application:•The evaluation model is fully data based with few subjective or arbitrary decision rules.•Our model successfully identifies the underlying mechanism instead of merely focusing on the relation between donation amount and the performance index.•Our model takes both homogeneity and heterogeneity into consideration.•Our model makes full use of the cross-section data and does not need time-series data to produce reasonable outcomes.2Addressing the Missing ValuesThe provided datasets suffer from severe data missing,which could undermine the reliabil-ity and interpretability of any results.To cope with this problem,we adopt several different methods for data with varied missing rate.For data with missing rate over50%,any current prevailing method would fall victim to under-or over-randomization.As a result,we omit this kind of data for simplicity’s sake.For variables with missing rate between10%-50%,we use imputation techniques(Little and Rubin,2014)where a missing value was imputed from a randomly selected similar record,and model-based analysis where missing values are substituted with distribution diagrams.For variables with missing rate under10%,we address missingness by simply replace miss-ing value with mean of existing values.3Determining the Performance IndexIn this section,we derive a composite index,called the performance index,to evaluate the educational performance of students at every institution.3.1Performance IndicatorsFirst,we need to determine which variables from various institutional performance data are direct indicators of Goodgrant Foundation’s major concern–to enhance students’educational performance.In practice,other charitable foundations such as Gates Foundation place their focus on core indexes like attendance and graduation rate.Logically,we select performance indicators on the basis of its correlation with these core indexes.With this method,miscellaneous performance data from the excel table boils down to4crucial variables.C150_4_P OOLED_SUP P and C200_L4_P OOLED_SUP P,as completion rates for different types of institutions,are directly correlated with graduation rate.We combine them into one variable.Md_earn_wne_p10and gt_25k_p6,as different measures of graduates’earnings,are proved in empirical studies(Ehren-berg,2004)to be highly dependent on educational performance.And RP Y_3Y R_RT_SUP P, as repayment rate,is also considered valid in the same sense.Let them be Y1,Y2,Y3and Y4.For easy calculation and interpretation of the performance index,we apply uniformization to all4variables,as to make sure they’re on the same scale(from0to100).3.2Performance Index via Principal-Component FactorsAs the model assumes the performance index is a linear composition of all performance indicators,all we need to do is determine the weights of these variables.Here we apply the method of Customer Satisfaction Index model(Rogg et al,2001),where principal-component factors(pcf)are employed to determine weights of all aspects.The pcf procedure uses an orthogonal transformation to convert a set of observations of pos-sibly correlated variables into a set of values of linearly uncorrelated variables called principal-component factors,each of which carries part of the total variance.If the cumulative proportion of the variance exceeds80%,it’s viable to use corresponding pcfs(usually thefirst two pcfs)to determine weights of original variables.In this case,we’ll get4pcfs(named P CF1,P CF2,P CF3and P CF4).First,the procedure provides the linear coefficients of Y m in the expression of P CF1and P CF2.We getP CF1=a11Y1+a12Y2+a13Y3+a14Y4P CF2=a21Y1+a22Y2+a23Y3+a24Y4(a km calculated as corresponding factor loadings over square root of factor k’s eigenvalue) Then,we calculate the rough weights c m for Y m.Let the variance proportions P CF1and P CF2 represent be N1and N2.We get c m=(a1m N1+a2m N2)/(N1+N2)(This formulation is justifiedbecause the variance proportions can be viewed as the significance of pcfs).If we let perfor-mance index=(P CF 1N 1+P CF 2N 2)/(N 1+N 2),c m is indeed the rough weight of Y m in terms of variance)Next,we get the weights by adjusting the sum of rough weights to 1:c m =c m /(c 1+c 2+c 3+c 4)Finally,we get the performance index,which is the weighted sum of the 4performance indicator.Performance index= m (c m Y m )Table 2presents the 10institutions with largest values of the performance index.This rank-ing is highly consistent with widely acknowledged rankings,like QS ranking,which indicates the validity of the performance index.Table 2:The Top 10Institutions in Terms of Performance IndexInstitutionPerformance index Los Angeles County College of Nursing and Allied Health79.60372162Massachusetts Institute of Technology79.06066895University of Pennsylvania79.05044556Babson College78.99269867Georgetown University78.90468597Stanford University78.70586395Duke University78.27719116University of Notre Dame78.15843964Weill Cornell Medical College 78.143341064Identifying Performance Contributing Variables via post-LASSO The next step of our model requires identifying the factors that may exert an influence on the students’educational performance from a variety of variables mentioned in the excel table and the pdf file (108in total,some of which are dummy variables converted from categorical variables).To achieve this purpose,we used a model called LASSO.A linear model is adopted to describe the relationship between the endogenous variable –performance index –and all variables that are potentially influential to it.We assign appropriate coefficient to each variable to minimize the square error between our model prediction and the actual value when fitting the data.min β1J J j =1(y j −x T j β)2where J =2881,x j =(1,x 1j ,x 2j ,...,x pj )THowever,as the amount of the variables included in the model is increasing,the cost func-tion will naturally decrease.So the problem of over fitting the data will arise,which make the model we come up with hard to predict the future performance of the students.Also,since there are hundreds of potential variables as candidates.We need a method to identify the variables that truly matter and have a strong effect on the performance index.Here we take the advantage of a method named post-LASSO (Tibshirani,1996).LASSO,also known as the least absolute shrinkage and selection operator,is a method used for variableselection and shrinkage in medium-or high-dimensional environment.And post-LASSO is to apply ordinary least squares(OLS)to the model selected byfirst-step LASSO procedure.In LASSO procedure,instead of using the cost function that merely focusing on the square error between the prediction and the actual value,a penalty term is also included into the objective function.We wish to minimize:min β1JJj=1(y j−x T jβ)2+λ||β||1whereλ||β||1is the penalty term.The penalty term takes the number of variables into con-sideration by penalizing on the absolute value of the coefficients and forcing the coefficients of many variables shrink to zero if this variable is of less importance.The penalty coefficient lambda determines the degree of penalty for including variables into the model.After min-imizing the cost function plus the penalty term,we couldfigure out the variables of larger essence to include in the model.We utilize the LARS algorithm to implement the LASSO procedure and cross-validation MSE minimization(Usai et al,2009)to determine the optimal penalty coefficient(represented by shrinkage factor in LARS algorithm).And then OLS is employed to complete the post-LASSO method.Figure2:LASSO path-coefficients as a function of shrinkage factor sFigure3:Cross-validated MSEFig2.displays the results of LASSO procedure and Fig3displays the cross-validated MSE for different shrinkage factors.As specified above,the cross-validated MSE reaches minimum with shrinkage factor between0.4-0.8.We choose0.6andfind in Fig2that6variables have nonzero coefficients via the LASSO procedure,thus being selected as the performance con-tributing variables.Table3is a demonstration of these6variables and corresponding post-LASSO results.Table3:Post-LASSO resultsDependent variable:performance_indexPCTPELL−26.453∗∗∗(0.872)PPTUG_EF−14.819∗∗∗(0.781)StudentToFaculty_ratio−0.231∗∗∗(0.025)Tuition&Fees20100.0003∗∗∗(0.00002)Carnegie_HighResearchActivity 5.667∗∗∗(0.775)Constant61.326∗∗∗(0.783)Observations2,880R20.610Adjusted R20.609Note:PCTPELL is percentage of students who receive aPell Grant;PPTUG_EF is share of students who are part-time;Carnegie_HighResearchActivity is Carnegie classifica-tion basic:High Research ActivityThe results presented in Table3are consistent with common sense.For instance,the pos-itive coefficient of High Research Activity Carnegie classification implies that active research activity helps student’s educational performance;and the negative coefficient of Student-to-Faculty ratio suggests that decrease in faculty quantity undermines students’educational per-formance.Along with the large R square value and small p-value for each coefficient,the post-LASSO procedure proves to select a valid set of performance contributing variables and describe well their contribution to the performance index.5Determining Investment Strategy based on ROIWe’ve identified5performance contributing variables via post-LASSO.Among them,tu-ition&fees in2010and Carnegie High-Research-Activity classification are quite insusceptible to donation amount.So we only consider the effects of increase in donation amount on per-centage of students who receive a Pell Grant,share of students who are part-time and student-to-faculty ratio.We denote them with F1,F2and F3,their post-LASSO coefficients withβ1,β2andβ3.In this section,wefirst introduce the procedure used tofit the relation between performance contributing variables and donation amount.Then we provide the model employed to calcu-latefitted ROIs of performance contributing variables(the homogenous influence of increase in donation amount)and the tuning parameter(the heterogenous influence of increase in dona-tion amount).Next,we introduce how to determine stly,we show how the maximiza-tion determines the investment strategy,including selection of institutions and distribution of investments.5.1Fitted Curve between Performance Contributing Variables and Donation AmountSince we have already approximated the linear relation between the performance index with the3performance contributing variables,we want to know how increase in donation changes them.In this paper,we use Generalized Adaptive Model(GAM)to smoothlyfit the relations. Generalized Adaptive Model is a generalized linear model in which the dependent variable depends linearly on unknown smooth functions of independent variables.Thefitted curve of percentage of students who receive a Pell Grant is depicted below in Fig4(see the other two fitted curves in Appendix):Figure4:GAM ApproximationA Pell Grant is money the U.S.federal government provides directly for students who needit to pay for college.Intuitively,if the amount of donation an institution receives from other sources such as private donation increases,the institution is likely to use these donations to alleviate students’financial stress,resulting in percentage of students who receive a Pell Grant. Thus it is reasonable to see afitted curve downward sloping at most part.Also,in commonsense,an increase in donation amount would lead to increase in the performance index.This downward sloping curve is consistent with the negative post-LASSO coefficient of percentage of students who receive a Pell Grant(as two negatives make a positive).5.2ROI(Return on Investment)5.2.1Model of Fitted ROIs of Performance Contributing Variables fROI iFigure5:Demonstration of fROI1Again,we usefitted curve of percentage of students who receive a Pell Grant as an example. We modeled the bluefitted curve to represent the homogeneous relation between percentage of students who receive a Pell Grant and donation amount.Recallfitted ROI of percentage of students who receive a Pell Grant(fROI1)is change in fitted values(∆f)over increase in donation amount(∆X).SofROI1=∆f/∆XAccording to assumption A2,the amount of each Goodgrant Foundation’s donation falls into a pre-specified set,namely,{500000,1000000,1500000,...,10000000}.So we get a set of possible fitted ROI of percentage of students who receive a Pell Grant(fROI1).Clearly,fROI1is de-pendent on both donation amount(X)and increase in donation amount(∆X).Calculation of fitted ROIs of other performance contributing variables is similar.5.2.2Model of the tuning parameter P iAlthough we’ve identified the homogenous influence of increase in donation amount,we shall not neglect the fact that institutions utilize donations differently.A proportion of do-nations might be appropriated by the university’s administration and different institutions allocate the donation differently.For example,university with a more convenient and well-maintained system of identifying students who needfinancial aid might be willing to use a larger portion of donations to directly aid students,resulting in a lower percentage of under-graduate students receiving Pell grant.Also,university facing lower cost of identifying and hiring suitable faculty members might be inclined to use a larger portion of donations in this direction,resulting in a lower student-to-faculty ratio.These above mentioned reasons make institutions deviate from the homogenousfitted func-tion and presents heterogeneous influence of increase in donation amount.Thus,while the homogenous influence only depends on donation amount and increase in donation amount, the heterogeneous influence is institution-specific.To account for this heterogeneous influence,we utilize a tuning parameter P i to adjust the homogenous influence.By multiplying the tuning parameter,fitted ROIs of performance con-tributing variables(fitted value changes)convert into ROI of performance contributing variable (true value changes).ROI i=fROI i·P iWe then argue that P i can be summarized by a function of deviation from thefitted curve (∆h),and the function has the shape shown in Fig6.The value of P i ranges from0to2,because P i can be viewed as an amplification or shrinkage of the homogenous influence.For example,P i=2means that the homogeneous influence is amplified greatly.P i=0means that this homogeneous influence would be entirely wiped out. The shape of the function is as shown in Fig6because of the following reasons.Intuitively,if one institution locates above thefitted line,when deviation is small,the larger it is,the larger P i is.This is because the institution might be more inclined to utilize donations to change that factor.However,when deviation becomes even larger,the institution grows less willing to invest on this factor.This is because marginal utility decreases.The discussion is similar if one institution initially lies under thefitted line.Thus,we assume the function mapping deviation to P i is similar to Fig6.deviation is on the x-axis while P i is on the y-axis.Figure6:Function from Deviation to P iIn order to simplify calculation and without loss of generality,we approximate the function。
2021年美国大学生数学建模竞赛题目A--真菌范文六篇(含Matlab源代码)
do the different fungi interact and decompose ground litter in a fixed patch of land in different
?12122?1142?1???1???11112206?12?2242?2???1???2222110102025wheremoisturetoleranceiswelldeterminedby
2021年美国大学生数学建模竞赛题目A--真菌范文六
篇(含Matlab源代码)
问题A:真菌……………………………………………………………………2
Include predictions about the relative advantages and disadvantages for each species and
combinations of species likely to persist, and do so for different environments including arid,
(difference of each isolate’s competitive ranking and their
moisture niche width, both scaled to [0,1]) of various fungi
and the resulting wood decomposition rate (% mass loss
Your complete solution.
美国中学生数学建模竞赛获奖论文
Abstract
In this paper, we undertake the search and find problem. In two parts of searching, we use different way to design the model, but we use the same algorithm to calculate the main solution. In Part 1, we assume that the possibilities of finding the ring in different paths are different. We give weight to each path according to the possibility of finding the ring in the path. Then we simplify the question as pass as more weight as possible in limited distance. To simplify the calculating, we use Greedy algorithm and approximate optimal solution, and we define the values of the paths(according to the weights of paths) in Greedy algorithm. We calculate the possibility according to the weight of the route and to total weights of paths in the map. In Part 2, firstly, we limit the moving area of the jogger according to the information in the map. Then we use Dijkstra arithmatic to analysis the specific area of the jogger may be in. At last, we use greedy algorithm and approximate optimal solution to get the solution.
2023年美国大学生社会学建模E题中文版论文
2023年美国大学生社会学建模E题中文版论文引言本文旨在探讨2023年美国大学生社会学建模竞赛中的E题。
本竞赛是一个模拟实践社会学的机会,参赛者需利用相关数据和模型建议解决一个社会问题。
E题作为其中的一道题目,关注的是.................................(简要介绍E题背景)。
方法为了解决E题,我们采用了如下的方法:1. 收集数据:我们首先收集了关于..............................................(提供数据收集方法和来源)。
2. 数据清洗:我们对收集到的数据进行了清洗和整理,去除了异常值并确保数据的准确性和一致性。
3. 模型选择:在分析数据之前,我们比较了多种社会学建模方法,并选择了最适用的模型来解决E题。
4. 模型建立:我们根据所选模型的要求和理论基础,建立了相应的数学模型,并运用相关软件工具进行计算和模拟。
5. 结果分析:通过对模型的运行结果进行分析和解读,我们得出了结论和建议。
结果经过分析和模拟实验,我们得出了以下结论:1. .......................................(列出结论1)2. .......................................(列出结论2)3. .......................................(列出结论3)讨论我们认识到本研究有一些局限性,主要包括:1. .......................................(列出局限性1)2. .......................................(列出局限性2)3. .......................................(列出局限性3)尽管如此,我们的研究结果仍具有一定的实际应用意义,并为解决E题提供了一种可能的途径和思路。
2015美国数学建模竞赛优秀论文
Team#35943
Page 1
of
20
Contents 1 Introduction and restatement
1.1 Background…………………………………………………………………… 2 1.2 Restatement of problems……………………………………………………..... 2 1.3 An overview of sustainable development…………………………………........ 3
D
2015
Mathematical Contest in Modeling (MCM/ICM) Summary Sheet
In order to measure a country's level of sustainable development precisely, we establish a evaluation system based on AHP(Analytic Hierarchy Process). We classify quantities of influential factors into three parts, including economic development, social development and the situation of resources and environment. Then we select 6 to 8 significant indicators of every part. With regard to the practical situation of the country we are focusing on, we are able to build judgment matrixes and obtain the weight of every indicator. Via liner weighting method, we give the definition of comprehensive index of sustainable development. Referring to classifications given by experts, we can judge whether this country is sustainable or not precisely. In task 2, we choose Cambodia as our target nation. We obtain detailed data from the world bank. Using standardized data of this country, via the process above, we successfully get the comprehensive index is 0.48, which means its sustainable development ability is weak. We notice that industrial value added, enrollment rate in higher learning institutions and other five indicators contribute most to the sustainable development according to our model, so we make policies and plans which focus on the improvement of these aspects, including developing industry and improving social security. We also recommend ICM to assistant Cambodia in these aspects in order to optimize the development. To solve task 3, we consider several unpredictable factors that may influence sustainable development, such as politics, climate changes and so on. After taking all of these factors into consideration, we predict the value of every indicator in 2020, 2030 and 2034 according to our plans. After calculating, we are delighted that the comprehensive index has grown up to 0.91, meaning this country is quite sustainable. This also reflects that our model and plans are reasonable and correct.
数学建模美赛写作模版(包含摘要、格式、总结、表格、公式、图表、假设)
论文reference 格式中文解说版总体要求1 正文中引用的文献与文后的文献列表要完全一致.ν文中引用的文献可以在正文后的文献列表中找到;文献列表的文献必须在正文中引用。
2 文献列表中的文献著录必须准确和完备。
3 文献列表的顺序文献列表按著者姓氏字母顺序排列;姓相同,按名的字母顺序排列;著者姓和名相同,按出版年排列。
νν相同著者,相同出版年的不同文献,需在出版年后面加a、b、c、d……来区分,按文题的字母顺序排列。
如: Wang, M. Y。
(2008a). Emotional……Wang, M。
Y。
(2008b). Monitor……Wang,M。
Y. (2008c). Weakness……4 缩写chap. chapter 章ed。
edition 版Rev. ed。
revised edition 修订版2nd ed. second edition 第2版Ed. (Eds。
)Editor (Editors)编Trans. Translator(s) 译n.d. No date 无日期p。
(pp。
)page (pages)页Vol. Volume (as in Vol。
4) 卷vols。
volumes (as in 4 vols.)卷No。
Number 第Pt。
Part 部分Tech. Rep. Technical Report 技术报告Suppl. Supplement 增刊5 元分析报告中的文献引用ν元分析中用到的研究报告直接放在文献列表中,但要在文献前面加星号*。
并在文献列表的开头就注明*表示元分析用到的的文献。
正文中的文献引用标志在著者—出版年制中,文献引用的标志就是“著者”和“出版年”,主要有两种形式:(1)正文中的文献引用标志可以作为句子的一个成分,如:Dell(1986)基于语误分析的结果提出了音韵编码模型,…….汉语词汇研究有庄捷和周晓林(2001)的研究。
(2)也可放在引用句尾的括号中,如:在语言学上,音节是语音结构的基本单位,也是人们自然感到的最小语音片段。
第四届MathorCup数学建模挑战赛优秀论文-推荐书籍
MathorCup 全球大学生数学建模挑战赛承诺书我们仔细阅读了MathorCup 全球大学生数学建模挑战赛的规则.我们完全明白,在竞赛开始后参赛队员不能以任何方式(包括电话、电子邮件、网上咨询等)与队外的任何人(包括指导教师)研究、讨论与赛题有关的问题。
我们知道,抄袭别人的成果是违反竞赛规则的, 如果引用别人的成果或其他公开的资料(包括网上查到的资料),必须按照规定的参考文献的表述方式在正文引用处和参考文献中明确列出。
我们郑重承诺,严格遵守竞赛规则,以保证竞赛的公正、公平性。
如有违反竞赛规则的行为,我们将受到严肃处理。
我们参赛选择的题号是(从A/B/C 中选择一项填写): B 我们同意组委会可以公开发布论文到校苑数模网:是(是/否)我们的参赛报名队号: 10352 参赛队员:1. 邢云飞2. 张丽娜3. 宋迎召指导教师或指导教师组负责人:廖川荣日期: 2014 年 5 月 28 日一、问题重述1.1问题的背景随着网络的普及,图书出版业也迎来了爆棚时代,读者面临的信息量越来越大,可供选择的书籍也越来越多,此时如何选到一本心满意足的书籍已经变得不那容易。
应于时代的要求,个性化推荐应运而生,它从用户的历史数据和用户的社交行为数据中发现用户的“兴趣”,采取推荐的方式将信息呈现在用户面前,使用户尽量快的从海量的信息中找到自己感兴趣的书籍。
然而,目前国内外对于图书评价的研究,无论在理论上还是实际中都相对落后。
目前,对于图书评价和图书的推荐仍然处于定性的分析层面上。
所以,有必要通过用户的资料以及历史行为对书籍评分进行预测并且实现较为准确的书籍推荐系统。
1.2问题的提出根据题目给出的数据以及要求,本体可以归纳为以下三个问题:1.挖掘题目中的数据内在联系。
并且观察评分与数据间的关系。
从中分析出对于用户评分的影响因素‘2.根据问题一的影响因素,建立适当的预测模型对表中用户未评过分的书籍进行评分。
3.利用用户的社交数据,使用协同过滤的方法给用户推荐符合兴趣爱好的书籍。
2011年美国大学生数学建模竞赛优秀作品
AbstractThis paper presents one case study to illustrate how probability distribution and genetic algorithm and geographical analysis of serial crime conducted within a geographic information system can assist crime investigation.Techniques are illustrated for predicting the location of future crimes and for determining the possible residence of offenders based on the geographical pattern of the existing crimes and quantitative method,which is PSO.It is found that such methods are relatively easy to implement within GIS given appropriate data but rely on many assumptions regarding offenders’behaviour.While some success has been achieved in applying the techniques it is concluded that the methods are essentially theory-less and lack evaluation.Future research into the evaluation of such methods and in the geographic behaviour of serial offenders is required in order to apply such methods to investigations with confidence in their reliability.1.IntroductionThis series of armed robberies occurred in Phoenix,Arizona between13September and5December1999and included35robberies of fast food restaurants,hotels and retail businesses.The offenders were named the“Supersonics”by the Phoenix Police Department Robbery Detail as the first two robberies were of Sonic Drive-In restaurants.After the35th robbery,the offenders appear to have desisted from their activity and at present the case remains unsolved.The MO was for the offenders to target businesses where they could easily gain entry,pull on a ski mask or bandanna, confront employees with a weapon,order them to the ground,empty the cash from a safe or cash register into a bag and flee on foot most likely to a vehicle waiting nearby. While it appears that the offenders occasionally worked alone or in pairs,the MO, weapons and witness descriptions tend to suggest a group of at least three offenders. The objective of the analysis was to use the geographic distribution of the crimes to predict the location of the next crime in an area that was small enough to be suitable for the Robbery Detail to conduct stakeouts and surveillance.After working with a popular crime analysis manual(Gottleib,Arenberg and Singh,1994)it was found that the prescribed method produced target areas so large that they were not operationally useful.However,the approach was attractive as it required only basic information and relied on simple statistical analysis.To identify areas that were more useful for the Robbery Detail,it was decided to use a similar approach combined with other measurable aspects of the spatial distribution of the crimes.As this was a“live”case, new crimes and information were integrated into the analysis as it came to hand.2.AssumptionIn order to modify the model existed,we apply serial new assumptions to the principle so that our rectified model can be much more practical.Below are the assumptions:1.C riminals prefer something about the locations where previous crimes werecommitted committed..We supposed the criminals have a greater opportunity to ran away if they choose to crime in the site they are familiar with.In addition,the criminals probably choose previous kill sites where their target potential victims live and work.2.Offenders regard it safer to crime in their previous kill site as time went by.This is true that the site would be severely monitored by police when a short term crime happened and consequently the criminal would suffer a risk of being arrested in that site.And as mentioned above ,the police would reduce the frequency of examining the previous kill sites as time went by.3.Criminals are likely to choose the site that have optimal distance .This is a reasonable assumption since it is probably insecure to crime in the site that stays far away and that costs an amount of energy to escape and adds the opportunity to be arrested in such an unfamiliar terrain.And it is also impossible to crime in the site nearby since it increases the probability of being recognized or being trapped.As a result,we can measure a optimal distance in series perpetrations.4.Crimes are committed by individual.We assume that all the case in the model are committed by individuals instead of by organized members.In this way the criminal is subject to the assumptions mentioned above due to his insufficient preparation.5.Criminals Criminals''movements unconstrained.Because of the difficulty of finding real-world distance data,we invoke the “Manhattan assumption”:There are enough streets and sidewalks in a sufficiently grid-like pattern that movements along real-world movement routes is the same as “straight-line”movement in a space be discrete into city blocks.It is demonstrated that across several types of serial crime,the Euclidean and Manhattan distances are essentially interchangeable in predicting anchor points.3.The prediction of the next crime site3.1The measure of the optimal distanceDue to the fact that the mental optimal distance of the criminal is related to whether he is a careful person or not,it is impossible for him to make a fixed constant.Besides,the optimal distance will change in different moment.However,such distance should be reflected on the distances of the former crime sites.Presume that the coordinates of the n crime sites is respectively ),(11y x 、),(22y x 、……、),(n n y x ,and define the distance between the th i crime site and the th j one as j D ,i .The distance above we first consider it as Euclid distance,which is:22,)()(j i j i j i y y x x D −+−=With that,we are able to measure the distance between the th n crime site and the th 1-n one respectively.According to the assumption 2,the criminal believes that the earlier crime sites have became saferfor him to commit a crime again,so we can define his mental optimal distance,giving the sites the weights from little to much according to when the offenses happened in time sequence,as:∑−==11,n i ni i D w SD Satisfying 121......−<<<n w w w ,111=∑−=n i i w .Presuming the th i crime happens in i t ,whichis measured by week,we can have ∑−==11n i i kk t t w .SD can reflect the criminal's mental condition to some extent,so we can use it to predict the mental optimal distance of the criminal in the th n 1+case.While referring to the th n crime site,the criminal is able to use SD to estimate the optimal distance in the next time,and while referring to the rest crime sites,the optimal distances reduce as time goes back.Thus,the optimal security of the th i crime site can be measured as the following:n ni i SD t t SD *=3.2The measure of the probability distributionGiven the crime sites and location,we can estimate tentatively the probability density distribution of the future crimes,which equals to that we add some small normal distribution to every scene of crime to produce a probability distribution estimate function.The small normal distribution uses the SD mentioned above as the mean,which is:∑=−−=n i i i SD r n y x f 122)2)(exp(211),(σσπi r is defined as the Euclid distance between the site to the th i crime site,and the standard difference of the deviation of the criminal's mental optimal distance is defined as σ,which also reflects the uncertainty of the deviation of the criminal's mental optimal distance,involves the impacts of many factors and can not be measured quantitatively.The discussion of the standard difference is as following:3.3The quantization of the standard differenceThe standard difference is identified according to the following goal,which is,every prediction of the next crime site according to the crime sites where the crimes were committed before should have the highest rate of success.When having to satisfying such optimization objective,it isimpossible to make the direct analysis and exhaustivity.Instead,we have to use the optimized solutions searching algorithm,which is genetic algorithm.\Figure1:The Distribution of the Population of the Last GenerationAccording to the figure,the population of the last generation is mostly concentrated near80, which is used as the standard distance and substituted to the*formula.With the*formula,we are able to predict the probability density of Whether the zones will be the next crime site.Case analysis:5crime site according to the4ones happened before Figure2:The prediction of theth6crime site according to the5ones happened before Figure3:The prediction of theth6crime site according to the5ones happened before Figure4:The prediction of thethAccording to the predictions happened before,the predictions of the outputs based on the models are accurate relatively,and they are able to be the references of the criminal investigations to some extent.However,when is frequency of such crime increases,the predictions of the outputs23crime site according deviated the actual sites more and more,such as the prediction of thethto the22ones happened before,which is:23crime site according to the22ones happened before Figure5:the prediction of thethConclusion according to analysis:It may not be able to predict the next crime site accurately if we use Euclid distance to measure the probability directly.So,we should analyze according to the actual related conditions.For example,we can consider the traffic commutes comprehensively based on the conveniences of the escapes,such as the facilities of the express ways network and the tunnels.According to the hidden security of the commitments,we should consider the population of the area and the distance from the police department.Thus,we should give more weights to the commute convenience,hidden security and less population.In addition,when the commitments increases,the accuracy of the model may decrease,resulted from the fact that when the criminal has more experience,he will choose the next crime sites more randomly.4.Problems and further improvementsWith23crimes in the series the predictions tended to provide large areas that included the target crime but were too large to be useful given the limited resources the police had at their disposal.At this stage,a more detailed look was taken at the directionality and distances between crimes.No significant trends could be found in the sequential distance between crimes so an attempt was made to better quantify the relationship between crimes in terms of directionality.The methodology began by calculating the geographic center of the existing crimes. The geographic center is a derived point that identifies the position at which the distance to each crime is minimized.For applications of the geographic center to crime analysis.Once constructed,the angle of each crime from the north point of the geographic center was calculated.From this it was possible to calculate the change indirection for the sequential crimes.It was found that the offenders were tending to pattern their crimes by switching direction away from the last crime.It appears that the offenders were trying to create a random pattern to avoid detection but unwittingly created a uniform pattern based upon their choice of locations.This relationship was quantified and a simple linear regression used to predict what the next direction would be.The analysis was once again applied to the data.While the area identified was reduced from previous versions and prioritized into sub-segments,the problem remained that the areas predicted were still too large to be used as more than a general guide to resource deployment.A major improvement to the methodology was to include individual targets.By this stage of the series,hotels and auto parts retailers had become the targets of choice.A geo-coded data set became available that allowed hotels and retail outlets to be plotted and compared to the predicted target areas.Ideally those businesses falling within the target areas could be prioritized as more likely targets.However,in some cases the distribution of the likely businesses appeared to contradict the area predicted.For example,few target hotels appeared in the target zone identified by the geographic analysis.In this case,more reliance was placed upon the location of individual targets. From this analysis it was possible to identify a prioritized list of individual commercial targets,which was of more use operationally.Maps were also provided to give an indication of target areas.Figure6demonstrates a map created using this methodology.It is apparent from the above discussion that the target areas identified were often too large to be used as more than a general guide by the Robbery Detail.However,by including the individual targets,it was possible to restrict the possible target areas to smaller,more useful areas,and a few prioritized targets.However,such an approach has the danger of being overly restrictive and it is not the purpose of the analysis to restrict police operations but to suggest priorities.This problem was somewhat dealt with by involving investigators in the analysis and presenting the results in an objective manner,such that investigators could make their own judgments about the results.To be more confident in using this kind of analysis a stronger theoretical background to the methods is required.What has been applied here is to simply exploit the spatial relationships in the information available without considering what the connection is to the actual behaviour of the offenders.For example,what is the reason behind a particular trend observed in the distance between crimes?Why would such a trend be expected between crimes that occur on different days and possibly involve different individuals?While some consideration was given to identifying the reason behind the pattern of directionality and while it seems reasonable to expect offender’s to look for freeway access,such reasoning has tended to follow the analysis rather than substantiate it.Without a theoretical background the analysis rests only on untested statistical relationships that do not provide an answer to the basic question:why this pattern?So next we will apply a quantitative method,which is PSO,based on a theoretical background,to locate the residence of the criminal's residence.5.The prediction of the residenceParticle Swarm Optimization is a evolutionary computation,invented by Dr.Eberhart and Dr.Kennedy.It is a tool of optimization based on iteration,resulted from the research on the behaviors of the bird predation.Initiating a series of random number,the PSO is able to catch the optimization with iteration.Like PSO,the resolution of our residence search problem is the criminal,whose serial crime sites have been abstracted into 23particles without volume and weight and extended to the 2-D space.Like bird,the criminal is presumed to go directly home when he committed a crime.So,there are 23criminals who commit the crimes in the 23sites mention before and then they will go home directly.The criminals are defined as a vector,so are their speed.All criminals have a fittness decided by the optimized functions,and every of them has a according speed which can decide their direction and distance.All the criminals know the best position (pbest,defined as the residence known by the individual),which has been discovered so far,and where they are now.Besides,every criminals also know the best position which has been found by the group (gbest,defined as the residence known by the group).Such search can be regarded as the experience of other criminals.The criminals are able to locate the residence by the experience of itself and the whole criminals.PSO computation initiates the 23criminals and then the offenders will pursue the optimized one to search in the space.In other words,they find the optimized solutions by iteration.Presume that in the 2-D space the location and speed of the ith crime site is relatively ),(2,1,i i i x x X =and ),(2,1,i i i v v V =.In every iteration,the criminals will pursue the two best positions to update themselves.The two best positions are relatively the individual peak (pbest),),(2,1,i i i p p P =,which is found by the criminal himself,and the group optimized solution (gbest),g P ,which has been found to be the optimized solution by the whole group so far.When the criminals found the two optimized solutions,they will update their speed and new position based on the following formulas.2,1),1()()1()]([)]([)()1(,,,,,22,,11,,=++=+−+−+=+j t v t x t x t x p r c t x p r c t wv t V j i j i j i j i j g j i j i j i j i In the above,the w is inertial weighted factor,21c andc are positive learning factors,21r andr are random number which are distributed uniformly between 0and 1.The learning factor can make the criminals have self-conclude ability and ability of learning from others.Here we make both of them be 2,as what they always are in PSO.The inertial weighted factor w decides the extent of the inheritance of the current speed of the crime sites.The appropriate choice can make them have balanced searching and exploring ability.For balancing the global searching ability and the local improving ability of the criminal in the PSO algorithm,here we adopt one of the self-adapted methods,which is Non-linear Dynamic Inertial Weight Coefficient to choose the inertial weight.The expression is as following:⎪⎩⎪⎨⎧=≤−−−−>avg avg avg f f f f f f w w w f f w w ,))*((,minmin min max min max In the above,the max w and min w are defined respectively as the maximum and minimum of w,f means the current functional value of the criminal,and the avg f and min f respectively means the average value and minimum value of all the current criminals.In addition,the inertial weight will change automatically according to the objective value,which gives the name self-adapted method.When the final values,which are estimations of the criminal's residence,become consistent,it will make the inertial weight increase.When they become sparser,it will make the inertial weight decrease.In the meantime,referring to the criminals whose final values are worse than the average value,its according inertial weighted factor will become smaller,which protect the crime site.Oppositely,when referring to the criminals whose final values are better than the average value,its according inertial weighted factor will become bigger,which makes the criminal nearer to the searching zone.So now,with the PSO of Non-linear Dynamic Inertial Weight Coefficient,we can calculate the minimum value of22,)()(j j j i y y x x R −+−=,j=1,2,3 (23)In the above,j ,i R is the residence of the criminal.Thus,we have the output (x,y)as(2.368260870656715,3.031739124610613).We can see the residence in the figure 7.Figure7:The residence in the map6.ConclusionThis paper has presented one case study to illustrate how probability distribution and geographical analysis of serial crime conducted can assist crime investigation. Unfortunately,in the Supersonic armed robbery investigation the areas identified were too large to have been of much use to investigators.Further,because of the number of assumptions applied the method does not inspire enough confidence to dedicate resources to comparing its results to the enormous amount of suspect data collected on the case.While the target areas predicted tended to be large,the mapping of individual commercial targets appears to offer a significant improvement to the method.However,as they stand,these methods lack a theoretical basis that would allow the results to be judged and applied in investigations.Limitations such as these can be offset to some degree by the involvement of investigators in the analysis.In the end,we used a quantitative method to locate the residence of the criminal to make the identified areas smaller.So,due to the advantages and drawbacks of the above methods,we suggest that we should use different methods to help us fight again the crimes comprehensively.。
美赛2019数模E题论文解法思路
美赛2019数模E题论文解法思路美赛2019数模E题解法思路题目:环境退化的成本是多少?解法思路:建立多元回归统计环境退化的成本模型,评估土地利用开发项目的环境成本。
环境退化的成本是多少问题数学模型摘要环境退化的成本是多少是本文要解决的数学问题,为了明确环境退化的成本是多少问题,本文针对环境退化的成本是多少问题进行了分析建模,对环境退化的成本是多少问题进行了参考文献研究,建立了环境退化的成本是多少问题的相应模型,推导出环境退化的成本是多少问题的计算公式,编写了环境退化的成本是多少问题的计算程序,经过程序运行,得到环境退化的成本是多少问题程序计算结果。
具体有:对于问题一,这是环境退化的成本是多少问题最重要的问题,根据题目,对问题一进行了分析,参考已有的资料,建立了环境退化的成本是多少问题一的数学模型,推导出问题一的计算公式,编写出环境退化的成本是多少问题一的计算程序。
求出了环境退化的成本是多少问题一的计算结果。
对于问题二,环境退化的成本是多少问题二比问题一复杂的,是环境退化的成本是多少问题的核心,分析的内容多,计算机的东西也多。
在环境退化的成本是多少问题一的基础上,根据环境退化的成本是多少问题,对问题二进行了分析,参考已有的资料,建立了环境退化的成本是多少问题二的数学模型,推导出问题二的计算公式,编写出环境退化的成本是多少问题二的计算程序。
求出了问题二的计算结果,并以图表形式表达结果。
对于问题三,环境退化的成本是多少问题三是问题一和问题二的深入。
在问题一和问题二的基础上,根据环境退化的成本是多少问题,对问题三进行了分析,参考已有的资料,建立了问题三的数学模型,推导出环境退化的成本是多少问题三的计算公式,编写出环境退化的成本是多少问题三的计算程序。
求出了环境退化的成本是多少问题三的计算结果,并以图表形式表达结果,并且进行了分析讨论。
对于问题4,环境退化的成本是多少问题4是问题一、问题二和问题三的扩展。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
SummaryThis paper discusses schemes to protect the stunt person with a stack of boxes. We develop two models for the problem. Model I involves 3D simulation while Model II is a 2D model. Both of them treat boxes as basic elements in computation.In Model I, we analyzed the physical properties of the box. A box has three states: uncompressed, partly compressed, and fully compressed. We record the size, position, mass and velocity, and state of each object, and calculate the motion and collision of the objects at each time step. In order to study the stability of the model, we discussed different entering points of the motorcycle.In Model II, each box is treated as a sphere in the computation of collision. This model is simple, but is suitable for testing modifications on the box.We calculated the appropriate size and number of boxes to use, and discussed the effects of different ways of stacking boxes. At last, we discussed generalizations and further improvements of our models.The Stunt PersonThe ProblemAn exciting action scene in a movie is going to be filmed, and you are the stunt coordinator! A stunt person on a motorcycle will jump over an elephant and land in a pile of cardboard boxes to cushion their fall. You need to protect the stunt person, and also use relatively few cardboard boxes (lower cost, not seen by camera, etc.).Your job is to:* determine what size boxes to use* determine how many boxes to use* determine how the boxes will be stacked* determine if any modifications to the boxes would help* generalize to different combined weights (stunt person & motorcycle) and different jump heightsNote that, in "Tomorrow Never Dies", the James Bond character on a motorcycle jumps over a helicopterRestatement of the ProblemHardly a movie made today is without some kind of amazing stunt work. We all are held breathless when a person falls out of a 20-story building or when a heart-stopping car chase ends in a spectacular crash.To reduce the chances of damage or injury, stunt designers use devices that stretch out the time it take to stop a body's momentum. The longer the period of time used in changing the momentum, the less force will be released upon impact. The cardboard boxes are one of the devices used.Our task is to:① determine w hat size boxes to use② determine how many boxes to use③ determine how the boxes will be stacked④ determine if any modifications to the boxes would help⑤generalize to different combined weights (stunt person &motorcycle) and different jump heightsBasic Assumptions(1) The area where the cardboard boxes are stacked is flat and without any obstacles.(2) During the entire process, the stunt person & motorcycle are considered as a whole object.(3) The cardboard boxes are approximately considered as cuboids.(4) Compared to the resistance of the boxes, the air’s is very slight and can be neglected.Symbolsthe bounce coefficient of the boxv v the component of velocity along x y axisx yS compression of the boxE the kinetic energy the box absorbs when it turns into compact bodyAnalysis of the Problem and Model DesignWhen the motorcycle crash into the boxes, the motorcycle collides with the boxes, forces the boxes into motion. The boxes, in return, exert forces on the motorcycle, thus cushion the movement of the motorcycle. Though the interactions between the boxes and the motorcycle are very complicated, they can mainly be divided into two kinds: friction and elastic force. They are forced by the friction, the bounce, the gravity, even inside the box there are the tension, the pull and so on. Since this problem involves discontinuous mechanical behavior, we adopt the Discrete Element Methodin our model design.Model Ⅰ:We analyze what will happen when a cardboard box is compressed: If the compression is slight, the cardboard box maintains its structure and strength; the compression of the box is restorable. When the compression increases, the resistance of the cardboard box gets stronger. When the compression gets too big that the cardboard box cannot hold its structure, the structure breaks and the box begins to collapse. A collapsing box only has negligible strength to resist further compression until most of the air in the box is expelled out of the cardboard box and the box turns into a compact body. Then the resist-compression curve of the box becomes so steep that it cannot be further compressed. The relation of the resist force and the compression of a box can be illustrated in the graph below.Fig.1 The relation of the resist force and the compression of a cardboard boxA —B : Restorable compression;B —C : The box begins to collapse;C —D : Collapsing;D — : The box turns into a compact solidLet 0E be the kinetic energy the box absorbs when it turns into compact body. We can calculate the energy absorbed by the box from point A to C which is close to 0E using the formula below:0CA E W F ds ≈=⋅⎰ (1) Note that this is the area of the shadow.W ds F S CA ABC =⋅=⎰ (2)Thus a box absorbs a definite amount of kinetic energy when it is compressed from point A to point C. Compared to the time of the whole process, the time from A to C is so short that it can be neglected. When the compression reaches point D, the resist force increases so sharply that we can treat the box as a rigid body. Therefore, we design the following model.A box is modeled as a trio-state cuboid that is compressible when it is at the “uncompressed” state and the "partly compressed" state, and becomes a rigid body when it is at the "fully compressed" state. An uncompressed box maintains its original shape and volume, and turns into “partly compressed ” when pressed by another object and absorbs a certain amount of kinetic energy of the object. A partly compressed box can be staved and compressed into a smaller cuboid without exerting any friction and elastic force to other objects, and turn into the compressed state when its volume decreases to a certain limit. The stunt man and the motorcycle are modeled as one rigid cuboid that crashes into the boxes. When it collides with a box, it compresses the box and turns the box into the compressed state, a rigid cuboid, which then interacts with the stunt man and motorcycle conforming the collision law. The box decelerates the man & motorcycle, while the man & motorcycle forces the box into movement, which then collide with other boxes. When the man & motorcycle moves on, it collides with more boxes, decelerating along the way, and eventually stops moving. The whole process of the man & motorcycle colliding with a box is illustrated in Fig.2.(d)(c)Fig.2. The man & motorcycle colliding with a box(a) The man & motorcycle moves towards the uncompressed box.(b) When the man & motorcycle contacts the box and begin to compress it, turn it from uncompressed into partly compressed, the box absorbs a definite amount of the kinetic energy of the man & motorcycle.(c) The man & motorcycle continues to compress the box, the box is in the partly compressed state, and exerts no force on the man & motorcycle.(d) The volume of the box decreases to the limit and turns into the compressed state (a rigid body) and collides with the man & motorcycle.(e) The compressed box is forced into movement by the collision, while the man & motorcycle is decelerated by the box.In this model, the movement of the man & motorcycle is influenced by gravity and collision with boxes; the movement of a compressed box is influenced by gravity, collision with man & motorcycle, collision with other boxes and ground support & friction if it contacts the ground, while an uncompressed box can be only influenced by gravity and support from boxes below or from the ground before it turns into compressed state. A computer program is used to calculate all of these influences and to simulate the cushion process of the boxes.In the computer program, the stunt man and motorcycle are represented as one object. All the boxes are represented as objects, too. Each object holds its size, position, mass and velocity, and each box object has a variable that represents its current state (compressed or not). The motion of each object is calculated each time step. At each step, every object is moved and checked if it collides with other object, if no collision happens to the current object, the state of the object is determined by the formula below:()()()()()()()()()()x x y y x y v t v t t v t v t t g t x t x t t v t t ty t y t t v t t t=-∆⎧⎪=-∆+⋅∆⎪⎨=-∆+-∆⋅∆⎪⎪=-∆+-∆⋅∆⎩ (3)If collision happens to the current object, the program calculates the reaction of the colliding objects based on their mass, speed and state (compressed or not). This is divided into three conditions:Condition Ⅰ: The current object collides into an uncompressed box; if the collision not hard enough, the box remains uncompressed and the standard collision law applies, otherwise, the uncompressed box turns into partly compressed and absorbs a certain amount of energy. The strength of a cardboard box is usually calculated with the following McKee formula:220.49240.5076( )McKee Formula FC ECT BP Caliper Lab Compression McKee Formula SF LWRF HFF PF == (4) FC = Flute Constant ECT = Edge Crush TestBP = Box Perimeter SF=Shape FactorLWRF=Length to Width Ratio FactorHFF=Horizontal Flute Factor PF=Printing FactorWe use this formula to estimate the value of energy absorbed(0E ) in our program.Condition Ⅱ: The object collides into a partly compressed box, because a partly compressed box does exerting any force to other objects, the current object moves exactly as if no collision is happening. So formula(3) is applied.Condition Ⅲ: The object collides into a fully compressed box or the man & motorcycle, and if the current object itself is an uncompressed or partly compressed box, then Condition I or Condition II applies to the fully compressed box or the man & motorcycle. If the current object is also a fully compressed box or the man & motorcycle, then a solid collision happens and the formula below applies:yxFig.3. The position of the two objects.112212'11'22()/()()()x x x x x x x x x x x v m v m v m m v v v v v v v v μμ=⋅+⋅+⎧⎪=+⋅-⎨⎪=+⋅-⎩(5)where μ is the bounce coefficient.1212'11'221122''1212y y x x y y y y y y y y v v And if Constv v v v v v m v m v else v v m m -≥-⎧=⎪⎨=⎪⎩⋅+⋅==+ (6) Based on the analysis above, we have designed a program in Visual C++ to simulate the whole process. Our program mainly consists of two parts:(1) Tracing and calculating of the position and speed of the boxes and the motorcycle. (2) Graphic visualization.Results of Model I:1. The best size of boxes:According to the Chinese standards of corrugated boxes, the density of a box is 0.68kg/m 2, so we can calculate the mass of a box according to the following formula:2*68.0*6a m = (7)where a is the side length of a box. And the energy absorbed when a box begins to collapse is proportional to a . We use different sizes of boxes to simulate the whole process, and recorded the decelerating rate of the man & motorcycle.When the boxes are small(a = 15cm), the vertical deceleration is very fast, so only the upper part of the box stack is affected by the man & motorcycle, other boxes are not affected so does not decelerate the man & motorcycle. The following graph illustrates the process:Fig.4 15 cm boxes cushion the man & motorcyclein this graph, the large box represents the man & motorcycle which is 150kg and has jumped over a 6 meter height(over an elephant) with a 20 m/s horizontal speed(from right to left). At the time of the graph, the vertical speed is reduced to 4m/s, but the horizontal speed is as fast as 15 m/s, and the horizontal speed can not be reduced below 12 m/s before the man & motorcycle lands on the ground, which is very dangerous.When the boxes are big(a = 50cm), the vertical deceleration is not enough to provide a safe landing. The following graph illustrates the process:Fig.5 50cm boxes cushion the man & motorcycleIn this graph, the same man & motorcycle jumped the same height with the same speed as in Fig.4, but the sizes of the boxes changed from 15cm to 50cm. At the time of the graph, the horizontal speed has been reduce to 9m/s, but the vertical speed is as big as 6m/s. the man & motorcycle strikes hard against the ground with a vertical speed that is equal to a fall from 2m height, which is not so safe. We simulated with varied sizes of boxes, and found that 30cm boxes provide the best landing, which is 8m/s horizontal speed and 3m/s vertical speed. A horizontal speed of 8m/s is equal to jumping down from a running bicycle, which is very safe. The whole process is illustrated in the following graphs:Fig.6 30cm boxes cushion the man & motorcycle The decelerating process is like this:Time(second) Horizontalspeed(m/s)Verticalspeed(m/s)State0.100250 20.000000 0.959881 Free Fall0.200500 20.000000 1.9489950.300750 20.000000 2.9361120.401000 20.000000 3.9239000.501000 20.000000 4.9022360.601000 20.000000 5.8842900.701000 20.000000 6.8685120.801000 20.000000 7.8478650.901056 19.747498 8.465098 Plunge into the boxes1.001070 17.868766 7.0358991.101088 16.061376 6.1593211.201099 14.805625 5.9394861.301102 13.972497 4.8513021.401119 12.883934 4.6265831.501120 12.2689393.9385901.601121 11.454120 3.3713591.701121 11.008360 3.5190111.801122 10.906639 1.5528441.901125 10.310166 1.6317262.001126 10.281974 2.4172732.101131 10.272263 0.4049832.201131 10.251879 0.846761List.1 Deceleration process of the man & motorcycle The deceleration is very smooth; the boxes cushion the man & motorcycle very well. We conclude that 30cm boxes provide the best all-round deceleration (horizontal and vertical), so it is the box of choice.2.How many boxes should be used?(1)Height of the box stackWe used different height of the box stack and the relation ofvertical landing speed and height of the box stack is listedbelow:The vertical landing speed decreases when the height of thestack increases. And we can conclude from the list that stackheight 5 is enough to decelerate the man & motorcycle in thevertical direction. Further increase the stack height does notprovide significant more safety.(2)Length of the box stackWe used different length of the box stack and the relation ofhorizontal landing speed and length of the box stack is listedbelow:15 12.54951920 10.55769025 9.24878840 7.69671475 4.733869The horizontal landing speed decreases when the length of thestack increases. The list shows that when stack length increaseto 25, the speed is already safe for the man & motorcycle toland, and there is only slight difference between thehorizontal landing speeds when the stack length increasesbeyond 25. So we chose 25 as the stack length.(3)Width of the box stackIn the calculations above, we assumed that the box stack is wideenough. We assumed that the width of the motorcycle is 80cm,so it can push 4 boxes simultaneously. So the width of the boxstack is at least 4.(4)Further analysis.Based on the analysis above, we need 5 * 25 * 4 = 500 boxes.But this is based on the assumption that the man & motorcycleenters the box stack from exactly the upper-right corner. Inreality, we can specify a area of possible landing area, andstack the boxes based on this area. This is illustrated in theimage below:Fig.7 Area covered by boxes(looking from above)So the amount of boxes should be used is more than 500 and isbased largely on the area of possible landing.3.In the simulations above, we stacked the boxes regularly with a 10cm interval between each row. This interval should be based on the ratio of the horizontal speed and vertical speed of the motorcycle.The interval should increase when the ratio increases, decrease when the ratio decrease, in order to fit the motion track of theman & motorcycle. This is illustrated in the graph below:We now give a theoretical analysis of stacking schemes:Horizontal deceleration of the motorcycle is caused by its collision with boxes in the horizontal direction. Suppose the motorcycle collides with n boxes in time dt.Mi is the mass of the i-th box, dvi is the change in velocity of the i-th box, f is the total force on the motorcycle.∑⋅=⋅i i dv m dt fmotor i v dv ⋅+=)1(μ,μis the elastic coefficient.ρμ⋅⋅⋅⋅⋅+=⋅∑motor motor motor i i S dt v v dv m )1(,ρμ⋅⋅⋅+=motor motor S v f 2)1(This means f is proportional to 2motor v . This means f is biggerwhen the motorcycle has just enter the stack. To make the motorcycle decelerate smoothly, we should put less boxes in the front.4. Filling the boxes with soft materials can increase the mass of abox without change other attributes of the box very much. Doing this can make the horizontal deceleration faster and make the horizontal landing speed lower. But if the mass of the box is too much, the deceleration becomes so great that human body cannot endure it. Anyway, unmodified boxes can cushion the man & motorcycle well and cost less. But if the weight of the man & motorcycle becomes very heavy (or replace the motorcycle with a car), filling the boxes toincrease their weights is a must.5.If the weight of the man & motorcycle and jump heights varies, weneed more or less boxes to cushion it. Because each box alwaysabsorbs the same amount of kinetic energy when it begins to collapse, the height of the box stack needed should be proportional to thepotential energy of the man & motorcycle when they are at the highestpoint of the jump. This is verified by simulation of falls fromdifferent height and different weight:following formula:H = 5 * (W * H) / (150kg * 6m) (8)H is the needed stack height, W is the combined weight of the man& motorcycle, H is the jump height. This is because we have calculated that an 150kg man & motorcycle combination needs a stack height of5.If the weight of the man & motorcycle increases a lot, the boxes should be filled to increase their mass in order to decelerate the man & motorcycle efficiently. The mass of a box should be proportional to the combined weight of the man & motorcycle in order to provide exactly the same deceleration effect to the man & motorcycle:w = w0 * W / 150kg (9) where w is the weight of the filled box, w0 is the weight of the empty box, W is the combined weight of the man & motorcycle. This is because empty boxes can decelerate an 150 kg man & motorcycle combination effectively.Model Ⅱ:In fact, there are several lines at the same altitude. And those boxes at the same altitude press each other, thus in the following model it is necessary to study how the boxes interact with each other.When an object (cardboard box or stunt person & motorcycle as a whole) is moving, if the direction isn’t straight, it will press the boxes at the lateral side into lateral movement. But this compression may be very slight. Experiments show that the relation of the resist force and thecompression of a cardboard box when the compression is slight can be represented by the following curve: (Fig.4.)Fig.8. The relation of the resist force and the compression of a cardboard box when the compression is slight (from /industry/wd_02.asp)There are several formulas to calculate resist force: K.O.Kellicutt formula, Maltenfoit formula, Wolf formula, Mckee formula and so on. But all those formulas involve many indeterminate factors. To avoid complex calculating of the process, we can just study the effect of this process. And the effect is that every box ‘close ’ to the moving object gets an impulse. But what does ‘close ’ mean, and how to determine the direction and how much the impulse is?In the model below, the stunt person & motorcycle as a whole is modeled as a sphere. According to the analysis above, it will cause its lateral boxes into movement even if they do not contact each other. Thus the diameter (R ) is set a little lager than its original width. And also the boxes are treated as spheres , while the diameter (r ) is determined according to the compression nature of the box (Fig.1.). Observed above, the whole process takes place in a horizontal plane.When two spheres meet, they collide with each other(Fig.5.). And their new velocities obey three laws at least:Law Ⅰ: The normal components (*y direction) of the velocities satisfy the following formula: ''1211v v v v μ-=- (10) where μ is the bounce coefficient.Law Ⅱ: The momentum conversation law along *y direction.''1*12*21*12*2y y y y m v m v m v m v ⋅+⋅=⋅+⋅ (11)Law Ⅲ: The shear components (*x direction) of the velocities do not vary.Fig.9 The two balls collide with each otherThe components of velocities can be calculated as follows: *111*111*222*2221212x x y y x y x x y y x y x x arctg y y v v Cos v Sin v v Sin v Cos v v Cos v Sin v v Sin v Cos ααααααααα⎧-=⎪-⎪⎪=⋅+⋅⎪⎨=-⋅+⋅⎪⎪=⋅+⋅⎪=-⋅+⋅⎪⎩ (12) After collision, the new components of velocities are determined by formulas (10) and (11):1*12*212'*1*1'*1*1'*2*2'*2*2()/()()()x x x x x x x y y x x x x y y v m v m v m m v v v v v v v v v v v v μμ=⋅+⋅+⎧⎪=+⋅-⎪⎪=⎨⎪=+⋅-⎪⎪=⎩(13) Thus the new velocity vectors can be calculated from new components of velocities in the x*-y* reference frame:'''1*1*1'''1*1*1'''2*2*2'''2*2*2x x y y x y x x y y x y v v Cos v Sin v v Sin v Cos v v Cos v Sin v v Sin v Cos αααααααα⎧=⋅-⋅⎪=⋅+⋅⎪⎨=⋅-⋅⎪⎪=⋅+⋅⎩(14)In our computer simulation, time is divided into intervals of 0.005 second. The position and the speed vector of each object is traced and refreshed at each time step. After a time interval, we calculate the new position of an object according to its last position and speed, and then calculate the gravity force that changes its speed. After that we examine whether it collides with another object, and calculate its new speed vector if collision happens. The program is written in Matlab.Results of Model II:We take certain boxes which all the parameters are determined to simulate the process, the following Fig will demonstrate this process (The motorcycle is represented as a flat, the boxes are represented as small balls):Fig.10 The process of the deceleration of the motorcycle using Model IIAt the same time, the program records the acceleration of the motorcycle in the two crossing directions and draws the following Fig:Fig.11. Two accelerations vary as the time goes onRed curve------ the acceleration in lateral directionBlue curve----- the acceleration in moving directionStrengths and Weaknesses of the ModelsModel I:Strengths(1) The box is treated as an object with three possible states. This treatment is a reasonable simplification according to the experimental results about cardboard boxes. Moreover, it has greatly accelerated computation.(2) The model involves 3-D simulation. We used improved algorithm to detect collisions, which reduced the time cost of the simulation. (About 30 seconds for a simulation of the complete process)(3) The model is applicable to different values of parameters, such as combined weights and jump heights.(4) The versatility of the model is obvious. This model employs a discrete simulation method that can be easily adapted to other applications.(5) Some parameters of the process can be studied separately, thus the size , the amount of the boxes and so on.Weaknesses(1) The rotation of the boxes was not considered in the model. The rotating energy of the boxes comes from the initial kinetic energy of the motorcycle. Since it was not taken into account, we can expect that the motorcycle actually decelerate faster than our calculation.(2) Collisions of more than two boxes were not considered.(3) The boxes were considered to be accurate cuboids. But actually they may change their shape under the impact of the motorcycle. We have neglected this effect, thus brought some inaccuracies.Further DevelopmentAs the rotation absorbs kinetic energy which will contribute to the total energy, we can add the rotating of the objects into further consideration.The change in shape of the boxes are not considered in our models. To solve the problem, we might further analyze the box into many parts, but this will definitely increase computational cost.In our models we have always assumed that no three objects will collide together at the same time. But this is not true for complex objects such as boxes. Interactions between boxes take time, and it is quite often that two or more of such time intervals overlap. if we treat collision as a process rather than a instantaneous event, we can add collision of manyobjects in the model.Model II:All the objects are treated as uncompessible spheres, then it is easy to refresh the position and velocity of the objects when they collide with each other.Although the simulation does not involve gravity, it is adaptable when the gravity can be neglected.The greatest strength of the this model is that it showes how the collision takes place when they do not meet center-to-center.But this model is quite simple, and neglects too many factor. Although those factors make little effect on the result alone, they do make great effect all together.Comparison of Model I and II:Model I treats the objects as cuboids while Model II treats them as spheres. In some sense, spheres better represents the boxes than cuboids do. In Model I, the objects are always cuboids even if compressed, and they can only collide on the horizontal or vertical direction, but actually they can rotate and change their shape. This effect is especially apparent for small boxes. Therefore Model II provides a platform on which we can test different shapes of boxes. Though we did not test many kinds of modifications on the shape of the box, it is already seen that spheres and cuboids bring about different results. An example is that, in Model II, more boxes near the entering point of the motorcycle move under the impact of the motorcycle than in Model I.Another difference is that Model II is a 2D model. It does not involve gravity force. It is a model suitable for studying the nature of the stack of boxes, which can be viewed as a kind of granular matter. We suggest using a 2D model to get some knowledge about the dynamic properties of the boxes. This is time-saving and will help us establish 3D models easier.References:1.Acceleration,/campus/departments/aas/beltran/acceleration.ppt2.Stacking Strength, /T_Stack.pdf。