Constructing Diverse Classifier Ensembles using Artificial Training Examples
高考英语一轮总复习 语法专题突破 专题三 谓语动词
6.(2021·浙江卷)The little home was painted
(paint) white.It was
sweet and fresh.Mary loved it.
7.(2021·全国甲卷)We hired (hire) our bikes from the rental place at the
【用法点拨】 1.过去将来时的构成 主动语态:would+动词原形 被动语态:would be+过去分词 2.过去将来时的主要用法 过去将来时表示在过去预计将来的某一时间要发生的动作或存在的状态。
I thought that Jack was going to write a letter to his father.
专题三 谓语动词
增素能 精准突破 测效果 课堂评价
增素能 精准突破
1.(2023·全国乙卷)The remarkable development of this city,which is
consciously designed to protect the past while stepping into the modern world,
仅表示在过去某一时间点,一个动作正在进 行,有什么样的结果不得而知
凡有明确的过去时间的情况均用过去时,不能用完成时,如含有 ago,last year,just now,the other day
强调的是动作发生在 “过去”,和现在毫无关 系
强调的是对“现在”的影响和结果,动作到现 在刚完成或还在继续
sophisticated classifiers -回复
sophisticated classifiers -回复什么是Sophisticated ClassifiersSophisticated Classifiers(复杂分类器)是一种在机器学习中广泛使用的技术,用于将数据点分配到不同类别中。
Sophisticated Classifiers 在许多应用领域都得到了广泛的应用,如自然语言处理、图像识别、金融预测等。
常见的Sophisticated Classifiers算法有许多常见的Sophisticated Classifiers 算法,具体选择哪个算法取决于应用和数据的特性。
下面是几个常见的算法:1. 支持向量机(Support Vector Machines,SVM): SVM是一种基于统计学习理论的监督学习算法,可以用于二分类或多分类问题。
2. 随机森林(Random Forests): 随机森林是一种基于决策树的集成算法。
3. 深度学习神经网络(Deep Learning Neural Networks): 深度学习神经网络是一种基于神经网络结构的复杂分类器。
4. 梯度提升(Gradient Boosting): 梯度提升是一种迭代的集成学习算法。
5. 卷积神经网络(Convolutional Neural Networks,CNN): CNN是一种特别适用于图像处理和识别的神经网络。
想要了解的事物英语作文Things I Yearn to Understand The world is an intricate tapestry woven with threads of knowledge, both known and unknown. While I find myself fascinated by the vast amount of information we’ve accumulated as a species, I am acutely aware of the vast, uncharted territories of understanding that lie before me. There are several key areas that spark a deep curiosity within me, areas I yearn to explore and grasp with greater clarity. Firstly, I am captivated by the complex workings of the human mind. The brain, a three-pound universe contained within our skulls, is a marvel of intricate networks and electrochemical signals that give rise to consciousness, emotion, and behavior. How do neurons fire in symphony to create our perceptions of the world? What are the mechanisms behind memory formation and retrieval? How does our unique blend of genetics and environment shape our personalities and predispositions? Unraveling the mysteries of the mind holds the key to understanding the very essence of what makes us human. The vast universe, with its swirling galaxies, enigmatic black holes, and the tantalizing possibility of life beyond Earth, also ignites my imagination. I long to understand the fundamental laws that govern the cosmos, from the delicate dance of subatomic particles to the majestic movements of celestial bodies. What is the true natureof dark matter and dark energy, the unseen forces shaping the universe's evolution? Are we alone in this vast cosmic expanse, or does life, in all its wondrous forms, exist elsewhere? The pursuit of answers to these questions is a quest to understand our place in the grand scheme of existence. Closer to home, the interconnected web of life on our planet fascinates me. The intricate ecosystems teeming with biodiversity, the delicate balance of predator and prey, theintricate cycles of energy and nutrients - these are all testament to the awe-inspiring power of evolution and adaptation. I yearn to understand the complex interactions within these ecosystems, the delicate balance that sustains them, and the impact of human activities on this delicate web. Understanding these complexities is crucial for our responsible stewardship of the planet and the preservation of its irreplaceable biodiversity. Furthermore, I am drawn to the intricacies of human history and its impact on our present reality. From the rise and fall of civilizations to the struggles for freedom and equality, historyoffers a lens through which we can examine the triumphs and failures of humankind.I crave a deeper understanding of the forces that have shaped our social,political, and economic systems, the ideologies that have fueled conflicts and cooperation, and the enduring legacies of past events. By studying history, wecan learn from our ancestors' mistakes and successes, equipping ourselves to navigate the challenges of the present and build a better future. The ever-evolving world of technology, with its rapid advancements in artificial intelligence, biotechnology, and space exploration, also holds a powerful allure.I am driven to understand the principles behind these innovations, their potential to address global challenges, and the ethical implications that accompany them. How can we harness the power of artificial intelligence for the betterment of society while mitigating potential risks? What are the ethical considerations surrounding genetic engineering and its impact on future generations? How can space exploration contribute to scientific advancements and inspire future generations? Exploring these frontiers of technology is essential for shaping a future where innovation serves humanity and the planet. Finally, I yearn to understand the very essence of creativity and its power to inspire, challenge, and transform. From the evocative brushstrokes of a painter to the soaring melodiesof a composer, creativity speaks a universal language that transcends cultural boundaries. What are the cognitive processes that underpin artistic expression? How does creativity foster innovation and problem-solving across disciplines? How can we nurture and cultivate our own creative potential to contribute to the world in meaningful ways? Understanding the nature of creativity is key to unlockingour own potential and enriching the human experience. In conclusion, the pursuit of knowledge is a lifelong journey, an insatiable thirst for understanding that fuels my curiosity and motivates my exploration. From the inner workings of the human mind to the vast expanses of the cosmos, from the intricate web of life on Earth to the enduring legacies of human history, from the frontiers of technology to the power of creative expression - these are the areas I yearn to understand with greater depth and clarity. This quest for knowledge is not merely an academic pursuit but a fundamental aspect of what makes us human - the desire to learn, grow, and contribute to the betterment of ourselves and the world around us.。
Proceedings of the IJCAI-2003, pp.505-510,Acapulco, Mexico, August 2003Constructing Diverse Classifier Ensembles using Artificial Training ExamplesPrem Melville and Raymond J.MooneyDepartment of Computer SciencesUniversity of Texas1University Station,C0500Austin,TX78712melville@,mooney@AbstractEnsemble methods like bagging and boosting thatcombine the decisions of multiple hypotheses aresome of the strongest existing machine learningmethods.The diversity of the members of anensemble is known to be an important factor indetermining its generalization error.This paperpresents a new method for generating ensemblesthat directly constructs diverse hypotheses usingadditional artificially-constructed training exam-ples.The technique is a simple,general meta-learner that can use any strong learner as a baseclassifier to build diverse committees.Experimen-tal results using decision-tree induction as a baselearner demonstrate that this approach consistentlyachieves higher predictive accuracy than both thebase classifier and bagging(whereas boosting canoccasionally decrease accuracy),and also obtainshigher accuracy than boosting early in the learningcurve when training data is limited.1IntroductionOne of the major advances in inductive learning in the past decade was the development of ensemble or committee ap-proaches that learn and retain multiple hypotheses and com-bine their decisions during classification[Dietterich,2000]. For example,boosting[Freund and Schapire,1996],an en-semble method that learns a series of“weak”classifiers each one focusing on correcting the errors made by the previous one,has been found to be one of the currently best generic inductive classification methods[Hastie et al.,2001]. Constructing a diverse committee in which each hypothesis is as different as possible(decorrelated with other members of the ensemble)while still maintaining consistency with the training data is known to be a theoretically important property of a good committee[Krogh and Vedelsby,1995].Although all successful ensemble methods encourage diversity to some extent,few have focused directly on the goal of maximizing diversity.Existing methods that focus on achieving diversity [Opitz and Shavlik,1996;Rosen,1996]are fairly complex and are not general meta-learners like bagging[Breiman, 1996]and boosting that can be applied to any base learner to produce an effective committee[Witten and Frank,1999].We present a new meta-learner(D ECORATE,Diverse En-semble Creation by Oppositional Relabeling of Artificial Training Examples)that uses an existing“strong”learner(one that provides high accuracy on the training data)to buildan effective diverse committee in a fairly simple,straightfor-ward manner.This is accomplished by adding different ran-domly constructed examples to the training set when buildingnew committee members.These artificially constructed ex-amples are given category labels that disagree with the cur-rent decision of the committee,thereby easily and directlyincreasing diversity when a new classifier is trained on the augmented data and added to the committee.Boosting and bagging provide diversity by sub-samplingor re-weighting the existing training examples.If the train-ing set is small,this limits the amount of ensemble diversitythat these methods can obtain.D ECORATE ensures diversityon an arbitrarily large set of additional artificial examples. Therefore,one hypothesis is that it will result in higher gen-eralization accuracy when the training set is small.This pa-per presents experimental results on a wide range of UCI data sets comparing boosting,bagging,and D ECORATE,all usingJ48decision-tree induction(a Java implementation of C4.5 [Quinlan,1993]introduced in[Witten and Frank,1999])as a base learner.Cross-validated learning curves support the hy-pothesis that“D ECORATE d trees”generally result in greaterclassification accuracy for small training sets.2Ensembles and DiversityIn an ensemble,the combination of the output of severalclassifiers is only useful if they disagree on some inputs [Krogh and Vedelsby,1995].We refer to the measure of disagreement as the of the ensemble.There have been several methods proposed to measure ensemble diver-sity[Kuncheva and Whitaker,2002]—usually dependent on the measure of accuracy.For regression,where the mean squared error is commonly used to measure accuracy,vari-ance can be used as a measure of diversity.So the diver-sity of the classifier on example can be defined as,where and are the predictions of the classifier and the ensemble respec-tively.For this setting Krogh et al[1995]show that the gen-eralization error,,of the ensemble can be expressed as ,where and are the mean error and diversity of the ensemble respectively.For classification problems,where the0/1loss function is most commonly used to measure accuracy,the diversity of the classifier can be defined as:ifotherwise(1) However,in this case the above simple linear relationship does not hold between,and.But there is still strong reason to believe that increasing diversity should decrease en-semble error[Zenobi and Cunningham,2001].The underly-ing principle of our approach is to build ensembles of classi-fiers that are consistent with the training data and maximize diversity as defined in(1).3DECORATE:Algorithm DefinitionIn D ECORATE(see Algorithm1),an ensemble is generated iteratively,learning a classifier at each iteration and adding it to the current ensemble.We initialize the ensemble to contain the classifier trained on the given training data.The classifiers in each successive iteration are trained on the original training data and also on some artificial data.In each iteration artifi-cial training examples are generated from the data distribu-tion;where the number of examples to be generated is spec-ified as a fraction,,of the training set size.The labels for these artificially generated training examples are chosen so as to differ maximally from the current ensemble’s predic-tions.The construction of the artificial data is explained in greater detail in the following section.We refer to the labeled artificially generated training set as the diversity data.We train a new classifier on the union of the original training data and the diversity data.If adding this new classifier to the cur-rent ensemble increases the ensemble training error,then we reject this classifier,else we add it to the current ensemble. This process is repeated until we reach the desired committee size or exceed the maximum number of iterations.To classify an unlabeled example,,we employ the fol-lowing method.Each base classifier,,in the ensemble provides probabilities for the class membership of.If is the probability of example belonging to classaccording to the classifier,then we compute the class membership probabilities for the entire ensemble as:Algorithm1The DECORATE algorithm6.While and7.Generate training examples,R,based on distribution of training databel examples in R with probability of class labelsinversely proportional to’s predictions,remove the artificial datapute training error,,of as in step514.If15.16.17.otherwise,18.19.Quinlan,1996].We compared the performance of D ECO-RATE to that of Adaboost,Bagging and J48,using J48as the base learner for the ensemble methods and using the Weka implementations of these methods[Witten and Frank,1999]. For the ensemble methods,we set the ensemble size to15. Note that in the case of D ECORATE,we only specify a max-imum ensemble size,the algorithm terminates if the number of iterations exceeds the maximum limit even if the desired ensemble size is not reached.For our experiments,we set the maximum number of iterations in D ECORATE to50.We ran experiments varying the amount of artificially generated data,;and found that the results do not vary much for the range0.5to1.However,values lower than0.5do adversely affect D ECORATE,because there is insufficient ar-tificial data to give rise to high diversity.The results we report are for set to1,i.e.the number of artificially generated examples is equal to the training set size.The performance of each learning algorithm was evaluated using10complete10-fold cross-validations.In each10-fold cross-validation each data set is randomly split into10equal-size segments and results are averaged over10trials.For each trial,one segment is set aside for testing,while the remaining data is available for training.To test performance on vary-ing amounts of training data,learning curves were generated by testing the system after training on increasing subsets of the overall training data.Since we would like to summarize results over several data sets of different sizes,we select dif-ferent percentages of the total training-set size as the points on the learning curve.To compare two learning algorithms across all domains we employ the statistics used in[Webb,2000],namely the win/draw/loss record and the geometric mean error ratio.The win/draw/loss record presents three values,the number of data sets for which algorithm obtained better,equal,or worse performance than algorithm with respect to classi-fication accuracy.We also report the statistically significant win/draw/loss record;where a win or loss is only counted if the difference in values is determined to be significant at the 0.05level by a paired-test.The geometric mean error ratio is defined as,where and are the mean errors of algorithm and on the same domain.If the ge-ometric mean error ratio is less than one it implies that algo-rithm performs better than,and vice versa.We compute error ratios so as to capture the degree to which algorithms out-perform each other in win or loss outcomes.4.2ResultsOur results are summarized in Tables1-3.Each cell in the tables presents the accuracy of D ECORATE versus another al-gorithm.If the difference is statistically significant,then the larger of the two is shown in bold.We varied the training set sizes from1-100%of the total available data,with more points lower on the learning curve since this is where we expect to see the most difference between algorithms.The bottom of the tables provide summary statistics,as discussed above,for each of the points on the learning curve.D ECORATE has more significant wins to losses over Bag-ging for all points along the learning curve(see Table2).D ECORATE also outperforms Bagging on the geometric mean ratio.This suggests that even in cases where Bagging beats D ECORATE the improvement is less than D ECORATE’s im-provement on Bagging on the rest of the cases.D ECORATE outperforms AdaBoost early on the learningcurve both on significant wins/draw/loss record and geomet-ric mean ratio;however,the trend is reversed when given75%or more of the data.Note that even with large amounts oftraining data,D ECORATE’s performance is quite competitive with Adaboost-given100%of the data D ECORATE produceshigher accuracies on6out of15data sets.It has been observed in previous studies[Webb,2000;Bauer and Kohavi,1999]that while AdaBoost usually sig-nificantly reduces the error of the base learner,it occasionallyincreases it,often to a large extent.D ECORATE does not havethis problem as is clear from Table1.On many data sets,D ECORATE achieves the same or higheraccuracy as Bagging and AdaBoost with many fewer trainingexamples.Figure1show learning curves that clearly demon-strate this point.Hence,in domains where little data is avail-able or acquiring labels is expensive,D ECORATE has an ad-vantage over other ensemble methods.We performed additional experiments to analyze the role that diversity plays in error reduction.We ran D ECORATE at10different settings of ranging from0.1to1.0,thusvarying the diversity of ensembles produced.We then com-pared the diversity of ensembles with the reduction in gener-alization error.Diversity of an ensemble is computed as themean diversity of the ensemble members(as given by Eq.1). We compared ensemble diversity with the ensemble error re-duction,i.e.the difference between the average error of theensemble members and the error of the entire ensemble(as in [Cunningham and Carney,2000]).We found that the correla-tion coefficient between diversity and ensemble error reduc-tion is0.6225(1),which is fairly strong.Further-more,we compared diversity with the base error reduction, i.e.the difference between the error of the base classifier and the ensemble error.The base error reduction gives a better in-dication of the improvement in performance of an ensemble over the base classifier.The correlation of diversity versus the base error reduction is0.1552().We note that even though this correlation is weak,it is still a statistically significant positive correlation.These results reinforce our belief that increasing ensemble diversity is a good approach to reducing generalization error.To determine how the performance of D ECORATE changeswith ensemble size,we ran experiments with increasing sizes.We compared results for training on20%of available data,since the advantage of D ECORATE is most noticeable low on the learning curve.Due to lack of space,we do not include the results for all15datasets,but presentfive representative datasets(see Figure2).The performance on other datasets is similar.We note,in general,that the accuracy of D ECORATE increases with ensemble size;though on most datasets,the performance levels out with an ensemble size of10to25.7075808590951000100200300400500600700P e r c e n t c o r r e c tNumber of Training ExamplesAdaBoost Decorate Bagging3040506070809010020406080100120140P e r c e n t c o r r e c tNumber of Training ExamplesAdaBoost Decorate BaggingB REAST -W I RIS5055606570758085900510152025303540455055P e r c e n t c o r r e c tNumber of Training ExamplesAdaBoost Decorate Bagging455055606570758050100150200250300P e r c e n t c o r r e c tNumber of Training ExamplesAdaBoost Decorate BaggingL ABOR H EART -CFigure 1:D ECORATE compared to AdaBoost and Bagging707580859095100P e r c e n t c o r r e c tEnsemble SizeFigure 2:D ECORATE at different ensemble sizes5Related WorkThere have been some other attempts at building ensembles that focus on the issue of diversity.Liu et al [1999]and Rosen [1996]simultaneously train neural networks in an ensemble using a correlation penalty term in their error functions.Opitz and Shavlik [1996]use a genetic algorithm to search for a good ensemble of networks.To guide the search they use an objective function that incorporates both an accuracy and diversity term.Zenobi et al [2001]build ensembles based on different feature subsets;where feature selection is done using a hill-climbing strategy based on classifier error and diversity.A classifier is rejected if the improvement of one of the metrics lead to a “substantial”deterioration of the other;where “substantial”is defined by a pre-set threshold.In all these approaches,ensembles are built attempting to simultaneously optimize the accuracy and diversity of indi-vidual ensemble members.However,in D ECORATE ,our goal is to minimize ensemble error by increasing diversity.At no point does the training accuracy of the ensemble go belowanneal16.66/16.6623.73/23.0741.72/41.1755.42/51.6764.09/60.5967.62/64.8470.46/68.1172.82/70.7777.8/75.1582.1/77.22 autos92.38/74.7394.12/87.3495.06/89.4295.64/92.2195.55/93.0995.91/93.3696.2/93.8596.01/94.2496.28/94.6596.31/95.01 credit-a31.69/31.6935.86/32.9644.5/38.3455.4/46.6261.77/54.1666.01/60.6368.07/61.3868.85/63.6972.73/67.5372.77/67.77 heart-c52.33/52.3371.95/65.9376.59/72.7578.85/78.2580.28/78.6181.14/78.6381.53/79.3581.68/79.5782.37/79.0482.43/79.22 colic33.33/33.3350.87/33.3380.67/59.3391.27/84.3393.07/91.3394.4/92.7395.07/9394.07/93.3394.67/94.0794.93/94.73 labor48.39/48.3953.49/46.6465.73/60.3972.79/68.2174.57/70.7978.84/73.5878.37/74.5378.31/73.3478.06/75.6378.74/76.06 segment19.37/13.6932.12/22.3255.55/42.9473.51/59.0484.63/74.4988.52/81.5990.37/84.7891.35/86.8992.85/89.4493.81/91.76 spliceWin/Draw/Loss7/8/010/3/211/4/010/5/011/4/012/3/013/2/012/2/110/4/110/4/1 GM error ratioTable2:D ECORATE vs Bagging1%2%5%10%20%30%40%50%75%100% anneal16.66/12.9823.73/23.6841.72/38.5555.42/51.3464.09/61.7667.62/66.970.46/70.2972.82/73.0777.8/77.3282.1/80.71 autos92.38/76.7494.12/88.0795.06/90.8895.64/93.4195.55/94.4295.91/94.9596.2/94.9596.01/95.5596.28/96.0796.31/96.3 credit-a31.69/24.8535.86/31.4744.5/40.8755.4/49.661.77/58.966.01/64.3568.07/66.368.85/68.4472.73/7272.77/74.67 heart-c52.33/52.3372.14/63.1876.8/75.279.48/78.6480.7/80.4281.81/81.0781.65/81.2283.19/81.0682.99/80.8782.62/81.34 colic33.33/33.3350.27/33.3380.67/60.4791.53/81.493.2/90.6794.2/92.3394.73/92.8794.4/93.694.53/94.4794.67/94.73 labor48.39/48.3953.62/47.1165.06/60.1271.2/69.6876.74/73.678.84/76.5878.17/77.6878.99/76.9879.14/76.879.08/77.97 segment19.51/14.5632.4/24.5855.36/47.4673.06/65.4585.14/79.2988.27/85.0590.22/87.8991.4/89.2292.75/91.5693.89/92.71 spliceWin/Draw/Loss8/7/010/3/210/3/29/5/110/2/38/4/36/7/28/5/25/7/34/9/2 GM error ratiothat of the base classifier;however,this is a possibility with previous methods.Furthermore,none of the previous stud-ies compared their methods with the standard ensemble ap-proaches such as Boosting and Bagging([Opitz and Shavlik, 1996]compares with Bagging,but not Boosting). Compared to boosting,which requires a“weak”base learner that does not completelyfit the training data(boosting terminates once it constructs a hypothesis with zero training error),D ECORATE requires a strong learner,otherwise the ar-tificial diversity training data may prevent it from adequately fitting the real data.When applying boosting to strong base learners,they mustfirst be appropriately weakened in order to benefit from boosting.Therefore,D ECORATE may be a preferable ensemble meta-learner for strong learners.To our knowledge,the only other ensemble approach to uti-lize artificial training data is the active learning method intro-duced in[Cohn et al.,1994].The goal of the committee here is to select good new training examples rather than to improve accuracy using the existing training data.Also,the labels of the artificial examples are selected to produce hypotheses that more faithfully represent the entire version space rather than to produce diversity.Cohn’s approach labels artificial data ei-ther all positive or all negative to encourage,respectively,the learning of more general or more specific hypotheses.6Future Work and ConclusionIn our current approach,we are encouraging diversity usingartificial training examples.However,in many domains,alarge amount of unlabeled data is already available.We could exploit these unlabeled examples and label them as diversitydata.This would allow D ECORATE to act as a form of semi-supervised learning that exploits both labeled and unlabeled data[Nigam et al.,2000].Our current study has used J48as a base learner;how-ever,we would expect similarly good results with other base learners.Decision-tree induction has been the most com-monly used base learner in other ensemble studies,but therehas been some work using neural networks and naive Bayes [Bauer and Kohavi,1999;Opitz and Maclin,1999].Exper-iments on“D ECORAT ing”other learners is another area forfuture work.By manipulating artificial training examples,D ECORATEis able to use a strong base learner to produce an effective,diverse ensemble.Experimental results demonstrate that the approach is particularly effective at producing highly accurate ensembles when training data is limited,outperforming both bagging and boosting low on the learning curve.The empir-ical success of D ECORATE raises the issue of developing a sound theoretical understanding of its effectiveness.In gen-anneal16.66/16.6623.73/23.4141.72/40.2455.42/52.764.09/64.1567.62/68.9170.46/73.0772.82/75.9277.8/81.7482.1/84.52 autos92.38/74.7394.12/87.8495.06/91.1595.64/93.7595.55/94.8595.91/95.7296.2/95.8496.01/95.8796.28/96.396.31/96.47 credit-a31.69/31.6935.86/32.9344.5/40.7155.4/49.7861.77/58.0366.01/64.3368.07/66.9368.85/68.6972.73/74.6972.77/76.06 heart-c52.33/52.3372.14/65.9376.8/73.0179.48/76.9580.7/79.4481.81/79.2281.65/81.2783.19/82.6382.99/83.2482.62/82.71 colic33.33/33.3350.27/33.3380.67/66.291.53/84.5393.2/90.7394.2/9394.73/93.3394.4/93.5394.53/94.294.67/94.2 labor48.39/48.3953.62/46.6465.06/60.5471.2/69.5776.74/74.1678.84/78.6278.17/80.3578.99/79.8879.14/80.9679.08/81.75 segment19.51/14.2632.4/23.3655.36/49.3773.06/69.4985.14/85.0188.27/88.3790.22/90.0491.4/90.8992.75/92.5793.89/92.88 spliceWin/Draw/Loss7/7/18/6/111/2/210/3/27/6/24/9/25/5/55/6/43/6/63/6/6 GM error ratioeral,the idea of using artificial or unlabeled examples to aid the construction of effective ensembles seems to be a promis-ing approach worthy of further study. AcknowledgmentsThis work was supported by DARPA EELD Grant F30602-01-2-0571.References[Bauer and Kohavi,1999]E.Bauer and R.Kohavi.An empirical comparison of voting classification algorithms: Bagging,boosting and variants.Machine Learning,36, 1999.[Blake and Merz,1998]C.L.Blake and C.J.Merz.UCI repository of machine learning databases./˜mlearn/MLRepository.html,1998.[Breiman,1996]Leo Breiman.Bagging predictors.Machine Learning,24(2):123–140,1996.[Cohn et al.,1994]D.Cohn,L.Atlas,and dner.Im-proving generalization with active learning.Machine Learning,15(2):201–221,1994.[Cunningham and Carney,2000]P.Cunningham and J.Car-ney.Diversity versus quality in classification ensembles based on feature selection.In11th European Conference on Machine Learning,pages109–116,2000. [Dietterich,2000]T.Dietterich.Ensemble methods in ma-chine learning.In J.Kittler and F.Roli,editors,First Inter-national Workshop on Multiple Classifier Systems,Lecture Notes in Computer Science,pages1–15.Springer-Verlag, 2000.[Freund and Schapire,1996]Yoav Freund and Robert E.Schapire.Experiments with a new boosting algorithm.In Proceedings of the13th International Conference on Ma-chine Learning,July1996.[Hastie et al.,2001]Trevor Hastie,Robert Tibshirani,and Jerome Friedman.The Elements of Statistical Learning.Springer Verlag,New York,August2001.[Krogh and Vedelsby,1995]A.Krogh and J.Vedelsby.Neu-ral network ensembles,cross validation and active learn-ing.In Advances in Neural Information Processing Sys-tems7,1995.[Kuncheva and Whitaker,2002]L.Kuncheva andC.Whitaker.Measures of diversity in classifier ensem-bles and their relationship with the ensemble accuracy.submitted,2002.[Liu and Yao,1999]Y.Liu and X.Yao.Ensemble learning via negative correlation.Neural Networks,12,1999. [Nigam et al.,2000]K.Nigam,A.K.McCallum,S.Thrun, and T.Mitchell.Text classification from labeled and unla-beled documents using EM.Machine Learning,39:103–134,2000.[Opitz and Maclin,1999]David Opitz and Richard Maclin.Popular ensemble methods:An empirical study.Journal of Artificial Intelligence Research,11:169–198,1999. [Opitz and Shavlik,1996]D.Opitz and J.Shavlik.Actively searching for an effective neural-network ensemble.Con-nection Science,8,1996.[Quinlan,1993]J.Ross Quinlan.C4.5:Programs for Ma-chine Learning.Morgan Kaufmann,San Mateo,CA,1993. [Quinlan,1996]J.Ross Quinlan.Bagging,boosting,and C4.5.In Proceedings of the13th National Conference on Artificial Intelligence,August1996.[Rosen,1996]B.Rosen.Ensemble learning using decorre-lated neural networks.Connection Science,8,1996. [Webb,2000]G.Webb.Multiboosting:A technique for combining boosting and wagging.Machine Learning,40, 2000.[Witten and Frank,1999]Ian H.Witten and Eibe Frank.Data Mining:Practical Machine Learning Tools and Techniques with Java Implementations.Morgan Kauf-mann,San Francisco,1999.[Zenobi and Cunningham,2001]G.Zenobi and ing diversity in preparing ensembles of classifiers based on different feature subsets to minimize generaliza-tion error.In Proceedings of the European Conference on Machine Learning,2001.。