A Comparison among Support Vector Machine and other Machine Learning Classification Algorithms
Support vector machine reference manual
snsv
ascii2bin bin2ascii
The rest of this document will describe these programs. To nd out more about SVMs, see the bibliography. We will not describe how SVMs work here. The rst program we will describe is the paragen program, as it speci es all parameters needed for the SVM.
sv
- the main SVM program - program for generating parameter sets for the SVM - load a saved SVM and classify a new data set
paragen loadsv
rm sv
- special SVM program for image recognition, that implements virtual support vectors BS97]. - program to convert SN format to our format - program to convert our ASCII format to our binary format - program to convert our binary format to our ASCII format
Support Vector Machines and Kernel Methods
Slack variables
4 3.5 3 2.5 2 1.5 1 0.5 0 −0.5 −3
−2
−1
0
1
2
3
If not linearly separable, add slack variable s ≥ 0 y (x · w + c) + s ≥ 1 Then
i si is total amount by which constraints are violated i si as small as possible
So try to make
Perceptron as convex program
The final convex program for the perceptron is: min
i si subject to
(y i x i ) · w + y i c + s i ≥ 1 si ≥ 0 We will try to understand this program using convex duality
10 8
6
4
2
0
−2
−4
−6
−8
−10 −10
−8
−6
−4
−2
0
2
4
6
8
10
Classification problem
100
10
% Middle & Upper Class
. . .
95
8
6
90
4
85
2
80
0
75
−2
70
−4
−6
65
X
Using support vector machines for lane change detection
USING SUPPORT VECTOR MACHINES FOR LANE-CHANGE DETECTIONHiren M. Mandalia Dario D. SalvucciDrexelUniversityDrexel UniversityPhiladelphia, PA Philadelphia, PADriving is a complex task that requires constant attention, and intelligent transportation systems thatsupport drivers in this task must continually infer driver intentions to produce reasonable, safe responses.In this paper we describe a technique for inferring driver intentions, specifically the intention to changelanes, using support vector machines (SVMs). The technique was applied to experimental data from aninstrumented vehicle that included both behavioral data and environmental data. Comparing these resultsto recent results using a novel “mind-tracking” technique, we found that SVMs outperformed earlieralgorithms and proved especially effective in early detection of driver lane changes.INTRODUCTIONIntelligent transportation systems (ITS) have been evolving to support drivers perform the vast array of typical driving tasks. Even with the many types of ITS systems in existence and under development, one commonality among these systems is the need for inferring driver intentions — that is, detecting what a driver is trying to do and helping them achieve that goal more safely and easily. One particular driver intention is that of changing lanes, which occurs ubiquitously in common driving environments — for instance, in highway driving, which accounts for approximately 70% of vehicle miles on American roadways (Federal Highway Administration, 1998). Much of previous work has focused on the decision-making and gap acceptance aspects of lane changing (e.g., Ahmed et al, 1996; Gipps, 1986), while other work has focused on development of real-world lane-change warning systems (e.g., Talmadge, Chu & Riney, 2000).Several recent studies have examined the behavioral nature of lane changes. Suzanne Lee et al. (2004) evaluated lane changes to provide a valuable insight into the severity of lane changes under naturalistic driving conditions, while Tijerina (1999) discussed operational and behavioral issues in evaluating lane change crash avoidance systems (CAS). Other studies (e.g., Tijerina, 2005) focused in particular on eye-glance behaviors during lane changes. Very little work has been done on recognizing driving maneuvers, especially critical ones like lane changing.The few studies on recognizing, or detecting, lane-change maneuvers include work from Pentland and Liu (1999), Kuge et al. (2000), Olsen (2003), and Salvucci (2004). The first two approaches were based on the concept that human behavior is made up of a sequence of internal ‘mental’ states that was not directly observable. These approaches used a technique called hidden Markov models (common in speech recognition), probabilistic models powered by robust expectation-maximization methods. Pentland and Liu (1999) reported that their recognition system could successfully recognize lane changes 1.5 seconds into the maneuver. Kuge et al. (2000) reported results with a “continuous” (point-by-point) recognition system; however, their system only uses steering-based features and has no knowledge of the surrounding environment. Olsen (2003) performed comprehensive analysis of lane changes in which a slow lead vehicle was present and proposed logistic regression models to predict lane changes. However his study only provides descriptive results and an insight into the variables that influence lane changes.Salvucci (2004) proposed a mind-tracking approach that uses a model of driver behavior to infer driver intentions. The mind tracking system essentially isolates a temporal window of driver data and extracts its similarity to several virtual drivers that are created probabilistically using a cognitive model. All of these approaches left something to be desired for purposes of a point-by-point detection system that could generate accurate detection with each data point.In this paper we describe a method of detecting lane change intentions using a technique known as support vector machines (SVMs). The paper begins with an overview of SVMs including details of how they are applied to the problem of detecting lane changes. The paper continues with an application study in which the SVM technique was applied to data collected from an instrumented vehicle in a real-world driving task, demonstrating that SVMs can successfully detect lane changes with high accuracy and low false-alarm rates, and that the technique performs very well in comparison with previously developed techniques for this problem.LANE-CHANGE DETECTIONUSING SUPPORT VECTOR MACHINES Support vector machines represent learning algorithms that can perform binary classification (pattern recognition) and real valued function approximation (regression estimation) tasks (see, e.g., Cortes & Vapnik, 1995). SVMs have been widely used for isolated handwritten digit recognition, object recognition, speaker identification, and face detection in images and text categorization. This section reviews the basic functioning of SVMs, motivation for using SVMs for lane change detection, and training of lane changes. The next section reports results in terms of the prediction accuracy (true positive rates and false positive rates) and other measures.Brief Overview of Support Vector MachinesSupport vector machines are based on statistical learning theory that uses supervised learning. In supervised learning, amachine is trained instead of programmed using a number of training examples of input-output pairs. The objective of training is to learn a function which best describes the relation between the inputs and the outputs. In general, any learning problem in statistical learning theory will lead to a solution of the typeEq (1) where the x i, i = 1,…, l are the input examples, K a certain symmetric positive definite function named kernel, and c i a set of parameters to be determined from the examples. For details on the functioning of SVM, readers are encouraged to refer to (Cortes & Vapnik, 1995). In short, the working of SVM can be described as statistical learning machines that map points of different categories in n-dimensional space into a higher dimensional space where the two categories are more separable. The paradigm tries to find an optimal hyperplane in that high dimensional space that best separates the two categories of points. Figure 1 shows an example of two categories of points separated by a hyperplane. Essentially, the hyperplane is learned by the points that are located closest to the hyperplane which are called ‘support vectors’. There can be more than one support vector on each side of the plane.Figure 1: Points separated by a hyperplane.(Source: )Motivation for using SVMsAssessing driver state is a substantial task, complicated by the various nuances and idiosyncrasies that characterize human behavior. Measurable components indicative of driver state may often reside in some high dimensional feature space (see, e.g., Wipf & Rao, 2003). Researchers have found that SVM have been particularly useful for binary classification problems. SVMs offer a robust and efficient classification approach for the problem of lane change detection because they map the driving data to high dimensional feature space where linear hyperplanes are sufficient to separate two categories of data points. A correct choice of kernel and data representation can lead to good solutions.Issues Relevant to Lane-Change DetectionKernel selection. A key issue in using the learning techniques is the choice of the kernel K in Eq (1). The kernel K(x i, x j) defines a dot product between projections of the two inputs x i and x j, in the feature space, the features been {Φ1(x), Φ2(x),…, ΦN(x} with N the dimensionality of the Reproducing Kernel Hilbert Space (see, e.g., Cortes & Vapnik, 1995). Therefore the choice is closely related to the choice of the “effective” representation of the data, e.g. the image representation in a vision application. The problem of choosing the kernel for the SVM, and more generally the issue of finding appropriate data representations for learning, is an important one (e.g., Evgeniou, Pontil, & Poggio, 2000). The theory does not provide a general method for finding “good” data representations, but suggests representations that lead to simple solutions. Although there is no general solution to this problem, several experimental and theoretical works provides insights for specific applications (see, e.g., Evgeniou, Pontil, & Poggio, 2000; Vapnik, 1998). However recent work from researchers (e.g., Lanckriet et al., 2004) has shown that for a given data representation there is a systematic method for kernel estimation using semi-definite programming. Estimating the right kind of kernel remains an important segment of the future work with this kind of work. At this point of time the available data was tested against different types of kernel functions to know the performance of each of them experimentally. Some of the kernel-types tested were Linear, Polynomial, Exponential and Gaussian. However, it was observed that all the kernels performed as good as or worse than Linear kernel. All the final results with SVM classification are therefore analyzed with Linear kernel.Choosing a time window. One issue with using SVM for lane change detection is that lane changes do not have fixed time length. Thus, longer lane changes see a smooth transition in feature values like steering angle, lane position, acceleration, etc. whereas shorter ones have a relatively abrupt transition. Moreover, one feature may be a function of one or many other features. The exact inter-dependency between features or within features themselves is often unclear. In the domain of detecting driving maneuvers like lane changes, the change in the features with respect to time and their inter-dependency is more critical than the individual values of the features. For example, studies have shown that during a lane change drivers exhibit an expected sine-wave steering pattern except for a longer and flatter second peak as they straightened the vehicle (e.g., Salvucci & Liu, 2002). The figure below shows such a pattern of steering against time.Drivers first steer toward the destination and then back to center to level the vehicle on a steady path to the destination lane. Then, in a continual smooth movement, drivers steer in the other direction and back to center to straighten the vehicle.Figure 2: Steering angle displays a sine-wave like pattern(Source: Salvucci & Liu, 2002)Such patterns can only be observed (by humans) or learned (by machines) when a reasonable sized window of samples is observed. Thus for all practical purposes when using the SVM for a lane change or lane keeping, an entire window of samples is input instead of a single sample. A ‘sample’ refers to the set of values of features at that instance of time. The data stream is broken down into fixed-size smaller windows. The size (time length) of the window that adequately captures the patterns within the features is a free parameter and therefore left to experimentation. Various window sizes were analyzed between 1 second to 5 seconds. The results with different window sizes are reported in the results. About 2/3rd of driving data was used for training SVMs and remaining 1/3rd for testing purposes.Another issue in training is assigning the correct label (LC or LK) to each window. The lane change definition used to classify training data was that a lane change starts when the vehicle moves towards the other lane boundary and never returns i.e. the final shift of the driver towards the other lane. Any reversal cancels the lane change. Using the definition of lane change, each sample within a window can be labeled positive (LC) or negative (LK) but not the entire window. The last sample within the window is used to label the entire window, since the last sample offers the latest information about the driving state. Also in order to predict the driving state at any time only the preceding samples can be used.Figure 3: Moving window of constant sizeAs shown in Figure 3 a single window of size N is defined by N samples at times {t 0, t 1,…, t N-1}. The label of sample at t N-1 is assigned to the entire window. A moving window is used as shown in the figure i.e. whenever a new sample is obtained, it is added to the moving window and the last sample is dropped, maintaining a constant size window.Choosing a data representation . The general problem of finding an optimal, or even reasonable, data representation is an open one. One approach to data representation is to input the entire training window to the SVM with the actual values of thefeatures (e.g., Wipf & Rao, 2003) and feed them to Relevance Vector Machines (RVM). In such an approach, an input vector corresponding to a single window of size N samples looks like[steerangle(t 0),…,steerangle(t N-1),speed(t 0),…,speed(t N-1),…]which in general is equivalent to[F 1(t 0),…,F 1(t N-1), F 2(t 0),…,F 2(t N-1),…,F M (t 0),…,F M (t N-1)]where F x (t i ) represents the value of feature F x at time t i . Such a vector is used to train Relevance Vector Machines (RVM). RVM is a probabilistic sparse kernel model identical in functional form to the SVM (e.g., Cortes & Vapnik, 1995).Embedded in this formulation is the fact that temporal variations in maneuver execution are handled implicitly by RVMs. However, inherent functionalities of RVMs or SVMs would fail to observe any dependencies or relationship between values of a feature over a period of time which could be critical. Also this formulation results in abnormally long sized input vector leading to additional computational complexity.An alternative approach is suggested to explicitly include some form of dependency/relationship measure between feature values rather than the original values. As argued previously it is the change pattern in the feature values which is more critical than the values themselves. Variance of a feature over all the samples in the block was used to replace the original values of that feature. Variance of a feature is given byEq (2)where N is the window size (number of samples), µx is the mean of the feature F x within the window and x i is the feature value of the i th sample. Thus variance effectively captures the change in the feature values which is very critical to learn specific patterns. Variance of features is particularly useful in reducing the effects of noise in the data. Another reason that encouraged the use of variance is the reduced size of input vectors that were used for final training as explained in the following section.Figure 4 explains the two data representations that were experimented using variance. A single window of size N is shown on the left hand side of the two representations where each feature F x has N values. In Data Representation I (non-overlapping), a single window is divided into two equal halves and the variance of features from each half is used. Thus for every N values of a feature only two values of variances from the first and second half of the window contributes to the input vector. The rationale for splitting a single window in two halves is the need to capture multiple change patterns within a window. For example, features like lane position or steer angle might change multiple times within a window but a single variance across the window will reflect only the overall change.Data Representation II (overlapping) shown in the figure uses a similar structure with the difference that the two halvesFigure 4: Data Representationoverlap with each other. A window of size N is divided into three equal parts say a, b, c. The first half will consist of the first two parts (a, b) and the second half will consist of last two parts. Overlapping structure was tested to account for the fact that the division of a lane change may not be equal and the changes may also happen in the middle of the window. Experiments were performed with each representation.Choosing a feature set. While training the SVMs it is very important that only the most effective set of features, rather than set of all possible features, is used. Features that display significant differences during a lane change against normal driving are the critical ones. Features that do not show enough predictability and vary randomly irrespective of a lane change should be avoided as they degrade the discrimination power of SVM.With no prior preference to any feature, initial experiments included all features. Later, only selected combinations were employed to choose the minimal feature set that would produce the best classification. Various combinations of features were tested. However, only few selected combinations generated good results. Best classification results were obtained with only four features with lane positions at different distances. Such an outcome was expected since lane position demonstrated the most consistent pattern among all the other features. One can argue that steering angle should also be a strong candidate. However, steering angle displays similar patterns both during lane change and while driving through a curved road which led to high number of false positives.Time to collision (TTC) (see, e.g., Olsen, 2003) could be an important feature however; the available data did not have speed values of the lead vehicles. Another limitation was the lack of eye movement data which could prove critical for lane change detection (see, e.g., Olsen, 2003). The current method of feature selection is based purely on experiments. However, as a future study use of more systematic stochastic techniques like t-tests, recursive feature elimination, and maximum likelihood test are planned.The feature sets in Table 1 were selected such that it includes different combinations of features that would generate significant patterns in their variances. Set 1 contains the most basic relevant features. Set 2 measures the effect of lead car distance. Set 3 includes longitudinal and latitudinal information of the car while Set 4 measures the significance of steering angle. Set 5 contains only lane position values since experiments indicated that they were most significant. Note that the lane position features indicate the longitudinal distance from the driver’s vehicle in m — e.g., “Lane position 30” represents the lane position for the point 30 m directly ahead of the vehicle.Generating continuous recognition. To simulate a continuous recognition scheme we use a moving window of size N (block-size) as shown in Figure 3. The window is slided one sample at a time across the data. The classification label predicted by SVM for each window is used as label for the last sample in the window. Consistency among classification scores is one important advantage of this scheme. That is if the previous and next few samples are classified positive the probability of the current sample to be classified negative is very low.Table 1: Feature setsSet Features1 Acceleration, Lane position 0, Lane position 30,Heading2 Acceleration, Lane position 0, Lane position 30,Heading, Lead Car distance3 Acceleration, Lane position 0, Lane Position 20,Lane position 30, Heading, Longitudinalacceleration, Lateral acceleration4 Acceleration, Lane position 0, Lane position 30,Heading, Steering Angle5 Lane position 0, Lane position 10, Lane position 20,Lane position 30EVALUATION STUDYWe performed an evaluation study applying the SVM technique to lane-change detection and, at the same time, exploring a subset of the space of possible data representations and feature sets. For this study, we used the real-world data set collected by T. Yamamura and N. Kuge at Nissan Research Center in Oppama, Japan (see Salvucci, Mandalia, Kuge, & Yamamura, in preparation). Four driver subjects were asked to drive on a Japanese multi-lane highway environment for one hour each through dense and smooth traffic. The drivers were given no specific goals or instructions and they were allowed to drive on their own.The results for the various combinations of window size, feature sets, and non-overlapping vs. overlapping representations are shown in Tables 2 and 3. Average true positive rate at 5% false alarm rate1 among all feature sets is very high, and results improve with decrease in window size. The overlapping representation with all the lane position features (Set 5) generates the best recognition result with 97.9% accuracy2. The system generates a continuous recognition which means it marks each sample with a positive (LC) or negative (LK) label. Thus out of every 100 actual lane-changing samples about 98 are detected correctly, whereas out of every 100 lane-keeping samples only 5 are incorrectly predicted.The recognition system was also analyzed for accuracy with respect to time to calculate how much time is elapsed from the start of a lane change until the point of detection. The system was able to detect about 87% of all true positives within the first 0.3 seconds from the start of the maneuver.DISCUSSIONThe SVM approach to detecting lane changes worked well with real-world vehicle data. A richer data set with features like lead-car velocity, eye movements should lead to even better1 False alarm rate = No. of False Positives * 100 / Total No. of Negatives (LK samples)2 Accuracy = No. of True Positives * 100 / Total No. of Positives (LC samples)accuracy. Comparing the results to previous algorithms, Pentland and Liu (1999) achieved accuracy of 95%, but only after 1.5 seconds into the maneuver whereas our system classifies all data points from the start of the maneuver with 97.9% accuracy. Kuge et al. (2000) achieved 98% accuracy of recognizing entire lane changes (as opposed to point-by-point accuracy). Salvucci’s (2004) results are more directly comparable: his mind-tracking algorithm achieved approximately 87% accuracy at 5% false alarm rate. Thus, the SVM approach outperforms previous approaches and offers great promise as a lane-change detection algorithm for intelligent transportation systems.Table 2: Accuracy by window size (non-overlapping) Window Set 1 Set 2 Set 3 Set 4 Set 5 5s 83.5 90.0 91.2 90.0 91.14s 88.1 91.3 92.5 91.5 92.22s 89.3 93.0 97.7 94.0 97.41.5s 85.2 93.7 96.3 93.2 97.71.2s 96.8 96.0 96.0 96.0 96.70.8s 86.3 91.8 90.0 86.6 94.6Table 3: Accuracy by window size (overlapping) Window Set 1 Set 2 Set 3 Set 4 Set 5 5s 87.1 86.9 89.0 88.1 87.84s 89.9 90.7 91.5 89.5 91.12s 96.2 96.2 97.8 95.8 97.31.5s 94.5 94.6 97.6 93.3 97.51.2s 93.8 93.7 95.0 93.2 97.90.8s 97.0 95.5 96.0 96.4 96.7ACKNOWLEDGMENTSThis work was supported by National Science Foundation grant#IIS-0133083.REFERENCESAhmed, K. I., Ben-Akiva, M. E., Koutsopoulos, H. N., & Mishalani, R. G. (1996). Models of freeway lane changingand gap acceptance behavior. In J.-B. Lesort (Ed.), Transportation and Traffic Theory. New York: Elsevier. Cortes, C. and Vapnik, V. (1995). Support vector networks. Machine Learning, 20:273-295.Evgeniou, T., Pontil, M., & Poggio, T. (2000). Statistical Learning Theory: A Primer. International Journal ofComputer Vision, 38, 9-13.Federal Highway Administration (1998). Our nation’s highways: Selected facts and figures (Tech. Rep. No. FHWA-PL-00-014). Washington, DC: U.S. Dept. of Transportation. Gipps, P. G. (1986). A model for the structure of lane-changing decisions. Transportation Research – Part B, 5, 403-414. Kuge, N., Yamamura, T. and Shimoyama, O. (2000). A Driver Behavior Recognition Method Based on a Driver Model Framework. Intelligent Vehicle Systems (SP-1538). Lanckriet, G. R. G., Cristianini, N., Barlett, P., Ghaoui, L. E., & Jordan, M. I. (2004). Learning the Kernel Matrix with Semidefinite Programming. Journal of Machine Learning Research 5, pp. 27-72.Lee, S. E., Olsen, E. C. B., & Wierwille, W. W. (2004). Naturalistic lane change field data reduction, analysis, and archiving: A comprehensive examination of naturalistic lane-changes (Tech. Rep. No. DOT-HS-809-702). National Highway Traffic Safety Administration.Olsen, E. C. B., (2003). Modeling slow lead vehicle lane changing. Doctoral Dissertation, Department of Industrial and Systems Engineering, Virginia Polytechnic Institute. Pentland, A., & Liu, A. (1999). Modeling and prediction of human behavior. Neural Computation, 11, 229-242. Salvucci, D. D. (2004). Inferring driver intent: A case study in lane-change detection. In Proceedings of the Human Factors Ergonomics Society 48th Annual Meeting.Salvucci, D. D., & Liu, A. (2002). The time course of a lane change: Driver control and eye-movement behavior. Transportation Research Part F, 5, 123-132.Salvucci, D.D., Mandalia, H.M., Kuge, N., & Yamamura, T. (in preparation). Comparing lane-change detection algorithms on driving-simulator and instrumented-vehicle data. In preparation for submission to Human Factors. Talmadge, S., Chu, R. Y., & Riney, R. S. (2000). Description and preliminary data from TRW’s lane change collision avoidance testbed. In Proceedings of the Intelligent Transportation Society of America’s Tenth Annual Meeting. Tijerina, L. (1999). Operational and behavioral issues in the comprehensive evaluation of lane change crash avoidance systems. J. of Transportation Human Factors, 1, 159-176. Tijerina, L., Garott W. R., Stoltzfus, D., Parmer, E., (2005). Van and Passenger Car Driver Eye Glance Behavior During the Lane Change Decision Phase. In Proceedings of Transportation Research Board Annual Meeting 2005. Vapnik, V.N. (1998). Statistical Learning Theory. Wiley: NY. Wipf, D. & Rao, B. (2003). Driver Intent Inference Annual Report, University of California, San Diego.。
基于Codebook背景建模的视频行人检测
基于Codebook背景建模的视频行人检测黄成都;黄文广;闫斌【摘要】针对视频序列,Codebook背景建模算法能检测出其中的运动物体,但却无法识别行人.而大部分基于支持向量机(SVM)训练的行人分类器,需要通过滑动窗口遍历图像检测行人.为加快行人检测的速度,提出将传统的行人分类器融入到Codebook背景建模算法中,通过背景建模算法为行人检测提供候选区域,减少搜索范围,降低了行人误检率;并根据行人的特点,构建临时块模型定期将满足条件的前景区域更新到背景模型中,解决了Codebook背景建模算法不能应对光照突变的问题.实验结果表明:所提算法能应对光照突变所带来的干扰,实现视频行人实时检测.%As for video sequences,Codebook background modeling algorithm can detect moving objects,but cannot recognize pedestrians. Meanwhile,most pedestrians classifiers are based on support vector machine(SVM) training has to traverse through the whole image,by sliding window to detect pedestrians. To speed up pedestrian detection process,algorithm of traditional pedestrian classification device fused in Codebook background modeling algorithm is proposed provide candidate regionals by background modeling algorithm for pedestrian detection reduce the search range and error rate of the pedestrian. According to features of pedestrians,temporary block model is built to regularly update into background model,which solve the problem that Codebook background modeling algorithm cannot suit the illumination abrupt variation. Experimental results demonstrate that the proposed algorithm can dealwith the interference caused by sudden light variation,it can achieve the real-time pedestrian detection in video.【期刊名称】《传感器与微系统》【年(卷),期】2017(036)003【总页数】3页(P144-146)【关键词】视频;Codebook背景建模;支持向量机;行人检测【作者】黄成都;黄文广;闫斌【作者单位】电子科技大学自动化工程学院,四川成都611731;国家电网四川省电力公司乐山电业局,四川乐山614000;电子科技大学自动化工程学院,四川成都611731【正文语种】中文【中图分类】TP391.4行人检测对于一个智能监控系统至关重要,是异常行为识别[1,2]、行人识别与跟踪[3]、步态识别[4,5]、行人计数[6,7]等智能应用的基础。
Support vector machine_A tool for mapping mineral prospectivity
Support vector machine:A tool for mapping mineral prospectivityRenguang Zuo a,n,Emmanuel John M.Carranza ba State Key Laboratory of Geological Processes and Mineral Resources,China University of Geosciences,Wuhan430074;Beijing100083,Chinab Department of Earth Systems Analysis,Faculty of Geo-Information Science and Earth Observation(ITC),University of Twente,Enschede,The Netherlandsa r t i c l e i n f oArticle history:Received17May2010Received in revised form3September2010Accepted25September2010Keywords:Supervised learning algorithmsKernel functionsWeights-of-evidenceTurbidite-hosted AuMeguma Terraina b s t r a c tIn this contribution,we describe an application of support vector machine(SVM),a supervised learningalgorithm,to mineral prospectivity mapping.The free R package e1071is used to construct a SVM withsigmoid kernel function to map prospectivity for Au deposits in western Meguma Terrain of Nova Scotia(Canada).The SVM classification accuracies of‘deposit’are100%,and the SVM classification accuracies ofthe‘non-deposit’are greater than85%.The SVM classifications of mineral prospectivity have5–9%lowertotal errors,13–14%higher false-positive errors and25–30%lower false-negative errors compared tothose of the WofE prediction.The prospective target areas predicted by both SVM and WofE reflect,nonetheless,controls of Au deposit occurrence in the study area by NE–SW trending anticlines andcontact zones between Goldenville and Halifax Formations.The results of the study indicate theusefulness of SVM as a tool for predictive mapping of mineral prospectivity.&2010Elsevier Ltd.All rights reserved.1.IntroductionMapping of mineral prospectivity is crucial in mineral resourcesexploration and mining.It involves integration of information fromdiverse geoscience datasets including geological data(e.g.,geologicalmap),geochemical data(e.g.,stream sediment geochemical data),geophysical data(e.g.,magnetic data)and remote sensing data(e.g.,multispectral satellite data).These sorts of data can be visualized,processed and analyzed with the support of computer and GIStechniques.Geocomputational techniques for mapping mineral pro-spectivity include weights of evidence(WofE)(Bonham-Carter et al.,1989),fuzzy WofE(Cheng and Agterberg,1999),logistic regression(Agterberg and Bonham-Carter,1999),fuzzy logic(FL)(Ping et al.,1991),evidential belief functions(EBF)(An et al.,1992;Carranza andHale,2003;Carranza et al.,2005),neural networks(NN)(Singer andKouda,1996;Porwal et al.,2003,2004),a‘wildcat’method(Carranza,2008,2010;Carranza and Hale,2002)and a hybrid method(e.g.,Porwalet al.,2006;Zuo et al.,2009).These techniques have been developed toquantify indices of occurrence of mineral deposit occurrence byintegrating multiple evidence layers.Some geocomputational techni-ques can be performed using popular software packages,such asArcWofE(a free ArcView extension)(Kemp et al.,1999),ArcSDM9.3(afree ArcGIS9.3extension)(Sawatzky et al.,2009),MI-SDM2.50(aMapInfo extension)(Avantra Geosystems,2006),GeoDAS(developedbased on MapObjects,which is an Environmental Research InstituteDevelopment Kit)(Cheng,2000).Other geocomputational techniques(e.g.,FL and NN)can be performed by using R and Matlab.Geocomputational techniques for mineral prospectivity map-ping can be categorized generally into two types–knowledge-driven and data-driven–according to the type of inferencemechanism considered(Bonham-Carter1994;Pan and Harris2000;Carranza2008).Knowledge-driven techniques,such as thosethat apply FL and EBF,are based on expert knowledge andexperience about spatial associations between mineral prospec-tivity criteria and mineral deposits of the type sought.On the otherhand,data-driven techniques,such as WofE and NN,are based onthe quantification of spatial associations between mineral pro-spectivity criteria and known occurrences of mineral deposits ofthe type sought.Additional,the mixing of knowledge-driven anddata-driven methods also is used for mapping of mineral prospec-tivity(e.g.,Porwal et al.,2006;Zuo et al.,2009).Every geocomputa-tional technique has advantages and disadvantages,and one or theother may be more appropriate for a given geologic environmentand exploration scenario(Harris et al.,2001).For example,one ofthe advantages of WofE is its simplicity,and straightforwardinterpretation of the weights(Pan and Harris,2000),but thismodel ignores the effects of possible correlations amongst inputpredictor patterns,which generally leads to biased prospectivitymaps by assuming conditional independence(Porwal et al.,2010).Comparisons between WofE and NN,NN and LR,WofE,NN and LRfor mineral prospectivity mapping can be found in Singer andKouda(1999),Harris and Pan(1999)and Harris et al.(2003),respectively.Mapping of mineral prospectivity is a classification process,because its product(i.e.,index of mineral deposit occurrence)forevery location is classified as either prospective or non-prospectiveaccording to certain combinations of weighted mineral prospec-tivity criteria.There are two types of classification techniques.Contents lists available at ScienceDirectjournal homepage:/locate/cageoComputers&Geosciences0098-3004/$-see front matter&2010Elsevier Ltd.All rights reserved.doi:10.1016/j.cageo.2010.09.014n Corresponding author.E-mail addresses:zrguang@,zrguang1981@(R.Zuo).Computers&Geosciences](]]]])]]]–]]]One type is known as supervised classification,which classifies mineral prospectivity of every location based on a training set of locations of known deposits and non-deposits and a set of evidential data layers.The other type is known as unsupervised classification, which classifies mineral prospectivity of every location based solely on feature statistics of individual evidential data layers.A support vector machine(SVM)is a model of algorithms for supervised classification(Vapnik,1995).Certain types of SVMs have been developed and applied successfully to text categorization, handwriting recognition,gene-function prediction,remote sensing classification and other studies(e.g.,Joachims1998;Huang et al.,2002;Cristianini and Scholkopf,2002;Guo et al.,2005; Kavzoglu and Colkesen,2009).An SVM performs classification by constructing an n-dimensional hyperplane in feature space that optimally separates evidential data of a predictor variable into two categories.In the parlance of SVM literature,a predictor variable is called an attribute whereas a transformed attribute that is used to define the hyperplane is called a feature.The task of choosing the most suitable representation of the target variable(e.g.,mineral prospectivity)is known as feature selection.A set of features that describes one case(i.e.,a row of predictor values)is called a feature vector.The feature vectors near the hyperplane are the support feature vectors.The goal of SVM modeling is tofind the optimal hyperplane that separates clusters of feature vectors in such a way that feature vectors representing one category of the target variable (e.g.,prospective)are on one side of the plane and feature vectors representing the other category of the target variable(e.g.,non-prospective)are on the other size of the plane.A good separation is achieved by the hyperplane that has the largest distance to the neighboring data points of both categories,since in general the larger the margin the better the generalization error of the classifier.In this paper,SVM is demonstrated as an alternative tool for integrating multiple evidential variables to map mineral prospectivity.2.Support vector machine algorithmsSupport vector machines are supervised learning algorithms, which are considered as heuristic algorithms,based on statistical learning theory(Vapnik,1995).The classical task of a SVM is binary (two-class)classification.Suppose we have a training set composed of l feature vectors x i A R n,where i(¼1,2,y,n)is the number of feature vectors in training samples.The class in which each sample is identified to belong is labeled y i,which is equal to1for one class or is equal toÀ1for the other class(i.e.y i A{À1,1})(Huang et al., 2002).If the two classes are linearly separable,then there exists a family of linear separators,also called separating hyperplanes, which satisfy the following set of equations(KavzogluandFig.1.Support vectors and optimum hyperplane for the binary case of linearly separable data sets.Table1Experimental data.yer A Layer B Layer C Layer D Target yer A Layer B Layer C Layer D Target1111112100000 2111112200000 3111112300000 4111112401000 5111112510000 6111112600000 7111112711100 8111112800000 9111012900000 10111013000000 11101113111100 12111013200000 13111013300000 14111013400000 15011013510000 16101013600000 17011013700000 18010113811100 19010112900000 20101014010000R.Zuo,E.J.M.Carranza/Computers&Geosciences](]]]])]]]–]]]2Colkesen,2009)(Fig.1):wx iþb Zþ1for y i¼þ1wx iþb rÀ1for y i¼À1ð1Þwhich is equivalent toy iðwx iþbÞZ1,i¼1,2,...,nð2ÞThe separating hyperplane can then be formalized as a decision functionfðxÞ¼sgnðwxþbÞð3Þwhere,sgn is a sign function,which is defined as follows:sgnðxÞ¼1,if x400,if x¼0À1,if x o08><>:ð4ÞThe two parameters of the separating hyperplane decision func-tion,w and b,can be obtained by solving the following optimization function:Minimize tðwÞ¼12J w J2ð5Þsubject toy Iððwx iÞþbÞZ1,i¼1,...,lð6ÞThe solution to this optimization problem is the saddle point of the Lagrange functionLðw,b,aÞ¼1J w J2ÀX li¼1a iðy iððx i wÞþbÞÀ1Þð7Þ@ @b Lðw,b,aÞ¼0@@wLðw,b,aÞ¼0ð8Þwhere a i is a Lagrange multiplier.The Lagrange function is minimized with respect to w and b and is maximized with respect to a grange multipliers a i are determined by the following optimization function:MaximizeX li¼1a iÀ12X li,j¼1a i a j y i y jðx i x jÞð9Þsubject toa i Z0,i¼1,...,l,andX li¼1a i y i¼0ð10ÞThe separating rule,based on the optimal hyperplane,is the following decision function:fðxÞ¼sgnX li¼1y i a iðxx iÞþb!ð11ÞMore details about SVM algorithms can be found in Vapnik(1995) and Tax and Duin(1999).3.Experiments with kernel functionsFor spatial geocomputational analysis of mineral exploration targets,the decision function in Eq.(3)is a kernel function.The choice of a kernel function(K)and its parameters for an SVM are crucial for obtaining good results.The kernel function can be usedTable2Errors of SVM classification using linear kernel functions.l Number ofsupportvectors Testingerror(non-deposit)(%)Testingerror(deposit)(%)Total error(%)0.2580.00.00.0180.00.00.0 1080.00.00.0 10080.00.00.0 100080.00.00.0Table3Errors of SVM classification using polynomial kernel functions when d¼3and r¼0. l Number ofsupportvectorsTestingerror(non-deposit)(%)Testingerror(deposit)(%)Total error(%)0.25120.00.00.0160.00.00.01060.00.00.010060.00.00.0 100060.00.00.0Table4Errors of SVM classification using polynomial kernel functions when l¼0.25,r¼0.d Number ofsupportvectorsTestingerror(non-deposit)(%)Testingerror(deposit)(%)Total error(%)11110.00.0 5.010290.00.00.0100230.045.022.5 1000200.090.045.0Table5Errors of SVM classification using polynomial kernel functions when l¼0.25and d¼3.r Number ofsupportvectorsTestingerror(non-deposit)(%)Testingerror(deposit)(%)Total error(%)0120.00.00.01100.00.00.01080.00.00.010080.00.00.0 100080.00.00.0Table6Errors of SVM classification using radial kernel functions.l Number ofsupportvectorsTestingerror(non-deposit)(%)Testingerror(deposit)(%)Total error(%)0.25140.00.00.01130.00.00.010130.00.00.0100130.00.00.0 1000130.00.00.0Table7Errors of SVM classification using sigmoid kernel functions when r¼0.l Number ofsupportvectorsTestingerror(non-deposit)(%)Testingerror(deposit)(%)Total error(%)0.25400.00.00.01400.035.017.510400.0 6.0 3.0100400.0 6.0 3.0 1000400.0 6.0 3.0R.Zuo,E.J.M.Carranza/Computers&Geosciences](]]]])]]]–]]]3to construct a non-linear decision boundary and to avoid expensive calculation of dot products in high-dimensional feature space.The four popular kernel functions are as follows:Linear:Kðx i,x jÞ¼l x i x j Polynomial of degree d:Kðx i,x jÞ¼ðl x i x jþrÞd,l40Radial basis functionðRBFÞ:Kðx i,x jÞ¼exp fÀl99x iÀx j992g,l40 Sigmoid:Kðx i,x jÞ¼tanhðl x i x jþrÞ,l40ð12ÞThe parameters l,r and d are referred to as kernel parameters. The parameter l serves as an inner product coefficient in the polynomial function.In the case of the RBF kernel(Eq.(12)),l determines the RBF width.In the sigmoid kernel,l serves as an inner product coefficient in the hyperbolic tangent function.The parameter r is used for kernels of polynomial and sigmoid types. The parameter d is the degree of a polynomial function.We performed some experiments to explore the performance of the parameters used in a kernel function.The dataset used in the experiments(Table1),which are derived from the study area(see below),were compiled according to the requirementfor Fig.2.Simplified geological map in western Meguma Terrain of Nova Scotia,Canada(after,Chatterjee1983;Cheng,2008).Table8Errors of SVM classification using sigmoid kernel functions when l¼0.25.r Number ofSupportVectorsTestingerror(non-deposit)(%)Testingerror(deposit)(%)Total error(%)0400.00.00.01400.00.00.010400.00.00.0100400.00.00.01000400.00.00.0R.Zuo,E.J.M.Carranza/Computers&Geosciences](]]]])]]]–]]]4classification analysis.The e1071(Dimitriadou et al.,2010),a freeware R package,was used to construct a SVM.In e1071,the default values of l,r and d are1/(number of variables),0and3,respectively.From the study area,we used40geological feature vectors of four geoscience variables and a target variable for classification of mineral prospec-tivity(Table1).The target feature vector is either the‘non-deposit’class(or0)or the‘deposit’class(or1)representing whether mineral exploration target is absent or present,respectively.For‘deposit’locations,we used the20known Au deposits.For‘non-deposit’locations,we randomly selected them according to the following four criteria(Carranza et al.,2008):(i)non-deposit locations,in contrast to deposit locations,which tend to cluster and are thus non-random, must be random so that multivariate spatial data signatures are highly non-coherent;(ii)random non-deposit locations should be distal to any deposit location,because non-deposit locations proximal to deposit locations are likely to have similar multivariate spatial data signatures as the deposit locations and thus preclude achievement of desired results;(iii)distal and random non-deposit locations must have values for all the univariate geoscience spatial data;(iv)the number of distal and random non-deposit locations must be equaltoFig.3.Evidence layers used in mapping prospectivity for Au deposits(from Cheng,2008):(a)and(b)represent optimum proximity to anticline axes(2.5km)and contacts between Goldenville and Halifax formations(4km),respectively;(c)and(d)represent,respectively,background and anomaly maps obtained via S-Afiltering of thefirst principal component of As,Cu,Pb and Zn data.R.Zuo,E.J.M.Carranza/Computers&Geosciences](]]]])]]]–]]]5the number of deposit locations.We used point pattern analysis (Diggle,1983;2003;Boots and Getis,1988)to evaluate degrees of spatial randomness of sets of non-deposit locations and tofind distance from any deposit location and corresponding probability that one deposit location is situated next to another deposit location.In the study area,we found that the farthest distance between pairs of Au deposits is71km,indicating that within that distance from any deposit location in there is100%probability of another deposit location. However,few non-deposit locations can be selected beyond71km of the individual Au deposits in the study area.Instead,we selected random non-deposit locations beyond11km from any deposit location because within this distance from any deposit location there is90% probability of another deposit location.When using a linear kernel function and varying l from0.25to 1000,the number of support vectors and the testing errors for both ‘deposit’and‘non-deposit’do not vary(Table2).In this experiment the total error of classification is0.0%,indicating that the accuracy of classification is not sensitive to the choice of l.With a polynomial kernel function,we tested different values of l, d and r as follows.If d¼3,r¼0and l is increased from0.25to1000,the number of support vectors decreases from12to6,but the testing errors for‘deposit’and‘non-deposit’remain nil(Table3).If l¼0.25, r¼0and d is increased from1to1000,the number of support vectors firstly increases from11to29,then decreases from23to20,the testing error for‘non-deposit’decreases from10.0%to0.0%,whereas the testing error for‘deposit’increases from0.0%to90%(Table4). In this experiment,the total error of classification is minimum(0.0%) when d¼10(Table4).If l¼0.25,d¼3and r is increased from 0to1000,the number of support vectors decreases from12to8,but the testing errors for‘deposit’and‘non-deposit’remain nil(Table5).When using a radial kernel function and varying l from0.25to 1000,the number of support vectors decreases from14to13,but the testing errors of‘deposit’and‘non-deposit’remain nil(Table6).With a sigmoid kernel function,we experimented with different values of l and r as follows.If r¼0and l is increased from0.25to1000, the number of support vectors is40,the testing errors for‘non-deposit’do not change,but the testing error of‘deposit’increases from 0.0%to35.0%,then decreases to6.0%(Table7).In this experiment,the total error of classification is minimum at0.0%when l¼0.25 (Table7).If l¼0.25and r is increased from0to1000,the numbers of support vectors and the testing errors of‘deposit’and‘non-deposit’do not change and the total error remains nil(Table8).The results of the experiments demonstrate that,for the datasets in the study area,a linear kernel function,a polynomial kernel function with d¼3and r¼0,or l¼0.25,r¼0and d¼10,or l¼0.25and d¼3,a radial kernel function,and a sigmoid kernel function with r¼0and l¼0.25are optimal kernel functions.That is because the testing errors for‘deposit’and‘non-deposit’are0%in the SVM classifications(Tables2–8).Nevertheless,a sigmoid kernel with l¼0.25and r¼0,compared to all the other kernel functions,is the most optimal kernel function because it uses all the input support vectors for either‘deposit’or‘non-deposit’(Table1)and the training and testing errors for‘deposit’and‘non-deposit’are0% in the SVM classification(Tables7and8).4.Prospectivity mapping in the study areaThe study area is located in western Meguma Terrain of Nova Scotia,Canada.It measures about7780km2.The host rock of Au deposits in this area consists of Cambro-Ordovician low-middle grade metamorphosed sedimentary rocks and a suite of Devonian aluminous granitoid intrusions(Sangster,1990;Ryan and Ramsay, 1997).The metamorphosed sedimentary strata of the Meguma Group are the lower sand-dominatedflysch Goldenville Formation and the upper shalyflysch Halifax Formation occurring in the central part of the study area.The igneous rocks occur mostly in the northern part of the study area(Fig.2).In this area,20turbidite-hosted Au deposits and occurrences (Ryan and Ramsay,1997)are found in the Meguma Group, especially near the contact zones between Goldenville and Halifax Formations(Chatterjee,1983).The major Au mineralization-related geological features are the contact zones between Gold-enville and Halifax Formations,NE–SW trending anticline axes and NE–SW trending shear zones(Sangster,1990;Ryan and Ramsay, 1997).This dataset has been used to test many mineral prospec-tivity mapping algorithms(e.g.,Agterberg,1989;Cheng,2008). More details about the geological settings and datasets in this area can be found in Xu and Cheng(2001).We used four evidence layers(Fig.3)derived and used by Cheng (2008)for mapping prospectivity for Au deposits in the yers A and B represent optimum proximity to anticline axes(2.5km) and optimum proximity to contacts between Goldenville and Halifax Formations(4km),yers C and D represent variations in geochemical background and anomaly,respectively, as modeled by multifractalfilter mapping of thefirst principal component of As,Cu,Pb,and Zn data.Details of how the four evidence layers were obtained can be found in Cheng(2008).4.1.Training datasetThe application of SVM requires two subsets of training loca-tions:one training subset of‘deposit’locations representing presence of mineral deposits,and a training subset of‘non-deposit’locations representing absence of mineral deposits.The value of y i is1for‘deposits’andÀ1for‘non-deposits’.For‘deposit’locations, we used the20known Au deposits(the sixth column of Table1).For ‘non-deposit’locations(last column of Table1),we obtained two ‘non-deposit’datasets(Tables9and10)according to the above-described selection criteria(Carranza et al.,2008).We combined the‘deposits’dataset with each of the two‘non-deposit’datasets to obtain two training datasets.Each training dataset commonly contains20known Au deposits but contains different20randomly selected non-deposits(Fig.4).4.2.Application of SVMBy using the software e1071,separate SVMs both with sigmoid kernel with l¼0.25and r¼0were constructed using the twoTable9The value of each evidence layer occurring in‘non-deposit’dataset1.yer A Layer B Layer C Layer D100002000031110400005000061000700008000090100 100100 110000 120000 130000 140000 150000 160100 170000 180000 190100 200000R.Zuo,E.J.M.Carranza/Computers&Geosciences](]]]])]]]–]]] 6training datasets.With training dataset1,the classification accuracies for‘non-deposits’and‘deposits’are95%and100%, respectively;With training dataset2,the classification accuracies for‘non-deposits’and‘deposits’are85%and100%,respectively.The total classification accuracies using the two training datasets are97.5%and92.5%,respectively.The patterns of the predicted prospective target areas for Au deposits(Fig.5)are defined mainly by proximity to NE–SW trending anticlines and proximity to contact zones between Goldenville and Halifax Formations.This indicates that‘geology’is better than‘geochemistry’as evidence of prospectivity for Au deposits in this area.With training dataset1,the predicted prospective target areas occupy32.6%of the study area and contain100%of the known Au deposits(Fig.5a).With training dataset2,the predicted prospec-tive target areas occupy33.3%of the study area and contain95.0% of the known Au deposits(Fig.5b).In contrast,using the same datasets,the prospective target areas predicted via WofE occupy 19.3%of study area and contain70.0%of the known Au deposits (Cheng,2008).The error matrices for two SVM classifications show that the type1(false-positive)and type2(false-negative)errors based on training dataset1(Table11)and training dataset2(Table12)are 32.6%and0%,and33.3%and5%,respectively.The total errors for two SVM classifications are16.3%and19.15%based on training datasets1and2,respectively.In contrast,the type1and type2 errors for the WofE prediction are19.3%and30%(Table13), respectively,and the total error for the WofE prediction is24.65%.The results show that the total errors of the SVM classifications are5–9%lower than the total error of the WofE prediction.The 13–14%higher false-positive errors of the SVM classifications compared to that of the WofE prediction suggest that theSVMFig.4.The locations of‘deposit’and‘non-deposit’.Table10The value of each evidence layer occurring in‘non-deposit’dataset2.yer A Layer B Layer C Layer D110102000030000411105000060110710108000091000101110111000120010131000140000150000161000171000180010190010200000R.Zuo,E.J.M.Carranza/Computers&Geosciences](]]]])]]]–]]]7classifications result in larger prospective areas that may not contain undiscovered deposits.However,the 25–30%higher false-negative error of the WofE prediction compared to those of the SVM classifications suggest that the WofE analysis results in larger non-prospective areas that may contain undiscovered deposits.Certainly,in mineral exploration the intentions are notto miss undiscovered deposits (i.e.,avoid false-negative error)and to minimize exploration cost in areas that may not really contain undiscovered deposits (i.e.,keep false-positive error as low as possible).Thus,results suggest the superiority of the SVM classi-fications over the WofE prediction.5.ConclusionsNowadays,SVMs have become a popular geocomputational tool for spatial analysis.In this paper,we used an SVM algorithm to integrate multiple variables for mineral prospectivity mapping.The results obtained by two SVM applications demonstrate that prospective target areas for Au deposits are defined mainly by proximity to NE–SW trending anticlines and to contact zones between the Goldenville and Halifax Formations.In the study area,the SVM classifications of mineral prospectivity have 5–9%lower total errors,13–14%higher false-positive errors and 25–30%lower false-negative errors compared to those of the WofE prediction.These results indicate that SVM is a potentially useful tool for integrating multiple evidence layers in mineral prospectivity mapping.Table 11Error matrix for SVM classification using training dataset 1.Known All ‘deposits’All ‘non-deposits’TotalPrediction ‘Deposit’10032.6132.6‘Non-deposit’067.467.4Total100100200Type 1(false-positive)error ¼32.6.Type 2(false-negative)error ¼0.Total error ¼16.3.Note :Values in the matrix are percentages of ‘deposit’and ‘non-deposit’locations.Table 12Error matrix for SVM classification using training dataset 2.Known All ‘deposits’All ‘non-deposits’TotalPrediction ‘Deposits’9533.3128.3‘Non-deposits’566.771.4Total100100200Type 1(false-positive)error ¼33.3.Type 2(false-negative)error ¼5.Total error ¼19.15.Note :Values in the matrix are percentages of ‘deposit’and ‘non-deposit’locations.Table 13Error matrix for WofE prediction.Known All ‘deposits’All ‘non-deposits’TotalPrediction ‘Deposit’7019.389.3‘Non-deposit’3080.7110.7Total100100200Type 1(false-positive)error ¼19.3.Type 2(false-negative)error ¼30.Total error ¼24.65.Note :Values in the matrix are percentages of ‘deposit’and ‘non-deposit’locations.Fig.5.Prospective targets area for Au deposits delineated by SVM.(a)and (b)are obtained using training dataset 1and 2,respectively.R.Zuo,E.J.M.Carranza /Computers &Geosciences ](]]]])]]]–]]]8。
南航大论文写作要求(最新版)
南京航空航天大学研究生学位论文撰写要求(2015年4月修订)一、学位论文的构成学位论文由三部分组成:学位论文前置部分、学位论文主体部分、学位论文附录部分。
二、各部分内容的要求1、学位论文前置部分包括封面、承诺书、中英文摘要、目录、图表清单、注释表。
(1) 封面按规定的格式、颜色统一到校印刷厂印制,详见附1。
论文编号:由学校代码,学院编号,年份(后两位),学生类别(S代表硕士,B代表博士)及三位序号组成。
示例:1028701 11-B026(南京航空航天大学航空宇航学院2011年第026号博士学位论文) 分类号:根据论文中主题内容,对照分类法选取中图分类号、学科分类号,著录在左上角。
中图分类号一般选取1~2个,学科分类号标注1个。
中图分类号参照《中国图书资料分类法》、《中国图书馆分类法》,学科分类号参照国务院学位委员会办公室颁布的《学术学位授予信息采集学科代码标准》。
题目:要能概括整个论文最重要的内容,具体、切题、不能太笼统,但要引人注目;题名力求简短,严格控制在25字以内。
学科专业:以国务院学位委员会批准的专业目录中的学科专业为准,填写二级学科名称;申请自主设置二级学科的学位,须填写一级学科名称,并同时填写自主设置二级学科名称,在自主设置二级学科名称上加括号,如力学(纳米力学)。
指导老师:指导老师的署名一律以批准招生的为准,如有变动要正式提出申请并报研究生院备案,且只能写一名指导教师,如有其他正式批准备案的导师,写在联合指导教师一项中(限一名)。
(2) 英文封面(详见附2)(3) 承诺书单设一页,排在英文封面后,请认真阅读承诺书内容,全面审视自己的论文,是否严格遵守《中华人民共和国著作权法》,对他人享有著作权的内容是否都进行了明确的标注,慎重签名。
详见附3。
(4) 中文摘要在论文的第一页,是学位论文内容的不加注释和评论的简短陈述,简要说明研究工作的目的、方法、创新性的成果和结论等。
硕士论文摘要中文字数400~600个字,博士论文摘要中文字数800~1000个字。
基于数字信号处理器的异步电机参数辨识实现
控制与应用技术基于数字信号处理器的异步电机参数辨识实现*刘述喜1, 王明渝2, 陈新岗1, 杨绍荣3, 贺晓蓉1(1.重庆工学院电子信息与自动化学院,重庆 400050;2.重庆大学电气工程学院,重庆 400044;3.重庆三峡水力电力(集团)股份有限公司,重庆 400400)摘 要:介绍了异步电机各种参数的辨识方法。
在不改变硬件系统的前提下,通过电流控制技术,向电机注入单相交流或直流电流,检测其响应,从而实现异步电机的参数辨识。
对一台未知参数的异步电机做仿真和试验研究,并将辨识出来的参数用于矢量控制系统,得到了良好的效果,证明了参数辨识的正确性。
关键词:异步电机;参数辨识;数字信号处理器中图分类号:TM 301.2:TM 343 文献标识码:A 文章编号:1673 6540(2006)10 0021 05I mple m ent of Para m eter Identification ofA synchronousM achineB ased on DSPLIU Shu xi 1, WANG M ing yu 2, C HEN X i n gang 1, Y ANG Shao rong 3, HE X iao rong1(1.Co llege of E lectr on ic I nfor m ation&Auto m ati o n ,Chongq i n g I nstitute ofTec hno l o gy ,Chongqi n g 400050,China ;2.Co llege of E lectrica lEngineeri n g ,Chongqing University ,Chongqi n g 400044,Chi n a ;3.Chongq i n g Three GorgeW ater Conservanly and E lectri c Po wer Co .,Ltd .,Chongqi n g 400400,Chi n a)Abstract :T he sche m e how to esti m a te param eters of asynchronous m ach i ne is introduced .T hen under t he pre m ise o f no chang i ng t he hard w are structure o f vecto r contro l o fm otor ,explaini ng the rea lization of i dentifica ti on of mo tor para m eters t hrough current contro l techno l ogy ,.i e .i nputti ng sing le phase alternati ng current or DC current to the mo tor ,t hen detecti ng t he response .A nd for an unknown para m ete rs mo tor ,the si m ulati on and exper i m ent st udies are perfor m ed ,the i dentified param eters are used to a vector contro l system,be tter effect i s ach i eved wh ich verify i ng the feasi b ilit y o f t he pa rame ters esti m ati on m ethod .K ey word s :asyn chronou s mach i n e ;para m eter identif ication ;digital signal processor*重庆市科委自然科学基金资助项目(C STC 2005BB6076)0 引 言异步电机的矢量控制具有动态性能好、调速范围宽等优点,但不足之处是它的控制性能严重依赖于电机参数的准确性[1]。
飞行术语中英对照整理
Aair superiority 制空权ambush 伏击assistance 援助、协助attitude 飞行姿态(高度)AW ACS(Airborne Warning and Control System) link 空中预警引导aye-aye, sir 遵命,长官(同affirmative)AB 加力ACM 空战机动ACT 空战战术AI 空中拦截Angles 用千英尺表示高度的快捷方式,例如,“Angles 20”表示高度20,000英尺。
Angle-off 敌机航向和你的航向之间的角度差,也称为航向交叉角或HCAArmour star hands 肥厚,笨拙的手,驾驶飞机时容易出错Aspect angle 你的飞机相对于敌机尾部的角度Attack geometry 进攻战机的追击路径Bbail out (爬出座舱)跳伞bearing 航向航向的划分方法是:以飞行员自己的飞机为圆心的一个与地平面平行的虚拟圆,分为360度,正北方为零度,正东方为90度,正南方为180度,正西方为270度,依此类推。
black out 由于指向头部方向的加速度过大而导致的大脑缺血,引起的暂时性失明和意识丧失。
bogie 原指妖怪或可怕的人,空战中指敌机。
bandit 原指强盗,空战中指敌机(同bad guy)bombing run 炸弹攻击brake 阻力板(降落时塔台呼叫call the ball 就是指放下阻力板)bug out 撤出战区bull's eye 原指圆形箭靶的中心,空战中的意思是“正中!”,类似的短语还有:direct hit, great balls of fun, splash one, one down等。
bingo 原指一种赌博游戏或烈酒,空战中指出乎意料地打中了敌机。
只够返回基地的燃料,其它已耗尽的状态。
Bar 雷达波的一次扫描Basic Fighter Maneuvers(BFM)在一对一空战环境中的基本空战机动Belly check 执行180度的翻滚,检查机身下方空间Boresight mode 雷达束固定指向机首前方,第一个进入雷达束区域的目标将被自动锁定Butterfly setup 一个战斗训练进入计划,两架战机开始以一字并列编队飞行,然后互相转离45度。
Fuzzy Support Vector Machines for Pattern Classification
In Section 2, we summarize support vector machines for
pattern classification. And in Section 3 we discuss the problem of the multiclass support vector machines. In Section 4,we discuss the method of defining the membership functions using the SVM decision functions. Finally, in Section 5, we evaluate our method for three benchmark data sets and demonstrate the superiority of the FSVM over the SVM.
two decision functions are positive or all the values are negative. To avoid this, in [3], a pairwise classification method, in which n(n - l ) / 2 decision functions are determined, is proposed. By this method, however unclassifiable regions remain. In this paper, to overcome this problem, we propose fuzzy support vector machines (FSVMs). Using the decision functions obtained by training the SVM, we define truncated polyhedral pyramidal membership functions [4] and resolve unclassifiable regions.
基于支持向量机的数控机床空间误差辨识与补偿(精)
基于支持向量机的数控机床空间误差辨识与补偿)))吴雄彪姚鑫骅何振亚等基于支持向量机的数控机床空间误差辨识与补偿吴雄彪1 姚鑫骅2 何振亚2 傅建中21.金华职业技术学院,金华,3210172.浙江大学,杭州,310027摘要:为了精确地建立数控机床空间误差预测模型,提出基于支持向量机的空间误差辨识与补偿方法。
通过训练学习,优化了最小二乘支持向量机参数,采用激光矢量分步对角线法在一台数控铣床上进行了空间误差辨识和补偿实验研究。
结果表明:支持向量机预测模型能有效辨识数控机床的空间误差。
与神经网络预测模型进行了比较,表明支持向量机空间误差预测模型补偿精度高、建模速度快,通过其补偿可有效提高数控机床的加工精度。
关键词:数控机床;空间误差;支持向量机;误差补偿中图分类号:TG659 文章编号:1004)132X(2010)12)1397)04 IdentificationandCompensationforVolumetricErrorsof CNCMachineToolsBasedonSVM1222WuXiongbiao YaoXinhua HeZhenya FuJianzhong1.JinhuaCollegeofProfessionandTechnology,Jinhua,Zhejiang,3210172.ZhejiangUniversity,Hangzhou,310027Abstract:Anovelmethodofidentificationandcompensationvolumetricerrorsformachinetoo lsbasedonSVMwaspresentedherein.Trainingandlearningtheerrorsamplesandoptimizingth eSVMparameters,thevolumetricerrormodelswereobtainedforcompensation.Bylaserseque ntialstepdiagonalmeasurement,thevolumetricerrorswereidentifiedandcompensated.Thee xperimentalresultsillustratethattheSVMmodelisadaptivetoforecastvolumetricerrorsofma paringtheneutralnetworkmodel,theSVMmodelofvolumetricerrorismorep ingtheSVMvolumetricerrormodel,itissucceededtoenhancetheprecision forCNCmachinetools.Keywords:CNCmachinetool;volumetricerror;supportvectormachine(SVM);errorcompen sation0 引言在数控机床上实现高精度加工的关键是提高机床的空间定位精度。
基于不确定性采样的自训练代价敏感支持向量机研究
第43卷第2期中南大学学报(自然科学版)、,01.43 N o.2 2012年2月Jou rn al o f C en tr al So uth Un ive rsit y(S cien ce and Tec hnology)Fe b.2012基于不确定性采样的自训练代价敏感支持向量机研究江彤1,唐明珠2,阳春华2 (1.湖南人文科技学院计算机科学技术系,湖南娄底,417000;2.中南大学信息科学与工程学院,湖南长沙,410083)摘要:针对样本集中的类不平衡性和样本标注代价昂贵问题,提出基于不确定性采样的自训练代价敏感支持向量机。
不确定性采样通过支持向量数据描述评价未标注样本的不确定性,对不确定性高的未标注样本进行标注,同时利用自训练方法训练代价敏感支持向量,代价敏感支持向量机利用代价参数和核参数对未标注样本进行预测。
实验结果表明:该算法能有效地降低平均期望误分类代价,减少样本集中样本需要标注次数。
关键词:主动学习;代价敏感支持向量机;自训练方法;不确定性采样:支持向量数据描述中图分类号:T18 文献标志码:A文章编号:1672-7207(2012)02-0561---06Self-training cost—sensitive support vector machine withuncertainty based on samplingJIANG Ton91,TANG Ming-zhu2,YANG Chun-hua2(1.Department ofComputer Science an d T ec hn ol og y,Hu na n Institute ofHu man iti es,Sc ie nc e an d T ec hn ol og y,Lo u di417000,C h i n a;2.S c h o o l o f I n f o r m a t i o n Science&Engineering,Central South Un ive rsi ty,Cha ngs ha 410083,China)Ab s tr ac t:S el f-t ra i ni ng c os t-s en s it i ve su p po r t v e c t o r machine wi t h u n c e r t a i n t y b as e d s a m p l i n g fscu)Ⅵ邯propose d to solve t w o di衔c ul ti es of class-im balanced d at ase t and ex p e n s i v e l ab e l e d cost.111e u n c e r t a in t y o f unlabeled sample wag evaluat ed using suppor t vec to r data des c r i p ti o n in u nc e rt a i nt y based sampling.111e unlabeled sample with hi gh u nc e r t a in t y w a s s e le c t e d to be label ed.Cost-se nsit ive su ppo rt vect or mac hi n e wa s trai ne d using self-training ap pr oa ch.C ost parameters and kernel parameters of c o s t-s e n si t i v e s up p o r t vector mach i ne were employed to predict a class labelfor an unl a b el e d s a m pl e.11h e results show that SC U e f fe ct i ve ly re du c es both averag e expected misclassification co sts and l ab e l ed times.Key w ord s:acti ve l ea rn in g;co st-s en si ti ve s u p p or t v e c to r ma c h i ne;s e l f-tr a i n in g approach;uncert ainty base d s am pl in g;support vector data des c r ip t io n复杂工业过程一旦发生故障,有可能直接影响产息化普及,生产过程中积累了丰富的数据,数据中蕴品数量和质量,甚至会发生系统瘫痪或灾难性事故,藏着反映生产规律的信息。
Vector machines, Gaussian processes and
Bayesian model selection for Support Vector machines, Gaussian processes and other kernel classi ers
Institute for Adaptive and Neural Computation University of Edinburgh 5 Forrest Hill, Edinburgh EH1 2QL seeger@
Matthiariational Bayesian method for model selection over families of kernels classi ers like Support Vector machines or Gaussian processes. The algorithm needs no user interaction and is able to adapt a large number of kernel parameters to given data without having to sacri ce training cases for validation. This opens the possibility to use sophisticated families of kernels in situations where the small \standard kernel" classes are clearly inappropriate. We relate the method to other work done on Gaussian processes and clarify the relation between Support Vector machines and certain Gaussian process models.
基于分形维数的离线图像分割方法_叶小岭
文章编号:1674-7070(2013)01-0069-04基于分形维数的离线图像分割方法叶小岭1刘太磊1胡凯1摘要为了能够提高图像分割的精度,提出了一种基于统计学和分形维数的图像分割方法,能够对自然景物中树木、道路和天空进行分割,并且可以应用于机器人导航的视觉系统.该方法首先通过统计大量的道路和树木和天空的分形维数(LFD),分析三者对应的LFD值分布特点,然后利用该特点对图像进行分割,最后对分割后的图像进行平滑处理,得到分割结果.实验结果表明:利用统计结果进行图像分割能够提高分割速率,而且使用分形维数作为特征能够得到比较精确的分割效果.关键词统计学;分形维数;毯子覆盖法;图像分割中图分类号TP391.4文献标志码A收稿日期2012-02-20资助项目江苏省产学研联合创新资金(BY20 11111);江苏省高校优势学科建设工程项目作者简介叶小岭,女,教授,硕士生导师,主要研究领域为系统优化与控制、智能仪器仪表、计算机应用等.xyz.nim@163.com1南京信息工程大学信息与控制学院,南京,2100440引言图像中不同的物体表面会呈现出不同的纹理特性.为了能够区分出兴趣目标与背景,纹理分割已经成为数字图像处理的一个非常重要的研究方向.已有的纹理分割方法多数是基于传统欧氏几何理论的空间域和频域内的分割方法,其主要弊端是欧氏几何理论不能够描述形状复杂的自然场景.Mandelbrot等[1]在20世纪70年代创立了区别于传统几何理论的分形几何理论,提出了“分形”、“分形维数”等影响深远的概念.分形维数是分形几何理论的基本概念之一,目前已得到广泛重视,成为描述自然现象的重要参数,并应用于图像分割、图像压缩以及计算机视觉领域.Peleg等[2]将分形维数作为图像的纹理特征,分析了不同分辨率下图像的纹理特性,实现Brodatz纹理图像库中各种纹理图像的归类.Novianto等[3]提出了一种计算图像分形维数(LFD)的算法,并结合聚类算法应用于自然景色的图像分割.Mavroforakis等[4]将分形维数特征结合神经网络及支持向量机运用于医学图像中肿块的检测.Yoshida等[5]提出了一种新的图像二值化方法,可以将分形维数较高部分与较低部分分割开来,但是只适用于目标区域与背景区域分形维数相差较大的情况,而且需要对256个阈值分割的结果分别进行LFD计算,计算量较大,运行时间较长.本文提出了一种基于统计学和分形维数的图像分割方法,将分形维数作为图像特征,通过统计学方法对其归类,从而达到图像分割的目的.本文分别对自然景物中树木、道路和天空进行分割,可以应用于机器人导航的视觉系统.其优点在于,在统计结果的基础上可以提高图像分割的速度,另外,采用LFD算法计算图像的分形维数可以提高后期图像分割的精度.1LFD算法介绍1.1毯子覆盖法Peleg等[2]于1984年提出了覆盖法求分形维数.假设图像的表面g(i,j)被毯子所覆盖,上毯子表示为uε,下毯子表示为bε,则毯子表面计算公式为uε(i,j)={max uε-1(i,j)+1,max|(m,n)-(i,j)|≤1uε-1(m,n}),(1)b ε(i ,j )={max b ε-1(i ,j )+1,max|(m ,n )-(i ,j )|≤1b ε-1(m ,n }),(2)其中(m ,n )表示与像素坐标(i ,j )距离小于1的邻域像素坐标.定义上下毯子的初始值为u 0(i ,j )=b 0(i ,j )=g (i ,j ),g (i ,j )表示坐标(i ,j )处对应的图像灰度值.假设ε表示毯子数(毯子厚度),那么毯子的面积A (ε)为A (ε)=∑i ,j (u ε(i ,j )-b ε(i ,j ))2ε.(3)已知文献[1]定义的分形表面公式为A (ε)=F ε2-D ,F 是常数,D 是图像表面的分形维数.对等式两边取对数,得到ln {A (ε)}=ln F +(2-D )ln ε,那么可以通过ln {A (ε)}与ln ε的线性关系计算得到分形维数.1.2LFD map 生成LFD [6]的概念是与全局分形维数(GFD )的概念相对应的,GFD 是指使用毯子法覆盖整幅灰度图像计算得到的分形维数,而LFD 的计算方法为:假设图像大小为M ˑN ,以像素坐标(i ,j )为中心,取窗口大小为w 的区域,使用上述毯子覆盖法计算该区域的分形维数,将计算的结果赋给坐标(i ,j )对应像素.对于二维灰度图像.分形维数的计算结果应该在2.03.0之间.LFD map 图像计算公式为G (i ,j )=255ˑD (i ,j )-D minD max -D min,(4)其中G (i ,j )是像素坐标(i ,j )处LFD map 图像的灰度值,D max 和D min 分别表示LFD 值的最大值和最小值.最后生成的LFD map 如图1b 所示,其中浅色像素表示分形维数较高部分,深色表示分形维数较低部分.由此,可以看出树木、草地以及边缘部分分形维数较高.图1LFD map 的生成Fig.1Generation of LFD map2实验与仿真2.1统计数据的获得本文的图像分割方法是建立在统计学的基础上进行的,需要大量统计数据的支持.1)拍摄天空、道路和树木图片各1000张(如图2所示),形成一个统计数据库.2)对图片库中所有图片进行预处理,包括图像灰度化和直方图均衡化,将图片调整统一大小(本文为360像数ˑ480像数).图2统计图像库示例Fig.2Examples of image database3)按照前文所述方法,计算每幅图像LFD 的平均值(毯子的厚度ε=44,窗口大小取3ˑ3[3]),即得到树木、道路和天空各1000个平均值,统计各向量中平均值出现的次数,最后将其拟合为曲线,如图3所示,可以得到天空、道路和树木对应的3个分形区域.分析图3可知,树木的分形维数较高,天空的分形维数较低,而道路的分形维数与前两者均有重叠,这是由于在采集道路的图片库时,一部分是在距离较远处拍摄,分辨率较低,道路表面粗糙程度不明显,相应的分形维数较低,而另一部分拍摄距离较近,分辨率较高,道路表面较粗糙,相应分形维数较高.由此可以看出,仅仅使用分形维数这一特征无法将道路提取出来,针对这一情况,后期可以引入颜色作为辅助特征进行判断.2.2图像分割由图3可以看出,在不考虑道路的情况下,树木和天空存在明显的阈值(2.20),故可以根据将图像分割为两个部分:Bw (i ,j )=0,D LF (i ,j )≥2.20,1,DLF(i ,j )<2.20{.(5)7叶小岭,等.基于分形维数的离线图像分割方法.YE Xiaoling ,et al.A new image segmentation method based on statistics and local fractal dimension.图3LFD均值统计结果曲线拟合Fig.3Curve fitting results of the average of LFD2.3后期处理图4为道路分割和平滑效果.由图4b和4e可以看出:距离相机近的路面,由于分辨率较高,分形维数相应较高,而由于阳光照射角度不同,树木在地面有时会有阴影,同样也会提高该区域道路的分形维数.有鉴于此,本文使用了平滑处理.1)利用数学形态学的方法对图4b、4e中杂小区域进行处理.为了保留道路的两条中线(以便后期机器人导航使用),本文使用垂直方向的直线对图4b和4e先膨胀后腐蚀.2)在第一步的基础上去除图像中面积较小区域.平滑后的效果分别如图4c和4f所示.图4道路分割及平滑效果Fig.4Segmentation and smoothingresults of the proposed method2.4实验结果本文提出的图像分割方法最终效果如图4c和4f 所示.分别使用Otsu方法、Niblack方法以及文献[5]中的方法对图4a和4d进行图像分割,得到的结果如图5所示,运行时间如表1所示.由此可以看出,Otsu 方法虽然运行时间最快,但无法精确地将树木在道路上的投影与树木分割开来,Niblack的方法分割效果并不理想,文献[5]中的方法可以得到精确的分割结果,但是需要对256个阈值分割的结果分别进行LFD 计算,计算量较大,运行时间较长.而使用本文所提出的方法是在统计结果的基础上进行图像分割,提高了运行速度,并且可以将树木、道路和天空的交界处精确地描述出来.如图6所示为K均值聚类方法与本文方法在不同的图像尺寸下运行时间的比较.另外,可以保留一些有用的边缘信息(例如本例中道路中央黄线,可以用于后期机器人导航).图5相关方法分割效果比较Fig.5Segmentation result comparision of different methods表1图像分割不同算法的运行时间比较Table1Runtime comparision of4image segmentation methodss 原始图像Otsu方法Niblack方法文献5方法本文方法图4a0.177411.253031.2682 5.3761图4b0.097910.005130.8824 5.2652图6本文方法与K均值聚类方法运行时间比较Fig.6Runtime comparision between the K-meansclustering method and the proposed method3结论本文提出的这种图像分割方法具有以下几个优17学报:自然科学版,2013,5(1):69-72Journal of Nanjing University of Information Science and Technology:Natural Science Edition,2013,5(1):69-72点:1)本文采用统计学的方法实现图像分割能够提高速度;2)采用LFD 算法计算图像的分形维数可以提高后期图像分割的精度,天空、树木的交界处以及道路、树木的交界处能够被精确描述,还能够保留图像中一些重要的边界信息,方便后期应用.本文方法的缺点是图像像素的分形维数容易受到光线角度的影响,需要以后进一步研究.参考文献References[1]Mandelbrot B B ,Passoja D E ,Paullay A J.Fractal char-acter of fracture surfaces of metals [J ].Nature ,1984,308(5961):721-722[2]Peleg S ,Naor J ,Hartley R ,et al.Multiple resolution tex-ture analysis and classification [J ].IEEE Trans PatternAnal Mach Intell ,1984,6(4):518-523[3]Novianto S ,Suzuki Y ,Maeda J.Near optimum estimationof local fractal dimension for image segmentation [J ].Pattern Recognition Letters ,2003,24(1/2/3):365-374[4]Mavroforakis M E ,Georgiou H V ,Dimitropoulos N ,et al.Mammographic masses characterization bases on localized texture and dataset fractal analysis using linear ,neuraland support vector machine classifiers [J ].Artificial In-telligence in Medicine ,2006,37(2):145-162[5]Yoshida H ,Tanaka N.A new binarization method for a sign board image with the blanket method [C ]∥ChineseConference on Pattern Recognition ,2009:1-4[6]Chen F ,Ji G R ,Cheng J N.Image edge detection based on improved local fractal dimension [C ]∥Proceedings of the IEEE International Conference on Natural Computa-tion ,2008:640-643[7]Ma L ,Shan Y J.Integration of fractal and grey-level fea-tures for texture segmentation [C ]∥Congress on Image and Signal Processing ,2008:687-691[8]Zhao H X ,Wang S C.Application of fractal mathematics in mode recognition and image processing [C ]∥Interna-tional Conference on Measuring Technology and Mecha-tronics Automation ,2011:536-538[9]Theiler J.Estimating fractal dimension [J ].Journal of the Optical Society of America ,1990,7(6):1055-1073[10]梁东方,李玉梁,江春波.测量分维的“数盒子”算法研究[J ].中国图像图形学报:A 辑,2002,7(3):40-44LIANG Dongfang ,LI Yuliang ,JIANG Chunbo.Research on the Box Counting algorithm in fractal dimension meas-urement [J ].Journal of Image and Graphics :A ,2002,7(3):40-44A new image segmentation method based onstatistics and local fractal dimensionYE Xiaoling 1LIU Tailei 1HU Kai 11School of Information and Control ,Nanjing University of Information Science &Technology ,Nanjing 210044Abstract To improve the precision of image segmentation ,a new method based on statistics and fractal dimensionis proposed.This method can be used to segment natural images like tree ,road and sky ,furthermore ,it can also be used in vision system of robot.The proposed method can be roughly divided into three steps ,firstly ,collect vast a-mount of Local Fractal Dimension (LFD )of tree ,road and sky ;secondly ,analyze the LFD distribution characteristics of the three textures ,which are mentioned above ;finally ,segment those three textures according to the distribution characteristics ,and then ,smooth and filter the image by using mathematical morphological methods.Compared with some other methods ,experimental data show that this proposed method can improve the precision of segmentation by using LFD as the feature.Key wordsstatistics ;local fractal dimension (LFD );the blanket method ;image segmentation27叶小岭,等.基于分形维数的离线图像分割方法.YE Xiaoling ,et al.A new image segmentation method based on statistics and local fractal dimension.。
交通标志论文:自然场景下交通标志图像识别方法研究
交通标志论文:自然场景下交通标志图像识别方法研究【中文摘要】随着社会进步和经济的发展,我国的公路交通行业得到了持续、快速地发展。
高度发达的现代交通为人类的生活带来了便利,但同时交通安全、交通拥挤等问题也变得越来越严重。
为了解决这些问题,智能交通系统ITS (Intelligent Traffic System) 这一研究领域应运而生。
道路交通标志识别系统TSR (Traffic Sig ns Recognition)作为智能交通系统研究方向,已成为国内外学者研究的热点之一,它通过安装在机动车辆上的摄像机摄取自然场景图像,并将图像送至系统的图像处理模块进行图像理解、交通标志检测与识别最后将识别结果告知驾驶员,以达到增强道路交通安全、降低交通拥挤的。
道路交通标志中,警告标志、禁令标志和指示标志是三种最重要、也是最常见的交通标志,它们均具有特定的颜色和形状用以区分其他物体,以达到提醒驾驶员或行人的。
十几年以来,道路交通标志识别研究有了很好的进展,并取得了一定的研究成果,但因背景复杂性以及光照等各种影响因素的存在,导致了它比非自然场景下的目标识别更具有挑战性,其影响因素主要表现在以下方面:光照条件时常变换且不可控、车辆震动导致摄取的图像模糊、交通标志被损坏、被污染或被遮挡、交通标志颜色褪色、雨雾等恶劣天气的存在,以及投影失真、尺度变换、倾斜、相同颜色背景等。
从交通标志的颜色信息和形状特征出发,研究一种交通标志的智能检测算法。
该算法主要包括基于HSV(Hue-Saturation-Values) 颜色空间的交通标志图像分割和基于颜色与形状的交通标志检测两部分。
首先将RGB(Red-Green-Blue) 图像转换至受光照影响较小的HSV颜色空间,通过提取不同颜色的阈值范围来定位目标区域;再根据目标区域的几何形状来划分警告、禁令和指示三种不同类别的交通标志,完成交通标志图像的检测。
针对自然场景下影响交通标志检测效果的不利因素,研究了基于多尺度Retinex 的交通标志图像增强和仿射变换的三角形交通标志校正算法,以及规范化圆形交通标志和矩形交通标志的算法。
宝赛菌株和质粒目录-2014
pET-3a(+)pET-3c-sumo pET-3d(+)pET-11a(+)pET-12a(+)pET-15b(+)pET-16b(+)pET-17b(+)pET-19b(+)pET-20b(+)pET-21a(+)pET-21b(+)pET-21d(+)pET-22b(+)pET-23a(+)pET-23b(+)pET-24a(+)pET-25b(+)pET-26b(+)pET-27b(+)pET-28a(+)pET-28b(+)pET-28c(+)pET-28a(+)-sumo pET-28a(+)-GFP pET-28a(+)-SMT3 pET-29a(+)pET-30a(+)pET-30c(+)pET-31b(+)pET-32a(+)pET-32a(+)pelB pET32a-LC3pET-32a(+)LicpET-35b(+)pET-38b(+)pET-39b(+)pET-40b(+)pET-41a(+)pET-42a(+)pET-43.1a(+)pET-44a(+)pET-49b(+)pET-50b(+)pET-52b(+)pMCSF1pMSCF3pRSFDuet1 pCDFduet1pColA duetE.coli EPI300Ecoli MC1061BL21BL21goldBL21(DE3)BL21(DE3)plysSBL21(DE3)GoldplysS BL21AIBL21SIBL21codonplusRIPL BL21codonplus(DE3) BL21TrxB(DE3)BL21(DE3)RDBL21Star(DE3)BL21RPBL21RILRosetta(DE3)Rosetta(DE5)Rosetta Blue(DE3) Rosetta(DE3)plysS Rosetta2(DE3)Rosetta2(DE3)plysS Rosetta-gami(DE3) Origami(DE3)Origami2(DE3) OrigamiB(DE3)Origami(DE3)plysS Rosetta-gami 2(DE3)placI Rosetta-gami 2(DE3)pLysS Rosetta-gami B(DE3)pLysS BLR(DE3)Novablue(DE3)B834(DE3)JM83AD494(DE3)HM174(DE3)HM174(DE3)plysSC41(DE3)C41(DE3)plysSC43(DE3)Turner(DE3)Turner(DE3)plysSJM101JM105JM109JM109(DE3)DH5aλpirK12MG1655XL1blueXL10 goldTop10Top10F’BM25.8BW23473SG1117TurboT1TG1TB1M15ER2566C2566ER2529ER2738HB2151S-17SM10JF1125BJ5183HB101K802C600JM110TH1AM1W3110DH10bacY1089Y1090Antarctic Express Antarctic Express(DE3) Antarctic Express(DE3)RP Antarctic Express(DE3)RILDH5αJM110TOP10XL2-Blue MRF’Mach1 T1OmniMAX2 T1 Phage-Resistant Cells EZ10DH10BSUREpubs 520DB3.1pCold IpCold IIpColdIIIpCold-sumopCold-TFpPin point xa1pPin point xa2pPin point xa3pPin point CATpMAL-c2xpMAL-p2xpMAL-p5xpMAL-c5xpMAL-c4xpMAL-p4xpMAL-p5epTWIN1pTWIN2pLLP-OmpApLLP-STIIpMBP-PpMBP-CpET-TrxpET-HispTrc-CKSpET-DsbApET-MBPpET32M3CpGEX-4T-3pGEX-5x-1pGEX-6p-1pGEX4T-1/Gst-His-HA pGEX3XpGEX-1λT-6His-GST-PP pGEX2tkpkk223-3pkk232-8pBV220pBV221pBV222pRsetApRsetBpRsetCpRsetB td Tomato pBBRMCS pBBRMCS2 pBBRMCS3 pBBRMCS4 pBBRMCS5pTrcHisApTrcHisBpTrcHisCpBADHis ApBADHis BpBADHis CpBADMycHisA pQE2pQE9pQE30lacIqpQE30pQE31pQE32pQE40pQE60pQE70pQE80L pQETrisystem pEZZ18pSE380pWHM3pQBI63pTrc99A pTrcMECTpTrcpTYB1pTYB2pTYB11pTYB12pCYB1pBAD18pBAD24pBAD33pBAD43pBAD24-GFP pDest15-N-throbin pDest22pDest32pED-Trx-pp-air pED-pppED-DsbA-pp-airpED-GST-pp-airpCWori+pITG TrxpT7tspTT5pALEXpUC18pUC19-GFPpUC19pUC18-p43pUC118pGEM3ZF+pEGM-7ZF(+)pEGM-11ZF(+)pKT100pME6032SuperCosmidpBR322pACYC184pACYC177pBluescript II SK(+) pBluescript SK(+) pBluescript II KS(+)pG-KJE8pGro7pKJE7pG-Tf2pTf16pEC86pET-28a-fabppETcoco-1酵母杂交系统pACTpACT-MyoDpBIND-IdpG5 lucpCMV-BDpCMV-ADpBD-p53pBD-NF−κBpAD-SV40TpAD-TRAFpFR-lucpGBKT7pGADT7pCL1pGBKT7-53pGADT7-TpGBKT7-LampACT2 ADY187AH109pSospMyrpSos MAFBpHis2pHisSi-1p53bluepGAG424pLaczip53hispGAD53mpBridgeY2H Gold Yeast strain Y187pINDpOPRSVIpOPI3CATpCMVLacIpSwitchpGene v5-His BpcDNA4/TO/Myc-His A pcDNA4/TO/Myc-His B pcDNA4/TO/Myc-His C pcDNA4/TO/Myc-His/LacZ pcDNA6/TRpTet-OnpTet-OffpTRE2pTRE2 hygropTK-hygpRevTet-OnpRevTet-OffpRevTREpTet on advancedpTRE TightpTRE Tight Luciferase pCMV-Tet3GpTRE3GpTRE3G-LucpNI v2pNG v2pNN v2pHD v2pTALETF v2 (NN) pTALETF v2 (NG) pTALETF v2 (NI) pTALETF v2 (HD) pTALEN v2 (NI) pTALEN v2 (NG)pTALEN v2 (NN)pTALEN v2 (HD)pX335pX260pX320pShuttle-CMVpShuttlepAdTrack-CMVpAdTrackpAdEasy-1BJ5183AdEasy-1pAd/BLOCK-iT-DEST RNAi Gateway Vector pDC315pBHGloxdelE13crepShuttle-IRES-hrGFP2pAAV-MCSpAAV-RC2pAAV-RCpHelperpAAV-LacZpAAV-IRES-hrGFPpCMV-MCSRNAi-Ready pSIREN-RetroQRNAi-Ready pSIREN-RetroQ-DsRed-Express RNAi-Ready pSIREN-RetroQ-ZsGreen pSilencer5.1-U6-RetropSilencer5.1-H1-RetropRetroX-IRES-ZsGreen1pRetroX-IRES-DsRed ExpresspLNCX2pQCXIHplvx-RNA1plvx-RNA2pMSCVpuropSuperpSuper retro GFP-neo pSupeior puropBabe puropCMV-Gag-polpCMV-VSV-GpCL AmphopSuper retro puropLVX-shRNA1pLVX-shRNA2pSicoRpSicoR pGK puro pLentilox 3.7pLKO.1 puroPLKO.1pLKO.3GpTet-PLKO-puro pPRIME-TET-GFP-FF3 pPRIME-TREX-GFP-FF3 pSIH1-H1-CoGFPpGIPZ emptypGIPZ controlpTRIPZ emptypTRIPZ controlpLVX-DsRed-Monomer-N1 pLVX-AcGFP1-N1pLVX-IRES-ZsGreen1 pLVX-IRES-td Tomato pLVX-IRES-mCherry pLVX-IRES-neopLVX-IRES-puropLVX-puropLVX-Tight-PuropLVX-Tet-On-AdvancedpLVX-TRE 3GpLVX-Tet 3Gplvx-i-TDpLenti4 TO/V5 DESTpLenti6 V5 DESTpLenti6-UbC-V5-DESTpCDH-CMV-MCS-EF1-copGFP pCDH-CMV-MCS-EF1-puro pCDH-CMV-MCS-EF1-GFP+Puro pCDH-EF1-MCS-T2A-Puro pWPXLFUGWpLVTHpLVTHMpLp1pLp2pLp VSV GpCgpVpVSV GpCMV-dR8.91pCMV-VSV-GpSPAX2pMD2.GpMDLg pRREpRSV-revpCMV-VSV-GpLentG-KOSMTetO-FUW-OSKMFUW-M2rtTAFUW-tetO-hOCT4FUW-tetO-hSOX2FUW-tetO-hKLF4FUW-tetO-hMYCpCDF1-MCS2-EF1-copGFPpFP93LeGO-iC2pLOX-CW-CREpLOX-CWBmi1pLOX-TERT-iresTK pLOX-Ttag-iresTK pGensil-1Stbl3Stbl4Stbl4PiggyBac Dual promoter Super PiggyBac Transposase pGAS-TA-LucpSTAT3-TA-LucpISRE-TA-LucpTA-LucpIκB-EGFPpNFAT-TA-Luc pCaspase3-sensorpAP1(PMA)-TA-Luc pCRE-LucpGRE-LucpHSE-Lucp53-LucpAP-1-LucpNF-κB-Lucp38-betap38-бp38-γπp-JNKK2pCDNA3.1-flag-TRAP2 pCDNA3.1-flag-TAK1HA-ASK1pCDNA3-myc-MEKK3pCMV-MDMZpEGFP-N1-raf通路pSRE-LucpGL4.10pGL4.13pGL4.19pGL4.26pGL4.27pGL4.29pGL4.30pGL4.75TOPFlashFOPFlash SuperTopFlashpE2F-TA-LucpSilencer1.0pSilencer 2.1-U6 hygro pSilencer3.0-H1 pSilencer 3.1-H1 hygro pSilencer 3.1-H1 neo pSilencer 4.1-CMV neo pSilencer 4.1-CMV puro pMIR-REPORT Luciferase pSV-β-galacosidase psiCHECK-2pmir GLOpGenesil-1pmR mCherrypmiRZip anti-microRNA pRNATin-H1.2/Neo pRNATin-H1.2/Hygro pRNATin-H1.2/Retro pRNA-H1.1/Adeno pYES2pYES2-EGFPpYES2-KanpYES2-flagpYES3/CTpYES6/CTpYES2 NTApYES2 NTBpYES2 NTCpFA6a-GFP(S65T)-His3MX6_1x pFA6a-GFPS65T-KanMX6 pESC-LeupESC-TrppESC-HispESC-UrapRS316pRS316HApRS426pRS426galpRS41HpRS41H-启动子pYCP211pYIP5pYRP7pYX212pYIP211pADH2PACT2-ADpDR195p416GFDpYEplac112pYEplac195Ycp22lac-EGFPYcplac33pAUR123pUG6pUG35pSH65pSH63pSH47 INVSc1S288cW303-1ABY4741BY4742FY837YPH499YS58YM4271 pPIC9pPIC9kpPIC9khis pPIC9kmut pPICZalphaA pPICZalphaB pPICZalphaC pPICZA pPICZB pPICZC pGAPZalphaA pGAPZalphaB pGAPZalphaC pGAPZA pGAPZB pGAPZC pPIC.5k pPIC3.5pAO815 pPink HCpPink LCpPink HC alphapMET αBpMET αCpPIC ZαGBpPIC ZαFCpPIC ZαDpHIC-PILA503ZαGApFLD-catp334BFD15pink1pink2pink3pink4SMD1168SMD1168HSMD1163KM71KM71HGS115GS190GS200JC308JC220pENTR 1ApENTR3CpDONR221pDONR ZeopcDNA6.2-GWEmGFP-miR negative pLenti6/TRDB3.1pYD1EBY100pDisplay植物载体pcambia35s-EGFP pcambia2301-101 pcambia35s-EYFP pcambia35s-ECFPpTCK303pBI101pBI221-GFPpBI121pBI121-GFPpcambia1300-221Super1300pcambia1300GFP pcambia 1200pcambia1201pcambia12811Zpcambia1300pcambia1301pcambia1302pcambia1303pcambia1304pcambia1305.1pcambia1305.2pcambia1380pcambia1381xcpcambia1381xapcambia1381pcambia1381xbpcambia1381Zpcambia1390pcambia1391pcambia1391Zpcambia1391xapcambia1391xcpcambia1391xbpcambia2200pcambia2300pcambia2301pcambia0380pcambia0390pcambia2201pcambia1291ZATCC15834 发根农杆菌LBA9402pKANNIBALpHANNIBALpGreen029pSPYCE(M)pSPYCE(MR)pspYCEMpspYCEMRpspYNE173pspYNE173RpSAT4 nVenus-CpSAT1 CCFP-NpSAT1 CCFP-CpPZp-RCS2-BarpSAT6 nCerulean C(A+)p416GFDpDF15枯草芽孢杆菌表达系统pMUTIN4pMUTIN4GFPpCSN44pMA5pMA5MCSpMA-H3pMA09-JpMA09-HpHY300pHY300plkpXMJ19pIL253pZL507pDG1363pSG1154pHT304pHCMC04 pHCMC05pAD43-25pMA5pMA5MCSpGFP315pGFP22pHT01pHT43BCL10501A75WB600BS168WB600WB800WB800NpHIS1525pK18mobSacB巨大芽孢菌 WH320巨大芽孢菌YYB pHT1谷氨酸棒杆菌ATCC13032其他酵母表达系统pKLAC1GG499pSEP1pSEP2pSEP3粟酒酵母SP01粟酒酵母SPQ01T载体系列pCXSNpCUXNpXDGpXDR细胞通路载体p38-betap38-бp38-γπp-JNKK2pCDNA3.1-flag-TRAP2 pCDNA3.1-flag-TAK1HA-ASK1pCDNA3-myc-MEKK3 pCMV-MDMZpEGFP-N1-raf通路JM109 MEKK1 fl pcDNA3-HA-TAB1ERK (WT)Pgex4t1-GST-IKB pSicheck2pQCXINpDONR221Top10 pROSEJM109 MEKK1 fl ERK (WT)pBabc-puroHA A8F1HAcdc37pcMV flag MEKK1 pBARK1pBD67其他表达系统pOS7001pHGF9050pIJ8860pRK2013pVLT33pJLA 502pIL253PSM-402pAP-B03pECX99EpXJM19pCS2-HApDG148pMK3S17-lamda pir pMG36ePYCTpNZ8048MG1363NZ9000pBIG2R-HPH2pAN7-1pAN8-1pKC1139哺乳动物真核表达质粒pSR-GFP/NeopVAX1pBudCE4.1pSFV1pCEP4pUB6/V5 His ApUB6/V5 His BpUB6/V5 His CpUB6/V5 His lacZpCMV5pCMV-Tag 2 BpCMV-Tag 5BpSecTag2 ApNTAP-BpBK-CMVpTracer CMV2pSG5pCI-neopCIpSIpCMV-mycpCMV-HApIRESpIRES neopIRES neo3pIRES hyg3pIRES puro3pIRES2-EGFPpIRES-hrGFP-1a pIRES-Dsred-Express pBI CMV1pFLAG CMV2pVitro2-neo-mcs pCAGGSpWHEREpBC1pcDNA4T/0pcDNA4/V5-His A pcDNA6TRpCDNA3.0-GFPpCDNA3.0pcDNA3 flagpcDNA3 EGFPpcDNA3.1HApcDNA3.1MycpcDNA 3.1-c-His/myc pcDNA3.1(+)/Myc HisB pcDNA3.1(zeo)+ pcDNA3.1/GSpcDNA3.1(-)/myc-His A pcDNA3.1-V5-HisA pcDNA3.1-CT-GFP pcDNA3mito RFP pcDNA4/HisMax B pcDNA6-Myc/His B pIE1pIEX-bac3pCDNA6 myc His C pfastbacIpFastBacIIIpFastBacI-Gus pFastBacHT ApFastBacHT B pFastBacHT C pFastBacHT-CAT pFastBacDualpFastBac-c-His-TEV pFastBac-N-GST-TEV pBlueBacHis2 A pBlueBacHis2 B pBlueBacHis2 CpIEXBac-1pIEXBac-c-EGFP-3 pIEXBac-c-EGFP-1 pIEXBac-c-EGFP-4pVL1392pVL1393pVL1392-XyIE control vector pCo-blastpAc5.1apAc5.1bpAc5.1 V5-His BpMT/V5 HispMT-Bip-V5-HisApEF1/myc-His BpEF1/V5 HisApEF4/V5 HisApEF6/Myc-HisCpMIB v5-His BpIZT/V5-Hissf9细胞High Five细胞DH10BacpFBDMpUCDMpBADZ-His6CreDH10MultiBac pDsRed2-N1 pDsRed2-C1 pDsRED-Monomer-N1 pDsred2-4T-1 pDsred2-32a pscherry1pmCherry-N1 pmCherry-C1pRFP-N3pEYFP-N1pEYFP-C1pEGFP-N1pEGFP-C1pEGFP-N3pEGFP-C3pECFP-N1pECFP-C1pEGFP-1pAcGFP1-1pDsRed2-1 pZsYellow-N1 pAmCyan-N1 pLEGFP-N1 pLEGFP-C1 pDsRED2-ER pDsRED2-Mito pDsRED2-Nuc pDsRED2-Peroxi pDsred-Express C1 pEYFP-GolgipEYFP-MempEYFP-ERpEYFP actinpEYFP tubpEGFP-ActinpECFP-ERpECFP-Mito pPAmCherry-Tubulin mRFP-ERDkeima fluorescent protein mAmetrinepDsRed-Monomer pEBFP-N1pEBFP-C1pEBFP-C2pAcGFP1-1pGreen0029pSicor EGFPpCDNA3.1(-)-MRFP pGL3-basicpGL3-controlpGL3-enhancerpGL3-promoterPGL3-SV40PGL3-CM 转DH6αpRL-TKpRL-CMVpRL-SV40pGluc-basicpSIM5pSIM6pKD3pKD4pKD20pKD46PCP20基因野生型P53myc-hTERTSV40 Large TTaqklentaqPfuKODT4 DNA ligaseT4 PNKTEVSUMOSUMO protease 2pP-αSUMO3GST-3CLYZ-pPICZα(溶菌酶表达毕赤酵母载体)Vaccinia viru Topoisomerase Imouse RITNFTGFLysozymeM-MLVM-MLV(-)CreCCDB-TsfGFP大肠杆菌DNA聚合酶I大片段(klenow)大肠杆菌DNA聚合酶I普通微生物菌种红发夫酵母(自己分离)酿酒酵母S288c(自己分离)葡萄牙假丝酵母(自己分离)多形汉逊酵母 NRRL Y-11170白色念珠菌ATCC 10231酿酒酵母 S.c197管囊酵母(自己分离)原毛黄孢平革菌 ATCC 34540热带假丝酵母 NRRL Y-12968马其顿假丝酵母CICC 1765巴斯德假丝酵母 NRRL Y1603三角酵母(自己分离)南极假丝酵母NRRL Y-7954 Ash bya gossypii NRRL Y-1056酿酒酵母 CGMCC2.1多形汉逊酵母 NRRL Y-11170出芽短梗霉 CICC40333黑曲霉(自己分离)简青霉CGMCC3.4402热带假丝酵母 CICC 1272树干毕赤酵母 NRRL Y-11545光滑假丝酵母NRRL Y-65长娄德酵母CGMCC2.1589季也蒙毕赤酵母CGMCC2.1801克鲁维酵母K.aestuarii Y-5370出芽短梗霉 NRRL Y-2311-2出芽短梗霉(自己分离)出芽短梗霉 ATCC 62921美季假丝酵母CGMCC2.1919安格斯毕赤酵母 NRRL Y-2214德巴利汉逊酵母NRRL Y-2021解脂亚罗酵母 CICC1441白色假丝酵母CGMCC 2.538卡尔斯博厄酵母 CICC1746赭色掷孢酵母CGMCC2.2113赭色掷孢酵母NRRL Y-5483简青霉(自己分离)乳酸克鲁维酵母CICC1773近平滑假丝酵母 CGMCC2.1786构巢曲霉ATCC38163博伊丁假丝酵母 NRRL Y-2332-2荚膜毕赤酵母 NRRL Y-1842博伊丁假丝酵母 NRRL Y-2332安格斯毕赤酵母 NCYC 495绿色木霉 NRRL3652大毛霉 NRR3635黑曲霉 NRR566米根霉(自己分离)嗜热脂肪地衣芽孢杆菌 NRRL B-1102嗜热脂肪地衣芽孢杆菌 NRRL B-1172嗜热脂肪地衣芽孢杆菌 NRRL B-14316嗜热脂肪地衣芽孢杆菌 NRRL B-14318解淀粉芽孢杆菌 CGMCC1.1099地衣芽孢杆菌 ATCC33632巨大芽孢杆菌NRRL B-14308巨大芽孢杆菌CGMCC 1.223蜡状芽孢杆菌 NRRL B-3711嗜碱芽孢杆菌ATCC BAA-125蜡状芽孢杆菌 CICC10040地衣芽孢杆菌 ATCC 14580巨大芽孢杆菌 ATCC12872芽孢杆菌 NRRL B-11291紫红红球菌 NICMB 11216紫红红球菌DSM11097紫红红球菌DSM44541赤红红球菌 CGMCC4.1187红串红球菌 CGMCC 1.2362红串红球菌 DSM 43297紫红红球菌 CGMCC 1.2348紫红红球菌 DSM6263紫红红球菌 DSM46022紫红红球菌 DSM363紫红红球菌DSM43002嗜吡啶红球菌JCM10940恶臭假单胞菌 NRRL-18668荧光假单胞菌 DSM7155荧光假单胞菌CGMCC 1.758洋葱假单胞菌(自己分离)绿叶假单胞菌 CICC10216洋葱假单胞菌(自己分离)野油菜黄单胞菌 DSM3586丁香假单胞菌 ATCC BAA-978粪产碱杆菌CGMCC1.767粪产碱杆菌ATCC8750反硝化产碱杆菌N4谷氨酸棒杆菌ATCC13032金黄节杆菌ATCC BAA-1386藤黄节杆菌 CGMCC1.1525柠檬黄节杆菌CGMCC1.1893植物乳杆菌 NCIMB8826鲍曼不动杆菌 CICC22934鲍曼不动杆菌CICC22934醋酸钙不动杆菌NCIMB9871糖多孢菌 NRRL 2338四联球菌 NRRL B-108肺炎链球菌ATCC 49619肺炎克雷伯氏菌(自己分离)敏捷食酸菌72w睾丸酮丛毛单胞菌 ATCC55746运动发酵单胞菌 ZM4嗜水气单胞菌2(自己分离)嗜水气单胞菌1(自己分离)睾丸酮丛毛单胞菌 ACCC10192睾丸酮丛毛单胞菌 ATCC 55744聚团斯塔普氏菌JCM 20685耐辐射甲烷菌JCM2831四联球菌NRRL B-108金黄色葡萄球菌 ATCC 6538金黄色葡萄球菌ATCC29213臭鼻克雷氏菌CGMCC1.1734哈维氏弧菌ATCC33842河流弧菌CGMCC1.1608副溶血弧菌(自己分离)嗜热假诺卡氏菌CGMCC10280沉积物希瓦氏菌 DSM17055脆壁希瓦氏菌 NCIMB 400农杆菌 NRRL B-11291铅青链霉菌阿维链霉菌 NRRL 8165茎瘤固氮根瘤菌 DSM 5975放射根瘤菌ATCC 33970粪肠球菌ATCC29212粪肠球菌 ATCC 7080洋葱伯克霍尔德菌ACCC 10506根瘤菌ATCC BAA-1182结核分枝杆菌 JCM13017 Prosthecochloris vibrioformis DSM263 Oceanicola granulosus ATCC863质控标准菌株人伤寒沙门菌(自己分离)猪伤寒沙门菌(自己分离)大肠杆菌ETEC大肠杆菌EIEC大肠杆菌EHEC大肠杆菌EPEC苏云金芽孢杆菌ATCC10792枸橼酸杆菌CMCC48001小肠耶尔森菌ATCC23715黄色微球菌ATCC58166宋内志贺ATCC9290阴沟肠杆菌ATCC45031溶藻胶弧菌ATCC17749枯草芽孢杆菌ATCC63501白色葡萄球菌8032鲍曼不动杆菌ATCC19606肺炎克雷伯氏菌ATCC13883甲型副伤寒ATCC9250拟志弧菌ATCC33847河弧菌NCTC11218大肠杆菌ATCC25922创伤弧菌ATCC27526铜绿假单胞菌ATCC27853金黄色葡萄球菌ATCC25923鼠伤寒杆菌ATCC14028单增李斯特菌CMCC54002变形杆菌CMCC49027白色念珠菌ATCC10231副溶血弧菌ATCC17082福氏志贺菌ATCC12022鲍氏志贺菌ATCC9207蜡样芽孢杆菌ATCC11778阪琦肠杆菌ATCC51329表皮葡萄球菌ATCC12228粪链球菌CMCC3200马红球菌ATCC6939大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达宿主菌大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌报告基因质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌表达质粒大肠杆菌克隆质粒大肠杆菌克隆质粒大肠杆菌克隆质粒大肠杆菌克隆质粒大肠杆菌克隆质粒大肠杆菌克隆质粒大肠杆菌克隆质粒大肠杆菌克隆质粒大肠杆菌克隆质粒大肠杆菌克隆质粒大肠杆菌克隆质粒大肠杆菌克隆质粒分子伴侣分子伴侣分子伴侣分子伴侣分子伴侣大肠杆菌克隆质粒大肠杆菌表达质粒大肠杆菌表达质粒哺乳动物双杂交系统哺乳动物双杂交系统哺乳动物双杂交系统哺乳动物双杂交系统哺乳动物双杂交系统哺乳动物双杂交系统哺乳动物双杂交系统哺乳动物双杂交系统哺乳动物双杂交系统哺乳动物双杂交系统哺乳动物双杂交系统哺乳动物双杂交系统酿酒酵母双杂交系统酿酒酵母双杂交系统酿酒酵母双杂交系统酿酒酵母双杂交系统酿酒酵母双杂交系统酿酒酵母双杂交系统酿酒酵母双杂交系统酿酒酵母双杂交系统酿酒酵母双杂交系统酿酒酵母双杂交系统酿酒酵母双杂交系统酿酒酵母双杂交系统酿酒酵母单杂交系统酿酒酵母单杂交系统酿酒酵母单杂交系统酿酒酵母单杂交系统酿酒酵母单杂交系统酿酒酵母单杂交系统酿酒酵母单杂交系统酿酒酵母三杂交系统酿酒酵母三杂交系统酿酒酵母三杂交系统蜕皮激素诱导表达系统蜕皮激素诱导表达系统LacSwith II哺乳动物诱导表达系统LacSwith II哺乳动物诱导表达系统LacSwith II哺乳动物诱导表达系统GeneSwith 哺乳动物诱导系统GeneSwith 哺乳动物诱导系统四环素诱导系统四环素诱导系统四环素诱导系统四环素诱导系统四环素诱导系统四环素诱导系统四环素诱导系统四环素诱导系统四环素诱导系统四环素诱导系统四环素诱导系统四环素诱导系统四环素诱导系统四环素诱导系统四环素诱导系统四环素诱导系统四环素诱导系统四环素诱导系统四环素诱导系统Tale ToolBox kitTale ToolBox kitTale ToolBox kitTale ToolBox kitTale ToolBox kitTale ToolBox kitTale ToolBox kitTale ToolBox kitTale ToolBox kitTale ToolBox kitTale ToolBox kit Tale ToolBox kit CAS9系统CAS9系统CAS9系统腺病毒系统腺病毒系统腺病毒系统腺病毒系统腺病毒系统腺病毒系统腺病毒系统腺病毒系统腺病毒系统腺病毒系统腺病毒系统腺相关病毒系统腺相关病毒系统腺相关病毒系统腺相关病毒系统腺相关病毒系统腺相关病毒系统腺相关病毒系统逆病毒系统逆病毒系统逆病毒系统逆病毒系统逆病毒系统逆病毒系统逆病毒系统逆病毒系统逆病毒系统逆病毒系统逆病毒系统逆病毒系统逆病毒系统逆病毒系统逆病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统慢病毒系统PiggyBac转座子系统PiggyBac转座子系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统信号通路报告系统RNAi系统RNAi系统RNAi系统RNAi系统RNAi系统RNAi系统RNAi系统RNAi系统RNAi系统RNAi系统RNAi系统RNAi系统RNAi系统RNAi系统RNAi系统RNAi系统RNAi系统RNAi系统酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒酿酒酵母表达质粒。
Semi-supervised support vector machines
1Байду номын сангаас
INTRODUCTION
In this work we propose a method for semi-supervised support vector machines (S3 VM). S3 VM are constructed using a mixture of labeled data (the training set) and unlabeled data (the working set). The objective is to assign class labels to the working set such that the “best” support vector machine (SVM) is constructed. If the working set is empty the method becomes the standard SVM approach to classification [20, 9, 8]. If the training set is empty, then the method becomes a form of unsupervised learning. Semi-supervised learning occurs when both training and working sets are nonempty. Semi-supervised learning for problems with small training sets and large working sets is a form of semi-supervised clustering. There are successful semi-supervised algorithms for k-means and fuzzy c-means clustering [4, 18]. Clustering is a potential application for S3 VM as well. When the training set is large relative to the working set, S3 VM can be viewed as a method for solving the transduction problem according to the principle of overall risk minimization (ORM) posed by Vapnik at the NIPS 1998 SVM Workshop and in [19, Chapter 10]. S3 VM for ORM is the focus of this paper. In classification, the transduction problem is to estimate the class of each given point in the unlabeled working set. The usual support vector machine (SVM) approach estimates the entire classification function using the principle of statistical risk minimization (SRM). In transduction, one estimates the classification function at points within the working set using information from both the training and working set data. Theoretically, if there is adequate training data to estimate the function satisfactorily, then SRM will be sufficient. We would expect transduction to yield no significant improvement over SRM alone. If, however, there is inadequate training data, then ORM may improve generalization on the working set. Intuitively, we would expect ORM to yield improvements when the training sets are small or when there is a significant deviation between the training and working set subsamples of the total population. Indeed,the theoretical results in [19] support these hypotheses. In Section 2, we briefly review the standard SVM model for structural risk minimization. According to the principles of structural risk minimization, SVM minimize both the empirical misclassification rate and the capacity of the classification function [19, 20] using the training data. The capacity of the function is determined by margin of separation between the two classes based on the training set. ORM also minimizes the both the empirical misclassification rate and the function capacity. But the capacity of the function is determined using both the training and working sets. In Section 3, we show how SVM can be extended to the semi-supervised case and how mixed integer programming can be used practically to solve the resulting problem. We compare support vector machines constructed by structural risk minimization and overall risk minimization computationally on eleven problems in Section 4. Our computational results support past theoretical results that improved generalization can be obtained by incorporating working set information during training when there is a deviation between the working set and training set sample distributions. In three of ten real-world problems the semi-supervised approach, S3 VM , achieved a significant increase in generalization. In no case did S3 VM ever obtain a significant decrease in generalization. We conclude with a discussion of more general S3 VM algorithms.
support-vecttor networks
支持向量网络Corinna Cortes and Vladimir VapnikAT&T Labs-Research, USA摘要. 支持向量网络是一种针对两类问题的新学习机器.它的实现基于以下思想:将输入向量非线性地映射到一个很高维的特征空间.并在该特征空间中构造一个线性决策平面.该决策平面的特殊性质保证了学习机器具有很好的推广能力.支持向量网络的思想已在完全可分的训练数据集上得以实现,这里我们将它扩展到不完全可分的训练数据集.利用多项式输入变换的支持向量网络被证明具有很好的推广能力.我们以光学字体识别为实验将支持向量网络和其他不同的经典学习算法进行了性能比较.关键词:模式识别, 有效的学习算法, 神经网路, 径向基函数分类器, 多项式分类器1 介绍60多年前,R.A.Fisher[7]提出了模式识别领域的第一个算法.该模型考虑n 维向量x 正态分布N(m 1,∑1)和N(m 2,∑2), m 1 和m 2为各个分布的均值向量, ∑1和∑2为各个分布的协方差矩阵,并给出最优解为如下二次决策函数:2111112221|11||()[()()()()ln ]22|T T sq F x sign x m x m x m x m --∑=-∑---∑-+∑ . (1) 当∑1 = ∑2 = ∑时该二次决策函数(1)退化为一个线性函数:1111211221()[()()]2T T T lin F x sign m m x m m m m ---=-∑-∑-∑ . (2) 评估二次决策函数需要确定n(n+3)/2个自由参数,而评估线性函数只需要n 个自由参数.在观测数目较小(小于10n 2)的情况下评估O(n 2)个参数是不可靠的.Fisher 因此提出以下建议,在∑1 ≠∑2 时也采用线性判别函数(2),其中的∑采用如下形式:12(1)ττ∑=∑+-∑ , (3) 这里τ是某个常数.Fisher 也对两个非正态分布的线性决策函数给出了建议.因此模式识别的算法最开始是和构建线性决策平面相关联的.1962年,Rosenblatt[11]提出了一种不同的学习机器:感知器(或神经网络).感知器由相关联的神经元构成,每个神经元实现一个分类超平面,因此整个感知器完成了一个分段线性分类平面.如图1.Fig 1: A simple feed-forward perceptron with 8 input units, 2 layers of hidden units, and 1 output unit. The gray-shading of the Vector entries reflects their numeric value.Rosenblatt 没有提出通过调整网络的所有权值来最小化向量集上误差的算法,而提出了一种自适应仅仅改变输出节点上的权值的方法.由于其他权值都是固定的,输入向量被非线性地映射到最后一层节点的特征空间Z.在该空间的线性决策函数如下:()(())i iiI x sign z x α=∑ (4) 通过调整第i 个节点到输出节点的权值i α来最小化定义在训练集上的某种误差.Rosenblatt 的方法,再次将构建决策规则归结为构造某个空间的线性超平面.1986年,针对模式识别问题出现了通过调整神经网络所有权值来局部最小化向量集上误差的算法[12,13,10,8],即后向传播算法.算法对神经网络的数学模型有微小的改动.至此,神经网络实现了分段线性决策函数.本文提出了一种全新的学习机器,支持向量网络.它基于以下思想:通过事先选择的非线性映射将输入向量映射到一个高维特征空间Z.在空间Z 里构建一个线性决策面,该决策面的一些性质保证支持向量网络具有好的推广能力.例如:要构造一个与二阶多项式对应的决策面,我们可以构造一个特征空间Z,它有如下的N=n(n+3)/2个坐标:11,,n n z x z x == , n coordinates,22112,,n n nz x z x +== , n coordinates, 21121,,n N n n z x x z x x +-== , n(n-1)/2 coordinates ,其中,x = 1(,,)n x x .分类超平面便是在该空间中构造的.以上方法存在两个问题:一个是概念上的,另一个是技术上的.(1)概念上的问题:怎样找到一个推广性很好的分类超平面?特征空间的维数将会很高,能将数据分开的超平面不一定都具有很好的推广性.(2)技术上的问题:怎样在计算上处理如此高维的空间?要在一个200维的空间中构建一个4或5阶的多项式,需要构造一个上十亿的特征空间.概念上的问题在1965年[14]通过完全可分情况下的最优超平面得以解决.最优超平面是指使两类向量具有最大间隔的线性决策函数,如图2所示.可以发现,构造最优超平面只需考虑训练集中决定分类隔间的少量数据,即所谓的支持向量.如果训练集被最优超平面完全无错地分开,则一个测试样例被错判的期望概率以支持向量的期望数目与训练集向量数目比值为上界,即: [number of support vectors][Pr()]number of training vectorsE E error ≤. (5) 注意,这个界与分类空间的维数无关.并且由此可知,如果支持向量的个数相对与整个训练集很小,则构建出来的分类超平面将具有很好的推广性,即便是在一个无限维空间.第5节中,通过实际问题我们验证了比值(5)能够小到0.03而相应的最优超平面在一个十亿的特征空间依然具有很好的推广能力.Fig 2. An example of a separable problem in a 2 dimensional space. The support vectors, marked with grey squares, define the margin of largest separation between the two classes.令000b ⋅+=w z为特征空间里的最优超平面.我们将看到,特征空间中最优超平面的权值0w 可以写成支持向量的某个线性组合0support vectors i i α=∑w z . (6)从而特征空间里的线性决策函数I(z )为如下形式: 0support vectors ()sign()i i I b α=⋅+∑z z z , (7) 其中i z ⋅z 表示支持向量i z 和向量z 在特征空间里的内积.因此,该决策函数可以通过一个两层的网络来描述.如图3.尽管最优超平面保证了好的推广性,但如何处理高维特征空间这个技术上的问题依然存在.1992年,在文献[3]中证明构造决策函数的步骤可以交换顺序:不必先将输入向量通过某种非线性变换映射到特征空间再与特征空间中的支持向量做内积;而可以先在输入空间通过内积或者某种别的距离进行比较,再对比较的值进行非线性变化.如图 4.这就允许我们构造足够好的分类决策面,比如任意精度的多项式决策面.称这种类型的学习机器为支持向量网络.支持向量网络的技术首先针对能够完全无错地分开的数据集.本文我们将支持向量网络推广到不能完全无错分类的数据集.通过该扩展,作为一种全新的学习机器的支持向量网络将和神经网络一样的强大和通用.第5节将展示它在256维的高维空间中针对高达7阶的多项式决策面的推广性.并将它和其他经典的算法比如线性分类器、k 近邻分类器和神经网络做了性能上的比较.第2、3、4节着重引出算法并讨论了算法的一些性质.算法的一些重要细节参见附录.Fig 3. Classification by a support-vector network of an unknown pattern is conceptually done by first transforming the pattern into some high-dimensional feature space. An optimal hyperplane constructed in this feature space determines the output. The similarity to a two-layer perceptron can be seen by comparison to Fig 1.Fig 4. Classification of an unknown pattern by a support-vector network. The pattern is in input space compared to support vectors. The resulting values are non-linearly transformed. A linear function of these transformed values determines the output of the classifier.2 最优超平面本节回顾文献[14]中针对能被完全无错分开的训练数据的最优超平面方法.下一节介绍软间隔的概念,用来处理训练集不完全可分情况下的学习问题.2.1 最优超平面算法训练样本集11(,),,(,)l l y y x x , {1,1}i y ∈- (8) 是线性可分的如果存在向量w 和标量b 使得以下不等式1 if 1,1 if 1,i i i i b y b y ⋅+≥=⋅+≤-=-w x w x (9)对(8)中所有元素都成立.我们将不等式(9)写成如下形式:(w x +)1, 1,.i i y b i l ⋅≥= (10) 最优超平面00+0b ⋅=w x (11) 是指将训练数据以最大间隔分开的那个平面:它决定了向量w/|w|的方向,该方向上两不同类别训练向量间的距离最大.可回顾图2.记这个距离为(,)b ρw ,它由下式给定: 00{:1}{:1}(,)||||max min y y b ρ==-⋅⋅=-x w x w w w w x x . (12) 最优超平面(w 0,b 0)便是使得距离(12)取最大值的那组参数.根据(12)和(10)可得0002(,)||b ρ==w w . (13) 这表明,最优超平面就是满足条件(10)并且使得⋅w w 最小化的超平面.因此,构建一个最有超平面其实就是一个二次规划问题.满足()1i i y b ⋅+=w x 的向量i x 即为支持向量.附录中将证明决定最优超平面的向量0w 可以写成所有训练向量的一个线性组合:001l ii i i y α==∑w x , (14)其中0i α≥0.由于只有支持向量处的系数α>0(参考附录),表达式(14)只是0w 的一种简写形式.要求出参数向量i α:0001(,,)Tl αα= Λ,需要求解以下二次规划问题: 1()2T T W =1-D ΛΛΛΛ (15) 其中T Λ=(1,,l αα ),且满足限制:Λ≥0, (16) T ΛY = 0, (17) T 1 = (1,...,1)是 维的单位向量, 1(,...,)T l y y =Y 是 维的类标签向量,D 是 × 的对称矩阵其元素, ,1,...,ij i j i j D y y i j l =⋅=x x . (18) 不等式(16)表示非负象限.因此,问题变为在非负象限中最大化二次式(15),并服从约束条件(17).当训练数据可完全无错地分开时,在附录A 中我们证明了在最大泛函(15)、00(,)b Λ和式(13)的最大间隔0ρ之间有以下关系:0202()W ρ=Λ. (19)如果存在某个*Λ和某个较大的常数0W 使得不等式*0()W W >Λ (20) 成立,则所有把训练集(8)分开的超平面其间隔都满足ρ<. 如果训练集(8)不能被超平面完全分开,则两个类的样本间的间隔变的任意小,使得泛函()W Λ的值变得任意大.因此,要在约束条件(16)和(17)下最大化泛函(15),要么能求得最大值(这种情况需要构造最大间隔为0ρ的最优超平面),要么求出超过某个给定的常数0W 的最大值(的间隔分开训练数据).在约束条件(16)和(17)下最大化泛函(15)的问题如下方法可有效地解决.将训练数据集划分为几部分,使每部分占有合理的少量数据.先求解由第一部分训练数据决定的二次规划问题.对于该问题,会有两个结果:其一,这部分数据不能被任何超平面分开(这种情况下整个数据集都不能被任何超平面分开);其二,找到了能分开这部分数据的最优超平面.假设对第一部分数据泛函(15)的最大化时对应的向量为1Λ.该向量某些维上取值为零,它们是和该部分数据中非支持向量相关的.将第一部分数据中的支持向量和第二部分中不满足约束条件(10)的向量组成一个新的训练集,其中w 将由1Λ决定.对于这个新的训练集,构建泛函2()W Λ并假设其在2Λ最大化.持续递增地构造出覆盖全部训练数据的解向量*Λ,此时,或者会发现无法找到将整个训练集无错分开的超平面,或者构建出了整个训练集的最优超平面,且*Λ=0Λ.值得注意的是,在这个过程中泛函()W Λ的值是单调递增的,因为越来越多的训练向量被考虑到优化过程中来,使得两类样本之间的间隔越来越小.3 软间隔超平面考虑训练数据不能完全无错地分开的情况,此时,我们希望能以最小的错误将训练集分开.为形式化进行表示,引入非负变量0, 1,...,i i l ξ≥=.现在,最小化泛函1()li i σξξ=Φ=∑ (21)参数σ>0,约束条件为(+)1, 1,...,i i i y b i l ξ⋅≥-=w x , (22)0, 1,...,i i l ξ≥=. (23)对于足够小的σ>0,泛函(21)描述了训练错误数.最小化泛函(21)可获得被错分样本的某个最小子集:11(,),...,(,)k k i i i i y y x x .如果不考虑这些样本,其他的样本组成的训练集可以被无错地分开.可以构造一个将其他样本组成的训练集完全分开的最优超平面.该思想形式化表述如下:最小化泛函211()2l i i C F σξ=+⋅∑w (24) 约束条件为(22)和(23),其中F(u)是一个单调凸函数,C 是常数.对于足够大的C 和足够小的σ,在约束条件(22)和(23)下最大化泛函(24)的向量w 0和常数b 0决定了最优超平面,而该超平面使得训练集上错误数最小并将未分错部分以最大间隔分开.然而,在训练集上构建一个超平面使得错误数最小一般说来是NP 完全的.为避免出现NP 完全性,我们考虑σ=1的情况(使得最优问题(15)具有唯一解的最小σ).此时,泛函(24)描述了如何构建一个分类超平面使得训练错误的偏差之和最小且被正确分类的样本的间隔最大的问题(对于足够大的C).如果整个训练集都能被正确地分开,则此时构建出来的超平面就是最优间隔超平面.相比于σ<1的情况,σ=1的情况存在有效的方法寻找(24)的解.称这个解为软间隔超平面.在附录A 我们考虑在约束条件(22)和(23)下最小化泛函211()2l i i C F ξ=+⋅∑w (25) 其中F(u)是个单调凸函数且满足F(0)=0.为了表述的简洁,本节只考虑F(u)=u 2的情况.此时,最优化问题依然是个二次规划问题.附录A 中,我们证明了最优超平面算法里的向量w 可以写成支持向量的一个线性组合:001li i i i y α==∑w x .为求向量1(,...,)T l αα=Λ,需要求解以下双重二次规划问题,最大化泛函21(,)2TT W C δδ⎡⎤=⎢⎥⎣⎦1-D ΛΛΛΛ+ (26) 约束条件为=0TY Λ, (27) 0δ≥, (28) δ≤≤01Λ, (29) 其中1、Λ、Y 和D 与构建最优超平面的最优化问题中的相同,δ是一个标量,(29)描述了坐标取值范围.根据(29),可以发现,泛函(26)中所允许的最小的δ应该为 max 1max(,...,)l δααα==.因此,为求解软间隔分类器我们求解出在约束条件Λ≥0和(27)下使泛函 2max 1()2TT W C α⎡⎤=⎢⎥⎣⎦1-D ΛΛΛΛ+ (30) 最大化的向量Λ.这个问题不同于在泛函(30)中增加max α这一项来构造最优间隔分类器的问题.由于多了这一项,构造软件间隔分类器的问题对于任何数据集都是唯一的且有解的. 因为max α这一项的存在,泛函(30)不再是二次的.在约束条件Λ≥0和(27)下最大化泛函(30)属于凸规划问题的范畴.因此,要构建软间隔分类器,我们可以在 维的Λ参数空间解凸规划问题,也可以在 +1维的(Λ,δ)参数空间解二次规划问题.在我们的实验里,采用的是后者.4 特征空间中内积的回旋方法前文描述了在输入空间构建分类超平面的算法.为建立特征空间里的超平面,首先要通过选择一个N 维向量函数φ::n N φℜ→ℜ将n 维输入向量x 映射成N 维特征向量.于是,一个N 维的线性分类器(w ,b)的构造将针对这些被转换的向量:12()(),(),...,()i i i N i φφφφ=x x x x , 1,...,i = .对一个未知向量x 进行分类,需要先将它转换到分类空间(φx (x) ),再考虑以下函数的符号 ()()+f b φ=⋅x w x . (31)根据软间隔分类方法的性质,向量w 可写成特征空间中支持向量的线性组合.即1=()li i ii y αφ=∑w x . (32) 由于内积的线性特点,分类函数(31)对于一个未知向量x 的判别仅依赖于内积运算: 1()()()()l i i ii f b y b φαφφ==+=⋅+∑x x x x . (33) 构造支持向量网络的思想来自于对Hilbert 空间[2]中内积一般形式的考虑:()()(,)K φφ⋅≡u v u v . (34) 根据Hilbert-Schmidt 理论[6],任意的对称函数(,)K u v ∈2L 都能展开成以下形式1(,)()()i i ji K λφφ∞==⋅∑u v u v , (35) 其中i λ∈ℜ为特征值,i φ为特征函数,由核(,)K u v 的如下积分式定义:(,)()()i i iK d φλφ=⎰u v u u v . 保证(34)描述了在某个特征空间中的一个内积的一个充分条件是展开式(35)中的所有特征值都为正数.要保证所有的这些系数都为正,充分必要条件是,对使得2()gd <∞⎰u u成立的所有g,条件 (,)()()K g g d d ⎰⎰u v u v u v >0成立(Merser ’s 定理).因此,满足Merser ’s 定理的函数就可以用作内积.Aizerman, Braverman 和Rozonoer[1]提出了一种特征空间里内积的回旋形式|-|(,)exp()K σ=-u v u v , (36)称之为势函数.事实上,特征空间中内积的回旋可以用满足Merser ’s 条件的任何函数,特别地,要在n 维输入空间构造d 次多项式分类器,我们可以利用以下函数(,)(1)d K =⋅u v u v +. (37) 使用不同的内积(,)K u v 我们以任意类型的决策平面来构造不同的学习机器[3].这些学习机器的决策平面具有如下形式1()(,)li i i i f y K α==∑x x x ,其中i x 是支持向量在输入空间的像,i α是支持向量在特征空间的权值.求解支持向量i x 和它们的权值i α可采用和原来的最优间隔分类器或者软间隔分类器类似的方法.唯一的不同点是矩阵D(在(18)中定义)的元素变为(,)ij i j i j D y y K =x x , ,1,...,i j l =.5 支持向量网络的一般特征5.1 支持向量网络建立的决策规则是有效的要构建支持向量网络的决策规则,需要求解如下二次优化问题:21(,)2TT W C δδ⎡⎤=⎢⎥⎣⎦1-D ΛΛΛΛ+, 约束条件为:=0T Y Λ,δ≤≤01Λ,其中矩阵(,)ij i j i j D y y K =x x ,,1,...,i j l =.由训练数据集决定, (,)K u v 是决定内积回旋的函数.这个最优化问题可通过训练数据决定的一个中间优化问题来求解,这个解是由支持向量组成的.相关技术在第3节已经论述过.求得的最优决策函数是唯一的.每一个最优化问题都有其相应的标准解法.5.2 支持向量网络是一种通用的学习机器通过选择不同的核函数(,)K u v 来内积回旋,可以实现不同的支持向量网络.下一节我们将考虑多项式决策面的支持向量网络.为实现多项式的不同阶数,选用如下核函数用于内积回旋(,)(1)d K =⋅u v u v +.具有如下决策函数形式221||()sign(exp )ni i f σ=⎧⎫=⎨⎬⎩⎭∑x -x x 的径向基函数的学习机器可通过以下核函数实现22|-|(,)exp{}K σ=-u v u v .此时,支持向量机将会构建出近似函数的中心i x 以及相应的权值.也可以结合已有问题的先验知识来构建一个特殊的回旋函数.因此,支持向量网络是非常通用的一种学习机器,利用不同的核函数就能实现不同的决策函数集合.5.3 支持向量网络与推广能力的控制控制一个学习机器的推广能力需要控制两个因素:训练集上的错误率和学习机器的容量VC 维[14].测试集上的误差概率存在这样一个界:不等式Pr(test error)Frequency(training error)+Confidence Interval ≤ (38) 以概率1-η成立.界(38)中的置信区间决定于学习机器的VC 维,训练集数据个数和η的值.(38)中的两个因素形成了一对矛盾:学习机器的VC 维越小,置信区间越小,但是错误的频率越大.结构风险最小化原则用来解决这个问题:对于给定的数据集,找到一个使置信区间与错误频率之和最小的解.结构风险最小化原则的一个特例是奥科玛-剃刀原则:保持第一项为零最小化第二项.我们知道,对于阈值b 固定的线性指示函数()sign(+), ||I b C =⋅≤x x w x x其VC 维等于输入空间的维数.然而,对于函数集(权值有界)()sign(+), ||, ||I b C C =⋅≤≤w x w x x w其VC 维可小于输入空间的维数且依赖于C w .可以看到,最优间隔分类器方法运用了奥科玛-剃刀原则.将(38)的第一项保持为零(通过满足不等式(9)),再最小化第二项(通过最小化函数)⋅w w .这样的最小化避免了过拟合问题.然而,即使是在训练数据完全可分的情况下,也可以通过以训练集上的错误为代价来最小化(38)中的置信区间那一项进一步获得更好的推广性.在软间隔分类方法中,这可以通过选择恰当的参数C 来实现.在支持向量网络算法中,可以通过调节参数C 来控制决策规则的复杂度和错误频率这一对矛盾,即使是对于训练集无法完全分开这种一般情况.因此,支持向量网络能够控制影像学习机器推广能力的两个因素.6 实验分析为验证支持向量网络方法,我们进行了两组实验.(1)构建了平面上的人工数据集并用2阶多项式决策面进行了实验;(2)针对数字识别这个实际问题进行了实验.6.1 平面上的实验利用核函数(,)(1)du v u v+(39)K=⋅其中d=2,构建平面上不同样本集的决策规则.实验结果证实了该算法的强大.图5给出了一些例子.黑白两种子弹代表两类.在图中,双圆圈表示支持向量,十字表示错分的向量.可以发现,支持向量的个数相对于训练样本的个数是很少的.Fig 5. Examples of the dot-product (39) with d=2. Support patterns are indicated with double circles, errors with a cross.Fig 6. Examples of patterns with labels from the US Postal Service digit database.6.2 数字识别实验针对位图数字识别构建支持向量网络的实验用到了一大一小两个数据库.小的是美国邮政服务器上的数据库,包含7300个训练样本和2000个测试样本.数据的分辨率是16×16像素,图6给出了一些例子.在这个数据库上,我们研究不同阶数的多项式的实验结果.大的数据库包含60,000训练样本和10,000测试样本,是NIST上训练集和测试集50-50的混合.28×28像素的分辨率导致输入维数是784.在这个数据上我们只构建了一个四阶的多项式分类器.该分类器的性能和其他不同学习机器在基准学习[4]上进行了比较.实验中共构建了十个分类器,每个类别一个.每个超平面使用相同的内积和数据预处理过程.对未知样本的判别结果为十个分类器中输出结果最大的一个.Table 1. Performance of various classifiers collected from publications and own experiments. For reference see text.Table 2. Results obtained for dot products of polynomials of various degree. The number “support vectors”is a mean value per classifier.Fig 7. Labeled examples of errors on the training set for the 2nd degree polynomial support-vector classifier.6.2.1 美国邮政服务数据库上的实验美国邮政服务数据库的数据是从现实生活中的邮政编码采集的,很多研究者在此之上做过实验.表1列出了公共实验和我们的实验中不同分类器的性能.人工表现的结果由J.Bromley和E.Sackinger给出[5].CART的结果是贝尔实验室的Daryl Pregibon和Michael D.Riley以及NJ 的Murray Hill完成的.C4.5的结果是C.Cortes得到的,最优两层神经网络的结果是B.Scholkopf 得到的.专门为此用途的五层神经网络结构(LeNet1)结果是由Y.LeCun等人取得的[9].本实验我们利用预处理技术(居中,倾斜,平滑)来合并实验中相关变量的知识.数据平滑作为支持向量网络的一项预处理技术在文献[3]中进行了研究.本实验与文献[3]一致,选择 =0.75的平滑高斯核函数.实验基于数据库构建了以(39)的内积的多项式指示函数.输入空间的维数是256,多项式阶数范围从1到7.表2描述了实验结果.实验的训练数据不是线性可分的.可以发现,支持向量的个数增加非常缓慢.7阶多项式只比3阶多项式多了30%的支持向量,甚至比一阶多项式还少.但是特征空间的维数7阶多项式却比3阶多项式的1010倍还多.除此以外还能看到,随着空间维数的增加性能并没有多大的改变,可见没有过拟合的问题.线性分割器的支持向量数目较多是由于数据集的不可分导致的:200个向量包括了支持向量和ξ值非零的训练向量.如果ξ>1则训练向量被错分;每个线性分类器在训练集上平均有34个样本被错分.2阶分类器训练集上被错分的总数下降到了4.这4个样本如图7所示.值得注意的是,当我们考虑所获得的支持向量个数并不是期望的数目时,实验中推广能力的界保持不变.各种不同情况错误概率的上界不超过3%(事实上在测试集上对单个分类器错误概率不超过1.5%).构建多项式分类器的时间不依赖于多项式的阶数,而仅依赖于支持向量的个数.即使是最坏情况,也快于专门为此用途设计的最优性能神经网络(LeNet1[9]).神经网络的性能是5.1%的粗错误率.2阶或更高阶的多项式性能好于LeNet1.6.2.2 NIST数据库上的实验NIST数据库作为基准学习引入刚刚两周.时间的限制只允许构建一种类型的分类器,我们选择4阶多项式,且未进行预处理.该选择是基于美国邮政服务数据库上的实验.表3列出了10个分类器的支持向量数目和训练集与测试集上的性能.可以看到,即使是4阶的多项式(多于108的自由参数)也会在训练集上出错.平均训练误差是0.02%,即平均每类12个.分类器1错分的14个样本如图8所示.再次注意对于所得的不同支持向量数目上界(5)是如何保持的.Table 3. Result obtained for a 4th degree polynomial classifier on the NIST database. The size of the training set is 60,000, and the size of the test size is 10,000 patterns.Fig 8. The 14 misclassified test patterns with labels for classifier 1. Patterns with label “1”are false negative. Patterns with other labels are false positive.Fig 9. Results from the benchmark study.十个分类器在测试集上的联合误差是1.1%.这个结果应该在学习基准上和其他的分类器进行比较.包括一个线性分类器,一个具有60,000个原型的k (=3)最近邻分类器,两个专门为数字识别而设计的神经网络(LeNet1和LeNet4).作者只给出了支持向量网络的实验结果.学习基准的比较结果如图9所示.我们引用文献[4]对学习基准的描述来作为本节的总结:“很长一段时间LeNet1被认为是发展水平…通过一系列在结构和错误特征分析的实验,LeNet4被认为是…支持向量网络具有很高的精度,很令人瞩目,因为和其他高性能的分类器不一样,它不包括问题的几何学知识.事实上,即使是对图像像素进行加密,比如通过一个固定随机的置换,它依然可以工作得很好.”最后要说的是,支持向量网络性能的进一步提高可以通过构建一个反应已有问题先验信K u v.息的内积函数(,)7 总结本文介绍了一种针对两类分类问题的新型学习机器支持向量网络.支持向量网络包含3个思想:求解最优超平面的技术(允许将解向量在支持向量上展开),内积回旋的思想(将决策面求解从线性扩展到非线性),和软间隔的概念(允许训练集上出现错误).该算法已经过测试并和其他分类算法进行过性能比较.尽管它的决策面设计简单但是在比较学习上它展现了很好的性能.其他特征例如容量控制能力和决策面变换的简易性证实了支持向量网络是一种及其强大和通用的学习机器.。
基于小孔模型的输气管道流量系数修正
采输技术DOI :10.3969/j.issn.1001-2206.2024.02.005基于小孔模型的输气管道流量系数修正高晓楠1,姚佳杉1,钱薇如2,郄晓敏2,陈喜鸿3,梁昌晶11.中国石油华北油田分公司第一采油厂,河北任丘0625522.华北石油管理局有限公司河北储气库分公司,河北廊坊0650003.华港燃气集团有限公司,河北任丘062552摘要:流量系数是管道泄漏速率计算过程中的不确定因素,目前流量系数的选取存在随意性和盲目性。
针对这一问题,在小孔模型理论分析的基础上,结合缩比实验和CFD 数值模拟,考察管道内压、泄漏口形状、泄漏面积、管壁粗糙度对泄漏速率及流量系数的影响,通过速度矢量及马赫数对泄漏口附近进行流场分析,利用多元非线性拟合手段回归流量系数。
结果表明,不同工况下的理论值均高于模拟值和实验值,而实验值和模拟值的吻合性较好;在相同压力下,不同形状泄漏口的泄漏速率与泄漏面积均呈正线性相关;在相同压力和泄漏面积下,矩形泄漏口泄漏速率最大,圆形泄漏口泄漏速率最小,说明矩形泄漏口的流量系数最大,其次为三角形和圆形泄漏口;流量系数修正方程的相关系数分别达到0.9951、0.9964、0.9925,可以用于不同工况下泄漏速率的计算。
关键词:泄漏速率;流量系数;修正;相似实验;CFDFlow coefficient correction of gas pipeline based on small hole modelGAO Xiaonan 1,YAO Jiashan 1,QIAN Weiru 2,QIE Xiaomin 2,CHEN Xihong 3,LIANG Changjing 11.No.1Oil Production Plant of Huabei Oilfield Company,CNPC,Renqiu 062552,China2.Hebei Gas Storage Branch of North China Petroleum Administration Co.,Ltd.,Langfang 065000,China3.Huagang Gas Group Co.,Ltd.,Renqiu 062552,ChinaAbstract:As an uncertainty in the calculation of pipeline leakage rate,flow coefficient is haphazardly selected at present.To solve this problem,based on the theoretical analysis of the small-hole model,combined with the subscale test and CFD numerical simulation,the influences of internal pressure of the pipeline,leakage port shape,leakage area and roughness of the pipe wall on the leakage rate and flow coefficient were investigated.The flow field near the leakage port was analyzed by the velocity vector and Mach number,with the flow coefficient returned by multivariate nonlinear fitting method.The results show that the theoretical values under different conditions are higher than the simulated and experimental values,and there is good agreement between the experimental and simulated values.Under the same pressure,and with different leakage port shapes,the leakage rate is linearly correlated with the leakage area.Under the same pressure and with the same leakage area,the rectangle port shows the largest leakage rate,and the circle port the smallest,which indicates that the rectangular leakage port has the largest flow coefficient,followed by triangle and circle ports.The correlation coefficients of the modified flow coefficient equation reach up to 0.9951,0.9964and 0.9925respectively,which can be used to calculate the leakage rate under different working conditions.Keywords:leakage rate;flow coefficient;correction;similarity experiment;CFD为尽快实现我国“碳中和、碳达峰”的目标,天然气已成为取代石油和煤炭的主要清洁能源。
基于支持向量机的混凝土中钢筋锈蚀量预测
1 前言
工程实践中, 有大量产生锈蚀裂缝却仍 在使用的钢筋混凝土构件。这些构件的钢
. s t 币 一 幻一 ,+。: 少 万 .毕 要 , t 二, 之0 ,= 1 八 二 , 夯
得该问题的解 :
“刀尸 名
{‘x,、 “+气 1办 W ‘代 , ‘“ , 一 !
, t
1 . 1、, 工
(3 )
核函数K( , ); x, x,
工 业 技 术
S C IE I以 〔 & TEC } NOLO GY IN厂 日 〕 邵 C MA T{ N O
基 于 支 持 向量机 的 混 凝 土 中钢 筋 锈 蚀 量 预 测 1
程 赞朱建国
(淮海工学院
江苏连云港 22 200 5 )
摘 要: 对胡筋份性1 进行 了 研究,提出了 一种街的垂于支持向全机理论的混提土中钥筋份性全的孩侧方法。该方法通过与传统的 BP 神经网络比软的实脸表明, 达到了 较好的预侧效果。 关键词: 钢筋姆性 支持向1 机 预浏 中图分类号: Tu 375 Tu 761. 1+ 3 文献标识码: A 文章编号: 1‘ 37, 72一 1(2007) 12(。 004卜 02 )一
求解该问题, 构造线性回归函数, 这种算法为 线性 一支持向量回归机。
其中按下列方式计算: 选择位于开区间 中的任一个或。
若 到 是 ,‘ 一 一K,二 选 的 弓 则一 客 。 动 二‘ ( x
对于非线性回归问题, 由Mer cer 定理
若选 是 则万 一; 一切勺‘ 到的 式, 二 身 动 ‘一 为 》
(2 构造并求解最优化问题 )
其中 C 为惩罚因子 ,E 为回归允许最大
筋 蚀 较大, 于腐 介质 H20 和 锈 量已 且由 蚀 ( 0 2) 更易到达钢筋表面, 其锈蚀速度也明
微波光子矢量信号处理技术
I
ABSTRACT
ABSTRACT
In recent years, with the development of communication technology, people’s demand for large-capacity and high-rate information transmission is increasing. In order to meet the needs of society, it is necessary to use vector signals of different modulation formats for information transmission, which can improve communication transmission rate and spectrum utilization. Therefore, vector signal processing is a key technology for many communication systems such as satellite communication and mobile communication. However, the use of electronic system to process vector signals faces electronic bottlenecks such as small bandwidth and poor frequency tuning. Microwave photonics, as an interdisciplinary combination of microwave technology and photonics technology, has many advantages such as large bandwidth, good frequency tuning and anti-electromagnetic interference. It can solve the electronic bottlenecks of traditional electronic systems. Compared with traditional electronic systems, vector signal processing technology based on microwave photonics has significant advantages. In this thesis, we study vector signal processing technology based on microwave photonics. The main work is as follows:
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
2
3
ABSTterrorist groups that responsible of terrorist attacks is a challenging task and a promising research area. There are many methods that have been developed to address this challenge ranging from supervised to unsupervised methods. The main objective of this research is to conduct a detailed comparative study among Support Vector Machine as one of the successful prediction classifiers that proved highly performance and other supervised machine learning classification and hybrid classification algorithms. Whereas most promising methods are based on support vector machine (SVM); so there is a need for a comprehensive analysis on prediction accuracy of supervised machine learning algorithms on different experimental conditions, and hence in this research we compare predictive accuracy and comprehensibility of explicit, implicit, and hybrid machine learning models and algorithms. This research based on predicting terrorist groups responsible of attacks in Middle East & North Africa from year 2004 up to 2008 by comparing various standard, ensemble, hybrid, and hybrid ensemble machine learning methods and focusing on SVM. The compared classifiers are categorized into main four types namely; Standard Classifiers, Hybrid Classifiers, Ensemble Classifiers, and Hybrid Ensemble Classifiers. In our study we conduct three different experiments on the used real data, afterwards we compare the obtained results according to four different performance measures. Experiments were carried out using real world data represented by Global terrorism Database (GTD) from National Consortium for the study of terrorism and Responses of Terrorism (START).
IPASJ International Journal of Computer Science (IIJCS)
A Publisher for Research Motivation ........
Volume 3, Issue 5, May 2015
Web Site: /IIJCS/IIJCS.htm Email: editoriijcs@ ISSN 2321-5992
1
Department of Operations Research & Decision Support, Faculty of Computers & Information, Cairo University, 5 Dr. Ahmed Zoweil St.- Orman - Postal Code 12613 - Giza – Egypt Department of Operations Research & Decision Support, Faculty of Computers & Information, Cairo University, 5 Dr. Ahmed Zoweil St.- Orman - Postal Code 12613 - Giza – Egypt Department of Operations Research & Decision Support, Faculty of Computers & Information, Cairo University, 5 Dr. Ahmed Zoweil St.- Orman - Postal Code 12613 - Giza – Egypt
Keywords: Hybrid Models, Machine Learning, Predictive Accuracy, Supervised Learning.
1. INTRODUCTION
Machine learning (ML) is the process of estimating unknown dependencies or structures in a system using a limited number of observations [1]. ML algorithms are used in data mining applications to retrieve hidden information. Machine learning methods are rote learning, learning by being told, learning by analogy, and inductive learning, which includes methods of learning by examples and learning by experimentation and discovery [1] [2]. Numerous machine learning methods and different knowledge representation models can be used for predicting different pattern in data set [3]. For example, classification, and regression methods can be used for learning decision trees, rules, Bayes networks, artificial neural networks and support vector machines. Supervised Machine learning classification is one of the tasks most frequently carried out by so called Intelligent Systems. Thus, a large number of techniques have been developed based on Artificial Intelligence (Logic-based techniques, Perceptron-based techniques) and Statistics (Bayesian Networks, Instance-based techniques). The concept of combing classifiers is proposed as a new direction for the improvement of the performance of individual machine learning algorithms. Hybrid and ensemble methods in machine learning have attracted a great attention of the scientific community over the last years [1]. Multiple, ensemble learning models have been theoretically and empirically shown to provide significantly better performance than single weak learners, especially while dealing with high dimensional, complex regression and classification problems [2]. Adaptive hybrid systems has become essential in computational intelligence and soft computing, a main reason for being popular is the high complementary of its components. The integration of the basic technologies into hybrid machine learning solutions [4] facilitate more intelligent search and reasoning methods that match various domain knowledge with empirical data to solve advanced and complex problems [5]. Both ensemble models and hybrid methods make use of the information fusion concept but in slightly different way. In case of ensemble classifiers, multiple but homogeneous, weak models are combined [6], typically at the level of their individual output, using various merging methods, which can be grouped into fixed (e.g., majority voting), and