Representational issues in machine learning of user profiles
survey--on sentiment detection of reviews
A survey on sentiment detection of reviewsHuifeng Tang,Songbo Tan *,Xueqi ChengInformation Security Center,Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100080,PR Chinaa r t i c l e i n f o Keywords:Sentiment detection Opinion extractionSentiment classificationa b s t r a c tThe sentiment detection of texts has been witnessed a booming interest in recent years,due to the increased availability of online reviews in digital form and the ensuing need to organize them.Till to now,there are mainly four different problems predominating in this research community,namely,sub-jectivity classification,word sentiment classification,document sentiment classification and opinion extraction.In fact,there are inherent relations between them.Subjectivity classification can prevent the sentiment classifier from considering irrelevant or even potentially misleading text.Document sen-timent classification and opinion extraction have often involved word sentiment classification tech-niques.This survey discusses related issues and main approaches to these problems.Ó2009Published by Elsevier Ltd.1.IntroductionToday,very large amount of reviews are available on the web,as well as the weblogs are fast-growing in blogsphere.Product re-views exist in a variety of forms on the web:sites dedicated to a specific type of product (such as digital camera ),sites for newspa-pers and magazines that may feature reviews (like Rolling Stone or Consumer Reports ),sites that couple reviews with commerce (like Amazon ),and sites that specialize in collecting professional or user reviews in a variety of areas (like ).Less formal reviews are available on discussion boards and mailing list archives,as well as in Usenet via Google ers also com-ment on products in their personal web sites and blogs,which are then aggregated by sites such as , ,and .The information mentioned above is a rich and useful source for marketing intelligence,social psychologists,and others interested in extracting and mining opinions,views,moods,and attitudes.For example,whether a product review is positive or negative;what are the moods among Bloggers at that time;how the public reflect towards this political affair,etc.To achieve this goal,a core and essential job is to detect subjec-tive information contained in texts,include viewpoint,fancy,atti-tude,sensibility etc.This is so-called sentiment detection .A challenging aspect of this task seems to distinguish it from traditional topic-based detection (classification)is that while top-ics are often identifiable by keywords alone,sentiment can be ex-pressed in a much subtle manner.For example,the sentence ‘‘What a bad picture quality that digital camera has!...Oh,thisnew type camera has a good picture,long battery life and beautiful appearance!”compares a negative experience of one product with a positive experience of another product.It is difficult to separate out the core assessment that should actually be correlated with the document.Thus,sentiment seems to require more understand-ing than the usual topic-based classification.Sentiment detection dates back to the late 1990s (Argamon,Koppel,&Avneri,1998;Kessler,Nunberg,&SchÄutze,1997;Sper-tus,1997),but only in the early 2000s did it become a major sub-field of the information management discipline (Chaovalit &Zhou,2005;Dimitrova,Finn,Kushmerick,&Smyth,2002;Durbin,Neal Richter,&Warner,2003;Efron,2004;Gamon,2004;Glance,Hurst,&Tomokiyo,2004;Grefenstette,Qu,Shanahan,&Evans,2004;Hil-lard,Ostendorf,&Shriberg,2003;Inkpen,Feiguina,&Hirst,2004;Kobayashi,Inui,&Inui,2001;Liu,Lieberman,&Selker,2003;Rau-bern &Muller-Kogler,2001;Riloff and Wiebe,2003;Subasic &Huettner,2001;Tong,2001;Vegnaduzzo,2004;Wiebe &Riloff,2005;Wilson,Wiebe,&Hoffmann,2005).Until the early 2000s,the two main popular approaches to sentiment detection,espe-cially in the real-world applications,were based on machine learn-ing techniques and based on semantic analysis techniques.After that,the shallow nature language processing techniques were widely used in this area,especially in the document sentiment detection.Current-day sentiment detection is thus a discipline at the crossroads of NLP and IR,and as such it shares a number of characteristics with other tasks such as information extraction and text-mining.Although several international conferences have devoted spe-cial issues to this topic,such as ACL,AAAI,WWW,EMNLP,CIKM etc.,there are no systematic treatments of the subject:there are neither textbooks nor journals entirely devoted to sentiment detection yet.0957-4174/$-see front matter Ó2009Published by Elsevier Ltd.doi:10.1016/j.eswa.2009.02.063*Corresponding author.E-mail addresses:tanghuifeng@ (H.Tang),tansongbo@ (S.Tan),cxq@ (X.Cheng).Expert Systems with Applications 36(2009)10760–10773Contents lists available at ScienceDirectExpert Systems with Applicationsjournal homepage:/locate/eswaThis paperfirst introduces the definitions of several problems that pertain to sentiment detection.Then we present some appli-cations of sentiment detection.Section4discusses the subjectivity classification problem.Section5introduces semantic orientation method.The sixth section examines the effectiveness of applying machine learning techniques to document sentiment classification. The seventh section discusses opinion extraction problem.The eighth part talks about evaluation of sentiment st sec-tion concludes with challenges and discussion of future work.2.Sentiment detection2.1.Subjectivity classificationSubjectivity in natural language refers to aspects of language used to express opinions and evaluations(Wiebe,1994).Subjectiv-ity classification is stated as follows:Let S={s1,...,s n}be a set of sentences in document D.The problem of subjectivity classification is to distinguish sentences used to present opinions and other forms of subjectivity(subjective sentences set S s)from sentences used to objectively present factual information(objective sen-tences set S o),where S s[S o=S.This task is especially relevant for news reporting and Internet forums,in which opinions of various agents are expressed.2.2.Sentiment classificationSentiment classification includes two kinds of classification forms,i.e.,binary sentiment classification and multi-class senti-ment classification.Given a document set D={d1,...,d n},and a pre-defined categories set C={positive,negative},binary senti-ment classification is to classify each d i in D,with a label expressed in C.If we set C*={strong positive,positive,neutral,negative,strong negative}and classify each d i in D with a label in C*,the problem changes to multi-class sentiment classification.Most prior work on learning to identify sentiment has focused on the binary distinction of positive vs.negative.But it is often helpful to have more information than this binary distinction pro-vides,especially if one is ranking items by recommendation or comparing several reviewers’opinions.Koppel and Schler(2005a, 2005b)show that it is crucial to use neutral examples in learning polarity for a variety of reasons.Learning from negative and posi-tive examples alone will not permit accurate classification of neu-tral examples.Moreover,the use of neutral training examples in learning facilitates better distinction between positive and nega-tive examples.3.Applications of sentiment detectionIn this section,we will expound some rising applications of sen-timent detection.3.1.Products comparisonIt is a common practice for online merchants to ask their cus-tomers to review the products that they have purchased.With more and more people using the Web to express opinions,the number of reviews that a product receives grows rapidly.Most of the researches about these reviews were focused on automatically classifying the products into‘‘recommended”or‘‘not recom-mended”(Pang,Lee,&Vaithyanathan,2002;Ranjan Das&Chen, 2001;Terveen,Hill,Amento,McDonald,&Creter,1997).But every product has several features,in which maybe only part of them people are interested.Moreover,a product has shortcomings in one aspect,probably has merits in another place(Morinaga,Yamanishi,Tateishi,&Fukushima,2002;Taboada,Gillies,&McFe-tridge,2006).To analysis the online reviews and bring forward a visual man-ner to compare consumers’opinions of different products,i.e., merely with a single glance the user can clearly see the advantages and weaknesses of each product in the minds of consumers.For a potential customer,he/she can see a visual side-by-side and fea-ture-by-feature comparison of consumer opinions on these prod-ucts,which helps him/her to decide which product to buy.For a product manufacturer,the comparison enables it to easily gather marketing intelligence and product benchmarking information.Liu,Hu,and Cheng(2005)proposed a novel framework for ana-lyzing and comparing consumer opinions of competing products.A prototype system called Opinion Observer is implemented.To en-able the visualization,two tasks were performed:(1)Identifying product features that customers have expressed their opinions on,based on language pattern mining techniques.Such features form the basis for the comparison.(2)For each feature,identifying whether the opinion from each reviewer is positive or negative,if any.Different users can visualize and compare opinions of different products using a user interface.The user simply chooses the prod-ucts that he/she wishes to compare and the system then retrieves the analyzed results of these products and displays them in the interface.3.2.Opinion summarizationThe number of online reviews that a product receives grows rapidly,especially for some popular products.Furthermore,many reviews are long and have only a few sentences containing opin-ions on the product.This makes it hard for a potential customer to read them to make an informed decision on whether to purchase the product.The large number of reviews also makes it hard for product manufacturers to keep track of customer opinions of their products because many merchant sites may sell their products,and the manufacturer may produce many kinds of products.Opinion summarization(Ku,Lee,Wu,&Chen,2005;Philip et al., 2004)summarizes opinions of articles by telling sentiment polari-ties,degree and the correlated events.With opinion summariza-tion,a customer can easily see how the existing customers feel about a product,and the product manufacturer can get the reason why different stands people like it or what they complain about.Hu and Liu(2004a,2004b)conduct a work like that:Given a set of customer reviews of a particular product,the task involves three subtasks:(1)identifying features of the product that customers have expressed their opinions on(called product features);(2) for each feature,identifying review sentences that give positive or negative opinions;and(3)producing a summary using the dis-covered information.Ku,Liang,and Chen(2006)investigated both news and web blog articles.In their research,TREC,NTCIR and articles collected from web blogs serve as the information sources for opinion extraction.Documents related to the issue of animal cloning are selected as the experimental materials.Algorithms for opinion extraction at word,sentence and document level are proposed. The issue of relevant sentence selection is discussed,and then top-ical and opinionated information are summarized.Opinion sum-marizations are visualized by representative sentences.Finally, an opinionated curve showing supportive and non-supportive de-gree along the timeline is illustrated by an opinion tracking system.3.3.Opinion reason miningIn opinion analysis area,finding the polarity of opinions or aggregating and quantifying degree assessment of opinionsH.Tang et al./Expert Systems with Applications36(2009)10760–1077310761scattered throughout web pages is not enough.We can do more critical part of in-depth opinion assessment,such asfinding rea-sons in opinion-bearing texts.For example,infilm reviews,infor-mation such as‘‘found200positive reviews and150negative reviews”may not fully satisfy the information needs of different people.More useful information would be‘‘Thisfilm is great for its novel originality”or‘‘Poor acting,which makes thefilm awful”.Opinion reason mining tries to identify one of the critical ele-ments of online reviews to answer the question,‘‘What are the rea-sons that the author of this review likes or dislikes the product?”To answer this question,we should extract not only sentences that contain opinion-bearing expressions,but also sentences with rea-sons why an author of a review writes the review(Cardie,Wiebe, Wilson,&Litman,2003;Clarke&Terra,2003;Li&Yamanishi, 2001;Stoyanov,Cardie,Litman,&Wiebe,2004).Kim and Hovy(2005)proposed a method for detecting opinion-bearing expressions.In their subsequent work(Kim&Hovy,2006), they collected a large set of h review text,pros,cons i triplets from ,which explicitly state pros and cons phrases in their respective categories by each review’s author along with the re-view text.Their automatic labeling systemfirst collects phrases in pro and confields and then searches the main review text in or-der to collect sentences corresponding to those phrases.Then the system annotates this sentence with the appropriate‘‘pro”or‘‘con”label.All remaining sentences with neither label are marked as ‘‘neither”.After labeling all the data,they use it to train their pro and con sentence recognition system.3.4.Other applicationsThomas,Pang,and Lee(2006)try to determine from the tran-scripts of US Congressionalfloor debates whether the speeches rep-resent support of or opposition to proposed legislation.Mullen and Malouf(2006)describe a statistical sentiment analysis method on political discussion group postings to judge whether there is oppos-ing political viewpoint to the original post.Moreover,there are some potential applications of sentiment detection,such as online message sentimentfiltering,E-mail sentiment classification,web-blog author’s attitude analysis,sentiment web search engine,etc.4.Subjectivity classificationSubjectivity classification is a task to investigate whether a par-agraph presents the opinion of its author or reports facts.In fact, most of the research showed there was very tight relation between subjectivity classification and document sentiment classification (Pang&Lee,2004;Wiebe,2000;Wiebe,Bruce,&O’Hara,1999; Wiebe,Wilson,Bruce,Bell,&Martin,2002;Yu&Hatzivassiloglou, 2003).Subjectivity classification can prevent the polarity classifier from considering irrelevant or even potentially misleading text. Pang and Lee(2004)find subjectivity detection can compress re-views into much shorter extracts that still retain polarity informa-tion at a level comparable to that of the full review.Much of the research in automated opinion detection has been performed and proposed for discriminating between subjective and objective text at the document and sentence levels(Bruce& Wiebe,1999;Finn,Kushmerick,&Smyth,2002;Hatzivassiloglou &Wiebe,2000;Wiebe,2000;Wiebe et al.,1999;Wiebe et al., 2002;Yu&Hatzivassiloglou,2003).In this section,we will discuss some approaches used to automatically assign one document as objective or subjective.4.1.Similarity approachSimilarity approach to classifying sentences as opinions or facts explores the hypothesis that,within a given topic,opinion sen-tences will be more similar to other opinion sentences than to fac-tual sentences(Yu&Hatzivassiloglou,2003).Similarity approach measures sentence similarity based on shared words,phrases, and WordNet synsets(Dagan,Shaul,&Markovitch,1993;Dagan, Pereira,&Lee,1994;Leacock&Chodorow,1998;Miller&Charles, 1991;Resnik,1995;Zhang,Xu,&Callan,2002).To measure the overall similarity of a sentence to the opinion or fact documents,we need to go through three steps.First,use IR method to acquire the documents that are on the same topic as the sentence in question.Second,calculate its similarity scores with each sentence in those documents and make an average va-lue.Third,assign the sentence to the category(opinion or fact) for which the average value is the highest.Alternatively,for the frequency variant,we can use the similarity scores or count how many of them for each category,and then compare it with a prede-termined threshold.4.2.Naive Bayes classifierNaive Bayes classifier is a commonly used supervised machine learning algorithm.This approach presupposes all sentences in opinion or factual articles as opinion or fact sentences.Naive Bayes uses the sentences in opinion and fact documents as the examples of the two categories.The features include words, bigrams,and trigrams,as well as the part of speech in each sen-tence.In addition,the presence of semantically oriented(positive and negative)words in a sentence is an indicator that the sentence is subjective.Therefore,it can include the counts of positive and negative words in the sentence,as well as counts of the polarities of sequences of semantically oriented words(e.g.,‘‘++”for two con-secutive positively oriented words).It also include the counts of parts of speech combined with polarity information(e.g.,‘‘JJ+”for positive adjectives),as well as features encoding the polarity(if any)of the head verb,the main subject,and their immediate modifiers.Generally speaking,Naive Bayes assigns a document d j(repre-sented by a vector dÃj)to the class c i that maximizes Pðc i j dÃjÞby applying Bayes’rule as follow,Pðc i j dÃjÞ¼Pðc iÞPðdÃjj c iÞPðdÃjÞð1Þwhere PðdÃjÞis the probability that a randomly picked document dhas vector dÃjas its representation,and P(c)is the probability that a randomly picked document belongs to class c.To estimate the term PðdÃjj cÞ,Naive Bayes decomposes it byassuming all the features in dÃj(represented by f i,i=1to m)are con-ditionally independent,i.e.,Pðc i j dÃjÞ¼Pðc iÞQ mi¼1Pðf i j c iÞÀÁPðdÃjÞð2Þ4.3.Multiple Naive Bayes classifierThe hypothesis of all sentences in opinion or factual articles as opinion or fact sentences is an approximation.To address this, multiple Naive Bayes classifier approach applies an algorithm using multiple classifiers,each relying on a different subset of fea-tures.The goal is to reduce the training set to the sentences that are most likely to be correctly labeled,thus boosting classification accuracy.Given separate sets of features F1,F2,...,F m,it train separate Na-ive Bayes classifiers C1,C2,...,C m corresponding to each feature set. Assuming as ground truth the information provided by the docu-ment labels and that all sentences inherit the status of their docu-ment as opinions or facts,itfirst train C1on the entire training set,10762H.Tang et al./Expert Systems with Applications36(2009)10760–10773then use the resulting classifier to predict labels for the training set.The sentences that receive a label different from the assumed truth are then removed,and train C2on the remaining sentences. This process is repeated iteratively until no more sentences can be removed.Yu and Hatzivassiloglou(2003)report results using five feature sets,starting from words alone and adding in bigrams, trigrams,part-of-speech,and polarity.4.4.Cut-based classifierCut-based classifier approach put forward a hypothesis that, text spans(items)occurring near each other(within discourse boundaries)may share the same subjectivity status(Pang&Lee, 2004).Based on this hypothesis,Pang supplied his algorithm with pair-wise interaction information,e.g.,to specify that two particu-lar sentences should ideally receive the same subjectivity label. This algorithm uses an efficient and intuitive graph-based formula-tion relying onfinding minimum cuts.Suppose there are n items x1,x2,...,x n to divide into two classes C1and C2,here access to two types of information:ind j(x i):Individual scores.It is the non-negative estimates of each x i’s preference for being in C j based on just the features of x i alone;assoc(x i,x k):Association scores.It is the non-negative estimates of how important it is that x i and x k be in the same class.Then,this problem changes to calculate the maximization of each item’s score for one class:its individual score for the class it is assigned to,minus its individual score for the other class,then minus associated items into different classes for penalization. Thus,after some algebra,it arrives at the following optimization problem:assign the x i to C1and C2so as to minimize the partition cost:X x2C1ind2ðxÞþXx2C2ind1ðxÞþXx i2C1;x k2C2assocðx i;x kÞð3ÞThis situation can be represented in the following manner.Build an undirected graph G with vertices{v1,...,v n,s,t};the last two are, respectively,the source and sink.Add n edges(s,v i),each with weight ind1(x i),and n edges(v i,t),each with weight ind2(x i).Finally, addðC2nÞedges(v i,v k),each with weight assoc(x i,x k).A cut(S,T)of G is a partition of its nodes into sets S={s}US0and T={t}UT0,where s R S0,t R T0.Its cost cost(S,T)is the sum of the weights of all edges crossing from S to T.A minimum cut of G is one of minimum cost. Then,finding solution of this problem is changed into looking for a minimum cut of G.5.Word sentiment classificationThe task on document sentiment classification has usually in-volved the manual or semi-manual construction of semantic orien-tation word lexicons(Hatzivassiloglou&McKeown,1997; Hatzivassiloglou&Wiebe,2000;Lin,1998;Pereira,Tishby,&Lee, 1993;Riloff,Wiebe,&Wilson,2003;Turney&Littman,2002; Wiebe,2000),which built by word sentiment classification tech-niques.For instance,Das and Chen(2001)used a classifier on investor bulletin boards to see if apparently positive postings were correlated with stock price,in which several scoring methods were employed in conjunction with a manually crafted lexicon.Classify-ing the semantic orientation of individual words or phrases,such as whether it is positive or negative or has different intensities, generally using a pre-selected set of seed words,sometimes using linguistic heuristics(For example,Lin(1998)&Pereira et al.(1993) used linguistic co-locations to group words with similar uses or meanings).Some studies showed that restricting features to those adjec-tives for word sentiment classification would improve perfor-mance(Andreevskaia&Bergler,2006;Turney&Littman,2002; Wiebe,2000).However,more researches showed most of the adjectives and adverb,a small group of nouns and verbs possess semantic orientation(Andreevskaia&Bergler,2006;Esuli&Sebas-tiani,2005;Gamon&Aue,2005;Takamura,Inui,&Okumura, 2005;Turney&Littman,2003).Automatic methods of sentiment annotation at the word level can be grouped into two major categories:(1)corpus-based ap-proaches and(2)dictionary-based approaches.Thefirst group in-cludes methods that rely on syntactic or co-occurrence patterns of words in large texts to determine their sentiment(e.g.,Hatzi-vassiloglou&McKeown,1997;Turney&Littman,2002;Yu&Hat-zivassiloglou,2003and others).The second group uses WordNet (/)information,especially,synsets and hierarchies,to acquire sentiment-marked words(Hu&Liu, 2004a;Kim&Hovy,2004)or to measure the similarity between candidate words and sentiment-bearing words such as good and bad(Kamps,Marx,Mokken,&de Rijke,2004).5.1.Analysis by conjunctions between adjectivesThis method attempts to predict the orientation of subjective adjectives by analyzing pairs of adjectives(conjoined by and,or, but,either-or,or neither-nor)which are extracted from a large unlabelled document set.The underlying intuition is that the act of conjoining adjectives is subject to linguistic constraints on the orientation of the adjectives involved(e.g.and usually conjoins two adjectives of the same-orientation,while but conjoins two adjectives of opposite orientation).This is shown in the following three sentences(where thefirst two are perceived as correct and the third is perceived as incorrect)taken from Hatzivassiloglou and McKeown(1997):‘‘The tax proposal was simple and well received by the public”.‘‘The tax proposal was simplistic but well received by the public”.‘‘The tax proposal was simplistic and well received by the public”.To infer the orientation of adjectives from analysis of conjunc-tions,a supervised learning algorithm can be performed as follow-ing steps:1.All conjunctions of adjectives are extracted from a set ofdocuments.2.Train a log-linear regression classifier and then classify pairs ofadjectives either as having the same or as having different ori-entation.The hypothesized same-orientation or different-orien-tation links between all pairs form a graph.3.A clustering algorithm partitions the graph produced in step2into two clusters.By using the intuition that positive adjectives tend to be used more frequently than negative ones,the cluster containing the terms of higher average frequency in the docu-ment set is deemed to contain the positive terms.The log-linear model offers an estimate of how good each pre-diction is,since it produces a value y between0and1,in which 1corresponds to same-orientation,and one minus the produced value y corresponds to dissimilarity.Same-and different-orienta-tion links between adjectives form a graph.To partition the graph nodes into subsets of the same-orientation,the clustering algo-rithm calculates an objective function U scoring each possible par-tition P of the adjectives into two subgroups C1and C2as,UðPÞ¼X2i¼11j C i jXx;y2C i;x–ydðx;yÞ!ð4Þwhere j C i j is the cardinality of cluster i,and d(x,y)is the dissimilarity between adjectives x and y.H.Tang et al./Expert Systems with Applications36(2009)10760–1077310763In general,because the model was unsupervised,it required an immense word corpus to function.5.2.Analysis by lexical relationsThis method presents a strategy for inferring semantic orienta-tion from semantic association between words and phrases.It fol-lows a hypothesis that two words tend to be the same semantic orientation if they have strong semantic association.Therefore,it focused on the use of lexical relations defined in WordNet to calcu-late the distance between adjectives.Generally speaking,we can defined a graph on the adjectives contained in the intersection between a term set(For example, TL term set(Turney&Littman,2003))and WordNet,adding a link between two adjectives whenever WordNet indicates the presence of a synonymy relation between them,and defining a distance measure using elementary notions from graph theory.In more de-tail,this approach can be realized as following steps:1.Construct relations at the level of words.The simplest approachhere is just to collect all words in WordNet,and relate words that can be synonymous(i.e.,they occurring in the same synset).2.Define a distance measure d(t1,t2)between terms t1and t2onthis graph,which amounts to the length of the shortest path that connects t1and t2(with d(t1,t2)=+1if t1and t2are not connected).3.Calculate the orientation of a term by its relative distance(Kamps et al.,2004)from the two seed terms good and bad,i.e.,SOðtÞ¼dðt;badÞÀdðt;goodÞdðgood;badÞð5Þ4.Get the result followed by this rules:The adjective t is deemedto belong to positive if SO(t)>0,and the absolute value of SO(t) determines,as usual,the strength of this orientation(the con-stant denominator d(good,bad)is a normalization factor that constrains all values of SO to belong to the[À1,1]range).5.3.Analysis by glossesThe characteristic of this method lies in the fact that it exploits the glosses(i.e.textual definitions)that one term has in an online ‘‘glossary”,or dictionary.Its basic assumption is that if a word is semantically oriented in one direction,then the words in its gloss tend to be oriented in the same direction(Esuli&Sebastiani,2005; Esuli&Sebastiani,2006a,2006b).For instance,the glosses of good and excellent will both contain appreciative expressions;while the glosses of bad and awful will both contain derogative expressions.Generally,this method can determine the orientation of a term based on the classification of its glosses.The process is composed of the following steps:1.A seed set(S p,S n),representative of the two categories positiveand negative,is provided as input.2.Search new terms to enrich S p and S e lexical relations(e.g.synonymy)with the terms contained in S p and S n from a thesau-rus,or online dictionary,tofind these new terms,and then append them to S p or S n.3.For each term t i in S0p [S0nor in the test set(i.e.the set of termsto be classified),a textual representation of t i is generated by collating all the glosses of t i as found in a machine-readable dic-tionary.Each such representation is converted into a vector by standard text indexing techniques.4.A binary text classifier is trained on the terms in S0p [S0nandthen applied to the terms in the test set.5.4.Analysis by both lexical relations and glossesThis method determines sentiment of words and phrases both relies on lexical relations(synonymy,antonymy and hyponymy) and glosses provided in WordNet.Andreevskaia and Bergler(2006)proposed an algorithm named ‘‘STEP”(Semantic Tag Extraction Program).This algorithm starts with a small set of seed words of known sentiment value(positive or negative)and implements the following steps:1.Extend the small set of seed words by adding synonyms,ant-onyms and hyponyms of the seed words supplied in WordNet.This step brings on average a5-fold increase in the size of the original list with the accuracy of the resulting list comparable to manual annotations.2.Go through all WordNet glosses,identifies the entries that con-tain in their definitions the sentiment-bearing words from the extended seed list,and adds these head words to the corre-sponding category–positive,negative or neutral.3.Disambiguate the glosses with part-of-speech tagger,and elim-inate errors of some words acquired in step1and from the seed list.At this step,it alsofilters out all those words that have been assigned contradicting.In this algorithm,for each word we need compute a Net Overlap Score by subtracting the total number of runs assigning this word a negative sentiment from the total of the runs that consider it posi-tive.In order to make the Net Overlap Score measure usable in sen-timent tagging of texts and phrases,the absolute values of this score should be normalized and mapped onto a standard[0,1] interval.STEP accomplishes this normalization by using the value of the Net Overlap Score as a parameter in the standard fuzzy mem-bership S-function(Zadeh,1987).This function maps the absolute values of the Net Overlap Score onto the interval from0to1,where 0corresponds to the absence of membership in the category of sentiment(in this case,these will be the neutral words)and1re-flects the highest degree of membership in this category.The func-tion can be defined as follows,Sðu;a;b;cÞ¼0if u6a2uÀac a2if a6u6b1À2uÀacÀa2if b6u6c1if u P c8>>>>>><>>>>>>:ð6Þwhere u is the Net Overlap Score for the word and a,b,c are the three adjustable parameters:a is set to1,c is set to15and b,which represents a crossover point,is defined as b=(a+c)/2=8.Defined this way,the S-function assigns highest degree of membership (=1)to words that have the Net Overlap Score u P15.Net Overlap Score can be used as a measure of the words degree of membership in the fuzzy category of sentiment:the core adjec-tives,which had the highest Net Overlap Score,were identified most accurately both by STEP and by human annotators,while the words on the periphery of the category had the lowest scores and were associated with low rates of inter-annotator agreement.5.5.Analysis by pointwise mutual informationThe general strategy of this method is to infer semantic orienta-tion from semantic association.The underlying assumption is that a phrase has a positive semantic orientation when it has good asso-ciations(e.g.,‘‘romantic ambience”)and a negative semantic orien-tation when it has bad associations(e.g.,‘‘horrific events”)(Turney, 2002).10764H.Tang et al./Expert Systems with Applications36(2009)10760–10773。
模式识别与机器学习 复习资料 温雯 老师
温雯
一些需要提及的问题
温雯 广东工业大学 计算机学院 23
温雯
广东工业大学
计算机学院
21
温雯
广东工业大学
模式识别系统的复杂性 – An Example
“利用光学传感器采集信息,对 传送带上的鱼进行种类的自动 区分” Fish Classification: Sea Bass / Salmon
一个例子
将鲈鱼与三文鱼进行区分 问题归纳(抽象而言) • 模式识别系统 • 设计流程
Preprocessing involves:
广东工业大学 计算机学院 28
Overlap in the histograms is small compared to length feature 温雯 广东工业大学 计算机学院
27
温雯
判定边界
错误分类的代价
模型的复杂度
Generalization (推广能力)
Partition the feature space into two regions by finding the decision boundary (判定边界)that minimizes the error.
Optical Character Recognition (typography)
A v t u I h D U w K
一种新的人机交互系统 你,从中看到模式识别吗?
Vision
机械专业英文文献翻译
英文原文High Productivity —A Question of Shearer LoaderCutting SequencesK. Nienhaus, A. K. Bayer & H. Haut, Aachen University ofTechnology, GER1 AbstractRecently, the focus in underground longwall coal mining has been on increasing the installed motor power of shearer loaders and armoured face conveyors (AFC), more sophisticated support control systems and longer face length, in order to reduce costs and achieve higher productivity. These efforts have resulted in higher output and previously unseen face advance rates. The trend towards “bigger and better” equipment and layout schemes, however, is rapidly nearing the limitations of technical and economical feasibility. To realise further productivity increases, organisational changes of longwall mining procedures looks like the only reasonable answer. The benefits of opti-mised shearer loader cutting sequences, leading to better performance, are discussed in this paper.2 IntroductionsTraditionally, in underground longwall mining operations, shearer loaders produce coal using either one of the following cutting sequences: uni-directional or bi-directional cycles. Besides these pre-dominant methods, alternative mining cycles have also been developed and successfully applied in underground hard coal mines all over the world. The half-web cutting cycle as e.g. utilized in RAG Coal International’s Twentymile Mine in Colorado, USA, and the “Opti-Cycle” of Matla’s South African shortwall operation must be mentioned in this context. Other mines have also tested similar but modified cutting cycles resulting inimproved output, e.g. improvements in terms of productiv-ity increases of up to 40 % are thought possible。
机械专业论文中英文对照
机械专业论文中英文对照第一篇:机械专业论文中英文对照Gearbox Noise Correlation with Transmission Error and Influence of Bearing PreloadABSTRACT The five appended papers all deal with gearbox noise and vibration.The first paper presents a review of previously published literature on gearbox noise and vibration.The second paper describes a test rig that was specially designed and built for noise testing of gears.Finite element analysis was used to predict the dynamic properties of the test rig, and experimental modal analysis of the gearbox housing was used to verify the theoretical predictions of natural frequencies.In the third paper, the influence of gear finishing method and gear deviations on gearbox noise is investigated in what is primarily an experimental study.Eleven test gear pairs were manufactured using three different finishing methods.Transmission error, which is considered to be an important excitation mechanism for gear noise, was measured as well as predicted.The test rig was used to measure gearbox noise and vibration for the different test gear pairs.The measured noise and vibration levels were compared with the predicted and measured transmission error.Most of the experimental results can be interpreted in terms of measured and predicted transmission error.However, it does not seem possible to identify one single parameter,such as measured peak-to-peak transmission error, that can be directly related to measured noise and vibration.The measurements also show that disassembly and reassembly of the gearbox with the same gear pair can change the levels of measured noise and vibration considerably.This finding indicates that other factors besides the gears affect gearnoise.In the fourth paper, the influence of bearing endplay or preload on gearbox noise and vibration is investigated.Vibration measurements were carried out at torque levels of 140 Nm and 400Nm, with 0.15 mm and 0 mm bearing endplay, and with 0.15 mm bearing preload.The results show that the bearing endplay and preloadinfluence the gearbox vibrations.With preloaded bearings, the vibrations increase at speeds over 2000 rpm and decrease at speeds below 2000 rpm, compared with bearings with endplay.Finite element simulations show the same tendencies as the measurements.The fifth paper describes how gearbox noise is reduced by optimizing the gear geometry for decreased transmission error.Robustness with respect to gear deviations and varying torque is considered in order to find a gear geometry giving low noise in an appropriate torque range despite deviations from the nominal geometry due to manufacturing tolerances.Static and dynamic transmission error, noise, and housing vibrations were measured.The correlation between dynamic transmission error, housing vibrations and noise was investigated in speed sweeps from 500 to 2500 rpm at constant torque.No correlation was found between dynamic transmission error and noise.Static loaded transmission error seems to be correlated with the ability of the gear pair to excite vibration in the gearbox dynamic system.Keywords: gear, gearbox, noise, vibration, transmission error, bearing preload.ACKNOWLEDGEMENTS This work was carried out at Volvo Construction Equipment in Eskilstuna and at the Department of Machine Design at the Royal Institute of Technology(KTH)in Stockholm.The work was initiated by Professor Jack Samuelsson(Volvo and KTH), Professor SörenAndersson(KTH), and rs Bråthe(Volvo).The financial support of the Swedish Foundation for Strategic Research and the Swedish Agency for Innovation Systems –VINNOVA –is gratefully acknowledged.Volvo Construction Equipment is acknowledged for giving me the opportunity to devote time to this work.Professor Sören Andersson is gratefully acknowledged for excellent guidance and encouragement.I also wish to express my appreciation to my colleagues at the Department of Machine Design, and especially to Dr.Ulf Sellgren for performing simulations and contributing to the writing of Paper D, and Dr.Stefan Björklund for performing surface finish measurements.The contributions to Paper C by Dr.Mikael Pärssinen are highly appreciated.All contributionsto this work by colleagues at Volvo are gratefully appreciated.1 INTRODUCTION 1.1 Background Noise is increasingly considered an environmental issue.This belief is reflected in demands for lower noise levels in many areas of society, including the working environment.Employees spend a lot of time in this environment and noise can lead not only to hearing impairment but also to decreased ability to concentrate, resulting in decreased productivity and an increased risk of accidents.Quality, too, has become increasingly important.The quality of a product can be defined as its ability to fulfill customers’ demands.These demands often change over time, and the best competitors in the market will set the standard.Noise concerns are also expressed in relation to construction machinery such as wheel loaders and articulated haulers.The gearbox is sometimes the dominant source of noise in these machines.Even if the gear noise is not the loudest source, its pure high frequency tone is easily distinguished from other noise sources and is oftenperceived as unpleasant.The noise creates an impression of poor quality.In order not to be heard, gear noise must be at least 15 dB lower than other noise sources, such as engine noise.1.2 Gear noise This dissertation deals with the kind of gearbox noise that is generated by gears under load.This noise is often referred to as “gear whine” and consists mainly of pure tones at high frequencies corresponding to the gear mesh frequency and multiples thereof, which are known as harmonics.A tone with the same frequency as the gear mesh frequency is designated the gear mesh harmonic, a tone with a frequency twice the gear mesh frequency is designated the second harmonic, and so on.The term “gear mesh harmonics” refers to all multiples of the gear mesh frequency.Transmission error(TE)is considered an important excitation mechanism for gear whine.Welbourn [1] defines transmission error as “the difference between the actual position of the output gear and the position it would occupy if the gear drive were perfectly conjugate.” Transmission error may be expressed as angular displacement or as linear displacement at the pitch point.Transmission error is caused by deflections, geometric errors, and geometric modifications.In addition to gear whine, other possible noise-generating mechanisms in gearboxes include gear rattle from gears running against each other without load, and noise generated by bearings.In the case of automatic gearboxes, noise can also be generated by internal oil pumps and by clutches.None of these mechanisms are dealt with in this work, and from now on “gear noise” or “gearbox noise” refers to “gear whine”.MackAldener [2] describes the noise generation process from a gearbox as consisting of three parts: excitation, transmission, and radiation.The origin of the noise is the gearmesh, in which vibrations are created(excitation), mainly due to transmission error.The vibrations are transmitted via the gears, shafts, and bearings to the housing(transmission).The housing vibrates, creating pressure variations in the surrounding air that are perceived as noise(radiation).Gear noise can be affected by changing any one of these three mechanisms.This dissertation deals mainly with excitation, but transmission is also discussed in the section of the literature survey concerning dynamic models, and in the modal analysis of the test gearbox in Paper B.Transmission of vibrations is also investigated in Paper D, which deals with the influence of bearing endplay or preload on gearbox noise.Differences in bearing preload influence a bearing’s dynamic properties like stiffness and damping.These properties also affect the vibration of the gearbox housing.1.3 Objective The objective of this dissertation is to contribute to knowledge about gearbox noise.The following specific areas will be the focus of this study: 1.The influence of gear finishing method and gear modifications and errors on noise and vibration from a gearbox.2.The correlation between gear deviations, predicted transmission error, measured transmission error, and gearbox noise.3.The influence of bearing preload on gearbox noise.4.Optimization of gear geometry for low transmission error, taking into consideration robustness with respect to torque and manufacturing tolerances.2 AN INDUSTRIAL APPLICATION −TRANSMISSION NOISE REDUCTION 2.1 Introduction This section briefly describes the activities involved in reducing gear noise from a wheel loader transmission.The aim is to show how the optimization of the gear geometry described in Paper E is used in an industrial application.The author was project manager for the “noise work team” and performed the gearoptimization.One of the requirements when developing a new automatic power transmission for a wheel loader was improving the transmission gear noise.The existing power transmission was known to be noisy.When driving at high speed in fourth gear, a high frequency gear-whine could be heard.Thus there were now demands for improved sound quality.The transmission is a typical wheel loader power transmission, consisting of a torque converter, a gearbox with four forward speeds and four reverse speeds, and a dropbox partly integrated with the gearbox.The dropbox is a chain of four gears transferring the powerto the output shaft.The gears are engaged by wet multi-disc clutches actuated by the transmission hydraulic and control system.2.2 Gear noise target for the new transmission Experience has shown that the high frequency gear noise should be at least 15 dB below other noise sources such as the engine in order not to be perceived as disturbing or unpleasant.Measurements showed that if the gear noise could be decreased by 10 dB, this criterion should be satisfied with some margin.Frequency analysis of the noise measured in the driver's cab showed that the dominant noise from the transmission originated from the dropbox gears.The goal for transmission noise was thus formulated as follows: “The gear noise(sound pressure level)from the dropbox gears in the transmission should be decreased by 10 dB compared to the existing transmission in order not to be perceived as unpleasant.It was assumed that it would be necessary to make changes to both the gears and the transmission housing in order to decrease the gear noise sound pressure level by 10 dB.2.3 Noise and vibration measurements In order to establish a reference for the new transmission, noise and vibration were measured for the existing transmission.Thetransmission is driven by the same type of diesel engine used in a wheel loader.The engine and transmission are attached to the stand using the same rubber mounts that are used in a wheel loader in order to make the installation as similar as possible to the installation in a wheel loader.The output shaft is braked using an electrical brake.2.4 Optimization of gears Noise-optimized dropbox gears were designed by choosing macro-and microgeometries giving lower transmission error than the original(reference)gears.The gear geometry was chosen to yield a low transmission error for the relevant torque range, while also taking into consideration variations in the microgeometry due to manufacturing tolerances.The optimization of one gear pair is described in more detail in Paper E.Transmission error is considered an important excitation mechanism for gear whine.Welbourn [1] defines it as “the difference between the actual position of the output gear and the position it would occupy if the gear drive were perfectly conjugate.” In this project the aim was to reduce the maximum predicted transmission error amplitude at gear mesh frequency(first harmonic of gear mesh frequency)to less than 50% of the value for the reference gear pair.The first harmonic of transmission error is the amplitude of the part of the total transmission error that varies with a frequency equal to the gear mesh frequency.A torque range of 100 to 500 Nm was chosen because this is the torque interval in which the gear pair generates noise in its design application.According to Welbourn [1], a 50% reduction in transmission error can be expected to reduce gearbox noise by 6 dB(sound pressure level, SPL).Transmission error was calculated using the LDP software(Load Distribution Program)developed atthe Gear Laboratory at Ohio State University [3].The “optimization” was not strictly mathematical.The design was optimized by calculating the transmission error for different geometries, and then choosing a geometry that seemed to be a good compromise, considering not only the transmission error, but also factors such asstrength, losses, weight, cost, axial forces on bearings, and manufacturing.When choosing microgeometric modifications and tolerances, it is important to take manufacturing options and cost into consideration.The goal was to use the same finishing method for the optimized gears as for the reference gears, namely grinding using a KAPP VAS 531 and CBN-coated grinding wheels.For a specific torque and gear macrogeometry, it is possible to define a gear microgeometry that minimizes transmission error.For example, at no load, if there are no pitch errors and no other geometrical deviations, the shape of the gear teeth should be true involute, without modifications like tip relief or involute crowning.For a specific torque, the geometry of the gear should be designed in such a way that it compensates for the differences in deflection related to stiffness variations in the gear mesh.However, even if it is possible to define the optimal gear microgeometry, it may not be possible to manufacture it, given the limitations of gear machining.Consideration must also be given to how to specify the gear geometry in drawings and how to measure the gear in an inspection machine.In many applications there is also a torque range over which the transmission error should be minimized.Given that manufacturing tolerances are inevitable, and that a demand for smaller tolerances leads to higher manufacturing costs, it is important that gears be robust.In other words, the important characteristics, in this case transmissionerror, must not vary much when the torque is varied or when the microgeometry of the gear teeth varies due to manufacturing tolerances.LDP [3] was used to calculate the transmission error for the reference and optimized gear pair at different torque levels.The robustness function in LDP was used to analyze the sensitivity to deviations due to manufacturing tolerances.The “min, max, level” method involves assigning three levels to each parameter.2.5 Optimization of transmission housing Finite element analysis was used to optimize the transmission housing.The optimization was not performed in a strictly mathematical way, but was done by calculating the vibration of the housing for different geometries and then choosing a geometry that seemed to be a good compromise.Vibration was not the sole consideration, also weight, cost, available space, and casting were considered.A simplified shell element model was used for the optimization to decrease computational time.This model was checked against a more detailed solid element model of the housing to ensure that the simplification had not changed the dynamic properties too much.Experimental modal analysis was also used to find the natural frequencies of the real transmission housing and to ensure that the model did not deviate too much from the real housing.Gears shafts and bearings were modeled as point masses and beams.The model was excited at the bearing positions by applying forces in the frequency range from 1000 to 3000 Hz.The force amplitude was chosen as 10% of the static load from the gears.This choice could be justified because only relative differences are of interest, not absolute values.The finite element analysis was performed by Torbjörn Johansen at Volvo Technology.The author’s contribution was the evaluation of the results of differenthousing geometries.A number of measuring points were chosen in areas with high vibration velocities.At each measuring point the vibration response due to the excitation was evaluated as a power spectral density(PSD)graph.The goal of the housing redesign was to decrease the vibrations at all measuring points in the frequency range 1000 to 3000 Hz.2.6 Results of the noise measurements The noise and vibration measurements described in section 2.3 were performed after optimizing the gears and transmission housing.The total sound power level decreased by 4 dB.2.7 Discussion and conclusions It seems to be possible to decrease the gear noise from a transmission bydecreasing the static loaded transmission error and/or optimizing the housing.In the present study, it is impossible to say how much of the decrease is due to the gear optimization and how much to the housing optimization.Answering this question would have required at least one more noise measurement, but time and cost issues precluded this.It would also have been interesting to perform the noise measurements on a number of transmissions, both before and after optimizing the gears and housing, in order to determine the scatter of the noise of the transmissions.Even though the goal of decreasing the gear noise by 10 dB was not reached, the goal of reducing the gear noise in the wheel loader cab to 15 dB below the overall noise was achieved.Thus the noise optimization was successful.3 SUMMARY OF APPENDED PAPERS 3.1 Paper A: Gear Noise and Vibration – A Literature Survey This paper presents an overview of the literature on gear noise and vibration.It is divided into three sections dealing with transmission error, dynamic models, and noise and vibration measurement.Transmission error is an important excitation mechanism for gear noise and vibration.It isdefined as “the differen ce between the actual position of the output gear and the position it would occupy if the gear drive were perfectly conjugate” [1].The literature survey revealed that while most authors agree that transmission error is an important excitation mechanism for gear noise and vibration, it is not the only one.Other possible time-varying noise excitation mechanisms include friction and bending moment.Noise produced by these mechanisms may be of the same order of magnitude as that produced by transmission error, at least in the case of gears with low transmission error [4].The second section of the paper deals with dynamic modeling of gearboxes.Dynamic models are often used to predict gear-induced vibrations and investigate the effect of changes to the gears, shafts, bearings, and housing.The literature survey revealed that dynamic models of a system consisting of gears, shafts, bearings, and gearbox casing can be useful in understanding and predicting the dynamic behavior of a gearbox.Forrelatively simple gear systems, lumped parameter dynamic models with springs, masses, and viscous damping can be used.For more complex models that include such elements as the gearbox housing, finite element modeling is often used.The third section of the paper deals with noise and vibration measurement and signal analysis, which are used when experimentally investigating gear noise.The survey shows that these are useful tools in experimental investigation of gear noise because gears create noise at specific frequencies related to the number of teeth and the rotational speed of the gear.3.2 Paper B: Gear Test Rig for Noise and Vibration Testing of Cylindrical Gears Paper B describes a test rig for noise testing of gears.The rig is of the recirculating power type and consists of two identical gearboxes,connected to each other with two universal joint shafts.Torque is applied by tilting one of the gearboxes around one of its axles.This tilting is made possible by bearings between the gearbox and the supporting brackets.A hydraulic cylinder creates the tilting force.Finite element analysis was used to predict the natural frequencies and mode shapes for individual components and for the complete gearbox.Experimental modal analysis was carried out on the gearbox housing, and the results showed that the FE predictions agree with the measured frequencies(error less than 10%).The FE model of the complete gearbox was also used in a harmonic response analysis.A sinusoidal force was applied in the gear mesh and the corresponding vibration amplitude at a point on the gearbox housing was predicted.3.3 Paper C: A Study of Gear Noise and Vibration Paper C reports on an experimental investigation of the influence of gear finishing methods and gear deviations on gearbox noise and vibration.Test gears were manufactured using three different finishing methods and with different gear tooth modifications and deviations.T able3.3.1 gives an overview of the test gear pairs.The surface finishes and geometries of the gear tooth flanks were measured.Transmission error was measured using a single flank gear tester.LDP software from Ohio State University was used for transmission error computations.The test rig described in Paper B was used to measure gearbox noise and vibration for the different test gear pairs.The measurements showed that disassembly and reassembly of the gearbox with the same gear pair might change the levels of measured noise and vibration.The rebuild variation was sometimes of the same order of magnitude as the differences between different tested gear pairs, indicating that other factors besides the gears affect gear noise.In a study of theinfluence of gear design on noise, Oswald et al.[5] reported rebuild variations of the same order of magnitude.Different gear finishing methods produce different surface finishes and structures, as well as different geometries and deviations of the gear tooth flanks, all of which influence the transmission error and thus the noise level from a gearbox.Most of the experimental results can be explained in terms of measured and computed transmission error.The relationship between predicted peak-to-peak transmission error and measured noise at a torque level of 500 Nm is shown in Figure 3.3.1.There appears to be a strong correlation between computed transmission error and noise for all cases except gear pair K.However, this correlation breaks down in Figure 3.3.2, which shows the relationship between predicted peak to peak transmission error and measured noise at a torque level of 140 Nm.The final conclusion is that it may not be possible to identify a single parameter, such as peak-to-peak transmission error, that can be directly related to measured noise and vibration.3.4 Paper D: Gearbox Noise and Vibration −Influence of Bearing Preload The influence of bearing endplay or preload on gearbox noise and vibrations is investigated in Paper D.Measurements were carried out on a test gearbox consisting of a helical gear pair, shafts, tapered roller bearings, and a housing.Vibration measurements were carried out at torque levels of 140 Nm and 400 Nm with 0.15 mm and 0 mm bearing endplay and with 0.15 mm bearing preload.The results shows that the bearing endplay or preload influence gearbox pared with bearingswith endplay, preloaded bearings show an increase in vibrations at speeds over 2000 rpm and a decrease at speeds below 2000 rpm.Figure 3.4.1 is a typical result showing theinfluence of bearing preload on gearbox housing vibration.After the first measurement, the gearbox was not disassembled or removed from the test rig.Only the bearing preload/endplay was changed from 0 mm endplay/preload to 0.15 mm preload.Therefore the differences between the two measurements are solely due to different bearing preload.FE simulations performed by Sellgren and Åkerblom [6]show the same trend as the measurements here.For the test gearbox, it seems that bearing preload, compared with endplay, decreased the vibrations at speeds below 2000 rpm and increased vibrations at speeds over 2000 rpm, at least at a torque level of 140 Nm.3.5 Paper E: Gear Geometry for Reduced and Robust Transmission Error and Gearbox Noise In Paper E, gearbox noise is reduced by optimization of gear geometry for decreased transmission error.The optimization was not performed strictly mathematically.It was done by calculating the transmission error for different geometries and then choosing a geometry that seemed to be a good compromise considering not only the transmission error, but also other important characteristics.Robustness with respect to gear deviations and varying torque was considered in order to find gear geometry with low transmission error in the appropriate torque range despite deviations from the nominal geometry due to manufacturing tolerances.Static and dynamic transmission error as well as noise and housing vibrations were measured.The correlation between dynamic transmission error, housing vibrations, and noise was investigated in a speed sweep from 500 to 2500 rpm at constant torque.No correlation was found between dynamic transmission error and noise.4 DISCUSSION AND CONCLUSIONS Static loaded transmission error seems tobe strongly correlated to gearbox noise.Dynamic transmission error does not seem to be correlated to gearbox noise in speed 第二篇:机械专业英语词汇中英文对照机床 machine tool金属工艺学 technology of metals刀具 cutter摩擦 friction联结link传动 drive/transmission轴 shaft弹性 elasticity频率特性 frequency characteristic误差 error响应 response定位 allocation机床夹具 jig动力学 dynamic运动学 kinematic静力学static分析力学analyse mechanics拉伸pulling压缩hitting剪切shear扭转 twist弯曲应力 bending stress强度 intensity三相交流电three-phase AC磁路magnetic circles变压器transformer异步电动机asynchronous motor几何形状geometrical精度precision正弦形的 sinusoid交流电路 AC circuit机械加工余量 machining allowance变形力 deforming force变形 deformation应力 stress硬度 rigidity热处理 heat treatment退火anneal正火normalizing脱碳decarburization渗碳carburization电路 circuit半导体元件 semiconductor element反馈 feedback发生器 generator直流电源 DC electrical source门电路 gate circuit逻辑代数 logic algebra外圆磨削 external grinding内圆磨削 internal grinding平面磨削 plane grinding变速箱 gearbox离合器 clutch绞孔 fraising绞刀reamer螺纹加工 thread processing螺钉 screw铣削 mill铣刀 milling cutter功率 power工件 workpiece齿轮加工 gear mechining齿轮 gear主运动 main movement主运动方向 direction of main movement进给方向 direction of feed进给运动 feed movement合成进给运动resultant movement of feed合成切削运动resultant movement of cutting合成切削运动方向 direction of resultantmovement of cutting切削深度 cutting depth前刀面 rake face 刀尖nose of tool前角rake angle后角clearance angle龙门刨削planing主轴 spindle主轴箱 headstock卡盘 chuck加工中心 machining center车刀 lathe tool车床 lathe钻削镗削 bore车削 turning磨床 grinder基准 benchmark钳工 locksmith 锻 forge压模 stamping焊 weld拉床 broaching machine拉孔 broaching装配 assembling铸造found流体动力学fluid dynamics流体力学fluid mechanics加工machining液压 hydraulic pressure切线 tangent机电一体化 mechanotronics mechanical-electrical integration 气压 air pressure pneumatic pressure稳定性 stability介质 medium液压驱动泵 fluid clutch液压泵 hydraulic pump阀门 valve失效 invalidation强度 intensity载荷 load应力 stress安全系数safty factor可靠性reliability螺纹thread螺旋helix 键 spline销 pin滚动轴承 rolling bearing滑动轴承 sliding bearing弹簧 spring 制动器 arrester brake十字结联轴节 crosshead联轴器 coupling 链 chain皮带 strap精加工 finish machining粗加工 rough machining变速箱体 gearbox casing腐蚀 rust氧化 oxidation磨损 wear耐用度 durability随机信号random signal离散信号discrete signal超声传感器ultrasonic sensor第三篇:机械专业论文中英文摘要摘要本文主要论述了基于PLC的钢管打捆机控制系统的设计思路和设计过程。
Chapter01 Introduction 机械零件设计英文PPT教案 课件 Design of Machine Elements
Other Purposes
Develop competence of creative design and solving practical problem
Normal load transmitter: sliding bearings, rollingelement bearings
Torque tramsmitter: gears, traction drives, chain drives, belt drives, power screws
Nonengineering decisions regarding marketability, product liability, ethics, politics, etc., must be integrated into the design process early.
4. Contents and Purpose of the Course
Besides, strength, reliability, deformation, tribology (friction, wear, and lubrication), also need to be considered.
2) Economic Demands
The objective is to produce a machine that not only is to function properly for a reasonable time but is also economically feasible.
人机互动中的移情问题探析
DCWExperience Exchange经验交流185数字通信世界2023.011 社交机器人的移情现象在人工智能技术持续发展的过程中,社会上出现了能够和人交流的社交机器人。
社交机器人是在确保自身身份、遵守伦理规范的条件下,与人类或其他自主实体进行交互和交流的机器人。
人同社交机器人交流的过程是社交机器人利用人工智能情感技术,对使用者进行情感识别,通过相应的情感计算,再对使用者进行情感输出的过程。
因此,移情也不可避免地出现在人与社交机器人的交往互动过程中。
顾名思义,移情有传达情感的作用,能够将人的主观情感、意志、动作等一系列的行为活动物化,是在主客体之间互为主体的情况下产生的情感投射,是人的物化和物的人化,是一个从客观物理性经过感觉知觉达到精神性的深化过程。
学者们通常认为移情在对设计社交机器人的反应形成中也起着重要作用。
一些研究人员甚至指出,它是实现和维持人与人之间以及人与社交机器人之间社会关系的关键因素之一。
然而,在人与社交机器人交往互动的过程中,“移情”一词的社会基础使得我们很难将它应用于社交机器人的研究,我们会考虑对这些无生命机器使用社会术语和解释在理论上是否合理。
移情在理论上通常被认为是一个与社会认知和社会情感二者相关的过程,但是这个理论在人机交互方面是存在一定问题的,因为社会关系传统上被理解为两个有意识的、理性的实体之间的相互作用。
然而社交机器人不符合有意识的和理性的主观性标准,因此不能参与以这种方式理解的社会关系。
尽管如此,仍然有一些理论的存在可以为我们谈论社交机器人作为社会关系的参与者提供合理性。
社交机器人技术的发展越来越注重将其置于一种互动循环中,即赋予它社会性特征,使其在互动中让人们发生情感,表达情感,完成相关情感反应的社会情境。
移情应该以主体间的、过程性的、社会性的方式来理解。
多米奥诺(Luisa Damiano )表示情感是参与特定社会情境的许多参与者共享和共同创造的现象[1]。
英语作文讨论人工智能问题
英语作文讨论人工智能问题Artificial Intelligence: The Future of HumanityThe rapid advancements in technology have led to the emergence of a new and revolutionary field - artificial intelligence (AI). AI, the ability of machines to exhibit human-like intelligence and perform tasks that typically require human cognition, has the potential to transform our world in ways we have never imagined. As we delve deeper into this captivating realm, it is essential to explore the multifaceted implications of AI and its impact on our society.One of the most significant advantages of AI is its ability to streamline and optimize various processes, leading to increased efficiency and productivity. In the realm of healthcare, AI-powered systems can analyze vast amounts of medical data, identify patterns, and assist in the early detection of diseases, ultimately improving patient outcomes. Similarly, in the financial sector, AI algorithms can analyze market trends, make informed investment decisions, and detect fraudulent activities with remarkable accuracy.Furthermore, AI has the potential to revolutionize the way we approach education. Intelligent tutoring systems can adapt to the individual learning styles and needs of students, providing personalized instruction and feedback. This not only enhances the learning experience but also enables educators to allocate their time and resources more effectively, catering to the unique requirements of each student.However, the integration of AI into our lives is not without its challenges and concerns. One of the primary issues is the potential displacement of human labor by automation. As AI-powered systems become more sophisticated, they may replace certain jobs, leadingto widespread unemployment and economic disruption. This raises important questions about the future of work and the need for new educational and training programs to equip individuals with the skills necessary to thrive in an AI-driven economy.Another significant concern is the ethical implications of AI. As these systems become more advanced, they may be required to make complex decisions that have far-reaching consequences. For instance, autonomous vehicles may need to make split-second decisions in the event of an accident, potentially prioritizing the safety of the passengers over pedestrians. Addressing these ethical dilemmas and ensuring that AI systems are aligned with human values and principles is crucial.Additionally, the issue of data privacy and security is of paramount importance. AI systems rely on vast amounts of data to function effectively, and this data can be vulnerable to breaches and misuse. Ensuring the protection of personal information and preventing the exploitation of data by malicious actors is a pressing challenge that must be addressed.Furthermore, the potential for AI to be used for nefarious purposes, such as the development of autonomous weapons or the spread of misinformation, is a significant concern. Responsible development and deployment of AI are essential to mitigate these risks and ensure that the technology is used for the betterment of humanity.Despite these challenges, the potential benefits of AI are immense. In the field of scientific research, AI can accelerate the pace of discovery by analyzing vast datasets, identifying patterns, and generating hypotheses that human researchers may have overlooked. This can lead to groundbreaking advancements in fields such as medicine, climate change, and space exploration.Moreover, AI can play a crucial role in addressing global issues, such as climate change and sustainable development. By leveraging AI-powered systems to monitor environmental data, optimize energy usage, and develop innovative solutions, we can work towards amore sustainable future for our planet.As we navigate the complex landscape of AI, it is essential to strike a balance between harnessing its immense potential and addressing the ethical, social, and economic challenges that come with it. This will require a collaborative effort involving policymakers, researchers, technologists, and the general public to ensure that the development and deployment of AI are guided by principles of transparency, accountability, and the betterment of humanity.In conclusion, artificial intelligence is a transformative force that holds the power to reshape our world in profound ways. While the challenges and concerns surrounding AI are significant, the potential benefits are equally remarkable. By embracing the responsible development and implementation of AI, we can unlock new frontiers of innovation, problem-solving, and human progress, ultimately shaping a future that is brighter and more sustainable for all.。
对机械的看法作文英语
对机械的看法作文英语题目,A Perspective on Machinery。
Machinery has been an integral part of humancivilization for centuries, revolutionizing the way we live, work, and interact with the world around us. From the simplest tools to the most advanced robotics, machinery has continually shaped our societies and economies. In this essay, we will explore various perspectives on machinery, considering its benefits, drawbacks, and future implications.Firstly, machinery has undoubtedly brought about immense benefits to human society. One of the mostsignificant advantages is the increase in efficiency and productivity. Machines can perform tasks with speed and precision that surpass human capabilities, leading tohigher output and economic growth. For example, in manufacturing industries, automated assembly lines have streamlined production processes, resulting in lower costsand higher quality goods.Moreover, machinery has played a crucial role in improving safety and reducing labor-intensive work. Dangerous or physically demanding tasks can now be undertaken by machines, minimizing the risk of accidents and injuries to workers. This has not only protected human lives but also enhanced the overall well-being of workers by freeing them from hazardous conditions.Additionally, machinery has facilitated innovation and technological advancement. The development of machinery has spurred breakthroughs in various fields, from healthcare to transportation. Medical equipment, such as MRI machines and robotic surgery systems, has revolutionized healthcare delivery, enabling precise diagnosis and treatment of illnesses. Similarly, advancements in transportation machinery, such as airplanes and automobiles, have transformed travel, connecting people and cultures across the globe.However, despite its numerous benefits, machinery alsopresents certain drawbacks and challenges. One of the primary concerns is the displacement of human labor. As machines become increasingly sophisticated, there is a growing fear that they will replace human workers, leading to unemployment and economic inequality. This issue is particularly salient in industries where automation is rapidly replacing traditional jobs, such as manufacturing and agriculture.Furthermore, the overreliance on machinery can have adverse environmental consequences. Many machines rely on fossil fuels for power, contributing to air and water pollution, as well as climate change. Additionally, the extraction of resources for manufacturing machinery can deplete natural resources and harm ecosystems. To mitigate these impacts, there is a pressing need for the development of sustainable technologies and practices in the design and use of machinery.Looking ahead, the future of machinery holds both promise and challenges. With advancements in artificial intelligence and robotics, machines are becomingincreasingly autonomous and intelligent. While this presents opportunities for innovation and efficiency, it also raises ethical and societal questions regarding the role of machines in our lives. It is crucial to ensure that the development and deployment of machinery are guided by ethical principles and considerations of human welfare.In conclusion, machinery has been a driving force behind human progress, offering numerous benefits in terms of efficiency, safety, and innovation. However, it also poses challenges related to unemployment, environmental sustainability, and ethical implications. By harnessing the potential of machinery responsibly and ethically, we can continue to reap its benefits while addressing its drawbacks, ultimately shaping a more prosperous and sustainable future for generations to come.This essay aims to provide a comprehensive overview of the various perspectives on machinery, considering its impacts on society, economy, and the environment. Through critical analysis and reflection, we can gain a deeperunderstanding of the role of machinery in our lives and work towards harnessing its potential for the greater good.。
automation in construction endnote -回复
automation in construction endnote -回复Automation in ConstructionAutomation in construction refers to the use of advanced technology and robotics to perform tasks and processes that were previously done manually in the construction industry. This includes tasks such as site preparation, material handling, building erection, and even maintenance and repair. The goal of automation in construction is to increase efficiency, productivity, and safety while reducing costs and reliance on human labor. In this article, we will explore the various aspects of automation in construction and its implications for the industry.1. Introduction to Automation in ConstructionConstruction is traditionally a labor-intensive industry, requiring a significant amount of manual work. However, with advancements in technology, automation is becoming increasingly prevalent in the field. Automation in construction involves the use of machinery, robots, and computer-controlled systems to carry out construction tasks. These technologies can be used for various purposes, such as excavation, handling and transportation of materials, buildingassembly, and quality control.2. Advantages of Automation in ConstructionAutomation in construction offers several advantages, both for construction companies and the overall industry. Firstly, automation can significantly improve productivity. Machines and robots can carry out tasks quickly and efficiently, reducing the time required to complete projects. This leads to cost savings and increased profitability for construction companies.Secondly, automation can enhance safety in construction. Construction sites are inherently dangerous, and automation can help reduce the risk of accidents. By replacing manual labor in hazardous activities, such as heavy lifting or working at heights, automation can protect workers from injury and even save lives.Furthermore, automation can improve the quality of construction projects. By using computer-controlled systems, construction processes can be closely monitored and controlled, leading to higher precision and accuracy. This results in buildings and infrastructure that are built to exact specifications and meetstringent quality standards.3. Technologies in AutomationThere are several key technologies that are driving automation in construction. One of these technologies is Building Information Modeling (BIM). BIM is a digital representation of a building or infrastructure project that contains information about its design, materials, and construction. BIM allows for better coordination and communication among stakeholders, reducing errors and rework. It also enables the simulation and optimization of construction processes before they are implemented, ensuring efficiency and cost-effectiveness.Another technology is robotic systems. Robots can perform tasks such as bricklaying, concrete pouring, and even welding. These systems are designed to be highly accurate and efficient, allowing for faster construction and improved quality. They can also work in hazardous or challenging environments, reducing risks to human workers.Additionally, automation in construction includes the use of dronesfor surveying and inspection purposes. Drones can capturehigh-resolution images and videos of construction sites, providing valuable data for decision-making and monitoring progress. This technology allows for faster and more accurate surveys, reducing costs and improving project management.4. Challenges and Future TrendsDespite the numerous benefits of automation in construction, there are also challenges and concerns that need to be addressed. One of the main concerns is the displacement of workers. As automation replaces manual labor, there is a risk of job losses in the construction industry. However, it is important to note that automation also creates new job opportunities, particularly in the areas of technology maintenance and operation.Another challenge is the initial investment required for automation technology. Implementing automated systems can be expensive, and smaller construction companies may find it difficult to adopt these technologies. However, as the technology continues to evolve and become more accessible, the initial costs are gradually decreasing, making it more feasible for a wider range ofcompanies.In terms of future trends, automation in construction is expected to continue growing and evolving. Advancements in artificial intelligence, robotics, and 3D printing are likely to lead to even more sophisticated automation systems in the future. These technologies will enable faster construction, increased customization, and further improvements in safety and quality.In conclusion, automation in construction is transforming the industry by increasing productivity, enhancing safety, and improving quality. Through the use of advanced technologies such as BIM, robotics, and drones, construction processes are becoming more efficient, cost-effective, and precise. While there are challenges to overcome, automation in construction is set to become even more widespread in the future, driving innovation and reshaping the way buildings and infrastructure are built.。
机械回复 英文作文
机械回复英文作文Title: The Impact of Artificial Intelligence on Society。
Introduction:Artificial Intelligence (AI) has emerged as one of the most transformative technologies of the 21st century. Its applications span across various domains, revolutionizing industries, economies, and even social structures. This essay explores the multifaceted impact of AI on society, delving into its benefits, challenges, and ethical implications.Benefits of AI:One of the primary benefits of AI lies in its abilityto augment human capabilities and enhance productivity. Through automation and machine learning algorithms, AI streamlines repetitive tasks, allowing humans to focus on more complex and creative endeavors. For instance, in thehealthcare sector, AI-powered diagnostic tools can analyze medical images with greater accuracy and efficiency than human professionals, leading to quicker diagnoses and improved patient outcomes.Moreover, AI facilitates personalized experiences and services across different sectors. Recommendation systems powered by AI algorithms provide tailored suggestions to users based on their preferences and past behavior, enhancing user satisfaction and engagement. In e-commerce, for example, AI-driven recommendation engines enable platforms to offer personalized product recommendations, thereby increasing sales and customer loyalty.Challenges of AI:However, the widespread adoption of AI also presents significant challenges, particularly concerning job displacement and economic inequality. As AI technologies automate routine tasks, there is a growing concern about the loss of jobs in various industries. Workers whose jobs are susceptible to automation may face unemployment or theneed for retraining to adapt to new roles.Furthermore, AI exacerbates existing disparities in access to resources and opportunities. Wealthier nations and corporations with greater financial resources tend to have more extensive access to AI technologies, widening the gap between developed and developing countries. Additionally, there are concerns about data privacy and security, as the collection and analysis of vast amounts of personal data by AI systems raise ethical and legal questions regarding consent and surveillance.Ethical Implications:The ethical implications of AI extend beyond economic considerations to encompass issues of bias, accountability, and transparency. AI algorithms are susceptible to biases inherent in the data used to train them, leading to discriminatory outcomes, particularly in areas such as hiring, lending, and criminal justice. Addressing these biases requires careful attention to data collection, algorithm design, and ongoing monitoring to ensure fairnessand equity.Moreover, as AI systems become more autonomous and capable of making decisions without human intervention, questions arise regarding accountability and responsibility for their actions. In cases where AI systems cause harm or make erroneous decisions, determining liability and recourse becomes challenging, raising concerns about legal frameworks and ethical oversight.Conclusion:In conclusion, while AI offers tremendous potential to transform society positively, its widespread adoption also poses significant challenges and ethical dilemmas. Balancing the benefits of AI with its potential risks requires thoughtful consideration of its impact on employment, inequality, and ethics. By fostering collaboration between policymakers, technologists, and ethicists, we can harness the power of AI to create a more equitable and prosperous future for all.。
讨论智能机器英语作文
讨论智能机器英语作文The Evolution and Impact of Intelligent Machines in the English-Speaking World.In the rapidly evolving landscape of technology, intelligent machines have become an integral part of our daily lives. From automated assistants to complex robots, these machines are not just tools; they are agents of change, reshaping our world in ways that we are still coming to understand. This essay delves into the development, applications, and societal implications of intelligent machines in the English-speaking world.The Dawn of Artificial Intelligence.The origins of artificial intelligence (AI) can be traced back to the mid-20th century, when pioneering thinkers like Alan Turing and John McCarthy explored the possibilities of machines that could think and learn. Their visionaries ideas laid the foundation for what we now knowas AI. In the English-speaking world, these early concepts took root in academic institutions, with research laboratories and universities leading the way in developing AI technologies.The Evolution of Intelligent Machines.Over the decades, intelligent machines have evolved from simple rule-based systems to complex learning algorithms. Today, they are capable of performing tasksthat were once thought to be solely within the realm of human intelligence. These machines can analyze vast amounts of data, make.。
Advancements in Machine Learning
Advancements in Machine Learning Machine learning has been a revolutionary field in the realm of artificial intelligence, continuously advancing and evolving to solve complex problems and improve efficiency in various industries. With the rapid growth of data collection and processing capabilities, machine learning algorithms have become increasingly sophisticated, enabling machines to learn from data and make predictions or decisions without being explicitly programmed. This has led to significant advancements in areas such as healthcare, finance, transportation, and many others, transforming the way tasks are performed and problems are solved. One of the key advantages of machine learning is its ability to analyze massive amounts of data quickly and accurately, identifying patterns and trends that may not be apparentto human analysts. This has proven to be particularly valuable in the field of healthcare, where machine learning algorithms can analyze medical images, genetic data, and patient records to diagnose diseases, predict outcomes, and personalize treatment plans. For example, machine learning models have been developed todetect early signs of diseases such as cancer, diabetes, and Alzheimer's, significantly improving the chances of successful treatment and recovery. In the financial sector, machine learning algorithms are used to analyze market trends, predict stock prices, detect fraudulent activities, and automate trading strategies. These algorithms can process vast amounts of financial data in real-time, making split-second decisions to optimize investment portfolios and minimize risks. This has led to the development of algorithmic trading systems that can execute trades at speeds and frequencies beyond human capabilities,revolutionizing the way financial markets operate. In the transportation industry, machine learning is being used to optimize route planning, improve traffic flow, and enhance the safety of autonomous vehicles. By analyzing data from sensors, cameras, and GPS devices, machine learning algorithms can predict traffic patterns, identify potential hazards, and make real-time decisions to avoid accidents. This technology has the potential to revolutionize the way we travel, making transportation safer, more efficient, and environmentally friendly. Despite the numerous benefits of machine learning, there are also concerns about its ethical implications and potential biases. Machine learning algorithms are only as good asthe data they are trained on, and if the data is biased or incomplete, the algorithms may produce biased or inaccurate results. This is particularly concerning in areas such as criminal justice, where machine learning algorithms are used to predict recidivism rates and make sentencing decisions. If thetraining data is biased against certain demographic groups, the algorithms may perpetuate existing inequalities and injustices. Another challenge facing machine learning is the lack of transparency and interpretability in some algorithms. Deep learning models, in particular, are often referred to as "black boxes" because it is difficult to understand how they arrive at their decisions. This lack of transparency can be problematic in critical applications such as healthcare and finance, where decisions have significant consequences. Researchers are working on developing more interpretable machine learning models that can explain their reasoning and provide insights into how decisions are made. In conclusion, machine learning has made remarkable advancements in recent years, revolutionizing industries and transforming the way tasks are performed. From healthcare to finance to transportation, machine learning algorithms are being used to analyze data, make predictions, and automate decision-making processes. While the benefits of machine learning are undeniable, it is crucial to address ethical concerns, biases, and transparency issues to ensure that these technologies are used responsibly and ethically. By continuing to research and develop more advanced and interpretable machine learning models, we can harness the full potential of this technology to improve lives and create a more equitable and sustainable future.。
MachineToolsJointheI机械英语文章翻译
Machine Tools Join the Internet AgeVisitors to Chicago's upcoming International Manufacturing Technology Show (IMTS 2008) will have an opportunity to witness an industry-changing exhibit-billed as "the most exciting development since numerical control. The reference is to MTC nnect,a new open communication protocol standard for interconnect-ability between machines,independent systems,devices and higher-level applications,says Paul Warndorf,AMT's vice president,technology.The numerical control (NC) concept first emerged as a "hot" discussion topic way back in 1955, but it wasn't until the 1972 IMTS show that NC resulted in exhibited,commercial products that proceeded to revolutionize the manufacturing industry's productivity and quality.Now, decades later, a concept of equal potential has been launched. AMT compares MTConnect's standarddization significance with an initiative from the 1860s-standardizing screw thread for industrial usage. Warndorf says all three are initiatives of far-reaching significance in the manufacturing world.In facilitating, coordinating and showcasing the industrywide effort, AMT is attempting to mirror the success occuring in the information technology world, says Perer Eelman, IMTS vice president, exhibitions. Although the standard's initial developmental thrust is in the machine tool sector,MTConnect is really an effort to solve data connectivity across all of discrete manufacturing, adds Warndorf.The idea is to allow for devices, equipment and systems to output data in an independent format that can be read by any other device using the same standard format to read data. He says MTConnect will enable everyone in the production supply chain to be part of making the manufacturing enterprise more productive. MTConnect will be open and royalty-free to ensure the widest possible acceptance and utility.MTConnect will have a pervasive presence at IMTS. For example, in the Emerging Techology Center visitors and exhibitors can either view a video explaining the concept and/or try it out via MTConnect-equipped computerized kiosks. Warndorf says the computerized kiosks are networked to the machine tools of participating exhibitors on the show floor. Activating the touchscreens delivers status reports from the maching in the specified booth. Warndorf says that data reporting is just the beginning of how far the information presentation could go in the future. "We want to tantalize the user community. We want to start them thinking about the future possibilities of expanding the concept."Warndorf admits further inspiration might also come from the process industries where communication standards have both an established history and established service providers.One example is SmartSingal Corp.Data communication standards in the process industries enable its services business to deliver advanced asset analytics. We provide detailed, real-time monitoring of 2,000 power plants, saysDavid Bell, vice president, application engineering.Through its Internet collaboration features, SmartSingal works with plant personnel until the problems are investigated and resolved. The company's services solution is designed to provide the intelligence and guidance needed to predict, diagnose and prioritize problems even before traditional, often-difficult-to-maintain, condition-based monitoring tools detect them, says Bell.With MTConnect in discrete manufacturing, Warndorf sees future usage evolving to include control functionality as well as data access. Possibilities include leveraging MTConnect to control power consumption. Typically, data collection and control will proceed together, says Warndorf." After all, effective control depends on knowing what's going on, and developing that knowledge base means data access."Warndorf says machine tool users recognize that MTConnect relates to significant competitive issues. To grow that theme even further AMT has an ongoing university competition focusing on new ways to strengthen that value. "The challenge:How can we get applications to now utilize real data to improve their output?" IMTS 2008 marks the midway point in this competition, notes Eelman. The finalists will be awarded their prizes in October 2009 at the EMO 2009 exhtbition in Milan, Italy.Immediate Data AccessWarndorf's vision for the XML-based middleware standard is to create a whole new future for manufacturing managers-wherr the immediate data access will provide advantages that won't stop growing.Warndorf says comments by AMT members indicate their enthusiastic acceptance of MTConnect. For example: "You're not moving fast enough!" Machine tool manufacturer Mazak,for example has already conceptually incorporated MTConnect approaches in its e-Tower machine tool communication nodes,says Chuck Birkle, vice president, sales and marketing.Haas is another instance of enthusiastic support, says Kurt Zierhut, director of electrical engineering at the company.AMT's development budget for MTConnect was originally set at $1 million, but the investment has already reached $2 million, adds Warndorf.To Brian Papke, president of Mazak Coro.,the strongest justfication for MTConnect is boldly evident among all of the IMTS exhibits." It is the growing intelligence of machine tools and the growing need to access it."Papke says Mazak's exhibit of 20 machine tools (booth A-8101) exemplifies the ever-increasing machine intelligence trend. One example, says Papke, is Mazak'sIntegrex i-150 Multi-Tasking Center for small, complex parts. Designed with an ultra-compact footprint, the product is designed for medical appliance manufacturers for high-precision component machining involving round, square or angular characteristics. With its Active Vibration Control,the machine optimizes acceleration and deceleration parameters during changes in direction through the extensive look-ahead capabilities in the Mazatrol Matrix CNC control.Also contributing to accuracy is the Intelligent Thermal Shield. The comprehensive algorithm differentiates dynamic thermal impact (for example, spindle rotation) fron static thermal impact (base or column structure growth) to arrive at the proper displacement adjustment. Multiple sensors retrieve the thermal data for processing.Sensors are involved with the i-150's Intelligent Performance Spindle Monitoring feature. Performance analysis is accomplished through a series of sensors located in the spindle. Machining errors and maintenance intervention are avoided by analyzing such things as temperture, vibration and displacement.Machine intelligence also handles safety and reliability on the i-150. For example, an Intelligent Safety Shield feature is a dynamic, 3-D simulation of machine components, tooling, fixturing and workpiece all displayed on the CNC control screen. Manual stepping of the part program identifies any interference situations, halts machine movements and allows safe corrections to be made in advance of production.While conversations with the i-150 are not possible, the machining center does include Mazak Voice Advisor. It allows the CNC to "talk" with the operator during setup, verbalizing machine setting and issuing safety advice for secure operations.The i-150 also helps operators monitor maintenance issues with an Intelligent Maintenance Support function. This feature monitors the status of perishable items and logs the history of major machine subassemblies. Pop-up windows alert the operator to specific required maintenance and allow shop management to develop a tailored preventive maintenance program.Is the increased presence of machine intelligence simpiy a sign of a machine tool maker attempting to compensate for the continuing, chronic shorage of skilled trained machine operators? Not in Mazak's thinking, insists Papke. He points out that machine intelligence, as it increases the performance and capability of a machine tool, actually makes it imperative to have a better-trained and skilled operator."Consider how intelligence and MTConnect make each machine a more significant contributor to the over all output of a manufacturing operation." Papke says. "You're doing more with fewer, more capable machines and therefore operator training becomes a more critical performance issue."附录2译文参观芝加哥举行的国际制造技术展( IMTS 2008 )将有机会亲眼目睹一个行业号称"最令人兴奋的发展源于数字控制。
Applied Machine Learning
Applied Machine Learning Machine learning is an incredibly powerful tool that has the potential to revolutionize countless industries and improve the way we live and work. However, despite its promise, there are a number of challenges and issues that must be addressed in order to fully realize the potential of machine learning. In this response, we will explore some of the key problems facing applied machine learning, and consider potential solutions and ways forward. One of the most pressingissues in applied machine learning is the lack of transparency andinterpretability in many machine learning models. This lack of transparency can make it difficult for users to understand how a model is making its predictions, which is a significant barrier to trust and adoption. Additionally, without transparency, it can be difficult to identify and rectify biases or errors in the model. This is particularly problematic in high-stakes applications such as healthcare or criminal justice, where the consequences of a flawed model can be significant. Addressing this issue will require the development of new techniques for explaining and interpreting machine learning models, as well as a greater emphasis on transparency and accountability in the development and deployment of these models. Another significant challenge in applied machine learning is the issue of bias and fairness. Machine learning models are often trained onhistorical data, which can reflect and perpetuate existing biases and inequalities. This can result in models that systematically disadvantage certain groups of people, perpetuating and even exacerbating existing social injustices. Addressing this issue will require a concerted effort to develop and implement techniques for detecting and mitigating bias in machine learning models, as well as a commitment to using machine learning in ways that promote fairness and equity. In additionto issues of transparency and fairness, there are also practical challenges that must be addressed in order to fully realize the potential of applied machine learning. One such challenge is the need for large, high-quality datasets for training machine learning models. Building and maintaining these datasets can be a significant undertaking, particularly in domains where data is scarce or difficult to collect. Additionally, there are often privacy concerns associated with the collection and use of large datasets, which must be carefully navigated.Addressing this challenge will require collaboration between researchers, industry, and policymakers to develop strategies for responsibly and ethically collectingand sharing data for machine learning. Furthermore, the deployment andintegration of machine learning models into real-world systems can be asignificant challenge. In many cases, machine learning models must be integrated with existing systems and processes, which can be complex and time-consuming. Additionally, there are often concerns about the robustness and reliability of machine learning models in real-world settings, particularly in safety-critical applications such as autonomous vehicles or medical devices. Addressing these challenges will require a concerted effort to develop best practices and standards for the deployment of machine learning models, as well as ongoing research into techniques for improving the robustness and reliability of these models. Finally, there is the challenge of ensuring that the benefits of machine learning are equitably distributed. As machine learning becomes increasingly prevalent in various industries, there is a risk that the benefits will accrue primarily to those who are already privileged, exacerbating existing inequalities. Addressing this challenge will require a commitment to using machine learning in ways that promote equity and inclusion, as well as efforts to ensure that the benefits of machine learning are accessible to all. In conclusion, while applied machine learning holds immense promise, there are a number of challenges and issues that must be addressed in order to fully realize this potential. From issues of transparency and fairness to practical challenges of data collection and deployment, there are many complex problems that must be navigated. However, with concerted effort and collaboration, it is possible to overcome these challengesand harness the power of machine learning for the benefit of all.。
computers 英语作文
Computers have become an integral part of our daily lives,transforming the way we work,learn,and communicate.They are powerful tools that have revolutionized various industries and have made our lives more efficient and convenient.The Evolution of ComputersThe journey of computers began with the invention of the abacus,an ancient calculating device.Over time,the development of computers has seen significant milestones,from the mechanical calculators to the electronic computers of today.The invention of the transistor in the mid20th century paved the way for smaller,faster,and more efficient computers.Functionality and UsesModern computers are versatile machines capable of performing a multitude of tasks. They are used for:1.Business and Productivity:Computers are essential in the corporate world for tasks such as data management,financial analysis,and project management.cation:They are used for research,online learning,and digital collaboration among students and educators.3.Entertainment:Computers provide a platform for gaming,streaming media,and social networking.munication:Email,social media,and video conferencing are facilitated by computers,connecting people across the globe.Impact on SocietyThe widespread use of computers has had profound effects on society:1.Access to Information:The internet,accessed through computers,has made information more accessible than ever before.2.Globalization:Computers have played a crucial role in connecting the world, facilitating international trade and cultural exchange.3.Job Market:The demand for computer skills has increased,creating new job opportunities in technologyrelated fields.Challenges and ConsiderationsDespite their benefits,computers also present challenges:1.Digital Divide:Not everyone has equal access to computers and the internet,leading to disparities in education and economic opportunities.2.Cybersecurity:With the increased reliance on computers,the risk of cyber threats and data breaches has grown.3.Health Concerns:Prolonged computer use can lead to physical and mental health issues if not managed properly.The Future of ComputersAs technology advances,computers are expected to become even more powerful and integrated into our lives.Developments in artificial intelligence and machine learning are set to make computers smarter and more capable of assisting in complex tasks.In conclusion,computers are not just machines they are an extension of our capabilities, offering endless possibilities for innovation and progress.As we continue to embrace this technology,it is important to address the challenges it presents and ensure that its benefits are accessible to all.。
Machine Vision Advances and Applications
Machine Vision Advances and Applications Machine vision, also known as computer vision, is a field of technology that enables machines to visually perceive and interpret the world around them. This technology has made significant advances in recent years, with applications in various industries such as manufacturing, healthcare, agriculture, and autonomous vehicles. Machine vision systems can analyze and process visual data to make decisions, identify objects, and detect anomalies with a high level of accuracy and speed. These advancements have the potential to revolutionize many aspects of our lives, but they also raise important ethical and societal considerations.One of the most significant advances in machine vision is the development of deep learning algorithms and convolutional neural networks (CNNs). These algorithms have greatly improved the ability of machines to recognize and classify objects in images and videos. As a result, machine vision systems can now perform tasks such as facial recognition, object detection, and image segmentation with a level of precision that was previously unattainable. This has opened up new possibilities for automation and efficiency in various industries, leading to increased productivity and cost savings.In the manufacturing industry, machine vision has been widely adopted for quality control and inspection processes. By using cameras and image processing algorithms, manufacturers can detect defects in products with a high degree of accuracy, leading to improved product quality and reduced waste. Machine vision systems can also be used to guide robotic arms in assembly processes, further increasing efficiency and precision. In the healthcare sector, machine vision has been utilized for medical imaging analysis, disease diagnosis, and surgical assistance, leading to improved patient outcomes and more personalized treatment plans.Another area where machine vision has shown great promise is in agriculture, where it can be used for crop monitoring, yield prediction, and automated harvesting. By analyzing drone or satellite imagery, machine vision systems can provide farmers with valuable insights into the health and growth of their crops, allowing for more targeted and efficient use of resources. In the field of autonomous vehicles, machine vision plays a critical role inenabling cars to perceive and navigate the world around them, ensuring the safety of passengers and pedestrians.Despite the numerous benefits of machine vision, there are also concerns regarding its ethical implications and potential societal impact. One of the main ethical considerations is the issue of privacy, particularly in the context of facial recognition technology. The widespread use of facial recognition in public spaces and surveillance systems has raised concerns about the infringement of individual privacy and civil liberties. There are also concerns about the potential for bias and discrimination in machine vision algorithms, particularly in areas such as law enforcement and hiring processes.Furthermore, the increasing automation of jobs through the implementation of machine vision systems has raised concerns about the displacement of human workers and the widening of economic inequality. While machine vision has the potential to increase productivity and efficiency, it may also lead to job losses in certain industries, particularly those that rely heavily on manual labor. This could exacerbate existing social and economic disparities, leading to greater inequality and social unrest.In conclusion, machine vision has made significant advances in recent years, with applications in various industries that have the potential to revolutionize many aspects of our lives. From manufacturing and healthcare to agriculture and autonomous vehicles, machine vision systems have the ability to improve efficiency, accuracy, and safety. However, it is important to consider the ethical and societal implications of these advancements, particularly in terms of privacy, bias, and the impact on the workforce. As machine vision continues to evolve, it is crucial to address these concerns and ensure that its benefits are realized in a responsible and equitable manner.。
停止运转出现故障英语作文
When machinery comes to a halt due to a malfunction, it can be a significant disruption to the workflow and productivity of a business or operation. Heres a detailed look at how to address such an issue in an English composition:Title: Addressing Machinery BreakdownsIn the fastpaced industrial world, the unexpected breakdown of machinery can be a daunting experience. It not only hampers the production process but also affects the morale of the workforce. This essay delves into the causes of machinery malfunctions, the impact on operations, and the steps that can be taken to mitigate such occurrences.IntroductionMachinery is the backbone of industrial production. When it fails to operate, it can lead to significant losses, both in terms of time and revenue. The sudden cessation of a machines function is often referred to as a stoppage or malfunction.Causes of Machinery Malfunctions1. Wear and Tear: Over time, mechanical parts can wear out due to continuous use, leading to inefficiencies and eventual failure.2. Lack of Maintenance: Regular maintenance is crucial to keep machinery in optimal condition. Neglecting this can result in unanticipated breakdowns.3. Human Error: Incorrect operation or oversight by operators can cause machinery to malfunction.4. Design Flaws: Sometimes, the root cause of a breakdown can be traced back to inherent design issues in the machinery.5. External Factors: Environmental conditions such as extreme temperatures, humidity, or dust can also affect machinery performance.Impact on OperationsProduction Delays: A halted machine means a paused production line, leading to missed deadlines and potential penalties for delayed delivery.Financial Losses: The cost of downtime can be substantial, including lost revenue and the expenses associated with repair or replacement of parts.Employee Morale: Frequent breakdowns can lead to frustration and decreased motivation among workers.Strategies for Mitigation1. Preventive Maintenance: Scheduling regular checks and servicing can identify potential issues before they cause a breakdown.2. Training: Ensuring that operators are welltrained in the correct use of machinery can reduce the risk of human error.3. Quality Control: Investing in highquality machinery and parts can decrease the likelihood of wear and tear.4. Emergency Plans: Having a contingency plan in place for when machinery fails can minimize the impact of the downtime.5. Technological Upgrades: Implementing advanced monitoring systems can provide realtime data on machinery health, allowing for proactive measures.ConclusionWhile machinery breakdowns are an unfortunate reality in the industrial sector, they can be managed effectively with the right strategies in place. By understanding the causes, impacts, and mitigation strategies, businesses can minimize the occurrence of such stoppages and maintain a smooth and efficient production process.This composition provides a comprehensive overview of machinery malfunctions, offering insights into why they happen and how they can be addressed to ensure minimal disruption to industrial operations.。
关于机械论问翻译(英文版)(doc 7页)
关于机械论问翻译(英文版)(doc 7页)Automobile manufacturers welding application of new technologies with the overall development trendWelding is a modern mechanical manufacturing process a necessary way to get car manufacturers at a wide range of applications. Automobile engine, gearbox, axle, frame, body, inside the six can not be separated from the total Chengdu welding techniques. At the manufacture of automotive components, spot welding, projection welding, seam welding, roll welding, arc welding electrodes, CO2 gas shielded arc welding, argon arc welding, gas welding, brazing, friction welding, electron beam welding and laser welding, etc. types of welding methods, because of spot welding, gas shielded arc welding, brazing with the production volume is very large, a high degree of automation, high-speed, low power, welding deformation of small, easy-to-use features, so the coverage of auto body sheet metal parts is particularly suitable, so , in automobile production in the most widely used. Investment costs at about 75% in spot welding, other welding methods account for only 25%.With the development of the automobile industry, auto-body welding production line is also gradually develop in the direction to the fully automated, in order to catch up with international standards, improve productivity at the same time, requires efforts to improve the quality of automobile manufacturers.As we all know, implementation is a prerequisite for automated manufacture of parts and components to high precision, the smallest hope of welding distortion, welding sites want fresh appearance, and called for more and more high-welding technology. Facing our country's accession to WTO opportunities and challenges, welding area of the popularization and application of new technologies for the automotive industry to enhance the brand has an extremely important role.First, the auto industry used welding methods and the application of parts andcomponents Automotive applications are welding industry noodle one of the most widely used method of welding a wide range of applications as follows:1. Resistance Welding(1) Spot welding used mainly for body assembly, flooring, doors, side Wai Wai after, the former bridge and small parts.(2) Multi-spot welding for body baseboard containing boxcar doors, engine cover and trunk cover.(3) Projection welding and roll welding for body parts, shock absorber stem, brake shoe, screws, nuts and small, such as stents.(4) Seam welding the body cap for rain canopies, shock absorber head, fuel tank, muffler and oil, such as disk.(5) For butt welding steel, was discharged into the air stem, cutting tools, etc..2. Electric arc welding(1) CO2 arc welding for the car, rear axle, frame, shock absorber stem, beam, shell and tube rear axle, drive shaft, hydraulic cylinders and jacks, such as welding.(2) Welding for the oil plate, aluminum alloy parts for welding and repair welding.(3) Electric arc welding electrode for thick parts such as stents, spare tire racks, trailers and so on.(4) Submerged for half-bridge casing, flange, and natural gas vehicles, such as pressure vessels.3. Special welding(1) Friction welding for automotive stem, rear axle, axle, steering rod and the attendant tools.(2) For electron beam welding gear, after the bridges.(3) Laser Welding for Body backplanes, gear, materials and spare parts under the edge and so on.4. Oxyacetylene weldingRepair welding for body assembly.5. BrazingFor radiators, copper and steel parts, hard alloy welding.Two, the automobile industry welding application of new technologies Nowadays, the automobile industry a lot of advanced welding technology, only to set out the relevant body welding and welding of new technologies.1. Resistance and control of energy-saving technology(1) Conjoined hanging spot welding in the automotive industry the most widely used are hanging spot welding machine, a workshop is often dozens or hundreds of units, mostly in its capacity above 100kV A, automotive sheet metal at the welding has been widely applications.(2) Of resistance welding machine current resistance welding machine communicate substantial use of the single-phase AC 50Hz power supply, large capacity, low power factor. The development of three-phase low-frequency resistance welding machine, three-phase secondary rectification welding contacts (in ordinary type spot welder, seam welder, welding machine application convex) and the IGBT inverter resistance welding machine, you can solve the imbalance and improve power grid power factor (up to 0 9 above) the problem. At the same time can be further save power, the realization of microcomputer control parameters can be better applied to welding of aluminum alloy, stainless steel and other difficult to weld metal welding. It could also further reduce the equipment weight.(3) The control of resistance against the Southwest Jiaotong University 1 Rim Factory Alloy Rim Butt Welding development of PLC (programmable controller) Intelligent controller, on the original machine had a transformation, to solve the aluminum alloy car lap welding quality problems, improve welding productivity. Factory has been developed after the same seam welding PLC controller to solve the general clean-up requirements parts of the seam welding problem. These two controllers through the development, proved more than single-chip microprocessor controller PLC anti-interference ability, high reliability; IPC controller than small size, low cost, use of a common single-phase AC power frequency resistance welding machine achieve a high degree of difficulty of the butt welding and seam welding job.2. Gas shielded welding technology(1) The surface tension of the transition waveform control method, the key is to use two current pulse achieve a droplet transition, the first current pulse to form a droplet and up until a short circuit droplet and the workpiece; No. 2 current pulse is a short-term narrow pulse and continuous detection of its di / dt, at the same time to control the value of current pulse to generate the appropriate electromagnetic contractile force, so that droplet contraction thin neck, and finally by the pool surface tension pull off, finish a droplet transfer without spatter.(2) Inverter power inverter power source waveform to control the use of dynamic characteristics of a good and flexible controllability, the use of waveform control, in the short-circuit current increased by inhibiting the early stages to reduce the electromagnetic force when the little bridge in the newly formed droplet transfer obstacles and blasting off to reduce the large particles flying, and in favor of droplet spread in the pool; when droplet spread out in the pool after making a rapid increase in current in order to speed up the formation of necking later slow rise to a lower peak so that when the little bridge blasting off to reduce spatter.industry on the application of metal plates welded problem, the relevant experts have pointed out that welding galvanized steel sheet most promising method is laser welding. However, because of the degree of automation of the process is now not enough, we must improve the automobile industry widely used method of contact spot welding. The main disadvantage of this method are Weldment at electrode contacts with substantial transition of zinc electrode burn faster. It is suggested that the use of Al2O3 particle dispersion strengthened Cu-Zn and Cu-Cr alloy electrode can be at a minimum under the electric resistance welding, in order to guarantee access to the smallest size welded joints.2. High-strength steel plateIn order to realize lightweight cars, improve vehicle safety performance, high strength steel in the automotive application is increasing year by year, the current emergence of a new generation of high strength steel materials - ultra-fine grained steel. The steel major means to further enhance economic indicators, based on iron and steel materials, strength, toughness than the existing steel doubled. A new generation of ultra-fine grained steel at the organizational structure with ultra-fine grain, high purity and high uniformity of properties.Currently used to study a new generation of ultra-fine grained steel major has 400MPa and 800MPa grade two level.At a new generation of research on iron and steel, our country and the international level and there is no gap, almost simultaneously started, with Japan and South Korea between the two countries living side by side the world's leading position. However, due to appear shorter, so the unit participated in the study is not much, and is based on Iron & Steel Research Institute and the Department of Mechanical Engineering, Tsinghua University-based. In the ultra-fine grained steel welding technology for the study must have achieved a number of advanced results.3. Aluminum AlloyAluminum alloy with light weight, high strength and corrosion resistance, etc., are good building materials, automobile industry is gradually Materials in the use of aluminum alloy parts. Aluminum alloy welding has five main features: (1) Aluminum alloy surface has a layer of dense oxide film (melting point of about 2050 ℃), welding if the latter fails to clear, it will affect the quality of basic metals melting, forming a mixture of such quality issues.(2) Thermal conductivity of large (approximately 4 times that of steel), electrical conductivity, and welding and steel to achieve the same welding speed, the welding heat input than when welding steel large 2 ~ 4 times.(3) Coefficient of linear expansion, and there is Weldment have a greater thermal stress, deformation and the tendency to crack.(4) Stomatal easily.(5) Aluminum alloy to reduce the strength of welded joints.Aluminum alloy welding of these characteristics, it is in the development ofour welding equipment and welding technology should be taken seriously the problem, the only way to develop a suitable aluminum alloy welding equipment, materials and welding technology.Four, the automobile industry's general trend of development of welding1. The development of flexible automated production systemsTaking an overall view of the welding automotive industry status quo, it is not difficult to analyze the development trend of the automotive industry for welding: the development of flexible automated production systems. And industrial robots, automated production because of set production features and flexibility all in one, so large-scale car production in recent years, the rapid use of the robot. In welding, the main use is to spot welding robot and welding robot.Welding want highly automated production line, widely used in the 6-DOF robot and welding robot with forceps repository can be made under different welding parts welding product requirements or changes from the repository automatically grasping forceps for welding required. Transmission devices have been developed into a more flexible use of unmanned sensors and car-oriented. 2. The development of lightweight modular intelligent automatic welding machineThe level of the domestic automobile welding gap compared with foreign countries. In recent years, the domestic automobile factory attaches great importance to the automation of welding. Such as the introduction of the Jetta FAW body welding workshop 13 production line automation rate of more than 80%. Each line by computer (programmable logic controller PLC-3) control, auto-complete the transmission of the workpiece and welding. By R30-type welding robot in polar coordinates and G60 toggle-type robot 61, the robot driven by a microcomputer control, numbers and characters revealed that the tape recorder input and output procedures. Robot moves step-by-step sequence using point-to-point trajectory, with a high level of automation of welding, both to improve the working conditions, improve product quality and productivity and reducing material consumption.Similar high level of production line in Shanghai, Wuhan and other places have a joint venture and introducing, including Germany, the United States, France and Japan's advanced automotive manufacturing technology. But these are after all still far from our country can not adapt to the rapid development of the national auto industry needs, we must insist on technological innovation, and vigorously accelerate the development of welding energy efficient new materials and new technologies and new equipment, development of applications for robot technology, the development of portable intelligent devices smarter establish a highly efficient economy welding automation systems, we must use computers and information technology to transform traditional industries, improve grades. Believe that in the near future, through our joint efforts, the domestic auto industry's welding technology will be gradually shortened with advanced welding technology levels, the auto industry to meet WTO brings opportunities and challenges.。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Representational Issues in Machine Learning of User Profiles +*Eric Bloedorn, +Inderjeet Mani, and +T. Richard MacMillan+Artificial Intelligence Technical CenterThe MITRE Corporation, Z4017525 Colshire Drive, McLean, VA 22102{bloedorn,imani,macmilla}@*Machine Learning and Inference LaboratoryGeorge Mason University, Fairfax, VA 22030AbstractAs more information becomes available electronically, tools for finding information of interest to users become increasingly important. Building tools for assisting users in finding relevant information is often complicated by the difficulty in articulating user interest in a form that can be used for searching. The goal of the research described here is to build a system for generating comprehensible user profiles that accurately capture user interest with minimum user interaction. Machine learning methods offer a promising approach to solving this problem. The research described here focuses on the importance of a suitable generalization hierarchy and representation for learning profiles which are predictively accurate and comprehensible. In our experiments using AQ15c and C4.5 we evaluated both traditional features based on weighted term vectors as well as subject features corresponding to categories which could be drawn from a thesaurus. Our experiments, conducted in the context of a content-based profiling system for on-line newspapers on the World Wide Web (the IDD News Browser) demonstrate the importance of a generalization hierarchy in obtaining high predictive accuracy, precision and recall, and stability of learning.IntroductionAs more information becomes available on the Internet, the need for effective personalized information filters becomes critical. In particular, there is a need for tools to capture profiles of users’ information needs, and to find articles relevant to these needs, as these needs change over time. Information filtering, as (Belkin and Croft 92), (Foltz and Dumais 92) point out, is an information access activity similar to information retrieval, but where the profiles represent evolving interests of users over a long-term period, and where the filters are applied to dynamic streams of incoming data. The research described here automates the task of building and adapting accurate and comprehensible individualized user profiles and focuses on the importance of a suitable generalization hierarchy and representation for learning.Our research builds on two particular traditions involving the application of machine learning to information access: empirical research on relevance feedback within the information retrieval community, and interdisciplinary work involving the construction of personalized news filtering agents. We will now introduce these briefly, to better motivate and distinguish our work.Relevance feedback approaches are a form of supervised learning where a user indicates which retrieved documents are relevant or irrelevant. These approaches, e.g., (Rocchio 1971), (Robertson & Sparck-Jones 1976), (Belew 1989), (Salton & Buckley 1990), (Harman 1992), (Haines & Croft 1993), (Buckley, Salton, & Allan 1994), have investigated techniques for automatic query reformulation based on user feedback, such as term reweighting and query expansion. While this body of work is not necessarily focused exclusively on the information filtering problem, it demonstrates effectively how learning can be used to improve queries.Work on the application of machine learning techniques for constructing personalized information filters has gained momentum in recent years. Some early MIT Media Lab work used a genetic algorithm approach to generate new profiles, which were evaluated based on user feedback (Sheth & Maes 1993), (Sheth 1993). One of the goals of that approach was “exploratory behavior.... so as to explore newer domains that might be of interest to the user.” (Sheth & Maes 1993). Since that time, a number of other systems for personalized information filtering have appeared on the scene, such as NewT (Maes 1994), Webhound (Lashkari, Metral, & Maes 1994), WebWatcher (Armstrong et al. 1995), WebLearner (Pazzani et al. 1995) and NewsWeeder (Lang 1995).One of the motivations for our approach was the discovery that the above research had paid little attention to learning generalizations about user’s interests. For example, if a user likes articles on scuba, whitewater rafting, and kayaking, a system with the ability to generalize could infer that the user is interested in water sports, and could communicate this inference to the user. Not only would this be a natural suggestion to the user, but it might also be useful in quickly capturing their real interest and suggesting what additional information might be of interest. Such an approach could exploit a concept hierarchy or network to perform the generalizations. While thesauri and other conceptual representations have been the subject of extensive investigation in both query formulation and expansion (e.g., see (Jones et al. 1995) for detailed references), they have not been used to learn generalized profiles.In order to investigate this further, we decided to use features which would allow us to exploit categories for generalization, where the categories could be drawn from a thesaurus. One well-known problem which arises here is that of word-sense disambiguation, in this case deciding which of several thesaurus categories are the most likely ones for a term. We decided to apply the approach used by (Liddy & Paik 1992) (Liddy & Myaeng 1992), which exploits evidence from local context and large-scale statistics. This resulted in our using the Subject Field Coder (SFC) (Liddy and Myaeng 1992) (Liddy and Paik 1992) (from TextWise, Inc.), which produces a vector representation of a text's subject categories, based on a thesaurus of 124 subject categories (the SFC is discussed in more detail in the next section). We therefore decided to use a vector of subject categories in our document representation, with the SFC thesaurus being used for generalization. In order to compare the influence of these features on learning compared to more traditional features based on weighted term vectors, we developed a hybrid representation which combined the two types of features.A personalized news filtering agent which engages in exploratory behavior must gain the confidence of the user. In many practical situations, a human may need to validate or edit the system’s learnt profiles; as (Mitchell et al. 1994) point out, intelligibility of profiles to humans is important in such situations. We speculated that the use of such a hybrid representation which exploits summary-level features such as subject categories would increase the intelligibility of profiles. To further strengthen profile intelligibility, we also decided to include other summary-level features in our document representation, involving terms relating to people, organizations, and places (along with their respective attributes). These features were provided by a name tagger (discussed in the next section). That such features could help profile learning was suggested in part by some recent query reformulation research (Broglio & Croft 1993), which had shown improved retrieval performance on TIPSTER queries using such features.In summary, our experiments evaluated the effects of different subsets of features on the learning of intelligible profiles. Our experiments were conducted in the context of a content-based profiling system for on-line newspapers on the World Wide Web, the IDD News Browser (Mani et al. 1995). In this system, which is in use at MITRE, the user can set up and edit profiles, which are periodically run against various collections built from live Internet newspaper and USENET feeds, to generate matches in the form of personalized newspapers. These personalized newspapers provide multiple views of the information space in terms of summary-level features. When reading their personalized newspapers, users provide positive or negative feedback to the system, which are then used by a learner to induce new profiles. These system-generated profiles can be used to make recommendations to the user about new articles and collections. The experiments reported here investigate the effect of different representations on learning new profiles.Text RepresentationAs mentioned earlier, we used a hybrid representation with three different sources of features. We now describe these in turn.The Subject Field Coder (SFC) (Liddy & Myaeng 1992) (Liddy & Paik 1992) (from TextWise, Inc.) produces a summary-level semantic representation of a text's contents, based on a thesaurus of 124 subject categories. Text summaries are represented by vectors in 124-dimensional space, with each vector's projection along a given dimension corresponding to the salience in the text of that subject category. The overall vector is built up from sentence-level vectors, which are constructed by combining the evidence from local context (e.g., unambiguous words) with evidence from large-scale statistics (e.g., pairwise correlations of subject categories). An earlier version of the SFC, which used subject codes from Longman’s Dictionary of Contemporary English (LDOCE), was tested on 166 sentences from the Wall Street Journal (1638 words). It gave the right category on 87% of the words (Liddy & Myaeng 1992).The second extraction system we used was the IDD POL Tagger (Mani et al 1993), (Mani & MacMillan 1995) which classifies names in unrestricted newswire text in terms of a hierarchy of different types of people (military officers, corporate officers, etc.), organizations (drug companies, government organizations, etc.), and places (cities, countries, etc.), along with their attributes (e.g., a person’s title, an organization’s business, a city’s country, etc.) The tagger combines evidence from multipleknowledge sources, each of which uses patterns based on lexical items, parts of speech, etc., to contribute evidence towards a particular classification. In trials against hand-tagged documents, the tagger was shown as having an average precision-recall accuracy (the average of precision and recall at a particular cutoff) of approximately 85%, where precision is calculated as the ratio of the Number of Correct Program Tags to the Number of Program Tags and recall is the ratio of the Number of Correct Program Tags to the Number of Hand Tags.The statistical features we used were generated by a term-frequency inverse-document-frequency (tf.idf) calculation (Salton & McGill 1983)(Sparck-Jones 1972), which is a well-established technique in information retrieval. The weight of term k in document i is represented as:dw ik = tf ik * ( log2(n)-log2(df k) +1)tf ik = frequency of term k in document idf k = number of documents in which term k occurs.n = total number of documents in collectionGiven these three sources of features, we developed a hybrid document representation (Figure 1), described as follows: Features describe subjects (x1..x5), people (x6..x59), organizations (x60..x104) and locations (x105..x140) present in each news article. The top n statistical keywords are also included in the vector describing the article (x141..x141+n), where n was varied from 5 to 200. For convenience, x6..x140 are referred to as POL features.Generalization HierarchyThe hierarchy came to us from TextWise Inc.’s thesaurus. The SFC subject vectors describing individual articles use terms from the lowest level (terminals) of the hierarchy, which initially consisted of 124 categories. Although this thesaurus covers a fairly wide set of subjects-as required in our newswire application-it only has three levels, and as such does not have a great deal of depth. We extended the set of terminal categories under medicine, to include another 16 lowest level categories. In Figure 2, we show a fragment of the extended hierarchy under sci+tech (scientific and technical).Learning MethodOur representational decisions suggested some constraintsFeatures Descriptionx1..x5Top 5 subject categories as computed by the SFC text classifier.x6..x59POL people tags as computed by the IDD POL tagger. For each person identified, the vector contains the following string features: (name, gender, honorific, title, occupation, age). 9 people(each with these subfields) are identified for each article.x60..x104POL organization tags as computed by the IDD POL tagger. For each organization identified, the vector contains the following string features: (name, type, acronym, country, business). 9organizations (each with these subfields) are identified for each article.x105..x140POL location tags as computed by the IDD POL tagger. For each location identified, the vector contains the following string features: (name, type, country, state) 9 locations (each with thesesubfields) are identified for each article.x141..x141+n The top n ranked tf.idf terms t1...tn are selected over all articles. For each article, position k in t1...tn has the tf.idf weight of term tk in that article.Figure 1. A description of the features used to represent textFigure 2. A fragment of the generalization hierarchy used in the experimentson the learning method. We wanted to use learning methods which performed inductive generalization, where the SFC generalization hierarchy could be exploited. Also, we required a learning algorithm whose learnt rules could be made easily intelligible to users. We decided to try both AQ15c (Wnek, Bloedorn, & Michalski 1994) and C4.5-Rules (Quinlan, 1992) because they meet these requirements (the generalization hierarchy is made available to C4.5 by extending the attribute set), are well-known in the field and are readily available.AQ15c is based on the A q algorithm for generating disjunctive normal form (DNF) expressions with internal disjunction from examples. In the A q algorithm rule covers are generated by iteratively generating stars from randomly selected seeds. A star is a set of most general alternative rules that cover that example, but do not cover any negative examples. A single 'best' rule is selected from this star based on the user's preference criterion (e.g. maximal coverage of new examples, minimal number of references, minimal cost, etc.). The positive examples covered by the best rule are removed from consideration and the process is repeated until all examples are covered. C4.5-Rules, which is part of the C4.5 system of programs, generates rules based on decision trees learned by C4.5. In C4.5 a decision tree is built by repeatedly splitting the set of given examples into smaller sets based on the values of the selected attribute. An attribute is selected based on its ability to maximize an expected information gain ratio. In our experiments we found the pruned decision rules produced the most accurate predictions. Another advantage of experimenting with these two learning methods is that we get to see if the representation we have developed for this problem is truly providing useful information, or if it is just well-matched to the bias of the selected learning algorithm. The learning preference in AQ15c is controlled by the preference criteria, which by default, is to learn simple rules. The preference for C4.5 is to select attributes which maximize the information gain ratio. This can sometimes lead to different hypotheses. Because of its ability to learn rules with internal disjunction AQ15c can easily learn rules which are conjunctions of many internal disjunctions. This type of concept is not easily represented in decision trees and thus not likely to be found by C4.5-Rules. We thought this may give an advantage to AQ15c in this domain, but based on our experimental results described below, it appears our representation is well suited to either learning bias.Experimental DesignThe goal of these experiments was to evaluate the influence of different sets of features on profile learning. In particular, we wanted to test the hypothesis that semantic features used for generalization were useful in profile learning. Each of the experiments involved selecting a source of documents, vectorizing them, selecting a profile, partitioning the source documents into documents relevant to the profile (positive examples) and irrelevant to the profile (negative examples), and then running a training and testing procedure. The training involved induction of a new profile based on feedback from the pre-classified training examples. The induced profile was then tested against each of the test examples. One procedure used 10 runs in each of which the examples were split into 70% training and 30% test (70/30-split). Another procedure used a 10-fold cross-validation, where the test examples in each of the 10 runs were disjoint ( 10-fold-cross ).The metrics we used to measure learning on the USMED and T122 problems include both predictive accuracy and precision and recall. These metrics are defined as shown in Figure 3. Precision and recall are standard metrics in the IR community, and predictive accuracy is standard in the ML community. Predictive accuracy is a reasonable metric when the user's objective function assigns the same cost to false positives and false negatives. When the numbers of false positives, true positives, false negatives, and true negatives are about equal, predictive accuracy tends to agree with precision and recall, but when false negatives predominate there can be large disagreements. Our first experiment exploited the availability of users of the IDD News Browser. A user with a “real” information need was asked to set up an initial profile. The articles matching his profile were then presented in his personalized newspaper. The user then offered positiveMetric DefinitionPredictive Accuracy:# examples classified correctly / total number of test examples. Precision:# positive examples classified correctly / # examples classified positive,during testingRecall:# positive examples classified correctly / # known positive, during testing Precision Learning Curve:Graph of average precision vs. % of examples used in trainingRecall Learning Curve:Graph of average recall vs. % of examples used in trainingAveraged Precision (Recall):Average of Precision (Recall) over all test runs.Figure 3. Metrics used to measure learning performanceand negative feedback on these articles. The set of positive and negative examples were then reviewed independently by the authors to check if they agreed in terms of relevance judgments, but no corrections needed to be made. In order to ensure that a relevant generalization hierarchy would be available for the learner, we extended the broad-subject thesaurus of the SFC to include several nodes under medicine. This involved adding in terms for medicine into the thesaurus. The details of the test are:Source: Colorado Springs Gazette Telegraph (Oct. through Nov. 1994) Profile: "Medicine in the US" (USMED) Relevance Assessme n t: users, machine aided Size of collection: 442 Positive Examples:18 Negative Examples: 20 Validation: “70/30-split”Our next experiment exploited the availability of a standard test collection, the TREC-92 collection. The same generalization hierarchy used in the previous experiment was used here too. The idea was to study the effect that these changes in the hierarchy would effect learning of the other topics. The details of the test are: Source: Wall Street Journal (1987-92), Profile: “RDT&E of New Cancer Fighting Drugs” (T122) Relevance Assessment: provided by TREC, Size of collectio n: 203, Positive Examples: 73, Negative Examples: 130, Validation: “10-fold cross”Experimental ResultsIn our first set of experiments we applied AQ15c and C4.5-Rules to the USMED and T122 datasets. Here AQ15c has the hierarchy available to it in the form of hierarchical domain definitions for attributes x1 through x5. C4.5 has a hierarchy available to it through an extended attribute set. In this extension, based on a pointer from Quinlan (Quinlan, 1995), we extended the attribute set to include attributes which describe nodes higher up on the generalization hierarchy. A total of eighteen additional attributes were added (six for each non-null subject attribute) which provided the values of the subject attributes at each of the six levels higher in the tree from the leaf node. Because the tree was unbalanced some of the additional attributes took dummy values for some examples.Predictive AccuracyThe predictive accuracy results (Table 1) show that the most predictively accurate profiles generated (boldface, outlined in thick lines) come from either the SFC or ALL feature sets, and the poorest profiles (italics, outlined in double lines) come from the POL or the TFIDF featureset. The TFIDF scores are shown for n=5; there was no appreciable difference for n=200. All differences between the best and worst predictive accuracies are significant to the 90% level and were calculated using a student t-test. From this we can infer that, for topics such as these, profile learning using summary-level features (POL or SFC) alone can sometimes be more accurate in terms of predictive accuracy than using term-level features (TF.IDF) alone. In particular, having a generalization hierarchy available and relevant (tuned to the topic) is useful, as witnessed by the superior performance of the SFC in the USMED. Also, as shown above, the use of a combination of all the features (ALL) was significantly better for the T122 problem. This was true of C4.5-Rules and AQ15c which performed best with the ALL featureset. Our general conclusion is that these results reveal that the hybrid representation can be useful in profile learning .Precision and RecallThe precision and recall results (Table 1) correspond fairly well with the predictive accuracy results. The best results (calculated as the sum of precision and recall) occur for the same feature sets as was found for predictive accuracy. The poorest profiles, however, were quite varied, with all of the featuresets except POL giving theLearning Learning Predictive Accuracy Average Precision/ Average RecallMethod Problem TFIDF POL SFC ALL TFIDF POL SFC ALLAQ15c USMED0.580.480.780.550.51/1.000.45/0.450.78/0.730.52/0.34 T1220.390.590.590.760.36/0.880.43/0.660.50/0.330.79/ 0.48C4.5-Rules USMED0.390.740.790.760.07/0.300.89/0.600.97/0.600.90/0.60 T1220.640.650.680.760.0/0.00.64./0.220.58 /0.550.70/ 0.67Table 1. Predictive Accuracy, Average Precision, and Average Recall of learned profiles for a given feature set (averaged over 10 runs). (Best profiles generated are in boldface, outlined in thick lines. Worst profiles generated are in italics, outlined in double lines.)worst result at some point. The USMED SFC result shows in a rather dramatic way how the presence of a relevant generalization hierarchy was able to improve performance. To the extent that such comparisons are possible, it is worth noting that our scores for T122 can be compared with scores on T122 reported in the literature: [Schutze, Hull & Pedersen 1995, p. 235] report Non-Interpolated Average (NIA) Precision for T122 of 0.524 (using a non-linear neural net) and 0.493 (using a linear neural net). However, average precision is a different metric from NIA-Precision, and we did not compute the latter.Learning CurvesAn examination of the learning curves also revealed some interesting results. Normally one expects a learning curve to show a steady increase in performance as the percentage of training examples increases. However, except for the learning curve for the SFC dataset shown in Figure 41, the learning curves for profiles learned by AQ15c in the USMED problem are very unstable2. The presence of a generalization hierarchy while learning results in profiles which are predictively accurate and more stable than profiles learned from other featuresets. This suggests that the generalization hierarchy is providing a deeper understanding of the needs of the user and is more robust to the particular set of training examples currently used. Stability of learned profile performance is extremely important in achieving user trust in the automatically generated profiles. Intelligibility of learnt profilesA system which discovers generalizations about a user’s interests can use these generalizations to suggest new articles. However, as mentioned earlier, in many practical situations, a human may need to validate or edit the system’s learnt profiles. Intelligibility to humans then becomes an important issue. The following profile induced by AQ illustrates the intelligibility property. It shows a generalization (see Figure 2) from terminal vector categories contagious and genetic present in the training examples to medical.sci (i.e., medical science), 1Note that Figure 4 shows a graph of average precision and average recall versus the percentage of examples used in training. This is not to be confused with the typical precision/recall curves found in the information retrieval literature (e.g., (Harman 94, p. A5-A13)), which might, for example, measure precision and recall at different cutoffs.2For reasons of space, the entire set of learning curves (Precision, Recall, and Predictive Accuracy learning curves for each of AQ15c and C4.5 on T122 and USMED, for each of POL, ALL, TF.IDF, and SFC) are not shown here.and from the terminal category abortion up to medical.policy (medical policy).IF subject1 = nature or physical science &subject2 = nature or medical science or medical policy or human bodyTHEN article is of interestFigure 4. Precision (dotted line) and Recall (dark line) Learning Curve. (AQ15c using SFC features on theUSMED dataset)Although intelligibility is hard to pin down, there are various coarse measures of rule intelligibility that one can use. For one thing, one might assume that users prefer more concise rules. We examined profile length, measured as the number of terms on the left hand side of a learnt rule. Here we observed that using ALL the features led to more complex profiles over time, whereas using only subsets of features other than POL leveled off pretty quickly at profiles with well under 10 terms. The SFC profiles, which exploited generalization, were typically short and succinct. The tf.idf profiles were also quite short, but given their low overall performance they would not be useful.Effect of Generalization HierarchyIn our next set of experiments we tried to isolate the effects of the generalization hierarchy on the C4.5 learning algorithm by evaluating the performance of profiles learned from C4.5 with the hierarchical information (in the form of the extended attribute set) against C4.5 without the hierarchy (with the original x1..x5 attributes, but without the additional 18 attributes).We found that the extension improved the performance significantly (99% confidence) for the USMED dataset and SFC feature set: predictive accuracy improved from 0.46 to 0.79 while precision/recall improved from 0.47/0.23 to 0.97/0.60. However, it did little to improve the performance for the other problem sets. These results are detailed in Table 2. With these additional attributes the best USMED results for both AQ and C4.5 was with the SFC generated attributes, and with the background knowledge of a generalization hierarchy.The best results for the T122 problem were obtained when all the generated features were available. This reinforces (with evidence from two learning algorithms) that our earlier conclusion that the hybrid representation is useful in profile learning, and that having a generalization hierarchy available and relevant (tuned to the topic) is useful.Comparison with word-level Relevance Feedback LearningAlthough our previous experiments had shown that machine learning methods learning from a hybrid document representation resulted in profiles which were predictively accurate and intelligible, they did not reveal if the traditional relevance feedback approach may not work just as well.In order to compare our results with a traditional relevance feedback method we applied a modified Rocchio algorithm to the two information retrieval tasks (USMED and T122) described earlier.The modified Rocchio algorithm is a standard relevance feedback learning algorithm which searches for the best set of weights to associate with individual terms (e.g. tf-idf features or keywords) in a retrieval query. In these experiments individual articles are represented as vectors of 30,000 tf-idf features.Our Rocchio method is based on the procedure described in (Buckley, Salton, & Allan 1994). As before, the training involved induction of a new profile based on feedback from the pre-classified training examples, as follows. To mimic the effect of a user’s initial selection of relevant documents matching her query, an initial profile was set to the average of all the vectors for the (ground-truth) relevant training documents for a topic. This average was converted from a tf.idf measure to a tf measure by dividing each tf.idf value by the idf. The profile was then reweighted using the modified Rocchio formula below. This formula transforms the weight of a profile term k from p-old to p-new as follows (Buckley, Salton, & Allan 1994):p-new k=(α ∗ p-old k) + (βr∗∑i=1rdw ik ) - (γs∗∑i=1sdw ik )r = number of relevant documentss= number of non-relevant documents (all non-relevant documents)dw ik = tf weight of term k in document iα = 8 β = 16 γ = 4 (tuning parameters)During testing, the test documents were compared against the new profile using the following cosine similarity metric for calculating the degree of match between a profile j (with the tf weights converted back to tf.idf weights) and a test document i (with tf.idf weights) (Salton & McGill 1983):Learning Generalization hierarchy PredictiveAccuracy Average Precision/ Average RecallProblem attributes present ?SFC ALL SFC ALL USMED No0.460.760.47/0.230.89/0.67Yes0.790.760.97/0.600.90/0.60 T122No0.680.730.58/0.550.64/0.74Yes0.680.760.58/0.550.70/ 0.67 Table 2. The effect of generalization hierarchy attributes on predictive accuracy, precision and recall performance for C4.5-learned rules. Significant changes are boxed in thick lines, with the significant effect of generalization shown in boldface.Learning Predictive Accuracy Average Precision/ Average Recall Method USMED T122USMED T122Rocchio0.490.510.52/0.530.39/0.27Best AQ15c (SFC)0.780.760.78/0.730.79//0.48Best C4.5 (ALL)0.760.730.90/0.600.64/0.74Table 3 Comparing Predictive Accuracy, Average Precision / Average Recall for tf.idf terms。