Cross_2011_Metacognitive_instruction_for_helping_less-skilled_listeners

合集下载

LTE信令流程详解

LTE信令流程详解

L T E信令流程详解集团标准化工作小组 #Q8QGGQT-GX8G08Q8-GNQGJ8-MHHGN#LTE信令流程目录概述本文通过对重要概念的阐述,为信令流程的解析做铺垫,随后讲解LTE中重要信令流程,让大家熟悉各个物理过程是如何实现的,其次通过异常信令的解读让大家增强对异常信令流程的判断,再次对系统消息的解析,让大家了解系统消息的特点和携带的内容。

最后通过实测信令内容讲解,说明消息的重要信元字段。

第一章协议层与概念1.1控制面与用户面在无线通信系统中,负责传送和处理用户数据流工作的协议称为用户面;负责传送和处理系统协调信令的协议称为控制面。

用户面如同负责搬运的码头工人,控制面就相当于指挥员,当两个层面不分离时,自己既负责搬运又负责指挥,这种情况不利于大货物处理,因此分工独立后,办事效率可成倍提升,在LTE网络中,用户面和控制面已明确分离开。

1.2接口与协议接口是指不同网元之间的信息交互时的节点,每个接口含有不同的协议,同一接口的网元之间使用相互明白的语言进行信息交互,称为接口协议,接口协议的架构称为协议栈。

在LTE中有空中接口和地面接口,相应也有对应的协议和协议栈。

信令流数据流图1 子层、协议栈与流图2 子层运行方式LTE系统的数据处理过程被分解成不同的协议层。

简单分为三层结构:物理层、数据链路层L2和网络层。

图1阐述了LTE系统传输的总体协议架构以及用户面和控制面数据信息的路径和流向。

用户数据流和信令流以IP包的形式进行传送,在空中接口传送之前,IP包将通过多个协议层实体进行处理,到达eNodeB后,经过协议层逆向处理,再通过S1/X2接口分别流向不同的EPS实体,路径中各协议子层特点和功能如下:1.2.1NAS协议(非接入层协议)处理UE和MME之间信息的传输,传输的内容可以是用户信息或控制信息(如业务的建立、释放或者移动性管理信息)。

它与接入信息无关,只是通过接入层的信令交互,在UE和MME之间建立起了信令通路,从而便能进行非接入层信令流程了。

LSEG World-Check One 客户对手风险识别系统说明书

LSEG World-Check One 客户对手风险识别系统说明书

of Politically Exposed Person (PEP) relationships and networks, and is customisable to identify a variety of specific third-party risks.The platform enables:−Advanced name-matching algorithms−Rich data−Secondary matching−Fewer false positives−Faster match resolution−Batch upload−Ongoing rescreening−Superior relevant media content screening Leverage LSEG World-Check, software and servicesWorld-Check One combines World-Check with the next generation of screening software. The software is built to maximise our proprietary World-Check data, capitalising on the powerof multiple secondary identifiers and additional information fields. With Enhanced Due Diligence reports and our Screening Resolution Service, organisations can focus on the recordsthat matter most.Screening software designed for World-Check−LSEG World-Check−Single name checks for manual name checking−Initial and ongoing screening of millions of records−Batch Screening−API with Zero Footprint Screening−A user interface available in multiple languages−Watchlist Screening that enables the user to upload in-house and third-party lists to screen against −Media Check AI-powered negative media screening tool, which pinpoints the media content most relevant to helping you meet your regulatory and legislative compliance requirements also available as an optional add-on via the World-Check One API−Identify Ultimate Beneficial OwnershipPowered by market-leading Dun & Bradstreet UBO data, our opt-in feature UBO Check lets you search and screen for regulatory and reputational risk with World-Check Risk Intelligence, all on one platform−Improved workflowOur enhanced case management functionality facilitates better visibility and improved breakdownof records, to help speed up the remediation process−Vessel due diligenceWith IHS Maritime data, check vessels for ownership structure and IMOs, and screen for any sanction and/or regulatory risk with World-Check all on our Vessel Check feature−Identity VerificationIdentity Check enables organisations to verify the identity of individuals and businesses through adata-based identity verification approach, utilising independent and authoritative identity data sources.World-Check One benefits More precision World-Check One enables greater customisation and control at name-matching level to screen against specific lists and data sets, or specific fields within those data sets, such as gender, nationality and date of birth.Lowering false positives When combined with the configurable name-matching algorithms and filtering technology in World-Check One, multiple secondary identifiers in World-Check help to reduce false positives.Intelligent teamwork The case management tool enables managers to define customised workflow to route cases to the right individuals and specialist teams, thereby reducing cycle times, promoting speed and efficiency, and giving teams more time to focus on investigations of the highest concern.Get more done with less World-Check One is designed to reduce the burden of daily customer screening. Its customisable searches, reduced false positives, ongoing screening capability and improved workflow reduce cycle times.Streamline the screening processOur World-Check One API facilitates the integration oflarge volumes of information and advanced functionalities into existing workflows and internal systems. Thisincreases the operational efficiency of the screeningprocess for onboarding, Know Your Customer (KYC) and third-party risk due diligence.One solution to screen multiple listsWatchlist Screening allows users to upload internaland third-party lists to World-Check One and apply thematching logic to all data sets, ensuring minimisation offalse positives and consistency of results.More precise media screeningNegative media forms part of a best practice approach to customer due diligence and ongoing risk assessment. AI delivers next-generation media screening of unstructured media, along with improved relevancy of results andworkflow integration, to enable better decision-making.Audit trail* and reporting capabilitiesWorld-Check One provides an extensive auditingcapability, with date-stamped actions for the matchresolution process. It includes detailed reports that canbe used as part of management reporting and regulatory proof of due diligence.World-Check One delivers a more efficient approachBalancing the regulatory and operational burden requires organisations to take a more targeted approach towards customer due diligence. Firms often have to do more with less. They require moreefficiency from the tools, technology and operations that support customer due diligence.*Not applicable for clients that do not require an audit trail.World-Check One leveragesWorld-CheckFind hidden risk in business relationships and human networks.World-Check provides trusted information to help businesses comply with regulations and identify potential financial crime. Since its inception, World-Check has served the KYC and third-party screening needs of the world’s largest firms, simplifying day-to-day onboarding and monitoring decisions, and helping businesses comply with anti-money laundering and countering financing of terrorism legislation. World-Check data is sourced from the public domain, deduplicated, structured into individual reports and linked to associations or human networks. Each action is underpinned by a meticulous, regulated research process.In addition to 100 percent sanctions coverage, additional risk-based information is sourced from extensive global media research by more than 400 research analysts working in over 65 languages, covering 240 countries. Information is collated from an extensive network of thousands of reputable sources, including 700+ sanction, watch, regulatory and law enforcement lists, local and international government records, country-specific data sources, international adverse electronic and physical media searches, and English and foreign language data sources.Sophisticated softwareA unified platform approach to customer due diligence.Our highly scalable solution is built for single users or large teams to support a carefully targeted approach to screening during KYC onboarding, ongoing monitoring and rescreening cycles. The system makes remediation quicker and more intelligent, and is adaptable to meet regulation changes.Additional servicesWe help organisations to optimise their resources and reduce operational cost.Screening Resolution Service – Our service highlights positive and possible matches for any customer identification programme detecting heightened risk individuals and entities, screened againstWorld-Check.By using a managed service like Screening Resolution Service, you can reduce your cost of compliance and free up departments to focus their efforts on activities such as tracking and implementing regulatory change. LSEG Due Diligence Reports – Use our Due Diligence reports to help you comply with anti-money laundering, anti-bribery and corruption regulations, or ahead of a merger, acquisition or joint venture. Y ou can also use them for third-party risk assessment, onboarding decision-making and identifying beneficial ownership structures.Using only ethical and non-intrusive research methods, LSEG is committed to principles of integrity and accountability. Subjects are not aware when we carry out an investigation, and we never misrepresent our activities. LSEG has a dedicated risk and control team that performs regular audits of the service, and external accreditation to ISAE 3000 standard through Schellman & Company, LLC.LSEG World-Check One for Salesforce – World-Check One for Salesforce connects your customer and third-party data from Salesforce with our proprietary World-Check to help you decide whether to onboard the vast majority of entities being screened or use further due diligence.Collaboration toolsEnhanced enterprise-level case management capabilities facilitate work on cases with assigned colleagues and teams when investigating risk, to ensure all decisions and discussions are captured as part of your audit trail. Secondary matchingApply secondary matching rules at list level based on your approach. Greater control enables reduced false positives. User experienceProven user interface promotes minimum user interaction.Cross-team communicationLanguage capabilities, ideal for multinational companies and team remediation.Prove due diligenceEach step of the screening process is tracked and saved for auditing purposes. T o satisfy regulatory demands, organisations can retrieve a detailed report showing the decision-making process and the individuals involved during every stage of the remediation process.World-Check One’s easy-to-use interface helps compliance teams work more efficiently.454312LDA3267049/10-23。

一种抗合谋攻击的基于身份的门限签名方案

一种抗合谋攻击的基于身份的门限签名方案

0 引 言
文献 [ ] 出了基于 身份 的密码 系统 , 1提 降低 了基 于证 书公 钥系统 中密钥 的分发和管理 成本 , 利用某 个公开 的算法 和用户
的身份计算 出用户 的公钥 , 用户的私钥 由 P G统一 生成。门 限 K 签名体制 可防止 P G权力过大的问题 , K 它将群签名密 钥分成
Ab ta t s r c B s d o ae s n sg au e a d d sr u e e e e ain p o o o , o e D— a e h e h l i au e s h me i p o o e . a e n P tro in tr n i i td k y g n r t r tc l a n v lI b s d t r s od sg tr c e s r p s d tb o n
T es h mec nefcieyrss c n prc t c n og r t c s F r emoe tesh mec ntaetesg esT ego pp bi e s h c e a f t l eit o siayat ka dfreyat k . ut r r , c e a rc in r. h ru u l k yi e v a a h h h c
=gi d kmo p

击 就是指大于或等于门限值的秘 密份额持 有者合谋 , 以高 可 概率获取群签名密钥 , 而不负责 任地伪造 该群对 任意 消息 的 进
签名攻击 。
则 k是 P 固有 的部分密钥 , P 的部分公钥。 。 Y是 Y必
须满足如下条件 :
任 个 员的Y之积Y=H, 均 意t 成 , 不相等, 其中A是
K e wor s y d Th e hod sg t r Co s r c t c I b s d Trc a iiy r s l inau e n pia y at k a D— a e a e b lt

Insight Problem Solving A Critical Examination of the Possibility

Insight Problem Solving A Critical Examination of the Possibility

The Journal of Problem Solving • volume 5, no. 1 (Fall 2012)56Insight Problem Solving: A Critical Examination of the Possibilityof Formal TheoryWilliam H. Batchelder 1 and Gregory E. Alexander 1AbstractThis paper provides a critical examination of the current state and future possibility of formal cognitive theory for insight problem solving and its associated “aha!” experience. Insight problems are contrasted with move problems, which have been formally defined and studied extensively by cognitive psychologists since the pioneering work of Alan Newell and Herbert Simon. To facilitate our discussion, a number of classical brainteasers are presented along with their solutions and some conclusions derived from observing the behavior of many students trying to solve them. Some of these problems are interesting in their own right, and many of them have not been discussed before in the psychologi-cal literature. The main purpose of presenting the brainteasers is to assist in discussing the status of formal cognitive theory for insight problem solving, which is argued to be considerably weaker than that found in other areas of higher cognition such as human memory, decision-making, categorization, and perception. We discuss theoretical barri-ers that have plagued the development of successful formal theory for insight problem solving. A few suggestions are made that might serve to advance the field.Keywords Insight problems, move problems, modularity, problem representation1 Department of Cognitive Sciences, University of California Irvine/10.7771/1932-6246.1143Insight Problem Solving: The Possibility of Formal Theory 57• volume 5, no. 1 (Fall 2012)1. IntroductionThis paper discusses the current state and a possible future of formal cognitive theory for insight problem solving and its associated “aha!” experience. Insight problems are con-trasted with so-called move problems defined and studied extensively by Alan Newell and Herbert Simon (1972). These authors provided a formal, computational theory for such problems called the General Problem Solver (GPS), and this theory was one of the first formal information processing theories to be developed in cognitive psychology. A move problem is posed to solvers in terms of a clearly defined representation consisting of a starting state, a description of the goal state(s), and operators that allow transitions from one problem state to another, as in Newell and Simon (1972) and Mayer (1992). A solu-tion to a move problem involves applying operators successively to generate a sequence of transitions (moves) from the starting state through intermediate problem states and finally to a goal state. Move problems will be discussed more extensively in Section 4.6.In solving move problems, insight may be required for selecting productive moves at various states in the problem space; however, for our purposes we are interested in the sorts of problems that are described often as insight problems. Unlike Newell and Simon’s formal definition of move problems, there has not been a generally agreed upon defini-tion of an insight problem (Ash, Jee, and Wiley, 2012; Chronicle, MacGregor, and Ormerod, 2004; Chu and MacGregor, 2011). It is our view that it is not productive to attempt a pre-cise logical definition of an insight problem, and instead we offer a set of shared defining characteristics in the spirit of Wittgenstein’s (1958) definition of ‘game’ in terms of family resemblances. Problems that we will treat as insight problems share many of the follow-ing defining characteristics: (1) They are posed in such a way as to admit several possible problem representations, each with an associated solution search space. (2) Likely initial representations are inadequate in that they fail to allow the possibility of discovering a problem solution. (3) In order to overcome such a failure, it is necessary to find an alternative productive representation of the problem. (4) Finding a productive problem representation may be facilitated by a period of non-solving activity called incubation, and also it may be potentiated by well-chosen hints. (5) Once obtained, a productive representation leads quite directly and quickly to a solution. (6) The solution involves the use of knowledge that is well known to the solver. (7) Once the solution is obtained, it is accompanied by a so-called “aha!” experience. (8) When a solution is revealed to a non-solver, it is grasped quickly, often with a feeling of surprise at its simplicity, akin to an “aha!” experience.It is our position that very little is known empirically or theoretically about the cogni-tive processes involved in solving insight problems. Furthermore, this lack of knowledge stands in stark contrast with other areas of cognition such as human memory, decision-making, categorization, and perception. These areas of cognition have a large number of replicable empirical facts, and many formal theories and computational models exist that attempt to explain these facts in terms of underlying cognitive processes. The main goal58W. H. Batchelder and G. E. Alexander of this paper is to explain the reasons why it has been so difficult to achieve a scientific understanding of the cognitive processes involved in insight problem solving.There have been many scientific books and papers on insight problem solving, start-ing with the seminal work of the Gestalt psychologists Köhler (1925), Duncker (1945), and Wertheimer (1954), as well as the English social psychologist, Wallas (1926). Since the contributions of the early Gestalt psychologists, there have been many journal articles, a few scientific books, such as those by Sternberg and Davidson (1996) and Chu (2009), and a large number of books on the subject by laypersons. Most recently, two excellent critical reviews of insight problem solving have appeared: Ash, Cushen, and Wiley (2009) and Chu and MacGregor (2011).The approach in this paper is to discuss, at a general level, the nature of several fun-damental barriers to the scientific study of insight problem solving. Rather than criticizing particular experimental studies or specific theories in detail, we try to step back and take a look at the area itself. In this effort, we attempt to identify principled reasons why the area of insight problem solving is so resistant to scientific progress. To assist in this approach we discuss and informally analyze eighteen classical brainteasers in the main sections of the paper. These problems are among many that have been posed to hundreds of upper divisional undergraduate students in a course titled “Human Problem Solving” taught for many years by the senior author. Only the first two of these problems can be regarded strictly as move problems in the sense of Newell and Simon, and most of the rest share many of the characteristics of insight problems as described earlier.The paper is divided into five main sections. After the Introduction, Section 2 describes the nature of the problem solving class. Section 3 poses the eighteen brainteasers that will be discussed in later sections of the paper. The reader is invited to try to solve these problems before checking out the solutions in the Appendix. Section 4 lays out six major barriers to developing a deep scientific theory of insight problem solving that we believe are endemic to the field. We argue that these barriers are not present in other, more theo-retically advanced areas of higher cognition such as human memory, decision-making, categorization, and perception. These barriers include the lack of many experimental paradigms (4.1), the lack of a large, well-classified set of stimulus material (4.2), and the lack of many informative behavioral measures (4.3). In addition, it is argued that insight problem solving is difficult to study because it is non-modular, both in the sense of Fodor (1983) but more importantly in several weaker senses of modularity that admit other areas of higher cognition (4.4), the lack of theoretical generalizations about insight problem solv-ing from experiments with particular insight problems (4.5), and the lack of computational theories of human insight (4.6). Finally, in Section 5, we suggest several avenues that may help overcome some of the barriers described in Section 4. These include suggestions for useful classes of insight problems (5.1), suggestions for experimental work with expert problem solvers (5.2), and some possibilities for a computational theory of insight.The Journal of Problem Solving •Insight Problem Solving: The Possibility of Formal Theory 592. Batchelder’s Human Problem Solving ClassThe senior author, William Batchelder, has taught an Upper Divisional Undergraduate course called ‘Human Problem Solving” for over twenty-five years to classes ranging in size from 75 to 100 students. By way of background, his active research is in other areas of the cognitive sciences; however, he maintains a long-term hobby of studying classical brainteasers. In the area of complex games, he achieved the title of Senior Master from the United States Chess Federation, he was an active duplicate bridge player throughout undergraduate and graduate school, and he also achieved a reasonable level of skill in the game of Go.The content of the problem-solving course is split into two main topics. The first topic involves encouraging students to try their hand at solving a number of famous brainteasers drawn from the sizeable folklore of insight problems, especially the work of Martin Gardner (1978, 1982), Sam Loyd (1914), and Raymond Smullyan (1978). In addition, games like chess, bridge, and Go are discussed. The second topic involves presenting the psychological theory of thinking and problem solving, and in most cases the material is organized around developments in topics that are covered in the first eight chapters of Mayer (1992). These topics include work of the Gestalt psychologists on problem solving, discussion of experiments and theories concerning induction and deduction, present-ing the work on move problems, including the General Problem Solver (Newell & Simon, 1972), showing how response time studies can reveal mental architectures, and describing theories of memory representation and question answering.Despite efforts, the structure of the course does not reflect a close overlap between its two main topics. The principal reason for this is that in our view the level of theoreti-cal and empirical work on insight problem solving is at a substantially lower level than is the work in almost any other area of cognition dealing with higher processes. The main goal of this paper is to explain our reasons for this pessimistic view. To assist in this goal, it is helpful to get some classical brainteasers on the table. While most of these problems have not been used in experimental studies, the senior author has experienced the solu-tion efforts and post solution discussions of over 2,000 students who have grappled with these problems in class.3. Some Classic BrainteasersIn this section we present eighteen classical brainteasers from the folklore of problem solving that will be discussed in the remainder of the paper. These problems have de-lighted brainteaser connoisseurs for years, and most are capable of giving the solver a large dose of the “aha!” experience. There are numerous collections of these problems in books, and many collections of them are accessible through the Internet. We have selected these problems because they, and others like them, pose a real challenge to any effort to • volume 5, no. 1 (Fall 2012)60W. H. Batchelder and G. E. Alexander develop a deep and general formal theory of human or machine insight problem solving. With the exception of Problems 3.1 and 3.2, and arguably 3.6, the problems are different in important respects from so-called move problems of Newell and Simon (1972) described earlier and in Section 4.6.Most of the problems posed in this section share many of the defining characteristics of insight problems described in Section 1. In particular, they do not involve multiple steps, they require at most a very minimal amount of technical knowledge, and most of them can be solved by one or two fairly simple insights, albeit insights that are rarely achieved in real time by problem solvers. What makes these problems interesting is that they are posed in such a way as to induce solvers to represent the problem information in an unproductive way. Then the main barrier to finding a solution to one of these problems is to overcome a poor initial problem representation. This may involve such things as a re-representation of the problem, the dropping of an implicit constraint on the solution space, or seeing a parallel to some other similar problem. If the solver finds a productive way of viewing the problem, the solution generally follows rapidly and comes with burst of insight, namely the “aha!” experience. In addition, when non-solvers are given the solu-tion they too may experience a burst of insight.What follows next are statements of the eighteen brainteasers. The solutions are presented in the Appendix, and we recommend that after whatever problem solving activity a reader wishes to engage in, that the Appendix is studied before reading the remaining two sections of the paper. As we discuss each problem in the paper, we provide authorship information where authorship is known. In addition, we rephrased some of the problems from their original sources.Problem 3.1. Imagine you have an 8-inch by 8-inch array of 1-inch by 1-inch little squares. You also have a large box of 2-inch by 1-inch rectangular shaped dominoes. Of course it is easy to tile the 64 little squares with dominoes in the sense that every square is covered exactly once by a domino and no domino is hanging off the array. Now sup-pose the upper right and lower left corner squares are cut off the array. Is it possible to tile the new configuration of 62 little squares with dominoes allowing no overlaps and no overhangs?Problem 3.2. A 3-inch by 3-inch by 3-inch cheese cube is made of 27 little 1-inch cheese cubes of different flavors so that it is configured like a Rubik’s cube. A cheese-eating worm devours one of the top corner cubes. After eating any little cube, the worm can go on to eat any adjacent little cube (one that shares a wall). The middlemost little cube is by far the tastiest, so our worm wants to eat through all the little cubes finishing last with the middlemost cube. Is it possible for the worm to accomplish this goal? Could he start with eating any other little cube and finish last with the middlemost cube as the 27th?The Journal of Problem Solving •Insight Problem Solving: The Possibility of Formal Theory 61 Figure 1. The cheese eating worm problem.Problem 3.3. You have ten volumes of an encyclopedia numbered 1, . . . ,10 and shelved in a bookcase in sequence in the ordinary way. Each volume has 100 pages, and to simplify suppose the front cover of each volume is page 1 and numbering is consecutive through page 100, which is the back cover. You go to sleep and in the middle of the night a bookworm crawls onto the bookcase. It eats through the first page of the first volume and eats continuously onwards, stopping after eating the last page of the tenth volume. How many pieces of paper did the bookworm eat through?Figure 2.Bookcase setup for the Bookworm Problem.Problem 3.4. Suppose the earth is a perfect sphere, and an angel fits a tight gold belt around the equator so there is no room to slip anything under the belt. The angel has second thoughts and adds an inch to the belt, and fits it evenly around the equator. Could you slip a dime under the belt?• volume 5, no. 1 (Fall 2012)62W. H. Batchelder and G. E. Alexander Problem 3.5. Consider the cube in Figure 1 and suppose the top and bottom surfaces are painted red and the other four sides are painted blue. How many little cubes have at least one red and at least one blue side?Problem 3.6. Look at the nine dots in Figure 3. Your job is to take a pencil and con-nect them using only three straight lines. Retracing a line is not allowed and removing your pencil from the paper as you draw is not allowed. Note the usual nine-dot problem requires you to do it with four lines; you may want to try that stipulation as well. Figure 3.The setup for the Nine-Dot Problem.Problem 3.7. You are standing outside a light-tight, well-insulated closet with one door, which is closed. The closet contains three light sockets each containing a working light bulb. Outside the closet, there are three on/off light switches, each of which controls a different one of the sockets in the closet. All switches are off. Your task is to identify which switch operates which light bulb. You can turn the switches off and on and leave them in any position, but once you open the closet door you cannot change the setting of any switch. Your task is to figure out which switch controls which light bulb while you are only allowed to open the door once.Figure 4.The setup of the Light Bulb Problem.The Journal of Problem Solving •Insight Problem Solving: The Possibility of Formal Theory 63• volume 5, no . 1 (Fall 2012)Problem 3.8. We know that any finite string of symbols can be extended in infinitely many ways depending on the inductive (recursive) rule; however, many of these ways are not ‘reasonable’ from a human perspective. With this in mind, find a reasonable rule to continue the following series:Problem 3.9. You have two quart-size beakers labeled A and B. Beaker A has a pint of coffee in it and beaker B has a pint of cream in it. First you take a tablespoon of coffee from A and pour it in B. After mixing the contents of B thoroughly you take a tablespoon of the mixture in B and pour it back into A, again mixing thoroughly. After the two transfers, which beaker, if either, has a less diluted (more pure) content of its original substance - coffee in A or cream in B? (Forget any issues of chemistry such as miscibility).Figure 5. The setup of the Coffee and Cream Problem.Problem 3.10. There are two large jars, A and B. Jar A is filled with a large number of blue beads, and Jar B is filled with the same number of red beads. Five beads from Jar A are scooped out and transferred to Jar B. Someone then puts a hand in Jar B and randomly grabs five beads from it and places them in Jar A. Under what conditions after the second transfer would there be the same number of red beads in Jar A as there are blue beads in Jar B.Problem 3.11. Two trains A and B leave their train stations at exactly the same time, and, unaware of each other, head toward each other on a straight 100-mile track between the two stations. Each is going exactly 50 mph, and they are destined to crash. At the time the trains leave their stations, a SUPERFLY takes off from the engine of train A and flies directly toward train B at 100 mph. When he reaches train B, he turns around instantly, A BCD EF G HI JKLM.............64W. H. Batchelder and G. E. Alexander continuing at 100 mph toward train A. The SUPERFLY continues in this way until the trains crash head-on, and on the very last moment he slips out to live another day. How many miles does the SUPERFLY travel on his zigzag route by the time the trains collide?Problem 3.12. George lives at the foot of a mountain, and there is a single narrow trail from his house to a campsite on the top of the mountain. At exactly 6 a.m. on Satur-day he starts up the trail, and without stopping or backtracking arrives at the top before6 p.m. He pitches his tent, stays the night, and the next morning, on Sunday, at exactly 6a.m., he starts down the trail, hiking continuously without backtracking, and reaches his house before 6 p.m. Must there be a time of day on Sunday where he was exactly at the same place on the trail as he was at that time on Saturday? Could there be more than one such place?Problem 3.13. You are driving up and down a mountain that is 20 miles up and 20 miles down. You average 30 mph going up; how fast would you have to go coming down the mountain to average 60 mph for the entire trip?Problem 3.14. During a recent census, a man told the census taker that he had three children. The census taker said that he needed to know their ages, and the man replied that the product of their ages was 36. The census taker, slightly miffed, said he needed to know each of their ages. The man said, “Well the sum of their ages is the same as my house number.” The census taker looked at the house number and complained, “I still can’t tell their ages.” The man said, “Oh, that’s right, the oldest one taught the younger ones to play chess.” The census taker promptly wrote down the ages of the three children. How did he know, and what were the ages?Problem 3.15. A closet has two red hats and three white hats. Three participants and a Gamesmaster know that these are the only hats in play. Man A has two good eyes, man B only one good eye, and man C is blind. The three men sit on chairs facing each other, and the Gamesmaster places a hat on each man’s head, in such a way that no man can see the color of his own hat. The Gamesmaster offers a deal, namely if any man correctly states the color of his hat, he will get $50,000; however, if he is in error, then he has to serve the rest of his life as an indentured servant to the Gamesmaster. Man A looks around and says, “I am not going to guess.” Then Man B looks around and says, “I am not going to guess.” Finally Man C says, “ From what my friends with eyes have said, I can clearly see that my hat is _____”. He wins the $50,000, and your task is to fill in the blank and explain how the blind man knew the color of his hat.Problem 3.16. A king dies and leaves an estate, including 17 horses, to his three daughters. According to his will, everything is to be divided among his daughters as fol-lows: 1/2 to the oldest daughter, 1/3 to the middle daughter, and 1/9 to the youngest daughter. The three heirs are puzzled as to how to divide the horses among themselves, when a probate lawyer rides up on his horse and offers to assist. He adds his horse to the kings’ horses, so there will be 18 horses. Then he proceeds to divide the horses amongThe Journal of Problem Solving •Insight Problem Solving: The Possibility of Formal Theory 65 the daughters. The oldest gets ½ of the horses, which is 9; the middle daughter gets 6 horses which is 1/3rd of the horses, and the youngest gets 2 horses, 1/9th of the lot. That’s 17 horses, so the lawyer gets on his own horse and rides off with a nice commission. How was it possible for the lawyer to solve the heirs’ problem and still retain his own horse?Problem 3.17. A logical wizard offers you the opportunity to make one statement: if it is false, he will give you exactly ten dollars, and if it is true, he will give you an amount of money other than ten dollars. Give an example of a statement that would be sure to make you rich.Problem 3.18. Discover an interesting sense of the claim that it is in principle impos-sible to draw a perfect map of England while standing in a London flat; however, it is not in principle impossible to do so while living in a New York City Pad.4. Barriers to a Theory of Insight Problem SolvingAs mentioned earlier, our view is that there are a number of theoretical barriers that make it difficult to develop a satisfactory formal theory of the cognitive processes in play when humans solve classical brainteasers of the sort posed in Section 3. Further these barriers seem almost unique to insight problem solving in comparison with the more fully developed higher process areas of the cognitive sciences such as human memory, decision-making, categorization, and perception. Indeed it seems uncontroversial to us that neither human nor machine insight problem solving is well understood, and com-pared to other higher process areas in psychology, it is the least developed area both empirically and theoretically.There are two recent comprehensive critical reviews concerning insight problem solving by Ash, Cushen, and Wiley (2009) and Chu and MacGregor (2011). These articles describe the current state of empirical and theoretical work on insight problem solving, with a focus on experimental studies and theories of problem restructuring. In our view, both reviews are consistent with our belief that there has been very little sustainable progress in achieving a general scientific understanding of insight. Particularly striking is that are no established general, formal theories or models of insight problem solving. By a general formal model of insight problem solving we mean a set of clearly formulated assumptions that lead formally or logically to precise behavioral predictions over a wide range of insight problems. Such a formal model could be posed in terms of a number of formal languages including information processing assumptions, neural networks, computer simulation, stochastic assumptions, or Bayesian assumptions.Since the groundbreaking work by the Gestalt psychologists on insight problem solving, there have been theoretical ideas that have been helpful in explaining the cog-nitive processes at play in solving certain selected insight problems. Among the earlier ideas are Luchins’ concept of einstellung (blind spot) and Duncker’s functional fixedness, • volume 5, no. 1 (Fall 2012)as in Maher (1992). More recently, there have been two developed theoretical ideas: (1) Criterion for Satisfactory Progress theory (Chu, Dewald, & Chronicle, 2007; MacGregor, Ormerod, & Chronicle, 2001), and (2) Representational Change Theory (Knoblich, Ohls-son, Haider, & Rhenius, 1999). We will discuss these theories in more detail in Section 4. While it is arguable that these theoretical ideas have done good work in understanding in detail a few selected insight problems, we argue that it is not at all clear how these ideas can be generalized to constitute a formal theory of insight problem solving at anywhere near the level of generality that has been achieved by formal theories in other areas of higher process cognition.The dearth of formal theories of insight problem solving is in stark contrast with other areas of problem solving discussed in Section 4.6, for example move problems discussed earlier and the more recent work on combinatorial optimization problems such as the two dimensional traveling salesman problem (MacGregor and Chu, 2011). In addition, most other higher process areas of cognition are replete with a variety of formal theories and models. For example, in the area of human memory there are currently a very large number of formal, information processing models, many of which have evolved from earlier mathematical models, as in Norman (1970). In the area of categorization, there are currently several major formal theories along with many variations that stem from earlier theories discussed in Ashby (1992) and Estes (1996). In areas ranging from psycholinguistics to perception, there are a number of formal models based on brain-style computation stemming from Rumelhart, McClelland, and PDP Research Group’s (1987) classic two-volume book on parallel distributed processing. Since Daniel Kahneman’s 2002 Nobel Memorial Prize in the Economic Sciences for work jointly with Amos Tversky developing prospect theory, as in Kahneman and Tversky (1979), psychologically based formal models of human decision-making is a major theoretical area in cognitive psychology today. In our view, there is nothing in the area of insight problem solving that approaches the depth and breadth of formal models seen in the areas mentioned above.In the following subsections, we will discuss some of the barriers that have prevented the development of a satisfactory theory of insight problem solving. Some of the bar-riers will be illustrated with references to the problems in Section 3. Then, in Section 5 we will assuage our pessimism a bit by suggesting how some of these barriers might be removed in future work to facilitate the development of an adequate theory of insight problem solving.4.1 Lack of Many Experimental ParadigmsThere are not many distinct experimental paradigms to study insight problem solving. The standard paradigm is to pick a particular problem, such as one of the ones in Section 3, and present it to several groups of subjects, perhaps in different ways. For example, groups may differ in the way a hint is presented, a diagram is provided, or an instruction。

零知识证明的前向安全不可否认数字签名方案

零知识证明的前向安全不可否认数字签名方案
维普资讯
第 3 卷 第 8 3 期
V1 3 o. 3






20 0 7年 4月
Ap i 2 0 rl 0 7
No 8 .
CO u e g n e i g mp t r En i e rn
博 士论 文 ・
文章编号:1・_48 o7 8 o7_3 文献标识码: o0 32(o)—02-0 (_ _ 2 o A
( l g in e ’nUn v ri f e h oo y Xia 0 5 ) Col eof e c s Xia i est o c n lg , ’n71 0 4 e Sc y T
[ sr c T i pp rpo oe owadsc r u dnal ii ls ntr shme b sd o eok o eg ro . y cmbnn h Ab ta t hs a e rp ssafr r— ue n eibe dgt i aue c e ae n z r—n wl e pof B o iig te I e a g d
可否认数字签名的一般特点,并具有前 向安全性 ,在签名密钥泄露的情况下可将损失减少到最小。该 方案具 有签名不可伪造性和不可否认 性 、签名和 密钥长度短等特点 。密钥 更新 协议使用 了零 知 证明的思 想,保 证密钥进化 的安全性 。在标准 困难 问题 假设 F,该方案是安 识
全的 。
关健词 :数字签名 ;不可否认 ;前向安全 ;零知识 证明
un e i b edi i l i n t r t h o wa d s c e d g t l i n t r , h e s h me k e st e g n r l o e t f n e ibl i i l i n t r ,ur i i a u e t e n w c e e p h e e a p ry o d n a ed g t g a u e a d as h a sg pr u as

The use of the area under the roc curve in the evaluation of machine learning algorithms

The use of the area under the roc curve in the evaluation of machine learning algorithms

curve (AUC) as a performance measure for machine learning algorithms. As a case study we evaluate six machine learning algorithms (C4.5, Multiscale Classifier, Perceptron, Multi-layer Perceptron, k-Nearest Neighbours, and a Quadratic Discriminant Function) on six "real world" medical diagnostics data sets. We compare and discuss the use of AUC to the more conventional overall accuracy and find that AUC exhibits a number of desirable properties when compared to overall accuracy: increased sensitivity in Analysis of Variance (ANOVA) tests; a standard error that decreased as both AUC and the number of test samples increased; decision threshold independent; and it is invafiant to a priori class probabilities. The paper concludes with the recommendation that AUC be used in preference to overall accuracy for "single number" evaluation of machine learning algorithms. © 1997 Pattern Recognition Society. Published by Elsevier Science Ltd. The ROC curve Cross-validation The area under the ROC curve (AUC) Wilcoxon statistic Standard error Accuracy measures

ANSI-TIA-EIA-568-B.2-1 Cat.6 June 20 2002

ANSI-TIA-EIA-568-B.2-1 Cat.6 June 20 2002
Standards and Publn accordance with the American National Standards Institute (ANSI) patent policy. By such action, TIA/EIA does not assume any liability to any patent owner, nor does it assume any obligation whatever to parties adopting the Standard or Publication.
PUBLICATIONS or call Global Engineering Documents, USA and Canada (1-800-854-7179) International (303-397-7956)
All rights reserved Printed in U.S.A.
PLEASE! DON'T VIOLATE
This Standard does not purport to address all safety problems associated with its use or all applicable regulatory requirements. It is the responsibility of the user of this Standard to establish appropriate safety and health practices and to determine the applicability of regulatory limitations before its use.
Published by

交通流

交通流

Network impacts of a road capacity reduction:Empirical analysisand model predictionsDavid Watling a ,⇑,David Milne a ,Stephen Clark baInstitute for Transport Studies,University of Leeds,Woodhouse Lane,Leeds LS29JT,UK b Leeds City Council,Leonardo Building,2Rossington Street,Leeds LS28HD,UKa r t i c l e i n f o Article history:Received 24May 2010Received in revised form 15July 2011Accepted 7September 2011Keywords:Traffic assignment Network models Equilibrium Route choice Day-to-day variabilitya b s t r a c tIn spite of their widespread use in policy design and evaluation,relatively little evidencehas been reported on how well traffic equilibrium models predict real network impacts.Here we present what we believe to be the first paper that together analyses the explicitimpacts on observed route choice of an actual network intervention and compares thiswith the before-and-after predictions of a network equilibrium model.The analysis isbased on the findings of an empirical study of the travel time and route choice impactsof a road capacity reduction.Time-stamped,partial licence plates were recorded across aseries of locations,over a period of days both with and without the capacity reduction,and the data were ‘matched’between locations using special-purpose statistical methods.Hypothesis tests were used to identify statistically significant changes in travel times androute choice,between the periods of days with and without the capacity reduction.A trafficnetwork equilibrium model was then independently applied to the same scenarios,and itspredictions compared with the empirical findings.From a comparison of route choice pat-terns,a particularly influential spatial effect was revealed of the parameter specifying therelative values of distance and travel time assumed in the generalised cost equations.When this parameter was ‘fitted’to the data without the capacity reduction,the networkmodel broadly predicted the route choice impacts of the capacity reduction,but with othervalues it was seen to perform poorly.The paper concludes by discussing the wider practicaland research implications of the study’s findings.Ó2011Elsevier Ltd.All rights reserved.1.IntroductionIt is well known that altering the localised characteristics of a road network,such as a planned change in road capacity,will tend to have both direct and indirect effects.The direct effects are imparted on the road itself,in terms of how it can deal with a given demand flow entering the link,with an impact on travel times to traverse the link at a given demand flow level.The indirect effects arise due to drivers changing their travel decisions,such as choice of route,in response to the altered travel times.There are many practical circumstances in which it is desirable to forecast these direct and indirect impacts in the context of a systematic change in road capacity.For example,in the case of proposed road widening or junction improvements,there is typically a need to justify econom-ically the required investment in terms of the benefits that will likely accrue.There are also several examples in which it is relevant to examine the impacts of road capacity reduction .For example,if one proposes to reallocate road space between alternative modes,such as increased bus and cycle lane provision or a pedestrianisation scheme,then typically a range of alternative designs exist which may differ in their ability to accommodate efficiently the new traffic and routing patterns.0965-8564/$-see front matter Ó2011Elsevier Ltd.All rights reserved.doi:10.1016/j.tra.2011.09.010⇑Corresponding author.Tel.:+441133436612;fax:+441133435334.E-mail address:d.p.watling@ (D.Watling).168 D.Watling et al./Transportation Research Part A46(2012)167–189Through mathematical modelling,the alternative designs may be tested in a simulated environment and the most efficient selected for implementation.Even after a particular design is selected,mathematical models may be used to adjust signal timings to optimise the use of the transport system.Road capacity may also be affected periodically by maintenance to essential services(e.g.water,electricity)or to the road itself,and often this can lead to restricted access over a period of days and weeks.In such cases,planning authorities may use modelling to devise suitable diversionary advice for drivers,and to plan any temporary changes to traffic signals or priorities.Berdica(2002)and Taylor et al.(2006)suggest more of a pro-ac-tive approach,proposing that models should be used to test networks for potential vulnerability,before any reduction mate-rialises,identifying links which if reduced in capacity over an extended period1would have a substantial impact on system performance.There are therefore practical requirements for a suitable network model of travel time and route choice impacts of capac-ity changes.The dominant method that has emerged for this purpose over the last decades is clearly the network equilibrium approach,as proposed by Beckmann et al.(1956)and developed in several directions since.The basis of using this approach is the proposition of what are believed to be‘rational’models of behaviour and other system components(e.g.link perfor-mance functions),with site-specific data used to tailor such models to particular case studies.Cross-sectional forecasts of network performance at specific road capacity states may then be made,such that at the time of any‘snapshot’forecast, drivers’route choices are in some kind of individually-optimum state.In this state,drivers cannot improve their route selec-tion by a unilateral change of route,at the snapshot travel time levels.The accepted practice is to‘validate’such models on a case-by-case basis,by ensuring that the model—when supplied with a particular set of parameters,input network data and input origin–destination demand data—reproduces current mea-sured mean link trafficflows and mean journey times,on a sample of links,to some degree of accuracy(see for example,the practical guidelines in TMIP(1997)and Highways Agency(2002)).This kind of aggregate level,cross-sectional validation to existing conditions persists across a range of network modelling paradigms,ranging from static and dynamic equilibrium (Florian and Nguyen,1976;Leonard and Tough,1979;Stephenson and Teply,1984;Matzoros et al.,1987;Janson et al., 1986;Janson,1991)to micro-simulation approaches(Laird et al.,1999;Ben-Akiva et al.,2000;Keenan,2005).While such an approach is plausible,it leaves many questions unanswered,and we would particularly highlight two: 1.The process of calibration and validation of a network equilibrium model may typically occur in a cycle.That is to say,having initially calibrated a model using the base data sources,if the subsequent validation reveals substantial discrep-ancies in some part of the network,it is then natural to adjust the model parameters(including perhaps even the OD matrix elements)until the model outputs better reflect the validation data.2In this process,then,we allow the adjustment of potentially a large number of network parameters and input data in order to replicate the validation data,yet these data themselves are highly aggregate,existing only at the link level.To be clear here,we are talking about a level of coarseness even greater than that in aggregate choice models,since we cannot even infer from link-level data the aggregate shares on alternative routes or OD movements.The question that arises is then:how many different combinations of parameters and input data values might lead to a similar link-level validation,and even if we knew the answer to this question,how might we choose between these alternative combinations?In practice,this issue is typically neglected,meaning that the‘valida-tion’is a rather weak test of the model.2.Since the data are cross-sectional in time(i.e.the aim is to reproduce current base conditions in equilibrium),then in spiteof the large efforts required in data collection,no empirical evidence is routinely collected regarding the model’s main purpose,namely its ability to predict changes in behaviour and network performance under changes to the network/ demand.This issue is exacerbated by the aggregation concerns in point1:the‘ambiguity’in choosing appropriate param-eter values to satisfy the aggregate,link-level,base validation strengthens the need to independently verify that,with the selected parameter values,the model responds reliably to changes.Although such problems–offitting equilibrium models to cross-sectional data–have long been recognised by practitioners and academics(see,e.g.,Goodwin,1998), the approach described above remains the state-of-practice.Having identified these two problems,how might we go about addressing them?One approach to thefirst problem would be to return to the underlying formulation of the network model,and instead require a model definition that permits analysis by statistical inference techniques(see for example,Nakayama et al.,2009).In this way,we may potentially exploit more information in the variability of the link-level data,with well-defined notions(such as maximum likelihood)allowing a systematic basis for selection between alternative parameter value combinations.However,this approach is still using rather limited data and it is natural not just to question the model but also the data that we use to calibrate and validate it.Yet this is not altogether straightforward to resolve.As Mahmassani and Jou(2000) remarked:‘A major difficulty...is obtaining observations of actual trip-maker behaviour,at the desired level of richness, simultaneously with measurements of prevailing conditions’.For this reason,several authors have turned to simulated gaming environments and/or stated preference techniques to elicit information on drivers’route choice behaviour(e.g. 1Clearly,more sporadic and less predictable reductions in capacity may also occur,such as in the case of breakdowns and accidents,and environmental factors such as severe weather,floods or landslides(see for example,Iida,1999),but the responses to such cases are outside the scope of the present paper. 2Some authors have suggested more systematic,bi-level type optimization processes for thisfitting process(e.g.Xu et al.,2004),but this has no material effect on the essential points above.D.Watling et al./Transportation Research Part A46(2012)167–189169 Mahmassani and Herman,1990;Iida et al.,1992;Khattak et al.,1993;Vaughn et al.,1995;Wardman et al.,1997;Jou,2001; Chen et al.,2001).This provides potentially rich information for calibrating complex behavioural models,but has the obvious limitation that it is based on imagined rather than real route choice situations.Aside from its common focus on hypothetical decision situations,this latter body of work also signifies a subtle change of emphasis in the treatment of the overall network calibration problem.Rather than viewing the network equilibrium calibra-tion process as a whole,the focus is on particular components of the model;in the cases above,the focus is on that compo-nent concerned with how drivers make route decisions.If we are prepared to make such a component-wise analysis,then certainly there exists abundant empirical evidence in the literature,with a history across a number of decades of research into issues such as the factors affecting drivers’route choice(e.g.Wachs,1967;Huchingson et al.,1977;Abu-Eisheh and Mannering,1987;Duffell and Kalombaris,1988;Antonisse et al.,1989;Bekhor et al.,2002;Liu et al.,2004),the nature of travel time variability(e.g.Smeed and Jeffcoate,1971;Montgomery and May,1987;May et al.,1989;McLeod et al., 1993),and the factors affecting trafficflow variability(Bonsall et al.,1984;Huff and Hanson,1986;Ribeiro,1994;Rakha and Van Aerde,1995;Fox et al.,1998).While these works provide useful evidence for the network equilibrium calibration problem,they do not provide a frame-work in which we can judge the overall‘fit’of a particular network model in the light of uncertainty,ambient variation and systematic changes in network attributes,be they related to the OD demand,the route choice process,travel times or the network data.Moreover,such data does nothing to address the second point made above,namely the question of how to validate the model forecasts under systematic changes to its inputs.The studies of Mannering et al.(1994)and Emmerink et al.(1996)are distinctive in this context in that they address some of the empirical concerns expressed in the context of travel information impacts,but their work stops at the stage of the empirical analysis,without a link being made to net-work prediction models.The focus of the present paper therefore is both to present thefindings of an empirical study and to link this empirical evidence to network forecasting models.More recently,Zhu et al.(2010)analysed several sources of data for evidence of the traffic and behavioural impacts of the I-35W bridge collapse in Minneapolis.Most pertinent to the present paper is their location-specific analysis of linkflows at 24locations;by computing the root mean square difference inflows between successive weeks,and comparing the trend for 2006with that for2007(the latter with the bridge collapse),they observed an apparent transient impact of the bridge col-lapse.They also showed there was no statistically-significant evidence of a difference in the pattern offlows in the period September–November2007(a period starting6weeks after the bridge collapse),when compared with the corresponding period in2006.They suggested that this was indicative of the length of a‘re-equilibration process’in a conceptual sense, though did not explicitly compare their empiricalfindings with those of a network equilibrium model.The structure of the remainder of the paper is as follows.In Section2we describe the process of selecting the real-life problem to analyse,together with the details and rationale behind the survey design.Following this,Section3describes the statistical techniques used to extract information on travel times and routing patterns from the survey data.Statistical inference is then considered in Section4,with the aim of detecting statistically significant explanatory factors.In Section5 comparisons are made between the observed network data and those predicted by a network equilibrium model.Finally,in Section6the conclusions of the study are highlighted,and recommendations made for both practice and future research.2.Experimental designThe ultimate objective of the study was to compare actual data with the output of a traffic network equilibrium model, specifically in terms of how well the equilibrium model was able to correctly forecast the impact of a systematic change ap-plied to the network.While a wealth of surveillance data on linkflows and travel times is routinely collected by many local and national agencies,we did not believe that such data would be sufficiently informative for our purposes.The reason is that while such data can often be disaggregated down to small time step resolutions,the data remains aggregate in terms of what it informs about driver response,since it does not provide the opportunity to explicitly trace vehicles(even in aggre-gate form)across more than one location.This has the effect that observed differences in linkflows might be attributed to many potential causes:it is especially difficult to separate out,say,ambient daily variation in the trip demand matrix from systematic changes in route choice,since both may give rise to similar impacts on observed linkflow patterns across re-corded sites.While methods do exist for reconstructing OD and network route patterns from observed link data(e.g.Yang et al.,1994),these are typically based on the premise of a valid network equilibrium model:in this case then,the data would not be able to give independent information on the validity of the network equilibrium approach.For these reasons it was decided to design and implement a purpose-built survey.However,it would not be efficient to extensively monitor a network in order to wait for something to happen,and therefore we required advance notification of some planned intervention.For this reason we chose to study the impact of urban maintenance work affecting the roads,which UK local government authorities organise on an annual basis as part of their‘Local Transport Plan’.The city council of York,a historic city in the north of England,agreed to inform us of their plans and to assist in the subsequent data collection exercise.Based on the interventions planned by York CC,the list of candidate studies was narrowed by considering factors such as its propensity to induce significant re-routing and its impact on the peak periods.Effectively the motivation here was to identify interventions that were likely to have a large impact on delays,since route choice impacts would then likely be more significant and more easily distinguished from ambient variability.This was notably at odds with the objectives of York CC,170 D.Watling et al./Transportation Research Part A46(2012)167–189in that they wished to minimise disruption,and so where possible York CC planned interventions to take place at times of day and of the year where impacts were minimised;therefore our own requirement greatly reduced the candidate set of studies to monitor.A further consideration in study selection was its timing in the year for scheduling before/after surveys so to avoid confounding effects of known significant‘seasonal’demand changes,e.g.the impact of the change between school semesters and holidays.A further consideration was York’s role as a major tourist attraction,which is also known to have a seasonal trend.However,the impact on car traffic is relatively small due to the strong promotion of public trans-port and restrictions on car travel and parking in the historic centre.We felt that we further mitigated such impacts by sub-sequently choosing to survey in the morning peak,at a time before most tourist attractions are open.Aside from the question of which intervention to survey was the issue of what data to collect.Within the resources of the project,we considered several options.We rejected stated preference survey methods as,although they provide a link to personal/socio-economic drivers,we wanted to compare actual behaviour with a network model;if the stated preference data conflicted with the network model,it would not be clear which we should question most.For revealed preference data, options considered included(i)self-completion diaries(Mahmassani and Jou,2000),(ii)automatic tracking through GPS(Jan et al.,2000;Quiroga et al.,2000;Taylor et al.,2000),and(iii)licence plate surveys(Schaefer,1988).Regarding self-comple-tion surveys,from our own interview experiments with self-completion questionnaires it was evident that travellersfind it relatively difficult to recall and describe complex choice options such as a route through an urban network,giving the po-tential for significant errors to be introduced.The automatic tracking option was believed to be the most attractive in this respect,in its potential to accurately map a given individual’s journey,but the negative side would be the potential sample size,as we would need to purchase/hire and distribute the devices;even with a large budget,it is not straightforward to identify in advance the target users,nor to guarantee their cooperation.Licence plate surveys,it was believed,offered the potential for compromise between sample size and data resolution: while we could not track routes to the same resolution as GPS,by judicious location of surveyors we had the opportunity to track vehicles across more than one location,thus providing route-like information.With time-stamped licence plates, the matched data would also provide journey time information.The negative side of this approach is the well-known poten-tial for significant recording errors if large sample rates are required.Our aim was to avoid this by recording only partial licence plates,and employing statistical methods to remove the impact of‘spurious matches’,i.e.where two different vehi-cles with the same partial licence plate occur at different locations.Moreover,extensive simulation experiments(Watling,1994)had previously shown that these latter statistical methods were effective in recovering the underlying movements and travel times,even if only a relatively small part of the licence plate were recorded,in spite of giving a large potential for spurious matching.We believed that such an approach reduced the opportunity for recorder error to such a level to suggest that a100%sample rate of vehicles passing may be feasible.This was tested in a pilot study conducted by the project team,with dictaphones used to record a100%sample of time-stamped, partial licence plates.Independent,duplicate observers were employed at the same location to compare error rates;the same study was also conducted with full licence plates.The study indicated that100%surveys with dictaphones would be feasible in moderate trafficflow,but only if partial licence plate data were used in order to control observation errors; for higherflow rates or to obtain full number plate data,video surveys should be considered.Other important practical les-sons learned from the pilot included the need for clarity in terms of vehicle types to survey(e.g.whether to include motor-cycles and taxis),and of the phonetic alphabet used by surveyors to avoid transcription ambiguities.Based on the twin considerations above of planned interventions and survey approach,several candidate studies were identified.For a candidate study,detailed design issues involved identifying:likely affected movements and alternative routes(using local knowledge of York CC,together with an existing network model of the city),in order to determine the number and location of survey sites;feasible viewpoints,based on site visits;the timing of surveys,e.g.visibility issues in the dark,winter evening peak period;the peak duration from automatic trafficflow data;and specific survey days,in view of public/school holidays.Our budget led us to survey the majority of licence plate sites manually(partial plates by audio-tape or,in lowflows,pen and paper),with video surveys limited to a small number of high-flow sites.From this combination of techniques,100%sampling rate was feasible at each site.Surveys took place in the morning peak due both to visibility considerations and to minimise conflicts with tourist/special event traffic.From automatic traffic count data it was decided to survey the period7:45–9:15as the main morning peak period.This design process led to the identification of two studies:2.1.Lendal Bridge study(Fig.1)Lendal Bridge,a critical part of York’s inner ring road,was scheduled to be closed for maintenance from September2000 for a duration of several weeks.To avoid school holidays,the‘before’surveys were scheduled for June and early September.It was decided to focus on investigating a significant southwest-to-northeast movement of traffic,the river providing a natural barrier which suggested surveying the six river crossing points(C,J,H,K,L,M in Fig.1).In total,13locations were identified for survey,in an attempt to capture traffic on both sides of the river as well as a crossing.2.2.Fishergate study(Fig.2)The partial closure(capacity reduction)of the street known as Fishergate,again part of York’s inner ring road,was scheduled for July2001to allow repairs to a collapsed sewer.Survey locations were chosen in order to intercept clockwiseFig.1.Intervention and survey locations for Lendal Bridge study.around the inner ring road,this being the direction of the partial closure.A particular aim wasFulford Road(site E in Fig.2),the main radial affected,with F and K monitoring local diversion I,J to capture wider-area diversion.studies,the plan was to survey the selected locations in the morning peak over a period of approximately covering the three periods before,during and after the intervention,with the days selected so holidays or special events.Fig.2.Intervention and survey locations for Fishergate study.In the Lendal Bridge study,while the‘before’surveys proceeded as planned,the bridge’s actualfirst day of closure on Sep-tember11th2000also marked the beginning of the UK fuel protests(BBC,2000a;Lyons and Chaterjee,2002).Trafficflows were considerably affected by the scarcity of fuel,with congestion extremely low in thefirst week of closure,to the extent that any changes could not be attributed to the bridge closure;neither had our design anticipated how to survey the impacts of the fuel shortages.We thus re-arranged our surveys to monitor more closely the planned re-opening of the bridge.Unfor-tunately these surveys were hampered by a second unanticipated event,namely the wettest autumn in the UK for270years and the highest level offlooding in York since records began(BBC,2000b).Theflooding closed much of the centre of York to road traffic,including our study area,as the roads were impassable,and therefore we abandoned the planned‘after’surveys. As a result of these events,the useable data we had(not affected by the fuel protests orflooding)consisted offive‘before’days and one‘during’day.In the Fishergate study,fortunately no extreme events occurred,allowing six‘before’and seven‘during’days to be sur-veyed,together with one additional day in the‘during’period when the works were temporarily removed.However,the works over-ran into the long summer school holidays,when it is well-known that there is a substantial seasonal effect of much lowerflows and congestion levels.We did not believe it possible to meaningfully isolate the impact of the link fully re-opening while controlling for such an effect,and so our plans for‘after re-opening’surveys were abandoned.3.Estimation of vehicle movements and travel timesThe data resulting from the surveys described in Section2is in the form of(for each day and each study)a set of time-stamped,partial licence plates,observed at a number of locations across the network.Since the data include only partial plates,they cannot simply be matched across observation points to yield reliable estimates of vehicle movements,since there is ambiguity in whether the same partial plate observed at different locations was truly caused by the same vehicle. Indeed,since the observed system is‘open’—in the sense that not all points of entry,exit,generation and attraction are mon-itored—the question is not just which of several potential matches to accept,but also whether there is any match at all.That is to say,an apparent match between data at two observation points could be caused by two separate vehicles that passed no other observation point.Thefirst stage of analysis therefore applied a series of specially-designed statistical techniques to reconstruct the vehicle movements and point-to-point travel time distributions from the observed data,allowing for all such ambiguities in the data.Although the detailed derivations of each method are not given here,since they may be found in the references provided,it is necessary to understand some of the characteristics of each method in order to interpret the results subsequently provided.Furthermore,since some of the basic techniques required modification relative to the published descriptions,then in order to explain these adaptations it is necessary to understand some of the theoretical basis.3.1.Graphical method for estimating point-to-point travel time distributionsThe preliminary technique applied to each data set was the graphical method described in Watling and Maher(1988).This method is derived for analysing partial registration plate data for unidirectional movement between a pair of observation stations(referred to as an‘origin’and a‘destination’).Thus in the data study here,it must be independently applied to given pairs of observation stations,without regard for the interdependencies between observation station pairs.On the other hand, it makes no assumption that the system is‘closed’;there may be vehicles that pass the origin that do not pass the destina-tion,and vice versa.While limited in considering only two-point surveys,the attraction of the graphical technique is that it is a non-parametric method,with no assumptions made about the arrival time distributions at the observation points(they may be non-uniform in particular),and no assumptions made about the journey time probability density.It is therefore very suitable as afirst means of investigative analysis for such data.The method begins by forming all pairs of possible matches in the data,of which some will be genuine matches(the pair of observations were due to a single vehicle)and the remainder spurious matches.Thus, for example,if there are three origin observations and two destination observations of a particular partial registration num-ber,then six possible matches may be formed,of which clearly no more than two can be genuine(and possibly only one or zero are genuine).A scatter plot may then be drawn for each possible match of the observation time at the origin versus that at the destination.The characteristic pattern of such a plot is as that shown in Fig.4a,with a dense‘line’of points(which will primarily be the genuine matches)superimposed upon a scatter of points over the whole region(which will primarily be the spurious matches).If we were to assume uniform arrival rates at the observation stations,then the spurious matches would be uniformly distributed over this plot;however,we shall avoid making such a restrictive assumption.The method begins by making a coarse estimate of the total number of genuine matches across the whole of this plot.As part of this analysis we then assume knowledge of,for any randomly selected vehicle,the probabilities:h k¼Prðvehicle is of the k th type of partial registration plateÞðk¼1;2;...;mÞwhereX m k¼1h k¼1172 D.Watling et al./Transportation Research Part A46(2012)167–189。

A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection

A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection

A Study of Cross-Validation and Bootstrap for Accuracy E stimation and Model SelectionRon KohaviComputer Science DepartmentStanford UniversityStanford, CA 94305ronnykGCS Stanford E D Uh t t p //r o b o t i c s Stanford edu/"ronnykA b s t r a c tWe review accuracy estimation methods andcompare the two most common methods cross-validation and bootstrap Recent experimen-tal results on artificial data and theoretical recults m restricted settings have shown that forselecting a good classifier from a set of classi-fiers (model selection), ten-fold cross-validationmay be better than the more expensive ka\pone-out cross-validation We report on a large-scale experiment—over half a million runs ofC4 5 and aNaive-Bayes algorithm—loestimalethe effects of different parameters on these algonthms on real-world datascts For cross-validation we vary the number of folds andwhether the folds arc stratified or not, for boot-strap, we vary the number of bootstrap sam-ples Our results indicate that for real-worddatasets similar to ours, The best method lo usefor model selection is ten fold stratified crossvalidation even if computation power allowsusing more folds1 I n t r o d u c t i o nIt can not be emphasized enough that no claimwhatsoever 11 being made in this paper that altalgorithms a re equiva lent in practice in the rea l world In pa rticula r no cla im is being ma de tha t ont should not use cross va lida tion in the real world— Wolpcrt (1994a.) Estimating the accuracy of a classifier induced by su-pervised learning algorithms is important not only to predict its future prediction accuracy, but also for choos-ing a classifier from a given set (model selection), or combining classifiers (Wolpert 1992) For estimating the final accuracy of a classifier, we would like an estimation method with low bias and low variance To choose a classifier or to combine classifiers, the absolute accura-cies are less important and we are willing to trade off biasA longer version of the paper can be retrieved by anony mous ftp to starry Htanford edu pub/ronnyk/accEst-long ps for low variance, assuming the bias affects all classifiers similarly (e g esLimates are ")% pessimistic)In this paper we explain some of the assumptions madeby Ihe different estimation methods and present con-crete examples where each method fails While it is known that no accuracy estimation can be corrert allthe time (Wolpert 1994b Schaffer 1994j we are inter ested in identifying a method that ib well suited for the biases and tn rids in typical real world datasetsRecent results both theoretical and experimental, have shown that it is no! alwa>s the case that increas-ing the computational cost is beneficial especiallhy if the relative accuracies are more important than the exact values For example leave-one-out is almost unbiased,but it has high variance leading to unreliable estimates (Efron 1981) l o r linear models using leave-one-out cross-validation for model selection is asymptotically in consistent in the sense that the probability of selectingthe model with the best predictive power does not con-verge to one as the lolal number of observations ap-proaches infinity (Zhang 1992, Shao 1993)This paper \s organized AS follows Section 2 describesthe common accuracy estimation methods and ways of computing confidence bounds that hold under some as-sumptions Section 3 discusses related work comparing cross-validation variants and bootstrap variants Sec lion 4 discusses methodology underlying our experimentThe results of the experiments are given Section 5 with a discussion of important observations We conelude witha summary in Section 62 Methods for Accuracy E s t i m a t i o nA classifier is a function that maps an unlabelled in-stance to a label using internal data structures An i n-ducer or an induction algorithm builds a classifier froma given dataset CART and C 4 5 (Brennan, Friedman Olshen &. Stone 1984, Quinlan 1993) are decision tree in-ducers that build decision tree classifiers In this paperwe are not interested in the specific method for inducing classifiers, but assume access to a dataset and an inducerof interestLet V be the space of unlabelled instances and y theKOHAVI 1137set of possible labels be the space of labelled instances and ,i n ) be a dataset (possibly a multiset) consisting of n labelled instances, where A classifier C maps an unla-beled instance ' 10 a l a b e l a n d an inducer maps a given dataset D into a classifier CThe notationwill denote the label assigned to an unlabelled in-stance v by the classifier built, by inducer X on dataset D tWe assume that there exists adistribution on the set of labelled instances and that our dataset consists of 1 1 d (independently and identically distributed) instances We consider equal misclassifica-lion costs using a 0/1 loss function, but the accuracy estimation methods can easily be extended to other loss functionsThe accuracy of a classifier C is the probability ofcorrectly clasaifying a randoml\ selected instance, i efor a randomly selected instancewhere the probability distribution over theinstance space 15 the same as the distribution that was used to select instances for the inducers training set Given a finite dataset we would like to custimate the fu-ture performance of a classifier induced by the given in-ducer and dataset A single accuracy estimate is usually meaningless without a confidence interval, thus we will consider how to approximate such an interval when pos-sible In order to identify weaknesses, we also attempt o identify cases where the estimates fail2 1 Holdout The holdout method sometimes called test sample esti-mation partitions the data into two mutually exclusivesubsets called a training set and a test set or holdout setIt is Lommon to designate 2/ 3 of the data as the trainingset and the remaining 1/3 as the test set The trainingset is given to the inducer, and the induced classifier istested on the test set Formally, let , the holdout set,be a subset of D of size h, and let Theholdout estimated accuracy is defined aswhere otherwise Assummg that the inducer s accuracy increases as more instances are seen, the holdout method is a pessimistic estimator because only a portion of the data is given to the inducer for training The more instances we leave for the test set, the higher the bias of our estimate however, fewer test set instances means that the confidence interval for the accuracy will be wider as shown belowEach test instance can be viewed as a Bernoulli trialcorrect or incorrect prediction Let S be the numberof correct classifications on the test set, then s is dis-tributed bmomially (sum of Bernoulli trials) For rea-sonably large holdout sets, the distribution of S/h is ap-proximately normal with mean ace (the true accuracy of the classifier) and a variance of ace * (1 — acc)hi Thus, by De Moivre-Laplace limit theorem, we havewhere z is the quanl lie point of the standard normal distribution To get a IOO7 percent confidence interval, one determines 2 and inverts the inequalities Inversion of the inequalities leads to a quadratic equation in ace, the roots of which are the low and high confidence pointsThe above equation is not conditioned on the dataset D , if more information is available about the probability of the given dataset it must be taken into accountThe holdout estimate is a random number that de-pends on the division into a training set and a test set In r a n d o m sub s a m p l i n g the holdout method is re-peated k times and the eslimated accuracy is derived by averaging the runs Th( slandard deviation can be estimated as the standard dewation of the accuracy es-timations from each holdout runThe mam assumption that is violated in random sub-sampling is the independence of instances m the test set from those in the training set If the training and testset are formed by a split of an original dalaset, thenan over-represented class in one subset will be a under represented in the other To demonstrate the issue we simulated a 2/3, 1 /3 split of Fisher's famous ins dataset and used a majority inducer that builds a classifier pre dieting the prevalent class in the training set The iris dataset describes ins plants using four continuous fea-tures, and the task is to classify each instance (an ins) as Ins Setosa Ins Versicolour or Ins Virginica For each class label there are exactly one third of the instances with that label (50 instances of each class from a to-tal of 150 instances) thus we expect 33 3% prediction accuracy However, because the test set will always con-tain less than 1/3 of the instances of the class that wasprevalent in the training set, the accuracy predicted by the holdout method is 21 68% with a standard deviation of 0 13% (estimated by averaging 500 holdouts) In practice, the dataset size is always finite, and usu-ally smaller than we would like it to be The holdout method makes inefficient use of the data a third of dataset is not used for training the inducer 2 2 Cross-Validation, Leave-one-out, and Stratification In fc-fold cross-validation, sometimes called rotation esti-mation, the dataset V is randomly split into k mutuallyexclusive subsets (the folds) , of approx-imately equal size The inducer is trained and tested1138 LEARNINGThe cross-validation estimate is a random number that depends on the division into folds C o m p l e t ec r o s s -v a l id a t i o n is the average of all possibil ities for choosing m/k instances out of m, but it is usually too expensive Exrept for leave-one-one (rc-fold cross-validation), which is always complete, fc-foM cross-validation is estimating complete K-foId cross-validationusing a single split of the data into the folds Repeat-ing cross-validation multiple limes using different spillsinto folds provides a better M onte C arlo estimate to 1 hecomplele cross-validation at an added cost In s t r a t i -fied c r o s s -v a l i d a t i o n the folds are stratified so thaitlicy contain approximately the same proportions of la-bels as the original dataset An inducer is stable for a given dataset and a set of perturbal ions if it induces classifiers thai make the same predictions when it is given the perturbed datasets P r o p o s i t i o n 1 (V a r i a n c e in A>fold C V )Given a dataset and an inducer If the inductr isstable under the pei tur bations caused by deleting theinstances f o r thr folds in k fold cross-validatwn thecross validation < stnnate will be unbiastd and the t a i lance of the estimated accuracy will be approximatelyaccrv (1—)/n when n is the number of instancesin the datasi t Proof If we assume that the k classifiers produced makethe same predictions, then the estimated accuracy has a binomial distribution with n trials and probabihly of success equal to (he accuracy of the classifier | For large enough n a confidence interval may be com-puted using Equation 3 with h equal to n, the number of instancesIn reality a complex inducer is unlikely to be stable for large perturbations unless it has reached its maximal learning capacity We expect the perturbations induced by leave-one-out to be small and therefore the classifier should be very stable As we increase the size of the perturbations, stability is less likely to hold we expect stability to hold more in 20-fold cross-validation than in 10-fold cross-validation and both should be more stable than holdout of 1/3 The proposition does not apply to the resubstitution estimate because it requires the in-ducer to be stable when no instances are given in the datasetThe above proposition helps, understand one possible assumption that is made when using cross-validation if an inducer is unstable for a particular dataset under a set of perturbations introduced by cross-validation, the ac-curacy estimate is likely to be unreliable If the inducer is almost stable on a given dataset, we should expect a reliable estimate The next corollary takes the idea slightly further and shows a result that we have observed empirically there is almost no change in the variance of the cross validation estimate when the number of folds is variedC o r o l l a r y 2 (Variance m cross-validation)Given a dataset and an inductr If the inducer is sta-ble undfi the }>tituibuhoris (aused by deleting the test instances foi the folds in k-fold cross-validation for var-ious valuts of k then tht vartanct of the estimates will be the sameProof The variance of A-fold cross-validation in Propo-sition 1 does not depend on k |While some inducers are liktly to be inherently more stable the following example shows that one must also take into account the dalaset and the actual perturba (ions E x a m p l e 1 (Failure of leave-one-out)lusher s ins dataset contains 50 instances of each class leading one to expect that a majority indu<er should have acruraov about j \% However the eombmation ofthis dataset with a majority inducer is unstable for thesmall perturbations performed by leave-one-out Whenan instance is deleted from the dalaset, its label is a mi-nority in the training set, thus the majority inducer pre-dicts one of the other two classes and always errs in clas-sifying the test instance The leave-one-out estimatedaccuracy for a majont> inducer on the ins dataset istherefore 0% M oreover all folds have this estimated ac-curacy, thus the standard deviation of the folds is again0 %giving the unjustified assurance that 'he estimate is stable | The example shows an inherent problem with cross-validation th-t applies to more than just a majority in-ducer In a no-infornirition dataset where the label val-ues are completely random, the best an induction algo-rithm can do is predict majority Leave-one-out on such a dataset with 50% of the labels for each class and a majontv ind'-cer (the best, possible inducer) would still predict 0% accuracy 2 3 B o o t s t r a pThe bootstrap family was introduced by Efron and is fully described in Efron &. Tibshirani (1993) Given a dataset of size n a b o o t s t r a p s a m p l e is created by sampling n instances uniformly from the data (with re-placement) Since the dataset is sampled with replace-ment, the probability of any given instance not beingchosen after n samples is theKOHAVI 1139expected number of distinct instances from the original dataset appearing in the teat set is thus 0 632n The eO accuracy estimate is derived by using the bootstrap sam-ple for training and the rest of the instances for testing Given a number b, the number of bootstrap samples, let e0, be the accuracy estimate for bootstrap sample i The632 bootstrap estimate is defined as(5)where ace, is the resubstitution accuracy estimate on the full dataset (i e , the accuracy on the training set) The variance of the estimate can be determined by com puting the variance of the estimates for the samples The assumptions made by bootstrap are basically the same as that of cross-validation, i e , stability of the al-gorithm on the dataset the 'bootstrap world" should closely approximate the real world The b32 bootstrap fails (o give the expected result when the classifier is a perfect memonzer (e g an unpruned decision tree or a one nearest neighbor classifier) and the dataset is com-pletely random, say with two classes The resubstitution accuracy is 100%, and the eO accuracy is about 50% Plugging these into the bootstrap formula, one gets an estimated accuracy of about 68 4%, far from the real ac-curacy of 50% Bootstrap can be shown to fail if we add a memonzer module to any given inducer and adjust its predictions If the memonzer remembers the training set and makes the predictions when the test instance was a training instances, adjusting its predictions can make the resubstitution accuracy change from 0% to 100% and can thus bias the overall estimated accuracy in any direction we want3 Related W o r kSome experimental studies comparing different accuracy estimation methods have been previously done but most of them were on artificial or small datasets We now describe some of these effortsEfron (1983) conducted five sampling experiments and compared leave-one-out cross-validation, several variants of bootstrap, and several other methods The purpose of the experiments was to 'investigate some related es-timators, which seem to offer considerably improved es-timation in small samples ' The results indicate that leave-one-out cross-validation gives nearly unbiased esti-mates of the accuracy, but often with unacceptably high variability, particularly for small samples, and that the 632 bootstrap performed bestBreiman et al (1984) conducted experiments using cross-validation for decision tree pruning They chose ten-fold cross-validation for the CART program and claimed it was satisfactory for choosing the correct tree They claimed that "the difference in the cross-validation estimates of the risks of two rules tends to be much more accurate than the two estimates themselves "Jain, Dubes fa Chen (1987) compared the performance of the t0 bootstrap and leave-one-out cross-validation on nearest neighbor classifiers Using artificial data and claimed that the confidence interval of the bootstrap estimator is smaller than that of leave-one-out Weiss (1991) followed similar lines and compared stratified cross-validation and two bootstrap methods with near-est neighbor classifiers His results were that stratified two-fold cross validation is relatively low variance and superior to leave-one-outBreiman fa Spector (1992) conducted a feature sub-set selection experiments for regression, and compared leave-one-out cross-validation, A:-fold cross-validation for various k, stratified K-fold cross-validation, bias-corrected bootstrap, and partial cross-validation (not discussed here) Tests were done on artificial datasets with 60 and 160 instances The behavior observed was (1) the leave-one-out has low bias and RMS (root mean square) error whereas two-fold and five-fold cross-validation have larger bias and RMS error only at models with many features, (2) the pessimistic bias of ten-fold cross-validation at small samples was significantly re-duced for the samples of size 160 (3) for model selection, ten-fold cross-validation is better than leave-one-out Bailey fa E lkan (1993) compared leave-one-out cross-ahdation to 632 bootstrap using the FOIL inducer and four synthetic datasets involving Boolean concepts They observed high variability and little bias in the leave-one-out estimates, and low variability but large bias in the 632 estimatesWeiss and Indurkyha (Weiss fa Indurkhya 1994) con-ducted experiments on real world data Lo determine the applicability of cross-validation to decision tree pruning Their results were that for samples at least of size 200 using stratified ten-fold cross-validation to choose the amount of pruning yields unbiased trees (with respect to their optimal size) 4 M e t h o d o l o g yIn order to conduct a large-scale experiment we decided to use 04 5 and a Naive Bayesian classifier The C4 5 algorithm (Quinlan 1993) is a descendent of ID3 that builds decision trees top-down The Naive-Bayesian clas-sifier (Langley, Iba fa Thompson 1992) used was the one implemented in (Kohavi, John, Long, Manley fa Pfleger 1994) that uses the observed ratios for nominal features and assumes a Gaussian distribution for contin-uous features The exact details are not crucial for this paper because we are interested in the behavior of the accuracy estimation methods more than the internals of the induction algorithms The underlying hypothe-sis spaces—decision trees for C4 5 and summary statis-tics for Naive-Bayes—are different enough that we hope conclusions based on these two induction algorithms will apply to other induction algorithmsBecause the target concept is unknown for real-world1140 LEARNINGconcepts, we used the holdout method to estimate the quality of the cross-validation and bootstrap estimates To choose & set of datasets, we looked at the learning curves for C4 5 and Najve-Bayes for most of the super-vised classification dataaets at the UC Irvine repository (Murphy & Aha 1994) that contained more than 500 instances (about 25 such datasets) We felt that a min-imum of 500 instances were required for testing While the true accuracies of a real dataset cannot be computed because we do not know the target concept, we can esti mate the true accuracies using the holdout method The "true' accuracy estimates in Table 1 were computed by taking a random sample of the given size computing the accuracy using the rest of the dataset as a test set, and repeating 500 timesWe chose six datasets from a wide variety of domains, such that the learning curve for both algorithms did not flatten out too early that is, before one hundred instances We also added a no inform a tion d l stt, rand, with 20 Boolean features and a Boolean random label On one dataset vehicle, the generalization accu-racy of the Naive-Bayes algorithm deteriorated hy morethan 4% as more instances were g;iven A similar phenomenon was observed on the shuttle dataset Such a phenomenon was predicted by Srhaffer and Wolpert (Schaffer 1994, Wolpert 1994), but we were surprised that it was observed on two real world datasetsTo see how well an Accuracy estimation method per forms we sampled instances from the dataset (uniformly without replacement) and created a training set of the desired size We then ran the induction algorihm on the training set and tested the classifier on the rest of the instances L E I the dataset This was repeated 50 times at points where the lea rning curve wa s sloping up The same folds in cross-validation and the same samples m bootstrap were used for both algorithms compared5 Results and DiscussionWe now show the experimental results and discuss their significance We begin with a discussion of the bias in the estimation methods and follow with a discussion of the variance Due to lack of space, we omit some graphs for the Naive-Bayes algorithm when the behavior is ap-proximately the same as that of C 4 5 5 1 T h e B i a sThe bias of a method to estimate a parameter 0 is de-fined as the expected value minus the estimated value An unbiased estimation method is a method that has zero bias Figure 1 shows the bias and variance of k-fold cross-validation on several datasets (the breast cancer dataset is not shown)The diagrams clearly show that k-fold cross-validation is pessimistically biased, especially for two and five folds For the learning curves that have a large derivative at the measurement point the pessimism in k-fold cross-Figure ] C'4 5 The bias of cross-validation with varying folds A negative K folds stands for leave k-out E rror bars are 95% confidence intervals for (he mean The gray regions indicate 95 % confidence intervals for the true ac curaries Note the different ranges for the accuracy axis validation for small k s is apparent Most of the esti-mates are reasonably good at 10 folds and at 20 folds they art almost unbiasedStratified cross validation (not shown) had similar be-havior, except for lower pessimism The estimated accu-racy for soybe an at 2 fold was 7% higher and at five-fold, 1 1% higher for vehicle at 2-fold, the accuracy was 2 8% higher and at five-fold 1 9% higher Thus stratification seems to be a less biased estimation methodFigure 2 shows the bias and variance for the b32 boot-strap accuracy estimation method Although the 632 bootstrap is almost unbiased for chess hypothyroid, and mushroom for both inducers it is highly biased for soy-bean with C'A 5, vehicle with both inducers and rand with both inducers The bias with C4 5 and vehicle is 9 8%5 2 The VarianceWhile a given method may have low bias, its perfor-mance (accuracy estimation in our case) may be poor due to high variance In the experiments above, we have formed confidence intervals by using the standard de-viation of the mea n a ccura cy We now switch to the standard deviation of the population i e , the expected standard deviation of a single accuracy estimation run In practice, if one dots a single cross-validation run the expected accuracy will be the mean reported above, but the standard deviation will be higher by a factor of V50, the number of runs we averaged in the experimentsKOHAVI 1141Table 1 True accuracy estimates for the datasets using C4 5 and Naive-Bayes classifiers at the chosen sample sizesFigure 2 C4 5 The bias of bootstrap with varying sam-ples Estimates are good for mushroom hypothyroid, and chess, but are extremely biased (optimistically) for vehicle and rand, and somewhat biased for soybeanIn what follows, all figures for standard deviation will be drawn with the same range for the standard devi-ation 0 to 7 5% Figure 3 shows the standard devia-tions for C4 5 and Naive Bayes using varying number of folds for cross-validation The results for stratified cross-validation were similar with slightly lower variance Figure 4 shows the same information for 632 bootstrap Cross-validation has high variance at 2-folds on both C4 5 and Naive-Bayes On C4 5, there is high variance at the high-ends too—at leave-one-out and leave-two-out—for three files out of the seven datasets Stratifica-tion reduces the variance slightly, and thus seems to be uniformly better than cross-validation, both for bias and vananceFigure 3 Cross-validation standard deviation of accu-racy (population) Different, line styles are used to help differentiate between curves6 S u m m a r yWe reviewed common accuracy estimation methods in-cluding holdout, cross-validation, and bootstrap, and showed examples where each one fails to produce a good estimate We have compared the latter two approaches on a variety of real-world datasets with differing charac-teristicsProposition 1 shows that if the induction algorithm is stable for a given dataset, the variance of the cross-validation estimates should be approximately the same, independent of the number of folds Although the induc-tion algorithms are not stable, they are approximately stable it fold cross-validation with moderate k values (10-20) reduces the variance while increasing the bias As k decreases (2-5) and the sample sizes get smaller, there is variance due to the instability of the training1142 LEARNING1 igure 4 632 Bootstrap standard deviation in acc-rat y (population)sets themselves leading to an increase in variance this is most apparent for datasets with many categories, such as soybean In these situations) stratification seems to help, but -epeated runs may be a better approach Our results indicate that stratification is generally a better scheme both in terms of bias and variance whencompared to regular cross-validation Bootstrap has low,variance but extremely large bias on some problems We recommend using stratified Len fold cross-validation for model selection A c k n o w l e d g m e n t s We thank David Wolpert for a thorough reading of this paper and many interesting dis-cussions We thank Tom Bylander Brad E fron Jerry Friedman, Rob Holte George John Pat Langley Hob Tibshiram and Sholom Weiss for their helpful com nients and suggestions Dan Sommcrfield implemented Lhe bootstrap method in WLC++ All experiments were conducted using M L C ++ partly partly funded by ONR grant N00014-94-1-0448 and NSF grants IRI 9116399 and IRI-941306ReferencesBailey, T L & E lkan C (1993) stimating the atcuracy of learned concepts, in Proceedings of In ternational Joint Conference on Artificial Intelli-gence , Morgan Kaufmann Publishers, pp 895 900 Breiman, L & Spector, P (1992) Submodel selectionand evaluation in regression the x random case Inttrnational St atistic al Review 60(3), 291-319 Breiman, L , Friedman, J H , Olshen, R A & StoneC J (1984), Cl a ssific ation a nd Regression Trets Wadsworth International GroupEfron, B (1983), 'E stimating the error rate of a pre-diction rule improvement on cross-validation",Journal of the Americ an St atistic al Associ ation 78(382), 316-330 Efron, B & Tibshiram, R (1993) An introduction tothe bootstra p, Chapman & HallJam, A K Dubes R C & Chen, C (1987), "Boot-strap techniques lor error estimation", IEEE tra ns-actions on p a ttern a n a lysis a nd m a chine intelli-gence P A M I -9(5), 628-633 Kohavi, R , John, G , Long, R , Manley, D &Pfleger K (1994), M L C ++ A machine learn-ing library in C ++ in 'Tools with Artifi-cial Intelligence I E E EComputer Society Press, pp 740-743 Available by anonymous ftp from s t a r r y Stanford E DU pub/ronnyk/mlc/ toolsmlc psLangley, P Tba, W & Thompson, K (1992), An anal-ysis of bayesian classifiers in Proceedings of the tenth national conference on artificial intelligence",A A A I Press and M I T Press, pp 223-228Murph' P M & Aha D W (1994), V( I repository of machine learning databases, For information con-tact ml-repository (Ui(,s uci edu Quinlan I R (1993) C4 5 Progra ms for Ma chine Learning Morgan Kaufmann Los Altos CaliforniaSchaffcr C (19941 A conservation law for generalization performance, in Maehinc Learning Proceedings of Lhe E leventh International conference Morgan Kaufmann, pp 259-265Shao, J (1993), Linear model seletion via cross-validation Journ a l of the America n sta tistica l As-sociation 88(422) 486-494 Weiss S M (1991), Small sample error rate estimationfor k nearest neighbor classifiers' I E EE Tr a ns ac tions on Pa ttern An alysis a nd Ma chine Intelligence 13(3), 285-289 Weiss, S M & lndurkhya N (1994) Decision Lreepruning Biased or optimal, in Proceedings of the twelfth national conference on artificial intel-ligence A A A I Press and M I T Press pp 626-632 Wolpert D H (1992), Stacked generalization , Neura lNetworks 5 241-259 Wolpert D H (1994a) Off training set error and a pri-ori distinctions between learning algorithms, tech-mcal Report SFI TR 94-12-121, The Sante Fe ln-stituteWolpert D II {1994b), The relationship between PAC, the statistical physics framework the Bayesian framework, and the VC framework Technical re-port, The Santa Fe Institute Santa Fe, NMZhang, P (1992), 'On the distributional properties of model selection criteria' Journ al of the America nStatistical Associa tion 87(419), 732-737 KOHAVI 1143。

k交叉验证法

k交叉验证法

k交叉验证法摘要:1.交叉验证法简介2.交叉验证法的原理3.交叉验证法在机器学习中的应用4.交叉验证法的优点与局限性5.总结正文:交叉验证法(Cross Validation)是一种在统计学和机器学习领域中广泛应用的数据分析方法。

其主要思想是将数据集划分为训练集和验证集,通过多次训练和验证的过程,评估模型性能并选择最佳模型。

交叉验证法可以帮助我们更准确地评估模型的泛化能力,从而避免过拟合现象。

交叉验证法的原理是,将数据集D 划分为K 个不重叠的子集,每个子集称为一个折(fold)。

在每次循环中,我们选取一个折作为验证集,其余K-1 个折作为训练集。

这样,我们可以得到K 个模型,每个模型在不同的验证集上进行评估。

最后,我们可以根据这K 个模型的性能指标(如准确率、召回率等),选择性能最佳的模型作为最终模型。

交叉验证法在机器学习中的应用十分广泛,尤其在模型选择和参数调优阶段。

通过交叉验证法,我们可以评估不同模型和参数组合在验证集上的表现,从而找出最佳模型和参数。

此外,交叉验证法还可以用于比较不同算法之间的性能,为实际应用场景提供参考依据。

交叉验证法具有以下优点:1.减少数据集大小对模型性能评估的影响,提高评估准确性。

2.有助于发现过拟合现象,提高模型的泛化能力。

3.可以在不同数据集上进行模型性能比较,提高模型选择和参数调优的可靠性。

然而,交叉验证法也存在一定的局限性:1.计算成本较高,尤其是在大数据集和复杂模型的情况下,计算量可能成为限制因素。

2.K 的选择具有一定的经验性,不同K 值可能导致不同的评估结果,影响模型选择和参数调优的准确性。

总之,交叉验证法是一种实用且有效的数据分析方法,在机器学习领域具有广泛的应用价值。

MDA A Formal Approach to Game Design and Game Research

MDA  A Formal Approach to Game Design and Game Research
AI coders and researchers are no exception. Seemingly inconsequential decisions about data, representation, algorithms, tools, vocabulary and methodology will trickle upward, shaping the final gameplay. Similarly, all desired user experience must bottom out, somewhere, in code. As games continue to generate increasingly complex agent, object and system behavior, AI and game design merge.
Abstract
In this paper we present the MDA framework (standing for Mechanics, Dynamics, and Aesthetics), developed and taught as part of the Game Design and Tuning Workshop at the Game Developers Conference, San Jose 2001-2004.
The MDA framework formalizes the consumption of games by breaking them into their distinct components:
Rules
System
“Fun”
…and establishing their design counterparts:

IEEE 1222-2004 IEEE Standard for All Dielectric Self-Supporting Fiber Optic Cable

IEEE 1222-2004 IEEE Standard for All Dielectric Self-Supporting Fiber Optic Cable

IEEE Std 1222™-2004I E E E S t a n d a r d s 1222TM IEEE Standard for All-Dielectric Self-Supporting Fiber Optic Cable 3 Park Avenue, New York, NY 10016-5997, USA IEEE Power Engineering Society Sponsored by the Power System Communications Committee30 July 2004Print: SH95192PDF: SS95192Recognized as anAmerican National Standard (ANSI)The Institute of Electrical and Electronics Engineers, Inc.3 Park Avenue, New York, NY 10016-5997, USACopyright © 2004 by the Institute of Electrical and Electronics Engineers, Inc.All rights reserved. Published 30 July 2004. Printed in the United States of America.IEEE is a registered trademark in the U.S. Patent & Trademark Office, owned by the Institute of Electrical and Electronics Engineers, Incorporated.Print: ISBN 0-7381-3887-8SH95192PDF: ISBN 0-7381-3888-6SS95192No part of this publication may be reproduced in any form, in an electronic retrieval system or otherwise, without the prior written permission of the publisher.IEEE Std 1222™-2003IEEE Standard for All-Dielectric Self-Supporting Fiber Optic CableSponsorPower System Communications Committeeof theIEEE Power Engineering SocietyApproved 31 March 2004American National Standards InstituteApproved 10 December 2003IEEE-SA Standards BoardAbstract: Construction, mechanical, electrical, and optical performance, installation guidelines, ac-ceptance criteria, test requirements, environmental considerations, and accessories for an all-dielectric, nonmetallic, self-supporting fiber optic (ADSS) cable are covered in this standard. The ADSS cable is designed to be located primarily on overhead utility facilities. This standard provides both construction and performance requirements that ensure within the guidelines of the standard that the dielectric capabilities of the cable components and maintenance of optical fiber integrity and optical transmissions are proper. This standard may involve hazardous materials, operations, and equipment. It does not purport to address all of the safety issues associated with its use, and it is the responsibility of the user to establish appropriate safety and health practices and to determine the applicability of regulatory limitations prior to use.Keywords: aeolian vibration, aerial cables, all-dielectric self-supporting (ADSS), buffer, cable reels, cable safety, cable thermal aging, dielectric, distribution lines, electric fields, electrical stress,fiber optic cable, galloping, grounding, hardware, high voltage, optical ground wire (OPGW), plastic cable, sag and tension, self-supporting, sheave test, span length, string procedures, temperature cycle test, tracking, transmission lines, ultraviolet (UV) deteriorationIEEE Standards documents are developed within the IEEE Societies and the Standards Coordinating Committees of the IEEE Standards Association (IEEE-SA) Standards Board. The IEEE develops its standards through a consensus development process, approved by the American National Standards Institute, which brings together volunteers representing varied view-points and interests to achieve the final product. Volunteers are not necessarily members of the Institute and serve without compensation. While the IEEE administers the process and establishes rules to promote fairness in the consensus develop-ment process, the IEEE does not independently evaluate, test, or verify the accuracy of any of the information contained in its standards.Use of an IEEE Standard is wholly voluntary. The IEEE disclaims liability for any personal injury, property or other dam-age, of any nature whatsoever, whether special, indirect, consequential, or compensatory, directly or indirectly resulting from the publication, use of, or reliance upon this, or any other IEEE Standard document.The IEEE does not warrant or represent the accuracy or content of the material contained herein, and expressly disclaims any express or implied warranty, including any implied warranty of merchantability or fitness for a specific purpose, or that the use of the material contained herein is free from patent infringement. IEEE Standards documents are supplied “AS IS .”The existence of an IEEE Standard does not imply that there are no other ways to produce, test, measure, purchase, market,or provide other goods and services related to the scope of the IEEE Standard. Furthermore, the viewpoint expressed at the time a standard is approved and issued is subject to change brought about through developments in the state of the art and comments received from users of the standard. Every IEEE Standard is subjected to review at least every five years for revi-sion or reaffirmation. When a document is more than five years old and has not been reaffirmed, it is reasonable to conclude that its contents, although still of some value, do not wholly reflect the present state of the art. Users are cautioned to check to determine that they have the latest edition of any IEEE Standard.In publishing and making this document available, the IEEE is not suggesting or rendering professional or other services for, or on behalf of, any person or entity. Nor is the IEEE undertaking to perform any duty owed by any other person or entity to another. Any person utilizing this, and any other IEEE Standards document, should rely upon the advice of a com-petent professional in determining the exercise of reasonable care in any given circumstances.Interpretations: Occasionally questions may arise regarding the meaning of portions of standards as they relate to specific applications. When the need for interpretations is brought to the attention of IEEE, the Institute will initiate action to prepare appropriate responses. Since IEEE Standards represent a consensus of concerned interests, it is important to ensure that any interpretation has also received the concurrence of a balance of interests. For this reason, IEEE and the members of its soci-eties and Standards Coordinating Committees are not able to provide an instant response to interpretation requests except in those cases where the matter has previously received formal consideration. At lectures, symposia, seminars, or educational courses, an individual presenting information on IEEE standards shall make it clear that his or her views should be considered the personal views of that individual rather than the formal position, explanation, or interpretation of the IEEE.Comments for revision of IEEE Standards are welcome from any interested party, regardless of membership affiliation with IEEE. Suggestions for changes in documents should be in the form of a proposed change of text, together with appropriate supporting comments. Comments on standards and requests for interpretations should be addressed to:Secretary, IEEE-SA Standards Board445 Hoes LaneP.O. Box 1331Piscataway, NJ 08855-1331USAAuthorization to photocopy portions of any individual standard for internal or personal use is granted by the Institute of Electrical and Electronics Engineers, Inc., provided that the appropriate fee is paid to Copyright Clearance Center. To arrange for payment of licensing fee, please contact Copyright Clearance Center, Customer Service, 222 Rosewood Drive,Danvers, MA 01923 USA; +1 978 750 8400. Permission to photocopy portions of any individual standard for educational classroom use can also be obtained through the Copyright Clearance Center.NOTE −Attention is called to the possibility that implementation of this standard may require use of subject matter covered by patent rights. By publication of this standard, no position is taken with respect to the exist-ence or validity of any patent rights in connection therewith. The IEEE shall not be responsible for identifying patents for which a license may be required by an IEEE standard or for conducting inquiries into the legal valid-ity or scope of those patents that are brought to its attention.Introduction(This introduction is not a part of IEEE Std 1222-2003, IEEE Standard for All-Dielectric Self-Supporting Fiber Optic Cable.)All-dielectric self-supporting (ADSS) fiber optic cables are being installed throughout the power utility industry. Because of the unique service environment and design of these cables, many new requirements are necessary to ensure proper design and application of these cables. In order to develop an industry-wide set of requirements and tests, the Fiber Optic Standards Working Group, under the direction of the Fiber Optic Subcommittee of the Communications Committee, brought together the expertise of key representatives from throughout the industry. These key people are from each manufacturer of ADSS cables and a cross sec-tion of the end users. All manufacturers and all known users were invited to participate in preparing this standard.The preparation of this standard occurred over a period of several years, and participation changed through-out that time as companies and individuals changed interests and positions. Effort was always made to include key individuals from each and every manufacturing concern, major user groups, and consulting firms. Membership and participation was open to everyone who had an interest in the standard, and all involvement was encouraged. This worldwide representation helps to ensure that this standard reflects the entire industry.As ADSS fiber optic cables are a new and changing technology, the working group is continuing to work on new revisions to this standard as the need arises.Notice to usersErrataErrata, if any, for this and all other standards can be accessed at the following URL: http:// /reading/ieee/updates/errata/index.html. Users are encouraged to check this URL for errata periodically.InterpretationsCurrent interpretations can be accessed at the following URL: /reading/ieee/interp/ index.html.PatentsAttention is called to the possibility that implementation of this standard may require use of subject matter covered by patent rights. By publication of this standard, no position is taken with respect to the existence or validity of any patent rights in connection therewith. The IEEE shall not be responsible for identifying patents or patent applications for which a license may be required to implement an IEEE standard or for conducting inquiries into the legal validity or scope of those patents that are brought to its attention. Copyright © 2004 IEEE. All rights reserved.iiiiv Copyright © 2004 IEEE. All rights reserved.ParticipantsDuring the preparation of this standard, the Fiber Optic Standards Working Group had the following membership:William A. Byrd, ChairRobert E. Bratton, Co-ChairThe following members of the individual balloting committee voted on this standard. Balloters may have voted for approval, disapproval, or abstention.When the IEEE-SA Standards Board approved this standard on 10 December 2003, it had the following membership:Don Wright, ChairHoward M. Frazier, Vice ChairJudith Gorman, Secretary*Member EmeritusAlso included are the following nonvoting IEEE-SA Standards Board liaisons:Satish K. Aggarwal, NRC RepresentativeRichard DeBlasio, DOE RepresentativeAlan Cookson, NIST RepresentativeSavoula AmanatidisIEEE Standards Managing EditorPhilip AdelizziHiroji AkasakaTom AldertonDave BouchardMark BoxerTerrence BurnsKurt DallasPaul DanielsWilliam DeWittGary DitroiaRobert EmersonTrey Fleck Denise Frey Henry Grad Jim Hartpence Claire Hatfield John Jones Tommy King Konrad Loebl John MacNair Andrew McDowell Tom Newhart Serge Pichot Craig Pon Jim Puzan Joe Renowden William Rich Tewfik Schehade John Smith Matt Soltis Dave Sunkel Alexander Torres Monty Tuominen Jan Wang Tim West Eric WhithamWole AkposeThomas BlairAl BonnymanStuart BoucheyMark BoxerRobert Bratton Terrence Burns William A. Byrd Manish Chaturvedi Ernest Duckworth Amir El-Sheikh Robert Emerson Denise Frey Jerry Goerz Brian G. Herbst Edward Horgan Mihai Ioan David JacksonPi-Cheng LawH. Stephen BergerJoe BruderBob DavisRichard DeBlasioJulian Forster*Toshio FukudaArnold M. GreenspanRaymond Hapeman Donald M. Heirman Laura Hitchcock Richard H. Hulett Anant Jain Lowell G. Johnson Joseph L. Koepfinger*Tom McGean Steve Mills Daleep C. Mohla William J. Moylan Paul Nikolich Gary Robinson Malcolm V. Thaden Geoffrey O. Thompson Doug Topping Howard L. WolfmanContents1.Overview (1)1.1Scope (1)2.ADSS cable and components (1)2.1Description (1)2.2Support systems (1)2.3Fiber optic cable core (2)2.4Optical fibers (3)2.5Buffer construction (3)2.6Color coding (3)2.7Jackets (3)3.Test requirements (4)3.1Cable tests (4)3.2Fiber tests (7)4.Test methods (10)4.1Cable tests (10)4.2Fiber tests (14)5.Sag and tension list (16)6.Field acceptance testing (16)6.1Fiber continuity (17)6.2Attenuation (17)6.3Fiber length (17)7.Installation recommendations (17)7.1Installation procedure for ADSS (17)7.2Electric field strength (17)7.3Span lengths (17)7.4Sag and tension (18)7.5Stringing sheaves (18)7.6Maximum stringing tension (18)7.7Handling (18)7.8Hardware and accessories (18)7.9Electrical stress (18)Copyright © 2004 IEEE. All rights reserved.v8.Cable marking and packaging requirements (19)8.1Reels (19)8.2Cable end requirements (19)8.3Cable length tolerance (19)8.4Certified test data (19)8.5Reel tag (20)8.6Cable marking (20)8.7Cable remarking (20)8.8Identification marking (20)8.9SOCC (21)Annex A (informative) Electrical test (24)Annex B (informative) Aeolian vibration test (26)Annex C (informative) Galloping test (28)Annex D (informative) Sheave test (ADSS) (30)Annex E (informative) Temperature cycle test (32)Annex F (informative) Cable thermal aging test (33)Annex G (informative) Bibliography (34)vi Copyright © 2004 IEEE. All rights reserved.IEEE Standard for All-DielectricSelf-Supporting Fiber Optic Cable1. Overview1.1 ScopeThis standard covers the construction, mechanical, electrical, and optical performance, installation guidelines, acceptance criteria, test requirements, environmental considerations, and accessories for an all-dielectric, nonmetallic, self-supporting fiber optic (ADSS) cable. The ADSS cable is designed to be located primarily on overhead utility facilities.The standard provides both construction and performance requirements that ensure within the guidelines of the standard that the dielectric capabilities of the cable components and maintenance of optical fiber integ-rity and optical transmissions are proper.This standard may involve hazardous materials, operations, and equipment. This standard does not purport to address all of the safety issues associated with its use. It is the responsibility of the user of this standard to establish appropriate safety and health practices and to determine the applicability of regulatory limitations prior to use.2. ADSS cable and components2.1 DescriptionThe ADSS cable shall consist of coated glass optical fibers contained in a protective dielectric fiber optic unit surrounded by or attached to suitable dielectric strength members and jackets. The cable shall not con-tain metallic components. The cable shall be designed to meet the design requirements of the optical cable under all installation conditions, operating temperatures, and environmental loading.2.2 Support systemsa)ADSS cable shall contain support systems that are integral to the cable. The purpose of the supportsystem is to ensure that the cable meets the optical requirements under all specified installation con-ditions, operating temperatures, and environmental loading for its design life. This standard excludes any “lashed” type of cables.Copyright © 2004 IEEE. All rights reserved.1IEEEStd 1222-2003IEEE STANDARD FOR ALL-DIELECTRICb)The basic annular construction may have aramid or other dielectric strands or a channeled dielectricrod as a support structure. In addition, other cable elements, such as central members, may be load bearing.c)Figure-8 constructions may have a dielectric messenger and a fiber optic unit, both of which share acommon outer jacket. In addition, other cable elements, such as central members, may be load bearing.d)Helically stranded cable systems may consist of a dielectric optical cable prestranded around adielectric messenger.e)The design load of the cable shall be specified so that support hardware can be manufactured to per-form under all environmental loading conditions. For zero fiber strain cable designs, the design load is defined as the load at which the optical fibers begin to elongate. For other cable designs, the design load is defined as the load at which the measured fiber strain reaches a predetermined level.f)Other designs previously not described are not excluded from this specification.2.3 Fiber optic cable coreThe fiber optic cable core shall be made up of coated glass optical fibers housed to protect the fibers from mechanical, environmental, and electrical stresses. Materials used within the core shall be compatible with one another, shall not degrade under the electrical stresses to which they may be exposed, and shall not evolve hydrogen sufficient to degrade optical performance of fibers within the cable.2.3.1 Fiber strain allowanceThe cable core shall be designed such that fiber strain does not exceed the limit allowed by the cable manu-facturer under the operational design limits of the cable. Maximum allowable fiber strain will generally be a function of the proof test level and strength and fatigue parameters of the coated glass fiber.2.3.2 Central structural elementIf a central structural element is necessary, it shall be of reinforced plastic, epoxiglass, or other dielectric material. If required, this element shall provide the necessary tensile strength to limit axial stress on the fibers and minimize fiber buckling due to cable contraction at low temperatures.2.3.3 Buffer tube filling compoundLoose buffer tubes shall be filled with a suitable compound compatible with the tubing material, fiber coat-ing, and coloring to protect the optical fibers and prevent moisture ingress.2.3.4 Cable core filling/flooding compoundThe design of the cable may include a suitable filling/flooding compound in the interstices to prohibit water migration along the fiber optic cable core. The filling compound shall be compatible with all components with which it may come in contact.2.3.5 Binder/tapeA binder yarn(s) and/or a layer(s) of overlapping nonhygroscopic tape(s) may be used to hold the cable core elements in place during application of the jacket.2Copyright © 2004 IEEE. All rights reserved.IEEE SELF-SUPPORTING FIBER OPTIC CABLE Std 1222-20032.3.6 Inner jacketA protective inner jacket or jackets of a suitable material may be applied over the fiber optic cable core, iso-lating the cable core from any external strength elements and the cable outer jacket.2.4 Optical fibersSingle-mode fibers, dispersion-unshifted, dispersion-shifted, or nonzero dispersion-shifted, and multimode fibers with 50/125 mm or 62.5/125 mm core/clad diameters are considered in this standard. The core and the cladding shall consist of glass that is predominantly silica (SiO2). The coating, usually made from one or more plastic materials or compositions, shall be provided to protect the fiber during manufacture, handling, and use.2.5 Buffer constructionThe individually coated optical fiber(s) or fiber ribbon(s) may be surrounded by a buffer for protection from physical damage during fabrication, installation, and performance of the ADSS. Loose buffer or tight buffer construction are two types of protection that may be used to isolate the fibers. The fiber coating and buffer shall be strippable for splicing and termination.2.5.1 Loose bufferLoose buffer construction shall consist of a tube or channel that surrounds each fiber or fiber group. The inside of the tube or channel shall be filled with a filling compound.2.5.2 Tight buffer constructionTight buffer construction shall consist of a suitable material that comes in contact with the coated fiber. 2.6 Color codingColor coding is essential for identifying individual optical fibers and groups of optical fibers. The colors shall be in accordance with TIA/EIA 598-A-1995 [B43].12.6.1 Color performanceThe original color coding system shall be discernible and permanent, in accordance with EIA359-A-1985[B3], throughout the design life of the cable, when cleaned and prepared per manufacturer’s recommendations.2.7 JacketsThe outer jacket shall be designed to house and protect the inner elements of the cable from damage due to moisture, sunlight, environmental, thermal, mechanical, and electrical stresses.a)The jacket material shall be dielectric, non-nutrient to fungus, and meet the requirements of3.1.1.13. The jacket material may consist of a polyethylene that shall contain carbon black and anantioxidant.b)The jacket shall be extruded over the underlying element and shall be of uniform diameter to prop-erly fit support hardware. The extruded surface shall be smooth for minimal ice buildup.1The numbers in brackets correspond to those of the bibliography in Annex G.Copyright © 2004 IEEE. All rights reserved.3Std 1222-2003IEEE STANDARD FOR ALL-DIELECTRICc)The cable jacket shall be suitable for application in electrical fields as defined in this clause anddemonstrated in 3.1.1.3.Class A: Where the level of electrical stress on the jacket does not exceed 12 kV spacepotential.Class B: Where the level of electrical stress on the jacket may exceed 12 kV space potential. NOTE—See 7.9 for additional deployment details.23. Test requirementsEach requirement in this clause is complementary to the corresponding paragraph in Clause4 that describesa performance verification or test procedure.3.1 Cable tests3.1.1 Design testsAn ADSS cable shall successfully pass the following design tests. However, design tests may be waived at the option of the user if an ADSS cable of identical design has been previously tested to demonstrate the capability of the manufacturer to furnish cable with the desired performance characteristics.3.1.1.1 Water blocking testA water block test for cable shall be performed in accordance with 4.1.1.1. No water shall leak through the open end of the 1 m sample. If the first sample fails, one additional 1 m sample, taken from a section of cable adjacent to the first sample, may be tested for acceptance.3.1.1.2 Seepage of filling/flooding compoundFor filled/flooded fiber optic cable, a seepage of filling/flooding compound test shall be performed in accor-dance with 4.1.1.2. The filling and flooding compound shall not flow (drip or leak) at 65 o C.3.1.1.3 Electrical testsElectrical tests shall be performed for Class B cables in accordance with 4.1.1.3. Tracking on the outside of the sheath resulting in erosion at any point that exceeds more than 50% of the wall thickness shall constitutea failure.3.1.1.4 Aeolian vibration testAn aeolian vibration test shall be carried out in accordance with 4.1.1.4. Any damage that will affect the mechanical performance of the cable or causes permanent or temporary increase in optical attenuation greater than 1.0 dB/km of the tested fibers at 1550 nm for single-mode fibers and at 1300 nm for multimode fibers shall constitute failure.2Notes in text, tables, and figures are given for information only and do not contain requirements needed to implement the standard.3.1.1.5 Galloping testA galloping test shall be carried out in accordance with 4.1.1.5. Any damage that will affect the mechanical performance of the cable or causes permanent or temporary increase in optical attenuation greater than 1.0dB/km of the tested fibers at 1550 nm for single-mode fibers and at 1300 nm for multimode fibers shall constitute failure.3.1.1.6 Sheave testA sheave test shall be carried out in accordance with 4.1.1.6. Any significant damage to the ADSS cable shall constitute failure. A permanent increase in optical attenuation greater than 1.0 dB/km of the tested fibers at 1550nm for single-mode fibers and at 1300 nm for multimode fibers shall constitute failure.Or successful completion of the following three tests may be a substitute for the sheave test:a)Tensile strength of a cable: The maximum increase in attenuation shall not be greater than 0.10 dBfor single-mode and 0.20 dB for multimode fibers when the cable is subjected to the maximum cable rated tensile load.b)Cable twist: The cable shall be capable of withstanding mechanical twisting without experiencingan average increase in attenuation greater than 0.10 dB for single-mode and 0.20 dB for multimode fibers.c)Cable cyclic flexing: The cable sample shall be capable of withstanding mechanical flexing withoutexperiencing an average increase in attenuation greater than 0.10 dB for single-mode and 0.20 dB for multimode fibers.3.1.1.7 Crush test and impact test3.1.1.7.1 Crush testA crush test shall be performed in accordance with 4.1.1.7.1. A permanent or temporary increase in optical attenuation value greater than 0.2 dB change in sample at 1550 nm for single-mode fibers and 0.4 dB at 1300nm for multimode fibers shall constitute failure.3.1.1.7.2 Impact testAn impact test shall be performed in accordance with 4.1.1.7.2. A permanent increase in optical attenuation value greater than 0.2 dB change in sample at 1550 nm for single-mode and 0.4 dB at 1300 nm for multi-mode fibers shall constitute failure.3.1.1.8 Creep testA creep test shall be carried out in accordance with 4.1.1.8. Values shall correspond with the manufacturer’s recommendations.3.1.1.9 Stress/strain testA stress/strain test shall be carried out in accordance with 4.1.1.9. The maximum rated cable load (MRCL), maximum rated cable strain (MRCS), and maximum axial fiber strain specified by the manufacturer for their cable design shall be verified. Any visual damage to the cable or permanent or temporary increase in optical attenuation greater than 0.10 dB at 1550 nm for single-mode fiber and 0.20 dB at 1300 nm for multimode fibers shall constitute failure.Std 1222-2003IEEE STANDARD FOR ALL-DIELECTRIC 3.1.1.10 Cable cutoff wavelength (single-mode fiber)The cutoff wavelength of the cabled fiber, λcc, shall be less than 1260 nm.3.1.1.11 Temperature cycle testOptical cables shall maintain mechanical and optical integrity when exposed to the following temperature extremes: –40 o C to +65 o C.The change in attenuation at extreme operational temperatures for single-mode fibers shall not be greater than 0.20 dB/km, with 80% of the measured values no greater than 0.10 dB/km. For single-mode fibers, the attenuation change measurements shall be made at 1550 nm.For multimode fibers, the change shall not be greater than 0.50 dB/km, with 80% of the measured values no greater than 0.25 dB/km. The multimode fiber measurements shall be made at 1300 nm unless otherwise specified.A temperature cycle test shall be performed in accordance with 4.1.1.11.3.1.1.12 Cable aging testThe cable aging test shall be a continuation of the temperature cycle test.The change in attenuation from the original values observed before the start of the temperature cycle test shall not be greater than 0.40 dB/km, with 80% of the measured values no greater than 0.20 dB/km for sin-gle-mode fibers.For multimode fibers, the change in attenuation shall not be greater than 1.00 dB/km, with 80% of the mea-sured values no greater than 0.50 dB/km.There shall be no discernible difference between the jacket identification and length marking colors of the aged sample relative to those of an unaged sample of the same cable. The fiber coating color(s) and unit/bun-dle identifier color(s) shall be in accordance with TIA/EIA 598-A-1992 [B43].A cable aging test shall be performed in accordance with 4.1.1.12.3.1.1.13 Ultraviolet (UV) resistance testThe cable and jacket system is expected to perform satisfactorily in the user-specified environment into which the cable is being placed into service. Because of the numerous possible environmental locations available, it is the user’s and supplier’s joint responsibility to provide the particular performance requirements of each installation location. These performance criteria are for nonsevere environments. The IEC 60068-2-1[B12] performance standards should be used to define particular environmental testing requirements for each unique location.The cable jacket shall meet the following requirements:Where carbon black is used as a UV damage inhibitor, the cable shall have a minimum absorption coeffi-cient of 0.32 per meter.Where the other cable UV blocking systems are being employed, the cable shalla)Meet the equivalent UV performance of carbon black at 0.32 per meterb)Meet the performance requirements as stated in 4.1.1.13 for IEC 60068-2-1 [B12] testing。

跨领域元认知领域一般还是领域特殊

跨领域元认知领域一般还是领域特殊

doi:10.3969/j.issn.2095-4301.2021.03.013-综述Review-跨领域元认知:领域一般还是领域特殊?张肖肖,陈功香济南大学教育与心理科学学院(中国济南250002)[摘要]元认知是人们对自己认知活动的认识和调节。

教育研究中不同学科领域之间的元认知具有不同的规律和特点,即元认知存在领域一般性和领域特殊性问题。

多数研究者支持元认知在儿童时期首先是特定领域的,但随着个体的发展,成人元认知会在各个领域中泛化,成为一种跨领域的一般性能力。

未来研究可以结合认知神经科学的方法进一步确定发生这一转变的机制。

[关键词]元认知;领域一般性;领域特殊性Cross-domain metacognition:domain-general or domain-specific?ZHANG Xiaoxiao,CHEN GongxiangSchool of Education and Psychology,University of Jinan,Jinan250002,Shandong Province,China [Abstract]Metacognition is individuals'knowledge and adjustment of their own cognitive activity.Metacognition in different academic fields and disciplines has various regulations and traits,that is,problems of domain generality and domain specificity have always existed in metacognition.Most researchers approve that metacognition first exists in a specific domain in childhood,while with the ontogenetic development,adult metacognition will be generalized in various domains and become a cross-domain general ability.Future research can be combined with the methods of cognitive neuroscience to further determine the mechanism of this transformation.[Key words]metacognition;domain-generality;元认知(metacognition)指学习过程中对人的认知活动的认识和调节,简单地说就是“对认知的认知”[1可。

IEC-61162-420

IEC-61162-420

Commission Electrotechnique Internationale International Electrotechnical Commission
PRICE CODE
XD
For price, see current cata001(E)
CONTENTS
FOREWORD...........................................................................................................................6 INTRODUCTION .....................................................................................................................8 1 2 3 Scope and object ..............................................................................................................9 Normative references...................................................................................................... 10 Definitions ...............................................................................................................

Methods and systems for access security for datalo

Methods and systems for access security for datalo

专利名称:Methods and systems for access security fordataloading发明人:Nathaniel John Simcoe,Steven James Darr申请号:US13032314申请日:20110222公开号:US09015481B2公开日:20150421专利内容由知识产权出版社提供专利附图:摘要:Systems and methods for access security for dataloading are provided. In one implementation, a system comprises a first computer that transmits a packet, the first computer comprising: an authentication code memory that stores an authentication codefor the packet; a first processing unit that executes communication instructions in a first memory, the communication instructions attaching the authentication code to the packet; and a first communication port that transmits the packet. The system also comprises a second computer that receives the packet, the second computer comprising: a second communication port that receives the packet; a verification code memory that stores a verification code for verifying the packet's authentication code; and a second processing unit that executes verification instructions in a second memory, the verification instructions comparing the verification code against the authentication code, wherein the second computer rejects the packet if the verification code does not match the authentication code.申请人:Nathaniel John Simcoe,Steven James Darr地址:Phoenix AZ US,Phoenix AZ US国籍:US,US代理机构:Fogg & Powers LLC更多信息请下载全文后查看。

交叉验证的方法范文

交叉验证的方法范文

交叉验证的方法范文交叉验证(Cross-validation)是机器学习中常用的一种模型评估方法,主要用于衡量模型的泛化能力。

通过将数据集划分成多个子集,然后将每个子集轮流作为验证集,其余子集作为训练集,从而得到多个模型的评估结果。

本文将详细介绍交叉验证的方法及其优势。

交叉验证的主要目的是评估模型在未知数据上的性能,以判断模型是否过拟合或欠拟合,以及选择适当的模型参数。

通过将数据集分为训练集和验证集两个部分,训练集用于模型的训练,验证集用于评估模型的性能。

然而,这种简单的划分可能会导致模型的评估结果对数据的划分方式非常敏感,从而可能影响模型选择的准确性。

为了克服这一问题,交叉验证将数据集划分为k个大小相等的子集,其中一个子集作为验证集,其余k-1个子集作为训练集。

这个过程重复k 次,每次都选取不同的子集作为验证集,然后取平均值来得到最终的模型性能评估结果。

其中,最常用的交叉验证方法包括k折交叉验证和留一法交叉验证。

在k折交叉验证中,数据集被划分为k个子集,每个子集都被轮流作为验证集,其余k-1个子集作为训练集。

每次交叉验证都会生成一个模型,并将模型在验证集上的性能进行评估。

最终,k次交叉验证的结果取平均值,得到模型的最终性能评估。

这种方法的优势在于所有的数据都被用于训练和验证,从而可以最大限度地利用数据集来评估模型的性能。

另一种常见的交叉验证方法是留一法交叉验证。

在留一法交叉验证中,数据集被划分为k个子集,其中每个子集只包含一个样本。

然后,每个样本依次作为验证集,其余的样本作为训练集,从而得到k个模型的评估结果。

最终,将这k个模型的评估结果取平均值,得到模型的最终性能评估。

留一法交叉验证的优势在于对于小型数据集,可以提供最准确的性能评估,但计算成本较高。

除了上述两种常用的交叉验证方法,还有一些其他的变种方法,如分层交叉验证、重复交叉验证和随机划分交叉验证等。

这些方法都旨在解决特定情况下的问题,并提供更准确的模型性能评估。

k交叉验证法

k交叉验证法

k交叉验证法【实用版】目录1.交叉验证法的定义和意义2.交叉验证法的应用场景3.交叉验证法的操作步骤4.交叉验证法的优点和局限性正文一、交叉验证法的定义和意义交叉验证法(Cross Validation)是一种统计学上的方法,主要用于评估模型的性能和参数选择。

它通过将数据集分为训练集和测试集,对模型进行多次训练和测试,从而得到模型在不同数据集上的表现,以此来评估模型的泛化能力。

交叉验证法能够帮助我们更好地了解模型在未知数据上的表现,为模型的选择和优化提供重要依据。

二、交叉验证法的应用场景交叉验证法适用于许多需要评估模型性能和参数选择的场景,例如:1.模型选择:当我们面临多种模型时,可以通过交叉验证法来比较它们的性能,从而选择最佳模型。

2.参数调整:在确定模型后,可以通过交叉验证法来寻找最优参数,提高模型的性能。

3.模型评估:在模型建立完成后,可以使用交叉验证法来评估模型的泛化能力,以此来判断模型是否具有实用价值。

三、交叉验证法的操作步骤交叉验证法的操作步骤如下:1.将数据集分为训练集和测试集:这是交叉验证法的基础,需要将数据集按照一定比例分为训练集和测试集。

2.多次训练和测试:将训练集用于模型训练,测试集用于模型测试。

重复这一过程多次,每次使用不同的训练集和测试集组合。

3.评估模型性能:根据多次测试的结果,计算模型的平均性能指标,如准确率、精确率等。

4.比较模型和参数:通过比较不同模型和参数在交叉验证过程中的表现,选择最佳模型和参数。

四、交叉验证法的优点和局限性交叉验证法的优点有:1.能够较好地评估模型的泛化能力,为模型选择和优化提供依据。

2.可以在较短时间内得到较为可靠的结果,节省计算资源和时间。

然而,交叉验证法也存在局限性:1.对于小样本数据,交叉验证法可能不太适用,因为数据量较小,可能导致模型的性能评估不准确。

2.当模型复杂度较高时,交叉验证法可能会导致过拟合,从而影响模型的泛化能力。

综上所述,交叉验证法是一种实用的模型评估和参数选择方法,适用于许多场景。

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

Metacognitive instruction for helpingless-skilled listenersJeremy CrossThis article reports on a small-scale study of the effect of metacognitive instructionon listeners’comprehension.Twenty adult,Japanese,advanced level EFL learnersparticipated in a task sequence,or‘pedagogical cycle’,of predicting,monitoring,problem identification,and evaluating in each offive listening lessons aimed atpromoting their comprehension of television news items.A comparison of pre-testand post-test scores illustrated that three of four less-skilled listeners made notablegains across thefive lessons,whereas only one of four more-skilled listenersimproved.Findings add support to the view that metacognitive instructionutilizing a pedagogical cycle may help less-skilled listeners to develop their listeningability,though there seems to be a threshold for higher skill levels beyond whicheffects are minimal.Introduction Teaching and learning about listening comprehension1in the languageclassroom is not an easy undertaking.Modern coursebooks oftenrecommend the explicit teaching of listening strategies as a way offacilitating less-skilled listeners’understanding.While this approach maybe of benefit,it has a narrow focus on strategy awareness and use alone.Assuch,strategy instruction does not really go far enough in providing learnerswith adequate knowledge about the nature of L2listening,associatedchallenges,and the cognitive and emotional factors involved.If less-skilledlisteners are provided with guidance and regular opportunities tofind outabout and explore these key aspects,they can become more capable ofcontrolling and evaluating their own listening development(Goh and Taib2006).Teaching that focuses on actively eliciting and promoting learners’knowledge of themselves as L2listeners and their understanding ofthe characteristics and demands of listening in an L2,and whichprovides them with direction about ways to discover how to managetheir listening comprehension is called‘metacognitive instruction’(ibid.).Metacognitive instruction in L2listening Before further considering metacognitive instruction in L2listening,it is helpful to briefly explain the term‘metacognition’.Flavell(1976)states that metacognition essentially refers to thinking about one’s owncognitive processes.It has two key aspects:the deliberate or conscious orchestration of cognitive functions and knowledge and beliefs about408ELT Journal Volume65/4October2011;doi:10.1093/elt/ccq073ªThe Author2010.Published by Oxford University Press;all rights reserved.Advance Access publication December23,2010 at City University of Hong Kong on March 8, 2012 / Downloaded fromcognitive processes.Both of these aspects are associated with a concrete objective.Moreover,Flavell differentiates knowledge and beliefs according to:1person factors(for example one’s strengths and weaknesses as a learner), 2task factors(for example the task nature and purpose),and3strategic factors(for example strategy types to facilitate task completion). Wenden(1998)initially pointed to the usefulness of encouraging and guiding metacognitive behaviour in L2learning.Benefits to learners include the development of knowledge about how to actively achieve success in language learning,and greater awareness of ways to operate as more self-directed learners.However,it is only more recently that metacognitive instruction and its usefulness in enhancing listeners’comprehension ability has been explored.There is a growing interest in implementing metacognitive instruction as it seems not all learners have an understanding of what listening in an L2involves(Vandergrift2003). Two kinds of techniques for metacognitive instruction in listening lessons are suggested by Goh(2008).Thefirst type involves learnersreflecting on their listening in diaries and questionnaires,which encourages the development of new knowledge about listening.The second type refers to ways to help listeners to systematically experience extracting information from a text and creating meaning. One such way is the use of a task sequence that engages learnersin predicting,monitoring,problem identification,and evaluating,and this task sequence is known as a‘pedagogical cycle’(Vandergrift 2004).Pedagogical cycle Vandergrift(2007)presents a pedagogical cycle that encourages learners toactively create and check predictions,establish and address gaps in theirunderstanding,and monitor and reflect on their performance.It alsoprovides for plentiful listening practice and can be used with any level oflearners,including advanced listeners who are exposed to appropriatelychallenging and realistic texts(Vandergrift2004).Furthermore,associated strategies may be practised individually without necessarilyrequiring their use in combination with other strategies(Field2000),thus possibly making their use less complicated for learners.Also,therepeated use of this cycle with a variety of texts allows listeners ofdifferent levels of ability and rates of learning in a class to makeprogress andfine-tune their comprehension in their own way and at theirown pace.Two previous classroom-based studies have illustrated the potential benefitsof metacognitive instruction using a pedagogical cycle.Goh and Taib(op.cit.)conducted a small-scale study of the development of a group of tenprimary school,ESL learners over eight listening lessons.The results froma pre-test and post-test showed that only one of the learners(the most-skilledlistener in the group)failed to improve across the study,with less-skilledlisteners in particular making the greatest improvements.In another study,Vandergrift and Tafaghodtari(2010)assessed the listening comprehensionof106tertiary-level high-beginner and lower-intermediate learners ofFrench as an L2before and after a course of13weeks.Pre-and post-testMetacognitive instruction for helping listeners409 at City University of Hong Kong on March 8, 2012 / Downloaded fromscores indicated learners who participated in a pedagogical cycle in eachlesson made modest but significant gains in performance.Also,it was theless-skilled listeners who made the greatest improvements.The study To see if metacognitive instruction benefits less-skilled listeners’comprehension in my teaching context,I conducted a small-scale studywith a group of20EFL learners studying at a language school in centralJapan.I guided the learners throughfive lessons based on listening totelevision news.One-way rather than interactive listening was the focus ofthe lessons,as this mode of listening is predominant in the given EFLcontext.The procedure and listening material through thefive lessons wereconsistent to enable learners to develop familiarity and confidence withcomprehending television news items.All interaction between learners wascarried out in the L2,reflecting the required use of the L2at the givenlanguage school.The learners were briefly interviewed at the end of thestudy about their perceptions of the lessons.Participants The20learners were Japanese females aged between22and55years.Allwere attending an advanced-level English language course,and according toin-house documentation,this level was equivalent to approximatelyIELTS7.The learners worked in self-selected pairs.No strategy instructionwas provided beforehand.All names shown are pseudonyms. Materials The listening material used in each of the lessons was a different BBC TVNews item.The choice of BBC TV News items was based on the fact that theadvanced learners participating in the study were exposed to this authenticmaterial on a regular basis.All the news items were approximately twominutes long and presented on a television screen in short segments.Thesegmentation was determined by the shift in audio and visual focus of thecontent.Apre-test and post-test were also developed using different BBC TVNews items.Lesson procedure Each of thefive lessons was90minutes long and involved a pedagogicalcycle(based on Vandergrift2007:199).The task sequence also included theexplicit sharing,discussion,and evaluation of strategies by learners asrecommended by Goh and Taib(op.cit.).The pedagogical cycle is shown inTable1.In Step1,learners read a short text related to the topic of the news item andmade notes about what they thought they might see or hear in the newsitem.The learners then shared their ideas for several minutes.The use ofthe short text aimed to facilitate and encourage the subsequent use ofstrategies such as inferencing and elaboration.Next,segments of the news item were played on a television screen and thesequence of Steps2–5was completed by learners in their own time for eachsegment.Learners watched thefirst segment as it was played andsubsequently made notes of any words or phrases they had comprehended.After this,learners discussed what they had understood from the segment.They then talked about the strategies they had used to try to understand thesegment and considered together strategies they could employ to deal withidentified gaps in their understanding.410Jeremy Cross at City University of Hong Kong on March 8, 2012 / Downloaded fromStep 1.Once learners know the topic and text type,they predict types of information they may see and hear (based on a short written text)and discuss ideas together.Step 2.Learners listen to confirm or correct initial predictions and note down content understood.Step 3.Learners share and compare what they have understood with each other,share strategies used,and discuss strategies to help address identified gaps in their understanding.Step 4.Learners listen and attend to points of disagreement or information not/partially understood,check their previous understanding,make corrections,and note down additional content.Step 5.Learners share and compare what they have understood with each other and modify as required,discuss strategies used,and together reconstruct the main points.Step 6.Learners compare the aural form of the complete news item with a transcription.Students watch the complete news item.Step 7.Learners evaluate their performance and strategic approach and consider strategies for the next lesson.table 1Steps and relatedprocedures of thepedagogical cycleThe segment was then shown again to allow for the possible effect of short-term memory constraints on the amount of content they were able to recall.It was also an opportunity for the learners to resolve discrepancies,confirm hypotheses,increase comprehension,and adapt their strategies.Once again learners wrote down words or phrases they comprehended when the segment ended,shared understanding,and discussed the strategies they had used the second time.Following this,learners worked together to write a summary representing an agreed account,based on their notes and discussion,of the main points in the segment which they had just viewed.On finishing their summary of content in the first segment,the learners followed the same note taking and summary writing procedure for the subsequent segments.In Step 6,after listening to all the segments and completing the associated tasks,the complete news item was played and learners were given the transcript to read simultaneously.This may assist the development of word recognition skills and form–meaning relationships (Vandergrift 2007).The whole news item was played once more and this time learners just looked at the screen and listened without referring to the transcript.This was to enable them to consolidate understanding using both the audio and the visual channels.The last step of the cycle (Step 7)involved learners discussing how successful their listening and strategy use had been and sharing possible strategies they could try in the future to help deal with problems they encountered.Listening comprehension tests The pre-and post-tests developed for measuring improvements inlistening comprehension across the study each consisted of two BBC TV News items presented in the same way as in the lessons.The procedure for both tests involved the learners individually reading a short text before the news item was played (as in the lessons).The learners then listened twice Metacognitive instruction for helping listeners 411 at City University of Hong Kong on March 8, 2012/Downloaded fromand made notes for each segment,before writing a summary of the given segment’s main points.There was no interaction between the learners.The responses for the two tests were scored according to a partial scoring system that was based on Bonk (2000).Where it was evident that learners could illustrate partial understanding of main points in the segments (for example word chunks),but they were not able to fully comprehend the main point,they were awarded marks ranging from one (isolated words)to four (a coherent and complete main point).A colleague was trained to use the scoring system,and inter-rater reliability for the two tests was 82per cent.Differences were resolved through discussion.ResultsParticipant scores as a percentage for the pre-test and post-test are presented in Table Pre-test Post-test Name Pre-test Post-test Noriko 2341Keiko 4350Masami 2638Manami 4343Yuko 2723Madoka 4350Azusa 2946Minori 4558Miyuki 3240Noriko 4549Rie 3236Yasuko 5056Koto 3439Jun 5654Kana 3442Kaori 5654Naomi 4247Tomoko 6255Hiromi 4350Nami 6270table 2Pre-test and post-testpercentagesFrom the pre-test,learners whose raw scores were more than one standard deviation lower than the mean were classified as less-skilled listeners.There were four less-skilled listeners:Noriko,Masami,Yuko,and Azusa.Similarly,learners whose raw scores were greater than one standard deviation above the mean were classified as more-skilled listeners.There were also four more-skilled listeners:Jun,Kaori,Tomoko,and Nami.Table 3shows the less-skilled listeners scores as percentages for the pre-and Pre-test (%)Post-test (%)Noriko 2341Masami 2638Yuko 2723Azusa 2946table 3Percentage scores forless-skilled listeners forthe pre-and post-testsThe scores shown in Table 3illustrate that three of the four less-skilled listeners made increases of over 10per cent across the study,with Noriko making a gain of 18per cent,Azusa 17per cent,and Masami 12per cent.Yuko actually scored four per cent worse in the post-test compared to the pre-test.Table 4shows the more-skilled listeners scores as percentages for the pre-and post-tests.412Jeremy Cross at City University of Hong Kong on March 8, 2012/Downloaded fromName Pre-test (%)Post-test (%)Jun 5654Kaori 5654Tomoko 6255Nami 6270table 4Percentage scores formore-skilled listeners forthe pre-and post-testsOf the four more-skilled listeners shown in Table 4,only Nami increased her score (by eight per cent).The other three listeners,Jun,Kaori,and Tomoko,all scored comparatively worse in the post-test (by two per cent,two per cent,and seven per cent respectively).In summary,the findings illustrate that three of four of the less-skilled listeners in the group made noteworthy gains across the study.In contrast,only one of four more-skilled listeners scored higher in the post-test than the pre-test.These findings are discussed in the next section.Discussion This small-scale study focused on determining if the less-skilled listenersmade noteworthy proficiency gains across a series of five lessons.The learners participated in a pedagogical cycle designed to promote their metacognition of L2listening as a route to enhancing their listeningcomprehension ability.Findings revealed that three of the four less-skilled listeners’comprehension improved in the study.A fourth less-skilledlistener did not improve at all.A reason for her lack of improvement may be evident in her interview response when asked for her general comments about the lessons.Yuko appeared to have lost the desire to continue in the lessons but had persevered for the sake of her partner:I nearly gave up if it wasn’t a pair work.There is a partner,so I couldn’t I can’t give up.This was very tough lessons for me.This negativity seems to have been because the text and/or task type and complexity were beyond her listening ability.Thus,the utility of thepedagogical cycle may be negated by such factors.In contrast,Azusa,who made a 17per cent gain,responded positively in her interview:It was very good activity for me to improve my English because I write what I understand and what I didn’t.And I have never thought about the strategy to understand more,so I could reflect what I did.Vandergrift and Tafaghodtari (op.cit.)suggest that the reason less-skilled listeners tend to show the greatest improvements is that before this type of metacognitive instruction,they have yet to develop sufficient knowledge and skills to reconstruct information when listening.Systematically and continuously orchestrating particular cognitivestrategies (for example inferencing and elaboration)and metacognitive strategies (for example predicting,monitoring,problem identification,and evaluating)guided by the task sequence possibly helps less-skilledlisteners to build a stronger representation in memory of information in the text.Through this process,weaker listeners may gradually begin to develop the kind of cognitive and metacognitive processes employed by a skilled listener (Vandergrift and Tafaghodtari op.cit.).However,such progress may Metacognitive instruction for helping listeners 413 at City University of Hong Kong on March 8, 2012/Downloaded frominitially be‘fragile’in nature,and encouraging learners to engage in the conscious and consistent effort of explicitly sharing and reflectingon their strategic behaviour may facilitate and strengthen skills development.The majority of the more-skilled listeners did not make improvements across thefive lessons,with three of these learners’percentages being very slightly worse for the post-test and the fourth improving by eight per cent. This was possibly because these skilled listeners had already reacheda comparatively solid level of understanding and orchestration of bottom-up and top-down skills and strategies,so that the impact of participating in the pedagogical cycle made little difference to their comprehension.Moreover, the difficulty level of either or both the text and task may set a threshold beyond which more-skilled listeners are unable to show progress from participating in the pedagogical cycle.Irrespective of the lack of improvement of most of the more-skilled listeners,all four were positive about the lessons in their interview responses,for example,Tomoko and Kaori.Throughout these lessons I think that I improve my way of understanding the news so I got a lot of benefit.Before this I didn’t consciously think about strategies to understand the news.(Tomoko) Well,this kind of activity was sort of summing up using our own words.It was good,and it was good to make me realize my weak points.Because when you listen to the news just as it is,you have no chance to realize what and how much you missed.(Kaori)With respect to the12learners whose pre-scores placed them within thefirststandard deviation on either side of the mean,the mean percentageincrease in scores was approximately seven per cent.This improvementmay certainly have been due to task and text familiarity or time spent onconcurrent general language learning.It seems,then,there was possiblya general tendency towards a reduction in the effects of the pedagogical cycleon performance as listeners increased in level of ability.Thefindings of this study support those of Goh and Taib(op.cit.)andVandergrift and Tafaghodtari(op.cit.)who also noted that the pedagogicalcycle can be effective for improving the listening comprehension of less-skilled listeners.This study,however,also differs in a number of respects.Firstly,Goh and Taib’s study involved young ESL learners and Vandergriftand Tafaghodtari’s study focused on high-beginner/low-intermediate adultlearners of French as a second language.In contrast,this study exploredadvanced-level,adult EFL learners.It appears then that the effects of thepedagogical cycle are evident across a range of abilities and contexts.Moreover,Goh and Taib and also Vandergrift and Tafaghodtari exposedlearners to a variety of texts,while this study only utilized one text typethroughout:television news items.It is not clear what impact this may havehad on the performance of three of the four less-skilled listeners who madegood gains,but it seems it had little effect on most of the more-skilled ones.Furthermore,as mentioned,an added feature of the pedagogical cycle usedin this study compared to those Goh and Taib and Vandergrift andTafaghodtari employed was that the learners explicitly shared,discussed, 414Jeremy Cross at City University of Hong Kong on March 8, 2012 / Downloaded fromand evaluated their listening strategies.Research has shown that more-skilled listeners utilize a wider variety of strategies with greaterflexibility,frequency,sophistication,and appropriateness to meet task demands(forexample Goh2002)and employ superior configurations of strategiescompared to less-proficient listeners(for example Vandergrift2003).Assuch,the strategy-focused interaction in the pedagogical cycle may havebenefited the weaker listeners as they were often observed engaging indiscussions with more-skilled listeners.Peer learning of strategies wastherefore a possible factor in some of the less-skilled listeners makingnotable gains compared to the more-skilled listeners.In real-world listening contexts,one is rarely presented with a pre-listeningtext and segmented television news items or verbalizes strategies.However,the value of the pedagogical cycle illustrated is that it provides plentifulopportunities for listeners at a range of skill levels to reflect on,explore,practise,and develop the kind of listening comprehension abilities that canfacilitate listening in real-life environments.Of course,it is not suggestedthat listening lessons should only encompass the pedagogical cycle becausethefindings illustrated that the majority of learners did not improve muchin the study,only three of four less-skilled listeners.As mentioned,theremay also be a threshold beyond which the text and task complexity possiblyinhibit the positive effects of metacognitive instruction for more-skilledlisteners.It appears,then,that metacognitive instruction can only go so farin enhancing learners’listening comprehension.Therefore,thepedagogical cycle could usefully be supplemented with bottom-up skillsinstruction(see Field2008).Similarly,strategy instruction may enable lessstrategy-aware learners to explore their strategies in a more informedmanner in the pedagogical cycle.Conclusions While this was a small-scale study that specifically focused oncomprehension of television news items and examined only four less-skilled and four more-skilled listeners,thefindings provide some additionalempirical support for the notion that metacognitive instruction usinga pedagogical cycle of predicting,monitoring,problem identification,andevaluating can be useful for guiding and helping less-skilled listenerstowards developing their listening comprehension ability.The implicationsof this study for teachers in other contexts are that metacognitive instructionappears to offer a practical pedagogical approach that can be exploited forskills development in listening lessons.However,it seems thatmetacognitive instruction may not necessarily be equally beneficial to alllearners in a class,and teachers should consider how to best implementit in combination with other types of listening instruction in orderto improve the listening comprehension skills of learners of differentabilities.Final revised version received September2010Note1Listening comprehension is defined as‘an active process in which listeners select and interpret information that comes from auditory and visual clues’(Rubin1995:7).ReferencesBonk,W.2000.‘Second language lexical knowledge and listening comprehension’. International Journal of Listening14:14–31.Metacognitive instruction for helping listeners415 at City University of Hong Kong on March 8, 2012 / Downloaded fromField,J.2000.‘Finding one’s way in the fog:listening strategies and second-language learners’.Modern English Teacher9/1:29–34.Field,J.2008.Listening in the Language Classroom. Cambridge:Cambridge University Press. Flavell,J.1976.‘Metacognitive aspects of problem solving’in L.Resnick(ed.).The Nature of Intelligence. Hillsdale,NJ:Erlbaum.Goh,C.2002.‘Exploring listening comprehension tactics and their interaction patterns’.System30/2: 185–206.Goh,C.2008.‘Metacognitive instruction for second language listening development:theory,practice and research implications’.RELC Journal39/2:188–213. Goh,C.and Y.Taib.2006.‘Metacognitive instruction in listening for young learners’.ELT Journal60/3:222–32.Rubin,J.1995.‘An overview to A Guide for the Teaching of Second Language Listening’inD.Mendelsohn and J.Rubin(eds.).A Guide for the Teaching of Second Language Listening.San Diego, CA:Dominie Press.Vandergrift,L.2003.‘Orchestrating strategyuse:toward a model of the skilled second language listener’.Language Learning53/3:463–96.Vandergrift,L.2004.‘Listening to learn or learning to listen?’.Annual Review of Applied Linguistics 24:3–25.Vandergrift,L.2007.‘Recent developments in second and foreign language listening comprehension research’.Language Teaching40/3: 191–210.Vandergrift,L.and M.Tafaghodtari.2010.‘Teaching L2learners how to listen does make a difference:an empirical study’.Language Learning60/2:470–97. Wenden,A.1998.‘Metacognitive knowledge and language learning’.Applied Linguistics19/4:515–37. The authorLinguistics at the National Institute of Education of the Nanyang Technological University,Singapore. His primary research interest is L2listening development.He mainly teaches postgraduate courses on ELT methodology for listening and speaking.Email:jertzy7@416Jeremy Cross at City University of Hong Kong on March 8, 2012 / Downloaded from。

相关文档
最新文档