Preconditioning of Elliptic Problems by Approximation in the
1. Numerical linear algebra
1. Methods for initial value problems for ordinary di erential equations: Runge-Kutta and Adams methods.
2. Methods with automatic step-size control for Runge-Kutta and Adams methods. 3. Basic concepts of stability of the multistep methods for ODE's and systems.
7. Finite element method
(Joix)
1. Weak (variational) formulation and characterization of the energy space: essential and natural boundary condition.
3. Eigenvalues and eigenvectors of matrices (minimax methods for symmetric matrices, power method, QR method). Singular value decomposition and its basic properties.
estimates (Courant condition). 3. Error estimates
6. Numerical methods for elliptic problems
(Ames, Striktwerda)
1. Finite di erences and nite volumes: approximation of the equation and the boundary conditions, higher order schemes.
半线性椭圆方程的Liouville定理和Harnack不等式
(4)
Theorem 1.2 ([42]) A C 2 solution of (4) is of the form (3) for some µ > 0, c x ¯ ∈ Rn−1 , and ¯ t = (n− . 2)µ Under an additional hypothesis u(x) = O(|x|2−n ) for large |x|, the result was established earlier by Escobar ([28]). The proof of Escobar is along the line of the proof of Obata, while the proof of Li and Zhu is by the method of moving spheres, a variant of the method of moving planes. Liouville type theorems in dimension n = 2 were established in [22], [27], [42], and the references therein. Analogues for systems were established in [14]. Improvements
Partially supported by a National Science Foundation Grant and a Rutgers University Research Council Grant. † Partially supported by a Graduate School Dissertation Fellowship of Rutgers University
安徽2024高考英语试卷
What is the main purpose of the first paragraph of the passage?A. To introduce a famous person.B. To present a controversial topic.C. To describe a historical event.D. To explain a scientific concept.The author mentions "global warming" in the text to _______.A. argue against environmental policiesB. illustrate the severity of climate changeC. promote a new energy sourceD. compare different weather patternsWhich of the following best summarizes the relationship between Paragraph 3 and Paragraph 4?A. Cause and effectB. Comparison and contrastC. Problem and solutionD. Thesis and supporting detailsThe word "ubiquitous" in Line 5 of the passage most closely means _______.A. rareB. everywhere presentC. recently discoveredD. hardly noticeableAccording to the passage, which factor contributes least to the decline in biodiversity?A. Habitat destructionB. PollutionC. OverpopulationD. Genetic engineeringThe tone of the author in discussing the future of artificial intelligence is _______.A. pessimisticB. cautiously optimisticC. indifferentD. openly skepticalWhat is the primary argument the author makes in the last paragraph?A. The importance of cultural exchange.B. The need for stricter immigration laws.C. The benefits of multiculturalism.D. The challenges of language barriers.The phrase "tipping point" in the context of the passage refers to _______.A. a moment of crisisB. a point of no returnC. a minor inconvenienceD. a temporary setbackWhich of the following statements about the character in the story is NOT true?A. She was born into a wealthy family.B. She faced numerous challenges in life.C. She eventually achieved her dreams.D. She never received any support from others.。
Hierarchical Error Estimator for Eddy Current Computation
Hierarchical Error Estimator for Eddy CurrentComputationR.Beck R.Hiptmair B.WohlmuthOctober27,1999AbstractWe consider the quasi-magnetostatic eddy current problem discretized by meansof lowest order-conformingfinite elements(edge elements)on tetrahedralmeshes.Bounds for the discretization error in thefinite element solution are de-sirable to control adaptive mesh refinement.We propose a local a-posteriori errorestimator based on higher order edge elements:The residual equation is approxi-mately solved in the space of p-hierarchical surpluses.Provided that a saturationassumption holds,we show that the estimator is both reliable and efficient.Key words.Edge elements,a posteriori error estimator,hierarchical error esti-matorMSC1991.65N15,65N30,65N51IntroductionThe eddy current model arises from Maxwell’s equations as a magneto-quasistatic approximation by dropping the displacement current[3,22].This is reasonable for low-frequency,high-conductivity applications like electrical machines.A wealth of different formulations have been proposed[1,8].We single out one that zeros in on the magnetic vector potential as primary unknown[18].Then we end up with the degenerate parabolic initial-boundary value probleminon(1)inHere,stands for a connected bounded polyhedral computational domain. Though the equations are initially posed on the entire space,we can switch to a bounded domain by introducing an artificial boundary sufficiently removed from the region of interest.This is commonplace in engineering simulations[8,18].Further,denotes the bounded uniformly positive inverse of the mag-netic permeability(magnetic susceptibility).We confine ourselves to linear isotropic media,i.e.is a scalar function of the spatial variable only.Hence,for somea.e.in.We rule out anisotropy also for the conduc-tivity,for which holds ually,there is a crisp distinction between conducting regions,where is bounded away from zero,and insulating regions,where.We will take for granted that,wherever.Often,the material parameters vary only moderately inside and outside the conductor.The right hand side is a time-dependent vectorfield in,which represents the source current.For physical reasons a.e.in and for all times.We remark that in many applications the exciting current,for instance the current in a coil,is provided through an analytic expression.A typical arrangement is depicted infigure1.Figure1:A model problem for eddy current computation(cf.[14,Ch.8])It is important to note that the vector potential lacks physical meaning.The really interesting quantity is the magnetic induction.This is why we can use an ungauged formulation as in(1),which does not impose a constraint on outside.Obviously,this forfeits uniqueness of the solution in parts of the domain where,but the solution for remains unique everywhere.Inside the conductor, where,we get a unique.For the sake of stability,timestepping schemes for(1)have to be L-stable[26]. This requirement can only be met by implicit schemes like SDIRK-methods.In each timestep they entail the solution of a degenerate elliptic boundary value problem of the forminon(2) In this context,denotes the new approximation to to be computed in the current timestep,and depends on and the approximation of in the previous timestep. Note that we can still assume outside.The coefficient agrees with2except for a scaling by the length of the current timestep;Accordingly,tion and present the concrete hierarchical error estimator.It turns out that it is essential to pay special attention to-free vectorfields in the process of hierarchical decou-pling and localization.Finally,we report on a number of numerical experiments that examine the performance of the hierarchical error estimator for some model problems. 2Edge elementsWe employ N´e d´e lec’s-conformingfinite elements[31]on families of shape-regular simplicial triangulations(in the sense of[15]).Usually,the meshes are gen-erated by repeated refinement of an initial coarse triangulation.We may use the standard“red-green”refinement process(cf.[7,11]).We require that the boundary of the conductor is resolved by all meshes,that is,is always the union of elements.We denote by the global space of edge elements of order built upon a simplicial mesh pliance with homogeneous Dirichlet boundary conditions is taken for granted.For any tetrahedron the local spaces are given bywhere designates the spaces of polynomials of degree over.For the lowest order case this leads to the representation.Then the global space is obtained by prescribing degrees of freedom (d.o.f.)that ensure tangential continuity of the discrete vectorfields.In the lowest order case the d.o.f.are given by path integrals along oriented edges of elements[31]:edge of(4)There is an explicit representation of the related local basis functions on a tetrahedron using the barycentric coordinate functions,,of[14]:(5) where is associated with edge,with endpoints and.The local degrees of freedom for second order edge elements involve additional integrals over faces of tetrahedra and can be stated asedge of basis of(6)face ofwhere is some basis of the tangential space of.By(6)the local basis func-tions are not entirelyfixed,since they depend on the choice of“test polynomials”and of tangential vectors.Based on the degrees of freedom we can introduce conventional nodal projection operators that can easily be extended to non-continuous4vectorfields,for which the degrees of freedoms are well defined.The exceptional fea-ture of the nodal projectors is that they preserve irrotational vectorfields(7) It will be crucial that edge elements form affine equivalent families offinite elements in the sense of[15].That is,if is the reference tetrahedron and,,,,the affine mapping taking it onto some ,thenwhere is the mapping(8) Moreover,if the weighting polynomials are suitably chosen,the local degrees of free-dom remain invariant under this transformation.The transformation has the following impact on norms[16]:(9)with meaning equivalence up to constants that only depend on,the shape-regularity measure of the element(i.e.,the ratio of its diameter and the radius of the largest inscribed ball),and the variation of the material parameters in individual elements.We have written for the diameter of.Affine equivalence is the key to establishing the following simultaneous approxi-mation estimates[16]that are valid for quasiuniform,shape-regular meshes of mesh-width:(10)(11)where stands for a generic constant and the vectorfields are to be sufficiently smooth.The estimates(10)and(11)directly translate into error estimates for thefinite element discretization of(3).More precisely,if the solution is sufficiently regular, we get for edge elements of order,.In addition,the same orders of convergence can be expected for.3Hierarchical error estimatorHierarchical error estimators invariably target the energy norm of the error.However, in the current context we have to deal with an energy-seminorm onas we dispensed with a gauge condition outside.As usual,the theoretical analysis of the hierarchical error estimator starts with a saturation assumption,which is to some extent justified by the a-priori error estimates (10)and(11).We assume that there is a sequence,,belonging to the shape-regular family of meshes such that(13)Here,is the discrete solution of(3)in,whereas is that in.As,is guaran-teed,but(13)will eventually hinge on extra smoothness of the continuous solution .In fact,if the solution is smooth enough,we expect from(10)and (11)that as the meshwidth tends to0.Admittedly,the discontinuity of the conductivity at the edge of the conducting zone will spawn singularities of the solution[20]and destroy the regularity required for(10)and(11).On the other hand, we merely demand that is uniformly bounded away from1,which is much weaker than.Using(13)and Galerkin orthogonality,the following lemma is readily established: Lemma1[4]If(13)holds,we can concludewhere the hierarchical surplus space is given by(16) Lemma3Define the symmetric positive semidefinite bilinear formby,, ,.Then for all. Proof.Pick some and an arbitrary and set.We decomposeAs(15)is a direct sum,we can resort to the equivalence of all norms onfinite dimen-sional spaces to seeWhen desired,the constants of this equivalence can be swiftly computed as the eigen-values of a small matrix[4].It is important to realize that thanks to(7),we know that if,then and.As a consequence,the same argument as above showsBe aware that the splitting(15)is based on degrees of freedom.Hence,it can be done element by element and is respected by the transformation.In sum,using(9),we locally(and globally)get the asserted equivalenceThe crucial observation is that the equivalence holds separately for and ,so that it does not matter,whether on.It is essential that from the previous lemma fully decouples and :As for,the defect equation(14)for can be restricted to the hierarchical surplus space:Seek such that(17) However,solving(17)still encounters a large linear problem in the space of hierarchi-cal surpluses.Therefore,we exploit lemma2once more to perform localization.For7that sake we pick a suitable basis of.On each element with barycentric coordinate functions,,the basis functions are given byThe function is“associated”with the edge connecting vertices and,whereas and belong to the face spanned by vertices,,and.By straightforward computations we see that,,and.A global basis of then readsedge of face ofand it defines a direct decomposition of the hierarchical surplus space(18)edge facewithFigure2:Location of degrees of freedom related to the basis functions involved in the localization procedureTwo facts ought to be mentioned:First,for all edges,and, second,and are linearly independent.This means,if some is-free,it will be split into-free localized contributions.As in the proof of lemma2,this is the key for separately getting the equivalence for the (semi)-norms and.Thus,using affine equivalence techniques as before,we can prove the following lemma:8Lemma4For,,, ,setedges facesThen is a symmetric positive definite bilinear form,which fulfillson.Hence,in(17)we can replace by and still get an,whose energy norm is equivalent to that of the true error.The gain is that solving, ,involves only small local problems:According to the dimensions of the subspaces occurring in the decomposition(18),we face a scalar equation for each edge of the mesh and a linear system for each face.In sum,edges4Numerical experimentsThroughout the numerical experiments we use lowest order edge elements on an un-structured tetrahedral grid.The stiffness matrix and load vector corresponding to(3) are computed using Gaussian quadrature of order5.Interpolation of boundary values is of the same order.Iteration errors can be neglected as numerous multigrid steps are carried out to compute.To gauge the quality of the error estimator we rely on the effectivity index,which gives the ratio between the estimated and the true energy norm of the discretization error.Here,and.This quantity reflects the quality of the global estimate.For a good error estimator,the effectivity index is to approach a constant rapidly as refinement proceeds.We point out that,since we can only expect equivalence of the estimated energy of the error and its true energy,the effectivity index may be far off the ideal value1.In ourfirst experiment the coefficients and are kept constant all over the do-main;is always set to1.The smooth analytic solution is en-forced by the choice of the right hand side and Dirichlet boundary values.The compu-tations are conducted on uniformly refined meshes.The resulting effectivity indices are reported in table1.Level0.450.630.670.690.690.700.460.670.720.750.760.770.74 1.27 1.40 1.43 1.33 1.19Gauß-Seidel-based hierarchical error estimator0.550.820.920.960.980.990.710.870.920.910.900.90Our second experiment is again carried out on the unit cube with ,but we enforce a vanishing zero–order term on part of the domain.As far as the coefficients are concerned,this experiment comes fairly close to the arrangements in realistic eddy current computations.In particular,we choose as follows:maxelsewhereWe impose homogeneous Dirichlet boundary conditions on the boundary and a smooth right hand side.To estimate the true errors,we carried out two more refinement steps than reported in tables2and3,respectively,and compare the discrete solutions to those obtained on thefinest levels.The results are collected in table2for uniform refinement.We did the same experiment on meshes generated by adaptive red-green refine-ment.The latter relies on an averaging strategy,which marks elements in the set0123Jacobi-based0.830.930.960.97Table2:Effectivity indices for the hierarchical error estimators in Exp.2.Level0.560.710.730.720.72Gauß-Seidel-based,Analysis of three dimensional electromagneticfileds using edge elements,p.Phys., 108(1993),pp.236–245.11[3]H.A MMARI,A.B UFFA,AND J.-C.N´E D´E LEC,A justification of eddy currents model for theMaxwell equations,tech.rep.,IAN,University of Pavia,Pavia,Italy,1998.[4]R.B ANK,Hierarchical bases and thefinite element method,Acta Numerica,5(1996),pp.1–43.[5]R.B ANK AND A.W EISER,Some a posteriori error estimators for elliptic partial differentialequations,p.,44(1985),pp.283–301.[6]R.B ECK,R.H IPTMAIR,R.H OPPE,AND B.W OHLMUTH,Residual based a-posteriori error es-timators for eddy current computation,Tech.Rep.112,SFB382,Universit¨a t T¨u bingen,T¨u bingen, Germany,March1999.To appear in.[7]J.B EY,Tetrahedral grid refinement,Computing,55(1995),pp.355–378.[8]O.B IRO AND K.R ICHTER,CAD in electromagnetism,in Advances in Electronics and ElectronPhysics,P.Hawkes,ed.,vol.82,Academic Press,1991,pp.1–96.[9] F.B ORNEMANN,An adaptive multilevel approach to parabolic equations I.General theory and1D-implementation,IMPACT Comput.Sci.Engrg.,2(1990),pp.279–317.[10],Solving Maxwell’s equations in a closed cavity and the question of spurious modes,IEEE Trans.Mag.,26(1990),pp.702–705.[14][23]P.D ULAR,J.-Y.H ODY,A.N ICOLET,A.G ENON,AND W.L EGROS,Mixedfinite elements as-sociated with a collection of tetrahedra,hexahedra and prisms,IEEE Trans Magnetics,MAG-30 (1994),pp.2980–2983.[24]R.D URAN AND R.R ODRIGUEZ,On the asymptotic exactness of Bank-Weiser’s estimator,Nu-mer.Math.,62(1992),pp.297–303.[25]K.E RIKSON,D.E STEP,P.H ANSBO,AND C.J OHNSON,Introduction to adaptive methods fordifferential equations,Acta Numerica,4(1995),pp.105–158.[26] E.H AIRER AND G.W ANNER,Solving Ordinary Differential Equations II.Stiff and Differential-Algebraic Problems,Springer-V erlag,Berlin,Heidelberg,New Y ork,1991.[27]R.H IPTMAIR,Multigrid method for Maxwell’s equations,SIAM J.Numer.Anal.,36(1999),pp.204–225.[28]P.M ONK,A mixed method for approximating Maxwell’s equations,SIAM J.Numer.Anal.,28(1991),pp.1610–1634.[29],A posteriori error indicators for Maxwell’s equations,put.Appl.Math.,(1999).To appear.[31]J.N´E D´E LEC,Mixedfinite elements in,Numer.Math.,35(1980),pp.315–341.[32]R.V ERF¨URTH,A Review of A Posteriori Error Estimation and Adaptive Mesh–Refinement Tech-niques,Wiley–Teubner,Chichester,Stuttgart,1996.[33]。
W^{2,1}_p Solvability for Parabolic Poincare Problem
2
L.G. SOFTOVA
coefficients allowing discontinuity in t. The vector field ℓ(x, t) generating B is defined on S = ∂ Ω × (0, T ) and is tangential to it in some subset E. The kind of contact is of neutral type and we suppose that γ (x, t) = (ℓ(x, t) · ν (x)) ≥ 0 on S. It means that the boundary value problem under consideration is of Fredholm type, i.e. both the kernel and cokernel are of finite dimension. 2,1 We are interested of strong solvability of our problem in Wp (Q), p ∈ (1, ∞). Because of the loss of regularity of the solution near to the set of tangency E we impose higher regularity in E of the data. The study is based on the original Winzel’s idea to extend ℓ into Ω such that to obtain explicit representation of the solution through the integral curves of that extension. Thus the problem is reduced to obtaining of suitable a priori estimates for the solution and its derivatives on an expanding family of cylinders. Further, the solvability is proved using regularization technique which, roughly speaking, means to perturb the vector field ℓ by adding small ε times ν, to solve the such obtained regular ODP and then pass to limit as ε → 0. The perturbed problem regards linear uniformly parabolic operator P with VMO coefficients and boundary operator B with (ℓε · ν ) > 0. In this case we dispose of unique solvability result 2,1 in Wp (Q), p ∈ (1, ∞) supposing P u ∈ Lp (Q) and initial and boundary data belonging to the corresponding Besov spaces (see [15], [8]). Poincar´ e problem for linear uniformly parabolic operators with H¨ older continuous coefficients is studied in [11] (see also [13]) where unique solvability in the corresponding H¨ older spaces is obtained. Moreover, the linear results were applied to the study of semilinear parabolic problem in H¨ older spaces. A tangential ODP for second-order uniformly elliptic operators with Lipschitz continuous coefficients was studied in [9] (see also [8]). It is obtained strong solvability in W 2,p (Ω) but for p > n/2. In our case the parabolic structure of the equation permits to obtain an 2,1 (Q) only through the data of the a priori estimate for the solution in Wp problem. Thus we are able to prove unique solvability for all p ∈ (1, ∞) avoiding the use of maximum principle and omitting any additional conditions on the vector field. 2. Statement of the problem and main results Let Ω ⊂ Rn , n ≥ 3 be a bounded domain with ∂ Ω ∈ C 2,1 and Q = Ω × (0, T ) be a cylinder in Rn+1 . Set ℓ(x, t) = (ℓ1 (x, t), . . . , ℓn (x, t), 0) for a unit vector field defined on the lateral boundary S = ∂ Ω×(0, T ). We consider the following oblique derivative problem ≡ ut − aij (x, t)Dij u = f (x, t) in Q, I u ≡ u(x, 0) = ψ (x) on Ω, (2.1) ∂u B u ≡ = ℓi (x, t)Di u = ϕ(x, t) on S. ∂ℓ Denote by ν (x) = (ν 1 (x), . . . , ν n (x)) the unit outward normal to ∂ Ω. Then we can write ℓ(x, t) = τ (x, t) + γ (x, t)ν (x) where τ (x, t) is tangential projection
A Calculus for Predicative Programming Emil Sekerinski ¢
A Calculus for Predicative ProgrammingEmil SekerinskiForschungszentrum Informatik Karlsruhe,Haid-und-Neu Strasse10-14,7500Karlsruhe,Germany,sekerinski@fzi.deAbstract.A calculus for developing programs from specifications written aspredicates that describe the relationship between the initial andfinal state is pro-posed.Such specifications are well known from the specification language Z.Allelements of a simple sequential programming notation are defined in terms ofpredicates.Hence programs form a subset of specifications.In particular,sequen-tial composition is defined by’demonic composition’,nondeterministic choiceby’demonic disjunction’,and iteration byfixed ws are derived whichallow proving equivalence and refinement of specifications and programs.Theweakest precondition is expressed by sequential composition.The approach iscompared to the predicative programming approach of E.Hehner and to otherrefinement calculi.1IntroductionWe view a specification as a predicate which describes the admissiblefinal state of a computing machine with respect to some initial state.A program is a predicate restricted to operators which can be implemented efficiently.Hence the task of a programmer is to transform a specification written in the rich mathematical notation into a corresponding one expressed in the restricted programming notation,perhaps by a series of transforma-tion steps.In this report,the programming notation consists of assignment,sequential composition,conditional,nondeterministic choice,variable declaration,and iteration.The predicative programming approach was originally proposed by Eric Hehner in [5]for both a sequential and concurrent programming notation,and later refined in[6], [7],and[8].The benefit of this approach is that there are no separated worlds for spec-ification and programming with cumbersome proof rules for the transition from spec-ification to programs;rather the laws can be used for development of programs from specifications,for the transformation of programs to equivalent,perhaps more efficient ones,and for deriving properties of programs.By using programming operators like the conditional for specifying as well,we can also get clearer specifications.Another benefit is that when presenting the calculus,we can start with the specification notation and gradually introduce the programming operators by their definition.For specifications we use a style which is similar to the specification language Z [16].Specifications are basically predicates relating the initial to thefinal state,given by the values of primed and unprimed symbols respectively.We will use the Z notation whenever appropriate,for example for the predicates and for the refinement relation. However,as our aim are compact calculations,we refrain form using the graphical conventions for the layout.We also ignore the typing problems and rather concentrate on the definitions and properties of the’control structures’.It should be also noted that we prefer to view predicates as boolean valued functions and will make use of quantifications over predicates.The calculus presented here allows stating the equivalence of two specifications (or programs),not just refinement.When developing a program from a specification by a series of transformations,we like to state for each intermediate result whether it is equivalent to or a refinement of the previous one,even though only refinement is required(similarly as when proving that one real expression is less than another real expression).Stating equivalence expresses that no premature design decisions have been made.However,it should be clarified what the equivalence of an executable program with a specification does mean:For some initial state,a specification is either defined,in which case it does relate to somefinal state,or is undefined.A program,when executed, does either terminates with some valid result,performs some undefined operation(like indexing out of range),or does not terminate at all.From our point of view nontermina-tion is as undesirable as an undefined operation.Hence we do not distinguish them and represent both by undefinedness.This allows us to define sequential composition by’demonic composition’and non-deterministic choice by’demonic disjunction’.When comparing relational composi-tion and disjunction with sequential composition and nondeterministic choice,it turns out that the former two’deliver a result’whenever a sensible result exists.They are therefore called angelic operators.However,they cannot be implemented effectively.In contrast,the latter two are undefined whenever the possibility of failure exists.They are called demonic operators,as if several possibilities for execution exist,the implementor is free to choose one arbitrarily(and we have to be prepared that always the worst one is chosen).This approach leads to a nice way for expressing Dijkstras[2]weakest precondi-tions wp P b:Let b be a condition,i.e.a specification over the initial state only.P generalized to the case that it is a specification,not necessarily a program.The meaning of wp P b is given by sequential composition P;b.This is made possible as the condi-tion b is just a specification and sequential composition is defined for any specification. As a consequence,the definition of the sequential composition and the conditional by weakest preconditionswp P;Q b wp P wp Q bwp if c then P else Q end b c wp P b c wp Q bcorrespond to the associativity of sequential composition and distributivity of sequential composition over the conditional.2P;Q;b P;Q;bif c then P else Q end;b if c then P;b else Q;b endAn obvious advantage is that we saved introducing a new function.However,a deeper advantage is that by mere notation we save applications of theorems like associativity and make theorems look simpler and easier to memorize.The following section introduces all programming operators except iteration.Each definition is followed by a number of properties of the operator in question.The third section introduces the refinement ordering and states relationships between the pro-gramming operators and the refinement ordering.The fourth section discusses weakest preconditions.Iteration is treated in thefifth section:First the definition in terms offixed points is given,its soundness established,and the main iteration theorem derived.The main iteration theorem makes use of weakest preconditions.The last section presents a small example of the use of the calculus,in particular of the main iteration theorem.2Straight Line Programming OperatorsBasic Notation(”Basics”)The boolean values are written as and,the boolean operators are written as,, ,,with their usual meaning,binding strongest,and binding weaker, and binding weakest.Let P Q stand for predicates.For the universal and existential quantifiers,we will also use the”restricted”forms:d P Q d P Qd P Q d P Qwhere d is the dummy over which the quantification ranges,with the possibility of quantifying over list of dummies.The substitution of a free symbol s in P by some expression e is written as P s:e,and the simultaneous substitution of symbols of the list S by the corresponding ones of the list of expressions E as P S:E.The everywhere operator P stands for the universal quantification of the symbols of interest in P.We write and over predicates for the universal equivalence and universal implication:P Q P QP Q P QThey bind weaker than all other boolean operators.The context of a specification is a list of symbols,called the variables.For a variable v we allow priming by v,and similarly for lists of variables.A specification describes how the initial values of the variables relate to theirfinal values.Hence in the context V,a specification is written as a predicate over V,V,and possibly some(unprimed)constants.Note that,although a specification can only be understood with respect to a context,we will leave the context in this paper usually implicit.3A condition is a specification in which no primed symbols occur,i.e.which does not say anything about thefinal values of the variables.For a condition b,b stands for that condition with all the free(unprimed)variables of the context primed.Throughout the paper,P Q R X Y stand for specifications,b c for conditions,V for the current context,v for a variable(element of the context V),and x y for symbols not in V.An example of a condition is the domain∆P of a specification P:∆P V PThe domain of a condition is the condition itself.The domain operator distributes over disjunctions but not over conjunctions.However,if one of the conjuncts is a condition, it can be moved out of the domain:(1)∆b P b∆PThe void specification II(also called skip)leaves all variables of the context unmodified: II V VRelational composition of specifications is defined by identifying thefinal state of the first component with the initial state of the second component:P Q V P V:V Q V:VBy convention,∆binds as strong as and weaker than and but stronger than =and.Relational composition has zero,identity II,is associative,and distributes through disjunction in both directions.The domain of the relational composition can be ”pushed”into the second operand:(2)∆P Q P∆QFor conjunctions with conditions,following laws hold:(3)b P Q b P Q(4)P b Q P b QFor the purpose of defining sequential composition,we introduce the condition P b. Informally,we can think of P b as characterizing those initial states which only relate tofinal states satisfying b,if they relate to any state at all.P b V P bThe subsequent laws about can be proved by predicate calculus:(5)P b P b P b∆P(6)P b c P b P cIn the sequel,we will visit each programming operator in turn and uncover many laws holding for each operator.4Assignment(”:=”)The assignment v:e changes variable v to e and leaves all other variables of the context unmodified.v must be an element of the context.v:e II v:eWe assume that the expression e is everywhere defined,therefore:(1)∆v:eSequential Composition(”;”)The sequential composition P;Q means thatfirst P is executed,then Q.If P is nondeter-ministic,Q must be defined whatever choice in P is taken.Hence P;Q behaves like the relational composition restricted to those initial states which only lead to intermediate states for which Q is defined.P;Q P∆Q P QSequential composition is assigned the same binding power as””.P;Q is defined if P is defined and P leads to a state for which Q is defined.(1)∆P;Q∆P P∆QProof L.H.S.∆P∆Q P Q by def.”;”P∆Q∆P Q by(1)under”Basics”P∆Q P∆Q by(2)under”Basics”R.H.S.by(5)under”Basics”Sequential composition has zero,identity II,and is associative.(2)P;;P(3)P;II II;P P(4)P;Q;R P;Q;RAssociativity is proved in the appendix.In general,sequential composition does neither distribute through disjunction nor conjunction.However,if one of the conjuncts is a condition,following theorems hold:(5)b P;Q b P;QProof L.H.S.b P∆Q b P Q by def.”;”b P∆Q b P Q by(3)under”Basics”P∆Q b P Q as for any c:b P c b P c =R.H.S by def.”;”(6)P;b Q P b P;Q5Proof L.H.S.P∆b Q P b Q by def.”;”P b∆Q P b Q by(1)and(4)under”Basics”P∆Q P b P b Q by(6)and(3)under”Basics”P∆Q P b P Q as for any b:P b P bR.H.S.by(3)under”Basics”,def.”;”Sequential compositions with assignments can be simplified as follows:(7)v:e;P P v:eConditional(”if”)The conditional”if b then P else Q end”behaves like P if b initially holds,and as Q if b does not initially hold.The choice between P and Q depends only on b.if b then P else Q end b P b Q(1)∆if b then P else Q end if b then∆P else∆Q endThere are many simple laws for manipulating conditionals.We give some which will be needed later on.(2)b if b then P else Q end b P(3)b if b then P else Q end b Q(4)b if c then P else Q end if c then b P else b Q end(5)if b then P else Q end if b then b P else b Q end(6)if b then P else P end P(7)if b then P else Q end;R if b then P;R else Q;R endLaws(2)to(6)are best proved by direct manipulations in the predicate calculus,law(7)by case analysis,which is expressed as follows:(8)”Principle of case analysis”P Q R P R Q R P R QFor the proof of law(7),we distinguish the cases”b”and”b”.We consider thefirst case only,as the second follows the same pattern.Proof b if b then P else Q end;R b if b then P;R else Q;R endb if b then P else Q end;R b if b then P;R else Q;R endby(5)under”;”b P;R b P;R by(2)by(5)under”;”We will also use the shorthand”if b then P end”:if b then P end if b then P else II end6Choice(””)The binary choice P Q means that initially either P or Q is chosen for execution. As we have no control over the choice,we must be prepared for the worst case in the following sense:P Q is only defined if both P and Q are defined,and if defined,the result is either that of P or Q.P Q∆P∆Q P Q(1)∆P Q∆P∆QProof L.H.S.∆∆P∆Q P Q by def.””∆P∆Q∆P Q by(1)under”Basics”∆P∆Q∆P∆Q∆distributes over=w of absorption By convention,””binds as strong as””and””.Binary choice has zero,is idem-potent,symmetric(which all follows immediately from the definition),and associative.(2)P P(3)P P P(4)P Q Q P(5)P Q R P Q RProof L.H.S.∆P∆Q R P Q R by def.””∆P∆Q∆R P∆Q∆R Q R by(1),def.””∆P∆Q∆R P Q R by pred.calc.=R.H.S.by repeating the argument Furthermore,restricting one of the operands by a condition is the same as restricting the whole choice by the condition.(6)b P Q b P QProof L.H.S.∆b P∆Q b P Q by def.””b∆P∆Q b P Q by(1)under”Basics”R.H.S.by pred.calc.,def.””Sequential composition distributes through binary choice in both directions.The proofs are given in the appendix.(7)P;Q R P;Q P;R(8)P Q;R P;R Q;RThe following two laws express that choosing between P and Q under a condition is the same asfirst choosing and then evaluating the condition.7(9)if b then P Q else R end if b then P else R end if b then Q else R end(10)if b then P else Q R end if b then P else Q end if b then P else R end Finally,binary choice distributes through the conditional.(11)if b then P else Q end R if b then P R else Q R endThe last three laws are most easily proved by case analysis.The unrestricted choice x P chooses the initial value of x in P arbitrarily.We assume that x is not in the current context.x P x∆P x PThe symbol x is bound by the choice.The rules for renaming bound symbols of quan-tifiers apply here as well.The choice x P is defined only in those initial states for which P will accept any value of x.(12)∆x P x∆PWe give some laws about unrestricted choice without proof.The order by which the value of two symbols are chosen does not matter.(13)x y P y x PChoosing the value of x and then choosing between P and Q is the same asfirst choosing between P and Q and then the value of x.(14)x P Q x P x QChoosing a value for x and accepting any(final)value of y in P is equivalent to accept-ing any value of y and choosing a value for x.(15)x y P y x PFinally if x is not free in P,following laws for sequential composition hold:(16)x P;Q P;x Q(17)x Q;P x Q;PVariable Declaration(”var”)The variable declaration var x P extends the current context V by x.We assume that x is not in the current context.The initial value of x is arbitrary and anyfinal value is acceptable.var x P x x PBoth x and x are bound by the variable declaration.var x P is defined if,for any initial value of x,P is defined.8(1)∆var x P x∆PThe order,in which variables are declared,does not matter:(2)var x var y P var y var x PProof L.H.S.x x y y P by def.”var”x y x y P by(15)under””y x y x P by(13)under””,pred.calc.y y x x P by(15)under””=R.H.S.by def.”var”We give some laws about variable declarations without proof.Let R be a specification which does not depend on x being in the current context.(3)var x P Q var x P var x P(4)var x R;P R;var x P(5)var x P;R var x P;R(6)var x x:e II3The Refinement Relation(””)We define an ordering relation on specifications with the intention that P Q holds if Q is an acceptable replacement for P:When expecting P,we cannot tell whether Q has been executed in place of P.P Q∆P∆Q Q P∆P∆Q states that Q has to be defined where P is and∆P Q P states that within the domain of P,Q is at least as deterministic as P.Outside the domain of P, Q may behave arbitrarily.We assign the same lowest binding power as and. Our confidence in the above definition is reassured by the following basic law about, which we could have taken as the definition as well.(1)P Q P Q PThis means that Q serves all the purposes P does,but it may serve more.Hence we call Q better than P and P worse than Q.The refinement relation is a partial order in that it is reflexive,antisymmetric,transitive,and it has as bottom element.(2)P P(3)P Q Q P P Q(4)P Q Q R P R(5)P9The proofs are straightforward using theorem(1)and the laws given under.There are many laws about,some of which are given below.Their use in program development is for restricting the nondeterminism and for enlarging the domains of specifications.(6)b P P(7)P Q P(8)P Q R P Q P RAgain,they are most easily proved using(1).All the programming operators introduced in the last section are monotonic with respect to in all their specification operands:(9)P QP;R Q;RR;P R;Qif b then P else R end if b then Q else R endif b then R else P end if b then R else Q endP R Q RR P R Qx P x Qvar x P var x QThe proof is most easily carried out using following theorem.Let R X be a specifica-tion over X.(10)R P Q R P R Q R X monotonic in XProof Assuming R distributes over,we calculate:X YX Y X by(1) R X Y R X by Leibnitz R X R Y R X by assumption R X R Y by(1)Theorem(9)follows from the distributivity of over all the programming operators as shown in the last section.4Weakest Precondition(”wp”)Informally,the weakest precondition of a specification P with respect to a condition b characterizes those initial states,for which P is defined and which only relate to states within b.The following theorem supports the informal claim that P;b is the weakest precondition of P with respect to b.(1)P;b P b∆P10Proof L.H.S.P∆b P b by def.”;”P b P b as∆b b =R.H.S.by(5)under”Basics”Before further elaborating the properties of weakest preconditions,we give two small theorems about conditions,which will be useful in proofs.(2)b c b c(3)b c b cWe further underpin the correspondence of P;b and wp P b by investigating Dijkstras ”healthiness”properties.The”law of the excluded miracle:wp P”translates to P;,which follows from the fact that is zero of”;”.The monotonicity of wp P b in b with respect to implication is expressed as follows:(4)b c P;b P;cProof L.H.S.b c by(2) P;b P;c as”;”-monotonicR.H.S.by(2) Now consider the property”wp P b c wp P b wp P c”,which translates to (5)P;b c P;b P;cRather than proving this theorem we prove the following generalization,which ex-presses that P;b Q behaves as P;Q restricted to the weakest precondition of P with respect to b.(6)P;b Q P;b P;QProof L.H.S.P b P;Q by(6)under”;”P b∆P P;Q as P;Q∆P =R.H.S.by(1) For practical program development,it is important to be able to calculate the weakest precondition of a given specification with respect to some postcondition.Here we give the laws for weakest preconditions of the programming operators.We assume that x is not in the current context and x is not free in b.(7);b(8)II;b b(9)x:e;b b x:e(10)P;Q;b P;Q;b(11)if c then P else Q end;b if c then P;b else Q;b end11(12)P Q;b P;b Q;b(13)x P;b x P;b(14)var x P;b x P;bLaws(7)to(11)follow directly from laws given in earlier w(12)follows from the distributivity of;over and property(3).For(13)we observe:Proof L.H.S.x P;b by(17)under ∆x P;b for any b:∆b b=R.H.S.by(12)under The proof of(14)follows the same pattern.Finally we state that the weakest precondi-tion with respect to is the domain:(15)∆P P;Our use of weakest preconditions will be for proving properties of iterations.5IterationIn this section we define the iteration”while b do P end”,show that it is well defined, and give the main iteration theorem.The next section applies the main iteration theorem to the linear search.For the definition of the iteration we observe,that the body P may be any specification,and hence unboundedly nondeterministic.It is known that unbounded nondeterminism leads to noncontinuity,which rules out the definition of iteration as a limit of a countable sequence of specifications.Hence,we define the iteration as afixed point based on the refinement relation as a partial order with bottom element.We give a self-contained account of the theory.Let W stand for”while b do P end”.As”W if b then P;W end”is an essential property of W,we like to define the iteration W as the solution of the recursive equation (*)Y if b then P;Y endHowever,in case the iteration does not terminate for some states,there may be many solutions.Consider for example”while do II end”:any Y will solve(*).Having de-cided that nontermination is represented by undefinedness,we have to choose the worst (least with respect to the refinement ordering)solution:Q worst Y such that r Y r Q X r X Q Xwhile b do P end worst Y such that Y if b then P;Y endwhere r X is for any specification X either or.We are obviously faced with the question of existence and uniqueness of the solutions,i.e.with the well-definedness of iteration.We do so by giving an equivalent,but explicit definition of iteration.To this end,we introduce the choice over a range of specifications.Let r be for any specification X either or:12X r P X r∆P X r PWe state some consequences of this definition.The next theorem states a relationship between universal quantification and choice over ranges.It can be seen as a generaliza-tion of(8)under””and is used in the proof of the subsequent theorem.We assume that X is not free in Q.(1)Q X r P X r Q PProof L.H.S.∆Q∆X r P X r P Q by def.””∆Q X r∆P X r∆P X r P Qby def.””∆Q X r∆P X r P Q by pred.calc.∆Q X r∆P P Q by pred.calc.X r Q∆P P Q by pred.calc.R.H.S.by def.””(2)Q worst Y such that r Y r Q Q X r X XThe implication from left to right states that if Q is the worst solution of r Q,then r Q holds(trivially),and Q is given by some unique expression.Hence,if a worst solution exists,it is unique.However,there does not necessarily exist a solution Y for r Y,but if one Q is a solution of r Q and Q is X r X X,then the converse implication states that it is the worst one.Here is the proof:Proof L.H.S.r Q X r X Q X by def.”worst”r Q Q X r X X by(1)r Q Q X r X X as r Q X r X X QR.H.S.This establishes the uniqueness of worst solutions of r Y.Next we turn our attention to the existence of solutions.We show the existence of solutions only for r Y of the form P Y Y,which suffices our purposes.We do so in two steps,first establishing the existence of solutions for P Y Y,then for P Y Y.We need some properties of the restricted choice:(3)X r P Q X r P X r QThe proof of this theorem is done by simple manipulations in the predicate calculus.It is used for the justification of the following theorem.(4)X r P Q X r P X r QProof Assuming X r P Q,we calculateX r P X r QX r P X r Q X r P by(1)under””X r P Q X r P property of””by assumption and(3)with Q:P Q13(5)P X monotonic in X P X r X X r P XProof Assuming P X monotonic in X,we calculateP X r X X r P XX r P X r X P X by(1)X r X r X X P X monotonic in Xproperty of””(6)P X monotonic in X there exists worst Y such that P Y YProof According to(2)it suffices to show that X P X X X is a solution.P X P X X XX P X X P X by assumption and(5)X P X X X by(4)with r P Q:P X X P X X This result establishes the existence and uniqueness of the worst solution for P Y Y, but iteration is defined by the worst solution of P Y Y.The theorem of Knaster and Tarski does the rest:(7)”Theorem of Knaster-Tarski”If P X monotonic in X,then(a)worst Y such that P Y Y,and(b)worst Y such that P Y Yhave the same solution.Proof According to(6),(b)has a solution.Let it be denoted by Q.Now we show that Q is also the solution of(a),i.e.(c)Q P Q,and(d)X X P X Q XFor(c)we observe:Q P QQ P Q P Q Q as””antisymmetric Q P Q as Q solution of(b) P P Q P Q as X P X X Q X with X:P QP Q Q P X monotonic in Xas Q solution of(b) For(d)we observe for any X:X P XP X X property of””Q X as X P X X X Q hence X X P X Q X.For iteration we observe,that both sequential composition and conditional are mono-tonic,hence”if b then P;X end”is monotonic in X,and”worst Y such that Y if b then P;Y end”has a unique solution according to Knaster-Tarski.This establishes the well-definedness of iterations.Next we turn to the question of how to prove properties about iterations.To this end,we need induction over well-founded sets.14(set C partially ordered by is well-founded)(all decreasing chains in C arefinite)(mathematical induction over C is valid)(for any predicate p c over c C:c c C p c c c Cd d C d c p d p c(8)”Principle of mathematical induction”(set C is well-founded)(mathematical induction over C is valid)The definitions above are taken from[3],where also a proof of the principle of mathe-matical induction is given.This givesfinally all the prerequisites for the main theorem about iterations.(9)”Main Iteration Theorem”Let D be a set partially ordered by C a well founded subset of D t a function from the context V to D P and Q specifications and b a condition.Then(a)∆Q t C b(b)c c C∆Q t c b P;t c(c)if b then P;Q end Q(d)while b do P end QThe function t is known as the termination rmally,if t is not in C(but in D), then according to requirement(a)the iteration is sure to terminate as b,the condition for termination holds.Otherwise,if t is in C,then the iteration might or might not terminate.However,in case it does not,requirement(b)ensures that t will be decreased by the body of the iteration,and hence eventually terminate:as C is well founded,all decreasing chains in C arefinite.Finally,requirement(c)states that the iteration does indeed compute the desired result:Q is a solution of equation defining the iteration. Proof For the purpose of the proof we define W byW while b do P endand conclude by”unfolding”the iteration:(e)W if b then P;W end by def.”while”The proof is carried out by showing separately(f)∆Q t C W t C Q(g)∆Q t C W t C Q(h)∆Q Wwhich together,by the principle of case analysis,establish(d).(f)caters for the case that the iteration terminates immediately,(g)for the case that it will terminate sometime,and (g)for the case that it is undefined.For(f)we observe:t C∆Q W t C Qt C∆Q b W t C∆Q b Q by(a) t C∆Q b II t C∆Q b II by(c),(e),and(3)under”if”15For the proof of(g)and(h)we need a couple of lemmas:(i)b P;∆Q∆Q(j)P;∆Q∆Q(k)b∆QThey are all consequences of requirement(c):if b then P;Q end Q∆if b then P;Q end∆Q rule of Leibnitz b∆P;Q b∆II∆Q by(1)under”if”,def.”if”b P;∆Q b∆Q by(15),(10)under”wp”P;∆Q b∆Q law of absorption P;∆Q∆Q b∆Q pred.calc.(i)is equivalent to the second last line.Now,for(g)we observet C∆Q W t C Qc c C t c∆Q W t c Q by pred.calc. According to the principle of mathematical induction,this is proved by deriving (l)t c∆Q W t c Qfrom(m)d d C d c t d∆Q W t d Qfor any c C.To this end,we calculate:d d C d c t d∆Q W t d Qt c t C∆Q W t c t C Q by pred.calc.t c∆Q W t c Q by(f) t c∆Q if b then P;t c∆Q W endt c∆Q if b then P;t c Q end by rule of Leibnitz t c∆Q if b then P;t c P;∆Q P;W endt c∆Q if b then P;t c P;Q end by(6)under”wp”t c∆Q if b then P;W endt c∆Q if b then P;Q end by(b),(i),and(4),(5)under”if”t c∆Q W t c Q by(e)and(c) For(h)we calculate:∆Q WW∆Q by pred.calc.∆W∆Q as for any P b:P b∆P b ∆X X if b then P;X end X∆Q property of”while”X X if b then P;X end∆X∆Q property of””X X if b then P;X end∆X∆Q by pred.calc.b II if b then P;b II end as∆b II∆Q(k)b II if b then P;b P end by(6)under”wp”b II if b then∆Q P;b P end as P;b P;∆Q∆Q by(k),(j)b II b∆Q P;b P b II by def.”if”as b∆Q by(k) This concludes the proof of the main iteration theorem.16。
Insight Problem Solving A Critical Examination of the Possibility
The Journal of Problem Solving • volume 5, no. 1 (Fall 2012)56Insight Problem Solving: A Critical Examination of the Possibilityof Formal TheoryWilliam H. Batchelder 1 and Gregory E. Alexander 1AbstractThis paper provides a critical examination of the current state and future possibility of formal cognitive theory for insight problem solving and its associated “aha!” experience. Insight problems are contrasted with move problems, which have been formally defined and studied extensively by cognitive psychologists since the pioneering work of Alan Newell and Herbert Simon. To facilitate our discussion, a number of classical brainteasers are presented along with their solutions and some conclusions derived from observing the behavior of many students trying to solve them. Some of these problems are interesting in their own right, and many of them have not been discussed before in the psychologi-cal literature. The main purpose of presenting the brainteasers is to assist in discussing the status of formal cognitive theory for insight problem solving, which is argued to be considerably weaker than that found in other areas of higher cognition such as human memory, decision-making, categorization, and perception. We discuss theoretical barri-ers that have plagued the development of successful formal theory for insight problem solving. A few suggestions are made that might serve to advance the field.Keywords Insight problems, move problems, modularity, problem representation1 Department of Cognitive Sciences, University of California Irvine/10.7771/1932-6246.1143Insight Problem Solving: The Possibility of Formal Theory 57• volume 5, no. 1 (Fall 2012)1. IntroductionThis paper discusses the current state and a possible future of formal cognitive theory for insight problem solving and its associated “aha!” experience. Insight problems are con-trasted with so-called move problems defined and studied extensively by Alan Newell and Herbert Simon (1972). These authors provided a formal, computational theory for such problems called the General Problem Solver (GPS), and this theory was one of the first formal information processing theories to be developed in cognitive psychology. A move problem is posed to solvers in terms of a clearly defined representation consisting of a starting state, a description of the goal state(s), and operators that allow transitions from one problem state to another, as in Newell and Simon (1972) and Mayer (1992). A solu-tion to a move problem involves applying operators successively to generate a sequence of transitions (moves) from the starting state through intermediate problem states and finally to a goal state. Move problems will be discussed more extensively in Section 4.6.In solving move problems, insight may be required for selecting productive moves at various states in the problem space; however, for our purposes we are interested in the sorts of problems that are described often as insight problems. Unlike Newell and Simon’s formal definition of move problems, there has not been a generally agreed upon defini-tion of an insight problem (Ash, Jee, and Wiley, 2012; Chronicle, MacGregor, and Ormerod, 2004; Chu and MacGregor, 2011). It is our view that it is not productive to attempt a pre-cise logical definition of an insight problem, and instead we offer a set of shared defining characteristics in the spirit of Wittgenstein’s (1958) definition of ‘game’ in terms of family resemblances. Problems that we will treat as insight problems share many of the follow-ing defining characteristics: (1) They are posed in such a way as to admit several possible problem representations, each with an associated solution search space. (2) Likely initial representations are inadequate in that they fail to allow the possibility of discovering a problem solution. (3) In order to overcome such a failure, it is necessary to find an alternative productive representation of the problem. (4) Finding a productive problem representation may be facilitated by a period of non-solving activity called incubation, and also it may be potentiated by well-chosen hints. (5) Once obtained, a productive representation leads quite directly and quickly to a solution. (6) The solution involves the use of knowledge that is well known to the solver. (7) Once the solution is obtained, it is accompanied by a so-called “aha!” experience. (8) When a solution is revealed to a non-solver, it is grasped quickly, often with a feeling of surprise at its simplicity, akin to an “aha!” experience.It is our position that very little is known empirically or theoretically about the cogni-tive processes involved in solving insight problems. Furthermore, this lack of knowledge stands in stark contrast with other areas of cognition such as human memory, decision-making, categorization, and perception. These areas of cognition have a large number of replicable empirical facts, and many formal theories and computational models exist that attempt to explain these facts in terms of underlying cognitive processes. The main goal58W. H. Batchelder and G. E. Alexander of this paper is to explain the reasons why it has been so difficult to achieve a scientific understanding of the cognitive processes involved in insight problem solving.There have been many scientific books and papers on insight problem solving, start-ing with the seminal work of the Gestalt psychologists Köhler (1925), Duncker (1945), and Wertheimer (1954), as well as the English social psychologist, Wallas (1926). Since the contributions of the early Gestalt psychologists, there have been many journal articles, a few scientific books, such as those by Sternberg and Davidson (1996) and Chu (2009), and a large number of books on the subject by laypersons. Most recently, two excellent critical reviews of insight problem solving have appeared: Ash, Cushen, and Wiley (2009) and Chu and MacGregor (2011).The approach in this paper is to discuss, at a general level, the nature of several fun-damental barriers to the scientific study of insight problem solving. Rather than criticizing particular experimental studies or specific theories in detail, we try to step back and take a look at the area itself. In this effort, we attempt to identify principled reasons why the area of insight problem solving is so resistant to scientific progress. To assist in this approach we discuss and informally analyze eighteen classical brainteasers in the main sections of the paper. These problems are among many that have been posed to hundreds of upper divisional undergraduate students in a course titled “Human Problem Solving” taught for many years by the senior author. Only the first two of these problems can be regarded strictly as move problems in the sense of Newell and Simon, and most of the rest share many of the characteristics of insight problems as described earlier.The paper is divided into five main sections. After the Introduction, Section 2 describes the nature of the problem solving class. Section 3 poses the eighteen brainteasers that will be discussed in later sections of the paper. The reader is invited to try to solve these problems before checking out the solutions in the Appendix. Section 4 lays out six major barriers to developing a deep scientific theory of insight problem solving that we believe are endemic to the field. We argue that these barriers are not present in other, more theo-retically advanced areas of higher cognition such as human memory, decision-making, categorization, and perception. These barriers include the lack of many experimental paradigms (4.1), the lack of a large, well-classified set of stimulus material (4.2), and the lack of many informative behavioral measures (4.3). In addition, it is argued that insight problem solving is difficult to study because it is non-modular, both in the sense of Fodor (1983) but more importantly in several weaker senses of modularity that admit other areas of higher cognition (4.4), the lack of theoretical generalizations about insight problem solv-ing from experiments with particular insight problems (4.5), and the lack of computational theories of human insight (4.6). Finally, in Section 5, we suggest several avenues that may help overcome some of the barriers described in Section 4. These include suggestions for useful classes of insight problems (5.1), suggestions for experimental work with expert problem solvers (5.2), and some possibilities for a computational theory of insight.The Journal of Problem Solving •Insight Problem Solving: The Possibility of Formal Theory 592. Batchelder’s Human Problem Solving ClassThe senior author, William Batchelder, has taught an Upper Divisional Undergraduate course called ‘Human Problem Solving” for over twenty-five years to classes ranging in size from 75 to 100 students. By way of background, his active research is in other areas of the cognitive sciences; however, he maintains a long-term hobby of studying classical brainteasers. In the area of complex games, he achieved the title of Senior Master from the United States Chess Federation, he was an active duplicate bridge player throughout undergraduate and graduate school, and he also achieved a reasonable level of skill in the game of Go.The content of the problem-solving course is split into two main topics. The first topic involves encouraging students to try their hand at solving a number of famous brainteasers drawn from the sizeable folklore of insight problems, especially the work of Martin Gardner (1978, 1982), Sam Loyd (1914), and Raymond Smullyan (1978). In addition, games like chess, bridge, and Go are discussed. The second topic involves presenting the psychological theory of thinking and problem solving, and in most cases the material is organized around developments in topics that are covered in the first eight chapters of Mayer (1992). These topics include work of the Gestalt psychologists on problem solving, discussion of experiments and theories concerning induction and deduction, present-ing the work on move problems, including the General Problem Solver (Newell & Simon, 1972), showing how response time studies can reveal mental architectures, and describing theories of memory representation and question answering.Despite efforts, the structure of the course does not reflect a close overlap between its two main topics. The principal reason for this is that in our view the level of theoreti-cal and empirical work on insight problem solving is at a substantially lower level than is the work in almost any other area of cognition dealing with higher processes. The main goal of this paper is to explain our reasons for this pessimistic view. To assist in this goal, it is helpful to get some classical brainteasers on the table. While most of these problems have not been used in experimental studies, the senior author has experienced the solu-tion efforts and post solution discussions of over 2,000 students who have grappled with these problems in class.3. Some Classic BrainteasersIn this section we present eighteen classical brainteasers from the folklore of problem solving that will be discussed in the remainder of the paper. These problems have de-lighted brainteaser connoisseurs for years, and most are capable of giving the solver a large dose of the “aha!” experience. There are numerous collections of these problems in books, and many collections of them are accessible through the Internet. We have selected these problems because they, and others like them, pose a real challenge to any effort to • volume 5, no. 1 (Fall 2012)60W. H. Batchelder and G. E. Alexander develop a deep and general formal theory of human or machine insight problem solving. With the exception of Problems 3.1 and 3.2, and arguably 3.6, the problems are different in important respects from so-called move problems of Newell and Simon (1972) described earlier and in Section 4.6.Most of the problems posed in this section share many of the defining characteristics of insight problems described in Section 1. In particular, they do not involve multiple steps, they require at most a very minimal amount of technical knowledge, and most of them can be solved by one or two fairly simple insights, albeit insights that are rarely achieved in real time by problem solvers. What makes these problems interesting is that they are posed in such a way as to induce solvers to represent the problem information in an unproductive way. Then the main barrier to finding a solution to one of these problems is to overcome a poor initial problem representation. This may involve such things as a re-representation of the problem, the dropping of an implicit constraint on the solution space, or seeing a parallel to some other similar problem. If the solver finds a productive way of viewing the problem, the solution generally follows rapidly and comes with burst of insight, namely the “aha!” experience. In addition, when non-solvers are given the solu-tion they too may experience a burst of insight.What follows next are statements of the eighteen brainteasers. The solutions are presented in the Appendix, and we recommend that after whatever problem solving activity a reader wishes to engage in, that the Appendix is studied before reading the remaining two sections of the paper. As we discuss each problem in the paper, we provide authorship information where authorship is known. In addition, we rephrased some of the problems from their original sources.Problem 3.1. Imagine you have an 8-inch by 8-inch array of 1-inch by 1-inch little squares. You also have a large box of 2-inch by 1-inch rectangular shaped dominoes. Of course it is easy to tile the 64 little squares with dominoes in the sense that every square is covered exactly once by a domino and no domino is hanging off the array. Now sup-pose the upper right and lower left corner squares are cut off the array. Is it possible to tile the new configuration of 62 little squares with dominoes allowing no overlaps and no overhangs?Problem 3.2. A 3-inch by 3-inch by 3-inch cheese cube is made of 27 little 1-inch cheese cubes of different flavors so that it is configured like a Rubik’s cube. A cheese-eating worm devours one of the top corner cubes. After eating any little cube, the worm can go on to eat any adjacent little cube (one that shares a wall). The middlemost little cube is by far the tastiest, so our worm wants to eat through all the little cubes finishing last with the middlemost cube. Is it possible for the worm to accomplish this goal? Could he start with eating any other little cube and finish last with the middlemost cube as the 27th?The Journal of Problem Solving •Insight Problem Solving: The Possibility of Formal Theory 61 Figure 1. The cheese eating worm problem.Problem 3.3. You have ten volumes of an encyclopedia numbered 1, . . . ,10 and shelved in a bookcase in sequence in the ordinary way. Each volume has 100 pages, and to simplify suppose the front cover of each volume is page 1 and numbering is consecutive through page 100, which is the back cover. You go to sleep and in the middle of the night a bookworm crawls onto the bookcase. It eats through the first page of the first volume and eats continuously onwards, stopping after eating the last page of the tenth volume. How many pieces of paper did the bookworm eat through?Figure 2.Bookcase setup for the Bookworm Problem.Problem 3.4. Suppose the earth is a perfect sphere, and an angel fits a tight gold belt around the equator so there is no room to slip anything under the belt. The angel has second thoughts and adds an inch to the belt, and fits it evenly around the equator. Could you slip a dime under the belt?• volume 5, no. 1 (Fall 2012)62W. H. Batchelder and G. E. Alexander Problem 3.5. Consider the cube in Figure 1 and suppose the top and bottom surfaces are painted red and the other four sides are painted blue. How many little cubes have at least one red and at least one blue side?Problem 3.6. Look at the nine dots in Figure 3. Your job is to take a pencil and con-nect them using only three straight lines. Retracing a line is not allowed and removing your pencil from the paper as you draw is not allowed. Note the usual nine-dot problem requires you to do it with four lines; you may want to try that stipulation as well. Figure 3.The setup for the Nine-Dot Problem.Problem 3.7. You are standing outside a light-tight, well-insulated closet with one door, which is closed. The closet contains three light sockets each containing a working light bulb. Outside the closet, there are three on/off light switches, each of which controls a different one of the sockets in the closet. All switches are off. Your task is to identify which switch operates which light bulb. You can turn the switches off and on and leave them in any position, but once you open the closet door you cannot change the setting of any switch. Your task is to figure out which switch controls which light bulb while you are only allowed to open the door once.Figure 4.The setup of the Light Bulb Problem.The Journal of Problem Solving •Insight Problem Solving: The Possibility of Formal Theory 63• volume 5, no . 1 (Fall 2012)Problem 3.8. We know that any finite string of symbols can be extended in infinitely many ways depending on the inductive (recursive) rule; however, many of these ways are not ‘reasonable’ from a human perspective. With this in mind, find a reasonable rule to continue the following series:Problem 3.9. You have two quart-size beakers labeled A and B. Beaker A has a pint of coffee in it and beaker B has a pint of cream in it. First you take a tablespoon of coffee from A and pour it in B. After mixing the contents of B thoroughly you take a tablespoon of the mixture in B and pour it back into A, again mixing thoroughly. After the two transfers, which beaker, if either, has a less diluted (more pure) content of its original substance - coffee in A or cream in B? (Forget any issues of chemistry such as miscibility).Figure 5. The setup of the Coffee and Cream Problem.Problem 3.10. There are two large jars, A and B. Jar A is filled with a large number of blue beads, and Jar B is filled with the same number of red beads. Five beads from Jar A are scooped out and transferred to Jar B. Someone then puts a hand in Jar B and randomly grabs five beads from it and places them in Jar A. Under what conditions after the second transfer would there be the same number of red beads in Jar A as there are blue beads in Jar B.Problem 3.11. Two trains A and B leave their train stations at exactly the same time, and, unaware of each other, head toward each other on a straight 100-mile track between the two stations. Each is going exactly 50 mph, and they are destined to crash. At the time the trains leave their stations, a SUPERFLY takes off from the engine of train A and flies directly toward train B at 100 mph. When he reaches train B, he turns around instantly, A BCD EF G HI JKLM.............64W. H. Batchelder and G. E. Alexander continuing at 100 mph toward train A. The SUPERFLY continues in this way until the trains crash head-on, and on the very last moment he slips out to live another day. How many miles does the SUPERFLY travel on his zigzag route by the time the trains collide?Problem 3.12. George lives at the foot of a mountain, and there is a single narrow trail from his house to a campsite on the top of the mountain. At exactly 6 a.m. on Satur-day he starts up the trail, and without stopping or backtracking arrives at the top before6 p.m. He pitches his tent, stays the night, and the next morning, on Sunday, at exactly 6a.m., he starts down the trail, hiking continuously without backtracking, and reaches his house before 6 p.m. Must there be a time of day on Sunday where he was exactly at the same place on the trail as he was at that time on Saturday? Could there be more than one such place?Problem 3.13. You are driving up and down a mountain that is 20 miles up and 20 miles down. You average 30 mph going up; how fast would you have to go coming down the mountain to average 60 mph for the entire trip?Problem 3.14. During a recent census, a man told the census taker that he had three children. The census taker said that he needed to know their ages, and the man replied that the product of their ages was 36. The census taker, slightly miffed, said he needed to know each of their ages. The man said, “Well the sum of their ages is the same as my house number.” The census taker looked at the house number and complained, “I still can’t tell their ages.” The man said, “Oh, that’s right, the oldest one taught the younger ones to play chess.” The census taker promptly wrote down the ages of the three children. How did he know, and what were the ages?Problem 3.15. A closet has two red hats and three white hats. Three participants and a Gamesmaster know that these are the only hats in play. Man A has two good eyes, man B only one good eye, and man C is blind. The three men sit on chairs facing each other, and the Gamesmaster places a hat on each man’s head, in such a way that no man can see the color of his own hat. The Gamesmaster offers a deal, namely if any man correctly states the color of his hat, he will get $50,000; however, if he is in error, then he has to serve the rest of his life as an indentured servant to the Gamesmaster. Man A looks around and says, “I am not going to guess.” Then Man B looks around and says, “I am not going to guess.” Finally Man C says, “ From what my friends with eyes have said, I can clearly see that my hat is _____”. He wins the $50,000, and your task is to fill in the blank and explain how the blind man knew the color of his hat.Problem 3.16. A king dies and leaves an estate, including 17 horses, to his three daughters. According to his will, everything is to be divided among his daughters as fol-lows: 1/2 to the oldest daughter, 1/3 to the middle daughter, and 1/9 to the youngest daughter. The three heirs are puzzled as to how to divide the horses among themselves, when a probate lawyer rides up on his horse and offers to assist. He adds his horse to the kings’ horses, so there will be 18 horses. Then he proceeds to divide the horses amongThe Journal of Problem Solving •Insight Problem Solving: The Possibility of Formal Theory 65 the daughters. The oldest gets ½ of the horses, which is 9; the middle daughter gets 6 horses which is 1/3rd of the horses, and the youngest gets 2 horses, 1/9th of the lot. That’s 17 horses, so the lawyer gets on his own horse and rides off with a nice commission. How was it possible for the lawyer to solve the heirs’ problem and still retain his own horse?Problem 3.17. A logical wizard offers you the opportunity to make one statement: if it is false, he will give you exactly ten dollars, and if it is true, he will give you an amount of money other than ten dollars. Give an example of a statement that would be sure to make you rich.Problem 3.18. Discover an interesting sense of the claim that it is in principle impos-sible to draw a perfect map of England while standing in a London flat; however, it is not in principle impossible to do so while living in a New York City Pad.4. Barriers to a Theory of Insight Problem SolvingAs mentioned earlier, our view is that there are a number of theoretical barriers that make it difficult to develop a satisfactory formal theory of the cognitive processes in play when humans solve classical brainteasers of the sort posed in Section 3. Further these barriers seem almost unique to insight problem solving in comparison with the more fully developed higher process areas of the cognitive sciences such as human memory, decision-making, categorization, and perception. Indeed it seems uncontroversial to us that neither human nor machine insight problem solving is well understood, and com-pared to other higher process areas in psychology, it is the least developed area both empirically and theoretically.There are two recent comprehensive critical reviews concerning insight problem solving by Ash, Cushen, and Wiley (2009) and Chu and MacGregor (2011). These articles describe the current state of empirical and theoretical work on insight problem solving, with a focus on experimental studies and theories of problem restructuring. In our view, both reviews are consistent with our belief that there has been very little sustainable progress in achieving a general scientific understanding of insight. Particularly striking is that are no established general, formal theories or models of insight problem solving. By a general formal model of insight problem solving we mean a set of clearly formulated assumptions that lead formally or logically to precise behavioral predictions over a wide range of insight problems. Such a formal model could be posed in terms of a number of formal languages including information processing assumptions, neural networks, computer simulation, stochastic assumptions, or Bayesian assumptions.Since the groundbreaking work by the Gestalt psychologists on insight problem solving, there have been theoretical ideas that have been helpful in explaining the cog-nitive processes at play in solving certain selected insight problems. Among the earlier ideas are Luchins’ concept of einstellung (blind spot) and Duncker’s functional fixedness, • volume 5, no. 1 (Fall 2012)as in Maher (1992). More recently, there have been two developed theoretical ideas: (1) Criterion for Satisfactory Progress theory (Chu, Dewald, & Chronicle, 2007; MacGregor, Ormerod, & Chronicle, 2001), and (2) Representational Change Theory (Knoblich, Ohls-son, Haider, & Rhenius, 1999). We will discuss these theories in more detail in Section 4. While it is arguable that these theoretical ideas have done good work in understanding in detail a few selected insight problems, we argue that it is not at all clear how these ideas can be generalized to constitute a formal theory of insight problem solving at anywhere near the level of generality that has been achieved by formal theories in other areas of higher process cognition.The dearth of formal theories of insight problem solving is in stark contrast with other areas of problem solving discussed in Section 4.6, for example move problems discussed earlier and the more recent work on combinatorial optimization problems such as the two dimensional traveling salesman problem (MacGregor and Chu, 2011). In addition, most other higher process areas of cognition are replete with a variety of formal theories and models. For example, in the area of human memory there are currently a very large number of formal, information processing models, many of which have evolved from earlier mathematical models, as in Norman (1970). In the area of categorization, there are currently several major formal theories along with many variations that stem from earlier theories discussed in Ashby (1992) and Estes (1996). In areas ranging from psycholinguistics to perception, there are a number of formal models based on brain-style computation stemming from Rumelhart, McClelland, and PDP Research Group’s (1987) classic two-volume book on parallel distributed processing. Since Daniel Kahneman’s 2002 Nobel Memorial Prize in the Economic Sciences for work jointly with Amos Tversky developing prospect theory, as in Kahneman and Tversky (1979), psychologically based formal models of human decision-making is a major theoretical area in cognitive psychology today. In our view, there is nothing in the area of insight problem solving that approaches the depth and breadth of formal models seen in the areas mentioned above.In the following subsections, we will discuss some of the barriers that have prevented the development of a satisfactory theory of insight problem solving. Some of the bar-riers will be illustrated with references to the problems in Section 3. Then, in Section 5 we will assuage our pessimism a bit by suggesting how some of these barriers might be removed in future work to facilitate the development of an adequate theory of insight problem solving.4.1 Lack of Many Experimental ParadigmsThere are not many distinct experimental paradigms to study insight problem solving. The standard paradigm is to pick a particular problem, such as one of the ones in Section 3, and present it to several groups of subjects, perhaps in different ways. For example, groups may differ in the way a hint is presented, a diagram is provided, or an instruction。
Splitting the Curvature of the Determinant Line Bundle
geometry in the case of a family of elliptic boundary value problems in dimension one. 2. Grassmann Sections and the Determinant Bundle Let π : Z −→ B be a smooth fibration of manifolds with fibre diffeomorphic to a closed connected even-dimensional manifold M . The tangent bundle along the fibres T (Z/B ) is taken to be oriented, spin and endowed with a Riemannian metric gT (Z/B ) . Let S (Z/B ) be the vertical spinor bundle and E a Hermitian coefficient bundle. Associated to this data one has a smooth elliptic family of Dirac operators D = {Db : b ∈ B } : H −→ H, where H = π∗ (S (Z/B ) ⊗ E ) is the infinite-dimensional Hermitian vector bundle on B whose fibre at b is the Frechet space of smooth sections Hb = C ∞ (Mb , Sb ⊗ Eb ). The Z2 bundle grading H = F + ⊕ F − into positive and negative chirality fields defines families of chiral Dirac operators D± : F ± −→ F ∓ and hence two families of finite-dimensional ± vector spaces ker(D± ) = {ker(Db ) : b ∈ B }. The Quillen determinant line bundle + Det Ind D is a complex line bundle over B with fibre at b canonically isomorphic + −1 − to the complex line ∧max (KerDb ) ⊗ ∧max Ker(Db ) (see [2, 8]). A connection on HY is defined as follows [1, 2]. A connection on the fibration means a splitting (1) T Z = T (Z/B ) ⊕ T H Z and specifies an isomorphism T H Z ∼ = π ∗ T B . A choice of metric gT B on T B deter∗ mines a metric gT Z = gT (Z/B ) ⊕ π gT B on T Z with Levi-Civita connection ∇T Z . We define a connection on T (Z/B ) by ∇T (Z/B ) = PZ/B ∇T Z PZ/B + c, where PZ/B is the orthogonal projection on T Z with range T (Z/B ) and c is the 1 iξ d(volZ/B )]V = c(ξ )volZ/B . Here volZ/B is the volume 1-form on Z defined by [ 2 form defined by gT (Z/B ) regarded as an n-form on Z and [.]V means the vertical component. The connection ∇T (Z/B ) lifts to a connection on the vertical spinor bundle, and endowing E with a connection compatible with its metric we obtain a connection ∇S (Z/B ) on each of the bundles S (Z/B ) ⊗ E , S ± (Z/B ) ⊗ E . Because a section of the infinite-dimensional bundle H is identified as a section of S (Z/B ) ⊗ E we have a unitary connection ∇Z on H, F ± (2) ∇Z ξ s = ∇ξH
NUMERICAL EXPERIMENTS WITH PARALLEL ORDERINGS FOR ILU PRECONDITIONERS
NUMERICAL EXPERIMENTS WITH PARALLEL ORDERINGS FOR ILUPRECONDITIONERSMICHELE BENZI,WAYNE JOUBERT,AND GABRIEL MATEESCU Abstract.Incomplete factorization preconditioners such as ILU,ILUT and MILU are well-known robust general-purpose techniques for solving linear systems on serial computers.However,they are difficult to parallelize efficiently.Various techniques have been used to parallelize these preconditioners,such as multicolor orderings and subdomain preconditioning.These techniques may degrade the performance and robustness of ILU precondition-ings.The purpose of this paper is to perform numerical experiments to compare these techniques in order to assess what are the most effective ways to use ILU preconditioning for practical problems on serial and parallel computers.Key words.Krylov subspace methods,preconditioning,incomplete factorizations,sparse matrix orderings, additive Schwarz methods,parallel computing.AMS subject classifications.1.Introduction.1.1.Motivation and focus.Krylov subspace methods[21]are customarily employed for solving linear systems arising from modeling large-scale scientific problems.The con-vergence of these iterative methods can be improved by preconditioning the linear system .The preconditioned system(for left preconditioning)can be solved faster than the original system if the preconditioner is an efficient and good ap-proximation of;efficient,in the sense that the cost of solving is much smaller than the cost of solving,and good in the sense that the convergence rate for the preconditioned iteration is significantly faster than for the unpreconditioned one.Incomplete factorization techniques[24],[17],[20]provide a good preconditioning strat-egy for solving linear systems with Krylov subspace ually,however,simply applying this strategy to the full naturally ordered linear system leads to a method with lit-tle parallelism.Incomplete factorization is also useful as an approximate subdomain solver in domain decomposition-based preconditioners,such as Additive Schwarz Method(ASM) preconditioning[23].In this paper we study the effect of the following algorithm parameters on the conver-gence of preconditioned Krylov subspace methods:Symmetric reorderings of the matrix:this applies to incomplete factorization(ILU) preconditioners;Subdomain overlap:this applies to Additive Schwarz preconditioners.Symmetric permutations of the linear system have beenfirst used in direct factorization solution methods for reducing the operation count and memory requirements.For example, the Minimum Degree ordering is effective for direct solvers in that it tends to reduce the num-ber of nonzeros of the and factors[12].Other reorderings can have a beneficial effect on incomplete factorizations employed as preconditioners,e.g.,by providing a parallel pre-conditioner[21];the storage for the preconditioner is typically controlled by the incomplete10002Parallel ILU Preconditioningsfactorization scheme.However,parallel orderings may degrade the convergence rate,and allowingfill may diminish the parallelism of the solver.In this paper,we consider structurally symmetric matrices arising from partial differential equations(PDEs)discretized on structured grids usingfinite differences.We focus on sym-metric permutations as they represent similarity transformations and preserve the spectrum of the linear system matrix.Furthermore,symmetric permutations preserve the structural symmetry of a matrix and the set of diagonal elements.Additive Schwarz methods[23]derive a preconditioner by decomposing the problem domain into a number of overlapping subdomains,(approximately)solving each subdomain, and summing the contributions of the subdomain solves.Variants of ASM are obtained by varying the amount of overlap and the subdomain solvers.When the subdomains are solved approximately using an incomplete factorization,the resulting preconditioner can be thought of as a“parallel ILU”strategy.Like block Jacobi,ASM has good parallelism and locality,but these advantages could be offset by a high iteration count of the underlying Krylov subspace method.1.2.Related work.A lexicographic ordering of the grid points in a regular two-or three-dimensional grid is obtained by scanning the grid nodes and assigning numbers to the nodes in the order in which they are seen in the scanning.A widely used ordering is the Nat-ural Order(NO),which is the order induced by labeling the grid nodes from the bottom up, one horizontal line at a time and(for three-dimensional grids)scanning consecutive vertical planes.Several reorderings have been considered in the literature as alternatives to the natural order.Among these are Minimum Degree(MD),Multiple Minimum Degree(MMD),Reverse Cuthill-McKee(RCM),Nested Dissection(ND),and Multicoloring(MCL).For a description of these reorderings see[6],[11],[12],[16],[21].MMD,MD,and ND reorderings attempt to minimize thefill in the factors,while RCM reduces the bandwidth of the matrix.The degree of parallelism(DOP)of a preconditioning algorithm is the number of pro-cessors that can work simultaneously on constructing or applying the preconditioner.MCL provides large grain parallelism for no-fill incomplete LU factorization[21].With multicol-oring,the linear system is partitioned in subsets of equations that do not have any internal dependencies between unknowns.For direct solvers,fill-reducing(ND,MD,MMD)and bandwidth-reducing(RCM)reorderings are superior to NO[12].The usefulness of these reorderings for incomplete factorization preconditioners is not well established.A simple incomplete factorization of a matrix is,where the triangular matrices and have the same nonzero structure as the lower and upper triangular parts of ,respectively,and is the residual matrix.This strategy is known as no-fill ILU,and is denoted by ILU(0).For symmetric positive definite(SPD)problems,and the factor-ization is called no-fill incomplete Cholesky,denoted by IC(0).One can attempt to improve the effectiveness of an incomplete LU factorization by allowingfill-in in the triangular factors ,.The ILU(1)method allows nonzeros entries for the elements with level offill at most one(see[21],pp.278–281);the corresponding factorization for SPD problems is denoted by IC(1).A more sophisticated preconditioner is a dual-dropping ILU preconditioner[20], denoted by ILUT(),where is the dropping threshold and is the maximum number of nonzeros offill allowed in a row above those present in the original matrix.The effect of reorderings on the performance of ILU-type preconditioners has been stud-ied by Duff and Meurant[7],Benzi et al.[2],and Saad[19],among others.Duff and Meurant have studied the impact of reorderings on incomplete factorization preconditioning for SPD problems.The sources of inaccuracy in incomplete factorizations and the effect of reorder-ings on accuracy and stability have been analyzed by Chow and Saad[4]and Benzi et al.[2].Let.The residual matrix measures the accuracy of the incom-M.Benzi,W.Joubert,and G.Mateescu10003 plete factorization.Let denote the Frobenius norm of a matrix.Chow and Saad[4] and Benzi et al.[2]have shown that for nonsymmetric problems may be an insuf-ficient characterization of the convergence of the preconditioned iterative methods.This is in contrast with symmetric positive definite problems where provides a good measure of convergence.These authors have shown that the stability of the triangular solves is gauged by the Frobenius norm of the deviation from identity matrix,.For ILU(0), can be small,while may be very large,i.e.,can be very ill-conditioned.Note that.Benzi et al.[2]have shown that RCM and MD reorderings can be beneficial for prob-lems that are highly nonsymmetric and far from diagonally dominant.Specifically,MD has the effect of stabilizing the ILU(0)triangular factors,while RCM improves the accuracy of incomplete factorizations withfill.Saad[19]has shown that improving the accuracy of the preconditioner,by replacing ILU(0)with dual dropping ILU preconditioning ILUT(),greatly improves the perfor-mance of red-black(RB)ordering,so that by increasing the ILUT preconditioner induced by RB will eventually outperform the one induced by NO,as measured by iterations orfloat-ing point operations.RB ordering is the simplest variant of MCL,in which the grid points are partitioned in two independent subsets(see Subsection2.1and[21],page366).1.3.Contributions of the paper.Our main contribution is to show that parallel order-ings can perform well even in a sequential environment,producing solvers requiring a lower wall-clock time(and in some cases reduced storage needs)than NO,especially for ILUT pre-conditioning of two-dimensional problems.We also show that for problems which are highly nonsymmetric and far from diagonally dominant these orderings are still better than NO,but they are outperformed by RCM.For such problems we also observe that parallel orderings can have a stabilizing effect on the ILU(0)preconditioner.We propose and investigate a new MCL ordering,which allows for parallel ILU(1)and IC(1)preconditioners.We perform numerical experiments with multicoloring for nonsym-metric problems and incomplete factorizations withfill-in,extending the work done by Poole and Ortega[18]and by Jones and Plassmann[15]who studied the effect of MCL reorderings on IC(0)preconditioners for SPD problems.Our experiments suggest that for nonsymmetric problems,the loss in convergence rate caused by switching from NO to multicolorings for ILU(0)and ILU(1)preconditioners is compensated by the large DOP of the multicolorings.We extend the study[7]in two ways:first,we show that RB applied to symmetric ILUT can outperform NO;second,we look at RB applied to nonsymmetric problems.We further the study of Benzi et al.[2]on orderings for nonsymmetric problems by considering MCL and ND orderings.We further the work of Saad by considering the performance of RB on larger problems (Saad considers problems with up to unknowns,while we consider up to unknowns)and comparing RB with RCM and ND,in addition to NO.Finally,we assess the scalability of one-level overlapping ASM preconditioning for the set of model problems defined below.1.4.Model problems.Although in practical experience it is often desirable to solve problems with complex physics on possibly unstructured grids,a minimal requirement of an effective parallel scheme is that it work well on simple,structured problems.For this reason, we will focus on a small set of model problems on structured grids which are“tunable”in terms of problem size and difficulty.The numerical experiments throughout the paper use the following three model PDE problems,all of them with homogeneous Dirichlet boundary conditions.Below we denote by the Laplace operator in two or three dimensions.10004Parallel ILU PreconditioningsProblem1.The Poisson’s equation:(1.1)Problem2.Convection-diffusion equation with convection in the xy-plane: (1.2)Problem3.Convection-diffusion equation with convection in the z-direction:M.Benzi,W.Joubert,and G.Mateescu10005 toolkit uses left-preconditioning and the residual used in the stopping test corresponds to the preconditioned problem.The code for the experiments in Sections3and5has been compiled using the compiler options-pfa(automatic parallelization of do-loops)-n32(new32-bit objects)-O0(turn off optimization).The timing data are obtained using the Fortran intrinsic function dtime()from which the user CPU time is extracted.The code uses right preconditioning.The runs in Section3use one processor,while those in Section5use eight processors.2.Background.2.1.Multicoloring orderings.Given a graph,where is the set ofvertices and is the set of edges,the graph coloring problem is to construct a partition of the set such that all vertices in the same part form an independent set,i.e.,vertices in the same subset are not connected by any edge.The minimum number ofcolors necessary for a graph,,is the chromatic number of the graph.The relevance of the chromatic number from a computational point of view is that all unknowns in the same subset can be solved in parallel.Thus,the number of inherent sequential steps is greater or equal to the chromatic number of the graph.With each matrix we can associate a graph such thatand.For arbitrary,finding the chromatic number of is NP-hard,but in practice a suboptimal coloring suffices.For the5-point stencil discretization, it is easy tofind a2-coloring of the graph,commonly called red-black coloring.For red-black coloring,the degree of parallelism in applying an IC(0)or ILU(0)preconditioner is (is the number of unknowns),which justifies considering this ordering strategy.The problem with this approach is that a no-fill preconditioner obtained with RB may result in poor convergence.For SPD matrices arising fromfinite difference discretizations of two-dimensional ellip-tic problems,Duff and Meurant[7]have shown,by way of experiments,that RB reordering has a negative effect on the convergence of conjugate gradient preconditioned with IC(0). Poole and Ortega[18]have observed similar convergence behavior for multicolorings.2.2.Scalability.Let be the time complexity of executing an iterative linear solver on a parallel computer with processors,where is a parameter equal to the number of subproblems(for example,the number of subdomains in the case of ASM)into which the solver divides the original problem of size,and is the subproblem size,.Following Gustafson[13],an algorithm is scalable if the time complexity stays constant when the subproblem size is kept constant while the number of processors and the number of subproblems both increase times,i.e.,.The scalability of an iterative linear solver can be decomposed into two types of scalability(see,for example,Cai et al.[3]):(i)algorithmic or numerical scalability,i.e.,the number of iterations is independent of the problem size;(ii)parallel scalability,i.e.,the parallel efficiencyremains constant when grows.In the case of linear solvers for problems arising from elliptic PDEs,it is likely thatthe two conditions for scalability cannot be simultaneously achieved;see,for example,Wor-ley[26].Hence,in practice a method is considered scalable if the number of iterations de-pends only weakly on the number of subproblems and the parallel efficiency decreases only slightly as increases.3.Performance of serial implementations.In this section we are concerned with theserial performance of reorderings for ILU preconditioners.We look at the effect of RB,RCM, and ND on several ILU preconditioners.10006Parallel ILU PreconditioningsIffill is allowed,the advantage of RB providing a parallel preconditioner is greatly dimin-ished;however,by allowingfill the accuracy of the RB-induced preconditioner may improve so much that it outperforms NO,thereby making RB an attractive option for uniprocessorcomputers.Indeed,Saad has observed[19]that,for a given problem and dropping tolerance ,there is an amount offill,such that if then the preconditioner ILUT()in-duced by RB outperforms the corresponding preconditioner induced by NO.On the otherhand,it has been observed by Benzi et al.[2]that,for highly nonsymmetric problems,RCM outperforms NO whenfill in the preconditioner is allowed.We further these studies by per-forming numerical experiments on NO,RB,and RCM,thereby comparing RB to RCM,and considering larger problems(more than100,000unknowns).For the drop tolerance-based solvers we use global scaling,i.e.,we divide all the matrixelements by the element of largest magnitude in the matrix.Thus,all entries in the scaledmatrix are in the interval.This scaling has no effect on the spectral properties of, but it helps in the choice of the(absolute)drop tolerance,which is a number. As an alternative,we tried the relative drop tolerance approach suggested in[20],where fill-ins at step are dropped whenever they are smaller in absolute value than times the2-norm of the th row of.For the problems considered in this paper,the two drop strategies gave similar results.For2D problems,we let,and for3D problems,we let.The number of iterations(I),the solver times(T)in seconds,and the memory require-ments of the preconditioners are shown in Tables3.1–3.6.Here denotes the amount of fill-in(in addition to the nonzero entries in the matrix)in thefill-controlled preconditioners. We denote as SILUT a modification of the ILUT algorithm which gives rise to a symmetric preconditioner whenever the original matrix is symmetric.Symmetry is exploited by stor-ing only the upper triangular part of.Throughout the paper we use as unit for storage measurement.The solver time reported is the sum of the time to construct the preconditioner and the time taken by the iterations of the preconditioned Krylov subspace method.The bold fonts indicate the smallest iteration count,time,andfill for each precondi-tioner.3.1.Symmetric positive definite problems.We consider symmetric ILUT precondi-tioning and incomplete Cholesky preconditioners with nofill and with level offill one,de-noted respectively by IC(0)and IC(1).We examine the effect of RB and ND orderings on the convergence of preconditioned CG.As already mentioned,allowingfill in the preconditioner largely limits the parallelism provided by RB ordering.Except for the IC(0)preconditioner, for which the degree of parallelism of applying the preconditioner is,the other precon-ditioners have modest parallelism.The results are reported in Tables3.1–3.2.We may draw several conclusions from these results.First,nested dissection ordering isnot competitive with other orderings for IC(0),IC(1)and SILUT,thus we will exclude thisordering from the subsequent discussion.Second,our numerical results for IC(0)indicate the same performance for NO and RCM;this is in accordance with theoretical results[27]which show that the ILU(0)preconditioner induced by RCM is just a permutation of that induced by NO.Third,while NO and RCM are the best orderings for IC(0),the best ordering for IC(1) is RB,followed by RCM and NO.This is true for both2D and3D problems.The best reordering for SILUT from an iteration count standpoint is RCM,followed byRB and NO.In the2D case,notice that RB,even though requiring slightly more iterations than RCM,leads to a preconditioner that has afill of roughly2/3of that of the RCM pre-conditioner.This has an effect on the solver time:for small problem sizes(, not shown)the best reordering is RCM,but for the larger problems RB is the best reordering, in terms of CPU time.Here,the higher iteration count of RB as compared to RCM is out-M.Benzi,W.Joubert,and G.Mateescu10007T ABLE3.1Iterations(I),time(T),andfill versus problem size(),for Problem1,2D domain,with different reorderings and preconditionersIC(1)SILUT(.001,10)I T T T(sec)(sec)(sec)NO57 3.30 2.74 2.65RB94 2.68 2.31 1.78RCM57 3.32 2.33 2.03ND110 4.91 3.87 2.8727.96763.5541262462845.452127411582731627.86763.5391882149944.61081148616341323NO12633.828.724.0RB21426.822.517.4RCM12633.523.317.9ND19951.547.229.4105102155823103215441797831061388367751041021555946328123016314926811243264826T ABLE3.2Iterations(I),time(T),andfill versus problem size(),for Problem1,3D domain,with different reorderings and preconditionersIC(1)SILUT(.001,10)I T T T(sec)(sec)(sec)NO140.440.390.55RB160.490.430.39RCM140.460.360.47ND210.560.460.524.031790.116119143136.16151352779.3141584.131790.115119113176.01231222797.918209NO2810.99.6413.5RB389.2113.19.37RCM2810.99.3811.8ND4515.114.813.722.5243512246618120629.520527373031960722.3243512246616121634.5334773937627804 weighed by the lower cost of applying the RB preconditioner,so that overall RB becomes thebest reordering for large problems with a moderately high level offill such as5or10.In the3D case the best ordering for SILUT depends on the level offill allowed:for SILUT(.005,5)the fastest solution is obtained with RCM,whereas RB is the best ordering for SILUT(.001,10).This appears to be due to the fact that RB does a better job at preserving sparsity in the incomplete factors,while the convergence rate is only marginally worse than with RCM.10008Parallel ILU Preconditionings3.2.Convection-diffusion problems.As we move from SPD problems to problems which are nonsymmetric and lack diagonal dominance,the relative merits of RB and RCM observed in the previous subsection change.In this subsection,we consider two-and three-dimensional instances of Problem2,by setting and,and compare the performance of the RB,RCM,NO,and ND reorderings.T ABLE3.3Iterations(I),time(T),andfill versus problem size(),for Problem2,2D domain,,with different reorderings and preconditionersILU(1)ILUT(.001,10)I T T T(sec)(sec)(sec)NO32 2.34 2.55 2.07RB106 2.11 1.84 1.74RCM32 1.85 1.47 1.10ND90 4.99 3.18 2.4140.052127413242277194.634254302861852839.442127263091359283.6832285529628532NO10646.145.733.8RB23935.132.424.6RCM10640.330.320.6ND20183.358.038.517390310669243719883366662156700321313175833105477626155531014853696766501357T ABLE3.4Iterations(I),time(T),andfill versus problem size(),for Problem2,3D domain,,with different reorderings and preconditionersILU(1)ILUT(.001,10)I T T T(sec)(sec)(sec)NO100.350.420.66RB190.350.450.53RCM100.350.330.45ND150.500.550.723.1681806193640113.17270615653083.1751805205341010.61224491787348NO1010.19.4811.9RB4810.39.519.91RCM10 5.65 5.83 6.75ND3317.314.116.123.6127031158710133283.5101054115578111423.7670366485110362.51995415602111246The effects of RB,RCM,and ND on the iteration count and execution time are illustrated in Tables3.3–3.6.Notice that,as for the incomplete Cholesky factorization,ND gives poorM.Benzi,W.Joubert,and G.Mateescu10009 performance under all aspects(iterations,time,storage),in all test cases except for the3D domain,,with ILU(0).First,consider the mildly nonsymmetric problems,(Tables3.3and3.4).In the two-dimensional case(Table3.3),RCM and NO are equivalent(up to round-off)for ILU(0) and are the best orderings.However,for ILU(1)for larger problems,RB is the best ordering from a time and iteration count standpoint.The second best ordering is RCM,followed by NO.Notice that RCM and NO induce the same size of the ILU(1)preconditioner,which has about half the size of thefill-in induced by RB.For ILUT,the time,iteration,and storage cost of RB and RCM are close,with RCM being somewhat better for.For larger problems RCM is generally the least expensive in terms of time and iterations,while RB induces the smallestfill.Notice that for largefill the relative storage saving obtained with RB as compared to RCM is less than the corresponding saving for the SPD case.In the three-dimensional case(Table3.4),RCM is the best ordering(or nearly so)for all preconditioners,often by a wide margin.RB,which is much worse than NO with ILU(0), gives a performance that is close to that of NO with ILU(1)and ILUT(.005,5)and is some-what better than NO with ILUT(.001,10).Next,consider the moderately nonsymmetric problems,(see Tables3.5and 3.6),where a indicates failure to converge in5000iterations.T ABLE3.5Iterations(I),time(T),andfill versus problem size(),for Problem2,2D domain,,with different reorderings and preconditionersILU(1)ILUT(.001,10)I T T T(sec)(sec)(sec)NO44 1.4 1.8 1.8RB106 1.0 1.0 1.3RCM46 1.40.90.8ND82 5.3 2.2 1.73.012127122149482114.8254819953533.25127597322285.1792282225510457NO2011.914.512.8RB27610.49.68.7RCM20 5.0 5.1 5.0ND19672.122.415.152.82831027738141495355.186211868911112653.912310114625781275.12053644728191271 For two-dimensional domains(Table3.5),NO and RCM are still the best orderings for ILU(0),while RB is much worse than these two.For ILU(1),with the only exception of the case,the best ordering is RCM;the performance of RB is about midway between RCM and NO.RCM outperforms RB by all criteria:iterations,time,and preconditioner size. This differs from the mildly nonsymmetric case,for which RB for ILU(1)typically wins in terms of CPU time.The results for ILUT are qualitatively similar to those for ILU(1), with RCM being the clear winner,and RB being better than NO.Notice that the size of the ILUT(.001,10)preconditioner induced by RB is always larger than that of the RCM-induced preconditioner.This behavior is different from that observed for Problem1and the instance of Problem2and indicates that the advantage of RB leading to a smaller ILUT10010Parallel ILU Preconditioningspreconditioner size than RCM for large enough is problem-dependent.T ABLE3.6Iterations(I),time(T),andfill versus problem size(),for Problem2,3D domain,,with different reorderings and preconditionersILU(1)ILUT(.001,10)I T T T(sec)(sec)(sec) NO 3.050.75 1.05RB56 1.01 1.190.93RCM47920.820.400.62ND6480.7 1.580.96171807313563014.01227013159831991804255353320.431244131946366NO8515.711.017.6RB4613.215.215.8RCM6810.3 5.879.67ND3925.215.819.252.316703711965240475.581054116096121852.9970349773206978.4219541174481409The results for the3D domain(see Table3.6)are similar to those for2D,with two ex-ceptions.First,for ILU(0),RB and ND are better than NO and RCM for; note that ILU(0)with the natural ordering(as well as RCM)is unstable for. Second,for ILUT,the minimum size of the preconditioner is induced by RB;this is similar to the behavior observed in Tables3.2and3.4.A salient characteristic of the ILUT precondi-tioners for the3D case is that the iteration counts are almost insensitive to the problem size, which suggests that the incomplete factorization is very close to a complete one.3.3.Summary of results.The experiments above suggest that RB and RCM reorder-ings can be superior to NO:they may lead to a reduction in the number of iterations and, for threshold-based ILU,to a smaller amount offill-in.For SPD problems and for mildly nonsymmetric problems,RB and RCM are the best reorderings for the set of preconditioners considered.The winner is either RB or RCM,depending on the type of preconditioner and on the problem size.On the other hand,RCM is the best reordering for highly nonsymmetric problems,for almost all preconditioners and problem sizes covered in this section;for the few cases where RCM is not the best choice,its performance is very close to the best choice.Therefore,the robustness of RCM makes it the best choice for problems with strong convection.We mention that similar conclusions hold when more adequate discretization schemes are used;see the experiments with locally refined meshes in[2].While it was already known that RB can outperform NO provided enoughfill is allowed(Saad[19]),here we have found that RCM is even better.We observe that,for the preconditioners withfill,the convergence is faster for the convection-diffusion problems than for the Poisson’s equation corresponding to the same problem size,preconditioner,and order.We should make a comment on the methodology of these experiments.It should be pointed out that these experiments are not exhaustive,and ideally one would like to say that for a given problem,for anyfixed,one method is better than another,which wouldimply that the given method for its best is better than the other method for its best ,i.e.,the choice of the best ordering is not an artifact of the parameterization of the knobs.However,such a test would require a very large number of experiments with different values which would not be practical.Furthermore,we feel the results we have given do give a general sense of the comparative performance of the methods.In this Section we have not been concerned with the issue of parallelism,and the ex-periments above were meant to assess the effect of different orderings on the quality of ILU preconditionings in a sequential setting.The remainder of the paper is devoted to an evalua-tion of different strategies for parallelizing ILU-type preconditioners.4.Additive Schwarz preconditioning.In this section we examine the effect of the problem size and number of subdomains(blocks the matrix is split into)on the convergence of ASM.The subdomain size,denoted by,is chosen such that the subdomain boundaries do not cross the z-axis.The experiments in this section use Problems1and3.The number of subdomains is always a divisor(or a multiple)of,the number of grid points along one dimension.Note that this is not always a power of2.The ASM preconditioner divides the matrix in overlapping blocks(corresponding to sub-domains of the PDE problem),each of which is approximately solved with IC(0)in the SPD case,and ILU(0)in the nonsymmetric case.We call these variants ASM.IC0and ASM.ILU0, respectively.The amount of overlap is denoted by.Since Additive Schwarz preconditioners can be improved by employing subdomain overlapping,we consider three levels of overlaps: ;a0-overlap gives an approximate(since subdomain solves are ILU(0))Block Jacobi preconditioner.We have employed the ASM preconditioner provided by the PETSc library.Conceptually,the preconditioner is formed by summing the(approximate)inverse of each block,where for each block a restriction operator extracts the coefficients corresponding to that block.A limited interpolation is employed,in which the off-block values for each block are ignored(this is the PETSc’s PC RESTRICT preconditioner).The number of processors is;more processors caused a performance degradation for the problem sizes considered here.We perform two kinds of scalability experiments.In thefirst kind,wefix the problem size and increase the number of subdomains.In the second,wefix the subdomain size, ,and increase.We monitor the iteration counts and the execution times.The grid nodes are numbered row-wise in each xy-plane,so each matrix block corresponds to a subdomain in an xy-plane.4.1.Scalability for constant problem size.For thefirst kind of scalability experiments, we measure for Problems1and3the number of iterations and solution times versus the number of subdomains.The results for Problem1solved with CG preconditioned with ASM are shown in Fig-ures4.1through4.4.The best wall-clock time is reached for about subdomains,for both 2D and3D problems.From a running-time standpoint,is the best choice,even though the iteration count is smaller(for large enough)when.The improvement in con-vergence brought by the overlap is not large enough to compensate for the higher cost of iterations as compared to zero-overlap.On the other hand,an overlap prevents the iteration count from increasing significantly with,and this effect is more pronounced in the2D case.Notice,however,that the number of iterations does not decrease monotonically with increasing overlap:for example,increasing from1to4for3D,, increases the number of iterations by10%.Moreover,for a zero-overlap gives the smallest number of iterations.。
Lecture notes on Numerical Analysis of Partial Differential Equations偏微分方程数值分析
cu(x, t2 ) dx −
D
cu(x, t1 ) dx
t2
=
t1
∂ ∂t
t2
cu dx dt =
D t1 D
∂ (cu) (x, t) dx dt. ∂t
Now, by conservation of energy, any change of heat in D must be accounted for by heat flowing in or out of D through its boundary or by heat entering from external sources (e.g., if the body were in a microwave oven). The heat flow is measured by a vector field σ (x, t) called the heat flux, which points in the direction in which heat is flowing with magnitude the rate energy flowing across a unit area per unit time. If we have a surface S embedded in D with normal n, then the heat flowing across S in the direction pointed to by n in unit time is S σ · n ds. Therefore the heat that flows out of D, i.e., across its boundary, in the time interval [t1 , t2 ], is given by
Numerical Methods for Incompressible Viscous Flow
Hans Petter Langtangen∗ Kent-Andre Mardal Dept. of Scientific Computing, Simula Research Laboratory and Dept. of Informatics, University of Oslo Ragnar Winther Dept. of Informatics, University of Oslo and Dept. of Mathematics, University of Oslo
Re = U d/ν , where U is a characteristic velocity of the flow and d is a characteristic length of the involved geometries. The basic Navier-Stokes equations describe both laminar and turbulent flow, but the spatial resolution required to resolve the small (and important) scales in turbulent flow makes direct solution of the Navier-Stokes equations too computationally demanding on today’s computers. As an alternative, one can derive equations for the average flow and parameterize the effects of turbulence. Such common models models for turbulent flow normally consist of two parts: one part modeling the average flow, and these equations are very similar to (1)–(2), and one part modeling the turbulent fluctuations. These two parts can at each time level be solved sequentially or in a fully coupled fashion. In the former case, one needs methods and software for the system (1)–(2) also in turbulent flow applications. Even in the fully coupled case the basic ideas regarding discretization of (1)–(2) are reused. We also mention that simulation of turbulence by solving the basic Navier-Stokes equations on very fine grids, referred to as Direct Numerical Simulation (DNS), achieves increasing importance in turbulence research as these solutions provide reference databases for fitting parameterized models. In more complex physical flow phenomena, laminar or turbulent viscous flow is coupled with other processes, such as heat transfer, transport of pollution, and deformation of structures. Multi-phase/multi-component fluid flow models often involve equations of the type (1)–(2) for the total flow coupled with advection-diffusion-type equations for the concentrations of each phase or component. Many numerical strategies for complicated flow problems employ a splitting of the compound model, resulting in the need to solve (1)–(2) as one subset of equations in a possibly larger model involving lots of partial differential equations. Hence, it is evident that complex physical flow phenomena also demand software for solving (1)–(2) in a robust fashion. Viscous flow models have important applications within the area of water resources. The common Darcy-type models for porous media flow are based on averaging viscous flow in a network of pores. However, the averaging introduces the permeability parameter, which must be measured experimentally, often with significant uncertainty. For multi-phase flow the ad hoc extensions of the permeability concept to relative permeabilities is insufficient for satisfactory modeling of many flow phenomena. Moreover,
Efficient Finite Element Simulation of Crack Propagation
Sonderforschungsbereich393 Parallele Numerische Simulation f¨u r Physik und KontinuumsmechanikArnd Meyer,Frank Rabold,Matthias ScherzerEfficient Finite Element Simulation ofCrack PropagationPreprint SFB393/04-01AbstractThe preprint delivers an efficient solution technique for the numericalsimulation of crack propagation of2D linear elastic formulations basedonfinite elements together with the conjugate gradient method in or-der to solve the corresponding linear equation systems.The developediterative numerical approach using hierarchical preconditioners compre-hends the interesting feature that the hierarchical data structure will notbe destroyed during crack propagation.Thus,one gets the possibilityto simulate crack advance in a very effective numerical manner includ-ing adaptive mesh refinement and mesh coarsening.Test examples arepresented to illustrate the efficiency of the given approach.Numericalsimulations of crack propagation are compared with experimental data.Preprintreihe des Chemnitzer SFB393ISSN1619-7178(Print)ISSN1619-7186(Internet)SFB393/04-01Februar2004Contents1Introduction1 2Formulation of the Problem3 3Numerical solution based on efficient adaptiveFinite Elements and iterative solver techniques63.1Adaptive Finite Element solution forfixed crack (6)3.2Adaptive Finite Element solution for the crack propagation (9)4Interaction integral125Numerical test examples of crack propagation145.1Crack propagation of a symmetrically loaded specimen (14)5.2Transverse force bending test (17)6Conclusions19 References20Author’s addresses:Arnd Meyer Matthias Scherzer,Frank RaboldTU Chemnitz TU Bergakademie FreibergFakult¨a t f¨u r Mathematik Fakult¨a t f¨u r Maschinenbau,Verfahrens-und EnergietechnikD-09107Chemnitz D-09596Freiberghttp://www.tu-chemnitz.de/sfb393/1IntroductionFracture of materials and several components of high technology engineering displays one of the central problems in modern strength analyzes.Today,fracture mechanics forms an autonomous research area of solid mechanics in order to explain phenomena of fracture, fatigue,and strength of materials for development of materials which are better failure-resistant than the conventional materials and to develop design methods that are better failure-safe than the conventional ones.Although,the entire fracture mechanics approaches cannot altogether be ascribed to crack type problems,the consideration and examination of the essential conditions leading to crack propagation,crack deflection and crack arrest is of highly practical and theoretical interests.Generally,analytical solutions for most of these crack problems infinite body domains are not attainable.Thus,it is necessary to develop numerical techniques for strength analysis of cracked structures subjected to various kinds of loads.Three well known numerical methods having the ability to solve crack propagation bound-ary value problems forfinite body domains can be mentioned[35]:•the Finite Element Method,•the Boundary Element Method,•the Meshless Galerkin Method.The Finite Element Method(FEM)is used to solve crack and crack propagation problems for over30years.On the one hand,one canfind nowadays a big number of publications concerning even dynamic crack growth as it follows from the papers[11,24,35]and the references therein as well as from the reviews[6,25,26].On the other hand,crack propa-gation modeling still represents a complicated problem and different specific definitions of crack propagation are used in the literature.•First,it is difficult to describe the changeover from a continuous solid medium hav-ing strong continuity requirements on the displacements and their derivatives tofinal displacement jumps with new free surfaces inside the solid.Generally,the modeling of this procedure should be based on continuum approaches[27]leading to well-posed boundary value problems including crack growth.For this purpose,one needs im-proved cohesive models following from the realized critical stress and deformation states surrounding the advancing crack tips,in general,for non-linear material be-havior.It is well known that classical plastic inviscid material models containing strain softening lead to ill-posed problems,and thus,are not useful to model crack propagation.It’s quite plain that the well-posedness of such formulations cannot be restored by means of numerical techniques.The problem formulations have to prevent ill-posedness in order to get stable solutions depending continuously on the initial parameters in connection with the application of stable numerics.1•Second,an essential technical effort is necessary to include the algorithmic extensions of the FEM technology for the crack advance procedures[20]if the constitutive def-inition of the fracture process(point one)is established.In this context,Belytschko and co-workers have worked out the extended Finite Element Method(X-FEM)[4,5] by adding enrichment functions to the approximation containing discontinuities for the crack propagation.This method was already expanded to3D crack propagation problems in[14,23]and induced an essential progress in crack advance simulation.•Third,the numerical solution procedure for the corresponding boundary value prob-lem with changing boundaries has to be made as efficient as possible,otherwise more complicated,that is,more realistic crack growth assemblies are not practicable to calculate.Thus,it is necessary to have excellent solutions surrounding the tips,in-cluding the asymptotic behavior,where the fracture process occurs.Away from the cracks,the numerical solution does not require very high resolution.In fact,one needs adaptivity of the solution based on a posteriori error estimation[12,28,34] together with effective capable solvers for the discrete solution system at each step of crack propagation.The Boundary Element Method(BEM),which reduces the dimensionality of the problem by one degree can be applied effectively to crack advance problems for linear elastic for-mulations.On the one hand,the large effort of remeshing during crack propagation is not necessary by means of BEM.On the other hand,BEM encounters difficulties in connection with artificial boundaries which must be introduced repeatedly for each increment of crack extension.The publications[1,2]and the references therein can be used for more details about BEM-applications in crack advance problems.Meshfree methods[18,19,21]were used in the past for numerical simulation of non-linear boundary value problems including elastic-plastic and strain localization problems.They do not need to generate a connectivity matrix and thus,they are especially suited for adap-tive refinements and for discontinuousfield problems or solutions with high gradients[16]. Meshless methods are capable to reproduce crack growth by means of special enrichments for the nodal shape functions[13,37].On the other hand,the main difficulties of these methods consist in the imposition of the essential boundary conditions[8]as well as in the complicated structure of the shape functions[9]leading to difficult integration procedures. Thus,it seems challenging to combine the advantages of the meshless methods with the advantages of FEM in order to achieve optimal solutions in the sense of efficiency and accuracy[16].In this paper,we focus on efficient solution techniques for the numerical simulation of crack propagation in2D linear elastic formulations based on FEM together with the conjugate gradient method(PCGM)solving the corresponding linear equation systems.The de-veloped iterative numerical technique using hierarchical preconditioners comprehends the interesting feature that the hierarchical data structure will not be destroyed during crack propagation.Thus,one gets the possibility to simulate crack advance in a very effective2numerical manner including adaptive mesh refinement and mesh coarsening.Test examplesare presented to illustrate the efficiency of the given approach.2Formulation of the ProblemLet us consider the governing equations of a linear two-dimensional boundary value problemfor isotropic elasto-statics with an advancing crack as shown in Figure1.The boundaryof the studied regionΩis subdivided into the following parts:The displacements u∗areprescribed on S u of the outer boundary S=S u∪S T and stress loads T∗are given on S T. Throughout the paper,bold typed variables denote vectors and tensors of second degreein R2.Our considerations will be restricted to two-dimensional problems embedded in R3.In addition to S,the crack surfaces a+and a−represent an additional boundary insideΩtogether with the crack advance length scales∆a+and∆a−(|∆a+|=|∆a−|).The direction and the magnitude of∆a+follow from corresponding crack advance models used in fracture mechanics.Thus,the strong equilibrium equations and the boundary conditions have the form:∇·σ=0inΩ,σ·n=T∗on S T,u=u∗on S u,(2.1)σ·n=0on a+,a−,∆a+and∆a−.(2.2) In(2.1)and(2.2)σ,u and n denote the stress tensor,the displacement vector and the unit normal vector corresponding to the given surfaces,respectively.∇represents the Nablaoperator which is defined through∇=e i∂∂x i.(2.3)Recurring indices mean the application of Einstein’s summation convention.The vectors e i(i=1,2)span the dual basis regarding e i as shown in Figure1with respect to the coordinates x i.The symbol·stands for the scalar(inner)product of two vectors. Throughout our considerations,we will use the geometrically linear fashion of continuum mechanics,that is,the deformation tensorεis related to the displacements u by means ofε=ε(u)=12 ∇u+(∇u)T .(2.4)(∇u)T denotes the transpose of∇u.To complete the description of the boundary value problem,the material equations relatingσtoεhave to be established.They will be applied in the form of an isotropic elastic solidσ=C:ε=λI1(ε)I+2µε(2.5) for plane strain conditions(u=u1(x1,x2)e1+u2(x1,x2)e2,u3=0).λandµdenote Lame’s coefficients,which follow from Young’s Modulus E and Poisson’s ratioνby meansofλ=νE(1+ν)(1−2ν),µ=E2(1+ν).(2.6) 3for the equilibrium at the end of the crack propagation step as a consequence of the linear quasi-static behavior.This circumstance simplifies the whole solution procedure and will be used in the following.At each load level,it is necessary to prove the critical crack advance conditions at the crack tips.If these conditions are fulfilled,the crack prolongates to a+∆a and the solid gets the new surfaces∆a+and∆a−.In continuum mechanics,the length scale∆a has to be assumed as an material parameter depending on the local stress state of the crack tip,and in general,on constitutive parameters influencing the fracture process[10].Throughout the paper,we will suppose|∆a+|=|∆a−|to be a given constant material parameter for simplicity.The crack prolongation conditions are formulated in the classical way given,for instance, in[10]by means of two approaches.Based on the applied elastic material behavior,these conditions can be expressed through the two-dimensional J-integrals or through the corre-sponding stress intensity factors in an equivalent manner.The J-integral vector J is defined alongΓ-contours surrounding the crack tip as shown in Figure2(the outer normals of these contours are labeled by n coinciding with the normal vector label on S).J= Γ−+Γ+Γ+ (1σ:ε)n−∇u·σ·n ds(2.9)For the considered material behavior,J does not depend on the specific choice of the integration contour surrounding the crack tip.Thus,it characterizes the energyflux to the tip and renders possible the production of the new surfaces∆a in the case of crack advance with an extreme amount of power.The components of J with respect to the local xy-coordinate system at the crack tip are shown in Figure2:J x=e x·J,J y=e y·J.(2.10) They are connected to the stress intensity factors K I and K II for the applied plane strain model by means ofJ x=1−ν2E K2I+K2II ,J y=−21−ν2E K I K II.(2.11)K I and K II provide the force criteria basis of fracture mechanics[17].Applying the energetic approach[10],that is,the crack propagates along the direction of the J-integral vector J(Figure2)if the magnitude J= J2x+J2y of J reaches the critical value J c(J=J c),the frailure surface F(K I,K II)=0is defined throughF(K I,K II)=K Ic−K I (1+ρ2)2+4ρ2 14=0,K2Ic=J c E1−ν2,ρ=K II K I.(2.12) K Ic andρdenote the fracture toughness and the ratio of K II and K I,respectively.For a standing crack,(2.12)defines crack advance∆a along the J-direction with the angleθ(Figure2):θ=arctan−2ρ1+ρ2.(2.13)5Atfirst,the complicate structure of the solution around the crack tip(and its singularities) demand an adaptive mesh refinement.Although the position of the actual crack tip is known,we use an error controlled refinement procedure together with possible coarsening which is very attractive for later stages where the crack tip has moved on.Here,we apply locally the well–known residual error estimator[38]as error indicator on each existingfinite element Tη2T= E⊂T h E [σ·n] 2L2(E),(3.1)which measures the size of the stress jumps[σ·n]over the edges E of T.In the case of isotropicfinite elements the weight h E denotes the edge size divided by Young’s modulus. Then,η= ∀Tη2T 1/2(3.2)gives an approximation to the total H1–error size up to an unknown constant.Hence,for an element T withη2T>αrefine·η2,(3.3) we should refine T and ifη2T<αcoarse·η2,(3.4) we should coarsen the mesh around T.(We chooseαrefine≈0.8and reduce it up to0.05if not enough elements are refined andαcoarse=10−3).Second,this adaptive procedure requires a series of linear system solutions on each actual mesh,respectively.Thefinite element solutions on these meshes mainly have to guarantee that the error indicators are precise enough for a good mesh control in the refinement procedure.Hence,we look for a very time efficient solution process enabling restricted accuracy on coarser meshes for domains away from cracks and crack tips.This is done with the help of the following ingredients:•We store only the element matrices of each element T together with the element data.The assembly of any total stiffness matrix is not necessary.The solver,applying the preconditioned conjugate gradient method(PCGM),multiplies easily element by element.Thus,we can use the advantage only to generate element matrices on the new elements emerging from the refinement process.The elements away from cracks and crack tips do not need refinement and remain unchanged.•The preconditioner uses the hierarchical data structure,which is contained in the his-tory of subdividing the edges of the mesh.Such hierarchical data are necessary for all modern Multi–Level preconditioners such as the“Hierarchical Basis Preconditioner”(HB,see[39]),the BPX–preconditioner(see[7])or all the well–known Multi-grid algorithms.For simulation in2D,the most simple HB–preconditioner leads to a very quick solver,especially from the following reasons:71.All information for implementing HB is contained in the“edge–subdivision–tree”from the mesh refinement used.2.The number of arithmetic operations is about3N for N unknowns in the actualmesh.3.On the one hand,the condition number of the preconditioned stiffness matrix(equal to the condition number of the Finite Element stiffness matrix withrespect to the hierarchical basis of the ansatz space)grows as(log N)2.On the other hand,we have a very good starting vector embedded into theadaptive loop on each new solution(only“high frequency error”)leading tonear constant number of iterations during the refinement.We demonstrate the efficiency of the technique described above at a simple example with a crack of constant length(“slit–domain”).We start with a coarse mesh of8triangles(17 coarse edges,10nodes on vertices)for the domainΩ=(−1,1)2\[0,1)×{0}.The node(1,0)occurs twice,as well as we have2distinct edges from(0,0)(the crack tip) to both nodes(1,0).The two adjacent triangles refer to the different edges.The coarse mesh solution shape and twofiner solution shapes with about1000and8000nodes are presented in Figure3.Figure3:Mesh refinement on”Slit Domain”(here:prescribed displacements at bottom and top)The following table contains some information about the adaptive refinement until30000 nodes.Each row belongs to one linear system solution with the times for generating the new elements and for the PCGM.Additionally,the number of elements to refine and to coarsen are given.Mainly,we watch thefinite number of about20iterations on each mesh. The computations were realized by means of a Pentium IV with optimized FORTRAN. These very short running times are challenging for the moving crack situation.In fact,we should maintain most of the features described above,although the mesh connection will change in the case of an advancing crack.8Mesh-Info:|Gener.|PCGM|#Elements|est.Err.#Nodes/#Elem’s/#Edges|time[s]|#It time[s]|to ref/coars|(square)-----------------------|-------|--------------|------------|---------27/8/17|0.000|20.000|10|1.1E+0185/32/75|0.001|140.001|30|5.2E+00297/128/287|0.005|180.005|62|2.6E+00643/254/633|0.008|200.011|1084|3.5E+00750/272/621|0.000|240.015|1556|2.0E+001099/416/955|0.009|230.022|16144|1.1E+001299/467/1040|0.006|240.027|24115|6.2E-011702/626/1389|0.012|250.040|27157|3.6E-01....6695/2822/6233|0.032|160.239|21515|5.1E-038786/3803/8324|0.160|150.307|65113|3.0E-039438/4088/8941|0.024|160.441|34839|2.2E-0312536/5534/12039|0.241|150.387|28437|1.1E-0315313/6845/14816|0.236|150.605|44797|6.3E-0419455/8747/18950|0.273|150.705|284206|4.3E-0422302/10088/21791|0.093|15 1.075|105993|3.0E-0431961/14477/31450|0.574|15 1.451|230820|2.0E-04 Table1:The adaptive history:number of nodes,elements and edges,solver times and error estimators for quadratic triangular elements3.2Adaptive Finite Element solution for the crack propagation If the refinement procedure of the previous section is applied until some small error crite-rion,we end up with a good Finite Element approximation of the true H1–solution belong-ing to any corresponding load level of the problem under consideration.The J–integrals introduced in Chapter2yield the propagation information of the crack movement. Thus,we obtain the direction and the length|∆a+|=|∆a−|defining the new crack tip.From this information,the segment L(straight line)from P(actual crack tip)to P new (new crack tip)fixes the new crack surfaces which are to incorporate into the existing mesh as shown in Figure4.In the usual manner of Finite Element routines,the new mesh opening can be calculated introducing nodes(hence edges of elements)along L with twice degrees of freedom.At node x,we define u+(x)and u−(x)the displacements on both crackflanks.We call an element T a“–”-element if it contains“–”-degrees of freedom.“+”-elements belong to the other new crackflank and contain“+”-degrees of freedom, respectively.9Thefirst step for the crack propagation modeling is the construction of these“–”-and “+”-elements by means of an(unusual)subdivision of all elements cutting L.Embedded into the adaptive mesh subdivision routine,this is done easily considering the edges E of the actualfine mesh and performing the following algorithm:For each edge E do:1.Calculate the intersection Q of L with E.2.If no intersection:continue with next edge.3.Let E=(a,b)with a,b∈R2the end nodes,then Q=αa+(1−α)b.Ifαnear1=⇒correct a to Q and continue with next edge.1Ifαnear0=⇒correct b to Q and continue with next edge.Else:Subdivide the Edge E,producing E1=(a,Q)and E2=(Q,b).Figure4:Mesh handling along section L,element1with“green”,element2and3with “red”and4,5with“green”subdivision after node moveAfter this procedure,the usual mesh refinement routine subdivides all elements T sur-rounding L in“red”manner,if two adjacent edges of T are intersected by L.The other surrounding elements are subdivided“green”,if they are intersected by L at one edge only.All other elements remain unchanged.This simple algorithm introduces the segment L as a couple of edges into our mesh.See Figure4for an example and Figure5for the explanation of“red”and“green”subdivision.This way,it is possible to define the“+”and“–”degrees of freedom at the new nodes along L.In this context,we emphasize the important fact regarding the following efficient solver. Up to now,the hierarchical data structure is given within the subdivision tree of the edges. If we will double the edges along L this hierarchy will be destroyed and we cannot use an efficient hierarchical preconditioner anymore.Thus,we define only one edge,as it has been created from the algorithm above,which refers to the“–”-nodes.Each“–”-node has a copy as“+”-node in the nodal list and there is a reference from“–”-node to its“+”-partner 1These tests avoid very distorted elements.10Figure5:“red”(left)and“green”(right)subdivision(and back)in the nodal list.Obviously,the“–”-elements refer to degrees of freedom of usual or“–”-nodes,the“+”-elements to such of usual or“+”-nodes.In order to construct an efficient preconditioner we append the new unknownsfirst.By definition of“–”-and“+”-nodes we have:n nodes with n usual nodal degrees of freedom2d“–”-nodes with d nodal degrees of freedom andd“+”-nodes with additionally d nodal degrees of freedom.For this actual mesh,our linear system to solve has N=n+d+d nodal degrees of freedom,but the hierarchical preconditioner C−1available from the information of the edge tree acts on n+d nodal degrees of freedom only and we won’t destroy this hierarchy information,as explained above,in order to construct an efficient preconditioner for crack growth problems.Thus,we introduce a restriction operator R of afictitious space with 2n+2d nodal degrees of freedom onto our realistic space of n+2d nodal degrees of freedomby means of:R=12I O12I OO I O OO O O I.(3.5)Finally,we guess the necessary resulting preconditioner in form of a two time C−1-appli-cation regardingR C−1OO C−1R T.(3.6)This results in the following procedure:Let r= r T0,r T−,r T+ T be the residual vector in the k–th iteration of the conjugate gradi-ent algorithm with the parts r0,r−and r+referring to usual,“–”-nodes and“+“-nodes, respectively.Then,the preconditioner has to produce the preconditioned residual w asw=R C−1OO C−1R T r,(3.7) 2We write‘n usual nodal degrees of freedom’for the indication of two degrees of freedom per node11which contains two preconditioning calls:w 0,−w − :=C −1 1r 0r − (3.8)and (after copying of r +to the “–”-data)w 0,+w +:=C −1 1r 0r + .(3.9)This calculates different values on the usual nodes,which are averaged for defining w :w = 12(w 0,++w 0,−)w −w + .(3.10)The two solvers in (3.8)and (3.9)are the cheapest hierarchical basis preconditioners acting on the nodal degrees of freedom,which are referenced from the edge data,keeping in mind the hierarchical data structure during crack propagation.Thus,they should produce an analogous efficient iteration behavior as presented in the previous chapter for a standing crack.4Interaction integralBased on the crack advance criteria (2.12,2.13)or (2.14)and given ∆a ,quasi-static crack prolongation can be simulated by means of the numerical technique introduced in section3.This way,the K -factors K I and K II must be determined at each load level.In general,it is not possible to extract K I and K II from the usual numerical near tip solution without taking into consideration the special asymptotic behavior in the solution procedure and in the postprocessing stage.This is the case as well,even though a very fine finite element net is reached after appropriate adaptive mesh refinement at the crack tip.Therefore,we will apply the J-integrals defined in section 2for the numerical determination of the K -factors exploiting their path independence in order to execute the numerical calculations away from the tip.This is possible for a general curved crack propagation only,when ∆a may be assumed as straight lines,whereby the so-called two-dimensional interaction integral technique [33]can be used.The essential point of this approach consists in the fact that the integrals along the straight ∆a -contours Γ−and Γ+(Figure 2),which include the near tip field is excluded from numerical calculations.The method considers two states of the cracked region:•State 1(σ(1),ε(1),u (1))represents the current FEM-solution for which the K -factorsK I =K (1)I and K II =K (1)II has to be calculated at a given load level and crack length,and12•State2(σ(2),Mode I eigenfunction Ifield Mode II eigenfunction[15]with the associated factor K(2)II.The J x-integral for the sum of these two states follows from(2.9)and(2.10)and the state definitions above in the form:J(1+2i) x =J(1)x+J(2i)x+J(1∗2i)x,i=I,II(4.1)with the notationJ(1∗2i)x= Γ (σ(1):ε(2i))e x·n−e x·(∇u(1)·σ(2i)+∇u(2i)·σ(1))·n ds,i=I,II.(4.2)The index i in(4.1)and(4.2)denotes the use of the Mode I eigenfunction(i=I)or the Mode II eigenfunction(i=II)with respect to the state2respectively.Note on the one hand that the integration contourΓdoes not contain any lines of the crack faces including the parts directly connected to the crack tip as shown in Figure6for an example net.This follows from the fact that only the J x-components of the J-integral vector are used in the calculations.These components have no contributions from the straight crack faces ∆a.On the other hand,Γreaches the crack faces at a distance from the tip which mustbe less than|∆a+|=|∆a−|.The integrals J(1∗2i)x are proportional to K(2)I(i=I)and K(2)II(i=II).On the other hand,the application of the superposition principle for the two states defined above to relation(2.11)results in the formulae:J(1+2I) x =J(1)x+J(2I)x+2E K(1)I K(2)I ,J(1+2II)x=J(1)x+J(2II)x+2E K(1)II K(2)II (4.3)with E =E/(1−ν2)(for plane strain).(4.1),(4.2)and(4.3)give the necessary relations for the K-factors:K I=K(1)I=E2K(2)IJ(1∗2I)x,K II=K(1)II=E2K(2)IIJ(1∗2II)x(4.4)13Thus,the K-factors can be calculated numerically by means of(4.4)based on the numerical solution of the current load step and the corresponding singular eigenfunctions alongΓ. 5Numerical test examples of crack propagation5.1Crack propagation of a symmetrically loaded specimenThefirst test example(Figure7)represents the crack propagation simulation of a symmetri-√mm,|∆a+|=|∆a−|=2.5mm,E=2·105MPa, cally loaded tension specimen(K Ic=450MPaν=0.3).Figure7shows the geometry together with the initial mesh.Throughout the following,all geometrical descriptions are given in the length unit mm.The specimen is loaded by means of uniform displacements u=0.05mm at the left side of the specimen (Figure7).The given load induces the crack propagation immediately and the crack propa-gates continuously up to its arrest.Figure8shows the adaptive refined meshes for different crack lengths.The numbers placed down right in the pictures represent the accumulated numbers of the realized PCGM-solutions up to the current reached crack length.The pic-tures indicate mesh refinement and mesh coarsening during crack growth.The mesh of the region surrounding the old crack tip is coarsened if a certain distance to the current crack tip is exceeded.The fourth picture in Figure8shows thefinal position of the crack tip for the given material properties and load conditions,i.e.,the crack stops.In this context, the dependence of the stress intensity factor on the crack length a is given in Figure9. Because of the symmetric loading conditions with respect to the crack trajectory resulting inρ=0,the crack advance conditions(2.12)together with(2.13)coincide with the hoopFigure7:Symmetric tension specimen14。
Approximate Reasoning for Solving Fuzzy Linear Programming Problems
element of the set {M AX (x) | x ∈ Rn } (in the sense of the given inequality relation). We interpret the FLP problem maximize c ˜1 x1 + · · · + c ˜n xn < ˜ bi , i = 1, . . . , m, subject to a ˜i1 x1 + · · · + a ˜in xn ∼ as Multiple Fuzzy Reasoning schemes of the form Antecedent 1 ... Antecedent m Fact Consequence
We consider LP problems, in which all of the coefficients are fuzzy quantities (i.e. fuzzy sets of the real line R), of the form maximize c ˜1 x1 + · · · + c ˜n xn < ˜ bi , i = 1, . . . , m, subject to a ˜i1 x1 + · · · + a ˜in xn ∼ (1)
Abstract We interpret fuzzy linear programming (FLP) problems (where some or all coefficients can be fuzzy sets and the inequality relations between fuzzy sets can be given by a certain fuzzy relation) as multiple fuzzy reasoning schemes (MFR), where the antecedents of the scheme correspond to the constraints of the FLP problem and the fact of the scheme is the objective of the FLP problem. Then the solution process consists of two steps: first, for every decision variable x ∈ Rn , we compute the maximizing fuzzy set, M AX (x), via sup-min convolution of the antecedents/constraints and the fact/objective, then an (optimal) solution to FLP problem is any point which produces a maximal element of the set {M AX (x) | x ∈ Rn } (in the sense of the given inequality relation). We show that our solution process for a classical (crisp) LP problem results in a solution in the classical sense, and (under well-chosen inequality relations and objective function) coincides with those suggested by [Buc88, Del87, Ram85, Ver82, Zim76]. Furthermore, we show how to extend the proposed solution principle to non-linear programming problems with fuzzy coefficients.
土木工程英文翻译1
* Corresponding author. Tel.: +852-2766-7820; fax: +852-23654703.
E-mail address: mmlhyam@.hk (L.H. Yam).
0263-8223/$ - see front matter Ó 2003 Elsevier Ltd. All rights reserved. doi:10.1016/pstruct.2003.09.038
378
Z. Wei et al. / Composite Structures 64 (2004) 377–387
Preconditioners for non-Hermitian Toeplitz systems
In this paper, we introduce circulant, respectively !-circulant preconditioners related to jf j2 for the normal equation (1.2) even if f 2 W has zeros. These preconditioners can be applied with a fewer amount of arithmetical operations than the combined preconditioners in 3]. We show that the singular values of the preconditioned matrix are clustered at 1 and that in case if the spectral condition number of AN (f ) is O(N ) the PCG method applied to (1.2) converges in O(log N ) iteration steps. We are also interested in Krylov space methods like GMRES or BICGSTAB which do not require the translation of (1.1) to the normal equation. Here we suggest circulant, respectively !-circulant preconditioners related to f . Unfortunately, the convergence of these methods does no longer depend on the singular values but on the eigenvalues of the preconditioned system. We can not prove clustering results for arbitrary generating functions f 2 W . However, for rational functions f , we show that the preconditioned matrices have only a nite (independent of N ) number of eigenvalues which are not equal to 1 such that preconditioned GMRES converges in a nite number of steps independent of the dimension of the problem. This paper is organized as follows: In Section 2, we introduce our circulant, respectively !-circulant preconditioners and prove corresponding clustering results. Section 3 modi es these results for trigonometric preconditioners. Finally, Section 4 contains numerical examples for various iterative methods and preconditioners. In particular, we apply our preconditioners to the queueing network problem with batch arrivals examined in 3].
AND JOS ' E M. CELA
jjAM ?1 ? I jjF <
n X j =1
1
and applying the de nition of the Frobenius norm, we obtain (6)
jjAm?1 ? ej jj2 < 2 j
2
where ej is the j-th vector of the canonical basis. The problem becomes naturally a parallel computation of each one of the M ?1 columns. In 2] an adaptative algorithm is proposed to compute the m?1 entries, and only the most signi cative are captured. j The SPAI preconditioners derived from this technique will be called in the following LSQ-SPAI (Least SQuare SPAI). For the second technique there are two di erent methodologies. One due to Saad et al. 4], and other one due to Benzi et al. 1]. The methodology proposed by Saad is based in a dropping strategy applied to the orthonormal basis generated in the GMRES algorithm. The methodology proposed by Benzi is based on the decomposition of A as a matrix product, derived from the nonsymmetric Lanczos method, i.e. (7) A = WDZ t
Improved average of synthetic exact filters for precise 2013
Published in IET Biometrics Received on14th December2011 Revised on11th December2012 Accepted on9th January2013 doi:10.1049/iet-bmt.2011.0006ISSN2047-4938 Improved average of synthetic exact filters for precise eye localisation under realistic conditionsEsteban Vazquez-Fernandez,Daniel Gonzalez-Jimenez,Long Long YuGRADIANT,Galician R&D Center in Advanced,Telecommunications,SpainE-mail:evazquez@Abstract:Precise eye localisation is a crucial step for many applications,including face recognition,gaze tracking and blink detection.In this study,the authors propose several improvements to the original average of synthetic exactfilters(ASEF) formulation,demonstrating that its accuracy can be enhanced if adequate illumination correction,spatial priors and cross-filter responses are exploited for eye localisation.The so-called improved ASEF(iASEF)was tested on the well-known BioID database and other more challenging datasets comprising real world face imagery:labelled faces in the wild(LFW) and the very recent labelled face parts in the wild.The iASEF provides the state-of-the-art results,rankingfirst on BioID database and second on a2000-image LFW subset.In addition,the authors propose a novel,much more challenging benchmark for eye localisation using the whole LFW and a standard protocol initially designed for face verification. Improvements over original ASEF were also confirmed on this difficult test,although with a significant drop in performance.They point out the necessity of adopting these realistic validation scenarios,in order to evaluate the actual state-of-the-art and fairly compare eye localisation methods in unconstrained settings,where localisation accuracy is still far from perfect.1IntroductionFace recognition continues to be an active and attractivefield of research,with a current trend towards unconstrained and complex scenarios[1–5].Within the related research community,it is well known that the accuracy in localising eye positions strongly affects the performance of automatic face recognition algorithms[6–8],mainly those that heavily rely on the quality of the geometrically normalised face image(e.g.subspace projection methods such as PCA,LDA etc.)but also others which are supposed to be more robust to misalignments[9].Precise eye localisation is also a crucial step for other systems such as gaze tracking,blink detection and iris recognition at a distance.Therefore the search for accurate(and efficient)eye localisation methods that can work in real world settings is a must,and has attracted the attention of the computer vision and face analysis research communities[10–21].Overall,the main problems for precise eye localisation arise from the drastic appearance variations provoked by several factors,starting with the normal behaviour of eyes (e.g.blinking),interpersonal differences and important external factors including pose changes,lighting conditions, specular reflections by glasses,partial and total occlusions (e.g.sunglasses),and low-resolution and-quality images, for instance those captured with mobile phone cameras and webcams.Examples of‘difficult’face images from the LFW database can be seen in Fig.1.Despite the large amount of research in this topic,eye localisation is still not solved,specially for real world imagery. Within this specific scenario,a new trend has emerged in the face analysis community,focusing on the development of datasets and standard protocols for benchmarking face processing systems under realistic conditions:The labelled faces in the wild(LFW)[1]for face recognition,Gallagher’s database for demographics estimation[22],the face detection database for face detection,and the dynamic facial expressions in the wild for automatic expression analysis are just some examples of this new tendency.Moreover,initiatives such as the one launched by the facial image processing and analysis group(FIPA,http://fi/)in adapting existing datasets to new challenges(ing the LFW dataset for benchmarking gender classification)is fostering fair competition in the research community.In the specific task of eye localisation,different image databases have been used for testing purposes.One of the most popular is the annotated BioID face database[23], which has been extensively used since2001.More recent works have been tested on different subsets of LFW[16, 21]and other datasets collected from the web[13,15]. However,in general,there is a lack of a common benchmark under realistic conditions for fairly analysing the performance of different approaches,which will be one of the issues addressed in this paper.Very recently,the labelled face parts in the wild(LFPW)database was proposed[5],for assessing the performance of multiple facial points localisation on difficult imagery.1.1ContributionsThe main contributions of this paper are listed below:1.Revisit average of synthetic exact filters (ASEF)accuracy on different datasets,highlighting the decrease in performance when tested on dif ficult imagery.To the best of our knowledge,ASEF implementations have been already tested in FERET [10],and on subsets of images gathered from the web [13,21].We assess the performance of our own ASEF-based eye localiser on the well-known BioID face database (a common dataset for evaluating eye localisation accuracy,with lots of approaches already tested [20]),and on the 2000-face-image subset of the LFW database initially proposed by Tan et al.[16].We demonstrate,in agreement with Štruc et al.[21],that the original ASEF degrades signi ficantly on real world imagery,performing more worse than in the more controlled conditions of Bolme et al.[10].2.Propose several modi fications to the original formulation and assess the improvements in performance on the selected datasets.More concisely†We show that illumination correction (both in filter learning and eye localisation stages)can lead to improved accuracy.In this sense,we test several state-of-the-art illumination normalisation techniquesprovided by Štruc and Paves ic [24,25],and evaluate their impact on the eye localisation accuracy.†We show that the use of spatial priors learned during filter training,combined with cross-filter responses improves eye localisation accuracy (compared to the unconstrained search and the 20×20search region of [10]).Finally,we demonstrate that the method combining all the aforementioned contributions (so-called improved ASEF,or iASEF)improves eye localisation on all considered scenarios:pare the proposed approach against the state-of-the-art on the BioID and a 2000-face-images subset of the LFW database initially proposed in [16].We demonstrate that our approach is very competitive on these databases,ranking first in the BioID and second in the LFW subset (being only outperformed by the enhanced pictorial structure (PS)of Tan et al.[16]).2.Design a challenging benchmark for eye localisation.In an attempt to propose a much more challenging benchmark for eye localisation,we took advantage of the manual annotations for the whole LFW provided by Degtyarev [26]and adapted an existing,standard face veri fication protocol [1]for eye localisation using a ten-fold cross validation scheme.While improvements over original ASEF were also con firmed in this dif ficult test,performance drop is signi ficant,and we point out the necessity of establishing challenging and standard protocols in realistic and large datasets,in order to evaluate the actual state-of-the-art andfairly compare eye localisation methods in unconstrained settings.3.We further evaluate our approach on the recent LFPW and compare the results with those obtained over the whole LFW.The paper is organised as follows.Related work on precise eye localisation is presented in Section 2.The different datasets and benchmarks used in this paper are described in Section 3,whereas a review of ASEF filters is presented in Section 4.Our contributions to the original ASEF formulations are described in detail in Section 5.Results and analysis of iASEF performance in the considered benchmarks are presented in Section 6,including a comparison against the state-of-the-art in the BioID and LFW-2000subset databases.Finally,some conclusions and future lines are drawn in Section 7.2Related workBolme et al.[10]proposed the ASEF paradigm for eye localisation,demonstrating its good performance in relatively constrained images (i.e.FERET database [27]),outperforming both Gabor jets [28]and a Haar-based Adaboost cascade [29].ASEF learns a correlation filter for each training image,and the whole set of filters are averaged together.Since it uses the whole (face)image in training rather than a subset of patches,the con figuration and appearance of different face parts (not only the eye patch)are learned in the final filter.Furthermore,ASEF is very fast in training and testing because correlation can be computed very ef ficiently in the Fourier domain.Li et al.[13]pointed out that since it is based on a single linear classi fier,ASEF cannot cope with drastic appearance variations because of considerable pose and lighting changes,and facial expressions.To overcome such drawbacks,the authors propose an additive logistic model to localise facial keypoints,make use of context (i.e.head pose estimation)to train a set of filters rather than a single average filter,and take advantage of a a context dependent PS to improve localisation results.Their proposed scheme outperforms ASEF in a subset of the UCL database comprising web images [30]where,in turn,ASEF outperforms a Haar-based Adaboost cascade [29]and a Bayesian approach [17].The simple Bayesian method has been shown to perform better on eye localisation than several classical appearance-based approaches such as regression methods,boosting-based methods and SVM-based methods [17].Struc et al.[21]proposed a variation of ASEF so-called principal directions of synthetic exact filters (PSEF).Results on FERET and on a subset of LFW database demonstrate better performance than the original ASEF and Haar-based classi fiers.In [16],Tan et al.proposed an enhanced PS [31]combined with SVM-based classi fication for precise eye localisation under uncontrolled environments,demonstrating very good results on a subset of the LFW database [1].Their approach is shown to outperform the Bayesian approach and the traditional PS.In [20],Yang et al.presented a new eye localisation method based on multiscale sparse dictionaries.The experiments performed on the BioID database proved that their method outperformed the state-of-the-art in such database.Other recent approach for eye localisation on dif ficult sets is the work by Qian and Xu [15]whereGaborFig.1Examples of ‘dif ficult ’face images taken from LFW [1]filtering and K-means clustering are used,reporting results on FERET and a subset of the LFW(in principle not the same subset as in[16]).Kroon et al.[11]proposed a probabilistic method based on multiscale local binary patterns(LBPs), testing the impact of eye localisation accuracy on face recognition performance.Scheirer et al.[18]compared a commercial system with an SVM-based and several correlationfilters-based eye localisation methods on re-imaged faces from FERET and CMU PIE databases, focusing on the performance of the systems in conditions of low-light,large distances and blurring.Other approaches for eye localisation include the works by Valenti et al.[14] using the so-called isophote curvature with tests on the Yale B and BioID databases,Campadelli et al.[32]who trained SVMs on properly selected Haar wavelet coefficients,and Monzo et al.[33]using HOG descriptors and SVM classification.It is clear,on the one hand,the large number of papers dealing with eye localisation over the past years.On the other hand,we would like to highlight the heterogeneity of training and testing datasets when assessing eye localisation performance.Even when the same database is used[15, 16],it is not clear whether the same protocol has been followed,which makes it very difficult to fairly compare different systems.Towards the objective of defining a standardised benchmark in unconstrained settings,afirst attempt was proposed by Tan et al.[16],with a simple division of2000images in training and testing sets. Moreover,the authors provide manual annotations for both eyes and nose.Recently,Degtyarev[26]shared the set of manual annotations for the whole LFW,which constitutes a significant leap towards the definition of a complete and challenging protocol.Finally,the LFPW has emerged as an interesting alternative for facial point localisation,with a quite simple protocol consisting on1132training images and300testing images[5].However,the problem with LFPW is that,since only url’s are provided,the content of the database may change as some imagefiles may no longer exist or become corrupted after some time.Therefore it seems that LFPW will not be an optimal choice as a standardised benchmark for evaluation.3Proposed benchmarkTo evaluate eye localisation accuracy,we adopt the normalised(scale-independent)measure proposed in[23]. This error measure is defined in terms of the eye centre positions according toD=max d l,d rM l−M r(1)where d l and d r stand for the Euclidean distances between the localised eye centres and their corresponding ground-truth positions M l and M r.Based on this error measure D,we will also use cumulative correct localisation curves for evaluation.These curves plot,for a given D,the fraction of testing images with a normalised error distance below than D. In order to provide a complete evaluation under different conditions,we will use four benchmarks:the BioID database,two different configurations of the LFW image database,and the LFPW dataset.The characteristics of each of these benchmarks are described below.3.1BioID benchmarkThe BioID database[23]comprises1521frontal face images from23different subjects.The images were taken under various lighting conditions and cluttered background.This database,for which manual annotations of eye positions are provided,is one of the most used datasets for eye detection tasks.A two-fold cross validation protocol is used to compare our results with previous methods tested in BioID [19,20,23,32,34–41].All the proposed contributions will be tested(both in isolation and combined)on the BioID dataset,assessing the benefits over the original ASEF formulation.3.2LFW benchmarksWe have also considered the LFW database[1],which was originally designed for studying the problem of unconstrained face recognition,for testing eye localisation performance.This dataset contains13233images of faces collected from the web,automatically detected by the OpenCV version of the Viola–Jones face detector[42]. Within the LFW,we consider two different benchmarks: thefirst one using the LFW subset proposed in[16](2000 images),and the second one using the whole LFW database,with a protocol designed by ourselves based on the original face verification protocol.3.2.1LFW2000subset:In order to compare our proposed approach with other methods under unconstrained conditions[15,16,21],we have chosen the LFW subset proposed in[16].It contains2000randomly selected images from the whole database,where the coordinates of the eyes and nose have been manually annotated.The images have been split into two groups with1000images each,one for training and the other for testing.The training set has been divided intofive equally sized subsets for development purposes,with afive-fold cross validation scheme used for parameter adjustment.3.2.2Extended LFW benchmark:As stated above,we chose the2000-face-images subset from LFW for comparing our approach against some methods tested in similar conditions[15,16,21].In the so-called extended LFW benchmark we propose to use the whole LFW database and the manual annotations provided by Degtyarev [26],in order to design a more challenging testbench for eye localisation.The aim of this benchmark is two-fold:on the one hand,show the degradation of a state-of-the-art method(our iASEF)in a really challenging benchmark and, on the other hand,propose a standard,difficult protocol for fair comparison of eye localisation algorithms in uncontrolled scenarios.The LFW database was originally intended for assessing face verification performance,and it is organised into two ‘Views’.View1is for development purposes and generalexperimentation,that is,model selection or validation.View 2is organised in ten disjoint folds,and should be used only forfinal performance evaluation of a method,to avoid ‘fitting to the test data’.For our purposes,we respect the division and meaning of LFW into view1and view2,but do not take into account the face pairs originally considered for face verification. Instead,we use the whole list of face images that comprise each of these‘views’(and folds)(available in /lfw/#views).We maintain the ten-foldcross validation scheme of[1]:training sets are formed using nine of the ten sets,with the held-out set as the test set.3.3LFPW benchmarkThe LFPW[5]consists on1432faces from images downloaded from the web using simple text queries on sites such as ,fl and .A total of 29fiducial points were labelled for each image,including eye centres.The dataset is divided into twofiles,one for training(1132images)and one for testing(300images). We included this benchmark because of its novelty,and also to evaluate whether our iASEF performed similarly in the extended LFW and LFPW testbenches or not.As discussed in the Introduction,the main drawback of the LFPW database relies on the fact that only links to the images are provided,which can lead to changes in the database over time(e.g.some imagefiles may be deleted or become corrupted).From this point of view,we consider that LFPW is not an optimal choice as a standard dataset for fairly comparing different methods.4ASEF fundamentalsASEF have been recently proposed in[10]and successfully applied to the task of precise eye localisation.ASEFfilters are trained using response images that specify the desired output at every pixel,so a complete exactfilter is determined for every training image.This differs from prior correlationfilters which only specified a single output value per training image as stated in[10].Overfitting to the training images is avoided by averaging all the singlefilters generated from the training dataset,which in turn provides great generalisation capabilities.In addition,since most of the computations are done in the Fourier domain,the computational cost of the algorithm is low.Formally,the construction of ASEFfilters can be described as follows:The(ASEF)filterˆh is learnt from a set of images f i and their corresponding set of synthetic outputs g i,that are specified by generating two-dimensional(2D)Gaussians centred at the desired positions in the training images(e.g. the eyes).For a single image f(x,y)and a desired output g (x,y),the syntheticfilter h(x,y)can be computed in the Fourier domain by using the convolution theoremg(x,y)=(f⊗h)(x,y)=F−1(F(v,n)H(v,n))(2)where F and H are the respective2D Fourier transforms of f and h.Correlation is obtained by taking the complex conjugate of HG(v,n)=F(v,n)H∗(v,n)(3) For each training image f i,the exactfilter is given byH∗i(v,n)=G i(v,n)F i(v,n)(4)The ASEFˆH is obtained by averaging of thefilters H i computed for every training image.In the localisation stage,the eyes are detected by the cross-correlation of the input image(the face region)with the previously computed averagefilter.The correlation response is examined for possible correlation peaks.In the simple approach of Bolme et al.[10],this is done by simple searching for a maximum.ASEFfilters have been tested for eye localisation in[10] using the FERET database[10].The evaluation methodology consists on a(scale invariant)distance measure between the manually selected eye coordinates and the ones located by ASEFfilterD l=P l−M lM l−M r(5)where D l is the normalised error measure for the left eye,P l is the predicted left eye co-ordinates,M l and M r are the manually annotated co-ordinates for left eye and right eye, respectively.The experiments of Bolme et al.[10]have shown good performance for eye localisation,achieving a correct rate of98.5%on FERET dataset at the operating point of D<0.10.However,results are only provided for the left eye(following a‘single eye’criterion).Therefore we do not have the performance results for the more restrictive error measure proposed in[23],where a‘worst eye’criterion is used(see(1)).In addition,FERET images were acquired under quite controlled conditions and do not reflect the complexity of real world face imagery,turning out that these tests are not a reliable indicator for the performance of the ASEF detectors in uncontrolled conditions.More recent works have assessed the performance of ASEFfilters for eye localisation in difficult imagery[13,21].In the following,we give some details regarding our ASEF (iASEF)filter generation.4.1Details onfilter generationBolme et al.[10]pointed out that a large training set is important for accurate localisation.Following Bolme et al.[10],we apply various transformations to the original images for augmenting the training set and generate the ASEF(or iASEF)filters.Such transformations comprise rotation and mirroring.The databases used for our experiments(BioID,LFW and LFPW)involve pose variations,lighting,image compression,noise,different resolution etc.,so other variations such as blurring or contrast modification have not been applied to the training datasets in contrast to[16].The size of the training images isfixed to150×150.5Proposed improvements to the original ASEF:their impact on the BioID databaseAs already discussed in the introduction,we propose several modifications to the original ASEF formulation for the task of eye localisation.These contributions are listed in the following:†Illumination normalisation:We show that illumination correction(both infilter learning and eye localisation stages)can lead to improved eye localisation accuracy.In this sense,we test several state-of-the-art illumination normalisation techniques provided by[24,25],and evaluate their impact on the eye localisation accuracy.†Spatial priors and cross-filter responses:Using training data with manual eye annotations,we learn priors for eye positions and generate a‘Gaussian mask’based on the relative positions w.r.t.the detected face bounding box.Contrary to[11,16]and other approaches,we do not usebinary search regions but data-driven Gaussian masks(thatweigh the ASEF response)instead.Moreover,experimentalresults demonstrate that our trainedfilter for right eyelocalisation also produces a‘peak’at the left eye positionand vice versa.Based on thisfinding,we show that thesimple sum of bothfilter responses improves eyelocalisation results.The combination of these contributions(Gaussian masks and cross-filter responses)significantlyimproves the performance in comparison with coarse searchregions(e.g.binary masks).We describe each of these contributions in the next sections(5.1and5.2),evaluating their(isolated)impact on the BioIDdatabase with respect to the original ASEF formulation.Theresults obtained with thefinal system(combining all theaforementioned contributions)are presented in Section6onthe the four proposed benchmarks.5.1Illumination normalisationIllumination variations remain to be one of the majorchallenges for robust face recognition systems,degradingconsiderably its performance.We hypothesise that properillumination normalisation of face images(both during filter training and eye localisation)will also lead to improvements in eye localisation(as in face recognition).With this goal,we evaluated different illuminationcorrection algorithms.Since one of the major advantages ofASEF-based eye localisation is its low computational cost,some computationally efficient illumination normalisationalgorithms have been considered:[43–48],all of themincluded in the INface toolbox v2.1[24,25].Other state-of-the-art methods included in the INface toolbox have been discarded because of their higher computational cost.In the following,we briefly describe the different illumination correction algorithms considered:†Single scale retinex(SSR)algorithm was proposed by Jobson et al.in[43].It is a photometric normalisation technique based on the so-called retinex theory.†The multiscale retinex(MSR)is an extension of SSR algorithm also proposed by Jobson et al.[44].†The discrete cosine transform-based normalisation technique(DCT)was proposed by Chen et al.in[45].This technique sets a number of DCT coefficients corresponding to low-frequencies to zero and hence tries to achieve illumination invariance.†The retina modelling approach[46]proposed by Vu and Caplier has been used with LBP-based face recognition, obtaining state-of-the-art results on FERET and Yale B datasets.First,two adaptive nonlinearfilters are applied for light adaptation.This reduces the effect of bright and dark regions.Then,a difference of Gaussian(DoG)filtering is performed.Finally,a truncation is used to enhance the global image contrast.†Tan and Triggs method[47]has been also applied with LBP and Gabor wavelet-based face recognition,achieving good results on illumination datasets,including Extended Yale-B,CAS-PEAL-R1and FRGC-204.This technique normalises the input image through the use of a processing chain thatfirst applies gamma correction to the input image, then performs DoGfiltering andfinally uses robust post-processing to output thefinal result.†The single scale Weberfaces normalisation technique has been proposed by Wang et al.in[48].The method computes the relative gradient in the form of a modified Weber contrast and uses the computed face representation as an illumination invariant version of the input image.Fig.2shows performance plots when the different illumination normalisation methods are combined with ASEFfiltering on the BioID database.Table1summarises these performances for the D<0.05,D<0.1and D<0.25 operating points.Overall,the best results for BioID are obtained using the Weberfaces normalisation(followed by Tan and Triggs[47]and retina modelling[46]).We must also highlight that all normalisation techniques outperform the original ASEF in the three operating points (with the exception of DCT in D<0.25),indicating that the illumination correction proposed in[10]does not properly deal with the lighting variations present in BioID images, and robust illumination compensation indeed leads to significant improvements(as initially hypothesised).5.2Spatial priors and cross-filter responsesMost eye localisation techniques constrain the search to a specified region of interest,which is appropriately set in relation to the bounding box provided by a face detection module[11,16,29].Bolme et al.[10]also presented results with and without search constraints,with the correct localisation rate of98.5%at D<0.1corresponding to the former approach(unconstrained search also provided good results on the FERET database,clearly outperforming Gabor jets and Haar-based Adaboost cascade).Previous comparison of illumination normalisation techniques(Section5.1)used rough search regions for each eye.The face image was divided into four regions and the Table1Localisation accuracy(%)for different illumination normalisation techniques on BioIDIllumination normalisation D<0.05D<0.1D<0.25original ASEF85.389.8994.01 DCT86.5690.6293.48 SSR87.8293.3596.07 MSR87.6293.4895.87 retina model88.4994.0897.01 Tan89.2995.1499.2 Weberfaces90.1596.2198.47 Fig.2Performance of different normalisation illumination algorithms on the BioID face database。
cPCG包用户指南说明书
Package‘cPCG’October12,2022Type PackageTitle Efficient and Customized Preconditioned Conjugate GradientMethod for Solving System of Linear EquationsVersion1.0Date2018-12-30Author Yongwen ZhuangMaintainer Yongwen Zhuang<******************>Description Solves system of linear equations using(preconditioned)conjugate gradient algo-rithm,with improved efficiency using Armadillo templated'C++'linear algebra library,andflex-ibility for user-specified precondition-ing method.Please check<https:///styvon/cPCG>for latest updates.Depends R(>=3.0.0)License GPL(>=2)Imports Rcpp(>=0.12.19)LinkingTo Rcpp,RcppArmadilloRoxygenNote6.1.1Encoding UTF-8Suggests knitr,rmarkdownVignetteBuilder knitrNeedsCompilation yesRepository CRANDate/Publication2019-01-1117:00:10UTCR topics documented:cPCG-package (2)cgsolve (3)icc (4)pcgsolve (5)Index712cPCG-package cPCG-package Efficient and Customized Preconditioned Conjugate Gradient Methodfor Solving System of Linear EquationsDescriptionSolves system of linear equations using(preconditioned)conjugate gradient algorithm,with im-proved efficiency using Armadillo templated’C++’linear algebra library,andflexibility for user-specified preconditioning method.Please check<https:///styvon/cPCG>for latest up-dates.DetailsFunctions in this package serve the purpose of solving for x in Ax=b,where A is a symmetric andpositive definite matrix,b is a column vector.To improve scalability of conjugate gradient methods for larger matrices,the Armadillo templatedC++linear algebra library is used for the implementation.The package also providesflexibility tohave user-specified preconditioner options to cater for different optimization needs.The DESCRIPTIONfile:Package:cPCGType:PackageTitle:Efficient and Customized Preconditioned Conjugate Gradient Method for Solving System of Linear Equati Version: 1.0Date:2018-12-30Author:Yongwen ZhuangMaintainer:Yongwen Zhuang<******************>Description:Solves system of linear equations using(preconditioned)conjugate gradient algorithm,with improved effic Depends:R(>=3.0.0)License:GPL(>=2)Imports:Rcpp(>=0.12.19)LinkingTo:Rcpp,RcppArmadilloRoxygenNote: 6.1.1Encoding:UTF-8Suggests:knitr,rmarkdownVignetteBuilder:knitrIndex of help topics:cPCG-package Efficient and Customized PreconditionedConjugate Gradient Method for Solving System ofLinear Equationscgsolve Conjugate gradient methodicc Incomplete Cholesky Factorizationpcgsolve Preconditioned conjugate gradient methodcgsolve3Author(s)Yongwen ZhuangReferences[1]Reeves Fletcher and Colin M Reeves.“Function minimization by conjugate gradients”.In:Thecomputer journal7.2(1964),pp.149–154.[2]David S Kershaw.“The incomplete Cholesky—conjugate gradient method for the iter-ativesolution of systems of linear equations”.In:Journal of computational physics26.1(1978),pp.43–65.[3]Yousef Saad.Iterative methods for sparse linear systems.V ol.82.siam,2003.[4]David Young.“Iterative methods for solving partial difference equations of elliptic type”.In:Transactions of the American Mathematical Society76.1(1954),pp.92–111.Examples#generate test datatest_A<-matrix(c(4,1,1,3),ncol=2)test_b<-matrix(1:2,ncol=1)#conjugate gradient method solvercgsolve(test_A,test_b,1e-6,1000)#preconditioned conjugate gradient method solver,#with incomplete Cholesky factorization as preconditionerpcgsolve(test_A,test_b,"ICC")cgsolve Conjugate gradient methodDescriptionConjugate gradient method for solving system of linear equations Ax=b,where A is symmetric and positive definite,b is a column vector.Usagecgsolve(A,b,tol=1e-6,maxIter=1000)ArgumentsA matrix,symmetric and positive definite.b vector,with same dimension as number of rows of A.tol numeric,threshold for convergence,default is1e-6.maxIter numeric,maximum iteration,default is1000.4iccDetailsThe idea of conjugate gradient method is tofind a set of mutually conjugate directions for the unconstrained problemargmin x f(x)where f(x)=0.5b T Ab−bx+z and z is a constant.The problem is equivalent to solving Ax=b.This function implements an iterative procedure to reduce the number of matrix-vector multiplica-tions[1].The conjugate gradient method improves memory efficiency and computational complex-ity,especially when A is relatively sparse.ValueReturns a vector representing solution x.WarningUsers need to check that input matrix A is symmetric and positive definite before applying the function.References[1]Yousef Saad.Iterative methods for sparse linear systems.V ol.82.siam,2003.See AlsopcgsolveExamples##Not run:test_A<-matrix(c(4,1,1,3),ncol=2)test_b<-matrix(1:2,ncol=1)cgsolve(test_A,test_b,1e-6,1000)##End(Not run)icc Incomplete Cholesky FactorizationDescriptionIncomplete Cholesky factorization method to generate preconditioning matrix for conjugate gradi-ent method.Usageicc(A)ArgumentsA matrix,symmetric and positive definite.DetailsPerforms incomplete Cholesky factorization on the input matrix A,the output matrix is used for preconditioning in pcgsolve()if"ICC"is specified as the preconditioner.ValueReturns a matrix after incomplete Cholesky factorization.WarningUsers need to check that input matrix A is symmetric and positive definite before applying the function.See AlsopcgsolveExamples##Not run:test_A<-matrix(c(4,1,1,3),ncol=2)out<-icc(test_A)##End(Not run)pcgsolve Preconditioned conjugate gradient methodDescriptionPreconditioned conjugate gradient method for solving system of linear equations Ax=b,where A is symmetric and positive definite,b is a column vector.Usagepcgsolve(A,b,preconditioner="Jacobi",tol=1e-6,maxIter=1000) ArgumentsA matrix,symmetric and positive definite.b vector,with same dimension as number of rows of A.preconditioner string,method for preconditioning:"Jacobi"(default),"SSOR",or"ICC".tol numeric,threshold for convergence,default is1e-6.maxIter numeric,maximum iteration,default is1000.DetailsWhen the condition number for A is large,the conjugate gradient(CG)method may fail to converge in a reasonable number of iterations.The Preconditioned Conjugate Gradient(PCG)Method appliesa precondition matrix C and approaches the problem by solving:C−1Ax=C−1bwhere the symmetric and positive-definite matrix C approximates A and C−1A improves the con-dition number of A.Common choices for the preconditioner include:Jacobi preconditioning,symmetric successive over-relaxation(SSOR),and incomplete Cholesky factorization[2].ValueReturns a vector representing solution x.PreconditionersJacobi:The Jacobi preconditioner is the diagonal of the matrix A,with an assumption that all diagonal elements are non-zero.SSOR:The symmetric successive over-relaxation preconditioner,implemented as M=(D+L)D−1(D+ L)T.[1]ICC:The incomplete Cholesky factorization preconditioner.[2]WarningUsers need to check that input matrix A is symmetric and positive definite before applying the function.References[1]David Young.“Iterative methods for solving partial difference equations of elliptic type”.In:Transactions of the American Mathematical Society76.1(1954),pp.92–111.[2]David S Kershaw.“The incomplete Cholesky—conjugate gradient method for the iter-ativesolution of systems of linear equations”.In:Journal of computational physics26.1(1978),pp.43–65.See AlsocgsolveExamples##Not run:test_A<-matrix(c(4,1,1,3),ncol=2)test_b<-matrix(1:2,ncol=1)pcgsolve(test_A,test_b,"ICC")##End(Not run)Index∗methodscgsolve,3icc,4pcgsolve,5∗optimizecgsolve,3pcgsolve,5∗packagecPCG-package,2cgsolve,3,6cPCG(cPCG-package),2cPCG-package,2icc,4pcgsolve,4,5,5preconditioner(pcgsolve),57。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
TR-CS-97-12 Preconditioning of elliptic problems by approximation in the transform
domain
Michael K.Ng
July1997
Joint Computer Science Technical Report Series
Department of Computer Science
Faculty of Engineering and Information Technology
Computer Sciences Laboratory
Research School of Information Sciences and Engineering
This technical report series is published jointly by the Department of Computer Science,Faculty of Engineering and Information Technology, and the Computer Sciences Laboratory,Research School of Information Sciences and Engineering,The Australian National University.
Please direct correspondence regarding this series to:
Technical Reports
Department of Computer Science
Faculty of Engineering and Information Technology
The Australian National University
Canberra ACT0200
Australia
or send email to:
Technical.Reports@.au
A list of technical reports,including some abstracts and copies of some full reports may be found at:
.au/techreports/
Recent reports in this series:
TR-CS-97-11Richard P.Brent,Richard E.Crandall,and Karl Dilcher.Two new factors of Fermat numbers.May1997.
TR-CS-97-10Andrew Tridgell,Richard Brent,and Brendan McKay.
Parallel integer sorting.May1997.
TR-CS-97-09M.Manzur Murshed and Richard P.Brent.Constant time algorithms for computing the contour of maximal elements on the
Reconfigurable Mesh.May1997.
TR-CS-97-08Xun Qu,Jeffrey Xu Yu,and Richard P.Brent.A mobile TCP socket.April1997.
TR-CS-97-07Richard P.Brent.A fast vectorised implementation of Wallace’s normal random number generator.April1997.
TR-CS-97-06M.Manzur Murshed and Richard P.Brent.RMSIM:a serial simulator for reconfigurable mesh parallel computers.April1997.
−0.500.51 1.52 2.53 3.501234567
I
MINV
C
M_0
M_1
M_3
M_7
10110210
3104105
10−3
10−2
10−1100101Grid Size n T i m e (i n s e c o n d ) a c h i e v e d t o r e a c h c o n v e r g e n c e。