Analysis of heuristic synergies
Globally networked risks and how to respond
Risk, systemic risk and hyper-risk
According to the standard ISO 31000 (2009; /iso/ catalogue_detail?csnumber543170), risk is defined as ‘‘effect of uncertainty on objectives’’. It is often quantified as the probability of occurrence of an (adverse) event, times its (negative) impact (damage), but it should be kept in mind that risks might also create positive impacts, such as opportunities for some stakeholders.
Analysis of Heuristic SynergiesRichard J.WallaceCork Constraint Computation Centre and Department of Computer ScienceUniversity College Cork,Cork,Irelandemail:r.wallace@4c.ucc.ieAbstract.“Heuristic synergy”refers to improvements in search performancewhen the decisions made by two or more heuristics are combined.This paperconsiders combinations based on products and quotients,and a less familiar formof combination based on weighted sums of ratings from a set of base heuristics,some of which result in definite improvements in performance.Then,using recentresults from a factor analytic study of heuristic performance,which had demon-strated two main effects of heuristics involving either buildup of contention orlook-ahead-induced failure,it is shown that heuristic combinations are effectivewhen they are able to balance these two actions.In addition to elucidating thebasis for heuristic synergy(or lack thereof),this work suggests that the task ofunderstanding heuristic search depends on the analysis of these two basic actions.1IntroductionCombining variable ordering heuristics that are based on different features sometimes results in better performance that can be obtained by either heuristic working in iso-lation.Perhaps the best-known instance of this is the domain/degree heuristic of[1]. Recently,further examples have been found based on weighted sums of rated selections produced by a set of heuristics[2].As yet,we do not have a good understanding of the basis for such heuristic syner-gies.Nor can we predict in general which heuristics will synergise.In fact,until now there has been no proper study of this phenomena,and perhaps not even a proper recog-nition that it is a phenomenon.The present paper initiates a study of heuristic synergies.A secondary purpose is to test the weighted sum strategy in a setting that is independent of its original machine learning context.The failure to consider this phenomenon stems in part from our inability to classify heuristic strategies beyond citing the problem features used by a heuristic.However, recent work has begun to shown how to delineate basic strategies,and with this work we can begin to understand how heuristics work in combination.Although the work is still in its early stages,it is already possible to predict which heuristics will synergise in combination and to understand to some extent why this occurs.The analysis to be presented depends heavily on the factor analysis of heuristic performance,i.e.of the efficiency of search when a given heuristic is used to order the variables.This approach is based on inter-problem variation.If the action of two heuristics is due to a common strategy,then the pattern of variation should be similar. Using this method,it has been possible to show that,for certain simple problem classes,such variation can be ascribed to only two factors,which can in turn be interpreted as basic heuristic actions[3].The next section describes the basic methodology.Section3gives results for heuris-tic combinations involving products and quotients.Section4gives results for weighted sums of heuristic ratings.Section5gives a brief overview of factor analysis as well as methodological details for the present work and gives the basic results of a factor analy-sis of variable ordering heuristics and their interpretation.Section6uses these results to predict successes and failures of heuristic combinations.Section7considers extensions using more advanced heuristics.Section8gives conclusions.2Description of MethodsThe basic analyses in this paper used a set of well-known variable ordering heuristics that could be combined in various ways.These are listed below together with the ab-breviations used in the rest of the paper.Minimum domain size(dom,dm).Choose a variable with the smallest current do-main size.Maximum forward degree(fd).Choose a variable with the largest number of neigh-bors(variables whose nodes are adjacent to the chosen variable in the constraint graph)within the set of uninstantiated variables.Maximum backward degree(bkd).Choose the variable with largest number of neighbors in the set of instantiated variables.Maximum static degree(stdeg,dg).Choose a variable with the largest number neighbors(i.e.the variable of highest degree).In most cases,ties were broken according to the lexical order of the variable labels.In such cases,max forward and max backward degree are bothfixed-order heuristics,as is max static degree.The initial tests were done with homogeneous random CSPs because these are easy to generate according to different parameter patterns.Problems were generated ac-cording to a probability-of-inclusion model for adding constraints,domain elements and constraint tuples,but where selection was repeated until the number of elements matched the expected value for the given probability.In all cases generation began with a spanning tree to ensure that the constraint graph was connected;however,densities given in this paper are simple graph densities.Typically,there were100problems in a set,although similar results were found in some cases for sets of500problems.Prob-lem parameters were chosen so that problems were in a critical complexity region of the parameter space.Some further tests were based on geometric problems,which are random prob-lems with small-world characteristics.Geometric problems are generated by choosing points at random within the unit square to represent the variables,and then connect-ing all pairs of variables whose points lie within a threshold distance.In this case,con-nectivity was ensured by checking for connected components,and if there were more than one,adding an edge between the two variables in different components separated by the shortest distance.Unless otherwise noted,the tests in this paper were based on the MAC-3algorithm. The basic measures of performance were,(i)nodes visited during search,(ii)constraint checks.In these experiments,both measures produced similar patterns of differences, so for brevity the results in this paper are restricted to search nodes.Synergy was evaluated for two kinds of strategy.Thefirst type of strategy was to take products and quotients of the basic heuristics,as is done in the well-known do-main/degree heuristics.(For quotients and products involving backward degree,when this component was zero,a value of one was used instead.)The second was to com-bine evaluations of individual heuristics into weighted sums.This strategy was derived from one used in a contemporary learning system[2];to my knowledge there has been no examination of its efficacy outside this context.In addition to being an alternative strategy for obtaining improved heuristics,this method is useful in the present context because it may allow more quantitative assessment of synergistic effects.1forward degree,the weighted sums for Variables1and2are32and31,respectively;in this case,therefore,Variable1would be chosen over Variable2.3Heuristic Combinations Based on Quotients and ProductsThe next two sections present a number of empirical results,some of which are fairly striking,that constitute a body offindings that must be accounted for by any explanation of heuristic synergies.At the same time,these results provide a number of hints about the nature of the variables that may underlie synergistic and non-synergistic effects.Several examples of quotients and products based on the four simple heuristics described in the last section are presented in Table1.Among them only two exhibit synergistic effects,and the only marked effect is produced by the well-known do-main/forward degree heuristic.Interestingly,for this set of problems domain/static de-gree did not give better performance than static degree alone,in contrast to domain/ forward degree.Table1.Results for Products and Quotientssimple heuristic nodes combination nodesmin dom/stdeg2076fd2625min dom/bkwd15081stdeg2000max stdeg*fd2417Mean nodes per problem.50,10,0.184,0.37problems.Bold entries show results superior to either heuristic alone.4Heuristic Combinations Based on Weighted SumsTable2gives results,in terms of nodes searched,for six“individual”heuristics and for combinations of these heuristics using the technique of weighted sums described in Section2.For these tests,heuristics were given equal weights.These data include examples of heuristic synergy as well as non-synergy.Note that in these tests the domain/degree quotients were used as components with respect to the weighted sums in addition to the four simple heuristics,and that this form of combination sometimes gave better results than the quotient alone.At the same time, such combinations were not superior to the best weighted sums based on the simpler heuristics.There are a number of significantfindings in this table:Some combinations do better,in terms of number of search nodes,than any heuris-tic used by itself.Some do even better than the best heuristic-quotient tested,which was min domain/forward-degree.Simply combining heuristics is not sufficient to obtain synergy;only certain com-binations are effective.The effectiveness of heuristic combinations does not correlate well with the effec-tiveness of the individual components.The effectiveness of heuristic combinations is not related to the inclusion of any particular component in and of itself.The best results for combinations of two heuristics were as good as the best results for combinations of more than two heuristics.Table2.Selected Results for Weighted Sumsheuristic nodes combination nodes combination nodesdm/dg+dm/fd1800dm/fd+fd1304dom+dm/fd1890dom+fd1317fd+stdeg2344bkwd+stdeg1876dom+dm/dg+stdeg1654dom+fd+stdeg1374Mean nodes per problem.50,10,0.184,0.37problems.Bold entries showresults that are better than any individual heuristic.In these tests componentheuristics were given equal weights.It is important to note in this connection that weighted sums gave better results than tie-breaking strategies based on the same heuristics.For comparison,here are results on the same set of problems with four tie-breaking strategies:min domain,ties broken by max forward degree:3101min domain,ties broken by max static degree:3155forward degree,ties broken by min domain:2239static degree,ties broken by min domain:1606Naturally,tie-breaking does reduce the size of the search tree in comparison with the primary heuristic when used alone,but not as much as some heuristic combinations.Another significant result is that,when combinations of two heuristics showed a high degree of synergy,equal weights gave better results than unequal weights,and in these cases performance deteriorated as a function of the difference in weights.This is shown in Table3.In cases in which weight combinations did not synergise or synergised weakly in comparison with the best individual heuristic in the combination,unequal weights sometimes gave some improvement,although the effect was never marked.An additionalfinding is that when weights were unequal,there were sometimes marked asymmetries or biases in the effect of weighting one heuristic more than the other. Evidence of this can be seen in each of the three columns of data to the right in the table.In the other pair(dom+fd),the effects of weights were highly symmetric,so that the increase in search effort rose in concert with the degree of difference in the weights regardless of which heuristic was more highly weighted.Table3.Two-Heuristic Combinations with Different Weights wt ratio dom+fd dom+stdeg stdeg+bkwd fd+stdeg14272344 1:214331420247124552:114051620185222351:316521454305424583:116511885181222231:520331557396024585:12368250418162223duction in average time.Evidently,then,for some problems the effects can scale up,so if efficient means can be found for computing weighted sums(or even approximations), this technique may be of practical importance.Table4.Three Heuristics with Different Weightswt ratio dom+stdeg+fd dom+stdeg+dm/dg dom+stdeg+bkwdMean nodes per problem.50,10,0.184,0.37problems.Otherconventions as in Table3.Table5.Five Heuristics with Different Weightswt ratio nodesdom stdeg dm/dg fd bkwd50,10,0.184,0.37problems.Other conventionsas in Table3.5Factor Analysis of Heuristic PerformanceWe turn now to the task of determining why search performance is sometimes improved (and sometimes worsened)by combining heuristics.To this end,a statistical technique called“factor analysis”was employed.Factor analysis is a technique for determining whether a set of measurements can be accounted for by a smaller number of“factors”.Strictly speaking,the notion of a factoris solely statistical and refers either to a repackaging of the original patterns of variation (variance)across individuals and measurements or to a set of linear relations that can account for the original statistical results.However,since variation must have causes, this technique if used carefully can yield considerable insight into the causes underly-ing the measurements obtained.In other words,the factors may be closely related to underlying variables that are sufficient to account for much of the variance.Table6.Results for Weighted Sums with Geometric Problemscombinationsheuristic nodes dom+fd fd+stdegdom11,2221:13991:1570dm/dg3681:35411:3554dm/fd3723:13373:1586fd6421:55521:5555stdeg5505:13375:1592problems,and each measurement is an efficiency measure,such as search nodes orconstraint checks,for a given heuristic.A factor extraction process is applied,basedon a standard method of approximation.The present work uses the method of maxi-mum likelihood,which starts from a hypothesis of common factors and determines maximum-likelihood estimates of them using the original correlation matrix[5].Factor analysis methods such as the maximum likelihood method obtain factorsthat are uncorrelated with each other.In this case,each above is identical to thecorrelation coefficient holding between and[4].Once obtained,the factors(which constitute a basis for a space of dimensions) can be rotated according to various criteria.Here the varimax rotation was used;this method tries to eliminate negative loadings while producing maximal loadings on the smallest possible set of measures.The interpretation of patterns of differences cannot assume that causal factors be-have additively,only that patterns of variation can be derived from additive combina-tions.Factor analysis,therefore,can only identify common sources of variation whose interpretation requires further investigation.5.2MethodologyThe software used in these analyses was System R,which was downloaded from.In this package,the factanal function was used for the factor analysis.As already noted,maximum likelihood methods require the number of factors as input.Since the number of significant factors was not known beforehand,various num-bers of factors were tested,first,to determine at what point factor extraction ceased to account for any significant part of the variance,second,to determine which of these fac-tors gave strong,reliable results.Thefirst kind of test can be taken as setting an upper bound on the number of useful factors.If there are other sources of variation than the ones emphasized here,since they are less important in their effects and less reliable across experiments,they are likely to be related to features of specific problem sets interacting with vagaries of the search process.In addition,the possible existence of further factors does not necessarily di-minish the importance of the ones demonstrated here.(In other words,the explanatory process may have to be extended,but it will not need to backtrack if the arguments for the factors described here are cogent.)5.3Factor Patterns for CSP Heuristics and Their InterpretationTable7shows selected results for an analysis based on12heuristics(described more fully in[3]).(The heuristics not included in the table were mainly diagnostic pseudo-heuristics:the FFx series of[6]and a variable ordering heuristic based on maximizing the summed“promise”across the values of a domain,derived from[7].)On the left are results from the basic experiment with the same set of random problems used in the present work.In this case,the analysis indicated that there were two major factors,but that min domain and max backward degree had idiosyncratic patterns of variation, reflected in their high uniqueness.Further experiments showed that the latter were due to random choices made at the top of the search tree;this occurs because for these heuristics there is no distinction among variables at the start of search.The results of one of these experiments are shown on the right hand side of the table.In this test,for each measurement thefirst three choices were in lexical order;thereafter,a particular heuristic was used.In this experiment,therefore,the effect of initial random selections was equalized.As a result,all heuristics had moderate to high loadings on two major factors,and the proportion of the total variance accounted for by these two factors was 0.95.Table7.Factor Analysis for CSP heuristicsheuristic heuristic alone3lexical,heuristicnodes factor1factor2unique nodes factor1factor2unique195870.8040.5650.034fd26250.4430.8730.042375360.7080.4880.261stdeg20000.4860.8350.06777120.7520.6520.010dm/fd16210.9090.4040.01085670.6260.7750.008weights,this rule was verified for allfifteen pairings of the original set of six heuristics (Table8).Table8.Predicted and Actual Synergies for Equal-Weight Pairsheuristicsfirst second compound11,334262511,33427,39111,334200011,334207611,3341621262527,39126252000262520762625162127,391200027,391207627,3911621200020762000162120761621Mean nodes per problem.50,10,0.184,0.37problems.Italicisedentries are predicted synergy based on factor loadings.Bold entriesshow actual synergistic effects.This rule is also consistent with the two cases of synergy among the products and quotients(cf.Table2).However,for quotients there appears to be a further(reason-able)condition:that both the numerator and denominator favor selections consistent with those favored by the original heuristic.This consideration accounts for the failure tofind synergy with the max forward degree/backward degree heuristic,although the components load most heavily on different factors.In this case,choosing according to this heuristic will favor variables with larger forward degrees,which accords with this component heuristic,but it will also favor variables with smaller backward degrees, which is counter to this component heuristic.Evidence that the balance between the two factors is important can be found in Tables3-5.This,in fact,seems to explain both the decline in quality for the two-heuristic combinations when the weights are made unequal(Table3)and the distin-guished heuristic phenomenon that was noted in the data shown in Table4.In each case,the single heuristic that loaded on a different factor from the other two heuristics was the one that needed to be weighted more strongly.A similar phenomenon seems to be involved in the pattern of results shown in Table5.6.2Synergies with FCStriking instances of synergy can be obtained by using forward checking instead of MAC.For forward checking with random CSPs,it has been found that look-ahead heuristics show a marked fall-off in efficiency.Naturally,there is an increase in numberof search nodes for all heuristics in comparison with the results for MAC,but while the increase is by an order of magnitude for the contention heuristics,it is by three to four orders of magnitude for the look-ahead heuristics,as shown in Table9.Table9.Performance of Forward Checkingheuristic nodesNotes.Basic heuristics and selected quotients.Means for50,10,0.184,0.37problems.Despite this drastic loss in efficiency for one class of heuristics,the rule of combina-tion presented above continues to hold,as shown in Table10.Thus,despite the fact that for these problems forward checking with max forward degree required more than nodes on average,when this heuristic was combined with min domain,it sometimes produced an order-of-magnitude improvement with respect to the latter,which was the best individual heuristic in this combination.In this case synergy based on weighted sums of paired heuristics is most marked when the weights are unequal so as to favor the contention heuristic.Table10.Two-Heuristic Combinations withForward Checkingwt ratio dom+fd dom+stdeg dm/dg+fdNotes.Mean nodes per problem.50,10,0.184,0.37problems.6.3Assessment of weighted-sum strategies in terms of heuristic policiesA more detailed analysis of heuristic quality can be made by assessing heuristic per-formance in terms of adherence to optimal policies.In addition to the overall policy of minimizing effort,there are two basic sub-policies,depending on whether search is currently on a solution path or in an insoluble subtree.In the former case,the optimal policy is to maximize the likelihood of remaining on the solution path(“promise”pol-icy);in the latter,the optimal policy is tofind a refutation of the original mistake as quickly as possible(“fail-first”policy).Although the two policies cannot be realized in practice(nor can the policy offind-ing a solution after a minimum number of search nodes),we can still measure adherenceto these policies and thereby compare heuristics in these terms.For the promise policy, we have an absolute measure when probabilities can be reasonably assigned to alter-native assignments,obtained by summing probability products at each level of search (described in[9]);for the fail-first policy,we can obtain a relative measure by averag-ing the sizes of all the insoluble subtrees(cf.[10]).The latter measure can be obtained for either the entire search tree or the part of the tree explored beforefinding thefirst solution.Although the promise measure necessarily involves the entire search tree,a rough measure can be obtained for the part of the tree explored by counting the number of“mistakes”(the number of times an insoluble subtree was entered).For measures of overall efficiency we use,as before,the number of search nodes.We also include the number of failures,which has been suggested recently as an alternative measure of overall performance[11].Table11gives data for promise and fail-first measures,together with measures of overall efficiency,and some descriptive measures of heuristic performance.This anal-ysis involved max forward degree,min domain and the weighted sum of the two(with equal weights;cf.Table2).These data indicate that this heuristic combination shows better adherence to both the fail-first and promise policies than the component heuristics acting alone.The descriptive measures suggest that the heuristic combination tends to compro-mise on the beneficial effects of the two components,since the measures for the former always fall between those for the latter.This is particularly interesting in connection with the consistent superiority demonstrated in the quality measures.7Limits to Synergy?Now that some understanding of heuristic combination has been obtained,an important question is whether this knowledge can be used to even greater effect than in the tests reported in earlier sections.Two kinds of strategies have been tried,using weighted sums.Thefirst is based on thefinding that if results for weighted sums are added to the factor analysis,they are usually found to load more heavily on one factor than the other. Hence,according to the rule for combining heuristics to produce synergies,it should be possible by appropriate weighting to combine a given weighted sum that loads most heavily on one factor with a basic heuristic(or weighted sum)that loads most heavily on the other.The second strategy was to combine more powerful heuristics that show differences in loadings,to see if synergistic effects can be obtained that are greater than those found by combining simpler heuristics.Although heuristic combinations tend to load more heavily on one factor than an-other(in most cases on the contention factor),it was not possible to combine them with other heuristics to obtain greater synergy.Thisfinding was,in fact,anticipated in the results shown in Tables2-5.The work with more advanced heuristics is still in its preliminary stages.To date, only one such combination has been tested that involved the min domain/weighted-degree heuristic of[12]and the min kappa heuristic of[13].These were chosen for combination because for the50-variable problems the former loaded more heavily on the contention factor while the latter was more heavily correlated with the look-aheadfactor.When used individually,the mean nodes searched was1575for min kappa and 1517for min domain/weighted-degree.Both are superior to the best single heuristic or quotient in the previous tests.When these were combined into a weighted sum,with equal weight for each heuristic,the mean nodes was1309.Therefore,synergy did occur as predicted,but the result was no better than the best results in the earlier experiments.Table11.Policy-Adherence and Descriptive Measuresfor Heuristics and Weighted Summeasure max forward degree min domain dom+fdNote.Means for50,10,0.184,0.37problems.8ConclusionsThis paper presents a study of the effects of combining heuristics.It begins by present-ing a collection of data showing significant cases of synergy,as well as striking patterns of synergistic and non-synergistic effects.Some of these effects are counter-intuitive, since the heuristics in a synergistic combination may result in mediocre or even dread-ful performance when used alone.The work also shows that weighted sums are in fact quite good at improving search performance by reducing the amount of search;this must be because this strategy improves the quality of variable selection.Some insight was gained into the basis for this improvement,using the recent dis-covery that there are two basic types of heuristic action:here labeled“contention”and “look-ahead”[3].This,in turn,led to the formulation of a simple rule for predicting success on the basis of the factor loadings of component heuristics.The success of this rule suggests that heuristic combinations work to improve search performance by bal-ancing the two basic actions.Conversely,when the two are not well-balanced,as in some of the cases in Tables1-6,performance is not improved and can even deteriorate in comparison with the component heuristics.Preliminary analysis of the features of search based on effective combinations supports the idea these hypotheses.This work raises a number of important questions to address in further research: How general are the present synergistic principles with respect to problem classes?(So far they seem quite general with respect to algorithms.)To what degree are more advanced heuristics managing to balance the two basic heuristic factors?Is this why they are effective?Are there other principles underlying improvements in heuristic performance,such as the degree to which alternative choices can be discriminated based on different amounts of information?How does the kind of balancing in evidence here serve to restrict search?Acknowledgment.This work was supported by Science Foundation Ireland under Grant00/PI.1/C075.References1.Bessi`e re,C.,R´e gin,J.C.:MAC and combined heuristics:Two reasons to forsake FC(andCBJ?)on hard problems.In Freuder,E.C.,ed.:Principles and Practice of Constraint Pro-gramming-CP’96.LNCS.No.1118,Berlin,Springer(1996)61–752.Epstein,S.L.,Freuder,E.C.,Wallace,R.,Morozov,A.,Samuels,B.:The adaptive constraintengine.In van Hentenryck,P.,ed.:Principles and Practice of Constraint Programming-CP2002.LNCS.No.2470,Berlin,Springer(2002)525–5403.Wallace,R.J.:Factor analytic studies of CSP heuristics.In van Beek,P.,ed.:Principlesand Practice of Constraint Programming-CP’05.LNCS No.3709,Berlin,Springer(2005) 712–7264.Harman,H.H.:Modern Factor Analysis.2nd edn.University of Chicago,Chicago andLondon(1967)wley,D.N.,Maxwell,A.E.:Factor Analysis as a Statistical Method.2nd edn.Butter-worths,London(1971)6.Smith,B.M.,Grant,S.A.:Trying harder to failfirst.In:Proc.Thirteenth European Confer-ence on Artificial Intelligence-ECAI’98,John Wiley&Sons(1998)249–2537.Geelen,P.A.:Dual viewpoint heuristics for binary constraint satisfaction problems.In:Proc.Tenth European Conference on Artificial Intelligence-ECAI’92.(1992)31–358.Wallace,R.J.:CSP heuristics categorized with factor analytic.In Creaney,N.,ed.:Proc.Sixteenth Irish Conference on Artificial Intelligence and Cognitive Science,Coleraine,NI, University of Ulster(2005)213–2229.Beck,J.C.,Prosser,P.,Wallace,R.J.:Variable ordering heuristics show promise.In:Princi-ples and Practice of Constraint Programming-CP’04.LNCS No.3258.(2004)711–715 10.Beck,J.C.,Prosser,P.,Wallace,R.J.:Trying again to fail-first.In:Recent Advances inConstraints.Papers from the2004ERCIM/CologNet Workshop-CSCLP2004.LNAI No.3419,Berlin,Springer(2005)41–5511.Bessi`e re,C.,Zanuttini,B.,Fern´a ndez,C.:Measuring search trees.In:ECAI2004Workshopon Modelling and Solving Problems with Constraints.(2004)31–4012.Boussemart,F.,Hemery,F.,Lecoutre,C.,Sais,L.:Boosting systematic search by weightingconstraints.In:Proc.Sixteenth European Conference on Artificial Intelligence-ECAI’04.(2004)146–15013.Gent,I.,MacIntyre,E.,Prosser,P.,Smith,B.,Walsh,T.:An empirical study of dynamic vari-able ordering heuristics for the constraint satisfaction problem.In:Principles and Practice of Constraint Programming-CP’96.LNCS No.1118.(1996)179–193。