算法导论第二十七章答案
算法导论文档
第一课课程细节;绪论:算法分析,插入排序法(Insertion Sort),归并排序(Merge Sort) 阅读:1-2章发测验02 演示课1 算法的正确性发《作业1》3 第二课渐进记号(Asymptotic Notation)。
递归公式(Recurrences):置换法,迭代法,主方法阅读:3-4 章,除了§4.44 第三课分治法:Strassen 算法,费氏数列,多项式乘法。
阅读:28 章第2 节,30章第1节5 演示课2 递归公式,松散性阅读:Akra-Bazzi 的讲义6 第四课快速排序法,随机化算法阅读:5 章1 到 3 节,7 章收《作业1》发《作业2》7 演示课3 排序法:堆排序,动态集合,优先队列阅读:6 章8 第五课线性时间的排序法:时间下界,计数排序法,基数排序法阅读:8 章第1 到3 节收《作业2》发《作业3》9 第六课顺序统计学,中位数阅读:9 章10 演示课4 中位数的应用,桶排序阅读:8 章第 4 节11 第七课散列,全域散列阅读:11 章1 到3 节收《作业3》发《作业4》12 第八课散列函数,完美散列阅读:11 章第5 节13 演示课5 测验1 复习收《作业4》14 评分后的作业4可以在中午拿到15 测验116 演示课6 二叉搜索树,树的遍历阅读:12 章1 到 3 节17 第九课二叉搜索树和快速排序法之间的关系;随机二叉搜索树的分析阅读:12 章4 节发《作业5》18 第十课红黑树,旋转,插入,删除阅读:13 章19 演示课7 2-3树,B-树阅读:18 章1 到 2 节20 第十一课高级数据结构,动态顺序统计,线段树(区间树)阅读:14 章收《作业5》发《作业6》21 第十二课计算几何,区间查询阅读:33 章1 到 2 节22 演示课8 凸多边形阅读:33 章3 节23 第十三课van Emde Boas树,优先队列阅读:van Emde Boas 的讲义收《作业6》发《作业7》24 第十四课平摊分析,表的复制,可能法阅读:17 章25 演示课9 竞争分析,自我排序列26 第十五课动态规划,最长公共子序列,最优二叉搜索树阅读:15 章收《作业7》发《作业8》27 第十六课贪婪算法,最小生成树阅读:16 章1 到 3 节,23 章28 演示课10 贪婪算法和动态规划的范例29 第十七课最短路径1,Dijkstra算法,广度优先搜索阅读:22 章1, 2 节;第580 - 587 页,24章 3 节收《作业8》发《作业9》30 演示课11 深度优先搜索,拓扑排序阅读:22 章3 到 5 节31 第十八课最短路径2,Bellman-Ford算法,DAG最短路径,差分约束阅读:24 章1, 2, 4, 5 节32 第十九课所有点对最短路径,Floyd-Warshall,Johnson 的算法阅读:25 章收《作业9》33 第二十课不相交集合的数据结构阅读:21 章34 评分后的作业9可以在中午拿到35 第二十一课带回家发下测验2 ; 道德,解决问题(强制参加)发测验236 没有演示课- 解答测验2!37 没有课算法程序比赛开始(非强制参加)收测验238 第二十二课网络流,最大流最小割切定理阅读:26 章1 - 2 节发《作业10》(选答)39 演示课12 图的匹配算法(注:最大二分匹配)阅读:26 章3 节40 第二十三课网络流,Edmonds-Karp 算法参赛答案截止41 第二十四课随堂测验;比赛颁奖;后续课程的讨论《作业10》解答。
算法导论习题答案 (5)
Three-hole punch your paper on submissions. You will often be called upon to “give an algorithm” to solve a certain problem. Your write-up should take the form of a short essay. A topic paragraph should summarize the problem you are solving and what your results are. The body of the essay should provide the following:
(a) Argue that this problem exhibits optimal substructure.
Solution: First, notice that linecost(i, j) is defined to be � if the words i through j do not fit on a line to guarantee that no lines in the optimal solution overflow. (This relies on the assumption that the length of each word is not more than M .) Second, notice that linecost(i, j) is defined to be 0 when j = n, where n is the total number of words; only the actual last line has zero cost, not the recursive last lines of subprob lems, which, since they are not the last line overall, have the same cost formula as any other line.
算法导论课程作业答案
算法导论课程作业答案Introduction to AlgorithmsMassachusetts Institute of Technology 6.046J/18.410J Singapore-MIT Alliance SMA5503 Professors Erik Demaine,Lee Wee Sun,and Charles E.Leiserson Handout10Diagnostic Test SolutionsProblem1Consider the following pseudocode:R OUTINE(n)1if n=12then return13else return n+R OUTINE(n?1)(a)Give a one-sentence description of what R OUTINE(n)does.(Remember,don’t guess.) Solution:The routine gives the sum from1to n.(b)Give a precondition for the routine to work correctly.Solution:The value n must be greater than0;otherwise,the routine loops forever.(c)Give a one-sentence description of a faster implementation of the same routine. Solution:Return the value n(n+1)/2.Problem2Give a short(1–2-sentence)description of each of the following data structures:(a)FIFO queueSolution:A dynamic set where the element removed is always the one that has been in the set for the longest time.(b)Priority queueSolution:A dynamic set where each element has anassociated priority value.The element removed is the element with the highest(or lowest)priority.(c)Hash tableSolution:A dynamic set where the location of an element is computed using a function of the ele ment’s key.Problem3UsingΘ-notation,describe the worst-case running time of the best algorithm that you know for each of the following:(a)Finding an element in a sorted array.Solution:Θ(log n)(b)Finding an element in a sorted linked-list.Solution:Θ(n)(c)Inserting an element in a sorted array,once the position is found.Solution:Θ(n)(d)Inserting an element in a sorted linked-list,once the position is found.Solution:Θ(1)Problem4Describe an algorithm that locates the?rst occurrence of the largest element in a?nite list of integers,where the integers are not necessarily distinct.What is the worst-case running time of your algorithm?Solution:Idea is as follows:go through list,keeping track of the largest element found so far and its index.Update whenever necessary.Running time isΘ(n).Problem5How does the height h of a balanced binary search tree relate to the number of nodes n in the tree? Solution:h=O(lg n) Problem 6Does an undirected graph with 5vertices,each of degree 3,exist?If so,draw such a graph.If not,explain why no such graph exists.Solution:No such graph exists by the Handshaking Lemma.Every edge adds 2to the sum of the degrees.Consequently,the sum of the degrees must be even.Problem 7It is known that if a solution to Problem A exists,then a solution to Problem B exists also.(a)Professor Goldbach has just produced a 1,000-page proof that Problem A is unsolvable.If his proof turns out to be valid,can we conclude that Problem B is also unsolvable?Answer yes or no (or don’t know).Solution:No(b)Professor Wiles has just produced a 10,000-page proof that Problem B is unsolvable.If the proof turns out to be valid,can we conclude that problem A is unsolvable as well?Answer yes or no (or don’t know).Solution:YesProblem 8Consider the following statement:If 5points are placed anywhere on or inside a unit square,then there must exist two that are no more than √2/2units apart.Here are two attempts to prove this statement.Proof (a):Place 4of the points on the vertices of the square;that way they are maximally sepa-rated from one another.The 5th point must then lie within √2/2units of one of the other points,since the furthest from the corners it can be is the center,which is exactly √2/2units fromeach of the four corners.Proof (b):Partition the square into 4squares,each with a side of 1/2unit.If any two points areon or inside one of these smaller squares,the distance between these two points will be at most √2/2units.Since there are 5points and only 4squares,at least two points must fall on or inside one of the smaller squares,giving a set of points that are no more than √2/2apart.Which of the proofs are correct:(a),(b),both,or neither (or don’t know)?Solution:(b)onlyProblem9Give an inductive proof of the following statement:For every natural number n>3,we have n!>2n.Solution:Base case:True for n=4.Inductive step:Assume n!>2n.Then,multiplying both sides by(n+1),we get(n+1)n!> (n+1)2n>2?2n=2n+1.Problem10We want to line up6out of10children.Which of the following expresses the number of possible line-ups?(Circle the right answer.)(a)10!/6!(b)10!/4!(c) 106(d) 104 ·6!(e)None of the above(f)Don’t knowSolution:(b),(d)are both correctProblem11A deck of52cards is shuf?ed thoroughly.What is the probability that the4aces are all next to each other?(Circle theright answer.)(a)4!49!/52!(b)1/52!(c)4!/52!(d)4!48!/52!(e)None of the above(f)Don’t knowSolution:(a)Problem12The weather forecaster says that the probability of rain on Saturday is25%and that the probability of rain on Sunday is25%.Consider the following statement:The probability of rain during the weekend is50%.Which of the following best describes the validity of this statement?(a)If the two events(rain on Sat/rain on Sun)are independent,then we can add up the twoprobabilities,and the statement is true.Without independence,we can’t tell.(b)True,whether the two events are independent or not.(c)If the events are independent,the statement is false,because the the probability of no rainduring the weekend is9/16.If they are not independent,we can’t tell.(d)False,no matter what.(e)None of the above.(f)Don’t know.Solution:(c)Problem13A player throws darts at a target.On each trial,independentlyof the other trials,he hits the bull’s-eye with probability1/4.How many times should he throw so that his probability is75%of hitting the bull’s-eye at least once?(a)3(b)4(c)5(d)75%can’t be achieved.(e)Don’t know.Solution:(c),assuming that we want the probability to be≥0.75,not necessarily exactly0.75.Problem14Let X be an indicator random variable.Which of the following statements are true?(Circle all that apply.)(a)Pr{X=0}=Pr{X=1}=1/2(b)Pr{X=1}=E[X](c)E[X]=E[X2](d)E[X]=(E[X])2Solution:(b)and(c)only。
藏书阁-《算法导论》常见算法总结
常见算法总结分治法分治策略的思想:顾名思义,分治是将一个原始问题分解成多个子问题,而子问题的形式和原问题一样,只是规模更小而已,通过子问题的求解,原问题也就自然出来了。
总结一下,大致可以分为这样的三步:分解:将原问题划分成形式相同的子问题,规模可以不等,对半或2/3对1/3的划分。
解决:对于子问题的解决,很明显,采用的是递归求解的方式,如果子问题足够小了,就停止递归,直接求解。
合并:将子问题的解合并成原问题的解。
这里引出了一个如何求解子问题的问题,显然是采用递归调用栈的方式。
因此,递归式与分治法是紧密相连的,使用递归式可以很自然地刻画分治法的运行时间。
所以,如果你要问我分治与递归的关系,我会这样回答:分治依托于递归,分治是一种思想,而递归是一种手段,递归式可以刻画分治算法的时间复杂度。
所以就引入本章的重点:如何解递归式?分治法适用的情况分治法所能解决的问题一般具有以下几个特征:1. 该问题的规模缩小到一定的程度就可以容易地解决2. 该问题可以分解为若干个规模较小的相同问题,即该问题具有最优子结构性质。
3. 利用该问题分解出的子问题的解可以合并为该问题的解;4. 该问题所分解出的各个子问题是相互独立的,即子问题之间不包含公共的子子问题。
第一条特征是绝大多数问题都可以满足的,因为问题的计算复杂性一般是随着问题规模的增加而增加;第二条特征是应用分治法的前提它也是大多数问题可以满足的,此特征反映了递归思想的应用;、第三条特征是关键,能否利用分治法完全取决于问题是否具有第三条特征,如果具备了第一条和第二条特征,而不具备第三条特征,则可以考虑用贪心法或动态规划法。
第四条特征涉及到分治法的效率,如果各子问题是不独立的则分治法要做许多不必要的工作,重复地解公共的子问题,此时虽然可用分治法,但一般用动态规划法较好。
——————————————————————————————最大堆最小堆1、堆堆给人的感觉是一个二叉树,但是其本质是一种数组对象,因为对堆进行操作的时候将堆视为一颗完全二叉树,树种每个节点与数组中的存放该节点值的那个元素对应。
《算法导论(第二版)》(中文版)课后答案
5
《算法导论(第二版) 》参考答案 do z←y 调用之前保存结果 y←INTERVAL-SEARCH-SUBTREE(y, i) 如果循环是由于y没有左子树,那我们返回y 否则我们返回z,这时意味着没有在z的左子树找到重叠区间 7 if y≠ nil[T] and i overlap int[y] 8 then return y 9 else return z 5 6 15.1-5 由 FASTEST-WAY 算法知:
15
lg n
2 lg n1 1 2cn 2 cn (n 2 ) 2 1
4.3-1 a) n2 b) n2lgn c) n3 4.3-4
2
《算法导论(第二版) 》参考答案 n2lg2n 7.1-2 (1)使用 P146 的 PARTION 函数可以得到 q=r 注意每循环一次 i 加 1,i 的初始值为 p 1 ,循环总共运行 (r 1) p 1次,最 终返回的 i 1 p 1 (r 1) p 1 1 r (2)由题目要求 q=(p+r)/2 可知,PARTITION 函数中的 i,j 变量应该在循环中同 时变化。 Partition(A, p, r) x = A[p]; i = p - 1; j = r + 1; while (TRUE) repeat j--; until A[j] <= x; repeat i++; until A[i] >= x; if (i < j) Swap(A, i, j); else return j; 7.3-2 (1)由 QuickSort 算法最坏情况分析得知:n 个元素每次都划 n-1 和 1 个,因 为是 p<r 的时候才调用,所以为Θ (n) (2)最好情况是每次都在最中间的位置分,所以递推式是: N(n)= 1+ 2*N(n/2) 不难得到:N(n) =Θ (n) 7.4-2 T(n)=2*T(n/2)+ Θ (n) 可以得到 T(n) =Θ (n lgn) 由 P46 Theorem3.1 可得:Ω (n lgn)
算法答案
算法复习什么是基本运算?答:基本运算是解决问题时占支配地位的运算(一般1种,偶尔两种);讨论一个算法优劣时,只讨论基本运算的执行次数。
什么是算法的时间复杂性(度)?答:算法的时间复杂性(度)是指用输入规模的某个函数来表示算法的基本运算量。
T(n)=4n3什么是算法的渐近时间复杂性?答:当输入规模趋向于极限情形时(相当大)的时间复杂性。
表示渐进时间复杂性的三个记号的具体定义是什么?答:1. T(n)= O(f(n)):若存在c > 0,和正整数n0≥1,使得当n≥n0时,总有T(n)≤c*f(n)。
(给出了算法时间复杂度的上界,不可能比c*f(n)更大)2. T(n)=Ω(f(n)):若存在c > 0,和正整数n0≥1,使得当n≥n0时,存在无穷多个n ,使得T(n)≥c*f(n)成立。
(给出了算法时间复杂度的下界,复杂度不可能比c*f(n)更小)3. T(n)= Θ(f(n)):若存在c1,c2>0,和正整数n0≥1,使得当n≥n0时,总有T(n)≤c1*f(n),且有无穷多个n,使得T(n)≥c2*f(n)成立,即:T(n)= O(f(n))与T(n)=Ω(f(n))都成立。
(既给出了算法时间复杂度的上界,也给出了下界)什么是最坏情况时间复杂性?什么是平均情况时间复杂性?答:最坏情况时间复杂性是规模为n的所有输入中,基本运算执行次数为最多的时间复杂性。
平均情况时间复杂性是规模为n的所有输入的算法时间复杂度的平均值(一般均假设每种输入情况以等概率出现)。
一般认为什么是算法?什么是计算过程?答:一般认为,算法是由若干条指令组成的有穷序列,有五个特性a.确定性(无二义)b.能行性(每条指令能够执行)c.输入 d.输出 e.有穷性(每条指令执行的次数有穷)只满足前4条而不满足第5条的有穷指令序列通常称之为计算过程。
算法研究有哪几个主要步骤?主要从哪几个方面评价算法?答:算法研究的主要步骤是1)设计2)表示3)确认,合法输入和不合法输入的处理4)分析5)测试评价算法的标准有1)正确性2)健壮性3)简单性4)高效性5)最优性关于多项式时间与指数时间有什么样的结论?答:1. 多项式时间的算法互相之间虽有差距,一般可以接受。
算法导论(第二版)习题答案(英文版)
Last update: December 9, 2002
1.2 − 2 Insertion sort beats merge sort when 8n2 < 64n lg n, ⇒ n < 8 lg n, ⇒ 2n/8 < n. This is true for 2 n 43 (found by using a calculator). Rewrite merge sort to use insertion sort for input of size 43 or less in order to improve the running time. 1−1 We assume that all months are 30 days and all years are 365.
n
Θ
i=1
i
= Θ(n2 )
This holds for both the best- and worst-case running time. 2.2 − 3 Given that each element is equally likely to be the one searched for and the element searched for is present in the array, a linear search will on the average have to search through half the elements. This is because half the time the wanted element will be in the first half and half the time it will be in the second half. Both the worst-case and average-case of L INEAR -S EARCH is Θ(n). 3
算法导论参考答案
第二章算法入门由于时间问题有些问题没有写的很仔细,而且估计这里会存在不少不恰当之处。
另,思考题2-3 关于霍纳规则,有些部分没有完成,故没把解答写上去,我对其 c 问题有疑问,请有解答方法者提供个意见。
给出的代码目前也仅仅为解决问题,没有做优化,请见谅,等有时间了我再好好修改。
插入排序算法伪代码INSERTION-SORT(A)1 for j ←2 to length[A]2 do key ←A[j]3 Insert A[j] into the sorted sequence A[1..j-1]4 i ←j-15 while i > 0 and A[i] > key6 do A[i+1]←A[i]7 i ←i − 18 A[i+1]←keyC#对揑入排序算法的实现:public static void InsertionSort<T>(T[] Input) where T:IComparable<T>{T key;int i;for (int j = 1; j < Input.Length; j++){key = Input[j];i = j - 1;for (; i >= 0 && Input[i].CompareTo(key)>0;i-- )Input[i + 1] = Input[i];Input[i+1]=key;}}揑入算法的设计使用的是增量(incremental)方法:在排好子数组A[1..j-1]后,将元素A[ j]揑入,形成排好序的子数组A[1..j]这里需要注意的是由于大部分编程语言的数组都是从0开始算起,这个不伪代码认为的数组的数是第1个有所丌同,一般要注意有几个关键值要比伪代码的小1.如果按照大部分计算机编程语言的思路,修改为:INSERTION-SORT(A)1 for j ← 1 to length[A]2 do key ←A[j]3 i ←j-14 while i ≥ 0 and A[i] > key5 do A[i+1]←A[i]6 i ←i − 17 A[i+1]←key循环丌变式(Loop Invariant)是证明算法正确性的一个重要工具。
算法导论习题答案26章
Solution to Exercise 26.2-11
For any two vertices u and in G , we can define a flow network Gu consisting of the directed version of G with s D u, t D , and all edge capacities set to 1. (The flow network Gu has V vertices and 2 jE j edges, so that it has O.V / vertices and O.E/ edges, as required. We want all capacities to be 1 so that the number of edges of G crossing a cut equals the capacity of the cut in Gu .) Let fu denote a maximum flow in Gu . We claim that for any u 2 V , the edge connectivity k equals min fjfu jg. We’ll
2V fug
show below that this claim holds. Assuming that it holds, we can find k as follows: E DGE -C ONNECTIVITY.G/ k D1 select any vertex u 2 G: V for each vertex 2 G: V fug set up the flow network Gu as described above find the maximum flow fu on Gu k D min.k; jfu j/ return k The claim follows from the max-flow min-cut theorem and how we chose பைடு நூலகம்apacities so that the capacity of a cut is the number of edges crossing it. We prove that k D min fjfu jg, for any u 2 V by showing separately that k is at least this
算法导论第三版新增27章中文版
多线程算法(完整版)——算法导论第3版新增第27章Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein邓辉译原文:/sites/products/documentation/cilk/book_chapter.pdf本书中的主要算法都是顺序算法,适合于运行在每次只能执行一条指令的单处理器计算机上。
在本章中,我们要把算法模型转向并行算法,它们可以运行在能够同时执行多条指令的多处理器计算机中。
我们将着重探索优雅的动态多线程算法模型,该模型既有助于算法的设计和分析,同时也易于进行高效的实现。
并行计算机(就是具有多个处理单元的计算机)已经变得越来越常见,其在价格和性能方面差距甚大。
相对比较便宜的有片上多处理器桌面电脑和笔记本电脑,其中包含着一个多核集成芯片,容纳着多个处理“核”,每个核都是功能齐全的处理器,可以访问一个公共内存。
价格和性能都处于中间的是由多个独立计算机(通常都只是些 PC 级的电脑)组成的集群,通过专用的网络连接在一起。
价格最高的是超级计算机,它们常常采用定制的架构和网络以提供最高的性能(每秒执行的指令数)。
多处理器计算机已经以各种形态存在数十年了。
计算社团早在计算机科学形成的初期就选定采用随机存取的机器模型来进行串行计算,但是对于并行计算来说,却没有一个公认的模型。
这主要是因为供应商无法在并行计算机的架构模型上达成一致。
比如,有些并行计算机采用共享内存,其中每个处理器都可以直接访问内存的任何位置。
而有些并行计算机则使用分布式内存,每个处理器的内存都是私有的,要想去访问其他处理器的内存,必须得向其他处理器发送显式的消息。
不过,随着多核技术的出现,新的笔记本和桌面电脑目前都成为共享内存的并行计算机,趋势似乎倒向了共享内存多处理这边。
虽然一切还是得由时间来证明,不过我们在章中仍将采用共享内存的方法。
对于片上多处理器和其他共享内存并行计算机来说,使用静态线程是一种常见的编程方法,该方法是一种共享内存“虚拟处理器”或者线程的软件抽象。
算法第四版习题答案解析
算法第四版习题答案解析1.1.1 给出以下表达式的值:a. ( 0 + 15 ) / 2b. 2.0e-6 * 100000000.1c. true && false || true && true答案:a.7,b.200.0000002 c.ture1.1.2 给出以下表达式的类型和值:a. (1 + 2.236)/2b. 1 + 2 + 3 + 4.0c. 4.1 >= 4d. 1 + 2 + "3"答案:a.1.618 b. 10.0 c.true d.331.1.3 编写一个程序,从命令行得到三个整数参数。
如果它们都相等则打印equal,否则打印not equal。
public class TestUqual{public static void main(String[] args){int a,b,c;a=b=c=0;StdOut.println("Please enter three numbers");a =StdIn.readInt();b=StdIn.readInt();c=StdIn.readInt();if(equals(a,b,c)==1){StdOut.print("equal");}else{StdOut.print("not equal");}}public static int equals(int a ,int b , int c){if(a==b&&b==c){return 1;}else{return 0;}}}1.1.4 下列语句各有什么问题(如果有的话)?a. if (a > b) then c = 0;b. if a > b { c = 0; }c. if (a > b) c = 0;d. if (a > b) c = 0 else b = 0;答案:a. if (a > b) c = 0; b. if (a > b) { c = 0; }1.1.5 编写一段程序,如果double 类型的变量x 和y 都严格位于0 和1 之间则打印true,否则打印false。
算法导论中文版答案
24.2-3
24.2-4
24.3-1 见图 24-6 24.3-2
24.3-3
24.3-4 24.3-5 24.3-6
24.3-7
24.3-8 这种情况下不会破坏已经更新的点的距离。 24.4**** 24.5****
25.1-1 见图 25-1 25.1-2 为了保证递归定义式 25.2 的正确性 25.1-3
8.3-3 8.3-4
8.3-5(*) 8.4-1 见图 8-4 8.4-2
8.4-3 3/2,1/2 8.4-4(*) 8.4-5(*)
9.1-1
9.1-2 9.2-1 9.3-1
第九章
9.3-2 9.3-3
9.3-4 9.3-5
9.3-6 9.3-7
9.3-8
9.3-9
15.1-1
6.4-4
6.4-5
6.5-1 据图 6-5 6.5-2
6.5-3 6.5-4 6.5-5
6.5-6 6.5-7
6.5-8
7.1-1 见图 7-1 7.1-2
7.1-3 7.1-4 7.2-1 7.2-2
7.2-3 7.2-4 7.2-5
第七章
7.2-6 7.3-1
7.3-2
7.4-1 7.4-2
5.3-6
6.1-1 6.1-2 6.1-3 6.1-4 6.1-5 6.1-6
第6章
6.1-7
6.2-1 见图 6-2 6.2-2
6.2-3
6.2-4
6.2-5 对以 i 为根结点的子树上每个点用循环语句实现 6.2-6
6.3-1
见图 6-3 6.3-2
6.3-3
6.4-1 见图 6-4 6.4-2 HEAPSORT 仍然正确,因为每次循环的过程中还是会运行 MAX-HEAP 的过程。 6.4-3
算法导论 第三版 第27章 答案 英
Chapter27Michelle Bodnar,Andrew LohrApril12,2016Exercise27.1-1This modification is not going to affect the asymptotic values of the span work or parallelism.All it will do is add an amount of overhead that wasn’t there before.This is because as soon as the F IB(n−2)is spawned the spawn-ing thread just sits there and waits,it does not accomplish any work while it is waiting.It will be done waiting at the same time as it would of been before because the F IB(n−2)call will take less time,so it will still be limited by the amount of time that the F IN(n−1)call takes.Exercise27.1-2The computation dag is given in the image below.The blue numbers by each strand indicate the time step in which it is executed.The work is29,span is10,and parallelism is2.9.Exercise27.1-31Suppose that there are x incomplete steps in a run of the program.Since each of these steps causes at least one unit of work to be done,we have that there is at most(T1−x)units of work done in the complete steps.Then,we suppose by contradiction that the number of complete steps is strictly greater than (T1−x)/P .Then,we have that the total amount of work done during the complete steps is P·( (T1−x)/P +1)=P (T1−x)/P +P=(T1−x)−((T1−x) mod P)+P>T1−x.This is a contradiction because there are only(T1−x) units of work done during complete steps,which is less than the amount we would be doing.Notice that since T∞is abound on the total number of both kinds of steps,it is a bound on the number of incomplete steps,x,so,T P≤ (T1−x)/P +x≤ (T1−T∞)/P +T∞Where the second inequality comes by noting that the middle expression,as a function of x is monotonically increasing,and so is bounded by the largest value of x that is possible,namely T∞.Exercise27.1-4The computation is given in the image below.Let vertex u have degree k, and assume that there are m vertices in each vertical chain.Assume that this is executed on k processors.In one execution,each strand from among the k on the left is executed concurrently,and then the m strands on the right are executed one at a time.If each strand takes unit time to execute,then the total computation takes2m time.On the other hand,suppose that on each time step of the computation,k−1strands from the left(descendants of u)are executed, and one from the right(a descendant of v),is executed.If each strand take unit time to executed,the total computation takes m+m/k.Thus,the ratio of times is2m/(m+m/k)=2/(1+1/k).As k gets large,this approaches2as desired.2Exercise27.1-5The information from T10applied to equation(27.5)give us that42≤T1−T∞10+T∞which tell us that420≤T1+9T∞Subtracting these two equations,we have that100≤8T∞.If we apply the span law to T64,we have that10≥T∞.Applying the work law to our measurement for T4gets us that320≥T1.Now,looking at the result of applying(27.5)to the value of T10,we get that420≤T1+9T∞≤320+90=410a contradiction.So,one of the three numbers for runtimes must be wrong. However,computers are complicated things,and its difficult to pin down what can affect runtime in practice.It is a bit harsh to judge professor Karan too poorly for something that may of been outside her control(maybe there was just a garbage collection happening during one of the measurements,throwing it off).Exercise27.1-6We’ll parallelize the for loop of lines6-7in a way which won’t incur races. With the algorithm P−P ROD given below,it will be easy to rewrite the code. For notation,let a i denote the i th row of the matrix A.Algorithm1P-PROD(a,x,j,j’)1:if j==j then2:return a[j]·x[j]3:end if 4:mid=j+j25:a’=spawn P-PROD(a,x,j,mid)6:x’=P-PROD(a,x,mid+1,j’)7:sync8:return a’+x’Exercise27.1-7The work is unchanged from the serial programming case.Since it isflipping Θ(n2)many entries,it doesΘ(n2)work.The span of it isΘ(lg(n))this is be-cause each of the parallel for loops can have its children spawned in time lg(n), so the total time to get all of the constant work tasks spawned is2lg(n)∈Θ(lg).3Algorithm2MAT-VEC(A,x)1:n=A.rows2:let y be a new vector of length n3:parallel for i=1to n do4:y i=05:end6:parallel for i=1to n do7:y i=P-PROD(a i,x,1,n)8:end9:return ySince the work of each task is o(lg(n)),that doesn’t affect the T∞runtime.The parallelism is equal to the work over the span,so it isΘ(n2/lg(n)).Exercise27.1-8The work isΘ(1+ nj=2j−1)=Θ(n2).The span isΘ(n)because in theworst case when j=n,the for-loop of line3will need to execute n times.The parallelism isΘ(n2)/Θ(n)=Θ(n).Exercise27.1-9We solve for P in the following equation obtained by setting T P=TP.T1 P +T∞=T 1P+T ∞2048 P +1=1024P+81024 P =710247=PSo we get that there should be approximately146processors for them to have the same runtime.Exercise27.2-1See the computation dag in the image below.Assuming that each strand takes unnit time,the work is13,the span is6,and the parallelism is1364Exercise27.2-2See the computation dag in the image below.Assuming each strand takes.unit time,the work is30,the span is16,and the parallelism is1585Exercise27.2-3We perform a modification of the P-SQUARE-MATRIX-MULTIPLY algo-rithm.Basically,as hinted in the text,we will parallelize the innermost for loop in such a way that there aren’t any data races formed.To do this,we will just define a parallelized dot product procedure.This means that lines5-7can be replaced by a single call to this procedure.P-DOT-PRODUCT computes the dot dot product of the two lists between the two bounds on indices.Using this,we can use this to modify P-SQUARE-MATRIX-MULTIPLY Since the runtime of the inner loop is O(lg(n)),which is the depth of the recursion.Since the paralel for loops also take O(lg(n))time.So,since the runtimes are additive here,the total span of this procedure isΘ(lg(n)).The total work is still just O(n3)Since all the spawning and recursing call be re-placed with the normal serial version once there aren’t enough free processors to handle all of the spawned calls to P-DOT-PRODUCT.Exercise27.2-46Algorithm3P-DOT-PROD(v,w,low,high) if low==high thenreturn v[low]=v[low]end if mid=low+high2x=spawn P-DOT-PROD(v,w,low,mid)y=P-DOT-PROD(v,w,mid+1,high)syncreturn x+yAlgorithm4MODIFIED-P-SQUARE-MATRIX-MULTIPLYn=A.rowslet C be a new n×n matrixparallel for i=1to n doparallel for j=1to n doc i,j=P-DOT-PROD(A i,·,B·,j,1,n)endendreturn CAssume that the input is two matrices A and B to be multiplied.For this algorithm we use the function P-PROD defined in exercise21.7-6.For notation, we let A i denote the i th row of A and A i denote the i th column of A.Here,C is assumed to be a p by r matrix.The work of the algorithm isΘ(prq),since this is the runtime of the serialization.The span isΘ(log(p)+log(r)+log(q))=Θ(log(pqr)).Thus,the parallelism isΘ(pqr/log(pqr),which remains highly parallel even if any of p,q,or r are1.Algorithm5MATRIX-MULTIPLY(A,B,C,p,q,r)1:parallel for i=1to p do2:parallel for j=1to r do3:C ij=P-PROD(A i,B j,1,q)4:end5:end6:return CExercise27.2-5Split up the region into four sections.Then,this amounts tofinding the transpose the upper left and lower right of the two submatrices.In addition to that,you also need to swap the elements in the upper right with their transpose position in the lower left.This dealing with the upper right swapping only takes7time O(lg(n2))=O(lg(n)).In addition,there are two subproblems,each of half the size.This gets us the recursion:T∞(n)=T∞(n/2)+lg(n)By the master theorem,we get that the total span of this procedure is T∞∈O(lg(n).The total work is still the usual O(n2).Exercise27.2-6Since D k cannot be computed without D k−1we cannot parallelize the for loop of line3of Floyd-Warshall.However,the other two loops can be paral-lelized.The work isΘ(n2),as in the serial case.The span isΘ(n lg n).Thus, the parallelism isΘ(n/lg n).The algorithm is as follows:Algorithm6P-FLOYD-WARSHALL(W)1:n=W.rows2:D(0)=W3:for k=1to n do4:let D(k)=(d(k)ij)be a new n×n matrix5:parallel for i=1to n do6:parallel for j=1to n do7:d(k)ij =min(d(k−1)ij,d(k−1)ik+d(k−1)kj8:end9:end10:end for11:return D(n)Exercise27.3-1To coarsen the base case of P-MERGE,just replace the condition on line2 with a check that n<k for some base case size k.And instead of just copying over the particular element of A to the right spot in B,you would call a serial sort on the remaining segment of A and copy the result of that over into the right spots in B.Exercise27.3-2By a slight modification of exercise9.3-8we canfind we canfind the median of all elements in two sorted arrays of total length n in O(lg n)time.We’ll modify P-MERGE to use this fact.Let MEDIAN(T,p1,r1,p2,r2)be the func-tion which returns a pair,q,where q.pos is the position of the median of all the elements T which lie between positions p1and r1,and between positions p2and r2,and q.arr is1if the position is between p1and r1,and2otherwise.The first8lines of code are identical to those in P-MERGE given on page800,so8we omit them here.Algorithm7P-MEDIAN-MERGE(T,p1,r1,p2,r2,A,p3)1:Run lines1through8of P-MERGE2:q=MEDIAN(T,p1,r1,p2,r2)3:if q.arr==1then4:q2=BINARY-SEARCH(T[q.pos]),T,p2,r2)5:q3=p3+q.pos−p1+q2−p26:A[q3]=T[q.pos]7:spawn P-MEDIAN-MERGE(T,p1,q.pos−1,p2,q2−1,A,p3)8:P-MEDIAN-MERGE(T,q.pos+1,r1,q2+1,r2,A,p3)9:sync10:else11:q2=BINARY-SEARCH(T[q.pos],T,p1,r1)12:q3=p3+q.pos−p2+q2−p113:A[q3]=T[q.pos]14:spawn P-MEDIAN-MERGE(T,p1,q2−1,p2,q.pos−1,A,p3)15:P-MEDIAN-MERGE(T,q2+1,r1,q.pos+1,r2,A,p3)16:sync17:end ifThe work is characterized by the recurrence T1(n)=O(lg n)+2T1(n/2), whose solution tells us that T1(n)=O(n).The work is at leastΩ(n)since we need to examine each element,so the work isΘ(n).The span satisfies the recur-rence T∞(n)=O(lg n)+O(lg n/2)+T∞(n/2)=O(lg n)+T∞(n/2)=Θ(lg2n), by exercise4.6-2.Exercise27.3-3Suppose that there are c different processors,and the array has length n and you are going to use its last element as a pivot.Then,look at each chunkof size nc of entries before the last element,give one to each processor.Then,each counts the number of elements that are less than the pivot.Then,we com-pute all the running sums of these values that are returned.This can be done easily by considering all of the subarrays placed along the leaves of a binary tree,and then summing up adjacent pairs.This computation can be done in time lg(min{c,n})since it’s the log of the number of leaves.From there,we can compute all the running sums for each of the subarrays also in logarithmic time. This is by keeping track of the sum of all more left cousins of each internal node, which is found by adding the left sibling’s sum vale to the left cousin value of the parent,with the root’s left cousin value initiated to0.This also just takes time the depth of the tree,so is lg(min{c,n}).Once all of these values are computed at the root,it is the index that the subarray’s elements less than the pivot should be put.Tofind the position where the subarray’s elements larger9than the root should be put,just put it at twice the sum value of the root minusthe left cousin value for that subarray.Then,the time taken is just O(nc ).Bydoing this procedure,the total work is just O(n),and the span is O(lg(n)),andso has parallelization of O(nlg(n)).This whole process is split across the severalalgoithms appearing here.Algorithm8PPartition(L)c=min{c,n}pivot=L[n]let Count be an array of length clet r1,...r c+1be roughly evenly spaced indices to L with r1=1and r c+1=n for i=1...c doCount[i]=spawn countnum(L[r i,r i+1−1],pivot)end forsynclet T be a nearly complete binary tree whose leaves are the elements of Count whose vertices have the attributes sum and lcfor all the leaves,let their sum value be the corresponding entry in Count ComputeSums(T.root)T.root.lc=0ComputeCousins(T.root)Let Target be an array of length n that the elements will be copied intofor i=1...c dolet cousin be the lc value of the node in T that corresponds to ispawn CopyElts(L,Target,cousin,r i,r i+1−1)end forTarget[n]=Target[T.root.sum]Target[T.root.sum]=L[n]return TargetAlgorithm9CountNum(L,x)ret=0for i=1...L.length doif L[i]<x thenret++end ifend forreturn retExercise27.3-4See the algorithm P-RECURSIVE-FFT.it parallelized over the two recursive calls,having a parallel for works because each of the iterations of the for loop10Algorithm10ComputeSums(v)if v is an internal node thenx=spawn ComputeSums(v.left)y=ComputeSums(v.right)syncv.sum=x+yend ifreturn v.sumAlgorithm11ComputeCousins(v)if v=NIL thenv.lc=v.p.lvif v=v.p.right thenv.lc+=c.p.left.sumend ifspawn ComputeCousins(v.left)ComputeCousins(v.right)syncend ifAlgorithm12CopyElts(L1,L2,lc,lb,ub) counter1=lc+1counter2=lbfor i=lb...ub doif L1[i]<x thenL2[counter1++]=L1[i]elseL2[counter2++]=L1[i]end ifend for11touch independent sets of variables.The span of the procedure is onlyΘ(lg(n)) giving it a parallelization ofΘ(n)Algorithm13P-RECURSIVE-FFT(a)n=a.lengthif n==1thenreturn aend ifωn=e2πi/nω=1a[0]=(a0,a2,...,a n−2)a[1]=(a1,a3,...,a n−1)y[0]=spawn P-RECURSIVE-FFT(a[0])y[1]=P-RECURSIVE-FFT(a[1])syncparallel for k=0,...,n/2−1doy k=y[0]k +ωy[1]ky k+(n/2)=y[0]k −ωy[1]kω=ωωnendreturn yExercise27.3-5Randomly pick a pivot element,swap it with the last element,so that it is in the correct format for running the procedure described in27.3-3.Run partition from problem27.3−3.As an intermediate step,in that procedure,we compute the number of elements less than the pivot(T.root.sum),so keep track of that value after the end of PPartition.Then,if we have that it is less than k, recurse on the subarray that was greater than or equal to the pivot,decreasing the order statistic of the element to be selected by T.root.sum.If it is larger than the order statistic of the element to be selected,then leave it unchanged and recurse on the subarray that was formed to be less than the pivot.A lot of the analysis in section9.2still applies,except replacing the timer needed for partitioning with the runtime of the algorithm in problem27.3-3.The work is unchanged from the serial case because when c=1,the algorithm reduces to the serial algorithm for partitioning.For span,the O(n)term in the equation half way down page218can be replaced with an O(lg(n))term.It can be seen with the substitution method that the solution to this is logarithmicE[T(n)]≤2nn−1k= n/2C lg(k)+O(lg(n))≤O(lg(n))So,the total span of this algorithm will still just be O(lg(n)).12Exercise27.3-6Let MEDIAN(A)denote a brute force method which returns the median element of the array A.We will only use this tofind the median of small arrays, in particular,those of size at most5,so it will always run in constant time.We also let A[i..j]denote the array whose elements are A[i],A[i+1],...,A[j].The function P-PARTITION(A,x)is a multithreaded function which partitions A around the input element x and returns the number of elements in A which are less than or equal to ing a parallel for-loop,its span is logarithmic in the number of elements in A.The work is the same as the serialization,which isΘ(n) according to section9.3.The span satisfies the recurrence T∞(n)=Θ(lg n/5)+ T∞(n/5)+Θ(lg n)+T∞(7n/10+6)≤Θ(lg n)+T∞(n/5)+T∞(7n/10+6).Using the substitution method we can show that T∞(n)=O(nε)for someε<1.In particular,ε=.9works.This gives a parallelization ofΩ(n.1).Algorithm14P-SELECT(A,i)1:if n==1then2:return A[1]3:end if4:Initialize a new array T of length n/55:parallel for i=0to n/5 −1do6:T[i+1]=MEDIAN(A[i n/5 ..i n/5 +4])7:end8:if n/5is not an integer then9:T[ n/5 ]=MEDIAN(A[5 n/5 ..n)10:end if11:x=P-SELECT(T, n/5 )12:k=P-PARTITION(A,x)13:if k==i then14:return x15:else if i<k then16:P-SELECT(A[1..k−1],i)17:else18:P-SELECT(A[k+1..n],i−k)19:end ifProblem27-1a.See the algorithm Sum-Arrays(A,B,C).The parallelism is O(n)since it’s workis n lg(n)and the span is lg(n).b.If grainsize is1,this means that each call of Add-Subarray just sums a singlepair of numbers.This means that since the for loop on line4will run n times,both the span and work will be O(n).So,the parallelism is just O(1).13Algorithm 15Sum-Arrays(A,B,C)n = A.length 2if n=0thenC[1]=A[1]+B[1]elsespawn Sum-Arrays(A[1...n],B[1...n],C[1...n])Sum-Arrays(A[n+1...A.length],B[n+1...A.length],C[n+1...A.length])syncend ifc.Let g be the grainsize.The runtime of the function that spawns all the other functions is ng .The runtime of any particular spawned task is g .So,wewant to minimize n g+g To do this we pull out our freshman calculus hat and take a derivative,weget 0=1−n g 2So,to solve this,we set g =√n .This minimizes the quantity and makes the span O (n/g +g )=O (√n ).Resulting in a parallelism of O ( (n )).Problem 27-2a.Our algorithm P-MATRIX-MULTIPLY-RECURSIVE-SPACE(C,A,B)mul-tiplies A and B ,and adds their product to the matrix C .It is assumed thatC contains all zeros when the function is first called.b.The work is the same as the serialization,which is Θ(n 3).It can also be foundby solving the recurrence T 1(n )=Θ(n 2)+8T (n/2)where T 1(1)=1.Bythe mater theorem,T 1(n )=Θ(n 3).The span is T ∞(n )=Θ(1)+T ∞(n/2)+T ∞(n/2)with T ∞(1)=Θ(1).By the master theorem,T ∞(n )=Θ(n ).c.The parallelism is Θ(n 2).Ignoring the constants in the Θ-notation,theparallelism of the algorithm on 1000×1000matrices is 1,000,ing P-MATRIX-MULTIPLY-RECURSIVE,the parallelism is 10,000,000,which isonly about 10times larger.Problem 27-314Algorithm16P-MATRIX-MULTIPLY-RECURSIVE-SPACE(C,A,B)1:n=A.rows2:if n=1then3:c11=c11+a11b114:else5:Partition A,B,and C into n/2×n/2submatrices6:spawn P-MATRIX-MULTIPLY-RECURSIVE-SPACE(C11,A11,B11) 7:spawn P-MATRIX-MULTIPLY-RECURSIVE-SPACE(C12,A11,B12) 8:spawn P-MATRIX-MULTIPLY-RECURSIVE-SPACE(C21,A21,B11) 9:spawn P-MATRIX-MULTIPLY-RECURSIVE-SPACE(C22,A21,B12) 10:sync11:spawn P-MATRIX-MULTIPLY-RECURSIVE-SPACE(C11,A12,B21) 12:spawn P-MATRIX-MULTIPLY-RECURSIVE-SPACE(C12,A12,B22) 13:spawn P-MATRIX-MULTIPLY-RECURSIVE-SPACE(C21,A22,B21) 14:spawn P-MATRIX-MULTIPLY-RECURSIVE-SPACE(C22,A22,B22) 15:sync16:end ifa.For the algorithm LU-DECOMPOSITION(A)on page821,the inner forloops can be parallelized,since they never update values that are read on later runs of those loops.However,the outermost for loop cannot be parallelized because across iterations of it the changes to the matrices from previous runs are used to affect the next.This means that the span will beΘ(n lg(n)),workwill still beΘ(n3)and,so,the parallelization will beΘ(n3n lg(n))=Θ(n2lg(n)).b.The for loop on lines7-10is taking the max of a set of things,while recordingthe index that that max occurs.This for loop can therefor be replaced witha lg(n)span parallelized procedure in which we arrange the n elements intothe leaves of an almost balanced binary tree,and we let each internal node be the max of its two children.Then,the span will just be the depth of this tree.This procedure can gracefully scale with the number of processors to make the span be linear,though even if it isΘ(n lg(n))it will be less than theΘ(n2)work later.The for loop on line14-15and the implicit for loop on line15have no concurrent editing,and so,can be made parallel to havea span of lg(n).While the for loop on lines18-19can be made parallel,theone containing it cannot without creating data races.Therefore,the total span of the naive parallelized algorithm will beΘ(n2lg(n)),with a work ofΘ(n3).So,the parallelization will beΘ(nlg(n)).Not as parallized as part(a),but still a significant improvement.c.We can parallelize the computing of the sums on lines4and6,but cannotalso parallize the for loops containing them without creating an issue of concurrently modifying data that we are reading.This means that the span will beΘ(n lg(n)),work will still beΘ(n2),and so the parallelization will be Θ(nlg(n)).15d.The recurrence governing the amount of work of implementing this procedureis given byI(n)≤2I(n/2)+4M(n/2)+O(n2)However,the two inversions that we need to do are independent,and the span of parallelized matrix multiply is just O(lg(n)).Also,the n2work of having to take a transpose and subtract and add matrices has a span of only O(lg(n)).Therefore,the span satisfies the recurrenceI∞(n)≤I∞(n/2)+O(lg(n))This recurrence has the solution I∞(n)∈Θ(lg2(n))by exercise4.6-2.There-fore,the span of the inversion algorithm obtained by looking at the pro-cedure detailed on page830.This makes the parallelization of it equal to Θ(M(n)/lg2(n))where M(n)is the time to compute matrix products. Problem27-4a.The algorithm below hasΘ(n)work because its serialization satisfies the re-currence T1(n)=2T(n/2)+Θ(1)and T(1)=Θ(1).It has span T∞(n)=Θ(lg n)because it satisfies the recurrence T∞(n)=T∞(n/2)+Θ(1)and T∞(1)=Θ(1).Algorithm17P-REDUCE(x,i,j)1:if i==j then2:return x[i]3:else4:mid= (i+j)/25:x=spawn P-REDUCE(x,i,mid)6:y=P-REDUCE(x,mid+1,j)7:sync8:return x⊗y9:end ifb.The work of P-SCAN-1is T1(n)=Θ(n2).The span is T∞(n)=Θ(n).Theparallelism isΘ(n).c.We’ll prove correctness by induction on the number of recursive calls made toP-SCAN-2-AUX.If a single call is made then n=1,and the algorithm sets y[1]=x[1]which is correct.Now suppose we have an array which requires n+1recursive calls.The elements in thefirst half of the array are accurately16computed since they require one fewer recursive calls.For the second half ofthe array,y[i]=x[1]⊗x[2]⊗...⊗x[i]=(x[1]⊗...⊗x[k])⊗(x[k+1]⊗...⊗x[i])=y[k]⊗(x[k+1]⊗...⊗x[i]).Since we have correctly computed the parenthesized term with P-SCAN-2-AUX,line8ensures that we have correctly computed y[i].The work is T1(n)=Θ(n lg n)by the master theorem.The span is T∞(n)=Θ(lg2n)by exercise4.6-2.The parallelism isΘ(n/lg n).d.Line8of P-SCAN-UP should befilled in by right⊗t[k].Lines5and6ofP-SCAN-DOWN should befilled in by v and v⊗t[k]respectively.Now weprove correctness.First I claim that if line5is accessed after l recursive calls,thent[k]=x[k]⊗x[k−1]⊗...⊗x[k− n/2l +1]andright=x[k+1]⊗x[k+2]⊗...x[k+ n/2l ].If n=2we make a single call,but no recursive calls,so we start our base caseat n=3.In this case,we set t[2]=x[2],and2− 3/2 +1=2.We also haveright=x[3]=x[2+1],so the claim holds.In general,on the l th recursivecall we set t[k]=P-SCAN-UP(x,t,i,k),which is t[ (i+k)/2 ]⊗right.Byour induction hypothesis,t[k]=x[(i+k)/2]⊗x[(i+k)/2−1]⊗...⊗x[(i+k)/2− n/2l+1 +1]⊗x[(i+k)/2+1⊗...⊗x[(i+k)/2+ n/2l+1 ].Thisis equivalent to our claim since(k−i)/2= n/2l+1 .A similar proof showsthe result for right.With this in hand,we can verify that the value v passed to P-SCAN-DOWN(v,x,t,y,i,j) satisfies v=x[1]⊗x[2]⊗...⊗x[i−1].For the base case,if a single recur-sive call is made then i=j=2,and we have v=x[1].In general,forthe call on line5there is nothing to prove because i doesn’t change.Forthe call on line6,we replace v by v⊗t[k].By our induction hypothesis,v=x[1]⊗...⊗x[i−1].By the previous paragraph,if we are on the l threcursive call,t[k]=x[i]⊗...⊗x[k− n/2l +1]=x[i]since on the l threcursive call,k and i must differ by n/2l .Thus,the claim holds.Sincewe set y[i]=v⊗x[i],the algorithm yields the correct result.e.The work of P-SCAN-UP satisfies T1(n)=2T(n/2)+Θ(1)=Θ(n).Thework of P-SCAN-DOWN is the same.Thus,the work of P-SCAN-3satisfiesT1(n)=Θ(n).The span of P-SCAN-UP is T∞(n)=T∞(n/2)+O(1)=Θ(lg n),and similarly for P-SCAN-DOWN.Thus,the span of P-SCAN-3isT∞(n)=Θ(lg n).The parallelism isΘ(n/lg n).Problem27-517a.Note that in this algorithm,the first call will be SIMPLE-STENCIL(A,A),and when there are ranges indexed into a matrix,what is gotten back is a view of the original matrix,not a copy.That is,changed made to the view will show up in the original.We can set up a recurrence for the work,which Algorithm 18SIMP LE −ST ENCIL (A,A 2)let n 1×n 2be the size of A 2.let m i = n i 2 for i =1,2.if m 1==0thenif m 2==0thencompute the value for the only position in A 2based on the current values in A .elseSIMP LE −ST ENCIL (A,A 2[1,1...m 2])SIMP LE −ST ENCIL (A,A 2[1,m 2+1...n 3])end ifelseif m 2==0thenSIMP LE −ST ENCIL (A,A 2[1...m 1,1])SIMP LE −ST ENCIL (A,A 2[m 1+1...n 1,1])elseSIMP LE −ST ENCIL (A,A 2[1...m 1,1...m 2])spawn SIMP LE −ST ENCIL (A,A 2[m 1+1...n 1,1...m 2])SIMP LE −ST ENCIL (A,A 2[1...m 1,m 2+1...n 2])syncSIMP LE −ST ENCIL (A,A 2[m 1+1...n 1,m 2+1...n 2])end ifend ifis justW (n )=4W (n/2)+Θ(1)which we can see by the master theorem has a solution which is Θ(n 2).For the span,the two middle subproblems are running at the same time,so,S (n )=3S (n/2)+Θ(1)Which has a solution that is Θ(n lg(3)),also by the master theorem.b.Just use the implementation for the third part with b =3The work has the same solution of n 2because it has the recurrenceW (n )=9W (n/3)+Θ(1)The span has recurrenceS (n )=5S (n/3)+Θ(1)Which has the solution Θ(n log 3(5))18Algorithm19GEN-SIMPLE-STENCIL(A,A2,b) c.let n×m be the size of A2.if(n=0)&&(m=0)thenif(n==1)&&(m==1)thencompute the value at the only position in A2elselet n i= inbfor i=1,...,b−1let m i= imbfor i=1,...,b−1let n0=m0=1for k=2,...b+1dofor i=1,...k-2dospawn GEN−SIMP LE−ST ENCIL(A,A2[n i−1...n i,m k−i−1...m k−i],b) end forGEN−SIMP LE−ST ENCIL(A,A2[n i−1...n i,m k−i−1...m k−i],b)syncend forfor k=b+2,...,2b dofor i=1,...,2b-k dospawn GEN−SIMP LE−ST ENCIL(A,A2[n b−k+i−1...n b−k+i,m b−i−1...m b−i],b) end forGEN−SIMP LE−ST ENCIL(A,A2[n3b−2k...n3b−2k+1i,m2k−2b...m2k−2b+1],b)syncend forend ifend if19The recurrences we get areW (n )=b 2W (n/b )+Θ(1)S (n )=(2b −1)W (n/b )+Θ(1)So,the work is Θ(n 2),and the span is Θ(n lg b (2b −1)).This means that the parallelization is Θ(n 2−lg b (2b −1)).So,to show the desired claim,we only need to show that 2−log b (2b −1)<12−log b (2b −1)<1log b (2b )−log b (2b −1)<1log b 2b 2b −1<12b 2b −1<b 2b <2b 2−b0<2b 2−3b0<(2b −3)bThis is clearly true because b is an integer greater than 2and this right handside only has zeroes at 0and 32and is positive for larger b .Algorithm 20BETTER-STENCIL(A)d.for k=2,...,n+1dofor i=1,...k-2dospawn compute and update the entry at A[i,k-i]end forcompute and update the entry at A[k-1,1]syncend forfor k=n+2,...2n dofor i=1,...2n-k dospawn compute and update the entries along the diagonal which have indices summing to kend forsyncend forThis procedure has span only equal to the length of the longest diago-nal with is O (n )with a factor of lg(n )thrown in.So,the parallelism is O (n 2/(n lg(n )))=O (n/lg(n )).Problem 27-620。
算法导论-习题集9
Introduction to Algorithms Day 32 Massachusetts Institute of Technology 6.046J/18.410J Singapore-MIT Alliance SMA5503 Professors Erik Demaine,Lee Wee Sun,and Charles E.Leiserson Handout29Exercise9-1.Do exercise22.2-7on page539of CLRS.Exercise9-2.Do exercise22.3-12on page549of CLRS.Exercise9-3.Do exercise22.4-3on page552of CLRS.Exercise9-4.Do exercise24.1-4on page591of CLRS.Exercise9-5.Do exercise24.3-6on page600of CLRS.Exercise9-6.Do exercise24.5-7on page614of CLRS.Let,where ranges over all directed cycles in.A cycle for whichis called a minimum mean-weight cycle.This problem investigates an efficient algorithm for computing.Assume without loss of generality that every vertex is reachable from a source vertex. Let be the weight of a shortest path from to,and let be the weight of a shortest path from to consisting of exactly edges.If there is no path from to with exactly edges, then.(a)Show that if,then contains no negative-weight cycles andfor all vertices.(b)Show that if,then(c)Let be a-weight cycle,and let and be any two vertices on.Suppose that theweight of the path from to along the cycle is.Prove that.(Hint:The weight of the path from to along the cycle is.)(d)Show that if,then there exists a vertex on the minimum mean-weight cyclesuch that(f)Show that if we add a constant to the weight of each edge of,then is increasede this to show that。