数据库系统概念(database system concepts)英文第六版 课后练习题 答案 第23章
数据库系统概念(database system concepts)英文第六版 课后练习题 答案 第22章
C H A P T E R22Object-Based DatabasesPractice Exercises22.1A car-rental company maintains a database for all vehicles in its cur-rentfleet.For all vehicles,it includes the vehicle identification number,license number,manufacturer,model,date of purchase,and color.Spe-cial data are included for certain types of vehicles:•Trucks:cargo capacity.•Sports cars:horsepower,renter age requirement.•Vans:number of passengers.•Off-road vehicles:ground clearance,drivetrain(four-or two-wheel drive).Construct an SQL schema definition for this e inheritancewhere appropriate.Answer:For this problem,we use table inheritance.We assume thatMyDate,Color and DriveTrainType are pre-defined types.create type Vehicle(vehicle id integer,license number char(15),manufacturer char(30),model char(30),purchase date MyDate,color Color)create table vehicle of type Vehiclecreate table truck(cargo capacity integer)under vehiclecreate table sportsCar12Chapter22Object-Based Databases(horsepower integerrenter age requirement integer)under vehiclecreate table van(num passengers integer)under vehiclecreate table offRoadVehicle(ground clearance realdriveTrain DriveTrainType)under vehicle22.2Consider a database schema with a relation Emp whose attributes areas shown below,with types specified for multivalued attributes.Emp=(ename,ChildrenSet multiset(Children),SkillSet multiset(Skills))Children=(name,birthday)Skills=(type,ExamSet setof(Exams))Exams=(year,city)a.Define the above schema in SQL,with appropriate types for eachattribute.ing the above schema,write the following queries in SQL.i.Find the names of all employees who have a child born on orafter January1,2000.ii.Find those employees who took an examination for the skilltype“typing”in the city“Dayton”.iii.List all skill types in the relation Emp.Answer:a.No Answer.b.Queries in SQL.i.Program:select enamefrom emp as e,e.ChildrenSet as cwhere’March’in(select birthday.monthfrom c)ii.Program:Practice Exercises3select e.enamefrom emp as e,e.SkillSet as s,s.ExamSet as xwhere s.type=’typing’and x.city=’Dayton’iii.Program:select distinct s.typefrom emp as e,e.SkillSet as s22.3Consider the E-R diagram in Figure22.5,which contains composite,multivalued,and derived attributes.a.Give an SQL schema definition corresponding to the E-R diagram.b.Give constructors for each of the structured types defined above.Answer:a.The corresponding SQL:1999schema definition is given below.Note that the derived attribute age has been translated into amethod.create type Name(first name varchar(15),middle initial char,last name varchar(15))create type Street(street name varchar(15),street number varchar(4),apartment number varchar(7))create type Address(street Street,city varchar(15),state varchar(15),zip code char(6))create table customer(name Name,customer id varchar(10),address Adress,phones char(7)array[10],dob date)method integer age()b.create function Name(f varchar(15),m char,l varchar(15))returns Namebeginsetfirst name=f;set middle initial=m;set last name=l;endcreate function Street(sname varchar(15),sno varchar(4),ano varchar(7))4Chapter22Object-Based Databasesreturns Streetbeginset street name=sname;set street number=sno;set apartment number=ano;endcreate function Address(s Street,c varchar(15),sta varchar(15),zip varchar(6))returns Addressbeginset street=s;set city=c;set state=sta;set zip code=zip;end22.4Consider the relational schema shown in Figure22.6.a.Give a schema definition in SQL corresponding to the relationalschema,but using references to express foreign-key relationships.b.Write each of the queries given in Exercise6.13on the aboveschema,using SQL.Answer:a.The schema definition is given below.Note that backward ref-erences can be addedbut they are not so important as in OODBSbecause queries can be written in SQL and joins can take care ofintegrity constraints.create type Employee(person name varchar(30),street varchar(15),city varchar(15))create type Company(company name varchar(15),(city varchar(15))create table employee of Employeecreate table company of Companycreate type Works(person ref(Employee)scope employee,comp ref(Company)scope company,salary int)create table works of Workscreate type Manages(person ref(Employee)scope employee,(manager ref(Employee)scope employee)create table manages of Managesb.i.select comp−>namePractice Exercises5from worksgroup by comphaving count(person)≥all(select count(person)from worksgroup by comp)ii.select comp−>namefrom worksgroup by comphaving sum(salary)≤all(select sum(salary)from worksgroup by comp)iii.select comp−>namefrom worksgroup by comphaving avg(salary)>(select avg(salary)from workswhere comp−>company name="First Bank Corporation")22.5Suppose that you have been hired as a consultant to choose a databasesystem for your client’s application.For each of the following appli-cations,state what type of database system(relational,persistent pro-gramming language–based OODB,object relational;do not specify acommercial product)you would recommend.Justify your recommen-dation.a.A computer-aided design system for a manufacturer of airplanes.b.A system to track contributions made to candidates for publicoffice.c.An information system to support the making of movies.Answer:a.A computer-aided design system for a manufacturer of airplanes:An OODB system would be suitable for this.That is because CADrequires complex data types,and being computation oriented,CAD tools are typically used in a programming language envi-ronment needing to access the database.b.A system to track contributions made to candidates for publicoffice:A relational system would be apt for this,as data types are ex-pected to be simple,and a powerful querying mechanism is es-sential.c.An information system to support the making of movies:Here there will be extensive use of multimedia and other complexdata types.But queries are probably simple,and thus an objectrelational system is suitable.6Chapter22Object-Based Databases22.6How does the concept of an object in the object-oriented model differfrom the concept of an entity in the entity-relationship model?Answer:An entity is simply a collection of variables or data items.An object is an encapsulation of data as well as the methods(code)tooperate on the data.The data members of an object are directly visibleonly to its methods.The outside world can gain access to the object’sdata only by passing pre-defined messages to it,and these messagesare implemented by the methods.。
数据库系统概念英文精编版第六版教学设计
数据库系统概念英文精编版第六版教学设计介绍在本教学设计中,我们将介绍《Database System Concepts》(数据库系统概念)英文精编版第六版,旨在帮助学生掌握数据库系统的关键概念、基本结构以及数据库的设计和实现。
此教学设计将涵盖以下主题:•数据库基础知识•数据模型和ER图•SQL语言•数据库设计•事务管理和并发控制•数据库安全本教学设计旨在为初学者提供一个实践性的课程,旨在帮助学生掌握现代数据库系统的基础知识,并增强他们在实际生活和职业中利用数据库系统进行数据管理的能力。
教学内容数据库基础知识本节将首先介绍数据库系统的一些基础概念,如数据、数据库、数据库管理系统和关系数据库等。
学习者将学会如何使用SQL语句来完成基本的数据检索和修改。
数据模型和ER图本章将介绍抽象概念对于数据库设计的重要性。
学生们将学会如何使用实体关系(ER)图来表达数据库中各个实体之间的关系,从而方便设计和管理数据库。
SQL语言SQL是处理和查询关系数据的标准语言。
在本节中,我们将介绍SQL语言的不同方面,如数据定义、数据操作和嵌套查询等。
数据库设计数据库设计是一项关键性的工作,我们需要通过设计来确保数据库可以支持我们所需要的数据存储和管理。
本节将介绍关于设计和开发数据库的一些基本技术,如关系模式、规范化和冗余等。
事务管理和并发控制并发控制是数据库系统设计中最具挑战性的问题之一。
在本节中,我们将介绍什么是事务,事务管理和并发控制如何确保数据库在多个用户同时访问时仍然保持正确性。
数据库安全数据安全是建立在数据库的完整性和可用性之上的。
在本节中,我们将介绍一些关于数据库安全的基本概念,如授权和认证、数据加密等。
教学方法本教学设计采用电子教学的方式,其中,我们将提供录制的视频课程、示例问题、以及针对每个主题的在线测验。
本课程将使用不同的教学方法:•通过开放式问题和讨论区域,用以解释不同的概念。
•提供示例代码来增强实践性学习。
数据库系统概念(database system concepts)英文第六版 PPT 第四章
Database System Concepts - 6th Edition
4.12
©Silberschatz, Korth and Sudarshan
View Definition
A view is defined using the create view statement which has
logical model (that is, all the actual relations stored in the database.)
Consider a person who needs to know an instructors name
and department, but not the salary. This person should see a relation described, in SQL, by select ID, name, dept_name from instructor
©Silberschatz, Korth and Sudarshan
Joined Relations
Join operations take two relations and return as a result
another relation.
A join operation is a Cartesian product which requires that
Join type – defines how tuples in each relation that do not
match any tuple in the other relation (based on the join condition) are treated.
数据库系统概念(database system concepts)英文第六版 课后练习题 答案 第7章
7.2 Answer: Note: the name of the relationship "course offering" needs to be changed to "section".
a. The E-R diagram is shown in Figure 7.2. Note that an alterantive is to model examinations as weak entities related to a section, rather than as a strong entity. The marks relationship would then be a binary relationship between student and exam, without directly involving section.
7.7 Answer: The primary key of a weak entity set can be inferred from its relationship with the strong entity set. If we add primary key attributes to the weak entity set, they will be present in both the entity set and the relationship set and they have to be the same. Hence there will be redundancy.
b. As indicated in the answer to the previous part, a path in the graph between a pair of entity sets indicates a (possibly indirect) relationship between the two entity sets. If there is a cycle in the graph then every pair of entity sets on the cycle are related to each other in at least two distinct ways. If the E-R diagram is acyclic then there is a unique path between every pair of entity sets and, thus, a unique relationship between every pair of entity sets.
数据库系统概念(database system concepts)英文第六版 PPT 第11章
11.8
©Silberschatz, Korth and Sudarshan
Sparse Index Files (Cont.)
Compared to dense indices:
Less space and less maintenance overhead for insertions and deletions. Generally slower than dense index for locating records.
Database System Concepts - 6th Edition
11.3
©Silberschatz, Korth and Sudarshan
Index Evaluation Metrics
Access types supported efficiently. E.g.,
records with a specified value in the attribute or records with an attribute value falling in a specified range of values.
actual records with that particular search-key value.
Secondary indices have to be dense
Database System Concepts - 6th Edition 11.10 ©Silberschatz, Korth and Sudarshan
value in the file.
E.g. index on ID attribute of instructor relation
数据库系统概念(database system concepts)英文第六版 课后练习题 答案 第11章
2Chapter11Indexing and Hashingb.c.11.4Answer:•With structure11.3.a:Insert9:10:InsertExercises3Delete23:Delete19:•With structure11.3.b:Insert9:Insert4Chapter 11Indexing and HashingDelete23:Delete 19:•With structure 11.3.c:Insert9:Insert10:Insert8:Delete 23:Delete 19:Exercises511.5Answer:If there are K search-key values and m −1siblings are involvedin the redistribution,the expected height of the tree is:log ⌊(m −1)n /m ⌋(K )11.6Answer:Extendable hash structure000 001010 011 100101 110 11111.7Answer:a.Delete 11:From the answer to Exercise 11.6,change the third bucketto:At this stage,it is possible to coalesce the second and third buckets.Then it is enough if the bucket address table has just four entriesinstead of eight.For the purpose of this answer,we do not do the coalescing.b.Delete 31:From the answer to 11.6,change the last bucket to:6Chapter11Indexing and Hashingc.Insert1:From the answer to11.6,change thefirst bucket to:d.Insert15:From the answer to11.6,change the last bucket to:11.8Answer:The pseudocode is shown in Figure11.1.11.9Answer:Let i denote the number of bits of the hash value used in thehash table.Let bsize denote the maximum capacity of each bucket.Thepseudocode is shown in Figure11.2.Note that we can only merge two buckets at a time.The common hashprefix of the resultant bucket will have length one less than the two bucketsmerged.Hence we look at the buddy bucket of bucket j differing from itonly at the last bit.If the common hash prefix of this bucket is not i j,thenthis implies that the buddy bucket has been further split and merge is notpossible.When merge is successful,further merging may be possible,which is handled by a recursive call to coalesce at the end of the function.11.10Answer:If the hash table is currently using i bits of the hash value,thenmaintain a count of buckets for which the length of common hash prefixis exactly i.Consider a bucket j with length of common hash prefix i j.If the bucketis being split,and i j is equal to i,then reset the count to1.If the bucketis being split and i j is one less that i,then increase the count by1.It thebucket if being coalesced,and i j is equal to i then decrease the count by1.If the count becomes0,then the bucket address table can be reduced insize at that point.However,note that if the bucket address table is not reduced at that point,then the count has no significance afterwards.If we want to postpone thereduction,we have to keep an array of counts,i.e.a count for each value ofExercises7 functionfindIterator(value V){/*Returns an iterator for the search on the value V*/Iterator iter();Set iter.v alue=V;Set C=root nodewhile(C is not a leaf node)beginLet i=samllest number such that V<=C.K iif there is no such number i then beginLet P m=last non-null pointer in the nodeSet C=C.P m;endelse Set C=C.P i;end/*C is a leaf node*/Let i be the least value such that K i=Vif there is such a value i then beginSet iter.index=i;Set iter.page=C;Set iter.acti v e=T RUE;endelse if(V is the greater than the largest value in the leaf)then beginif(C.P n.K1=V)then beginSet iter.page=C.P n;Set iter.index=1;Set iter.acti v e=T RUE;endelse Set iter.acti v e=F AL SE;endelse Set iter.acti v e=F AL SE;return(iter)}Class Iterator{variables:value V/*The value on which the index is searched*/boolean active/*Stores the current state of the iterator(TRUE or FALSE)*/int index/*Index of the next matching entry(if active is TRUE)*/PageID page/*Page Number of the next matching entry(if active is TRUE)*/ function next(){if(active)then beginSet ret Page=page;Set retI ndex=index;if(index+1=page.size)then beginpage=page.P nindex=0endelse index=index+1;if(page.K index=V)then acti v e=F AL SE;return(ret Page,retI ndex)endelse return null;}}Figure11.1Pseudocode forfindIterator and the Iterator class8Chapter11Indexing and Hashingdelete(value K l)beginj=first i high-order bits of h(K l);delete value K l from bucket j;coalesce(bucket j);endcoalesce(bucket j)begini j=bits used in bucket j;k=any bucket withfirst(i j−1)bits same as thatof bucket j while the bit i j is reversed;i k=bits used in bucket k;if(i j=i k)return;/*buckets cannot be merged*/if(entries in j+entries in k>bsize)return;/*buckets cannot be merged*/move entries of bucket k into bucket j;decrease the value of i j by1;make all the bucket-address-table entries,which pointed to bucket k,point to j;coalesce(bucket j);endFigure11.2Pseudocode for deletioncommon hash prefix.The array has to be updated in a similar fashion.Thebucket address table can be reduced if the i th entry of the array is0,wherei is the number of bits the table is using.Since bucket table reduction isan expensive operation,it is not always advisable to reduce the table.Itshould be reduced only when sufficient number of entries at the end ofcount array become0.11.11Answer:We reproduce the instructor relation below.Exercises9 ID dept salary10101Comp.Sci.Wu9000015151MusicEinstein9500032343HistoryGold8700045565Comp.Sci.Califieri6200076543FinanceCrick7200083821Comp.Sci.Kim80000a.Bitmap for salary,with S1,S2,S3and S4representing the given inter-vals in the same orderS1000000000000S3010*********b.The question is a bit trivial if there is no bitmap on the deptname attribute is:Comp.Sci010********* Music000101000000 History000000000100 Elec.Eng.010********* Finance010*********10Chapter11Indexing and HashingScan on these records with salary80000or more gives Wu and Singhas the instructors who satisfy the given query.11.12Answer:If the index entries are inserted in ascending order,the newentries get directed to the last leaf node.When this leaf node getsfilled,it is split into two.Of the two nodes generated by the split,the left nodeis left untouched and the insertions takes place on the right node.Thismakes the occupancy of the leaf nodes to about50percent,except the lastleaf.If keys that are inserted are sorted in descending order,the above situationwould still occur,but symmetrically,with the right node of a split nevergetting touched again,and occupancy would again be50percent for allnodes other than thefirst leaf.11.13Answer:a.The cost to locate the page number of the required leaf page foran insertion is negligible since the non-leaf nodes are in memory.On the leaf level it takes one random disk access to read and onerandom disk access to update it along with the cost to write onepage.Insertions which lead to splitting of leaf nodes require anadditional page write.Hence to build a B+-tree with n r entries ittakes a maximum of2∗n r random disk accesses and n r+2∗(n r/f)page writes.The second part of the cost comes from the fact that inthe worst case each leaf is halffilled,so the number of splits thatoccur is twice n r/f.The above formula ignores the cost of writing non-leaf nodes,sincewe assume they are in memory,but in reality they would also bewritten eventually.This cost is closely approximated by2∗(n r/f)/f,which is the number of internal nodes just above the leaf;we canadd further terms to account for higher levels of nodes,but these aremuch smaller than the number of leaves and can be ignored.b.Substituting the values in the above formula and neglecting the costfor page writes,it takes about10,000,000∗20milliseconds,or56hours,since each insertion costs20milliseconds.Exercises11c.function insert leaf(value K,pointer P)if(tree is empty)create an empty leaf node L,which is also the rootelse Find the last leaf node in the leaf nodes chain Lif(L has less than n−1key values)then insert(K,P)at thefirst available location in Lelse beginCreate leaf node L1Set L.P n=L1;Set K1=last value from page Linsert parent(1,L,K1,L1)insert(K,P)at thefirst location in L1endfunction insert parent(level l,pointer P,value K,pointer P1)if(level l is empty)then beginCreate an empty non-leaf node N,which is also the rootinsert(P,K,P1)at the starting of the node Nreturnelse beginFind the right most node N at level lif(N has less than n pointers)then insert(K,P1)at thefirst available location in Nelse beginCreate a new non-leaf page N1insert(P1)at the starting of the node Ninsert parent(l+1,pointer N,value K,pointer N1)endendThe insert leaf function is called for each of the value,pointerpairs in ascending order.Similar function can also be build for de-scending order.The search for the last leaf or non-leaf node at anylevel can be avoided by storing the current last page details in anarray.The last node in each level might be less than halffilled.To makethis index structure meet the requirements of a B+-tree,we can re-distribute the keys of the last two pages at each level.Since the lastbut one node is always full,redistribution makes sure that both ofthen are at least halffilled.11.14Answer:In a B+-tree index orfile organization,leaf nodes that areadjacent to each other in the tree may be located at different places ondisk.When afile organization is newly created on a set of records,it ispossible to allocate blocks that are mostly contiguous on disk to leafsnodes that are contiguous in the tree.As insertions and deletions occur12Chapter11Indexing and Hashingon the tree,sequentiality is increasingly lost,and sequential access has towait for disk seeks increasingly often.a.One way to solve this problem is to rebuild the index to restoresequentiality.b.i.In the worst case each n-block unit and each node of the B+-treeis halffilled.This gives the worst case occupancy as25percent.ii.No.While splitting the n-block unit thefirst n/2leaf pages areplaced in one n-block unit,and the remaining in the second n-block unit.That is,every n-block split maintains the order.Hence,the nodes in the n-block units are consecutive.iii.In the regular B+-tree construction,the leaf pages might not besequential and hence in the worst case,it takes one seek per leafing the block at a time method,for each n-node block,we will have at least n/2leaf nodes in it.Each n-node block canbe read using one seek.Hence the worst case seeks comes downby a factor of n/2.iv.Allowing redistribution among the nodes of the same block,doesnot require additional seeks,where as,in regular B+-tree werequire as many seeks as the number of leaf pages involvedin the redistribution.This makes redistribution for leaf blocksefficient with this scheme.Also the worst case occupancy comesback to nearly50percent.(Splitting of leaf nodes is preferredwhen the participating leaf nodes are nearly full.Hence nearly50percent instead of exact50percent)。
数据库系统概念(英文精编版.第六版)
Atomicity of updates
Failures
may lead to inconsistencies (1) account_A = account_A – 100 (2) account_B = account_B + 100
Example:
Concurrent access by multiple users
Exercises
Computer users interacts with data in the _______ level A. physical B. logical C. view D. all of the above Application users interact with data in the _______ level. A. physical B. logical C. view D. all of the above How the data are actually stored is called _______ A. Physical level B. Logical level C. View level D. Conceptual level
property is called ( )
A. Data inconsistency C. Data isolation B. Data redundancy D. Data integrity
1.3 View of Data
Hierarchy of Abstraction Levels
Three Abstraction Levels of Data
机械工业出版社
本课程学习内容
关系数据模型 关系数据库语言
数据库系统概念(database system concepts)英文第六版 课后练习题 答案 第12章
C H A P T E R12Query ProcessingPractice Exercises12.1Assume(for simplicity in this exercise)that only one tuplefits in a blockand memory holds at most3blocks.Show the runs created on each passof the sort-merge algorithm,when applied to sort the following tuples onthefirst attribute:(kangaroo,17),(wallaby,21),(emu,1),(wombat,13),(platypus,3),(lion,8),(warthog,4),(zebra,11),(meerkat,6),(hyena,9),(hornbill,2),(baboon,12).Answer:We will refer to the tuples(kangaroo,17)through(baboon,12)using tuple numbers t1through t12.We refer to the j th run used by the i thpass,as r i j.The initial sorted runs have three blocks each.They are:r11={t3,t1,t2}r12={t6,t5,t4}r13={t9,t7,t8}r14={t12,t11,t10}Each pass merges three runs.Therefore the runs after the end of thefirstpass are:r21={t3,t1,t6,t9,t5,t2,t7,t4,t8}r22={t12,t11,t10}At the end of the second pass,the tuples are completely sorted into onerun:r31={t12,t3,t11,t10,t1,t6,t9,t5,t2,t7,t4,t8}12.2Consider the bank database of Figure12.13,where the primary keys areunderlined,and the following SQL query:12Chapter12Query Processingselect T.branch namefrom branch T,branch Swhere T.assets>S.assets and S.branch city=“Brooklyn”Write an efficient relational-algebra expression that is equivalent to thisquery.Justify your choice.Answer:Query:T.branch name(( branch name,assets(T(branch)))1T.assets>S.assets( assets((branch city=’Brooklyn’)(S(branch)))))This expression performs the theta join on the smallest amount of datapossible.It does this by restricting the right hand side operand of the jointo only those branches in Brooklyn,and also eliminating the unneededattributes from both the operands.12.3Let relations r1(A,B,C)and r2(C,D,E)have the following properties:r1has20,000tuples,r2has45,000tuples,25tuples of r1fit on one block,and30tuples of r2fit on one block.Estimate the number of block transfers andseeks required,using each of the following join strategies for r11r2:a.Nested-loop join.b.Block nested-loop join.c.Merge join.d.Hash join.Answer:r1needs800blocks,and r2needs1500blocks.Let us assume M pagesof memory.If M>800,the join can easily be done in1500+800diskaccesses,using even plain nested-loop join.So we consider only the casewhere M≤800pages.a.Nested-loop join:Using r1as the outer relation we need20000∗1500+800=30,000,800disk accesses,if r2is the outer relation we need45000∗800+1500=36,001,500disk accesses.b.Block nested-loop join:If r1is the outer relation,we need⌈800M−1⌉∗1500+800disk accesses,if r2is the outer relation we need⌈1500M−1⌉∗800+1500disk accesses.c.Merge-join:Assuming that r1and r2are not initially sorted on the join key,the to-tal sorting cost inclusive of the output is B s=1500(2⌈log M−1(1500/M)⌉+Exercises32)+800(2⌈log M−1(800/M)⌉+2)disk accesses.Assuming all tupleswith the same value for the join attributesfit in memory,the totalcost is B s+1500+800disk accesses.d.Hash-join:We assume no overflow occurs.Since r1is smaller,we use it as thebuild relation and r2as the probe relation.If M>800/M,i.e.no needfor recursive partitioning,then the cost is3(1500+800)=6900diskaccesses,else the cost is2(1500+800)⌈log M−1(800)−1⌉+1500+800disk accesses.12.4The indexed nested-loop join algorithm described in Section12.5.3can beinefficient if the index is a secondary index,and there are multiple tuples with the same value for the join attributes.Why is it inefficient?Describea way,using sorting,to reduce the cost of retrieving tuples of the innerrelation.Under what conditions would this algorithm be more efficient than hybrid merge join?Answer:If there are multiple tuples in the inner relation with the same value for the join attributes,we may have to access that many blocks of the inner relation for each tuple of the outer relation.That is why it is inefficient.To reduce this cost we can perform a join of the outer relation tuples with just the secondary index leaf entries,postponing the inner relation tuple retrieval.The resultfile obtained is then sorted on the inner relation addresses,allowing an efficient physical order scan to complete the join.Hybrid merge–join requires the outer relation to be sorted.The above algorithm does not have this requirement,but for each tuple in the outer relation it needs to perform an index lookup on the inner relation.If the outer relation is much larger than the inner relation,this index lookup cost will be less than the sorting cost,thus this algorithm will be more efficient.12.5Let r and s be relations with no indices,and assume that the relationsare not sorted.Assuming infinite memory,what is the lowest-cost way (in terms of I/O operations)to compute r1s?What is the amount of memory required for this algorithm?Answer:We can store the entire smaller relation in memory,read the larger relation block by block and perform nested loop join using the larger one as the outer relation.The number of I/O operations is equal to b r+b s,and memory requirement is min(b r,b s)+2pages.12.6Consider the bank database of Figure12.13,where the primary keys areunderlined.Suppose that a B+-tree index on branch city is available on relation branch,and that no other index is available.List different ways to handle the following selections that involve negation:a.¬(branch city<“Brooklyn”)(branch)4Chapter12Query Processingb.¬(branch city=“Brooklyn”)(branch)c.¬(branch city<“Brooklyn”∨assets<5000)(branch)Answer:e the index to locate thefirst tuple whose branch cityfield hasvalue“Brooklyn”.From this tuple,follow the pointer chains till theend,retrieving all the tuples.b.For this query,the index serves no purpose.We can scan thefilesequentially and select all tuples whose branch cityfield is anythingother than“Brooklyn”.c.This query is equivalent to the query(branch city≥′Brooklyn′∧assets<5000)(branch)Using the branch-city index,we can retrieve all tuples with branch-cityvalue greater than or equal to“Brooklyn”by following the pointerchains from thefirst“Brooklyn”tuple.We also apply the additionalcriteria of assets<5000on every tuple.12.7Write pseudocode for an iterator that implements indexed nested-loopjoin,where the outer relation is pipelined.Your pseudocode must definethe standard iterator functions open(),next(),and close().Show what stateinformation the iterator must maintain between calls.Answer:Let outer be the iterator which returns successive tuples fromthe pipelined outer relation.Let inner be the iterator which returns suc-cessive tuples of the inner relation having a given value at the join at-tributes.The inner iterator returns these tuples by performing an indexlookup.The functions IndexedNLJoin::open,IndexedNLJoin::close andIndexedNLJoin::next to implement the indexed nested-loop join iteratorare given below.The two iterators outer and inner,the value of the lastread outer relation tuple t r and aflag done r indicating whether the end ofthe outer relation scan has been reached are the state information whichneed to be remembered by IndexedNLJoin between calls.IndexedNLJoin::open()beginouter.open();inner.open();done r:=false;if(outer.next()=false)move tuple from outer’s output buffer to t r;elsedone r:=true;endExercises5IndexedNLJoin::close()beginouter.close();inner.close();endboolean IndexedNLJoin::next()beginwhile(¬done r)beginif(inner.next(t r[JoinAttrs])=false)beginmove tuple from inner’s output buffer to t s;compute t r1t s and place it in output buffer;return true;endelseif(outer.next()=false)beginmove tuple from outer’s output buffer to t r;rewind inner tofirst tuple of s;endelsedone r:=true;endreturn false;end12.8Design sort-based and hash-based algorithms for computing the relationaldivision operation(see Practise Exercises of Chapter6for a definition of the division operation).Answer:Suppose r(T∪S)and s(S)be two relations and r÷s has to be computed.For sorting based algorithm,sort relation s on S.Sort relation r on (T,S).Now,start scanning r and look at the T attribute values of thefirst tuple.Scan r till tuples have same value of T.Also scan s simultaneously and check whether every tuple of s also occurs as the S attribute of r,ina fashion similar to merge join.If this is the case,output that value of Tand proceed with the next value of T.Relation s may have to be scanned multiple times but r will only be scanned once.Total disk accesses,after6Chapter12Query Processingsorting both the relations,will be|r|+N∗|s|,where N is the number ofdistinct values of T in r.We assume that for any value of T,all tuples in r with that T valuefit inmemory,and consider the general case at the end.Partition the relation ron attributes in T such that each partitionfits in memory(always possiblebecause of our assumption).Consider partitions one at a time.Build ahash table on the tuples,at the same time collecting all distinct T valuesin a separate hash table.For each value of T,Now,for each value V T ofT,each value s of S,probe the hash table on(V T,s).If any of the values isabsent,discard the value V T,else output the value V T.In the case that not all r tuples with one value for Tfit in memory,partitionr and s on the S attributes such that the condition is satisfied,run thealgorithm on each corresponding pair of partitions r i and s i.Output theintersection of the T values generated in each partition.12.9What is the effect on the cost of merging runs if the number of bufferblocks per run is increased,while keeping overall memory available forbuffering runsfixed?Answer:Seek overhead is reduced,but the the number of runs that canbe merged in a pass decreases potentially leading to more passes.A valueof b b that minimizes overall cost should be chosen.。
数据库系统概念(database system concepts)英文第六版 课后练习题 答案 第19章
C H A P T E R19Distributed DatabasesPractice Exercises19.1How might a distributed database designed for a local-area network differfrom one designed for a wide-area network?Answer:Data transfer on a local-area network(LAN)is much faster thanon a wide-area network(WAN).Thus replication and fragmentation willnot increase throughput and speed-up on a LAN,as much as in a WAN.But even in a LAN,replication has its uses in increasing reliability andavailability.19.2To build a highly available distributed system,you must know what kindsof failures can occur.a.List possible types of failure in a distributed system.b.Which items in your list from part a are also applicable to a central-ized system?Answer:a.The types of failure that can occur in a distributed system includei.Site failure.ii.Disk failure.munication failure,leading to disconnection of one or moresites from the network.b.Thefirst two failure types can also occur on centralized systems.19.3Consider a failure that occurs during2PC for a transaction.For each pos-sible failure that you listed in Practice Exercise19.2a,explain how2PCensures transaction atomicity despite the failure.Answer:A proof that2PC guarantees atomic commits/aborts inspite ofsite and link failures,follows.The main idea is that after all sites replywith a<ready T>message,only the co-ordinator of a transaction canmake a commit or abort decision.Any subsequent commit or abort by a12Chapter19Distributed Databasessite can happen only after it ascertains the co-ordinator’s decision,eitherdirectly from the co-ordinator,or indirectly from some other site.Let usenumerate the cases for a site aborting,and then for a site committing.a.A site can abort a transaction T(by writing an<abort T>log record)only under the following circumstances:i.It has not yet written a<ready T>log-record.In this case,theco-ordinator could not have got,and will not get a<ready T>or<commit T>message from this site.Therefore only an abortdecision can be made by the co-ordinator.ii.It has written the<ready T>log record,but on inquiry it foundout that some other site has an<abort T>log record.In thiscase it is correct for it to abort,because that other site wouldhave ascertained the co-ordinator’s decision(either directly orindirectly)before actually aborting.iii.It is itself the co-ordinator.In this case also no site could havecommitted,or will commit in the future,because commit deci-sions can be made only by the co-ordinator.b.A site can commit a transaction T(by writing an<commit T>logrecord)only under the following circumstances:i.It has written the<ready T>log record,and on inquiry it foundout that some other site has a<commit T>log record.In thiscase it is correct for it to commit,because that other site wouldhave ascertained the co-ordinator’s decision(either directly orindirectly)before actually committing.ii.It is itself the co-ordinator.In this case no other participatingsite can abort/would have aborted,because abort decisions aremade only by the co-ordinator.19.4Consider a distributed system with two sites,A and B.Can site A distin-guish among the following?•B goes down.•The link between A and B goes down.•B is extremely overloaded and response time is100times longer than normal.What implications does your answer have for recovery in distributedsystems?Answer:Site A cannot distinguish between the three cases until communicationhas resumed with site B.The action which it performs while B is inacces-sible must be correct irrespective of which of these situations has actuallyPractice Exercises3 occurred,and must be such that B can re-integrate consistently into the distributed system once communication is restored.19.5The persistent messaging scheme described in this chapter depends ontimestamps combined with discarding of received messages if they are too old.Suggest an alternative scheme based on sequence numbers instead of timestamps.Answer:We can have a scheme based on sequence numbers similar to the scheme based on timestamps.We tag each message with a sequence number that is unique for the(sending site,receiving site)pair.The num-ber is increased by1for each new message sent from the sending site to the receiving site.The receiving site stores and acknowledges a received message only if it has received all lower numbered messages also;the message is stored in the received-messages relation.The sending site retransmits a message until it has received an ack from the receiving site containing the sequence number of the transmitted message, or a higher sequence number.Once the acknowledgment is received,it can delete the message from its send queue.The receiving site discards all messages it receives that have a lower sequence number than the latest stored message from the sending site.The receiving site discards from received-messages all but the(number of the) most recent message from each sending site(message can be discarded only after being processed locally).Note that this scheme requires afixed(and small)overhead at the receiving site for each sending site,regardless of the number of messages received.In contrast the timestamp scheme requires extra space for every message.The timestamp scheme would have lower storage overhead if the number of messages received within the timeout interval is small compared to the number of sites,whereas the sequence number scheme would have lower overhead otherwise.19.6Give an example where the read one,write all available approach leadsto an erroneous state.Answer:Consider the balance in an account,replicated at N sites.Let the current balance be$100–consistent across all sites.Consider two trans-actions T1and T2each depositing$10in the account.Thus the balance would be$120after both these transactions are executed.Let the transac-tions execute in sequence:T1first and then T2.Let one of the sites,say s, be down when T1is executed and transaction t2reads the balance from site s.One can see that the balance at the primary site would be$110at the end.19.7Explain the difference between data replication in a distributed systemand the maintenance of a remote backup site.Answer:In remote backup systems all transactions are performed at the primary site and the data is replicated at the remote backup site.The4Chapter19Distributed Databasesremote backup site is kept synchronized with the updates at the primarysite by sending all log records.Whenever the primary site fails,the remotebackup site takes over processing.The distributed systems offer greater availability by having multiplecopies of the data at different sites whereas the remote backup systemsoffer lesser availability at lower cost and execution overhead.In a distributed system,transaction code runs at all the sites whereas ina remote backup system it runs only at the primary site.The distributedsystem transactions follow two-phase commit to have the data in con-sistent state whereas a remote backup system does not follow two-phasecommit and avoids related overhead.19.8Give an example where lazy replication can lead to an inconsistent databasestate even when updates get an exclusive lock on the primary(master)copy.Answer:Consider the balance in an account,replicated at N sites.Let thecurrent balance be$100–consistent across all sites.Consider two trans-actions T1and T2each depositing$10in the account.Thus the balancewould be$120after both these transactions are executed.Let the trans-actions execute in sequence:T1first and then T2.Suppose the copy of thebalance at one of the sites,say s,is not consistent–due to lazy replicationstrategy–with the primary copy after transaction T1is executed and lettransaction T2read this copy of the balance.One can see that the balanceat the primary site would be$110at the end.19.9Consider the following deadlock-detection algorithm.When transactionT i,at site S1,requests a resource from T j,at site S3,a request message withtimestamp n is sent.The edge(T i,T j,n)is inserted in the local wait-forgraph of S1.The edge(T i,T j,n)is inserted in the local wait-for graph ofS3only if T j has received the request message and cannot immediatelygrant the requested resource.A request from T i to T j in the same site ishandled in the usual manner;no timestamps are associated with the edge(T i,T j).A central coordinator invokes the detection algorithm by sendingan initiating message to each site in the system.On receiving this message,a site sends its local wait-for graph to the coordinator.Note that such a graph contains all the local information thatthe site has about the state of the real graph.The wait-for graph reflectsan instantaneous state of the site,but it is not synchronized with respectto any other site.When the controller has received a reply from each site,it constructs a graph as follows:•The graph contains a vertex for every transaction in the system.•The graph has an edge(T i,T j)if and only if:◦There is an edge(T i,T j)in one of the wait-for graphs.Practice Exercises5◦An edge(T i,T j,n)(for some n)appears in more than one wait-forgraph.Show that,if there is a cycle in the constructed graph,then the system isin a deadlock state,and that,if there is no cycle in the constructed graph,then the system was not in a deadlock state when the execution of thealgorithm began.Answer:Let us say a cycle T i→T j→···→T m→T i exists in the graphbuilt by the controller.The edges in the graph will either be local edgesof the from(T k,T l)or distributed edges of the form(T k,T l,n).Each localedge(T k,T l)definitely implies that T k is waiting for T l.Since a distributededge(T k,T l,n)is inserted into the graph only if T k’s request has reachedT l and T l cannot immediately release the lock,T k is indeed waiting for T l.Therefore every edge in the cycle indeed represents a transaction waitingfor another.For a detailed proof that this imlies a deadlock refer to Stuartet al.[1984].We now prove the converse implication.As soon as it is discovered thatT k is waiting for T l:a.a local edge(T k,T l)is added if both are on the same site.b.The edge(T k,T l,n)is added in both the sites,if T k and T l are ondifferent sites.Therefore,if the algorithm were able to collect all the local wait-for graphsat the same instant,it would definitely discover a cycle in the constructedgraph,in case there is a circular wait at that instant.If there is a circu-lar wait at the instant when the algorithm began execution,none of theedges participating in that cycle can disappear until the algorithmfin-ishes.Therefore,even though the algorithm cannot collect all the localgraphs at the same instant,any cycle which existed just before it startedwill anyway be detected.19.10Consider a relation that is fragmented horizontally by plant number:employee(name,address,salary,plant number)Assume that each fragment has two replicas:one stored at the New Yorksite and one stored locally at the plant site.Describe a good processingstrategy for the following queries entered at the San Jose site.a.Find all employees at the Boca plant.b.Find the average salary of all employees.c.Find the highest-paid employee at each of the following sites:Toronto,Edmonton,Vancouver,Montreal.d.Find the lowest-paid employee in the company.6Chapter19Distributed DatabasesAnswer:a.i.Send the query name(employee)to the Boca plant.ii.Have the Boca location send back the answer.pute average at New York.ii.Send answer to San Jose.c.i.Send the query tofind the highest salaried employee to Toronto,Edmonton,Vancouver,and Montreal.pute the queries at those sites.iii.Return answers to San Jose.d.i.Send the query tofind the lowest salaried employee to New York.pute the query at New York.iii.Send answer to San Jose.19.11Compute r⋉s for the relations of Figure19.9.Answer:The result is as follows.r⋉s=A B C12353219.12Give an example of an application ideally suited for the cloud and anotherthat would be hard to implement successfully in the cloud.Explain youranswer.Answer:Any application that is easy to partition,and does not needstrong guarantees of consistency across partitions,is ideally suited to thecloud.For example,Web-based document storage systems(like Googledocs),and Web based email systems(like Hotmail,Yahoo!mail or GMail),are ideally suited to the cloud.The cloud is also ideally suited to certainkinds of data analysis tasks where the data is already on the cloud;forexample,the Google Map-Reduce framework,and Yahoo!Hadoop arewidely used for data analysis of Web logs such as logs of URLs clicked byusers.Any database application that needs transactional consistency wouldbe hard to implement successfully in the cloud;examples include bankrecords,academic records of students,and many other types of organiza-tional records.19.13Given that the LDAP functionality can be implemented on top of a databasesystem,what is the need for the LDAP standard?Answer:The reasons are:a.Directory access protocols are simplified protocols that cater to alimited type of access to data.Practice Exercises7b.Directory systems provide a simple mechanism to name objects ina hierarchical fashion which can be used in a distributed directorysystem to specify what information is stored in each of the directoryservers.The directory system can be set up to automatically forwardqueries made at one site to the other site,without user intervention.19.14Consider a multidatabase system in which it is guaranteed that at mostone global transaction is active at any time,and every local site ensures local serializability.a.Suggest ways in which the multidatabase system can ensure thatthere is at most one active global transaction at any time.b.Show by example that it is possible for a nonserializable globalschedule to result despite the assumptions.Answer:a.We can have a special data item at some site on which a lock willhave to be obtained before starting a global transaction.The lockshould be released after the transaction completes.This ensures thesingle active global transaction requirement.To reduce dependencyon that particular site being up,we can generalize the solution byhaving an election scheme to choose one of the currently up sites tobe the co-ordinator,and requiring that the lock be requested on thedata item which resides on the currently elected co-ordinator.b.The following schedule involves two sites and four transactions.T1and T2are local transactions,running at site1and site2respectively.T G1and T G2are global transactions running at both sites.X1,Y1aredata items at site1,and X2,Y2are at site2.T1T2T G1T G2write(Y1)read(Y1)write(X2)read(X2)write(Y2)read(Y2)write(X1)read(X1)In this schedule,T G2starts only after T G1finishes.Within each site,there is local serializability.In site1,T G2→T1→T G1is a serializ-ability order.In site2,T G1→T2→T G2is a serializability order.Yetthe global schedule schedule is non-serializable.19.15Consider a multidatabase system in which every local site ensures localserializability,and all global transactions are read only.8Chapter19Distributed Databasesa.Show by example that nonserializable executions may result in sucha system.b.Show how you could use a ticket scheme to ensure global serializ-ability.Answer:a.The same system as in the answer to Exercise19.14is assumed,except that now both the global transactions are read-only.Considerthe schedule given below.Though there is local serializability in both sites,the global scheduleis not serializable.b.Since local serializability is guaranteed,any cycle in the system wideprecedence graph must involve at least two different sites,and twodifferent global transactions.The ticket scheme ensures that when-ever two global transactions access data at a site,they conflict ona data item(the ticket)at that site.The global transaction managercontrols ticket access in such a manner that the global transactionsexecute with the same serializability order in all the sites.Thus thechance of their participating in a cycle in the system wide precedencegraph is eliminated.。
数据库系统概念(database system concepts)英文第六版 第一章
Databa se Sy stem Concept s - 6th Edition
1 .3
©Silber schatz , Korth and S u dar
n Relational model (Chapter 2) n Example of tabular data in the relational model Columns
_____ Rows
Databa se Sy stem Concept s - 6th Edition
1 .10
©Silber schatz , Korth and S u dar
n Physical Data Independence – the ability to modify the physical schema without changing the logical schema l Applications depend on the logical schema l In general, the interfaces between the various levels and components should be well defined so that changes in some parts do not seriously influence others.
1 .5
©Silber schatz , Korth and S u dar
n Phys ical level : describes how a record (e.g., customer) is stored. n Logical level : describes data stored in database, and the relationships among the data. type instructor = record ID : string;
数据库系统概念(database system concepts)英文第六版 PPT 第17章
Database System Concepts - 6th Edition
17.3
© Silberschatz, Korth and Sudarshan
A Centralized Computer System
Database System Concepts - 6th Edition
17.4
© Silberschatz, Korth and Sudarshan
computer systems.
General-purpose computer system: one to a few CPUs and a number
of device controllers that are connected through a common bus that provides access to shared memory.
Checkpoint process
Performs periodic checkpoints
Monitors other processes, and takes recovery actions if any of the other processes fail
Process monitor process
Chapter 17: Database System Architectures
Centralized and Client-Server Systems Server System Architectures Parallel Systems Distributed Systems
better functionality for the cost flexibility in locating resources and expanding facilities better user interfaces easier maintenance
数据库系统概念(database system concepts)英文第六版 课后练习题 答案 第25章
C H A P T E R25Advanced Data Types and New ApplicationsPractice Exercises25.1What are the two types of time,and how are they different?Why doesit make sense to have both types of time associated with a tuple?Answer:A temporal database models the changing states of someaspects of the real world.The time intervals related to the data storedin a temporal database may be of two types-valid time and transactiontime.The valid time for a fact is the set of intervals during which the factis true in the real world.The transaction time for a data object is the set oftime intervals during which this object is part of the physical database.Only the transaction time is system dependent and is generated by thedatabase system.Suppose we consider our sample bank database to be bitemporal.Only the concept of valid time allows the system to answer queries suchas-“What was Smith’s balance two days ago?”.On the other hand,queries such as-“What did we record as Smith’s balance two daysago?”can be answered based on the transaction time.The differencebetween the two times is important.For example,suppose,three daysago the teller made a mistake in entering Smith’s balance and correctedthe error only yesterday.This error means that there is a differencebetween the results of the two queries(if both of them are executedtoday).25.2Suppose you have a relation containing the x,y coordinates and namesof restaurants.Suppose also that the only queries that will be askedare of the following form:The query specifies a point,and asks if thereis a restaurant exactly at that point.Which type of index would bepreferable,R-tree or B-tree?Why?Answer:The given query is not a range query,since it requires onlysearching for a point.This query can be efficiently answered by a B-treeindex on the pair of attributes(x,y).12Chapter25Advanced Data Types and New Applications25.3Suppose you have a spatial database that supports region queries(withcircular regions)but not nearest-neighbor queries.Describe an algo-rithm tofind the nearest neighbor by making use of multiple regionqueries.Answer:Suppose that we want to search for the nearest neighbor of apoint P in a database of points in the plane.The idea is to issue multipleregion queries centered at P.Each region query covers a larger area ofpoints than the previous query.The procedure stops when the result ofa region query is non-empty.The distance from P to each point withinthis region is calculated and the set of points at the smallest distance isreported.25.4Suppose you want to store line segments in an R-tree.If a line segment isnot parallel to the axes,the bounding box for it can be large,containinga large empty area.•Describe the effect on performance of having large bounding boxes on queries that ask for line segments intersecting a given region.•Briefly describe a technique to improve performance for such queries and give an example of its benefit.Hint:You can divide segmentsinto smaller pieces.Answer:Large bounding boxes tend to overlap even where the regionof overlap does not contain any information.The followingfigure:Rshows a region R within which we have to locate a segment.Notethat even though none of the four segments lies in R,due to the largebounding boxes,we have to check each of the four bounding boxes toconfirm this.A significant improvement is observed in the follwoingfigure:Practice Exercises3Rwhere each segment is split into multiple pieces,each with its own bounding box.In the second case,the box R is not part of the boxes indexed by the R-tree.In general,dividing a segment into smaller pieces causes the bounding boxes to be smaller and less wasteful of area. 25.5Give a recursive procedure to efficiently compute the spatial join of tworelations with R-tree indices.(Hint:Use bounding boxes to check if leaf entries under a pair of internal nodes may intersect.)Answer:Following is a recursive procedure for computing spatial join of two R-trees.SpJoin(node n1,node n2)beginif(the bounding boxes of n1and n2do not intersect)return;if(both n1and n2are leaves)output all pairs of entries(e1,e2)such thate1∈n1and e2∈n2,and e1and e2overlap;if(n1is not a leaf)NS1=set of children of n1;elseNS1={n1};if(n1is not a leaf)NS1=set of children of n1;elseNS1={n1};for each ns1in NS1and ns2in NS2;SpJoin(ns1,ns2);end25.6Describe how the ideas behind the RAID organization(Section10.3)canbe used in a broadcast-data environment,where there may occasionally be noise that prevents reception of part of the data being transmitted.4Chapter25Advanced Data Types and New ApplicationsAnswer:The concepts of RAID can be used to improve reliability ofthe broadcast of data over wireless systems.Each block of data that isto be broadcast is split into units of equal size.A checksum value iscalculated for each unit and appended to the unit.Now,parity data forthese units is calculated.A checksum for the parity data is appendedto it to form a parity unit.Both the data units and the parity unit arethen broadcast one after the other as a single transmission.On reception of the broadcast,the receiver uses the checksums to verify whether each unit is received without error.If one unit is foundto be in error,it can be reconstructed from the other units.The size of a unit must be chosen carefully.Small units not only requiremore checksums to be computed,but the chance that a burst of noisecorrupts more than one unit is also higher.The problem with usinglarge units is that the probability of noise affecting a unit increases;thus there is a tradeoff to be made.25.7Define a model of repeatedly broadcast data in which the broadcastmedium is modeled as a virtual disk.Describe how access time anddata-transfer rate for this virtual disk differ from the correspondingvalues for a typical hard disk.Answer:We can distinguish two models of broadcast data.In the caseof a pure broadcast medium,where the receiver cannot communicatewith the broadcaster,the broadcaster transmits data with periodic cy-cles of retransmission of the entire data,so that new receivers can catchup with all the broadcast information.Thus,the data is broadcast in acontinuous cycle.This period of the cycle can be considered akin to theworst case rotational latency in a disk drive.There is no concept of seektime here.The value for the cycle latency depends on the application,but is likely to be at least of the order of seconds,which is much higherthan the latency in a disk drive.In an alternative model,the receiver can send requests back to the broadcaster.In this model,we can also add an equivalent of disk accesslatency,between the receiver sending a request,and the broadcasterreceiving the request and responding to it.The latency is a function ofthe volume of requests and the bandwidth of the broadcast medium.Further,queries may get satisfied without even sending a request,sincethe broadcaster happened to send the data either in a cycle or basedon some other receivers request.Regardless,latency is likely to be atleast of the order of seconds,again much higher than the correspondingvalues for a hard disk.A typical hard disk can transfer data at the rate of1to5megabytes persecond.In contrast,the bandwidth of a broadcast channel is typicallyonly a few kilobytes per second.Total latency is likely to be of the orderof seconds to hundreds or even thousands of seconds,compared to afew milliseconds for a hard disk.25.8Consider a database of documents in which all documents are keptin a central database.Copies of some documents are kept on mobilePractice Exercises5 computers.Suppose that mobile computer A updates a copy of docu-ment1while it is disconnected,and,at the same time,mobile computer B updates a copy of document2while it is disconnected.Show how the version-vector scheme can ensure proper updating of the central database and mobile computers when a mobile computer reconnects. Answer:Let C be the computer onto which the central database is loaded.Each mobile computer(host)i stores,with its copy of each document d,a version-vector–that is a set of version numbers V d,i,j, with one entry for each other host j that stores a copy of the document d,which it could possibly update.Host A updates document1while it is disconnected from C.Thus, according to the version vector scheme,the version number V1,A,A is incremented by one.Now,suppose host A re-connects to C.This pair exchanges version-vectors andfinds that the version number V1,A,A is greater than V1,C,A by1,(assuming that the copy of document1stored host A was updated most recently only by host A).Following the version-vector scheme, the version of document1at C is updated and the change is reflected by an increment in the version number V1,C,A.Note that these are the only changes made by either host.Similarly,when host B connects to host C,they exchange version-vectors,and host Bfinds that V1,B,A is one less than V1,C,A.Thus,the version number V1,B,A is incremented by one,and the copy of docu-ment1at host B is updated.Thus,we see that the version-vector scheme ensures proper updating of the central database for the case just considered.This argument can be very easily generalized for the case where multiple off-line updates are made to copies of document1at host A as well as host B and host C.The argument for off-line updates to document2is similar.。
数据库系统概念(database system concepts)英文第六版 PPT 第24章
Solving above equation with current disk and memory prices leads to:
5-minute rule: if a page that is randomly accessed is used more frequently than once in 5 minutes it should be kept in memory
E.g., 80% of the code may take up 20% of time and 20% of code takes up 80% of time
Worth spending most time on 20% of code that take 80% of time
Bottlenecks may be in hardware (e.g., disks are very busy, CPU is
idle), or in software
Removing one bottleneck often exposes another
De-bottlenecking consists of repeatedly finding bottlenecks, and
removing them
This is a heuristic
24.2
©Silberschatz, Korth and Sudarshan
Performance Tuning
Adjusting various parameters and design choices to improve system performance for a specific application.
数据库系统概念(database system concepts)英文第六版 课后练习题 答案 第8章
C H A P T E R8Relational Database DesignExercises8.1Suppose that we decompose the schema R=(A,B,C,D,E)into(A,B,C)(A,D,E).Show that this decomposition is a lossless-join decomposition if thefollowing set F of functional dependencies holds:A→BCCD→EB→DE→AAnswer:A decomposition{R1,R2}is a lossless-join decomposition ifR1∩R2→R1or R1∩R2→R2.Let R1=(A,B,C),R2=(A,D,E),and R1∩R2=A.Since A is a candidate key(see PracticeExercise8.6),Therefore R1∩R2→R1.8.2List all functional dependencies satisfied by the relation of Figure8.17.Answer:The nontrivial functional dependencies are:A→B andC→B,and a dependency they logically imply:AC→B.There are19trivial functional dependencies of the form␣→,where⊆␣.Cdoes not functionally determine A because thefirst and third tuples havethe same C but different A values.The same tuples also show B does notfunctionally determine A.Likewise,A does not functionally determineC because thefirst two tuples have the same A value and different Cvalues.The same tuples also show B does not functionally determine C.8.3Explain how functional dependencies can be used to indicate the fol-lowing:910Chapter8Relational Database Design•A one-to-one relationship set exists between entity sets student andinstructor.•A many-to-one relationship set exists between entity sets studentand instructor.Answer:Let Pk(r)denote the primary key attribute of relation r.•The functional dependencies Pk(student)→Pk(instructor)andPk(instructor)→Pk(student)indicate a one-to-one relationshipbecause any two tuples with the same value for student must havethe same value for instructor,and any two tuples agreeing oninstructor must have the same value for student.•The functional dependency Pk(student)→Pk(instructor)indicates amany-to-one relationship since any student value which is repeatedwill have the same instructor value,but many student values mayhave the same instructor value.8.4Use Armstrong’s axioms to prove the soundness of the union rule.(Hint:Use the augmentation rule to show that,if␣→,then␣→␣.Apply theaugmentation rule again,using␣→␥,and then apply the transitivityrule.)Answer:To prove that:if␣→and␣→␥then␣→␥Following the hint,we derive:␣→given␣␣→␣augmentation rule␣→␣union of identical sets␣→␥given␣→␥augmentation rule␣→␥transitivity rule and set union commutativity8.5Use Armstrong’s axioms to prove the soundness of the pseudotransitiv-ity rule.Answer:Proof using Armstrong’s axioms of the Pseudotransitivity Rule:if␣→and␥→␦,then␣␥→␦.␣→given␣␥→␥augmentation rule and set union commutativity␥→␦given␣␥→␦transitivity rule8.6Compute the closure of the following set F of functional dependenciesfor relation schema R=(A,B,C,D,E).Exercises 11A →BCCD →EB →DE →AList the candidate keys for R .Answer:Note:It is not reasonable to expect students to enumerate all of F +.Some shorthand representation of the result should be acceptable as long as the nontrivial members of F +are found.Starting with A →BC ,we can conclude:A →B and A →C .Since A →B and B →D ,A →D (decomposition,transitive)Since A →C D and C D →E ,A →E (union,decom-position,transi-tive)Since A →A ,we have (reflexive)A →ABC DE from the above steps (union)Since E →A ,E →ABC DE (transitive)Since C D →E ,C D →ABC DE (transitive)Since B →D and BC →C D ,BC →ABC DE (augmentative,transitive)Also,C →C ,D →D ,B D →D ,etc.Therefore,any functional dependency with A ,E ,BC ,or C D on the left hand side of the arrow is in F +,no matter which other attributes appear in the FD.Allow *to represent any set of attributes in R ,then F +is B D →B ,B D →D ,C →C ,D →D ,B D →B D ,B →D ,B →B ,B →B D ,and all FDs of the form A ∗→␣,BC ∗→␣,C D ∗→␣,E ∗→␣where ␣is any subset of {A ,B ,C ,D ,E }.The candidate keys are A ,BC ,C D ,and E .8.7Using the functional dependencies of Practice Exercise 8.6,compute thecanonical cover F c .Answer:The given set of FDs F is:-A →BCCD →EB →DE →AThe left side of each FD in F is unique.Also none of the attributes in the left side or right side of any of the FDs is extraneous.Therefore the canonical cover F c is equal to F .12Chapter8Relational Database Design8.8Consider the algorithm in Figure8.18to compute␣+.Show that thisalgorithm is more efficient than the one presented in Figure8.8(Sec-tion8.4.2)and that it computes␣+correctly.Answer:The algorithm is correct because:•If A is added to result then there is a proof that␣→A.To see this,observe that␣→␣trivially so␣is correctly part of result.IfA∈␣is added to result there must be some FD→␥such thatA∈␥andis already a subset of result.(Otherwise f dcountwould be nonzero and the if condition would be false.)A full proofcan be given by induction on the depth of recursion for an executionof addin,but such a proof can be expected only from students witha good mathematical background.•If A∈␣+,then A is eventually added to result.We prove this byinduction on the length of the proof of␣→A using Armstrong’saxioms.First observe that if procedure addin is called with someargument,all the attributes inwill be added to result.Also if aparticular FD’s fdcount becomes0,all the attributes in its tail willdefinitely be added to result.The base case of the proof,A∈␣⇒A∈␣+,is obviously true because thefirst call to addinhas the argument␣.The inductive hypotheses is that if␣→A canbe proved in n steps or less then A∈result.If there is a proof inn+1steps that␣→A,then the last step was an application ofeither reflexivity,augmentation or transitivity on a fact␣→proved in n or fewer steps.If reflexivity or augmentation was usedin the(n+1)st step,A must have been in result by the end of the n thstep itself.Otherwise,by the inductive hypothesis⊆result.Therefore the dependency used in proving→␥,A∈␥willhave f dcount set to0by the end of the n th step.Hence A will beadded to result.To see that this algorithm is more efficient than the one presented inthe chapter note that we scan each FD once in the main program.Theresulting array a ppears has size proportional to the size of the givenFDs.The recursive calls to addin result in processing linear in the sizeof a ppears.Hence the algorithm has time complexity which is linear inthe size of the given FDs.On the other hand,the algorithm given in thetext has quadratic time complexity,as it may perform the loop as manytimes as the number of FDs,in each loop scanning all of them once.8.9Given the database schema R(a,b,c),and a relation r on the schema R,write an SQL query to test whether the functional dependency b→cholds on relation r.Also write an SQL assertion that enforces the func-tional dependency.Assume that no null values are present.(Althoughpart of the SQL standard,such assertions are not supported by anydatabase implementation currently.)Answer:Exercises13a.The query is given below.Its result is non-empty if and only ifb→c does not hold on r.select bfrom rgroup by bhaving count(distinct c)>1b.create assertion b to c check(not exists(select bfrom rgroup by bhaving count(distinct c)>1))8.10Our discussion of lossless-join decomposition implicitly assumed thatattributes on the left-hand side of a functional dependency cannot take on null values.What could go wrong on decomposition,if this property is violated?Answer:The natural join operator is defined in terms of the cartesian product and the selection operator.The selection operator,gives unknown for any query on a null value.Thus,the natural join excludes all tuples with null values on the common attributes from thefinal result.Thus, the decomposition would be lossy(in a manner different from the usual case of lossy decomposition),if null values occur in the left-hand side of the functional dependency used to decompose the relation.(Null values in attributes that occur only in the right-hand side of the functional dependency do not cause any problems.)8.11In the BCNF decomposition algorithm,suppose you use a functional de-pendency␣→to decompose a relation schema r(␣,,␥)into r1(␣,) and r2(␣,␥).a.What primary and foreign-key constraint do you expect to holdon the decomposed relations?b.Give an example of an inconsistency that can arise due to anerroneous update,if the foreign-key constraint were not enforcedon the decomposed relations above.c.When a relation is decomposed into3NF using the algorithm inSection8.5.2,what primary and foreign key dependencies wouldyou expect will hold on the decomposed schema?14Chapter8Relational Database DesignAnswer:a.␣should be a primary key for r1,and␣should be the foreign keyfrom r2,referencing r1.b.If the foreign key constraint is not enforced,then a deletion of atuple from r1would not have a corresponding deletion from thereferencing tuples in r2.Instead of deleting a tuple from r,thiswould amount to simply setting the value of␣to null in sometuples.c.For every schema r i(␣)added to the schema because of a rule␣→,␣should be made the primary key.Also,a candidate key␥for the original relation is located in some newly created relationr k,and is a primary key for that relation.Foreign key constraints are created as follows:for each relationr i created above,if the primary key attributes of r i also occur inany other relation r j,then a foreign key constraint is created fromthose attributes in r j,referencing(the primary key of)r i.8.12Let R1,R2,...,R n be a decomposition of schema U.Let u(U)be a rela-(u).Show thattion,and let r i= RIu⊆r11r21···1r nAnswer:Consider some tuple t in u.(u)implies that t[R i]∈r i,1≤i≤n.Thus,Note that r i= Rit[R1]1t[R2]1...1t[R n]∈r11r21...1r nBy the definition of natural join,t[R1]1t[R2]1...1t[R n]= ␣((t[R1]×t[R2]×...×t[R n]))where the conditionis satisfied if values of attributes with the samename in a tuple are equal and where␣=U.The cartesian productof single tuples generates one tuple.The selection process is satisfiedbecause all attributes with the same name must have the same valuesince they are projections from the same tuple.Finally,the projectionclause removes duplicate attribute names.By the definition of decomposition,U=R1∪R2∪...∪R n,which meansthat all attributes of t are in t[R1]1t[R2]1...1t[R n].That is,t is equalto the result of this join.Since t is any arbitrary tuple in u,u⊆r11r21...1r n8.13Show that the decomposition in Practice Exercise8.1is not a dependency-preserving decomposition.Answer:The dependency B→D is not preserved.F1,the restrictionof F to(A,B,C)is A→ABC,A→AB,A→AC,A→BC,Exercises 15A →B ,A →C ,A →A ,B →B ,C →C ,AB →AC ,AB →ABC ,AB →BC ,AB →AB ,AB →A ,AB →B ,AB →C ,AC (same as AB ),BC (same as AB ),ABC (same as AB ).F 2,the restriction of F to (C ,D ,E )is A →ADE ,A →AD ,A →AE ,A →DE ,A →A ,A →D ,A →E ,D →D ,E (same as A ),AD ,AE ,DE ,ADE (same as A ).(F 1∪F 2)+is easily seen not to contain B →D since the only FD in F 1∪F 2with B as the left side is B →B ,a trivial FD .We shall see in Practice Exercise 8.15that B →D is indeed in F +.Thus B →D is not preserved.Note that C D →ABC DE is also not preserved.A simpler argument is as follows:F 1contains no dependencies with D on the right side of the arrow.F 2contains no dependencies withB on the left side of the arrow.Therefore for B →D to be preserved theremustbe an FD B →␣in F +1and ␣→D in F +2(so B →D would follow by transitivity).Since the intersection of the two schemes is A ,␣=A .Observe that B →A is not in F +1since B +=B D .8.14Show that it is possible to ensure that a dependency-preserving decom-position into 3NF is a lossless-join decomposition by guaranteeing that at least one schema contains a candidate key for the schema being decom-posed.(Hint :Show that the join of all the projections onto the schemas of the decomposition cannot have more tuples than the original relation.)Answer:Let F be a set of functional dependencies that hold on a schema R .Let ={R 1,R 2,...,R n }be a dependency-preserving 3NF decompo-sition of R .Let X be a candidate key for R .Consider a legal instance r of R .Let j = X (r )1 R 1(r )1 R 2(r ) (1)R n (r ).We want to prove that r =j .We claim that if t 1and t 2are two tuples in j such that t 1[X ]=t2[X ],then t 1=t 2.To prove this claim,we use the following inductive argument –Let F ′=F 1∪F 2∪...∪F n ,where each F i is the restriction of F to the schema R i in .Consider the use of the algorithm given in Figure 8.8to compute the closure of X under F ′.We use induction on the number of times that the f or loop in this algorithm is executed.•Basis :In the first step of the algorithm,result is assigned to X ,and hence given that t 1[X ]=t 2[X ],we know that t 1[result ]=t 2[result ]is true.•Induction Step :Let t 1[result ]=t 2[result ]be true at the end of thek th execution of the f or loop.Suppose the functional dependency considered in the k +1th execution of the f or loop is →␥,and that ⊆result .⊆result implies that t 1[]=t 2[]is true.The facts that →␥holds for some attribute set Ri in ,and that t 1[R i ]and t 2[R i ]are inR i (r )imply that t 1[␥]=t 2[␥]is also true.Since ␥is now added to result by the algorithm,we know that t 1[result ]=t 2[result ]is true at theend of the k +1th execution of the f or loop.16Chapter8Relational Database DesignSinceis dependency-preserving and X is a key for R,all attributes in Rare in result when the algorithm terminates.Thus,t1[R]=t2[R]is true,that is,t1=t2–as claimed earlier.Our claim implies that the size of X(j)is equal to the size of j.Notealso that X(j)= X(r)=r(since X is a key for R).Thus we haveproved that the size of j equals that of ing the result of PracticeExercise8.12,we know that r⊆j.Hence we conclude that r=j.Note that since X is trivially in3NF,∪{X}is a dependency-preservinglossless-join decomposition into3NF.8.15Give an example of a relation schema R′and set F′of functional depen-dencies such that there are at least three distinct lossless-join decompo-sitions of R′into BCNF.Answer:Given the relation R′=(A,B,C,D)the set of functionaldependencies F′=A→B,C→D,B→C allows three distinctBCNF decompositions.R1={(A,B),(C,D),(B,C)}is in BCNF as isR2={(A,B),(C,D),(A,C)}R2={(A,B),(C,D),(A,C)}R3={(B,C),(A,D),(A,B)}8.16Let a prime attribute be one that appears in at least one candidate key.Let␣andbe sets of attributes such that␣→holds,but→␣does not hold.Let A be an attribute that is not in␣,is not in,and forwhich→A holds.We say that A is transitively dependent on␣.Wecan restate our definition of3NF as follows:A relation schema R is in3NF with respect to a set F of functional dependencies if there are nononprime attributes A in R for which A is transitively dependent on akey for R.Show that this new definition is equivalent to the original one.Answer:Suppose R is in3NF according to the textbook definition.Weshow that it is in3NF according to the definition in the exercise.Let A bea nonprime attribute in R that is transitively dependent on a key␣forR.Then there exists⊆R such that→A,␣→,A∈␣,A∈,and→␣does not hold.But then→A violates the textbookdefinition of3NF since•A∈implies→A is nontrivial•Since→␣does not hold,is not a superkey•A is not any candidate key,since A is nonprimeExercises17 Now we show that if R is in3NF according to the exercise definition,it is in3NF according to the textbook definition.Suppose R is not in3NF according the the textbook definition.Then there is an FD␣→that fails all three conditions.Thus•␣→is nontrivial.•␣is not a superkey for R.•Some A in−␣is not in any candidate key.This implies that A is nonprime and␣→A.Let␥be a candidate key for R.Then␥→␣,␣→␥does not hold(since␣is not a superkey), A∈␣,and A∈␥(since A is nonprime).Thus A is transitively dependent on␥,violating the exercise definition.8.17A functional dependency␣→is called a partial dependency if thereis a proper subset␥of␣such that␥→.We say thatis partially dependent on␣.A relation schema R is in second normal form(2NF)if each attribute A in R meets one of the following criteria:•It appears in a candidate key.•It is not partially dependent on a candidate key.Show that every3NF schema is in2NF.(Hint:Show that every partial dependency is a transitive dependency.)Answer:Referring to the definitions in Practice Exercise8.16,a relation schema R is said to be in3NF if there is no non-prime attribute A in R for which A is transitively dependent on a key for R.We can also rewrite the definition of2NF given here as:“A relation schema R is in2NF if no non-prime attribute A is partially dependent on any candidate key for R.”To prove that every3NF schema is in2NF,it suffices to show that if a non-prime attribute A is partially dependent on a candidate key␣,thenA is also transitively dependent on the key␣.Let A be a non-prime attribute in R.Let␣be a candidate key for R.Suppose A is partially dependent on␣.•From the definition of a partial dependency,we know that for someproper subsetof␣,→A.•Since⊂␣,␣→.Also,→␣does not hold,since␣is acandidate key.•Finally,since A is non-prime,it cannot be in eitheror␣.Thus we conclude that␣→A is a transitive dependency.Hence we have proved that every3NF schema is also in2NF.8.18Give an example of a relation schema R and a set of dependencies suchthat R is in BCNF but is not in4NF.18Chapter8Relational Database DesignAnswer:R(A,B,C)A→→BExercises19result:=∅;/*fdcount is an array whose i th element contains the number of attributes on the left side of the i th FD that arenot yet known to be in␣+*/for i:=1to|F|dobeginlet→␥denote the i th FD;fdcount[i]:=||;end/*appears is an array with one entry for each attribute.The entry for attribute A is a list of integers.Each integeri on the list indicates that A appears on the left sideof the i th FD*/for each attribute A dobeginappears[A]:=NI L;for i:=1to|F|dobeginlet→␥denote the i th FD;if A∈then add i to appears[A];endendaddin(␣);return(result);procedure addin(␣);for each attribute A in␣dobeginif A∈result thenbeginresult:=result∪{A};for each element i of appears[A]dobeginfdcount[i]:=fdcount[i]−1;if fdcount[i]:=0thenbeginlet→␥denote the i th FD;addin(␥);endendendendFigure8.18.An algorithm to compute␣+.。
数据库系统概念(database system concepts)英文第六版 课后练习题 答案 第26章
©Silberschatz, Korth and Sudarshan See for conditions on re-use
With the growth of networks, and the existence of multiple autonomous
database systems, workflows provide a convenient way of carrying out tasks that involve multiple systems.
Provide infrastructure for building and administering complex transaction
processing systems with a large number of clients and multiple servers.
Provide services such as:
Some commercial TP monitors: CICS from IBM, Pathway from Tandem,
Top End from NCR, and Encina from Transarc
Database System Concepts - 6th Edition
26.3
©Silberschatz, Korth and Sudarshan
TP Monitor Architectures
Database System Concepts - 6th Edition
26.4
©Silberschatz, Korth and Sudarshan
数据库系统概念(database system concepts)英文第六版 PPT 第15章
requesting and releasing locks. Locking protocols restrict the set of possible schedules.
Database System Concepts - 6th Edition
15.5
©Silberschatz, Korth and Sudarshan
The Two-Phase Locking Protocol (Cont.)
There can be conflict serializable schedules that cannot be obtained if
two-phase locking is used.
However, in the absence of extra information (e.g., ordering of access
to data), two-phase locking is needed for conflict serializability in the following sense: Given a transaction Ti that does not follow two-phase locking, we can find a transaction Tj that uses two-phase locking, and a schedule for Ti and Tj that is not conflict serializable.
©Silberschatz, Korth and Sudarshan
Lock-Based Protocols (Cont.)
Lock-compatibility matrix
数据库系统概念(英文精编版.第六版)ch7E-RModel
instructor and student may have the attribute date which tracks when the instructor becomes the advisor of the student
Degree of a Relationship Set
Degree of the relationship set: Refers to number of eies in B can be associated with one entity in A? How many entities in A can be associated with one entity in B?
A
R
B
Mapping Cardinality(映射基数)
advisor relationship
instructor entity
A relationship set is a set of relationships of the same type.
Formally, it is a subset of
{(e1, e2, …, en) | e1 E1, e2 E2, …, en En}
entity in A is associated with any number (zero or more) of entities in B
An
entity in B is associated with any number (zero or more) of entities in A
Participation
An
entity in A is associated with any number (zero or more) (任意个,包括零个)of entities in B
数据库系统概念(英文精编版.第六版)
Attributes
Attribute Domain
The set of allowed values for each attribute is called the domain of the attribute
The special value null is a member of every domain
Instructor_schema = (ID, name, dept_name, salary)
r(R) is a relation on the relation schema R
We use lowercase names for relations.
Example: instructor (Instructor _schema)
2.5 Relational Query Languages
2.6 Relational Operations
Chapter 2: Relational Model
教学目的:
熟悉关系数据模型
教学重点:
简单属性、多值属性、复合属性
关系、关系模式、码等概念 关系模式图 关系代数的操作
?简单属性多值属性复合属性?关系关系模式码等概念?关系模式图?关系代数的操作?教学难点
Chapter 2: Relational Model
2.1 Structure of Relational Databases 2.2 Database Schema 2.3 Keys 2.4 Schema Diagrams
referenced relation
Exercise: 找出左图里面
存在的其他外码、参照关 系和被参照关系
数据库系统概念(database system concepts)英文第六版 课后练习题 答案 第17章
C H A P T E R17Database System ArchitecturesPractice Exercises17.1Instead of storing shared structures in shared memory,an alternativearchitecture would be to store them in the local memory of a specialprocess,and access the shared data by interprocess communicationwith the process.What would be the drawback of such an architecture?Answer:The drawbacks would be that two interprocess messageswould be required to acquire locks,one for the request and one toconfirm grant.Interprocess communication is much more expensivethan memory access,so the cost of locking would increase.The processstoring the shared structures could also become a bottleneck.The benefit of this alternative is that the lock table is protected betterfrom erroneous updates since only one process can access it.17.2In typical client–server systems the server machine is much more pow-erful than the clients;that is,its processor is faster,it may have multipleprocessors,and it has more memory and disk capacity.Consider in-stead a scenario where client and server machines have exactly thesame power.Would it make sense to build a client–server system insuch a scenario?Why?Which scenario would be better suited to adata-server architecture?Answer:With powerful clients,it still makes sense to have a client-server system,rather than a fully centralized system.If the data-serverarchitecture is used,the powerful clients can off-load all the long andcompute intensive transaction processing work from the server,freeingit to perform only the work of satisfying read-write requests.even ifthe transaction-server model is used,the clients still take care of theuser-interface work,which is typically very compute-intensive.A fully distributed system might seem attractive in the presence ofpowerful clients,but client-server systems still have the advantage ofsimpler concurrency control and recovery schemes to be implemented1718Chapter17Database System Architectureson the server alone,instead of having these actions distributed in allthe machines.17.3Consider a database system based on a client–server architecture,withthe server acting as a data server.a.What is the effect of the speed of the interconnection betweenthe client and the server on the choice between tuple and pageshipping?b.If page shipping is used,the cache of data at the client can beorganized either as a tuple cache or a page cache.The page cachestores data in units of a page,while the tuple cache stores data inunits of tuples.Assume tuples are smaller than pages.Describeone benefit of a tuple cache over a page cache.Answer:a.We assume that tuples are smaller than a page andfit in a page.If the interconnection link is slow it is better to choose tuple ship-ping,as in page shipping a lot of time will be wasted in shippingtuples that might never be needed.With a fast interconnectionthough,the communication overheads and latencies,not the ac-tual volume of data to be shipped,becomes the bottle neck.Inthis scenario page shipping would be preferable.b.Two benefits of an having a tuple-cache rather than a page-cache,even if page shipping is used,are:i.When a client runs out of cache space,it can replace objectswithout replacing entire pages.The reduced caching granu-larity might result in better cache-hit ratios.ii.It is possible for the server to ask clients to return some ofthe locks which they hold,but don’t need(lock de-escalation).Thus there is scope for greater concurrency.If page caching isused,this is not possible.17.4Suppose a transaction is written in C with embedded SQL,and about80percent of the time is spent in the SQL code,with the remaining20percent spent in C code.How much speedup can one hope to attain ifparallelism is used only for the SQL code?Explain.Answer:Since the part which cannot be parallelized takes20%of thetotal running time,the best speedup we can hope for has to be less than5.17.5Some database operations such as joins can see a significant differencein speed when data(for example,one of the relations involved in ajoin)fits in memory as compared to the situation where the data doesnotfit in memory.Show how this fact can explain the phenomenonof superlinear speedup,where an application sees a speedup greaterthan the amount of resources allocated to it.Answer:FILLPractice Exercises19 17.6Parallel systems often have a network structure where sets of n pro-cessors connect to a single Ethernet switch,and the Ethernet switches themselves connect to another Ethernet switch.Does this architecture correspond to a bus,mesh or hypercube architecture?If not,how would you describe this interconnection architecture?Answer:FILL。
数据库系统概念(database system concepts)英文第六版 PPT 第五章
"select * from instructor where name = ‟" + name + "‟" Suppose the user, instead of entering a name, enters: X‟ or ‟Y‟ = ‟Y then the resulting statement becomes: "select * from instructor where name = ‟" + "X‟ or ‟Y‟ = ‟Y" + "‟" which is: select * from instructor where name = ‟X‟ or ‟Y‟ = ‟Y‟ User could have even used X‟; update instructor set salary = salary + 10000; - Prepared statement internally uses: "select * from instructor where name = ‟X\‟ or \‟Y\‟ = \‟Y‟ Always use prepared statements, with user inputs as parameters
How is this useful?
Database System Concepts - 6th Edition
5.10
©Silberschatz, Korth and Sudarshan
Metadata (Cont)
Database metadata DatabaseMetaData dbmd = conn.getMetaData();
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
C H A P T E R23XMLPractice Exercises23.1Give an alternative representation of university information contain-ing the same data as in Figure23.1,but using attributes instead ofsubelements.Also give the DTD or XML S chema for this representation.Answer:a.The XML representation of data using attributes is shown in Figure23.100.b.The DTD for the bank is shown in Figure23.101.12Chapter23XML<university><department dept name=“Comp.Sci.”building=“Taylor”budget=“100000”></department><department dept name=“Biology”building=“Watson”budget=“90000”></department><course course id=“CS-101”title=“Intro.to Computer Science”dept name=“Comp.Sci.”credits=“4”></course><course course id=“BIO-301”title=“Genetics”dept name=“Biology.”credits=“4”></course><instructor IID=“10101”name=“Srinivasan”dept name=“Comp.Sci.”salary=“65000”></instructor><instructor IID=“83821”name=“Brandt”dept name=“Comp.Sci”salary=“92000”></instructor><instructor IID=“76766”name=“Crick”dept name=“Biology”salary=“72000”></instructor><teaches IID=“10101”course id=“CS-101”></teaches><teaches IID=“83821”course id=“CS-101”></teaches><teaches IID=“76766”course id=“BIO-301”></teaches></university>Figure23.100XML representation.23.2Give the DTD or XML S chema for an XML representation of the followingnested-relational schema:Emp=(ename,ChildrenSet setof(Children),SkillsSet setof(Skills))Children=(name,Birthday)Birthday=(day,month,year)Skills=(type,ExamsSet setof(Exams))Exams=(year,city)Practice Exercises3 Answer:Query:<!DOCTYPE db[<!ELEMENT emp(ename,children*,skills*)><!ELEMENT children(name,birthday)><!ELEMENT birthday(day,month,year)><!ELEMENT skills(type,exams+)><!ELEMENT exams(year,city)><!ELEMENT ename(#PCDATA)><!ELEMENT name(#PCDATA)><!ELEMENT day(#PCDATA)><!ELEMENT month(#PCDATA)><!ELEMENT year(#PCDATA)><!ELEMENT type(#PCDATA)><!ELEMENT city(#PCDATA)>]><!DOCTYPE university[<!ELEMENT department><!ATTLIST departmentdept name ID#REQUIREDbuilding CDATA#REQUIREDbudget CDATA#REQUIRED><!ELEMENT instructor><!ATTLIST instructorIID ID#REQUIREDname CDATA#REQUIREDdept name IDREF#REQUIRED>salary CDATA#REQUIRED><!ELEMENT course><!ATTLIST coursecourse id ID#REQUIREDtitle CDATA#REQUIREDdept name IDREF#REQUIRED>credits CDATA#REQUIRED><!ELEMENT teaches><!ATTLIST teachesIID IDREF#REQUIRED>course id IDREF#REQUIRED]>Figure23.101The DTD for the university.4Chapter23XML23.3Write a query in XP ath on the schema of Practice Exercise23.2to list allskill types in Emp.Answer:Code:/db/emp/skills/type23.4Write a query in XQ uery on the XML representation in Figure23.11tofind the total salary of all instructors in each department.Answer:Query:for$b in distinct(/university/department/dept name)return<dept-total><dept name>$b/text()</dept name>let$s:=sum(/university/instructor[dept name=$b]/salary)return<total-salary>$s</total-salary></dept-total>23.5Write a query in XQ uery on the XML representation in Figure23.1tocompute the left outer join of department elements with course ele-ments.(Hint:Use universal quantification.)Answer:Query:<lojoin>for$d in/university/department,$c in/university/coursewhere$c/dept name=$d/dept namereturn<dept-course>$d$c</dept-course>|for$d in/university/department,where every$c in/university/course satisfies(not($c/dept name=$d/dept name))return<dept-course>$c</dept-course></lojoin>23.6Write queries in XQ uery to output course elements with associatedinstructor elements nested within the course elements,given the uni-versity information representation using ID and IDREFS in Figure23.11.Practice Exercises5 Answer:The answer in XQuery is<university-2>for$c in/university/coursereturn<course><course id>$c/*</course id>for$a in$c/id(@instructors)return$a</course></university-2>23.7Give a relational schema to represent bibliographical information spec-ified according to the DTD fragment in Figure23.16.The relational schema must keep track of the order of author elements.You can as-sume that only books and articles appear as top-level elements in XML documents.Answer:Relation schema:book(bid,title,year,publisher,place)article(artid,title,journal,year,number,volume,pages)book author(bid,first name,last name,order)article author(artid,first name,last name,order)23.8Show the tree representation of the XML data in Figure23.1,and therepresentation of the tree using nodes and child relations described in Section23.6.2.Answer:The answer is shown in Figure23.102.nodes(1,element,university,–)nodes(2,element,department,–)nodes(3,element,department,–)nodes(4,element,course,–)nodes(5,element,course,–)nodes(6,element,instructor,–)nodes(7,element,instructor,–)nodes(8,element,instructor,–)nodes(9,element,teaches,–)nodes(10,element,teaches,–)nodes(11,element,teaches,–)child(2,1)child(3,1)child(4,1)child(5,1)child(6,1)child(7,1)child(8,1)child(9,1)Continued in Figure23.103Figure23.102Relational Representation of XML Data as Trees.6Chapter23XML23.9Consider the following recursive DTD:<!DOCTYPE parts[<!ELEMENT part(name,subpartinfo*)><!ELEMENT subpartinfo(part,quantity)><!ELEMENT name(#PCDATA)><!ELEMENT quantity(#PCDATA)>]>a.Give a small example of data corresponding to this DTD.b.Show how to map this DTD to a relational schema.You can as-sume that part names are unique;that is,wherever a part ap-pears,its subpart structure will be the same.c.Create a schema in XML S chema corresponding to this DTD.Answer:a.The answer is shown in Figure23.104.b.Show how to map this DTD to a relational schema.part(partid,name)subpartinfo(partid,subpartid,qty)Attributes partid and subpartid of subpartinfo are foreign keys topart.c.The XML S chema for the DTD is as follows:<xs:schema xmlns:xs=“/2001/XMLSchema”><xs:element name=“parts”type=“partsType”/><xs:complexType name=“partType”><xs:sequence><xs:element name=“name”type=“xs:string”/><xs:element name=“subpartinfo”type=“subpartinfoType”minOccurs=“0”maxOccurs=“unbounded”/></xs:sequence><xs:complexType name=“subpartinfoType”/><xs:sequence><xs:element name=“part”type=“partType”/><xs:element name=“quantity”type=“xs:string”/></xs:sequence></xs:schema>Practice Exercises7 child(10,1)child(11,1)nodes(12,element,dept name,Comp.Sci.)nodes(13,element,building,Taylor)nodes(14,element,budget,100000)child(12,2)child(13,2)child(14,2)nodes(15,element,dept name,Biology)nodes(16,element,building,Watson)nodes(17,element,budget,90000)child(15,3)child(16,3)child(17,3)nodes(18,element,course id,CS-101)nodes(19,element,title,Intro.to Computer Science)nodes(20,element,dept name,Comp.Sci.)nodes(21,element,credits,4)child(18,4)child(19,4)child(20,4)child(21,4)nodes(22,element,course id,BIO-301)nodes(23,element,title,Genetics)nodes(24,element,dept name,Biology)nodes(25,element,credits,4)child(22,5)child(23,5)child(24,5)child(25,5)nodes(26,element,IID,10101)nodes(27,element,name,Srinivasan)nodes(28,element,dept name,Comp.Sci.)nodes(29,element,salary,65000)child(26,6)child(27,6)child(28,6)child(29,6)nodes(30,element,IID,83821)nodes(31,element,name,Brandt)nodes(32,element,dept name,Comp.Sci.)nodes(33,element,salary,92000)child(30,7child(31,7)child(32,7)child(33,7)nodes(34,element,IID,76766)nodes(35,element,dept name,Biology)nodes(36,element,salary,72000)child(34,8)child(35,8)child(36,8)nodes(37,element,IID,10101)nodes(38,element,course id,CS-101)child(37,9)child(38,9)nodes(39,element,IID,83821)nodes(40,element,course id,CS-101)child(39,10)child(40,10)nodes(41,element,IID,76766)nodes(42,element,course id,BIO-301)child(41,11)child(42,11)Figure23.103Continuation of Figure23.102.8Chapter23XML<parts><part><name>bicycle</name><subpartinfo><part><name>wheel</name><subpartinfo><part><name>rim</name></part><qty>1</qty></subpartinfo><subpartinfo><part><name>spokes</name></part><qty>40</qty></subpartinfo><subpartinfo><part><name>tire</name></part><qty>1</qty></subpartinfo></part><qty>2</qty></subpartinfo><subpartinfo><part><name>brake</name></part><qty>2</qty></subpartinfo><subpartinfo><part><name>gear</name></part><qty>3</qty></subpartinfo><subpartinfo><part><name>frame</name></part><qty>1</qty></subpartinfo></part></parts>Figure23.104Example Parts Data in XML.。