《形式语言与自动机》(王柏、杨娟编著)北邮出版社_课后习题答案

合集下载

形式语言与自动机ch4.2

目的方便文法的设计, 利于文法规范化. 影响消去产生式, 除了文法不能产生字符串外，不会影响到原文法相应的语言中其它字符串的产生.
可致空符号（nullable symbol）
对于 CFG G = (N, T, P , S )，称符号 A N 是可致空的，当且仅当 A .
（3）消除 G 中的无用符号.
注意以上简化步骤的次序. 结论设 CFG G 的语言至少包含一个非的字符串，通过上述步骤从 G 构造 G1 ，则有 L(G1)= L(G) - {}.
College of Computer Science & Technology, BUPT
22
§ 4.2 上下文无关文法的变换
CFG 的简化消无用符号消产生式消单产生式对生成式形式进行标准化

College of Computer Science & Technology, BUPT
1
生成式的标准形式
Chomsky范式 (CNF - Chomsky Normal Form) 生成式形式为A→BC, A→a, A, B, C∈N , a∈T (后面将证明, 每个上下文无关文法都有一个CNF文法)
College of Computer Science & Technology, BUPT 6
算法1: 找出有用非终结符（图示）
N''= N0∪{B|B→α 且α ∈(T∪N ')* } N' = { A | A→ω 且 ω ∈T*}
B
1
A
1
N0 = 空
A
2
A
3
2
B
一层层向外扩展，直至最外两层相等为止。所得集合

作业参考答案8

形式语言与自动机作业参考答案第八次作业：（课本P182,183第6题，第8题，第9题，第10题）6.设文法G=({S,T,F},{0,1},P,S)，其中生成式P如下：S -> 0C | 1DD -> 0 | 0S | 1DDC -> 1 | 1S | 0CC对于字符串0001110011，找出（1）一个最左推导。

（2）一个最右推导。

（3）一棵推导树。

答：(1) 最左推导：S => 0C => 00CC => 000CCC => 0001CC => 00011C => 000111S=> 0001110C => 00011100CC => 000111001C => 0001110011(2) 最右推导：S => 0C => 00CC => 00C1S => 00C10C => 00C100CC => 00C100C1=> 00C10011 => 000CC10011 => 000C110011 => 0001110011(3) 推导树如下：S0 C0 C C0 C C 1 S1 1 0 C1 18. 把下列文法G1和G2，分别变换为没有无用符号，且与其等价的上下文无关文法。

（1）G1:S -> DC | EDC -> CE | DCD -> aE -> aC | b(2)G2:S -> D | CD -> aC | bS | bC -> DC | CaE -> DS | b答：（1）G1：根据算法1得出的有用符号N1 = {S, D, E}，在此基础上用算法2得到最终的有用符号N1 = {S, D, E}，T1 = {a, b}。

所以不含无用符号的与原文法等价的文法为：G = ({S, D, E}, {a, b}, P, S)，其中P如下：S -> EDD -> aE -> b（2）G2：根据算法1得出的有用符号N1 = {S, D, E}，在此基础上用算法2得到最终的有用符号N1 = {S, D}，T1 = {b}。

有限自动机在自动控制软件分析中的应用

间隔单元）和控制系统交换信息。使用这种协议，在变电站和厂站中，不同的继电保护设备（或者间隔单元）和控制
参考艾馘ｌ毒
。｜ｌ
… 王柏，杨娟，形式语言与自动机『Ｍ］．北京：北京邮电大学出版社．２００５［２］陈文宇，形式语言与自动机［Ｍ］．成者Ｉ
赵丽敏．女．在读硕士．讲师，德州学院计算机系教师，主要研究方向：数据库技术及
示。其中，Ｓ为各个状态和ｆ为各种转换
条件。
上接第１２６页
程序设计卜的分支和循环这两个最卞婴的序结构。这种处理方法使得整个软什设计过程思路清晰，设计｝Ｈ的程序也符合结构化ｆｒｌ＇－．的原则。
镣巷食。鼍ｌｌｌｌ＿＿ｌｌｌ
应用
。０ｌ
行设计分析时，采用了形式语言与自动
机的相关技术，通过对整个ＩＥＣ６０８７０５ｌ０３规约研究，用状态转换图即有限自动机描述主站和子站通讯的流程如下图所
首先把上面得到的有限自动机进行确定化，然后进行自动机的化简，对每个状态结点所做的工作编写一段程序实现，各段程序之间的转移由一个 “ 判别状态” 来实现，构成程序的分支。用程序设计语言灾现上面的有限自动机。在本项目的程序设计中，主要有 “ 遥信、遥测、遥控、遥调 ” 等功能，按照此方完全掌握这个易用的数据库访问工具。

形式语言与自动机ch3.4-3.6资料

设一个 - NFA, : Q T 2Q
扩充定义 : Q T* 2Q 对任何q Q，定义：
1 (q , ) = ECLOSE(q ) 2 δ'(q，ωa)＝ε-CLOSURE(P) 其中P＝{ p| 存在r∈δ'(q，ω) ∧ p∈δ(r，a)}
注意：此时δ(q，a) δ'(q，a)，
相应的正则集为{ε}，，{a}
归纳：如果A和B是正则式，且分别代表集合L(A)和L(B)，
则(A+B)，(A.B)， A* 也是正则式，分别表示以下正则集：
L(A) ∪L(B)
(语言A / 语言B的串)
L(A).L(B)
(两个语言中的串的连接)
L(A) *
(语言A中的串的多次连接)
仅通过有限次使用以上两步定义的表达式，才是字母表T上的正则式。这些正则式所表示的字符串集合是T上的正则集。
联合（union）连接（concatenation）（星）闭包（closure）
2020/5/25
College of Computer Science & Technology, BUPT
16
正则表达式（regular expression）
归纳定义正则表达式如下：基础：ε，，a （a∈T）都是正则式 (原子正则式) ，
2020/5/25
College of Computer Science & Technology, BUPT
21
语言的闭包（closure）运算
语言 L 的闭包
L* = wn w L n0 , 其中wn 为w 的 n 次连接
或 L* = L0 L1 L2 … = i 0 Li , 其中

形式语言与自动机Chapter7练习参考解答

Chapter 7 练习参考解答Exercise 7.1.3 从以下文法出发：S → 0A0 | 1B1 | BBA → CB → S | AC → S | εa) 有没有无用符号？如果有的话去除它们。

b) 去除ε-产生式。

c) 去除单位产生式。

d) 把该文发转化为乔姆斯基范式。

参考解答：a)没有无用符.b) 所有符号S,A,B,C都是可致空的，消去ε-产生式后得到新的一组产生式:S → 0A0 | 1B1 | BB | B | 00 | 11A → CB → S | AC → Sc) 单元偶对包括：（A,A）,（B,B）,（C,C）,（S,S）,（A,C）,（A,S）,（A,B）,（B,A）,（B,C）,（B,S）,（C,A）,（C,B）,（C,S）,（S,A）,（S,B）,（S,C）,消去单元产生式后得到新的一组产生式S → 0A0 | 1B1 | BB | B | 00 | 11A → CB → S | AC → SS → 0A0 | 1B1 | BB | 00 | 11A → 0A0 | 1B1 | BB | 00 | 11B → 0A0 | 1B1 | BB | 00 | 11C → 0A0 | 1B1 | BB | 00 | 11d)先消去无用符号C，得到新的一组产生式:S → 0A0 | 1B1 | BB | 00 | 11A → 0A0 | 1B1 | BB | 00 | 11B → 0A0 | 1B1 | BB | 00 | 11引入非终结符C，D，增加产生式C → 0和D → 1，得到新的一组产生式:S → CAC | DBD | BB | CC | DDA → CAC | DBD | BB | CC | DDB → CAC | DBD | BB | CC | DDC → 0D → 1引入非终结符E，F，增加产生式E → CA和F → DB，得到满足Chomsky范式的一组产生式:S → EC | FD | BB | CC | DDA → EC | FD | BB | CC | DDB → EC | FD | BB | CC | DDE → CAF → DBC → 0D → 1Exercise 7.2.1(b)用CFL泵引理来证明下面的语言都不是上下文无关的：b) {a n b n c i | i ≤n}。

形式语言与自动机ch4.1

E E O E
v

E
(
E O
)
E
v
v (v＋d)
（5）（4）（1）
＋
d
v (v＋E)
（6）
(1) (2) (3) (4) (5) (6)
E EOE E (E) Ev Ed O ＋ O
（3）
vO(v＋E)
（3）
vO(E＋E)
vO(EOE)
vO(E)
（2）
vOE
EOE
（1）
E
12
College of Computer Science & Technology, BUPT 18
二义性
定义: 2型文法是二义的,当且仅当对于句子ω∈L(G),存在两棵不同的具有边缘为ω的推导树。 (即：如果文法是二义的, 那么它所产生的某个句子必然能从不同的最左(右)推导推出)。例: (书P124 例1) 句子(a*a+a)有二棵不同的推导树. (相当于一个先算乘法,一个先算加法.) 注意: 可有二个文法,一个有二义,一个无二义,但产生相同的语言. 可否通过变换消除二义性? —— 无一般的算法!
v (v＋d)
（5）（4）
v (v＋E) vO(E)
（2）
（6）
vO(v＋E)
（3）
（3）
vO(E＋E) E
5
vO(EOE)
（1）
vOE
EOE
（1）
College of Computer Science & Technology, BUPT
归约与推导
推导过程举例
对于CFG Gexp = ({E,O}, { (, ),＋, , v, d }, P , E ) ，P 为

形式语言与自动机Chapter5练习参考解答

Chapter 5 练习参考解答Exercise 5.1.2 (c) 下面的文法产生了正则表达式0*1(0+1)*的语言：εε|1|0|01B B B A A BA S →→→试给出下列串的最左推导和最右推导：c) 00011。

参考解答：一个最左推导：S ⇒lm A1B ⇒lm 0A1B ⇒lm 00A1B ⇒lm 000A1B ⇒lm 0001B ⇒lm 00011B ⇒lm 00011一个最右推导：S ⇒rm A1B ⇒rm A11B ⇒rm A11 ⇒rm 0A11⇒rm 00A11⇒rm 000A11 ⇒rm 00011! Exercise 5.1.3 证明任何正则语言都是上下文无关语言。

提示：通过对正则表达式中的运算符的数目进行归纳的方法来构造CFG 。

参考解答：对于任何正规表达式R ，归纳于R 中算符的数目n 构造如下产生式集合P(R)，相应的开始符号为S(R)：基础：n=0.（1）R 为ε，则任选非终结符A ，令P(R)只包含A →ε，以及S(R)为A ；（2）R 为φ，令P(R) 为空集；（3）R 为a ，则任选非终结符A ，令P(R)只包含A →a ，以及S(R)为A ；基础：n>0.（1）R为R1+R2，则适修改非终结符的名字，使得P(R1)与P(R2)中的所有非终结符没有重名，任选不出现在P(R1)⋃P(R2)中的非终结符A，令P(R)= P(R1) ⋃P(R2)⋃{ A→ S(R1), A→ S(R2) }，并且，令S(R)为A；（2）R为R1R2，则适修改非终结符的名字，使得P(R1)与P(R2)中的所有非终结符没有重名，任选不出现在P(R1)⋃P(R2)中的非终结符A，令P(R)= P(R1) ⋃P(R2)⋃{ A→ S(R1)S(R2) }；并且，令S(R)为A；（3）R为R1*，任选不出现在P(R1) 中的非终结符A，令P(R)= P(R1) ⋃{ A→ AS(R1) , A→ε }；并且，令S(R)为A.设L为正规语言，R为正规表达式，且有L=L(R). 令上下文无关文法G 的产生式集合为上述归纳过程所得到的P(R)，以及G的开始符号为S(R). 可以归纳证明L(G)=L(R)=L.! Exercise 5.1.4 (选做)如果一个CFG的每个产生式的体都最多只有一个变元，并且该变元总在最右端，那么该CFG称做右线性的。

《形式语言与自动机》第二版英文版课后题答案

Exercise 2.3.1Here are the sets of NFA states represented by each of the DFA states A through H: A = {p}; B = {p,q}; C = {p,r}; D = {p,q,r}; E = {p,q,s}; F = {p,q,r,s}; G = {p,r,s}; H = {p,s}.Exercise 2.3.4(a)The idea is to use a state qi, for i= 0,1,...,9 to represent the idea that we have seen an input i and guessed that this is the repeated digit at the end. We also have state qs, the initial state, and qf, the final state. We stay in state qs all the time; itSolutions for Section 2.4Exercise 2.4.1(a)We'll use q0 as the start state. q1, q2, and q3 will recognize abc; q4, q5, and q6 will recognize abd, and q7 through q10 will recognize aacd. The transition table is:Exercise 2.4.2(a)The subset construction gives us the following states, each representing the subset of the NFA states indicated: A = {q0}; B = {q0,q1,q4,q7}; C = {q0,q1,q4,q7,q8}; D = {q0,q2,q5}; E = {q0,q9}; F = {q0,q3}; G = {q0,q6}; H = {q0,q10}. Note that F, G and H can be combined into one accepting state, or we can use these three state to signal the recognition of abc, abd, and aacd, respectively.Solutions for Section 2.5Exercise 2.5.1For part (a): the closure of p is just {p}; for q it is {p,q}, and for r it is {p,q,r}.For (b), begin by noticing that a always leaves the state unchanged. Thus, we can think of the effect of strings of b's and c's only. To begin, notice that the only ways to get from p to r for the first time, using only b, c, and ε-transitions are bb, bc, and c. After getting to r, we can return to r reading either b or c. Thus, every string of length 3 or less, consisting of b's and c's only, is accepted, with the exception of the string b. However, we have to allow a's as well. When we try to insert a's in these strings, yet keeping the length to 3 or less, we find that every string of a's b's, and c's with at most one a is accepted. Also, the strings consisting of one c and up to 2 a's are accepted; other strings are rejected.There are three DFA states accessible from the initial state, which is the ε closure of p, or {p}. Let A = {p}, B = {p,q}, and C = {p,q,r}. Then the transition table is:Solutions for Section 3.1Exercise 3.1.1(a)The simplest approach is to consider those strings in which the first a precedes the first b separately from those where the opposite occurs. The expression:c*a(a+c)*b(a+b+c)* + c*b(b+c)*a(a+b+c)*Exercise 3.1.2(a)(Revised 9/5/05) The trick is to start by writing an expression for the set of strings that have no two adjacent 1's. Here is one such expression: (10+0)*(ε+1)To see why this expression works, the first part consists of all strings in which every 1 is followed by a 0. To that, we have only to add the possibility that there is a 1 at the end, which will not be followed by a 0. That is the job of (ε+1).Now, we can rethink the question as asking for strings that have a prefix with no adjacent 1's followed by a suffix with no adjacent 0's. The former is the expression we developed, and the latter is the same expression, with 0 and 1 interchanged. Thus, a solution to this problem is (10+0)*(ε+1)(01+1)*(ε+0). Note that the ε+1 term in the middle is actually unnecessary, as a 1 matching that factor can be obtained from the (01+1)* factor instead.Exercise 3.1.4(a)This expression is another way to write ``no adjacent 1's.'' You should compare it with the different-looking expression we developed in the solution to Exercise 3.1.2(a). The argument for why it works is similar. (00*1)* says every 1 is preceded by at least one 0. 0*at the end allows 0's after the final 1, and (ε+1) at the beginning allows an initial 1, which must be either the only symbol of the string or followed by a 0.Exercise 3.1.5The language of the regular expression ε. Note that ε* denotes the language of strings consisting of any number of empty strings, concatenated, but that is just the set containing the empty string.Solutions for Section 3.2Exercise 3.2.1Part (a): The following are all R0expressions; we list only the subscripts. R11 = ε+1; R12 = 0; R13 = phi; R21 = 1; R22 = ε; R23 = 0; R31 = phi; R32 = 1; R33 = ε+0.Part (b): Here all expression names are R(1); we again list only the subscripts. R11 = 1*; R12 = 1*0; R13 = phi; R21 = 11*; R22 = ε+11*0; R23 = 0; R31 = phi; R32 = 1; R33 = ε+0.Part (e): Here is the transition diagram:If we eliminate state q2 we get:Applying the formula in the text, the expression for the ways to get from q1 to q3 is: [1 + 01 +00(0+10)*11]*00(0+10)*Exercise 3.2.4(a)Exercise 3.2.6(a)(Revised 1/16/02) LL* or L+.Exercise 3.2.6(b)The set of suffixes of strings in L.Exercise 3.2.8Let R(k)ijm be the number of paths from state i to state j of length m that go through no state numbered higher than k. We can compute these numbers, for all states i and j, and for m no greater than n, by induction on k.Basis: R0ij1 is the number of arcs (or more precisely, arc labels) from state i to state j. R0ii0 = 1, and all other R0ijm's are 0.Induction: R(k)ijm is the sum of R(k-1)ijm and the sum over all lists (p1,p2,...,pr) of positive integers that sum to m, of R(k-1)ikp1 * R(k-1)kkp2 *R(k-1)kkp3 *...* R(k-1)kkp(r-1) * R(k-1)kjpr. Note r must be at least 2.The answer is the sum of R(k)1jn, where k is the number of states, 1 is the start state, and j is any accepting state.Solutions for Section 3.4Exercise 3.4.1(a)Replace R by {a} and S by {b}. Then the left and right sides become {a} union {b} = {b} union {a}. That is, {a,b} = {b,a}. Since order is irrelevant in sets, both languages are the same: the language consisting of the strings a and b.Exercise 3.4.1(f)Replace R by {a}. The right side becomes {a}*, that is, all strings of a's, including the empty string. The left side is ({a}*)*, that is, all strings consisting of the concatenation of strings of a's. But that is just the set of strings of a's, and is therefore equal to the right side.Exercise 3.4.2(a)Not the same. Replace R by {a} and S by {b}. The left side becomes all strings of a's and b's (mixed), while the right side consists only of strings of a's (alone) and strings of b's (alone). A string like ab is in the language of the left side but not the right.Exercise 3.4.2(c)Also not the same. Replace R by {a} and S by {b}. The right side consists of all strings composed of zero or more occurrences of strings of the form a...ab, that is, one or more a's ended by one b. However, every string in the language of the left side has to end in ab. Thus, for instance, ε is in the language on the right, but not on the left.Solutions for Section 4.1Exercise 4.1.1(c)Let n be the pumping-lemma constant (note this n is unrelated to the n that is a local variable in the definition of the language L). Pick w = 0n10n. Then when we write w = xyz, we know that |xy| <= n, and therefore y consists of only 0's. Thus, xz, which must be in L if L is regular, consists of fewer than n0's, followed by a 1 and exactly n0's. That string is not in L, so we contradict the assumption that L is regular.Exercise 4.1.2(a)Let n be the pumping-lemma constant and pick w = 0n2, that is, n2 0's. When we write w = xyz, we know that y consists of between 1 and n 0's. Thus, xyyz has length between n2 + 1 and n2 + n. Since the next perfect square after n2 is (n+1)2 = n2 + 2n + 1, we know that the length of xyyz lies strictly between the consecutive perfect squares n2 and (n+1)2. Thus, the length of xyyz cannot be a perfect square. But if the language were regular, then xyyz would be in the language, which contradicts the assumption that the language of strings of 0's whose length is a perfect square is a regular language.Exercise 4.1.4(a)We cannot pick w from the empty language.Exercise 4.1.4(b)If the adversary picks n = 3, then we cannot pick a w of length at least n.Exercise 4.1.4(c)The adversary can pick an n > 0, so we have to pick a nonempty w. Since w must consist of pairs 00 and 11, the adversary can pick y to be one of those pairs. Then whatever i we pick, xy i z will consist of pairs 00 and 11, and so belongs in the language.Solutions for Section 4.2Exercise 4.2.1(a)aabbaa.Exercise 4.2.1(c)The language of regular expression a(ab)*ba.Exercise 4.2.1(e)Each b must come from either 1 or 2. However, if the first b comes from 2 and the second comes from 1, then they will both need the a between them as part of h(2) and h(1), respectively. Thus, the inverse homomorphism consists of the strings {110, 102, 022}.Exercise 4.2.2Start with a DFA A for L. Construct a new DFA B, that is exactly the same as A, except that state q is an accepting state of B if and only if δ(q,a) is an accepting state of A. Then B accepts input string w if and only if A accepts wa; that is, L(B) = L/a.Exercise 4.2.5(b)We shall use D a for ``the derivative with respect to a.'' The key observation is that if epsilon is not in L(R), then the derivative of RS will always remove an a from the portion of a string that comes from R. However, if epsilon is in L(R), then the string might have nothing from R and will remove a from the beginning of a string in L(S) (which is also a string in L(RS). Thus, the rule we want is:If epsilon is not in L(R), then D a(RS) = (D a(R))S. Otherwise, D a(RS) = D a(R)S + D a(S).Exercise 4.2.5(e)L may have no string that begins with 0.Exercise 4.2.5(f)This condition says that whenever 0w is in L, then w is in L, and vice-versa. Thus, L must be of the form L(0*)M for some language M (not necessarily a regular language) that has no string beginning with 0.In proof, notice first that D0(L(0*)M = D0(L(0*))M union D0(M) = L(0*)M. There are two reasons for the last step. First, observe that D0 applied to the language of all strings of 0's gives all strings of 0's, that is, L(0*). Second, observe that because M has no string that begins with 0, D0(M) is the empty set [that's part (e)].We also need to show that every language N that is unchanged by D0is of this form. Let M be the set of strings in N that do not begin with 0. If N is unchanged by D0, it follows that for every string w in M, 00...0w is in N; thus, N includes all the strings of L(0*)M. However, N cannot include a string that is not in L(0*)M. If x were such a string, then we can remove all the 0's at the beginning of x and get some string y that is also in N. But y must also be in M.Exercise 4.2.8Let A be a DFA for L. We construct DFA B for half(L). The state of B is of the form [q,S], where:∙q is the state A would be in after reading whatever input B has read so far.∙S is the set of states of A such that A can get from exactly these states to an accepting state by reading any input string whose length is the same as the length of the string B has read so far.It is important to realize that it is not necessary for B to know how many inputs it has read so far; it keeps this information up-to-date each time it reads a new symbol. The rule that keeps things up to date is: δB([q,S],a) = [δA(q,a),T], where T is the set of states p of A such that there is a transition from p to any state of S on any input symbol. In this manner, the first component continues to simulate A, while the second component now represents states that can reach an accepting state following a path that is one longer than the paths represented by S.To complete the construction of B, we have only to specify:∙The initial state is [q0,F], that is, the initial state of A and the accepting states of A. This choice reflects the situation when A has read 0 inputs: it is still in its initial state, and the accepting states are exactly the ones that can reach anaccepting state on a path of length 0.∙The accepting states of B are those states [q,S] such that q is in S. The justification is that it is exactly these states that are reached by some string of length n, and there is some other string of length n that will take state q to an accepting state.Exercise 4.2.13(a)Start out by complementing this language. The result is the language consisting of all strings of 0's and 1's that are not in 0*1*, plus the strings in L0n1n. If we intersect with 0*1*, the result is exactly L0n1n. Since complementation and intersection with a regular set preserve regularity, if the given language were regular then so would be L0n1n. Since we know the latter is false, we conclude the given language is not regular.Exercise 4.2.14(c)Change the accepting states to be those for which the first component is an accepting state of A L and the second is a nonaccepting state of A M. Then the resulting DFA accepts if and only if the input is in L - M.Solutions for Section 4.3Exercise 4.3.1Let n be the pumping-lemma constant. Test all strings of length between n and 2n-1 for membership in L. If we find even one such string, then L is infinite. The reason is that the pumping lemma applies to such a string, and it can be ``pumped'' to show an infinite sequence of strings are in L.Suppose, however, that there are no strings in L whose length is in the range n to 2n-1. We claim there are no strings in L of length 2n or more, and thus there are only a finite number of strings in L. In proof, suppose w is a string in L of length at least 2n, and w is as short as any string in L that has length at least 2n. Then the pumping lemma applies to w, and we can write w = xyz, where xz is also in L. How long could xz be? It can't be as long as 2n, because it is shorter than w, and w is as short as any string in L of length 2n or more. n, because xz is at most n shorter than w. Thus, xz is of length between n and 2n-1, which is a contradiction, since we assumed there were no strings in L with a length in that range.Solutions for Section 4.4Exercise 4.4.1Revised 10/23/01.B|xC|x xD|x x xE|x x xF|x x x xG| x x x x xH|x x x x x x x---------------A B C D E F GNote, however, that state H is inaccessible, so it should be removed, leaving the first four states as the minimum-state DFASolutions for Section 5.1Exercise 5.1.1(a)S -> 0S1 | 01Exercise 5.1.1(b)S -> AB | CDA -> aA | εB -> bBc | E | cDC -> aCb | E | aAD -> cD | εE -> bE | bTo understand how this grammar works, observe the following:∙A generates zero or more a's.∙D generates zero or more c's.∙E generates one or more b's.∙B first generates an equal number of b's and c's, then produces either one or more b's (via E) or one or more c's (via cD).That is, B generates strings in b*c* with an unequal number of b's and c's.∙Similarly, C generates unequal numbers of a's then b's.∙Thus, AB generates strings in a*b*c* with an unequal numbers of b's and c's, while CD generates strings in a*b*c* with an unequal number of a's and b's.Exercise 5.1.2(a)Leftmost: S => A1B => 0A1B => 00A1B => 001B => 0010B => 00101B => 00101Rightmost: S => A1B => A10B => A101B => A101 => 0A101 => 00A101 => 00101Exercise 5.1.5S -> S+S | SS | S* | (S) | 0 | 1 | phi | eThe idea is that these productions for S allow any expression to be, respectively, the sum (union) of two expressions, the concatenation of two expressions, the star of an expression, a parenthesized expression, or one of the four basis cases of expressions: 0, 1, phi, and ε.Solutions for Section 5.2Exercise 5.2.1(a)S/ | \A 1 B/ | / |0 A 0 B/ | / |0 A 1 B| |e eIn the above tree, e stands for ε.Solutions for Section 5.3Exercise 5.3.2B -> BB | (B) | [B] | εExercise 5.3.4(a)Change production (5) to:ListItem -> <LI> Doc </LI>Solutions for Section 5.4Exercise 5.4.1Here are the parse trees:S S/ | / / | \a S a Sb S/ | \ \ | \ |a Sb S a S e| | |e e eThe two leftmost derivations are: S => aS => aaSbS => aabS => aab and S => aSbS => aaSbS => aabS => aab. The two rightmost derivations are: S => aS => aaSbS => aaSb => aab and S => aSbS => aSb => aaSb => aab. Exercise 5.4.3The idea is to introduce another nonterminal T that cannot generate an unbalanced a. That strategy corresponds to the usual rule in programming languages that an ``else'' is associated with the closest previous, unmatched ``then.'' Here, we force a b to match the previous unmatched a. The grammar:S -> aS | aTbS | εT -> aTbT | εExercise 5.4.6Alas, it is not. We need to have three nonterminals, corresponding to the three possible ``strengths'' of expressions:1. A factor cannot be broken by any operator. These are the basis expressions, parenthesized expressions, and theseexpressions followed by one or more *'s.2. A term can be broken only by a *. For example, consider 01, where the 0 and 1 are concatenated, but if we follow it bya *, it becomes 0(1*), and the concatenation has been ``broken'' by the *.3.An expression can be broken by concatenation or *, but not by +. An example is the expression 0+1. Note that if weconcatenate (say) 1 or follow by a *, we parse the expression 0+(11) or 0+(1*), and in either case the union has been broken.The grammar:E -> E+T | TT -> TF | FF -> F* | (E) | 0 | 1 | phi | eSolutions for Section 6.1Exercise 6.1.1(a)(q,01,Z0) |- (q,1,XZ0) |- (q,ε,XZ0) |- (p,ε,Z0)|- (p,1,Z0) |- (p,ε,ε)Solutions for Section 6.2Exercise 6.2.1(a)We shall accept by empty stack. Symbol X will be used to count the 0's on the input. In state q, the start state, where we have seen no 1's, we add an X to the stack for each 0 seen. The first X replaces Z0, the start symbol. When we see a 1, we go to state p, and then only pop the stack, one X for each input 1. Formally, the PDA is ({q,p},{0,1},{X,Z0},δ,q,Z0). The rules:1.δ(q,0,Z0) = {(q,X)}2.δ(q,0,X) = {(q,XX)}3.δ(q,1,X) = {(p,ε)}4.δ(p,1,X) = {(p,ε)}Exercise 6.2.2(a)Revised 6/20/02.Begin in start state q0, with start symbol Z0, and immediately guess whether to check for:1.i=j=0 (state q1).2.i=j>0 (state q2).3.j=k (state q3).We shall accept by final state; as seen below, the accepting states are q1 and q3. The rules, and their explanations:∙δ(q0,ε,Z0) = {(q1,Z0), (q2,Z0), (q3,Z0)}, the initial guess.∙δ(q1,c,Z0) = {(q1,Z0)}. In case (1), we assume there are no a's or b's, and we consume all c's. State q1 will be one of our accepting states.∙δ(q2,a,Z0) = {(q2,XZ0)}, and δ(q2,a,X) = {(q2,XX)}. These rules begin case (2). We use X to count the number of a's read from the input, staying in state q2.∙δ(q2,b,X) = δ(q4,b,X) = {(q4,ε)}. When b's are seen, we go to state q4 and pop X's against the b's.∙δ(q4,ε,Z0) = {(q1,Z0)}. If we reach the bottom-of-stack marker in state q4, we have seen an equal number of a's and b's.We go spontaneously to state q1, which will accept and consume all c's, while continuing to accept.∙δ(q3,a,Z0) = {(q3,Z0)}. This rule begins case (3). We consume all a's from the input. Since j=k=0 is possible, state q3 must be an accepting state.∙δ(q3,b,Z0) = {(q5,XZ0)}. When b's arrive, we start counting them and go to state q5, which is not an accepting state.∙δ(q5,b,X) = {(q5,XX)}. We continue counting b's.∙δ(q5,c,X) = δ(q6,c,X) = {(q6,ε)}. When c's arrive, we go to state q6 and match the c's against the b's.∙δ(q6,ε,Z0) = {(q3,ε)}. When the bottom-of-stack marker is exposed in state q6, we have seen an equal number of b's and c's. We spontaneously accept in state q3, but we pop the stack so we cannot accept after reading more a's.Exercise 6.2.4Introduce a new state q, which be comes the initial state. On input ε and the start symbol of P, the new PDA has a choice of popping the stack (thus accepting ε), or going to the start state of P.Exercise 6.2.5(a)Revised 6/6/06.(q0,bab,Z0) |- (q2,ab,BZ0) |- (q3,b,Z0) |- (q1,b,AZ0) |- (q1,ε,Z0) |- (q0,ε,Z0) |- (f,ε,ε)Exercise 6.2.8Suppose that there is a rule that (p,X1X2...X k) is a choice in δ(q,a,Z). We create k-2 new states r1,r2,...,r k-2 that simulate this rule but do so by adding one symbol at a time to the stack. That is, replace (p,X1X2...X k) in the rule by (r k-2,X k-1X k. Then create new rules δ(r k-2,ε,X k-1) = {(r k-3,X k-2X k-1)}, and so on, down to δ(r2,ε,X3) = {(r1,X2X3)} and δ(r1,X2) = {(p,X1X2)}.Solutions for Section 6.3Exercise 6.3.1({q},{0,1),{0,1,A,S},δ ,q ,S) where δ is defined by:1.δ(q,ε,S) = {(q,0S1), (q,A)}2.δ(q,ε,A) = {(q,1A0), (q,S), (q,ε)}3.δ(q,0,0) = {(q,ε)}4.δ(q,1,1) = {(q,ε)}Exercise 6.3.3In the following, S is the start symbol, e stands for the empty string, and Z is used in place of Z0.1.S -> [qZq] | [qZp]The following four productions come from rule (1).2.[qZq] -> 1[qXq][qZq]3.[qZq] -> 1[qXp][pZq]4.[qZp] -> 1[qXq][qZp]5.[qZp] -> 1[qXp][pZp]The following four productions come from rule (2).6.[qXq] -> 1[qXq][qXq]7.[qXq] -> 1[qXp][pXq]8.[qXp] -> 1[qXq][qXp]9.[qXp] -> 1[qXp][pXp]The following two productions come from rule (3).10.[qXq] -> 0[pXq]11.[qXp] -> 0[pXp]The following production comes from rule (4).12.[qXq] -> eThe following production comes from rule (5).13.[pXp] -> 1The following two productions come from rule (6).14.[pZq] -> 0[qZq]15.[pZp] -> 0[qZp]Exercise 6.3.6Convert P to a CFG, and then convert the CFG to a PDA, using the two constructions given in Section 6.3. The result is a one-state PDA equivalent to P.Solutions for Section 6.4Exercise 6.4.1(b)Not a DPDA. For example, rules (3) and (4) give a choice, when in state q, with 1 as the next input symbol, and with X on top of the stack, of either using the 1 (making no other change) or making a move on ε input that pops the stack and going to sta te p.Exercise 6.4.3(a)Suppose a DPDA P accepts both w and wx by empty stack, where x is not ε (i.e., N(P) does not have the prefix property). Then (q0,wxZ0) |-* (q,x,ε) for some state q, where q0 and Z0 are the start state and symbol of P. It is not possible that (q,x,ε) |-* (p,ε,ε) for some state p, because we know x is not ε, and a PDA cannot have a move with an empty sta ck. This observation contradicts the assumption that wx is in N(P).Exercise 6.4.3(c)Modify P' in the following ways to create DPDA P:1.Add a new start state and a new start symbol. P, with this state and symbol, pushes the start symbol of P' on top of thestack and goes to the start state of P'. The purpose of the new start symbol is to make sure P doesn't accidentally accept by empty stack.2.Add a new ``popping state'' to P. In this state, P pops every symbol it sees on the stack, using ε input.3.If P' enters an accepting state, P enters the popping state instead.As long as L(P') has the prefix property, then any string that P' accepts by final state, P will accept by empty stack.Solutions for Section 7.1Exercise 7.1.1A and C are clearly generating, since they have productions with terminal bodies. Then we can discover S is generating because of the production S->CA, whose body consists of only symbols that are generating. However,B is not generating. Eliminating B, leaves the grammarS -> CAA -> aC -> bSince S, A, and C are each reachable from S, all the remaining symbols are useful, and the above grammar is the answer to the question.Exercise 7.1.2Revised 6/27/02.a)Only S is nullable, so we must choose, at each point where S occurs in a body, to eliminate it or not. Since there is no body that consists only of S's, we do not have to invoke the rule about not eliminating an entire body. The resulting grammar:S -> ASB | ABA -> aAS | aA | aB -> SbS | bS | Sb | b | A | bbb)The only unit production is B -> A. Thus, it suffices to replace this body A by the bodies of all the A-productions. The result:S -> ASB | ABA -> aAS | aA | aB -> SbS | bS | Sb | b | aAS | aA | a | bbc)Observe that A and B each derive terminal strings, and therefore so does S. Thus, there are no useless symbols.d)Introduce variables and productions C -> a and D -> b, and use the new variables in all bodies that are not a single terminal:S -> ASB | ABA -> CAS | CA | aB -> SDS | DS | SD | b | CAS | CA | a | DDC -> aD -> bFinally, there are bodies of length 3; one, CAS, appears twice. Introduce new variables E, F, andG to split these bodies, yielding the CNF grammar:S -> AE | ABA -> CF | CA | aB -> SG | DS | SD | b | CF | CA | a | DDC -> aD -> bE -> SBF -> ASG -> DSExercise 7.1.10It's not possible. The reason is that an easy induction on the number of steps in a derivation shows that every sentential form has odd length. Thus, it is not possible to find such a grammar for a language as simple as {00}.To see why, suppose we begin with start symbol S and try to pick a first production. If we pick a production with a single terminal as body, we derive a string of length 1 and are done. If we pick a body with three variables, then, since there is no way for a variable to derive epsilon, we are forced to derive a string of length 3 or more.Exercise 7.1.11(b)The statement of the entire construction may be a bit tricky, since you need to use the construction of part (c) in (b), although we are not publishing the solution to (c). The construction for (b) is by induction on i, but it needs to be of the stronger statement that if an A i-production has a body beginning with A j, then j > i (i.e., we use part (c) to eliminate the possibility that i=j).Basis: For i = 1 we simply apply the construction of (c) for i = 1.Induction: If there is any production of the form A i -> A1..., use the construction of (a) to replace A1. That gives us a situation where all A i production bodies begin with at least A2 or a terminal. Similarly, replace initial A2's using (a), to make A3 the lowest possible variable beginning an A i-production. In this manner, we eventually guarantee that the body of each A i-production either begins with a terminal or with A j, for some j >= i. A use of the construction from (c) eliminates the possibility that i = j.Exercise 7.1.11(d)As per the hint, we do a backwards induction on i, that the bodies of A i productions can be made to begin with terminals.Basis: For i = k, there is nothing to do, since there are no variables with index higher than k to begin the body.Induction: Assume the statement for indexes greater than i. If an A i-production begins with a variable, it must be A j for some j > i. By the induction hypothesis, the A j-productions all have bodies beginning with terminals now. Thus, we may use the construction (a) to replace the initial A j, yielding only A i-productions whose bodies begin with terminals.After fixing all the A i-productions for all i, it is time to work on the B i-productions. Since these have bodies that begin with either terminals or A j for some j, and the latter variables have only bodies that begin with terminals, application of construction (a) fixes the B j's.Solutions for Section 7.2Exercise 7.2.1(a)Let n be the pumping-lemma constant and consider string z = a n b n+1c n+2. We may write z = uvwxy, where v and x, may be ``pumped,'' and |vwx| <= n. If vwx does not have c's, then uv3wx3y has at least n+2a's or b's, and thus could not be in the language.If vwx has a c, then it could not have an a, because its length is limited to n. Thus, uwy has n a's, but no more than 2n+2b's and c's in total. Thus, it is not possible that uwy has more b's than a's and also has more c's than b's. We conclude that uwy is not in the language, and now have a contradiction no matter how z is broken into uvwxy.Exercise 7.2.1(d)Let n be the pumping-lemma constant and consider z = 0n1n2. We break Z = uvwxy according to the pumping lemma. If vwx consists only of 0's, then uwy has n2 1's and fewer than n 0's; it is not in the language. If vwx has only 1's, then we derive a contradiction similarly. If either v or x has both 0's and 1's, then uv2wx2y is not in 0*1*, and thus could not be in the language.Finally, consider the case where v consists of 0's only, say k 0's, and x consists of m 1's only, where k and m are both positive. Then for all i, uv i+1wx i+1y consists of n + ik 0's and n2 + im 1's. If the number of 1's is always to be the square of the number of 0's, we must have, for some positive k and m: (n+ik)2 = n2+ im, or 2ink + i2k2= im. But the left side grows quadratically in i, while the right side grows linearly, and so this equality for all i is impossible. We conclude that for at least some i, uv i+1wx i+1y is not in the language and have thus derived a contradiction in all cases.Exercise 7.2.2(b)It could be that, when the adversary breaks z = uvwxy, v = 0k and x = 1k. Then, for all i, uv i wx i y is in the language.Exercise 7.2.2(c)The adversary could choose z = uvwxy so that v and x are single symbols, on either side of the center. That is, |u| = |y|, and w is either epsilon (if z is of even length) or the single, middle symbol (if z is of odd length). Since z is a palindrome, v and x will be the same symbol. Then uv i wx i y is always a palindrome.。

《形式语言与自动机》(王柏、杨娟编著)北邮出版社-课后习题答案

北京邮电大学——形式语言与自动机课后作业答案第二章4．找出右线性文法，能构成长度为1至5个字符且以字母为首的字符串。

答：G={N,T,P,S}其中N={S,A,B,C,D} T={x,y} 其中x∈{所有字母} y∈{所有的字符} P如下: S→x S→xA A→y A→yBB→y B→yC C→y C→yD D→y6．构造上下文无关文法能够产生L={ω/ω∈{a,b}*且ω中a的个数是b的两倍}答：G={N,T,P,S}其中N={S} T={a,b} P如下:S→aab S→aba S→baaS→aabS S→aaSb S→aSab S→SaabS→abaS S→abSa S→aSba S→SabaS→baaS S→baSa S→bSaa S→Sbaa7．找出由下列各组生成式产生的语言（起始符为S）(1)S→SaS S→b(2)S→aSb S→c(3)S→a S→aE E→aS答：（1）b(ab)n /n≥0}或者L={(ba)n b/n≥0}(2) L={a n cb n /n≥0}(3)L={a2n+1 /n≥0}第三章1．下列集合是否为正则集，若是正则集写出其正则式。

（1）含有偶数个a和奇数个b的{a,b}*上的字符串集合（2）含有相同个数a和b的字符串集合（3）不含子串aba的{a,b}*上的字符串集合答：（1）是正则集，自动机如下(2) 不是正则集，用泵浦引理可以证明，具体见17题（2）。

(3) 是正则集先看L’为包含子串aba的{a,b}*上的字符串集合显然这是正则集，可以写出表达式和画出自动机。

（略）则不包含子串aba的{a,b}*上的字符串集合L是L’的非。

根据正则集的性质，L也是正则集。

4．对下列文法的生成式，找出其正则式（1）G=({S,A,B,C,D},{a,b,c,d},P,S),生成式P如下：S→aA S→BA→abS A→bBB→b B→cCC→D D→bBD→d（2）G=({S,A,B,C,D},{a,b,c,d},P,S),生成式P如下：S→aA S→BA→cC A→bBB→bB B→aC→D C→abBD→d答：(1) 由生成式得：S=aA+B ①A=abS+bB ②B=b+cC ③C=D ④D=d+bB ⑤③④⑤式化简消去CD，得到B=b+c(d+bB)即B=cbB+cd+b =>B=(cb)*(cd+b) ⑥将②⑥代入①S=aabS+ab(cb)*(cd+b)+(cb)*(cd+b) =>S=(aab)*(ab+ε)(cb)*(cd+b) (2) 由生成式得：S=aA+B ①A=bB+cC ②B=a+bB ③C=D+abB ④D=dB ⑤由③得 B=b*a ⑥将⑤⑥代入④ C=d+abb*a=d+ab+a ⑦将⑥⑦代入② A=b+a+c(d+b+a) ⑧将⑥⑧代入① S=a(b+a+c(d+ab+a))+b*a=ab+a+acd+acab+a+b*a5.为下列正则集，构造右线性文法：(1){a,b}*(2)以abb结尾的由a和b组成的所有字符串的集合(3)以b为首后跟若干个a的字符串的集合(4)含有两个相继a和两个相继b的由a和b组成的所有字符串集合答：（1）右线性文法G=({S},{a,b},P,S)P: S→aS S→bS S→ε(2) 右线性文法G=({S},{a,b},P,S)P: S→aS S→bS S→abb(3) 此正则集为{ba*}右线性文法G=({S,A},{a,b},P,S)P: S→bA A→aA A→ε(4) 此正则集为{{a,b}*aa{a,b}*bb{a,b}*, {a,b}*bb{a,b}*aa{a,b}*}右线性文法G=({S,A,B,C},{a,b},P,S)P: S→aS/bS/aaA/bbBA→aA/bA/bbCB→aB/bB/aaCC→aC/bC/ε7.设正则集为a(ba)*(1)构造右线性文法(2)找出（1）中文法的有限自 b动机答：（1）右线性文法G=({S,A},{a,b},P,S)P: S→aA A→bS A→ε（2）自动机如下：(p2是终结状态)9.对应图（a）(b)的状态转换图写出正则式。

形式语言与自动机习题答案(部分)

a）语言{ww:w∈{a,b}*}的文法G＝（V，∑，R，S）如下： V＝{S，a,b,A,B,C,Ta,Tb,Tc} ∑＝{a,b} R={ S -> CC, C -> AC | BC | Tc BTc -> Tcb ATc -> Tca Tc -> e } 或文法G[S]： S→CD Ab→bA C→aCA Ba→aB C→bCB Bb→bB AD→aD C→ε BD→bD D→ε Aa→bD 或 S→aAS|bBS|aAE|bBE Aa→aA Ab→bA Ba→aB AE→Ea|a BE→Eb|b L(G)={ww|w∈{a,b}*}
4331把两个带头分别移动第一个带头向左移动到带头第二个带头向右移动直到发现空格为2向右移动第一个带头记录下字符的值
2.2.3
2.2.9
2.3.4
3.5.14
• (a):是上下文相关的，因为m,n,p之间必存在二者相等的情况，如果a的个数确定了，则b或者c的个数也是确定的。 • (b):是上下文相关的，因为当a的个数确定时，b和 c的个数是不能和a相同的，即n和p的取值有了限制。 • (c):是上下文相关的，a的个数限定了b和c的个数。 • (d):是上下文相关的，当a和b的个数相等的情况下，则限定了c的个数不能和a，b相同。 • (e):是上下文相关的，若要满足 |w1|=|w2|…..=|wn|>=2,则|w|的值限定了必须是可被整除的数（素数）。
4.1.10
• 功能是： • 首先找到第二个和第三个字符，要求不能是空格，然后记录下，再分别填入之后的第一个空格和第二个空格所位置。
4.3.3
• （1）把两个带头分别移动，第一个带头向左移动到带头，第二个带头向右移动，直到发现空格为止。 • （2）向右移动第一个带头，记录下字符的值。判断有没有遇到空格，若遇到空格，则跳至（），否则，跳至（3）。 • （3）向右移动第二个带头，并复制下已记录的字符，再跳至（2）。 • （4）第一个带头遇到空格，说明w已经复制完，故停机。

《形式语言与自动机》期末复习题及答案(一)

形式语言与自动机期末复习题及答案（一）1.有图灵机 M=(Q, ∑, Γ, δ,q 0 , B , F) 接受语言{w t w│w ∈{a, b}*}，按照下图说明其接受过程。

（本题15分）[q 1[q 6,B]答：abtab 的分析过程：[q 1,B]abtab├a [q 2,a]btab├ab [q 2,a]tab├abt [q 3,a]ab├ ab [q 4,B]tab├a [q 5,B]btab├[q 6,B]abtab├a [q 1,B]btab ├ab [q 2,b]tab├abt [q 3,b]ab ├abta [q 3,b]b ├abt [q 4,B]ab├a [q 5,B]btab ├ab [q 7,B]tab ├abt [q 8,B]ab├abta [q 8,B]b ├abtab [q 8,B]B├abta [q 9,B]b 接受abtab√√√√√√√√√√√√√√√√√√√√√√√√√√√√√√√√√√√√√√√√√√√√√√√√2.简述《形式语言与自动机》课程的主要内容。

(本题10分)答：语言的文法描述；RL （RG 、FA 、RE 、RL 的性质）；CFL （CFG(CNF 、GNF)、PDA 、CFL 的性质）；TM （基本TM 、构造技术、TM 的修改）；CSL （CSG 、LBA ）。

3.简述《形式语言与自动机》课程的学习目的和基本要求。

(本题10分) 答：本专业人员4种基本的专业能力：计算思维能力、算法的设计与分析能力、程序设计和实现能力、计算机软硬件系统的认知、分析、设计与应用能力。

其中计算思维能力包括：逻辑思维能力和抽象思维能力、构造模型对问题进行形式化描述、理解和处理形式模型。

本课程应使学生掌握如下知识：正则语言、下文无关语言的文法、识别模型及其基本性质、图灵机的基本知识。

锻炼培养如下能力：形式化描述和抽象思维能力、了解和初步掌握“问题、形式化描述、自动化（计算机化）”这一最典型的计算机问题求解思路。

《形式语言与自动机》(王柏、杨娟编著)课后习题答案

形式语言与自动机课后习题答案第二章4．找出右线性文法，能构成长度为1至5个字符且以字母为首得字符串。

答：G={N,T,P,S}其中N={S,A,B,C,D} T={x,y} 其中x∈{所有字母} y∈{所有得字符} P如下: S→x S→xA A→y A→yBB→y B→yC C→y C→yD D→y6．构造上下文无关文法能够产生L={ω/ω∈{a,b}*且ω中a得个数就是b得两倍}答：G={N,T,P,S}其中N={S} T={a,b} P如下:S→aab S→aba S→baaS→aabS S→aaSb S→aSab S→SaabS→abaS S→abSa S→aSba S→SabaS→baaS S→baSa S→bSaa S→Sbaa7．找出由下列各组生成式产生得语言（起始符为S）(1)S→SaS S→b(2)S→aSb S→c(3)S→a S→aE E→aS答：（1）b(ab)n /n≥0}或者L={(ba)n b/n≥0}(2) L={a n cb n /n≥0}(3)L={a2n+1 /n≥0}第三章1．下列集合就是否为正则集，若就是正则集写出其正则式。

（1）含有偶数个a与奇数个b得{a,b}*上得字符串集合（2）含有相同个数a与b得字符串集合（3）不含子串aba得{a,b}*上得字符串集合答：（1）就是正则集，自动机如下题（2）。

(3) 就是正则集先瞧L’为包含子串aba得{a,b}*上得字符串集合显然这就是正则集，可以写出表达式与画出自动机。

（略）则不包含子串aba得{a,b}*上得字符串集合L就是L’得非。

根据正则集得性质，L也就是正则集。

4．对下列文法得生成式，找出其正则式（1）G=({S,A,B,C,D},{a,b,c,d},P,S),生成式P如下：S→aA S→BA→abS A→bBB→b B→cCC→D D→bBD→d（2）G=({S,A,B,C,D},{a,b,c,d},P,S),生成式P如下：S→aA S→BA→cC A→bBB→bB B→aC→D C→abBD→d答：(1) 由生成式得：S=aA+B ①A=abS+bB ②B=b+cC ③C=D ④D=d+bB ⑤③④⑤式化简消去CD，得到B=b+c(d+bB)即B=cbB+cd+b =>B=(cb)*(cd+b) ⑥将②⑥代入①S=aabS+ab(cb)*(cd+b)+(cb)*(cd+b) =>S=(aab)*(ab+ε)(cb)*(cd+b) (2) 由生成式得：S=aA+B ①A=bB+cC ②B=a+bB ③C=D+abB ④D=dB ⑤由③得 B=b*a ⑥将⑤⑥代入④ C=d+abb*a=d+ab+a ⑦将⑥⑦代入② A=b+a+c(d+b+a) ⑧将⑥⑧代入① S=a(b+a+c(d+ab+a))+b*a=ab+a+acd+acab+a+b*a5、为下列正则集，构造右线性文法：(1){a,b}*(2)以abb结尾得由a与b组成得所有字符串得集合(3)以b为首后跟若干个a得字符串得集合(4)含有两个相继a与两个相继b得由a与b组成得所有字符串集合答：（1）右线性文法G=({S},{a,b},P,S)P: S→aS S→bS S→ε(2) 右线性文法G=({S},{a,b},P,S)P: S→aS S→bS S→abb(3) 此正则集为{ba*}右线性文法G=({S,A},{a,b},P,S)P: S→bA A→aA A→ε(4) 此正则集为{{a,b}*aa{a,b}*bb{a,b}*, {a,b}*bb{a,b}*aa{a,b}*}右线性文法G=({S,A,B,C},{a,b},P,S)P: S→aS/bS/aaA/bbBA→aA/bA/bbCB→aB/bB/aaCC→aC/bC/ε7、设正则集为a(b a)*(1)构造右线性文法(2)找出（1）中文法得有限自b动机答：（1）右线性文法G=({S,A},{a,b},P,S)P: S→aA A→bS A→ε（2）自动机如下：)9、对应图（a）(b)得状态转换图写出正则式。

形式语言与自动机复习总结

形式语言与自动机复习总结适合《形式语言与自动机》（第2版）、杨娟，石川，王柏主编1.形式语言：形式化描述的字母表上的字符串集合，是一种公认的符号和表达式所描述的一种语言，是通用的语言。

2.自动机：具有离散的输入输出模型。

a)状态：一个标识，能区分自动机在不同时刻的状况。

b)自动机本质：根据输入和规则决定下一个状态。

c)部分常见的自动机：i.有限自动机：具有读头的有限控制器和一条写有字符的输入带组成。

ii.下推自动机：由一个输入带，一个有限控制器和一个下推栈组成。

iii.图灵机：一个具有读写头的有限控制器和一条无限带组成。

3.部分术语a)字母表：字符的有限集合，记为。

b)字符串：由字母表中的字符构成的序列。

Note: 一般字符串常用来表示，单个字符常用来表示。

c)字（串）：字母表上的字符串。

d)空串：不包含任何字符的字符串，用表示。

e)长度：字符串上的字符个数，用表示。

f)连接：设为串，且，，那么和的连接定义为。

性质：i.ii.iii.g)字符串的逆：字符串的倒置，用或表示，其中。

h)幂运算：设为字母表，为任意自然数，定义：i.ii.设，则iii.中的元素由i和ii生成i)闭包：j)闭包：Note:4.语言：设为字母表，则任何集合是字母表上的一个语言。

a)语言的积：和的积表示为，表示由和的字符串连接所构成的字符串的集合。

Note:b)语言的幂：。

Note: 字符串和语言的关系可以类比集合的元素和集合的关系。

5.文法：定义语言的数学模型。

a)列举法：表示有限集合。

b)文化产生系统：由定义的文法规则产生语言。

c)机器识别系统：当一个字符串能被识别系统接受，则这个字符串是语言的一个句子。

d)BNF：讨论某种程序设计语言语法的元语言<数字><字母><标识符> <字母>|<标识符><字母>|<标识符><数字>“定义为”, “或者”, <>: “必须的部分”6.Chomsky文法体系：将BNF中的“”用“”代替，用字符代替汉字包含两个不同的有限符号的集合：非终结符和终结符，形式规则的有限集，起始符，文法，的集合，, 。

形式语言与自动机week2-ch3.1-3.3

0
0
q2
1 1
q3
College of Computer Science & Technology, BUPT
9
转移表表示的 DFA
Q = {q0 , q1 , q2 , q3 }
T = {0, 1 }
0 q0 q1 q2 q2 q3 q0 q1
1 q1 q0 q3 q2
q3
对于DFA:δ’(q,a)=δ(δ‘(q, ),a)=δ(q,a)，即对于单个字符时δ和δ'是相等的。为了方便，以后在不引起混淆时用δ代替δ'
College of Computer Science & Technology, BUPT 11
扩展转移函数适合于输入字符串
0 q0 q1 q2 q3 q2 q3 q0 q1 1 q1 q0 q3 q2

T: 有限的输入字母表 δ: 转换函数(状态转移集合): Q×T Q q0: 初始状态， q0 Q

F: 终止状态集, F Q
College of Computer Science & Technology, BUPT
8
转移图表示的 DFA
1 Start
Q = {q0 , q1 , q2 , q3 }
例：构造自动机，识别所有由奇数个a和奇数个b组成的字符串。关键：不需要记住所看到的整个字符串，只需记住至此所看到
的a、b个数是偶数还是奇数。
Start a
q偶a偶b
a b
q奇a偶b
b b a
b
q偶a奇b
College of Computer Science & Technology, BUPT

形式语言与自动机理论--第三章参考答案

第三章作业答案1．已知DFA M1与M2如图3－18所示。

(xxxx 02282068) (1) 请分别给出它们在处理字符串1011001的过程中经过的状态序列。

(2) 请给出它们的形式描述。

Sq q1q q图3－18 两个不同的DFA解答：(1)M1在处理1011001的过程中经过的状态序列为q0q3q1q3q2q3q1q3;M2在处理1011001的过程中经过的状态序列为q0q2q3q1q3q2q3q1;(2)考虑到用形式语言表示,用自然语言似乎不是那么容易,所以用图上作业法把它们用正则表达式来描述:M1: [01+(00+1)(11+0)][11+(10+0)(11+0)]* M2: (01+1+000){(01)*+[(001+11)(01+1+000)]*} *******************************************************************************2．构造下列语言的DFA( xx02282085 ) （1）{0，1}*，1（2）{0，1}+，1（3）{x|x{0，1}+且x 中不含00的串}（设置一个陷阱状态，一旦发现有00的子串，就进入陷阱状态）（4）{ x|x{0，1}*且x中不含00的串}（可接受空字符串，所以初始状态也是接受状态）（5）{x|x{0，1}+且x中含形如10110的子串}（6）{x|x{0，1}+且x中不含形如10110的子串}（设置一个陷阱状态，一旦发现有00的子串，就进入陷阱状态）（7）{x|x{0，1}+且当把x看成二进制时，x模5和3同余，要求当x为0时，|x|=1,且x0时，x的首字符为1 }1.以0开头的串不被接受，故设置陷阱状态，当DFA在启动状态读入的符号为0，则进入陷阱状态2.设置7个状态：开始状态qs,q0:除以5余0的等价类，q1：除以5余1的等价类,q2:除以5余2的等价类，q3:除以5余3的等价类，q4:除以5余4的等价类，接受状态qt3.状态转移表为（8）{x|x{0，1}+且x的第十个字符为1}（设置一个陷阱状态，一旦发现x的第十个字符为0，进入陷阱状态）（9）{x|x{0，1}+且x以0开头以1结尾}（设置陷阱状态，当第一个字符为1时，进入陷阱状态）（10）{x|x{0，1}+且xxx至少含有两个1}（11）{x|x{0，1}+且如果x以1结尾，则它的xx为偶数；如果x以0结尾，则它的xx为奇数}可将{0，1}+的字符串分为4个等价类。