词法分析部分总结
合集下载
相关主题
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
A language class is a set of languages.
We have one example for language class: the regular languages.
任何一个正则表达式都表达了一个语言,所有的 正则表达式构成了语言类:正则语言
Language classes have two important kinds of properties:
(2)And for every automaton, there is a RE defining its language.
From Automata to RE
Arden’s rule
For any sets of strings S and T, the equation X=SX+T has X=S*T as a solution. Moreover, this solution is unique if ε not in S.
RE to ε-NFA: Induction 2 – Concatenation
ε For E1
For E1E2
For E2
RE to ε-NFA: Induction 3 – Closure
ε
ε
ε
For E
ε For E*
正则语言相关内容
如何证明正则表达式和NFA的等价性?
(1)We need to show that for every RE, there is an automaton that accepts the same language.
Example: “Does the protocol terminate?” = “Is the language finite?”
Example: “Can the protocol fail?” = “Is the language nonempty?”
Why Decision Properties – (2)
Assume representation is DFA. Construct the transition graph. Compute the set of states reachable from the
start state. If any final state is reachable, then yes, else no.
q0 a
a,b a
b
(2) X2=aX3+bX3+cX0 q2 (3) X3=aX3+bX3+cX3
c q1
X3=(a+b+c)X3+∅, by Arden’s
c
(0) X0=aX1 (1) X1=bX2+cX0+ε (2) X2=cX0
rule: SubstiXtu3i=ng(a(+0b)+acn)d*∅(2=) ∅in (1): X1=(bc+c)aX1+ε =((bc+c)a)* (by Arden’s rule)
From Automata to RE
Given an automaton A A has states {q0, …, qn} with q0 being the start
state Let Xi denote the set of strings accepted by A
starting in state qi Thus, L(A)=X0 We can write an equation for each Xi, defining it in
εR = Rε = R.
∅ is the annihilator (零元) for concatenation.
∅R = R∅ = ∅.
正则语言相关内容
如何证明正则表达式和NFA的等价性?
(1)We need to show that for every RE, there is an automaton that accepts the same language.
I.e., do two DFA’s define the same language?
You can’t find the “smallest.”
Closure Properties
A closure property (封闭性) of a language class says that given languages in the class, an operator (e.g., union) produces another language in the same class.
1. Decision properties. 2. Closure properties.
Decision Properties
A decision property for a class of languages is an algorithm that takes a formal description of a language (e.g., a DFA) and tells whether or not some property holds.
The Membership Question
Our first decision property is the question: “is string w in regular language L?(成员问题)”
Assume L is represented by a DFA A. Simulate the action of A on the sequence of input
symbols forming w.
Example: Testing Membership
01011
Next symbol
0
A1
0,1
1
B
C
Start
0
Current state
Example: Testing Membership
01011
Next symbol
0
A1
0,1
1
B
C
Start
0
Current state
The Infiniteness Problem
Is a given regular language infinite? Start with a DFA for the language. Key idea: if the DFA has n states, and the
language contains any string of length n or more, then the language is infinite. Otherwise, the language is surely finite.
Example: Is language L empty? Suppose the representation is a DFA. Can you tell if L(A) = for DFA A?
Why Decision Properties?
When we talked about protocols represented as DFA’s, we noted that important properties of a good protocol were related to the language of the DFA.
+ is commutative (可交换的) and associative (可结合的) ➢ a+b=b+a, a+b+c= a+(b+c)
concatenation is associative (可结合的) ➢ a.b.c=a.(b.c)
Concatenation distributes over +(可分配的) ➢ a.(b+c)=a.b+a.c
词法分析部分总结
田聪
构造词法分析器的一般方法和步骤
1. 描述:用正规式对模式进行 描述;
词词词 词词词词
2. 构造NFA:为每个正规式构
词词词词
造一个NFA;
词
词
3. 确定化:将NFA转换成等价 词
词词词词
词
的DFA;
词
4. 最小化:优化DFA,使其状
词 词
词词词词词词
词
态数最少;
词
5. 构造词法分析器:由DFA构
terms of the sets corresponding to its successor states.
From Automata to RE
A0
b, c
q0 a c
a,b,c
q3
a,b
a
q2
b
q1
c
From Automata to RE
A0
b, c
a,b,c q3
(0) X0=aX1+bX3+cX3 (1) X1=aX3+bX2+cX0+ε
X0=a((bc+c)a)*
DFA-to-RE
Another approach Page 93, theorem 3.4. (形式语言与自动机) Induction on k-path.
Decision Properties of Regular Languages
Properties of Language Classes
01011
Next symbol
0
A1
1 B
Start
0
Current state
0,1 C
Example: Testing Membership
01011
Next symbol
0
A1
1 B
0,1 C
Start
0
Current state
The Emptiness Problem
Given a regular language, does the language contain no string at all(判空问题).
Exception: Concatenation is not commutative (可交换的) ➢ a.b ≠b.a
Identities and Annihilators
∅ is the identity (单位元) for +.
R + ∅ = R.
ε is the identity (单位元) for concatenation.
Example: Testing Membership
01011
Next symbol
0
A1
1 B
Start
0
Current state
0,1 C
Example: Testing Membership
01011
Next symbol
0
0,1
A1
1
B
C
Start
0
Current state
Example: Testing Membership
正则语言
正则语言 正则表达式 有限状态自动机
上下文无关语言 上下文无关文法 非确定的下推自动机
上下文有关语言 上下文有关文法 线性有界自动机(特殊的图灵机)
递归可枚举语言 短语结构 图灵机
Algebraic Laws for RE’s
Union and concatenation behave like addition and multiplication.
词词词词
造词法分析器(表驱动,直 接编码,LEX)。
词词词词词词
词词词词
涉及到的形式化概念
正则表达式(正规式)
NFA
DFA
正则语言(正规集)
字符串,abb
A111wenku.baidu.com ….
正则表达式
例如: Char(char|digit)*,…
NFA
DFA
正则语言
递归可枚举语言 上下文有关语言 上下文无关语言
正则语言
We might want a “smallest” representation for a language, e.g., a minimum-state DFA or a shortest RE.
If you can’t decide “Are these two languages the same?”
(2)And for every automaton, there is a RE defining its language.
RE to ε-NFA: Basis
a
Symbol a:
ε
ε:
RE to ε-NFA: Induction 1 – Union
ε
For E1
ε
ε
ε
For E2
For E1 E2
Example: the regular languages are obviously closed under union, concatenation, and (Kleene) closure. (求补?求交?)
ε是正规式 若a是Σ上的字符,则a是正规式 若r和s分别是Σ上的正规式,那么 (a) r|s是正规式 (b) rs是正规式 (c) r*是正规式
We have one example for language class: the regular languages.
任何一个正则表达式都表达了一个语言,所有的 正则表达式构成了语言类:正则语言
Language classes have two important kinds of properties:
(2)And for every automaton, there is a RE defining its language.
From Automata to RE
Arden’s rule
For any sets of strings S and T, the equation X=SX+T has X=S*T as a solution. Moreover, this solution is unique if ε not in S.
RE to ε-NFA: Induction 2 – Concatenation
ε For E1
For E1E2
For E2
RE to ε-NFA: Induction 3 – Closure
ε
ε
ε
For E
ε For E*
正则语言相关内容
如何证明正则表达式和NFA的等价性?
(1)We need to show that for every RE, there is an automaton that accepts the same language.
Example: “Does the protocol terminate?” = “Is the language finite?”
Example: “Can the protocol fail?” = “Is the language nonempty?”
Why Decision Properties – (2)
Assume representation is DFA. Construct the transition graph. Compute the set of states reachable from the
start state. If any final state is reachable, then yes, else no.
q0 a
a,b a
b
(2) X2=aX3+bX3+cX0 q2 (3) X3=aX3+bX3+cX3
c q1
X3=(a+b+c)X3+∅, by Arden’s
c
(0) X0=aX1 (1) X1=bX2+cX0+ε (2) X2=cX0
rule: SubstiXtu3i=ng(a(+0b)+acn)d*∅(2=) ∅in (1): X1=(bc+c)aX1+ε =((bc+c)a)* (by Arden’s rule)
From Automata to RE
Given an automaton A A has states {q0, …, qn} with q0 being the start
state Let Xi denote the set of strings accepted by A
starting in state qi Thus, L(A)=X0 We can write an equation for each Xi, defining it in
εR = Rε = R.
∅ is the annihilator (零元) for concatenation.
∅R = R∅ = ∅.
正则语言相关内容
如何证明正则表达式和NFA的等价性?
(1)We need to show that for every RE, there is an automaton that accepts the same language.
I.e., do two DFA’s define the same language?
You can’t find the “smallest.”
Closure Properties
A closure property (封闭性) of a language class says that given languages in the class, an operator (e.g., union) produces another language in the same class.
1. Decision properties. 2. Closure properties.
Decision Properties
A decision property for a class of languages is an algorithm that takes a formal description of a language (e.g., a DFA) and tells whether or not some property holds.
The Membership Question
Our first decision property is the question: “is string w in regular language L?(成员问题)”
Assume L is represented by a DFA A. Simulate the action of A on the sequence of input
symbols forming w.
Example: Testing Membership
01011
Next symbol
0
A1
0,1
1
B
C
Start
0
Current state
Example: Testing Membership
01011
Next symbol
0
A1
0,1
1
B
C
Start
0
Current state
The Infiniteness Problem
Is a given regular language infinite? Start with a DFA for the language. Key idea: if the DFA has n states, and the
language contains any string of length n or more, then the language is infinite. Otherwise, the language is surely finite.
Example: Is language L empty? Suppose the representation is a DFA. Can you tell if L(A) = for DFA A?
Why Decision Properties?
When we talked about protocols represented as DFA’s, we noted that important properties of a good protocol were related to the language of the DFA.
+ is commutative (可交换的) and associative (可结合的) ➢ a+b=b+a, a+b+c= a+(b+c)
concatenation is associative (可结合的) ➢ a.b.c=a.(b.c)
Concatenation distributes over +(可分配的) ➢ a.(b+c)=a.b+a.c
词法分析部分总结
田聪
构造词法分析器的一般方法和步骤
1. 描述:用正规式对模式进行 描述;
词词词 词词词词
2. 构造NFA:为每个正规式构
词词词词
造一个NFA;
词
词
3. 确定化:将NFA转换成等价 词
词词词词
词
的DFA;
词
4. 最小化:优化DFA,使其状
词 词
词词词词词词
词
态数最少;
词
5. 构造词法分析器:由DFA构
terms of the sets corresponding to its successor states.
From Automata to RE
A0
b, c
q0 a c
a,b,c
q3
a,b
a
q2
b
q1
c
From Automata to RE
A0
b, c
a,b,c q3
(0) X0=aX1+bX3+cX3 (1) X1=aX3+bX2+cX0+ε
X0=a((bc+c)a)*
DFA-to-RE
Another approach Page 93, theorem 3.4. (形式语言与自动机) Induction on k-path.
Decision Properties of Regular Languages
Properties of Language Classes
01011
Next symbol
0
A1
1 B
Start
0
Current state
0,1 C
Example: Testing Membership
01011
Next symbol
0
A1
1 B
0,1 C
Start
0
Current state
The Emptiness Problem
Given a regular language, does the language contain no string at all(判空问题).
Exception: Concatenation is not commutative (可交换的) ➢ a.b ≠b.a
Identities and Annihilators
∅ is the identity (单位元) for +.
R + ∅ = R.
ε is the identity (单位元) for concatenation.
Example: Testing Membership
01011
Next symbol
0
A1
1 B
Start
0
Current state
0,1 C
Example: Testing Membership
01011
Next symbol
0
0,1
A1
1
B
C
Start
0
Current state
Example: Testing Membership
正则语言
正则语言 正则表达式 有限状态自动机
上下文无关语言 上下文无关文法 非确定的下推自动机
上下文有关语言 上下文有关文法 线性有界自动机(特殊的图灵机)
递归可枚举语言 短语结构 图灵机
Algebraic Laws for RE’s
Union and concatenation behave like addition and multiplication.
词词词词
造词法分析器(表驱动,直 接编码,LEX)。
词词词词词词
词词词词
涉及到的形式化概念
正则表达式(正规式)
NFA
DFA
正则语言(正规集)
字符串,abb
A111wenku.baidu.com ….
正则表达式
例如: Char(char|digit)*,…
NFA
DFA
正则语言
递归可枚举语言 上下文有关语言 上下文无关语言
正则语言
We might want a “smallest” representation for a language, e.g., a minimum-state DFA or a shortest RE.
If you can’t decide “Are these two languages the same?”
(2)And for every automaton, there is a RE defining its language.
RE to ε-NFA: Basis
a
Symbol a:
ε
ε:
RE to ε-NFA: Induction 1 – Union
ε
For E1
ε
ε
ε
For E2
For E1 E2
Example: the regular languages are obviously closed under union, concatenation, and (Kleene) closure. (求补?求交?)
ε是正规式 若a是Σ上的字符,则a是正规式 若r和s分别是Σ上的正规式,那么 (a) r|s是正规式 (b) rs是正规式 (c) r*是正规式