刘康-基于深度学习的知识库问答研究进展
合集下载
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
e1 r1 r2 candidate r3 e4 e3 r4 e2
matching
f (qi ) =
wk ∈qi
∑ E(w )
k
g(qi ) = ∑ E (ek )
ek ∈ ti
Object:
L = 0.1 − S( f (q), g(t )) + S( f (q), g(t ′ ))
S( f (q), g(t )) = f (q)T g(t )
V (q ) = 1 ∑ V (w ) N w∈q
head
r tail
V (triple ) =
1 V (item) ∑ N item∈triple
•
Generalization Component
• •
Memory
•
Output Component
•
Response Component
•
object
r5 r3 r4 e3
•
Subgraph of the answers (contextual Info)
g(qi ) =
ek ∈ Context (t i )
∑
E (ek ) +
rk ∈ Context (t i )
∑
E (rk )
S(q , a) = f (q)T g(a)
Results
Multi-Column CNN (Dong et al. 2015)
type Person
age 34(Int)
What states borders Texas
relation mention entity mention relation mention
λ x .state( x ) ∧ borders( x ,texas )
Relation Relation entity
X X X
1 2 3 3
Deep Learning
•
•
DL-based
•
Yinh et al. Semantic Parsing via Staged Query Graph Generation: Question Answering with Knowledge Bas, In Proceedings of ACL 2015 (Outstanding Paper) Zeng et al. Relation Classification via Convolutional Deep Neural Network, in Proceedings of COLING 2014 (Best Paper) Zeng et al, Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks, in Proceedings of EMNLP 2015 Xu et al. Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths, In Proceedings of EMNLP 2015
type Person
age 34(Int)
Location
SELECT DISTINCT ?x WHERE { ?y ?x. res: ?y. }
type
λx . 配偶(姚明,y) Λ 出生地(y, x)
Lambda Calculus DCS-Tree Fun-QL …
SQL SPARQL Prolog FunQL …
Entity Information E(ek ) Path to entities in question
g(qi ) =
ek ∈Path(t i )
e1 e6 path r1 candidate r2 e2
•
∑
E (ek ) +
rk ∈Path(t i )
∑
E (rk )
entity in q e5 e4
45.8 MT+Relation+Embedding
Query Graph + Reranking
42.2
42.6
Attention Model
Memory Network
CNN
Bao 2014
Bordes Bordes Berant 2014a 2014b 2014
Yao 2014b
Dong Yao+Yang Yih 2015 2015 2015
Bordes 2015
Zhang 2016
Traditional Method
End-End DL-based System
Traditional Method Promoted by DL
Some materials coms from “Min Zhou. Latest Progress of Knowledge-Based Question Answering. in CCIR 2015”
Memory Network
•
4 Components:
• • •
I (Input feature map): G (Generalization): O (Output feature map): R (Response): Memory Memory
•
•
Input Component
• • • • •
•
Basic System
•
Bordes et al. Question Answering with Subgraph Embedding, In Proceedings of EMNLP 2014
•
Contextual Information of Answers
•
Yang et al. Joint Relational Embeddings for Knowledge-Based Question Answering, In Proceedings of EMNLP 2014
Multitask learning with paraphrases:
S p ( f (q), f (qp )) = f (q)T f (qp )
S( f (q), g(t )) = f (q)T Mg(t )
Considering More Info (Bordes et al. 2014)
•
• • •
Single Relation
Simple Question
Step1
• •
Entity Linking KB main entity
main entity entity
•
Step2
e1 r1 main entity r2 e2
r3 e4
r4 e3
Basic End2End QA System (Bordes et al. 2014)
CYK-like Grammars [Liang, 2011]
Which software has been developed by organizations founded in California, USA?
software
developed by
organizations
founded in
Contextual Information
•
Bordes et al. Large-scale Simple Question Answering with Memory Network, In Proceedings of ICIR 2015
•
Memory Network
Basic End2End QA System (Bordes et al. 2014)
KB
• •
Semantic Parsing Combinatory Categorical Grammars [Zettlemoyer, 1995] “ ” Shift-reduce Derivations [Zelle, 1995]
• • • • •
Synchronous Grammars [Wong, 2007] Hybrid Tree [Lu, 2008] CFG CYK CFG-like Grammars [Clarke, 2010]
California
dbo:Software
dbr:developer
dbo:Company
dbr:foundationPlace
dbo:California
<dbo:Software, dbr:developer, dbo:Company>
<dbo:Company, dbr:foundationPlace, dbo:California
•
Answer Type Answer Context Answer Path
e1 e6 path answer entity in q e5 e4 e3 r5 r3 r4 r1 type e2
Framework
when did Avatar release in UK
Results
wenku.baidu.com
Neural Attention-based Neural Model for QA with Combining Global Knowledge Information (Zhang et al. 2016)
Problems
•
End2End Model
Simple
•
Which song are performed by person who was born in New York and played a role in Valentine’s Day
•
Entity Type
•
Dong et al. Question Answering over Freebase with Multi-Column Convolutional Neural Network, In Proceedings of ACL 2015
•
Topic Entity Relation Path
•
•
•
(DL,End2End)
Black Box
Location
type
type Person
age 34(Int)
(DL,End2End) IR based Approach
e1
e9 e3 e10
Similarity
e2 e4
!e11
e12
e5 e6
! e7
e8
Progress
•
Bordes et al. Open Question Answering with Weakly Supervised Embedding Models, In Proceedings of ECML-PKDD 2014
•
Multitask Learning
Comparison in Benchmark of WebQuestion
52.5
F1
44.3 39.2 35 33 Structured Perception Fader 2014 MT logistic Regression Yao 2014a 35.7 37.5 NLG+embedding Subgraph Embedding 32 Average Embedding DCS Berant 2013 39.9 41.3 Question Graph + LR Relation+Embedding Yang 2014 40.8
“
”
2016 4
17
Template-based QA Expert System
KB-based QA IR-based QA Community QA
1960
BaseBall LUNAR MACSYMA SHRDLE
1990
MASQUE TREC
2000
2010
IR-based QA
Community QA
type
type Person
age 34(Int)
Semantic Parsing (SP) Information Retrieval (IR)
( ) Semantic Parsing
type
Location
SELECT DISTINCT ?x WHERE { ?y ?x. res: ?y. }
KB-based QA
+
Knowledge Base
•
•
•
KB-based QA
Location
type
type Person
age 34(Int)
Location
type
type Person
age 34(Int)
Location
type
type Person
age 34(Int)
Location
•
(Mention)
organizations founded in California
(KB)
dbo:Company dbr:foundationPlace dbo:California
•
Which software has been developed by organizations founded in California, USA?
matching
f (qi ) =
wk ∈qi
∑ E(w )
k
g(qi ) = ∑ E (ek )
ek ∈ ti
Object:
L = 0.1 − S( f (q), g(t )) + S( f (q), g(t ′ ))
S( f (q), g(t )) = f (q)T g(t )
V (q ) = 1 ∑ V (w ) N w∈q
head
r tail
V (triple ) =
1 V (item) ∑ N item∈triple
•
Generalization Component
• •
Memory
•
Output Component
•
Response Component
•
object
r5 r3 r4 e3
•
Subgraph of the answers (contextual Info)
g(qi ) =
ek ∈ Context (t i )
∑
E (ek ) +
rk ∈ Context (t i )
∑
E (rk )
S(q , a) = f (q)T g(a)
Results
Multi-Column CNN (Dong et al. 2015)
type Person
age 34(Int)
What states borders Texas
relation mention entity mention relation mention
λ x .state( x ) ∧ borders( x ,texas )
Relation Relation entity
X X X
1 2 3 3
Deep Learning
•
•
DL-based
•
Yinh et al. Semantic Parsing via Staged Query Graph Generation: Question Answering with Knowledge Bas, In Proceedings of ACL 2015 (Outstanding Paper) Zeng et al. Relation Classification via Convolutional Deep Neural Network, in Proceedings of COLING 2014 (Best Paper) Zeng et al, Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks, in Proceedings of EMNLP 2015 Xu et al. Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths, In Proceedings of EMNLP 2015
type Person
age 34(Int)
Location
SELECT DISTINCT ?x WHERE { ?y ?x. res: ?y. }
type
λx . 配偶(姚明,y) Λ 出生地(y, x)
Lambda Calculus DCS-Tree Fun-QL …
SQL SPARQL Prolog FunQL …
Entity Information E(ek ) Path to entities in question
g(qi ) =
ek ∈Path(t i )
e1 e6 path r1 candidate r2 e2
•
∑
E (ek ) +
rk ∈Path(t i )
∑
E (rk )
entity in q e5 e4
45.8 MT+Relation+Embedding
Query Graph + Reranking
42.2
42.6
Attention Model
Memory Network
CNN
Bao 2014
Bordes Bordes Berant 2014a 2014b 2014
Yao 2014b
Dong Yao+Yang Yih 2015 2015 2015
Bordes 2015
Zhang 2016
Traditional Method
End-End DL-based System
Traditional Method Promoted by DL
Some materials coms from “Min Zhou. Latest Progress of Knowledge-Based Question Answering. in CCIR 2015”
Memory Network
•
4 Components:
• • •
I (Input feature map): G (Generalization): O (Output feature map): R (Response): Memory Memory
•
•
Input Component
• • • • •
•
Basic System
•
Bordes et al. Question Answering with Subgraph Embedding, In Proceedings of EMNLP 2014
•
Contextual Information of Answers
•
Yang et al. Joint Relational Embeddings for Knowledge-Based Question Answering, In Proceedings of EMNLP 2014
Multitask learning with paraphrases:
S p ( f (q), f (qp )) = f (q)T f (qp )
S( f (q), g(t )) = f (q)T Mg(t )
Considering More Info (Bordes et al. 2014)
•
• • •
Single Relation
Simple Question
Step1
• •
Entity Linking KB main entity
main entity entity
•
Step2
e1 r1 main entity r2 e2
r3 e4
r4 e3
Basic End2End QA System (Bordes et al. 2014)
CYK-like Grammars [Liang, 2011]
Which software has been developed by organizations founded in California, USA?
software
developed by
organizations
founded in
Contextual Information
•
Bordes et al. Large-scale Simple Question Answering with Memory Network, In Proceedings of ICIR 2015
•
Memory Network
Basic End2End QA System (Bordes et al. 2014)
KB
• •
Semantic Parsing Combinatory Categorical Grammars [Zettlemoyer, 1995] “ ” Shift-reduce Derivations [Zelle, 1995]
• • • • •
Synchronous Grammars [Wong, 2007] Hybrid Tree [Lu, 2008] CFG CYK CFG-like Grammars [Clarke, 2010]
California
dbo:Software
dbr:developer
dbo:Company
dbr:foundationPlace
dbo:California
<dbo:Software, dbr:developer, dbo:Company>
<dbo:Company, dbr:foundationPlace, dbo:California
•
Answer Type Answer Context Answer Path
e1 e6 path answer entity in q e5 e4 e3 r5 r3 r4 r1 type e2
Framework
when did Avatar release in UK
Results
wenku.baidu.com
Neural Attention-based Neural Model for QA with Combining Global Knowledge Information (Zhang et al. 2016)
Problems
•
End2End Model
Simple
•
Which song are performed by person who was born in New York and played a role in Valentine’s Day
•
Entity Type
•
Dong et al. Question Answering over Freebase with Multi-Column Convolutional Neural Network, In Proceedings of ACL 2015
•
Topic Entity Relation Path
•
•
•
(DL,End2End)
Black Box
Location
type
type Person
age 34(Int)
(DL,End2End) IR based Approach
e1
e9 e3 e10
Similarity
e2 e4
!e11
e12
e5 e6
! e7
e8
Progress
•
Bordes et al. Open Question Answering with Weakly Supervised Embedding Models, In Proceedings of ECML-PKDD 2014
•
Multitask Learning
Comparison in Benchmark of WebQuestion
52.5
F1
44.3 39.2 35 33 Structured Perception Fader 2014 MT logistic Regression Yao 2014a 35.7 37.5 NLG+embedding Subgraph Embedding 32 Average Embedding DCS Berant 2013 39.9 41.3 Question Graph + LR Relation+Embedding Yang 2014 40.8
“
”
2016 4
17
Template-based QA Expert System
KB-based QA IR-based QA Community QA
1960
BaseBall LUNAR MACSYMA SHRDLE
1990
MASQUE TREC
2000
2010
IR-based QA
Community QA
type
type Person
age 34(Int)
Semantic Parsing (SP) Information Retrieval (IR)
( ) Semantic Parsing
type
Location
SELECT DISTINCT ?x WHERE { ?y ?x. res: ?y. }
KB-based QA
+
Knowledge Base
•
•
•
KB-based QA
Location
type
type Person
age 34(Int)
Location
type
type Person
age 34(Int)
Location
type
type Person
age 34(Int)
Location
•
(Mention)
organizations founded in California
(KB)
dbo:Company dbr:foundationPlace dbo:California
•
Which software has been developed by organizations founded in California, USA?