Universal LevenshteinAutomaton
合集下载
相关主题
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
States
Universal Levenshtein Automaton
input vector
k = 1; P of the form “chold”. Input “child” is translated into the sequence
Implementation(A)
Universal Levenshtein Automaton
Set of active states after reading wi depends
Universal Levenshtein Automaton
Input vectors
Universal Levenshtein Automaton
Universal Levenshtein Automaton
Set of active states after reading wi depends
Universal Levenshtein Automaton
Triangular areas are highlighted
Basic correction algorithm
Evaluation
Байду номын сангаас
Improvement II
Using backwards dictionaries for filtering
Backwards dictionary
Dictionary
Backwards dictionary
Nondeterministic automaton A
Levenshtein distance automaton with bound k = 2 for pattern “chold”
Naive ”AssPro” algorithm
P = Input A = NEA(P,k) for w in Dict: if A(w) <= k: output w
Levenshtein Distance
dynamic programming table proceeding topdown and from left to right by:
OCR context
Association problem (APro): input(ocr) > dictionary(current) => candidates
Backwards dictionary [k<=2]
Reference
Fast approximate search in large dictionaries (Stoyan Mihov, Klaus U. Schulz; 2003) Measuring Dialect Pronunciation Differences using Levenshtein Distance (Wilbert Heeringa, 2004) Precise and Efficient Text Correction using Levenshtein Automata, Dynamic Web Dictionaries and Optimized Correction Models (Mihov,Schulz et al.) OCRKorrektur und Bestimmung von LevenshteinGewichten OCRKorrektur und Bestimmung von LevenshteinGewichten (Christoph Ringlstetter) Einsatz von bitparallelen Algorithmen und Filtern zur approximativen Zeichenkettensuche (K.Stamme 2002) Zeichenkettensuche Nachkorrektur von Ergebnissen einer optischen Charaktererkennung (Schulz 2006) Levenshtein Demo:http://odur.let.rug.nl/~kleiweg/lev/
Seminar: Computerlinguistische Probleme bei der Digitalisierung historischer Texte f.a.zrenner Paper: Stoyan Mihov, Klaus U. Schulz, Fast approximate search in large dictionaries
Approximate search in dictionaries using universal Levenshtein automata , by linking it with the dictionary automaton