Matlab正则表达式总结

合集下载

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Matlab正则表达式总结
S = REGEXP(STRING,EXPRESSION) :从string中匹配符合表达式express的字符串。

返回
值s表示符合表达式的每个字符串的首个字符的索引。

express：由通配符和一般性文字字符组成。

常用通配符介绍：
1.以下通配符可以精确匹配某些字符
通配符说明
--------------- --------------------------------
. 匹配任意一个字符
[] 匹配中括号中的任意一个字符
[^] 匹配中括号中的字符除外的任意一个字符
\w 匹配小写字母a到z、大写字母A到Z、数字0到9、字符‘_’中
的任意一个字符
\W 匹配不在[^a-z_A-Z0-9]范围内的任意一个字符
\d 匹配数字[0-9]中的任意一个数字
\D 匹配除数字外的一个字符
\s 匹配分隔符[\t\r\n\f\v]
\S 不匹配分隔符[^ \t\r\n\f\v]
2. 以下通配符用于逻辑分组的表达式或者指定上下文位置。

这些通配符不匹配任何字符。

Metacharacter Meaning
--------------- --------------------------------
() Group subexpression
| Match subexpression before or after the |
^ Match expression at the start of string
$ Match expression at the end of string
\< Match expression at the start of a word
\> Match expression at the end of a word
3.以下通配符表示匹配字符的次数。

Metacharacter Meaning
--------------- --------------------------------
* 匹配0或更多次
+ 匹配1或更多次
? 匹配0或1次
{n,m} 匹配n和m中间的次数
基本用例
Example:
str = 'bat cat can car coat court cut ct caoueouat';
pat = 'c[aeiou]+t';
regexp(str, pat)
returns [5 17 28 35]
Example:
str = {'Madrid, Spain' 'Romeo and Juliet' 'MATLAB is great'};
pat = '\s';
regexp(str, pat)
returns {[8]; [6 10]; [7 10]}
Example:
str = {'Madrid, Spain' 'Romeo and Juliet' 'MATLAB is great'};
pat = {'\s', '\w+', '[A-Z]'};
regexp(str, pat)
returns {[8]; [1 7 11]; [1 2 3 4 5 6]}
REGEXP输出：
regexp支持六个输出，通过在输入参数中写入相应字段，获得对应输出。

具体参数及说明如下：
Keyword Result
--------------- --------------------------------
'start' 每个匹配的字符串的首个字符的位置索引，返回值是行向量。

'end' Row vector of ending indices of each match
'tokenExtents' Cell array of extents of tokens in each match
'match' Cell array of the text of each match
'tokens' Cell array of the text of each token in each match
'names' Structure array of each named token in each match
'split' Cell array of the text delimited by each match
Example:
str = 'regexp helps you relax';
pat = '\w*x\w*';
m = regexp(str, pat, 'match')
returns
m = {'regexp', 'relax'}
Example:
str = 'regexp helps you relax';
pat = '\s+';
s = regexp(str, pat, 'split')
returns
s = {'regexp', 'helps', 'you', 'relax'}
Example: Tokens的用法
str = 'six sides of a hexagon';
pat = 's(\w*)s';
t = regexp(str, pat, 'tokens')
returns
t = {{'ide'}}
Example: names的用法，需要在express中加入(?<name>...)
str = 'John Davis; Rogers, James';
pat = '(?<first>\w+)\s+(?<last>\w+)|(?<last>\w+),\s+(?<first>\w+)'; n = regexp(str, pat, 'names')
returns
n(1).first = 'John'
n(1).last = 'Davis'
n(2).first = 'James'
n(2).last = 'Rogers'
只读取第一个匹配的字符串：
REGEXP(STRING,EXPRESSION,'once').。