python模块：re模块-CFANZ编程社区

. 匹配任意字符
　　[] 匹配指定字符类别
　　^ 字符开头
　　$ 字符结尾
　　[^] 取非字符
　　* 重复多次字符(0次或多次)
　　+ 重复多次字符(1次或多次)
　　? 重复单次字符
　　| 左右表达式任意匹配
　　{m,n} 重复m到n次字符
　　{m} 重复m次字符
　　\d 匹配任何十进制数,相当于[0-9]
　　\D 匹配任何非数字字符,相当于[^0-9]
　　\s 匹配任何空白字符,相当于[fdss]
　　\S 匹配任何非空白字符,相当于[^jdvnjsd]
　　\w 匹配任何数字字母,相当于[a-zA-Z0-9]
　　\W 匹配任何非数字字母,相当于[^a-zA-Z0-9]
　　例1:定义简单的正则表达式
　　格式:re.compile(strPattern[, flag]):
　　strPattern:字符串表达式
　　flag:
　　re.I(re.IGNORECASE): 忽略大小写(括号内是完整写法，下同)
　　M(MULTILINE): 多行模式，改变'^'和'$'的行为
　　S(DOTALL): 点任意匹配模式，改变'.'的行为
　　L(LOCALE): 使预定字符类 \w \W \b \B \s \S 取决于当前区域设定
　　U(UNICODE): 使预定字符类 \w \W \b \B \s \S \d \D 取决于unicode定义的字符属性
　　X(VERBOSE): 详细模式。这个模式下正则表达式可以是多行，忽略空白字符，并可以加入注释。
　　pattern=re.compile(r'heLLow',re.I) #生成一个pattern实例
　　match=pattern.match('hellow world') #使用pattern匹配文本
　　if match:
　　print match.group() #如果匹配就输出
　　#例2:match属性和方法
　　#!/bin/env python
　　#!-*- coding:UTF-8 -*-
　　import re
　　match=re.match(r'(\w+)(\w+)(?P
    
     .*)','hellow world!') #使用pattern匹配文本
    
　　print "match.string:",match.string #匹配时使用的文本
　　print "match.re:",match.re #匹配时使用的pattern对像
　　print "match.pos:",match.pos #开始搜索的索引
　　print "match.endpos:",match.endpos #结束搜索的索引
　　print "match.lastindex:",match.lastindex #最后一个被捕获在分组在文本中的索引
　　print "match.lastgroup",match.lastgroup #最后一个被捕获的分组名
　　print "match.group(1,2):",match.group(1,2) #获得一个或多个分组截获的字符串
　　print "match.groups():",match.groups() #以元组形式返回全部分组的字符串
　　print "match.groupdict():",match.groupdict() #返回有别名组的别名为键
　　print "match.start(2):",match.start(2) #返回指定组截获的子串在字符中的起始索引
　　print "match.end(2):",match.end(2) #返回指定组截获的子串在字符中的结束索引
　　print "match.span(2):",match.span(2) #返回起始组和结束组
　　#例3:re模块和方法
　　re.compile #转换为Pattern对像
　　re.match #匹配正零时表达式
　　re.search #查找字符串
　　re.split #分割字符串
　　re.findall #搜索字符中,以列表形式返回全部能匹配的子串
　　re.finditer #搜索字符串,返回一个顺序访问每一个匹配的结果
　　re.sub #替换字符串
　　re.subn #替换字符串,n次
　　#例4:查找字符串
　　a=re.compile(r'hello')
　　b=a.search('hello world') #查找a中是否有hello字符
　　if b:
　　print b.group()
　　#例5:截断字符串
　　p=re.compile(r'\d')
　　print p.findall('one1two2three3four4five5')
　　#例6:返回每个匹配的结果
　　w=re.compile(r'\d')
　　for m in w.finditer('one1two2three3four4five5'):
　　print m.group()
　　#例7:
　　e=re.compile(r'(\w+)(\w+)')
　　s='This is, tong cheng'
　　print e.sub(r'\2\1',s)
　　def func(m):
　　return m.group(1).title()+ ' ' + m.group(2).title()
　　print e.sub(func,s)