第一个程序
import re
key
=r
"<h1>hello world <h1>"
p1
=r
"<h1>.+<h1>"
pattern
=re
.compile(p1
)
print(pattern
.findall
(key
))
['<h1>hello world <h1>']
key
=r
"aaa@qq.com"
p
="aaa@qq.com"
pattern
=re
.compile(p
)
print(pattern
.findall
(key
))
['aaa@qq.com']
“.” 代表可以匹配任意字符
key
=r
"aaa@qq.com"
p
="aaa@qq.com"
pattern
=re
.compile(p
)
print(pattern
.findall
(key
))
['aaa@qq.com']
key
=r
"aaa@qq)com"
p
="aaa@qq.com"
pattern
=re
.compile(p
)
print(pattern
.findall
(key
))
['aaa@qq)com']
使用“+”表示可以匹配多个字符
key
=r
"aaa@qq)))))com"
p
="aaa@qq.+com"
pattern
=re
.compile(p
)
print(pattern
.findall
(key
))
['aaa@qq)))))com']
使用“*”表示可以匹配0个或者更多的字符
key
=r
"http://www.baidu.com & https://www.hao123.com"
p
="https*://"
pattern
=re
.compile(p
)
print(pattern
.findall
(key
))
['http://', 'https://']
解决大小写问题 “[]” 方括号是解决其中任意一个 “+”1次或者多次 “*”0次或者多次
key
=r
"hahah<hTml>aoisjgoajogj<Html>jalfjla"
p
=r
"<[hH][tT][mM][lL].+[hH][tT][mM][lL]>"
pattern
=re
.compile(p
)
print(pattern
.findall
(key
))
['<hTml>aoisjgoajogj<Html>']
使用"[^字符]"去掉不需要的字符
key
=r
"lap map nap rap xap aap ap"
p
=r
"[^x]ap"
pattern
=re
.compile(p
)
print(pattern
.findall
(key
))
简洁正则表达式常用写法
[0-9] 0到9任意之一
[a-z]小写字母任意之一
[A-Z]大写字母任意之一
\d 等同于[0-9]
\D 等同于[^0-9]匹配非数字
\w 等同于[a-z0-9A-z]匹配大小写字母、数字和下划线
\W 等同于[^a-z0-9A-z]等同于上一条取非
使用“?”和{} 设定匹配次数 懒惰与贪婪
key
=r
"heiheihei@imooc.com.cn.aa"
p
=r
"@.+\."
pattern
=re
.compile(p
)
print(pattern
.findall
(key
))
['@imooc.com.cn.']
key
=r
"heiheihei@imooc.com.cn.aa"
p
=r
"@.+?\."
pattern
=re
.compile(p
)
print(pattern
.findall
(key
))
['@imooc.']
key
=r
"bt&bat&baat&baaat"
p
=r
"ba{1,2}t"
pattern
=re
.compile(p
)
print(pattern
.findall
(key
))
['bat', 'baat']
key
=r
"bt&bat&baat&baaat"
p
=r
"ba{,2}t"
pattern
=re
.compile(p
)
print(pattern
.findall
(key
))
['bt', 'bat', 'baat']
key
=r
"bt&bat&baat&baaat"
p
=r
"ba{2,}t"
pattern
=re
.compile(p
)
print(pattern
.findall
(key
))
['baat', 'baaat']
元字符
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-koMRvNxY-1602050660584)(attachment:image.png)]
小例子 寻找MAC服务器地址 子表达式的相关概念
key
=r
"bt124.2.3.4baaat"
p
=r
"\d+?\.\d+?\.\d+?\.\d+?\."
pattern
=re
.compile(p
)
print(pattern
.findall
(key
))
['124.2.3.4.']
key
=r
"bt124.2.3.4.5baaat"
p
=r
"\d{1,3}?\.\d{1,3}?\.\d{1,3}?\.\d{1,3}"
pattern
=re
.compile(p
)
print(pattern
.findall
(key
))
['124.2.3.4']
将前面的一个表达式用“()”起来变成一个子表达式
key
=r
"bt124.2.0.4.5baaat"
p
=r
"(\d{1,3}?\.){3}\d{1,3}"
pattern
=re
.compile(p
)
print(pattern
.findall
(key
))
['0.']
match
=pattern
.search
(key
)
print(match
.group
(0))
124.2.0.4
使用“?:” 将匹配模式设置成为非捕获组
key
=r
"bt124.2.0.4.5baaat"
p
=r
"(?:\d{1,3}?\.){3}\d{1,3}"
pattern
=re
.compile(p
)
print(pattern
.findall
(key
))
['124.2.0.4']
删除操作 前瞻与后顾
前瞻: exp1(?=exp2) 查找exp2前面的exp1
后顾:(?<=exp2)expe1 查找exp2后面的exp2
负前瞻: exp1(?!exp2) 查找后面不是exp2的exp1
负后顾:(?<exp2)exp1 查找前面不是exp2的exp1
import re
key
=r
"<h1>hello world <h1>"
p1
=r
"(?<=<h1>).+(?=<h1>)"
pattern
=re
.compile(p1
)
print(pattern
.findall
(key
))
['hello world ']
import re
key
=r
"<h1>hello world </h1><h2>hello world </h2><h3>hello error </h3><h1>helll </h2>"
p1
=r
"(?<=(h[1-6])>)[^<>]+?(?=</\1>)"
pattern
=re
.compile(p1
)
objs
=pattern
.finditer
(key
)
list=[]
for obj
in objs
:
list.append
(obj
.group
())
print(list)
['hello world ', 'hello world ', 'hello error ']
实战第一弹:正则校验篇
校验email地址
key
=r
'asda@asdfa.com'
pa
=r
"^\w+@[a-z0-9]+\.[a-z]{2,4}$"
pattern1
=re
.compile(pa
)
print(pattern1
.findall
(key
))
['asda@asdfa.com']
校验手机号
key_phone_num
=r
"17289899995"
p1
=r
"^1\d{10}$"
pattern
=re
.compile(p1
)
print(pattern
.findall
(key_phone_num
))
['17289899995']
校验身份证号
key
=r
"42130219550522081X"
p1
=r
"^[0-9]\d{5}(18|19|(2\d))\d{2}((0[1-9])|(10|11|12))(([0-2][1-9])|10|20|30|31)\d{3}[0-9xX]$"
pattern1
=re
.compile(p1
)
list=[]
for obj
in objs
:
list.append
(obj
)
print(list)
[]
实战第二弹:正则表达式实现24点
24点是一款益智游戏,是把4个整数通过加、减、乘、除以及括号运算,是最后的计算结果是24的一个数学游戏。
我们的目标是程序自动判断给出的四个整数,并列出可以得出24点的计算公式。
operator
=['+','-','*','/']
format=['%d%s%d%s%d%s%d',]