📅  最后修改于: 2020-10-24 09:21:39             🧑  作者: Mango
正则表达式可以定义为被用于搜索字符串中的图案的字符序列。 re模块提供了在Python程序中使用正则表达式的支持。如果使用正则表达式时出错,则re模块会引发异常。
必须导入re模块才能使用Python的regex功能。
import re
Python中使用以下正则表达式函数。
SN | Function | Description |
---|---|---|
1 | match | This method matches the regex pattern in the string with the optional flag. It returns true if a match is found in the string otherwise it returns false. |
2 | search | This method returns the match object if there is a match found in the string. |
3 | findall | It returns a list that contains all the matches of a pattern in the string. |
4 | split | Returns a list in which the string has been split in each match. |
5 | sub | Replace one or many matches in the string. |
可以通过使用元字符,特殊序列和集合的混合来形成正则表达式。
元字符是具有指定含义的字符。
Metacharacter | Description | Example |
---|---|---|
[] | It represents the set of characters. | “[a-z]” |
\ | It represents the special sequence. | “\r” |
. | It signals that any character is present at some specific place. | “Ja.v.” |
^ | It represents the pattern present at the beginning of the string. | “^Java” |
$ | It represents the pattern present at the end of the string. | “point” |
* | It represents zero or more occurrences of a pattern in the string. | “hello*” |
+ | It represents one or more occurrences of a pattern in the string. | “hello+” |
{} | The specified number of occurrences of a pattern the string. | “java{2}” |
| | It represents either this or that character is present. | “java|point” |
() | Capture and group |
特殊序列是包含\后跟一个字符的序列。
Character | Description |
---|---|
\A | It returns a match if the specified characters are present at the beginning of the string. |
\b | It returns a match if the specified characters are present at the beginning or the end of the string. |
\B | It returns a match if the specified characters are present at the beginning of the string but not at the end. |
\d | It returns a match if the string contains digits [0-9]. |
\D | It returns a match if the string doesn’t contain the digits [0-9]. |
\s | It returns a match if the string contains any white space character. |
\S | It returns a match if the string doesn’t contain any white space character. |
\w | It returns a match if the string contains any word characters. |
\W | It returns a match if the string doesn’t contain any word. |
\Z | Returns a match if the specified characters are at the end of the string. |
一组是在方括号内给出的一组字符。它代表特殊含义。
SN | Set | Description |
---|---|---|
1 | [arn] | Returns a match if the string contains any of the specified characters in the set. |
2 | [a-n] | Returns a match if the string contains any of the characters between a to n. |
3 | [^arn] | Returns a match if the string contains the characters except a, r, and n. |
4 | [0123] | Returns a match if the string contains any of the specified digits. |
5 | [0-9] | Returns a match if the string contains any digit between 0 and 9. |
6 | [0-5][0-9] | Returns a match if the string contains any digit between 00 and 59. |
10 | [a-zA-Z] | Returns a match if the string contains any alphabet (lower-case or upper-case). |
此方法返回一个列表,其中包含字符串内模式的所有匹配项的列表。它按照找到的顺序返回模式。如果没有匹配项,则返回一个空列表。
考虑以下示例。
例
import re
str = "How are you. How is everything"
matches = re.findall("How", str)
print(matches)
print(matches)
输出:
['How', 'How']
匹配对象包含有关搜索和输出的信息。如果找不到匹配项,则返回None对象。
import re
str = "How are you. How is everything"
matches = re.search("How", str)
print(type(matches))
print(matches) #matches is the search object
输出:
<_sre.SRE_Match object; span=(0, 3), match='How'>
与Match对象关联的方法如下。
import re
str = "How are you. How is everything"
matches = re.search("How", str)
print(matches.span())
print(matches.group())
print(matches.string)
输出:
(0, 3)
How
How are you. How is everything