📜  Python表达式|正则表达式

📅  最后修改于: 2020-10-24 09:21:39             🧑  作者: Mango

Python正则表达式

正则表达式可以定义为被用于搜索字符串中的图案的字符序列。 re模块提供了在Python程序中使用正则表达式的支持。如果使用正则表达式时出错,则re模块会引发异常。

必须导入re模块才能使用Python的regex功能。

import re 

正则表达式功能

Python中使用以下正则表达式函数。

SN Function Description
1 match This method matches the regex pattern in the string with the optional flag. It returns true if a match is found in the string otherwise it returns false.
2 search This method returns the match object if there is a match found in the string.
3 findall It returns a list that contains all the matches of a pattern in the string.
4 split Returns a list in which the string has been split in each match.
5 sub Replace one or many matches in the string.

形成正则表达式

可以通过使用元字符,特殊序列和集合的混合来形成正则表达式。

元字符

元字符是具有指定含义的字符。

Metacharacter Description Example
[] It represents the set of characters. “[a-z]”
\ It represents the special sequence. “\r”
. It signals that any character is present at some specific place. “Ja.v.”
^ It represents the pattern present at the beginning of the string. “^Java”
$ It represents the pattern present at the end of the string. “point”
* It represents zero or more occurrences of a pattern in the string. “hello*”
+ It represents one or more occurrences of a pattern in the string. “hello+”
{} The specified number of occurrences of a pattern the string. “java{2}”
| It represents either this or that character is present. “java|point”
() Capture and group

特殊序列

特殊序列是包含\后跟一个字符的序列。

Character Description
\A It returns a match if the specified characters are present at the beginning of the string.
\b It returns a match if the specified characters are present at the beginning or the end of the string.
\B It returns a match if the specified characters are present at the beginning of the string but not at the end.
\d It returns a match if the string contains digits [0-9].
\D It returns a match if the string doesn’t contain the digits [0-9].
\s It returns a match if the string contains any white space character.
\S It returns a match if the string doesn’t contain any white space character.
\w It returns a match if the string contains any word characters.
\W It returns a match if the string doesn’t contain any word.
\Z Returns a match if the specified characters are at the end of the string.

套装

一组是在方括号内给出的一组字符。它代表特殊含义。

SN Set Description
1 [arn] Returns a match if the string contains any of the specified characters in the set.
2 [a-n] Returns a match if the string contains any of the characters between a to n.
3 [^arn] Returns a match if the string contains the characters except a, r, and n.
4 [0123] Returns a match if the string contains any of the specified digits.
5 [0-9] Returns a match if the string contains any digit between 0 and 9.
6 [0-5][0-9] Returns a match if the string contains any digit between 00 and 59.
10 [a-zA-Z] Returns a match if the string contains any alphabet (lower-case or upper-case).

findall()函数

此方法返回一个列表,其中包含字符串内模式的所有匹配项的列表。它按照找到的顺序返回模式。如果没有匹配项,则返回一个空列表。

考虑以下示例。

import re

str = "How are you. How is everything"

matches = re.findall("How", str)

print(matches)

print(matches)

输出:

['How', 'How']

比赛对象

匹配对象包含有关搜索和输出的信息。如果找不到匹配项,则返回None对象。

import re

str = "How are you. How is everything"

matches = re.search("How", str)

print(type(matches))

print(matches) #matches is the search object

输出:


<_sre.SRE_Match object; span=(0, 3), match='How'>

Match对象方法

与Match对象关联的方法如下。

  • span():返回包含比赛开始和结束位置的元组。
  • 字符串 ():返回传递给函数的字符串。
  • group():返回找到匹配项的字符串部分。

import re

str = "How are you. How is everything"

matches = re.search("How", str)

print(matches.span())

print(matches.group())

print(matches.string)

输出:

(0, 3)
How
How are you. How is everything