Python正则表达式备忘单

正则表达式或正则表达式是Python编程或任何其他编程语言的重要组成部分。它用于搜索甚至替换指定的文本模式。在正则表达式中，一组字符共同构成了搜索模式。它也被称为正则表达式模式。 Regex 的难点不是学习或理解它，而是记住语法以及如何根据我们的要求形成模式。所以这里我们提供了一个正则表达式备忘单，其中包含正则表达式中使用的所有不同的字符类、特殊字符、修饰符、集合等。

基本字符：

Expression	Explanations
^	Matches the expression to its right, at the start of a string before it experiences a line break
$	Matches the expression to its left, at the end of a string before it experiences a line break
.	Matches any character except newline
a	Matches exactly one character a
xy	Matches the string xy
a\|b	Matches expression a or b. If a is matched first, b is left untried.

例子：

Python3

import re
  
print(re.search(r"^x","xenon"))
print(re.search(r"s$","geeks"))

Python3

import re
  
print(re.search(r"9+","289908"))
print(re.search(r"\d{3}","hello1234"))

Python3

import re
  
print(re.search(r"\s","xenon is a gas"))
print(re.search(r"\D+\d*","123geeks123"))

Python3

import re
  
print(re.search(r"[^abc]","abcde"))
print(re.search(r"[a-p]","xenon"))

Python3

import re
  
example = (re.search(r"(?:AB)","ACABC"))
print(example)
print(example.groups())
  
result = re.search(r"(\w*), (\w*)","geeks, best")
print(result.groups())

Python3

import re
  
print(re.search(r"z(?=a)", "pizza"))
print(re.search(r"z(?!a)", "pizza"))

Python3

import re
  
exp = """hello there
I am from
Geeks for Geeks"""
  
print(re.search(r"and", "Sun And Moon", flags=re.IGNORECASE)) 
print(re.findall(r"^\w", exp, flags = re.MULTILINE))

输出：

解释：

首先使用命令import re导入正则表达式模块

然后，在第一个示例中，我们使用正则表达式在单词“xenon”中搜索“ ^x” 。 ^这个字符匹配它右边的表达式，在字符串的开头。因此， ^x将在字符串的开头搜索字符x 。由于xenon以x 开头，它将找到匹配项并返回匹配项 ('x') 及其位置 (0,1)

类似地，在第二个例子中s$将搜索字符串末尾的字符s ，现在因为极客以s结尾，所以它将找到匹配并返回匹配（'s'）及其位置（4， 5）。

量词：

Expressions	Explanations
+	Matches the expression to its left 1 or more times.
*	Matches the expression to its left 0 or more times.
?	Matches the expression to its left 0 or 1 times
{p}	Matches the expression to its left p times, and not less.
{p, q}	Matches the expression to its left p to q times, and not less.
{p, }	Matches the expression to its left p or more times.
{ , q}	Matches the expression to its left up to q times

他们的默认搜索方法是贪婪。但是如果 ?添加到限定符（+、* 和 ? 本身）后，它将以非贪婪的方式执行匹配。

例子：

蟒蛇3

import re
  
print(re.search(r"9+","289908"))
print(re.search(r"\d{3}","hello1234"))

输出：

解释：

在第一个示例中， 9+将搜索数字9一次或多次。由于289908包含9两次，正则表达式将匹配它并打印 match('99') 及其位置(2,4)

在第二个示例中， \d{3}将精确搜索数字 3 次。由于hello1234有数字，它将恰好匹配第一个遇到的 3 个数字，即 123 而不是 4，因为{3}将正好匹配 3 个数字。所以它将打印匹配（'123'）及其位置（5,8）

字符类：

Expressions	Explanations
\w	Matches alphanumeric characters, that is a-z, A-Z, 0-9, and underscore(_)
\W	Matches non-alphanumeric characters, that is except a-z, A-Z, 0-9 and _
\d	Matches digits, from 0-9.
\D	Matches any non-digits.
\s	Matches whitespace characters, which also include the \t, \n, \r, and space characters.
\S	Matches non-whitespace characters.
\A	Matches the expression to its right at the absolute start of a string whether in single or multi-line mode.
\Z	Matches the expression to its left at the absolute end of a string whether in single or multi-line mode.
\n	Matches a newline character
\t	Matches tab character
\b	Matches the word boundary (or empty string) at the start and end of a word.
\B	Matches where \b does not, that is, non-word boundary

例子：

蟒蛇3

import re
  
print(re.search(r"\s","xenon is a gas"))
print(re.search(r"\D+\d*","123geeks123"))

输出：

解释：

在第一个示例中， \s将搜索空格，每当遇到第一个空格时，它将打印出该匹配项。由于氙气是一种包含空格的气体，它会遇到第一个空格并打印出匹配（' '）及其位置（5,6）

在第二个示例中， \D+\d*将搜索一个或多个非数字字符，后跟 0 个或多个数字。在我们的例子中， geeks123最适合描述，因为它包含 1 个或多个非数字字符（geeks），后跟 0 个或多个数字字符（123）。所以它将打印匹配（'geeks123'）及其位置（3,11）。

套：

Expressions	Explanations
[abc]	Matches either a, b, or c. It does not match abc.
[a-z]	Matches any alphabet from a to z.
[A-Z]	Matches any alphabets in capital from A to Z
[a\-p]	Matches a, -, or p. It matches – because \ escapes it.
[-z]	Matches – or z
[a-z0-9]	Matches characters from a to z or from 0 to 9.
[(+*)]	Special characters become literal inside a set, so this matches (, +, *, or )
[^ab5]	Adding ^ excludes any character in the set. Here, it matches characters that are not a, b, or 5.
\[a\]	Matches [a] because both parentheses [ ] are escaped

例子：

蟒蛇3

import re
  
print(re.search(r"[^abc]","abcde"))
print(re.search(r"[a-p]","xenon"))

输出：

解释：

在第一个示例中， [^abc]将搜索除 a、b 和 c 之外的任何内容，因此正则表达式将匹配第一个不是 a 或 b 或 c 的字符，并打印出该匹配项。由于abcde包含d作为其第一个既不是 a 也不是 b 也不是 c 的匹配项，因此它将打印出该匹配项。所以匹配将是 ('d') 并且它的位置将是 (3,4)

在第二个示例中， [ap]将搜索 a 到 p 之间的字符。在氙气中 ap 之间的第一个单词是e它将打印出该搜索。所以匹配将是 ('e') 并且它的位置将是 (1,2)

团体：

Expressions	Explanations
( )	Matches the expression inside the parentheses and groups it which we can capture as required
(?#…)	Read a comment
(?PAB)	Matches the expression AB, which can be retrieved with the group name.
(?:A)	Matches the expression as represented by A, but cannot be retrieved afterwards.
(?P=group)	Matches the expression matched by an earlier group named “group”

例子：

蟒蛇3

import re
  
example = (re.search(r"(?:AB)","ACABC"))
print(example)
print(example.groups())
  
result = re.search(r"(\w*), (\w*)","geeks, best")
print(result.groups())

输出：


()
('geeks', 'best')

解释：

在第一个示例中， (?:AB)将搜索并匹配表达式AB并打印出匹配项及其位置。由于ACABC包含AB，它将打印 match('AB') 及其位置 (2,4)，但如上所述，此后无法检索。因此，如果我们尝试打印输出的组，它将显示一个空括号。

在第二个示例中，我们捕获了两个组，一个组包含 0 个或多个字母数字字符，后跟逗号和空格，然后另一个组包含 0 个或多个字母数字字符。在极客中，最好的极客和最好的被捕获为第一组和第二组。因此，当我们打印出这些组时，我们将拥有 ('geeks', 'best) 作为捕获的组。

断言：

Expression	Explanation
A(?=B)	This matches the expression A only if it is followed by B. (Positive look ahead assertion)
A(?!B)	This matches the expression A only if it is not followed by B. (Negative look ahead assertion)
(?<=B)A	This matches the expression A only if B is immediate to its left. (Positive look behind assertion)
(?	This matches the expression A only if B is not immediately to its left. (Negative look behind assertion)
(?()\|)	If else conditional

例子：

蟒蛇3

import re
  
print(re.search(r"z(?=a)", "pizza"))
print(re.search(r"z(?!a)", "pizza"))

输出：

解释：

在第一个示例中， z(?=a)将搜索字符z 后跟字符a。因为在披萨中，我们有一个字符z紧随其后的是字符a (pizz za )，所以会有一场比赛。正则表达式将打印 match('z') 后跟a及其位置 (3,4)

在第二个示例中， z(?!a)将搜索后面没有跟有字符a 的字符z 。因为在披萨中，我们有一个字符z后面不是a而是z (pi zz a)，所以会有匹配。正则表达式将打印 match('z') 后跟a及其位置 (2,3)

标志：

Expression	Explanation
a	Matches ASCII only
i	Ignore case
L	Locale character classes
m	^ and $ match start and end of the line (Multi-line)
s	Matches everything including newline as well
u	Matches Unicode character classes
x	Allow spaces and comments (Verbose)

例子：

蟒蛇3

import re
  
exp = """hello there
I am from
Geeks for Geeks"""
  
print(re.search(r"and", "Sun And Moon", flags=re.IGNORECASE)) 
print(re.findall(r"^\w", exp, flags = re.MULTILINE))

输出：


['h', 'I', 'G']

解释：

在第一个示例中，IGNORECASE 标志将搜索单词并且不考虑其大小写（无论是大写还是小写），因此它忽略大小写并匹配表达式中的And 。所以它会打印 match('And') 和它的 position(4,7)

在第二个示例中， MULTILINE 标志将在每一行中搜索，并在该行以字母数字字符开头时匹配。由于在 Multi-line hello 中，我来自 Geeks for Geeks，每行都以字母数字字符开头，因此它将匹配每一行并在数组中打印匹配项 (['h', 'I', 'G' ]）。

注意：在 MULTILINE 标志中，我们必须使用 re.findall，因为它有很多匹配项（对于每一行）

Python正则表达式备忘单

基本字符：

^

$

.

a

xy

a|b

Python3

Python3

Python3

Python3

Python3

Python3

Python3

量词：

+

*

?

{p}

{p, q}

{p, }

{ , q}

蟒蛇3

字符类：

\w

\W

\d

\D

\s

\S

\A

\Z

\n

\t

\b

\B

蟒蛇3

套：

[abc]

[a-z]

[A-Z]

[a\-p]

[-z]

[a-z0-9]

[(+*)]

[^ab5]

\[a\]

蟒蛇3

团体：

( )

(?#…)

(?PAB)

(?:A)

(?P=group)

蟒蛇3

断言：

A(?=B)

A(?!B)

(?<=B)A

(?

(?()|)

蟒蛇3

标志：

a

i

L

m

s

u

x

蟒蛇3