珀尔 |正则表达式备忘单
正则表达式或正则表达式是Perl编程的重要组成部分。它用于搜索指定的文本模式。在这种情况下,一组字符一起形成了搜索模式。它也被称为正则表达式。当用户学习正则表达式时,可能需要快速查看他不经常使用的那些概念。因此,为了提供这种便利,创建了一个正则表达式备忘单,其中包含正则表达式中使用的不同类、字符、修饰符等。
字符类
字符类用于字符字符串。这些类让用户匹配任何用户事先不知道的字符范围。
Classes | Explanation |
---|---|
[abc.] | It includes only one of specified characters i.e. ‘a’, ‘b’, ‘c’, or ‘.’ |
[a-j] | It includes all the characters from a to j. |
[a-z] | It includes all lowercase characters from a to z. |
[^az] | It includes all characters except a and z. |
\w | It includes all characters like [a-z, A-Z, 0-9] |
\d | It matches for the digits like [0-9] |
[ab][^cde] | It matches that the characters a and b should not be followed by c, d and e. |
\s | It matches for [\f\t\n\r] i.e form feed, tab, newline and carriage return. |
\W | Complement of \w |
\D | Complement of \d |
\S | Complement of \s |
例子:
# Perl program to demonstrate
# character class
# Actual String
$str = "45char";
# Prints match found if
# its found in $str
# by using \w
if ($str =~ /[\w]/)
{
print "Match Found\n";
}
# Prints match not found
# if it is not found in $str
else
{
print "Match Not Found\n";
}
输出:
Match Found
锚点
锚点根本不匹配任何字符。相反,它们匹配字符之前、之后或之间的特定位置。
Anchors | Explanation |
---|---|
^ | It matches at the beginning of the string. |
$ | It matches at the end of the string. |
\b | It matches at the word boundary of the string from \w to \W. |
\A | It matches at the beginning of the string. |
\Z | It matches at the ending of the string or before the newline. |
\z | It matches only at the end of the string. |
\G | It matches at the specified position pos(). |
\p{….} | Unicode character class like IsLower, IsAlpha etc. |
\P{….} | Complement of Unicode character class |
[:class:] | POSIX Character Classes like digit, lower, ascii etc. |
例子:
# Perl program to demonstrate
# use of anchors in regex
# Actual String
$str = "55";
# Prints match found if
# its found in $str
# using Anchors /
if ($str =~ /[[:alpha:]]/)
{
print "Match Found\n";
}
# Prints match not found
# if it is not found in $str
else
{
print "Match Not Found\n";
}
输出:
Match Not Found
元字符
元字符用于匹配 Perl 正则表达式中的模式。所有元字符都必须转义。
Characters | Explanation |
---|---|
^ | To check the beginning of the string. |
$ | To check the ending of the string. |
. | Any character except newline. |
* | Matches 0 or more times. |
+ | Matches 1 or more times. |
? | Matches 0 or more times. |
() | Used for grouping. |
\ | Use for quote or special characters. |
[] | Used for set of characters. |
{} | Used as repetition modifier. |
量词
这些用于检查特殊字符。量词分为三种
- “?”它匹配 0 或 1 次出现的字符。
- '+'匹配 1 次或多次出现的字符。
- '*'匹配 0 次或多次出现的字符。
Using Quantifiers | Explanation |
---|---|
a? | It checks if ‘a’ occurs 0 or 1 time. |
a+ | It checks if ‘a’ occurs 1 or more time |
a* | It checks if ‘a’ occurs 0 or more time |
a{2, 6} | It checks if ‘a’ occurs 2 to 6 times |
a{2, } | It checks if ‘a’ occurs 2 to infinite times |
a{2} | It checks if ‘a’ occurs 2 time. |
例子:
# Perl program to demonstrate
# use of quantifiers in regex
# Actual String
$str = "color";
# Prints match found if
# its found in $str
# using quantifier ?
if ($str =~ /colou?r/)
{
print "Match Found\n";
}
# Prints match not found
# if it is not found in $str
else
{
print "Match Not Found\n";
}
输出:
Match Found
修饰符
Modifiers | Explanation |
---|---|
\g | It is used to replace all the occurrence of string. |
\gc | It allows continued search after \g match fails. |
\s | It treats string as a single line. |
i | It turns off the case sensitivity. |
\x | It disregard all the white spaces. |
(?#text) | It is used to add comment in the code. |
(?:pattern) | It is used to match pattern of the non capturing group. |
(?|pattern) | It is used to match pattern of the branch test. |
(?=pattern) | It is used for positive look ahead assertion. |
(?!pattern) | It is used for negative look ahead assertion. |
(<=pattern) | It is used for positive look behind assertion. |
( | It is used for negative look behind assertion. |
空白修饰符
Modifiers | Explanation |
---|---|
\t | Used for inserting tab space |
\r | Carriage return character |
\n | Used for inserting new line. |
\h | Used for inserting horizontal white space. |
\v | Used for inserting vertical white space. |
\L | Used for lowercase characters. |
\U | Used for upper case characters. |
量词——修饰语
Maximal | Minimal | Explanation |
---|---|---|
? | ?? | It can occur 0 or 1 time |
+ | +? | It can occur 1 or more times. |
* | *? | It can occur 0 or more times. |
{3} | {3}? | Must match exactly 3 times. |
{3, } | {3, }? | Must match at least 3 times. |
{3, 7} | {3, 7}? | Must match at least 3 times but not more than 7 times. |
分组和捕获
在正则表达式内部,这些组由“\1”引用,而在正则表达式外部,这些组由“$1”引用。这些组可以通过列表上下文中的变量赋值来获取,称为捕获。分组构造(...)创建称为捕获缓冲区的捕获组。
(…) | These are used for grouping and capturing. |
\1, \2, \3 | During regex matching, these are used to capture buffers. |
$1, $2, $3 | During successful matching, these are used to capture variables. |
(?:…) | These are used to group without capturing.(these neither set this $1 nor \1) |