珀尔 |正则表达式中的锚点
Perl Regex 中的锚点根本不匹配任何字符。相反,它们匹配字符之前、之后或之间的特定位置。这些用于检查的不是字符串,而是它的位置边界。
以下是 Perl Regex 中的各个锚点:
'^' '$', '\b', '\A', '\Z', '\z', '\G', '\p{....}', '\P{....}', '[:class:]'
^ 或 \A :它匹配字符串开头的模式。
Syntax: (/^pattern/, /\Apattern/).
例子:
#!/usr/bin/perl
$str = "guardians of the galaxy";
# prints the pattern as it is
# starting with 'guardians'
print "$&\n" if($str =~ /^guardians/);
# prints the pattern 'gua'
print "$&\n" if($str =~ /\Agua/);
# prints nothing because
# the 0th position doesn't start with 'a'
print "$&" if($str =~ /^ans/)
guardians
gua
$ 或 \z :它匹配字符串末尾的模式。
Syntax: (/pattern$/, /pattern\z/).
例子:
#!/usr/bin/perl
$str = "guardians of the galaxy";
# prints nothing as it is not
# ending with 'guardians'
print "$&\n" if($str =~ /guardians$/);
# prints the pattern 'y'
print "$&\n" if($str =~ /y\z/);
# prints the pattern as it is
# ending with 'galaxy'
print "$&" if($str =~ /galaxy$/)
y
galaxy
\b :它匹配从\w到\W的字符串的单词边界。准确地说,如果它是一个单词,它要么匹配字符串的开头或结尾,要么匹配单词字符或非单词字符。
Syntax: (/\bpattern\b/).
例子:
#!/usr/bin/perl
$str = "guardians-of-the-galaxy";
# prints '-galaxy' as it forms
# a word even with '-'.
print "$&\n" if($str =~ /\b-galaxy\b/);
# prints '-guardians' as it forms
# a word even with '-'.
print "$&\n" if($str =~ /\bguardians-\b/);
# prints nothing as it is bounded
# with a character 't'.
print "$&" if($str =~ /\be-galaxy\b/);
# prints 'guardians-of-the-galaxy' as it
# is bounded with the beginning and end.
print "$&" if($str =~ /\bguardians-of-the-galaxy\b/);
-galaxy
guardians-
guardians-of-the-galaxy
\Z :匹配字符串的结尾或换行符之前。 ' \z ' 和 ' \Z ' 都与$不同,因为它们不受/m “多行”标志的影响,它允许$匹配任何行的末尾。
#!/usr/bin/perl
# Prints one due to m//
print "one\n" if ('galaxy' =~ m/galaxy\z/);
# Prints two due to m//
print "two\n" if('galaxy' =~ m/galaxy\Z/);
# Prints three due to /Z
# as it forms a newline
print "three\n" if ("galaxy\n" =~ m/galaxy\Z/);
# Prints four due to m// as
# the line ended \z gets affected
print "four\n" if ("galaxy\n" =~ m/galaxy\n\z/);
# Prints five as it forms a new line
print "five\n" if("galaxy\n" =~ m/galaxy\n\Z/);
# Due to the "" it forms a newline and
# \z doesn't get affected. Prints nothing
print "six" if("galaxy\n" =~ m/galaxy\z/);
one
two
three
four
five
\G :在指定位置匹配。如果一个模式的长度为 5,那么它从字符串的开头开始直到 5 个位置,如果模式有效,则强制从第 6 个位置开始检查字符串,以这种方式向前移动,直到模式无效或结束字符串。
#!/usr/bin/perl
$str = "galaxy8222as";
# prints until the pattern is valid
print "one: $& " while($str =~ /\G[a-z]{2}/gc);
print "\n";
# prints until the pattern is valid
print "two: $& " while("1122a44" =~ /\G\d\d/gc);
print "\n";
# Take the string as a new value and
# searches from the start to false
print "three: $& " while("galaxy8222as" =~ /\G\w{2}/gc);
print "four: $& " while($str =~ /\G[a-z]{2}/gc);
# Take the false position of the
# above string and searches from there
# Prints if the pattern is valid from that position
# onwards(prints nothing). As it is false
# it stays at the same position as before.
print "\n";
print "five: $& " while($str =~ /\G\w{2}/gc);
one: ga one: la one: xy
two: 11 two: 22
three: ga three: la three: xy three: 82 three: 22 three: as
five: 82 five: 22 five: as
\p{...} 和 \P{...} : \p{...}匹配 Unicode字符类,如 IsLower、IsAlpha 等,而\P{....}是 Unicode字符类的补码。
#!/usr/bin/perl
# unicode class is the pattern to match
print "$&" while("guardians!@#%^*123" =~ /\p{isalpha}/gc);
print "\n";
# unicode class is the pattern to match
print "$&" while("guardians!@#%^&*123" =~ /\p{isalnum}/gc);
print "\n";
# here L matches the alphabets where \P is the complement
print "$&" while("guardians!@#%^&*123" =~ /\P{L}/gc);
print "\n";
# here L matches the alphabets where \p is non-complement
print "$&" while("guardians!@#%^&*123" =~ /\p{L}/gc);
guardians
guardians123
!@#%^&*123
guardians
[:class:] : POSIX字符类,如 digit、lower、ascii 等。
Syntax: (/[[:class:]]/)
POSIX字符类如下:
alpha, alnum, ascii, blank, cntrl, digit, graph, lower, punct, space, upper, xdigit, word
#!/usr/bin/perl
# prints only alphabets
print "$&" while('guardians!@#%^&*123' =~ /[[:alpha:]]/gc);
print "\n";
# prints characters and digits
print "$&" while("guardians!@#%^&*123" =~ /[[:alnum:]]/gc);
print "\n";
# prints only digits
print "$&" while("guardians!@#%^&*123" =~ /[[:digit:]]/gc);
print "\n";
# prints anything except space " ".
print "$&" while("guardians!@#%^& 123\n" =~ /[[:graph:]]/gc);
print "\n";
# prints the 1 as it gets matched to
# space " " or horizontal tab.
print "1" while("guardians!@#%^& 123\n" =~ /[[:blank:]]/gc);
print "\n";
# prints lowercase characters
print "$&" while("Guardians!@#%^& 123\n" =~ /[[:lower:]]/gc);
print "\n";
# prints all ascii characters
print "$&" while("guardians!@#%^& 123\n" =~ /[[:ascii:]]/gc);
guardians
guardians123
123
guardians!@#%^&123
1
uardians
guardians!@#%^& 123