正则表达式 是字符的特殊序列,可帮助您匹配或查找使用的模式举办了专门的语法字符串的其他字符串或设置。
正则表达式字面量 是斜线之间或任意定界符之间的模式,后跟%r,如下所示-
句法
/pattern/
/pattern/im # option can be specified
%r!/usr/local! # general delimited regular expression
例
#!/usr/bin/ruby
line1 = "Cats are smarter than dogs";
line2 = "Dogs also like meat";
if ( line1 =~ /Cats(.*)/ )
puts "Line1 contains Cats"
end
if ( line2 =~ /Cats(.*)/ )
puts "Line2 contains Dogs"
end
这将产生以下结果-
正则表达式修饰符
正则表达式字面量可以包括可选的修饰符,以控制匹配的各个方面。如前所示,修饰符在第二个斜杠字符之后指定,并且可以用以下字符之一表示-
Sr.No.
Modifier & Description
1
i
Ignores case when matching text.
2
o
Performs #{} interpolations only once, the first time the regexp literal is evaluated.
3
x
Ignores whitespace and allows comments in regular expressions.
4
m
Matches multiple lines, recognizing newlines as normal characters.
5
u,e,s,n
Interprets the regexp as Unicode (UTF-8), EUC, SJIS, or ASCII. If none of these modifiers is specified, the regular expression is assumed to use the source encoding.
像用%Q分隔的字符串字面量一样,Ruby允许您以%r开头正则表达式,后跟您选择的分隔符。当您描述的模式包含许多不想转义的正斜杠字符时,这很有用-
# Following matches a single slash character, no escape required
%r|/|
# Flag characters are allowed with this syntax, too
%r[(.*)>]i
正则表达式
除控制字符(+?。* ^$()[] {} | \)外 ,所有字符匹配。您可以在控制字符前面加上反斜杠来对其进行转义。
下表列出了Ruby中可用的正则表达式语法。
Sr.No.
Pattern & Description
1
^
Matches beginning of line.
2
$
Matches end of line.
3
.
Matches any single character except newline. Using m option allows it to match newline as well.
4
[…]
Matches any single character in brackets.
5
[^…]
Matches any single character not in brackets
6
re*
Matches 0 or more occurrences of preceding expression.
7
re+
Matches 1 or more occurrence of preceding expression.
8
re?
Matches 0 or 1 occurrence of preceding expression.
9
re{ n}
Matches exactly n number of occurrences of preceding expression.
10
re{ n,}
Matches n or more occurrences of preceding expression.
11
re{ n, m}
Matches at least n and at most m occurrences of preceding expression.
12
a| b
Matches either a or b.
13
(re)
Groups regular expressions and remembers matched text.
14
(?imx)
Temporarily toggles on i, m, or x options within a regular expression. If in parentheses, only that area is affected.
15
(?-imx)
Temporarily toggles off i, m, or x options within a regular expression. If in parentheses, only that area is affected.
16
(?: re)
Groups regular expressions without remembering matched text.
17
(?imx: re)
Temporarily toggles on i, m, or x options within parentheses.
18
(?-imx: re)
Temporarily toggles off i, m, or x options within parentheses.
19
(?#…)
Comment.
20
(?= re)
Specifies position using a pattern. Doesn’t have a range.
21
(?! re)
Specifies position using pattern negation. Doesn’t have a range.
22
(?> re)
Matches independent pattern without backtracking.
23
\w
Matches word characters.
24
\W
Matches nonword characters.
25
\s
Matches whitespace. Equivalent to [\t\n\r\f].
26
\S
Matches nonwhitespace.
27
\d
Matches digits. Equivalent to [0-9].
28
\D
Matches nondigits.
29
\A
Matches beginning of string.
30
\Z
Matches end of string. If a newline exists, it matches just before newline.
31
\z
Matches end of string.
32
\G
Matches point where last match finished.
33
\b
Matches word boundaries when outside brackets. Matches backspace (0x08) when inside brackets.
34
\B
Matches non-word boundaries.
35
\n, \t, etc.
Matches newlines, carriage returns, tabs, etc.
36
\1…\9
Matches nth grouped subexpression.
37
\10
Matches nth grouped subexpression if it matched already. Otherwise refers to the octal representation of a character code.
正则表达式示例
字面量字符
Sr.No.
Example & Description
1
/ruby/
Matches “ruby”.
2
¥
Matches Yen sign. Multibyte characters are supported in Ruby 1.9 and Ruby 1.8.
字符类
Sr.No.
Example & Description
1
/[Rr]uby/
Matches “Ruby” or “ruby”.
2
/rub[ye]/
Matches “ruby” or “rube”.
3
/[aeiou]/
Matches any one lowercase vowel.
4
/[0-9]/
Matches any digit; same as /[0123456789]/.
5
/[a-z]/
Matches any lowercase ASCII letter.
6
/[A-Z]/
Matches any uppercase ASCII letter.
7
/[a-zA-Z0-9]/
Matches any of the above.
8
/[^aeiou]/
Matches anything other than a lowercase vowel.
9
/[^0-9]/
Matches anything other than a digit.
特殊字符类
Sr.No.
Example & Description
1
/./
Matches any character except newline.
2
/./m
In multi-line mode, matches newline, too.
3
/\d/
Matches a digit: /[0-9]/.
4
/\D/
Matches a non-digit: /[^0-9]/.
5
/\s/
Matches a whitespace character: /[ \t\r\n\f]/.
6
/\S/
Matches non-whitespace: /[^ \t\r\n\f]/.
7
/\w/
Matches a single word character: /[A-Za-z0-9_]/.
8
/\W/
Matches a non-word character: /[^A-Za-z0-9_]/.
重复案例
Sr.No.
Example & Description
1
/ruby?/
Matches “rub” or “ruby”: the y is optional.
2
/ruby*/
Matches “rub” plus 0 or more ys.
3
/ruby+/
Matches “rub” plus 1 or more ys.
4
/\d{3}/
Matches exactly 3 digits.
5
/\d{3,}/
Matches 3 or more digits.
6
/\d{3,5}/
Matches 3, 4, or 5 digits.
非贪婪重复
这匹配最小的重复次数-
Sr.No.
Example & Description
1
/<.*>/
Greedy repetition: matches “perl>”.
2
/<.*?>/
Non-greedy: matches “” in “perl>”.
用括号分组
Sr.No.
Example & Description
1
/\D\d+/
No group: + repeats \d
2
/(\D\d)+/
Grouped: + repeats \D\d pair
3
/([Rr]uby(, )?)+/
Match “Ruby”, “Ruby, ruby, ruby”, etc.
反向参考
这再次匹配先前匹配的组-
Sr.No.
Example & Description
1
/([Rr])uby&\1ails/
Matches ruby&rails or Ruby&Rails.
2
/([‘”])(?:(?!\1).)*\1/
Single or double-quoted string. \1 matches whatever the 1st group matched . \2 matches whatever the 2nd group matched, etc.
备择方案
Sr.No.
Example & Description
1
/ruby|rube/
Matches “ruby” or “rube”.
2
/rub(y|le))/
Matches “ruby” or “ruble”.
3
/ruby(!+|\?)/
“ruby” followed by one or more ! or one ?
锚点
它需要指定匹配位置。
Sr.No.
Example & Description
1
/^Ruby/
Matches “Ruby” at the start of a string or internal line.
2
/Ruby$/
Matches “Ruby” at the end of a string or line.
3
/\ARuby/
Matches “Ruby” at the start of a string.
4
/Ruby\Z/
Matches “Ruby” at the end of a string.
5
/\bRuby\b/
Matches “Ruby” at a word boundary.
6
/\brub\B/
\B is non-word boundary: matches “rub” in “rube” and “ruby” but not alone.
7
/Ruby(?=!)/
Matches “Ruby”, if followed by an exclamation point.
8
/Ruby(?!!)/
Matches “Ruby”, if not followed by an exclamation point.
带括号的特殊语法
Sr.No.
Example & Description
1
/R(?#comment)/
Matches “R”. All the rest is a comment.
2
/R(?i)uby/
Case-insensitive while matching “uby”.
3
/R(?i:uby)/
Same as above.
4
/rub(?:y|le))/
Group only without creating \1 backreference.
搜索和替换
一些使用正则表达式的最重要的String方法是sub 和gsub ,以及它们的就位变体sub! 和gsub! 。
所有这些方法都使用Regexp模式执行搜索和替换操作。子 与子! 替换模式的第一个匹配项以及gsub 和gsub! 替换所有事件。
sub 和gsub 返回一个新的字符串,而原来的未修改的位置保留为sub! 和gsub! 修改调用它们的字符串。
以下是示例-
#!/usr/bin/ruby
phone = "2004-959-559 #This is Phone Number"
# Delete Ruby-style comments
phone = phone.sub!(/#.*$/, "")
puts "Phone Num : #{phone}"
# Remove anything other than digits
phone = phone.gsub!(/\D/, "")
puts "Phone Num : #{phone}"
这将产生以下结果-
Phone Num : 2004-959-559
Phone Num : 2004959559
以下是另一个示例-
#!/usr/bin/ruby
text = "rails are rails, really good Ruby on Rails"
# Change "rails" to "Rails" throughout
text.gsub!("rails", "Rails")
# Capitalize the word "Rails" throughout
text.gsub!(/\brails\b/, "Rails")
puts "#{text}"
这将产生以下结果-
Rails are Rails, really good Ruby on Rails