📜  Java中的正则表达式字符案例(1)

📅  最后修改于: 2023-12-03 15:16:33.741000             🧑  作者: Mango

Java中的正则表达式字符案例

介绍

正则表达式是一种强大的模式匹配工具,Java中可以通过正则表达式匹配字符串。Java中的正则表达式由字面值字符、元字符和转义字符组成,用于匹配字符序列。

字面值字符

字面值字符是表示它本身的字符,如数字、字母等。在正则表达式中,字母和数字都是字面值字符,表示它本身的含义。

例如,正则表达式[a-z]可以匹配任意一个小写字母。

String regex = "[a-z]";
String input = "regex example";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
    System.out.println("Match found at index " + matcher.start() + " to " + (matcher.end()-1) + ": " + matcher.group());
}

输出结果:

Match found at index 0 to 0: r
Match found at index 1 to 1: e
Match found at index 2 to 2: g
Match found at index 3 to 3: e
Match found at index 4 to 4: x
Match found at index 6 to 6: a
Match found at index 7 to 7: m
Match found at index 8 to 8: p
Match found at index 9 to 9: l
Match found at index 10 to 10: e
元字符

元字符是具有特殊含义的字符,用于表示多个字面值字符的组合。Java中的元字符包括[](){}^&.*+?\|等。

[]

[]用于匹配包含在括号中的任意一个字符。

例如,正则表达式[abc]可以匹配a、b、c中的任意一个字符。

String regex = "[abc]";
String input = "regex example";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
    System.out.println("Match found at index " + matcher.start() + " to " + (matcher.end()-1) + ": " + matcher.group());
}

输出结果:

Match found at index 0 to 0: r
Match found at index 4 to 4: e
Match found at index 7 to 7: m
()

()用于将多个字符组合成为一个整体。括号内的字符表示一个子表达式。

例如,正则表达式(ab)+可以匹配一个或多个连续的ab。

String regex = "(ab)+";
String input = "ababab regex ab";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
    System.out.println("Match found at index " + matcher.start() + " to " + (matcher.end()-1) + ": " + matcher.group());
}

输出结果:

Match found at index 0 to 5: ababab
Match found at index 12 to 13: ab
{}

{}用于表示正则表达式中的重复次数。

例如,正则表达式a{3}可以匹配连续三个a。

String regex = "a{3}";
String input = "aaabbbccc regex aaa";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
    System.out.println("Match found at index " + matcher.start() + " to " + (matcher.end()-1) + ": " + matcher.group());
}

输出结果:

Match found at index 0 to 2: aaa
Match found at index 21 to 23: aaa
^

^用于匹配行的开头。

例如,正则表达式^a可以匹配以a开头的行。

String regex = "^a";
String input = "abc\nacb\ndef";
Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
    System.out.println("Match found at index " + matcher.start() + " to " + (matcher.end()-1) + ": " + matcher.group());
}

输出结果:

Match found at index 0 to 0: a
Match found at index 8 to 8: a
$

$用于匹配行的结尾。

例如,正则表达式a$可以匹配以a结尾的行。

String regex = "a$";
String input = "abc\nacb\ndefa\n";
Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
    System.out.println("Match found at index " + matcher.start() + " to " + (matcher.end()-1) + ": " + matcher.group());
}

输出结果:

Match found at index 3 to 3: a
Match found at index 8 to 8: a
.

.用于匹配除换行符以外的任意一个字符。

例如,正则表达式a.c可以匹配以a开头,c结尾,中间有任意一个字符的字符串。

String regex = "a.c";
String input = "abc abcc abc\ndef\na-c";
Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
    System.out.println("Match found at index " + matcher.start() + " to " + (matcher.end()-1) + ": " + matcher.group());
}

输出结果:

Match found at index 0 to 2: abc
Match found at index 11 to 13: abc
Match found at index 21 to 23: a-c
*

*用于匹配零个或多个字符。

例如,正则表达式a*b可以匹配以0个或多个a为开头,后接b的字符串。

String regex = "a*b";
String input = "ab abb aabb aaabbb regex";
Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
    System.out.println("Match found at index " + matcher.start() + " to " + (matcher.end()-1) + ": " + matcher.group());
}

输出结果:

Match found at index 0 to 1: ab
Match found at index 4 to 7: abb
Match found at index 9 to 12: aabb
Match found at index 16 to 22: aaabbb
+

+用于匹配一个或多个字符。

例如,正则表达式a+b可以匹配以1个或多个a为开头,后接b的字符串。

String regex = "a+b";
String input = "ab abb aabb aaabbb regex";
Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
    System.out.println("Match found at index " + matcher.start() + " to " + (matcher.end()-1) + ": " + matcher.group());
}

输出结果:

Match found at index 4 to 6: abb
Match found at index 9 to 12: aabb
Match found at index 16 to 22: aaabbb
?

?用于匹配零个或一个字符。

例如,正则表达式a?b可以匹配以0个或1个a为开头,后接b的字符串。

String regex = "a?b";
String input = "ab abb aabb aaabbb regex";
Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
    System.out.println("Match found at index " + matcher.start() + " to " + (matcher.end()-1) + ": " + matcher.group());
}

输出结果:

Match found at index 0 to 1: ab
Match found at index 4 to 5: ab
\

\用于转义特殊字符,使得这些字符可以被当作普通字符进行匹配。

例如,正则表达式\.可以匹配一个点号。

String regex = "\\.";
String input = "www.baidu.com example";
Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
    System.out.println("Match found at index " + matcher.start() + " to " + (matcher.end()-1) + ": " + matcher.group());
}

输出结果:

Match found at index 3 to 3: .
Match found at index 9 to 9: .
|

|用于匹配多个表达式中的一个。

例如,正则表达式a|b|c可以匹配a、b、c中的任意一个字符。

String regex = "a|b|c";
String input = "abc abb cba example";
Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
    System.out.println("Match found at index " + matcher.start() + " to " + (matcher.end()-1) + ": " + matcher.group());
}

输出结果:

Match found at index 0 to 0: a
Match found at index 1 to 1: b
Match found at index 2 to 2: c
Match found at index 4 to 4: a
Match found at index 5 to 5: b
Match found at index 6 to 6: b
Match found at index 9 to 9: c
Match found at index 10 to 10: b
Match found at index 11 to 11: a
转义字符

转义字符是用\加上一个字符来表示这个字符本身。

例如,正则表达式\d可以匹配任意一个数字。

String regex = "\\d";
String input = "1234567890 regex";
Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
    System.out.println("Match found at index " + matcher.start() + " to " + (matcher.end()-1) + ": " + matcher.group());
}

输出结果:

Match found at index 0 to 0: 1
Match found at index 1 to 1: 2
Match found at index 2 to 2: 3
Match found at index 3 to 3: 4
Match found at index 4 to 4: 5
Match found at index 5 to 5: 6
Match found at index 6 to 6: 7
Match found at index 7 to 7: 8
Match found at index 8 to 8: 9
Match found at index 9 to 9: 0
结论

通过本文,我们了解了Java中的正则表达式字符,包括字面值字符、元字符和转义字符。对于程序员来说,熟练使用正则表达式可以更快速地匹配、查找和替换字符串,提高工作效率。