Scala 中的正则表达式
正则表达式解释了用于匹配一系列输入数据的常见模式,因此它有助于许多编程语言中的模式匹配。在 Scala 中,正则表达式通常被称为 Scala Regex 。
Regex是一个从scala.util.matching.Regex
包中导入的类,广泛应用于搜索和文本解析。为了将字符串重铸成正则表达式,我们需要使用r()方法来指定字符串。
例子 :
// Scala program for Regular
// Expressions
// Creating object
object GfG
{
// Main method
def main(args: Array[String])
{
// Applying r() method
val portal = "GeeksforGeeks".r
val CS = "GeeksforGeeks is a CS portal."
// Displays the first match
println(portal findFirstIn CS)
}
}
输出:
Some(GeeksforGeeks)
在这里,我们在所述字符串上调用了方法r()
以获取Regex class
的实例,以创建模式。上面的代码中使用了findFirstIn()
方法来查找正则表达式的第一个匹配项。为了找到表达式中的所有匹配词,请使用findAllIn()
方法。
例子 :
// Scala program for Regular
// Expressions
import scala.util.matching.Regex
// Creating object
object GfG
{
// Main method
def main(args: Array[String])
{
// Applying Regex class
val x = new Regex("Nidhi")
val myself = "My name is Nidhi Singh."
// replaces first match with the
// String given below
println(x replaceFirstIn(myself, "Rahul"))
}
}
输出:
My name is Rahul Singh.
我们甚至可以使用 Regex 的构造函数来代替r()
方法。在这里, replaceFirstIn()
方法用于替换指定字符串的第一个匹配项,我们甚至可以使用replaceAllIn()
方法替换所有匹配项。
例子 :
// Scala program for Regular
// Expressions
import scala.util.matching.Regex
// Creating object
object GfG
{
// Main method
def main(args: Array[String])
{
// Applying Regex class
val Geeks = new Regex("(G|g)fG")
val y = "GfG is a CS portal. I like gfG"
// Displays all the matches separated
// by a separator
println((Geeks findAllIn y).mkString(", "))
}
}
输出:
GfG, gfG
在这里,我们使用mkString方法连接所有匹配项,用分隔符分隔,并且在上面的代码中使用管道( | )在给定的字符串中搜索大小写。因此,此处返回所述字符串的大写和小写。
Scala 正则表达式的语法
Java继承了 Perl 的一些特性,Scala 继承了Java的 Scala 正则表达式的语法。下面是元字符语法列表:
Subexpression | Matches |
---|---|
^ | It is used to match starting point of the line. |
$ | It is used to match terminating point of the line. |
. | It is used to match any one character excluding the newline. |
[…] | It is used to match any one character within the brackets. |
[^…] | It is used to match any one character which is not in the brackets. |
\\A | It is used to match starting point of the intact string. |
\\z | It is used to match terminating point of the intact string. |
\\Z | It is used to match end of the whole string excluding the new line, if it exists. |
re* | It is utilized to match zero or more appearances of the foregoing expressions. |
re+ | It is used to match one or more of the foregoing expressions. |
re? | It is used to match zero or one appearance of the foregoing expression. |
re{ n} | It is used to matches precisely n number of appearances of the foregoing expression. |
re{ n, } | It is used to match n or more appearances of the foregoing expression. |
re{ n, m} | It is used to match at least n and at most m appearances of the foregoing expression. |
q|r | It is utilized to match either q or r. |
(re) | It is utilized to group the Regular expressions and recollects the text that are matched. |
(?: re) | It also groups the regular expressions but does not recollects the matched text. |
(?> re) | It is utilized to match self-reliant pattern in absence of backtracking. |
\\w | It is used to match characters of the word. |
\\W | It is used to match characters of the non-word. |
\\s | It is utilized to match white spaces which are analogous to [\t\n\r\f]. |
\\S | It is used to match non-white spaces. |
\\d | It is used to match the digits i.e, [0-9]. |
\\D | It is used to match non-digits. |
\\G | It is used to match the point where the endmost match overs. |
\\n | It is used for back-reference to occupy group number n. |
\\b | It is used to match the word frontiers when it is out of the brackets and matches the backspace when it is in the brackets. |
\\B | It is used to match non-word frontiers. |
\\n, \\t, etc. | It is used to match the newlines, tabs, etc. |
\\Q | It is used to escape (quote) each of the characters till \\E. |
\\E | It is used in ends quoting starting with \\Q. |