📜  PHP 正则表达式

📅  最后修改于: 2020-09-29 05:50:06             🧑  作者: Mango

PHP正则表达式

正则表达式通常称为regex。这些只不过是一个模式或一系列字符,它们将特殊的搜索模式描述为text字符串。

正则表达式允许你搜索某个字符串在另一特定字符串。即使我们可以用另一个字符串替换一个字符串,也可以将一个字符串分成多个块。他们使用算术运算符(+,-,^)创建复杂的表达式。

默认情况下,正则表达式区分大小写。

正则表达式的优点和用途

在当前的应用程序编程中,几乎所有地方都使用正则表达式。下面给出了正则表达式的一些优点和用法:

  • 正则表达式可帮助程序员验证文本字符串。
  • 它提供了功能强大的工具来分析和搜索模式以及修改文本字符串。
  • 通过使用正则表达式功能,可以提供简单的解决方案来识别模式。
  • 正则表达式有助于创建识别标签的HTML模板系统。
  • 正则表达式广泛用于浏览器检测,表单验证,垃圾邮件过滤和密码强度检查。
  • 它对用户输入验证测试(例如电子邮件地址,手机号码和IP地址)很有帮助。
  • 它有助于根据搜索结果或输入突出显示文件中的特殊关键字。
  • 元字符使我们可以创建更复杂的模式。

您可以通过应用一些正则表达式的基本规则来创建复杂的搜索模式。正则表达式还使用许多算术运算符(+,-,^)创建复杂的模式。

正则表达式运算符

Operator Description
^ It indicates the start of string.
$ It indicates the end of the string.
. It donates any single character.
() It shows a group of expressions.
[] It finds a range of characters, e.g., [abc] means a, b, or c.
[^] It finds the characters which are not in range, e.g., [^xyz] means NOT x, y, or z.
It finds the range between the elements, e.g., [a-z] means a through z.
| It is a logical OR operator, which is used between the elements. E.g., a|b, which means either a OR b.
? It indicates zero or one of preceding character or element range.
* It indicates zero or more of preceding character or element range.
+ It indicates zero or more of preceding character or element range.
{n} It denotes at least n times of preceding character range. For example – n{3}
{n, } It denotes at least n, but it should not be more than m times, e.g., n{2,5} means 2 to 5 of n.
{n, m} It indicates at least n, but it should not be more than m times. For example – n{3,6} means 3 to 6 of n.
\ It denotes the escape character.

正则表达式中的特殊字符类

Special Character Description
\n It indicates a new line.
\r It indicates a carriage return.
\t It represents a tab.
\v It represents a vertical tab.
\f It represents a form feed.
\xxx It represents an octal character.
\xxh It denotes hexadecimal character hh.

PHP提供了两组正则表达式函数:

  • POSIX正则表达式
  • PERL样式正则表达式

POSIX正则表达式

POSIX正则表达式的结构类似于典型的算术表达式:几个运算符/元素组合在一起以形成更复杂的表达式。

最简单的正则表达式是匹配字符串单个字符的表达式。例如-在切换或笼形字符串内的“g”。让我们介绍一下POSIX正则表达式中使用的一些概念:

括号

在正则表达式中使用方括号[]具有特殊含义。这些用于查找其中的字符范围。

Expression Description
[0-9] It matches any decimal digit 0 to 9.
[a-z] It matches any lowercase character from a to z.
[A-Z] It matches any uppercase character from A to Z.
[a-Z] It matches any character from lowercase a to uppercase Z.

以上范围是常用的。您可以根据需要使用范围值,例如[0-6]以匹配0到6之间的任何十进制数字。

量词

特殊字符可以表示方括号和单个字符的位置。每个特殊字符都有特定的含义。给定的符号+,*,?,$和{intrange}标志均遵循字符序列。

Expression Description
p+ It matches any string that contains atleast one p.
p* It matches any string that contains one or more p’s.
p? It matches any string that has zero or one p’s.
p{N} It matches any string that has a sequence of N p’s.
p{2,3} It matches any string that has a sequence of two or three p’s.
p{2, } It matches any string that contains atleast two p’s.
p$ It matches any string that contains p at the end of it.
^p It matches any string that has p at the start of it.

PHP Regexp POSIX函数

PHP提供了七个使用POSIX样式正则表达式搜索字符串的函数-

Function Description
ereg() It searches a string pattern inside another string and returns true if the pattern matches otherwise return false.
ereg_replace() It searches a string pattern inside the other string and replaces the matching text with the replacement string.
eregi() It searches for a pattern inside the other string and returns the length of matched string if found otherwise returns false. It is a case insensitive function.
eregi_replace() This function works same as ereg_replace() function. The only difference is that the search for pattern of this function is case insensitive.
split() The split() function divide the string into array.
spliti() It is similar to split() function as it also divides a string into array by regular expression.
Sql_regcase() It creates a regular expression for case insensitive match and returns a valid regular expression that will match string.

注意:请注意,以上功能在PHP 5.3.0中已弃用,在PHP 7.0.0中已删除。

PERL样式正则表达式

Perl样式的正则表达式与POSIX非常相似。POSIX语法可与Perl样式的正则表达式函数互换使用。POSIX部分中介绍的量词也可以在PERL样式正则表达式中使用。

元字符

元字符是字母字符,后跟反斜杠,为组合赋予特殊含义。

例如,可以使用’\d’元字符搜索大笔款项:/([\d]+)000/。在这里/d将搜索数字字符的字符串。

以下是可在PERL样式正则表达式中使用的元字符列表-

Character Description
. Matches a single character
\s It matches a whitespace character like space, newline, tab.
\S Non-whitespace character
\d It matches any digit from 0 to 9.
\D Matches a non-digit character.
\w Matches for a word character such as – a-z, A-Z, 0-9, _
\W Matches a non-word character.
[aeiou] It matches any single character in the given set.
[^aeiou] It matches any single character except the given set.
(foo|baz|bar) Matches any of the alternatives specified.

修饰符

有多个修饰符可用,这使使用正则表达式的工作变得更加容易。例如-区分大小写或在多行中搜索等。

以下是PERL样式正则表达式中使用的修饰符列表-

Character Description
i Makes case insensitive search
m It specifies that if a string has a carriage return or newline characters, the $ and ^ operator will match against a newline boundary rather than a string boundary.
o Evaluates the expression only once
s It allows the use of .(dot) to match a newline character
x This modifier allows us to use whitespace in expression for clarity.
g It globally searches all matches.
cg It allows the search to continue even after the global match fails.

PHP Regexp POSIX函数

PHP当前提供了七个使用POSIX样式正则表达式搜索字符串的函数-

Function Description
preg_match() This function searches the pattern inside the string and returns true if the pattern exists otherwise returns false.
preg_match_all() This function matches all the occurrences of pattern in the string.
preg_replace() The preg_replace() function is similar to the ereg_replace() function, except that the regular expressions can be used in search and replace.
preg_split() This function exactly works like split() function except the condition is that it accepts regular expression as an input parameter for pattern. Mainly it divides the string by a regular expression.
preg_grep() The preg_grep() function finds all the elements of input_array and returns the array elements matched with regexp (relational expression) pattern.
preg_quote() Quote the regular expression characters.