📅  最后修改于: 2020-12-23 05:14:23             🧑  作者: Mango
字符串是Python最流行的类型之一。我们可以简单地通过将字符括在引号中来创建它们。 Python将单引号与双引号相同。创建字符串就像将值分配给变量一样简单。例如-
var1 = 'Hello World!'
var2 = "Python Programming"
Python不支持字符类型。这些被视为长度为一的字符串,因此也被视为子字符串。
要访问子字符串,请使用方括号与一个或多个索引一起切片以获得您的子字符串。例如-
#!/usr/bin/python
var1 = 'Hello World!'
var2 = "Python Programming"
print "var1[0]: ", var1[0]
print "var2[1:5]: ", var2[1:5]
执行以上代码后,将产生以下结果-
var1[0]: H
var2[1:5]: ytho
你可以在“更新”现有的(再)分配给另一个字符串变量字符串。新值可以与其先前值相关,也可以与完全不同的字符串。例如-
#!/usr/bin/python
var1 = 'Hello World!'
print "Updated String :- ", var1[:6] + 'Python'
执行以上代码后,将产生以下结果-
Updated String :- Hello Python
下表是可以用反斜杠表示的转义字符或不可打印字符的列表。
转义字符被解释;用单引号和双引号字符串。
Backslash notation | Hexadecimal character | Description |
---|---|---|
\a | 0x07 | Bell or alert |
\b | 0x08 | Backspace |
\cx | Control-x | |
\C-x | Control-x | |
\e | 0x1b | Escape |
\f | 0x0c | Formfeed |
\M-\C-x | Meta-Control-x | |
\n | 0x0a | Newline |
\nnn | Octal notation, where n is in the range 0.7 | |
\r | 0x0d | Carriage return |
\s | 0x20 | Space |
\t | 0x09 | Tab |
\v | 0x0b | Vertical tab |
\x | Character x | |
\xnn | Hexadecimal notation, where n is in the range 0.9, a.f, or A.F |
假设字符串变量a容纳“ Hello”,变量b容纳“Python”,则-
Operator | Description | Example |
---|---|---|
+ | Concatenation – Adds values on either side of the operator | a + b will give HelloPython |
* | Repetition – Creates new strings, concatenating multiple copies of the same string |
a*2 will give -HelloHello |
[] | Slice – Gives the character from the given index | a[1] will give e |
[ : ] | Range Slice – Gives the characters from the given range | a[1:4] will give ell |
in | Membership – Returns true if a character exists in the given string | H in a will give 1 |
not in | Membership – Returns true if a character does not exist in the given string | M not in a will give 1 |
r/R | Raw String – Suppresses actual meaning of Escape characters. The syntax for raw strings is exactly the same as for normal strings with the exception of the raw string operator, the letter “r,” which precedes the quotation marks. The “r” can be lowercase (r) or uppercase (R) and must be placed immediately preceding the first quote mark. | print r’\n’ prints \n and print R’\n’prints \n |
% | Format – Performs String formatting | See at next section |
Python最酷的功能之一是字符串格式运算符%。该运算符是字符串所独有的,并弥补了C的printf()系列具有的功能。以下是一个简单的例子-
#!/usr/bin/python
print "My name is %s and weight is %d kg!" % ('Zara', 21)
执行以上代码后,将产生以下结果-
My name is Zara and weight is 21 kg!
这是可以与%一起使用的完整符号集的列表-
Format Symbol | Conversion |
---|---|
%c | character |
%s | string conversion via str() prior to formatting |
%i | signed decimal integer |
%d | signed decimal integer |
%u | unsigned decimal integer |
%o | octal integer |
%x | hexadecimal integer (lowercase letters) |
%X | hexadecimal integer (UPPERcase letters) |
%e | exponential notation (with lowercase ‘e’) |
%E | exponential notation (with UPPERcase ‘E’) |
%f | floating point real number |
%g | the shorter of %f and %e |
%G | the shorter of %f and %E |
下表列出了其他受支持的符号和功能-
Symbol | Functionality |
---|---|
* | argument specifies width or precision |
– | left justification |
+ | display the sign |
leave a blank space before a positive number | |
# | add the octal leading zero ( ‘0’ ) or hexadecimal leading ‘0x’ or ‘0X’, depending on whether ‘x’ or ‘X’ were used. |
0 | pad from left with zeros (instead of spaces) |
% | ‘%%’ leaves you with a single literal ‘%’ |
(var) | mapping variable (dictionary arguments) |
m.n. | m is the minimum total width and n is the number of digits to display after the decimal point (if appl.) |
通过允许字符串跨越多行,包括逐字换行符,TAB和任何其他特殊字符,Python的三重引号得以解决。
三重引号的语法包含三个连续的单引号或双引号。
#!/usr/bin/python
para_str = """this is a long string that is made up of
several lines and non-printable characters such as
TAB ( \t ) and they will show up that way when displayed.
NEWLINEs within the string, whether explicitly given like
this within the brackets [ \n ], or just a NEWLINE within
the variable assignment will also show up.
"""
print para_str
执行以上代码后,将产生以下结果。请注意,如何将每个特殊字符都转换为其印刷形式,一直到“ up”之间的字符串末尾的最后一个NEWLINE。和三重引号。另请注意,NEWLINE要么在行的末尾显式回车,要么其转义码(\ n)-
this is a long string that is made up of
several lines and non-printable characters such as
TAB ( ) and they will show up that way when displayed.
NEWLINEs within the string, whether explicitly given like
this within the brackets [
], or just a NEWLINE within
the variable assignment will also show up.
原始字符串根本不会将反斜杠视为特殊字符。您输入到原始字符串的每个字符都保持您编写它的方式-
#!/usr/bin/python
print 'C:\\nowhere'
执行以上代码后,将产生以下结果-
C:\nowhere
现在,让我们使用raw 字符串 。我们将表达式放在r’expression’中,如下所示:
#!/usr/bin/python
print r'C:\\nowhere'
执行以上代码后,将产生以下结果-
C:\\nowhere
Python中的普通字符串在内部存储为8位ASCII,而Unicode字符串存储为16位Unicode。这允许使用更多种类的字符,包括世界上大多数语言的特殊字符。我将对Unicode字符串的处理限于以下内容-
#!/usr/bin/python
print u'Hello, world!'
执行以上代码后,将产生以下结果-
Hello, world!
如您所见,Unicode字符串使用前缀u,就像原始字符串使用前缀r。
Python包含以下内置方法来处理字符串-
Sr.No. | Methods with Description |
---|---|
1 | capitalize()
Capitalizes first letter of string |
2 | center(width, fillchar)
Returns a space-padded string with the original string centered to a total of width columns. |
3 | count(str, beg= 0,end=len(string))
Counts how many times str occurs in string or in a substring of string if starting index beg and ending index end are given. |
4 | decode(encoding=’UTF-8′,errors=’strict’)
Decodes the string using the codec registered for encoding. encoding defaults to the default string encoding. |
5 | encode(encoding=’UTF-8′,errors=’strict’)
Returns encoded string version of string; on error, default is to raise a ValueError unless errors is given with ‘ignore’ or ‘replace’. |
6 | endswith(suffix, beg=0, end=len(string))
Determines if string or a substring of string (if starting index beg and ending index end are given) ends with suffix; returns true if so and false otherwise. |
7 | expandtabs(tabsize=8)
Expands tabs in string to multiple spaces; defaults to 8 spaces per tab if tabsize not provided. |
8 | find(str, beg=0 end=len(string))
Determine if str occurs in string or in a substring of string if starting index beg and ending index end are given returns index if found and -1 otherwise. |
9 | index(str, beg=0, end=len(string))
Same as find(), but raises an exception if str not found. |
10 | isalnum()
Returns true if string has at least 1 character and all characters are alphanumeric and false otherwise. |
11 | isalpha()
Returns true if string has at least 1 character and all characters are alphabetic and false otherwise. |
12 | isdigit()
Returns true if string contains only digits and false otherwise. |
13 | islower()
Returns true if string has at least 1 cased character and all cased characters are in lowercase and false otherwise. |
14 | isnumeric()
Returns true if a unicode string contains only numeric characters and false otherwise. |
15 | isspace()
Returns true if string contains only whitespace characters and false otherwise. |
16 | istitle()
Returns true if string is properly “titlecased” and false otherwise. |
17 | isupper()
Returns true if string has at least one cased character and all cased characters are in uppercase and false otherwise. |
18 | join(seq)
Merges (concatenates) the string representations of elements in sequence seq into a string, with separator string. |
19 | len(string)
Returns the length of the string |
20 | ljust(width[, fillchar])
Returns a space-padded string with the original string left-justified to a total of width columns. |
21 | lower()
Converts all uppercase letters in string to lowercase. |
22 | lstrip()
Removes all leading whitespace in string. |
23 | maketrans()
Returns a translation table to be used in translate function. |
24 | max(str)
Returns the max alphabetical character from the string str. |
25 | min(str)
Returns the min alphabetical character from the string str. |
26 | replace(old, new [, max])
Replaces all occurrences of old in string with new or at most max occurrences if max given. |
27 | rfind(str, beg=0,end=len(string))
Same as find(), but search backwards in string. |
28 | rindex( str, beg=0, end=len(string))
Same as index(), but search backwards in string. |
29 | rjust(width,[, fillchar])
Returns a space-padded string with the original string right-justified to a total of width columns. |
30 | rstrip()
Removes all trailing whitespace of string. |
31 | split(str=””, num=string.count(str))
Splits string according to delimiter str (space if not provided) and returns list of substrings; split into at most num substrings if given. |
32 | splitlines( num=string.count(‘\n’))
Splits string at all (or num) NEWLINEs and returns a list of each line with NEWLINEs removed. |
33 | startswith(str, beg=0,end=len(string))
Determines if string or a substring of string (if starting index beg and ending index end are given) starts with substring str; returns true if so and false otherwise. |
34 | strip([chars])
Performs both lstrip() and rstrip() on string. |
35 | swapcase()
Inverts case for all letters in string. |
36 | title()
Returns “titlecased” version of string, that is, all words begin with uppercase and the rest are lowercase. |
37 | translate(table, deletechars=””)
Translates string according to translation table str(256 chars), removing those in the del string. |
38 | upper()
Converts lowercase letters in string to uppercase. |
39 | zfill (width)
Returns original string leftpadded with zeros to a total of width characters; intended for numbers, zfill() retains any sign given (less one zero). |
40 | isdecimal()
Returns true if a unicode string contains only decimal characters and false otherwise. |