📜  Python字符串

📅  最后修改于: 2020-10-24 09:03:05             🧑  作者: Mango

Python字符串

到目前为止,我们已经讨论了数字作为Python的标准数据类型。在本教程的这一部分中,我们将讨论Python最流行的数据类型,即字符串。

Python字符串是由单引号,双引号或三引号引起来的字符的集合。计算机无法识别字符;在内部,它将操纵字符存储为0和1的组合。

每个字符都以ASCII或Unicode字符编码。因此,可以说Python字符串也称为Unicode字符的集合。

在Python,可以通过将字符或字符序列括在引号中来创建字符串。 Python允许我们使用单引号,双引号或三引号来创建字符串。

考虑下面的Python示例创建字符串。

句法:

str = "Hi Python !"  

在这里,如果我们使用Python脚本检查变量str的类型

print(type(str)), then it will print a string (str).  

在Python,字符串被视为字符序列,这意味着Python不支持字符数据类型。而是将写为’p’的单个字符视为长度为1的字符串。

在Python创建字符串

我们可以通过将字符括在单引号或双引号中来创建字符串。 Python还提供了三引号来表示字符串,但通常用于多行字符串或文档字符串。

#Using single quotes
str1 = 'Hello Python'
print(str1)
#Using double quotes
str2 = "Hello Python"
print(str2)

#Using triple quotes
str3 = '''Triple quotes are generally used for 
    represent the multiline or
    docstring''' 
print(str3)

输出:

Hello Python
Hello Python
Triple quotes are generally used for 
    represent the multiline or
    docstring

字符串索引和拆分

与其他语言一样, Python字符串的索引从0开始。例如,如下图所示,对字符串“ HELLO”进行了索引。

考虑以下示例:

str = "HELLO"
print(str[0])
print(str[1])
print(str[2])
print(str[3])
print(str[4])
# It returns the IndexError because 6th index doesn't exist
print(str[6])

输出:

H
E
L
L
O
IndexError: string index out of range

如Python所示,slice运算符[]用于访问字符串的各个字符。但是,我们可以在Python使用:(冒号)运算符来访问给定字符串的子字符串。考虑以下示例。

在这里,我们必须注意,切片运算符给出的上限始终是互斥的,即,如果给出了str =’HELLO’,则str [1:3]将始终包括str [1] =’E’,str [2 ] =’L’,仅此而已。

考虑以下示例:

# Given String
str = "JAVATPOINT"
# Start Oth index to end
print(str[0:])
# Starts 1th index to 4th index
print(str[1:5])
# Starts 2nd index to 3rd index
print(str[2:4])
# Starts 0th to 2nd index
print(str[:3])
#Starts 4th to 6th index
print(str[4:7])

输出:

JAVATPOINT
AVAT
VA
JAV
TPO

我们可以对字符串进行负切片;它从最右边的字符(表示为-1)开始。最右边的第二个索引表示-2,依此类推。考虑下图。

考虑下面的例子

str = 'JAVATPOINT'
print(str[-1])
print(str[-3])
print(str[-2:])
print(str[-4:-1])
print(str[-7:-2])
# Reversing the given string
print(str[::-1])
print(str[-12])

输出:

T
I
NT
OIN
ATPOI
TNIOPTAVAJ
IndexError: string index out of range

重新分配字符串

更新字符串的内容就像将其分配给新字符串一样容易。字符串对象不支持项目分配,即字符串只能用新字符串替换,因为其内容不能部分替换。字符串在Python是不可变的。

考虑以下示例。

例子1

str = "HELLO"  
str[0] = "h"  
print(str)  

输出:

Traceback (most recent call last):
  File "12.py", line 2, in 
    str[0] = "h";
TypeError: 'str' object does not support item assignment

但是,在示例1中,可以将字符串str完全分配给以下示例中指定的新内容。

例子2

str = "HELLO"  
print(str)  
str = "hello"  
print(str)  

输出:

HELLO
hello  

删除字符串

我们知道字符串是不可变的。我们不能删除或从字符串中删除的字符。但是我们可以使用del关键字删除整个字符串。

str = "JAVATPOINT"
del str[1]

输出:

TypeError: 'str' object doesn't support item deletion

现在我们要删除整个字符串。

str1 = "JAVATPOINT"
del str1
print(str1)

输出:

NameError: name 'str1' is not defined

字符串运算符

Operator Description
+ It is known as concatenation operator used to join the strings given either side of the operator.
* It is known as repetition operator. It concatenates the multiple copies of the same string.
[] It is known as slice operator. It is used to access the sub-strings of a particular string.
[:] It is known as range slice operator. It is used to access the characters from the specified range.
in It is known as membership operator. It returns if a particular sub-string is present in the specified string.
not in It is also a membership operator and does the exact reverse of in. It returns true if a particular substring is not present in the specified string.
r/R It is used to specify the raw string. Raw strings are used in the cases where we need to print the actual meaning of escape characters such as “C://python”. To define any string as a raw string, the character r or R is followed by the string.
% It is used to perform string formatting. It makes use of the format specifiers used in C programming like %d or %f to map their values in python. We will discuss how formatting is done in python.

考虑以下示例,以了解Python运算符的实际用法。

str = "Hello"   
str1 = " world"  
print(str*3) # prints HelloHelloHello  
print(str+str1)# prints Hello world   
print(str[4]) # prints o              
print(str[2:4]); # prints ll                  
print('w' in str) # prints false as w is not present in str  
print('wo' not in str1) # prints false as wo is present in str1.   
print(r'C://python37') # prints C://python37 as it is written  
print("The string str : %s"%(str)) # prints The string str : Hello   

输出:

HelloHelloHello
Hello world
o
ll
False
False
C://python37
The string str : Hello

Python字符串格式

转义序列

假设我们需要将文本编写为-他们说:“您好,这是怎么回事?”-给定的语句可以用单引号或双引号编写,但是由于它同时包含单引号和双引号,因此会引发SyntaxError。

考虑以下示例,以了解Python运算符的实际用法。

str = "They said, "Hello what's going on?""
print(str)

输出:

SyntaxError: invalid syntax

我们可以使用三引号来解决此问题,但是Python提供了转义序列。

反斜杠(/)符号表示转义序列。反斜杠后面可以跟一个特殊字符,并且对它的解释有所不同。字符串的单引号必须转义。我们可以使用与双引号相同的方法。

范例-

# using triple quotes
print('''They said, "What's there?"''')

# escaping single quotes
print('They said, "What\'s going on?"')

# escaping double quotes
print("They said, \"What's going on?\"")

输出:

They said, "What's there?"
They said, "What's going on?"
They said, "What's going on?"

下面给出了转义序列的列表:

Sr. Escape Sequence Description Example
1. \newline It ignores the new line.
print("Python1 \
Python2 \
Python3")

Output:

Python1 Python2 Python3
2. \\ Backslash
print("\\")

Output:

\
3. \’ Single Quotes
print('\'')

Output:

'
4. \\” Double Quotes
print("\"")

Output:

"
5. \a ASCII Bell
print("\a")
6. \b ASCII Backspace(BS)
print("Hello \b World")

Output:

Hello World
7. \f ASCII Formfeed
print("Hello \f World!")
Hello  World!
8. \n ASCII Linefeed
print("Hello \n World!")

Output:

Hello
 World!
9. \r ASCII Carriege Return(CR)
print("Hello \r World!")

Output:

World!
10. \t ASCII Horizontal Tab
print("Hello \t World!")

Output:

Hello  World!
11. \v ASCII Vertical Tab
print("Hello \v World!")

Output:

Hello 
 World!
12. \ooo Character with octal value
print("\110\145\154\154\157")
Output:
Hello
13 \xHH Character with hex value.
print("\x48\x65\x6c\x6c\x6f")

Output:

Hello

这是转义序列的简单示例。

print("C:\\Users\\DEVANSH SHARMA\\Python32\\Lib")
print("This is the \n multiline quotes")
print("This is \x48\x45\x58 representation")

输出:

C:\Users\DEVANSH SHARMA\Python32\Lib
This is the 
 multiline quotes
This is HEX representation

我们可以使用原始字符串忽略给定字符串的转义序列。我们可以通过在字符串前面写r或R来做到这一点。考虑以下示例。

print(r"C:\\Users\\DEVANSH SHARMA\\Python32")

输出:

C:\\Users\\DEVANSH SHARMA\\Python32

format()方法

format()方法是格式化字符串最灵活,最有用的方法。花括号{}用作字符串的占位符,并由format()方法参数代替。让我们看一个给定的例子:

# Using Curly braces
print("{} and {} both are the best friend".format("Devansh","Abhishek"))

#Positional Argument
print("{1} and {0} best players ".format("Virat","Rohit"))

#Keyword Argument
print("{a},{b},{c}".format(a = "James", b = "Peter", c = "Ricky"))

输出:

Devansh and Abhishek both are the best friend
Rohit and Virat best players 
James,Peter,Ricky 

使用%运算符的Python字符串格式化

Python允许我们使用C的printf语句中使用的格式说明符。 Python中格式说明符的处理方式与C中相同。但是, Python提供了一个附加的运算符%,用作格式说明符及其值之间的接口。换句话说,我们可以说它将格式说明符绑定到值。

考虑以下示例。

Integer = 10;  
Float = 1.290  
String = "Devansh"  
print("Hi I am Integer ... My value is %d\nHi I am float ... My value is %f\nHi I am string ... My value is %s"%(Integer,Float,String))  

输出:

Hi I am Integer ... My value is 10
Hi I am float ... My value is 1.290000
Hi I am string ... My value is Devansh

Python String函数

Python提供了各种用于字符串处理的内置函数。许多弦乐

Method Description
capitalize() It capitalizes the first character of the String. This function is deprecated in python3
casefold() It returns a version of s suitable for case-less comparisons.
center(width ,fillchar) It returns a space padded string with the original string centred with equal number of left and right spaces.
count(string,begin,end) It counts the number of occurrences of a substring in a String between begin and end index.
decode(encoding = ‘UTF8’, errors = ‘strict’) Decodes the string using codec registered for encoding.
encode() Encode S using the codec registered for encoding. Default encoding is ‘utf-8’.
endswith(suffix ,begin=0,end=len(string)) It returns a Boolean value if the string terminates with given suffix between begin and end.
expandtabs(tabsize = 8) It defines tabs in string to multiple spaces. The default space value is 8.
find(substring ,beginIndex, endIndex) It returns the index value of the string where substring is found between begin index and end index.
format(value) It returns a formatted version of S, using the passed value.
index(subsring, beginIndex, endIndex) It throws an exception if string is not found. It works same as find() method.
isalnum() It returns true if the characters in the string are alphanumeric i.e., alphabets or numbers and there is at least 1 character. Otherwise, it returns false.
isalpha() It returns true if all the characters are alphabets and there is at least one character, otherwise False.
isdecimal() It returns true if all the characters of the string are decimals.
isdigit() It returns true if all the characters are digits and there is at least one character, otherwise False.
isidentifier() It returns true if the string is the valid identifier.
islower() It returns true if the characters of a string are in lower case, otherwise false.
isnumeric() It returns true if the string contains only numeric characters.
isprintable() It returns true if all the characters of s are printable or s is empty, false otherwise.
isupper() It returns false if characters of a string are in Upper case, otherwise False.
isspace() It returns true if the characters of a string are white-space, otherwise false.
istitle() It returns true if the string is titled properly and false otherwise. A title string is the one in which the first character is upper-case whereas the other characters are lower-case.
isupper() It returns true if all the characters of the string(if exists) is true otherwise it returns false.
join(seq) It merges the strings representation of the given sequence.
len(string) It returns the length of a string.
ljust(width[,fillchar]) It returns the space padded strings with the original string left justified to the given width.
lower() It converts all the characters of a string to Lower case.
lstrip() It removes all leading whitespaces of a string and can also be used to remove particular character from leading.
partition() It searches for the separator sep in S, and returns the part before it, the separator itself, and the part after it. If the separator is not found, return S and two empty strings.
maketrans() It returns a translation table to be used in translate function.
replace(old,new[,count]) It replaces the old sequence of characters with the new sequence. The max characters are replaced if max is given.
rfind(str,beg=0,end=len(str)) It is similar to find but it traverses the string in backward direction.
rindex(str,beg=0,end=len(str)) It is same as index but it traverses the string in backward direction.
rjust(width,[,fillchar]) Returns a space padded string having original string right justified to the number of characters specified.
rstrip() It removes all trailing whitespace of a string and can also be used to remove particular character from trailing.
rsplit(sep=None, maxsplit = -1) It is same as split() but it processes the string from the backward direction. It returns the list of words in the string. If Separator is not specified then the string splits according to the white-space.
split(str,num=string.count(str)) Splits the string according to the delimiter str. The string splits according to the space if the delimiter is not provided. It returns the list of substring concatenated with the delimiter.
splitlines(num=string.count(‘\n’)) It returns the list of strings at each line with newline removed.
startswith(str,beg=0,end=len(str)) It returns a Boolean value if the string starts with given str between begin and end.
strip([chars]) It is used to perform lstrip() and rstrip() on the string.
swapcase() It inverts case of all characters in a string.
title() It is used to convert the string into the title-case i.e., The string meEruT will be converted to Meerut.
translate(table,deletechars = ”) It translates the string according to the translation table passed in the function .
upper() It converts all the characters of a string to Upper Case.
zfill(width) Returns original string leftpadded with zeros to a total of width characters; intended for numbers, zfill() retains any sign given (less one zero).
rpartition()