用一些定界符分割字符串是非常常见的任务。例如,我们有一个逗号分隔的文件列表,我们希望数组中有单个条目。
几乎所有的编程语言都提供了一个由某些定界符分割字符串的函数。
在C中:
// Splits str[] according to given delimiters.
// and returns next token. It needs to be called
// in a loop to get all tokens. It returns NULL
// when there are no more tokens.
char * strtok(char str[], const char *delims);
C
// A C/C++ program for splitting a string
// using strtok()
#include
#include
int main()
{
char str[] = "Geeks-for-Geeks";
// Returns first token
char *token = strtok(str, "-");
// Keep printing tokens while one of the
// delimiters present in str[].
while (token != NULL)
{
printf("%s\n", token);
token = strtok(NULL, "-");
}
return 0;
}
C++
#include
using namespace std;
// A quick way to split strings separated via spaces.
void simple_tokenizer(string s)
{
stringstream ss(s);
string word;
while (ss >> word) {
cout << word << endl;
}
}
int main(int argc, char const* argv[])
{
string a = "How do you do!";
// Takes only space seperated C++ strings.
simple_tokenizer(a);
cout << endl;
return 0;
}
C++
#include
using namespace std;
void tokenize(string s, string del = " ")
{
int start = 0;
int end = s.find(del);
while (end != -1) {
cout << s.substr(start, end - start) << endl;
start = end + del.size();
end = s.find(del, start);
}
cout << s.substr(start, end - start);
}
int main(int argc, char const* argv[])
{
// Takes C++ string with any separator
string a = "Hi$%do$%you$%do$%!";
tokenize(a, "$%");
cout << endl;
return 0;
}
Java
// A Java program for splitting a string
// using split()
import java.io.*;
public class Test
{
public static void main(String args[])
{
String Str = new String("Geeks-for-Geeks");
// Split above string in at-most two strings
for (String val: Str.split("-", 2))
System.out.println(val);
System.out.println("");
// Splits Str into all possible tokens
for (String val: Str.split("-"))
System.out.println(val);
}
}
Python
line = "Geek1 \nGeek2 \nGeek3";
print line.split()
print line.split(' ', 1)
Output: Geeks
for
Geeks
在C++中
Note: The main disadvantage of strtok() is that it only works for C style strings.
Therefore we need to explicitly convert C++ string into a char array.
Many programmers are unaware that C++ has two additional APIs which are more elegant
and works with C++ string.
方法1:使用C++的stringstream API
先决条件:stringstream API
可以使用字符串对象初始化Stringstream对象,它会自动在空间char上标记字符串。就像“ cin”流stringstream一样,它允许您将字符串作为单词流读取。
Some of the Most Common used functions of StringStream.
clear() — flushes the stream
str() — converts a stream of words into a C++ string object.
operator << — pushes a string object into the stream.
operator >> — extracts a word from the stream.
下面的代码对此进行了演示。
C++
#include
using namespace std;
// A quick way to split strings separated via spaces.
void simple_tokenizer(string s)
{
stringstream ss(s);
string word;
while (ss >> word) {
cout << word << endl;
}
}
int main(int argc, char const* argv[])
{
string a = "How do you do!";
// Takes only space seperated C++ strings.
simple_tokenizer(a);
cout << endl;
return 0;
}
Output : How
do
you
do!
方法2:使用C++ find()和substr()API。
先决条件:查找函数和substr() 。
此方法更健壮,并且可以使用任何定界符而不是空格来解析字符串(尽管默认行为是在空格之间进行分隔。)从下面的代码中可以很容易地理解逻辑。
C++
#include
using namespace std;
void tokenize(string s, string del = " ")
{
int start = 0;
int end = s.find(del);
while (end != -1) {
cout << s.substr(start, end - start) << endl;
start = end + del.size();
end = s.find(del, start);
}
cout << s.substr(start, end - start);
}
int main(int argc, char const* argv[])
{
// Takes C++ string with any separator
string a = "Hi$%do$%you$%do$%!";
tokenize(a, "$%");
cout << endl;
return 0;
}
Output: How
do
you
do
!
在Java:
在Java,split()是String类中的方法。
// expregexp is the delimiting regular expression;
// limit is the number of returned strings
public String[] split(String regexp, int limit);
// We can call split() without limit also
public String[] split(String regexp)
Java
// A Java program for splitting a string
// using split()
import java.io.*;
public class Test
{
public static void main(String args[])
{
String Str = new String("Geeks-for-Geeks");
// Split above string in at-most two strings
for (String val: Str.split("-", 2))
System.out.println(val);
System.out.println("");
// Splits Str into all possible tokens
for (String val: Str.split("-"))
System.out.println(val);
}
}
输出:
Geeks
for-Geeks
Geeks
for
Geeks
在Python:
Python的split()方法用指定的分隔符将给定的字符串断开后,将返回一个字符串列表。
// regexp is the delimiting regular expression;
// limit is limit the number of splits to be made
str.split(regexp = "", limit = string.count(str))
Python
line = "Geek1 \nGeek2 \nGeek3";
print line.split()
print line.split(' ', 1)
输出:
['Geek1', 'Geek2', 'Geek3']
['Geek1', '\nGeek2 \nGeek3']