📜  Python NLTK | nltk.tokenize.LineTokenizer

📅  最后修改于: 2022-05-13 01:55:34.602000             🧑  作者: Mango

Python NLTK | nltk.tokenize.LineTokenizer

借助nltk.tokenize.LineTokenizer()方法,我们可以使用tokenize.LineTokenizer()方法从单行形式的句子字符串中提取标记。

示例 #1:
在这个例子中我们可以看到,通过使用tokenize.LineTokenizer()方法,我们能够从句子流中提取标记成小行。

# import LineTokenizer() method from nltk
from nltk.tokenize import LineTokenizer
     
# Create a reference variable for Class LineTokenizer
tk = LineTokenizer()
     
# Create a string input
gfg = "GeeksforGeeks...$$&* \nis\n for geeks"
     
# Use tokenize method
geek = tk.tokenize(gfg)
     
print(geek)

输出 :

示例 #2:

# import LineTokenizer() method from nltk
from nltk.tokenize import LineTokenizer
     
# Create a reference variable for Class LineTokenizer
tk = LineTokenizer(blanklines ='keep')
     
# Create a string input
gfg = "The price\n\n of burger \nin BurgerKing is Rs.36.\n"
     
# Use tokenize method
geek = tk.tokenize(gfg)
     
print(geek)

输出 :