📜  文本拆分器 - Python 代码示例

📅  最后修改于: 2022-03-11 14:45:04.694000             🧑  作者: Mango

代码示例2
def text_splitter(t):
    t = re.sub(r'[^\w\s]','',text)
    data = pd.DataFrame(columns=['text','output'])
    for i in range(len(t.split())):
        text = ' '.join(t.split()[0:i+1])
        temp_text = ' '.join(t.split()[i+1:])
        for k in range(len(temp_text.split())):
            output = ' '.join(temp_text.split()[:k+1])
    #         print(text,'----',output)
            data = data.append({'text':text, 'output':output}, ignore_index=True)
    return data
  
  
  
  
t = 'How do I reset a password in windows'
text_splitter(t)

text    output
0    How    do
1    How    do I
2    How    do I reset
3    How    do I reset a
4    How    do I reset a password
5    How    do I reset a password in
6    How    do I reset a password in windows
7    How do    I
8    How do    I reset
9    How do    I reset a
10    How do    I reset a password
11    How do    I reset a password in
12    How do    I reset a password in windows
13    How do I    reset
14    How do I    reset a
15    How do I    reset a password
16    How do I    reset a password in
17    How do I    reset a password in windows
18    How do I reset    a
19    How do I reset    a password
20    How do I reset    a password in
21    How do I reset    a password in windows
22    How do I reset a    password
23    How do I reset a    password in
24    How do I reset a    password in windows
25    How do I reset a password    in
26    How do I reset a password    in windows
27    How do I reset a password in    windows