📜  Python – 连续字符频率(1)

📅  最后修改于: 2023-12-03 15:34:09.371000             🧑  作者: Mango

Python – 连续字符频率

在Python中,我们可以使用简单的代码来计算一个字符串中连续字符出现的频率。这个功能对于文本挖掘、数据分析等领域非常有用。

实现方法

我们可以通过遍历字符串的每个字符,检查是否与前一个字符相同,如果相同,则计数器+1,否则将计数器重置为1。然后我们可以将计数器的值存储在一个字典中,并将每个连续字符序列作为键。最后,我们可以按值对字典进行排序并输出结果。

def freq_of_conseq_chars(string):
    freq_dict = {}
    count = 1
    for i in range(1,len(string)):
        if string[i] == string[i-1]:
            count += 1
        else:
            if count in freq_dict:
                freq_dict[count].append(string[i-1]*count)
            else:
                freq_dict[count] = [string[i-1]*count]
            count = 1
    if count in freq_dict:
        freq_dict[count].append(string[-1]*count)
    else:
        freq_dict[count] = [string[-1]*count]
    sorted_dict = sorted(freq_dict.items(),reverse=True)
    for k,v in sorted_dict:
        print(f"{v[0]} of '{v[0][0]}' were found {k} times")
示例

如果我们对字符串“AAABBBCCDDEFFF”应用上述函数,将会得到以下输出:

FFF of 'F' were found 3 times
BBB of 'B' were found 3 times
AA of 'A' were found 3 times
CC of 'C' were found 2 times
DD of 'D' were found 2 times
结论

使用上述代码,我们可以计算一个字符串中连续字符出现的频率。这个功能在文本挖掘、数据分析等领域非常有用,能够帮助我们深入理解文本数据中的模式。