Python – 连续字符频率(1)

📌 相关文章

📜 Python – 连续字符频率(1)

📅 最后修改于: 2023-12-03 15:34:09.371000 🧑 作者: Mango

Python – 连续字符频率

在Python中，我们可以使用简单的代码来计算一个字符串中连续字符出现的频率。这个功能对于文本挖掘、数据分析等领域非常有用。

实现方法

我们可以通过遍历字符串的每个字符，检查是否与前一个字符相同，如果相同，则计数器+1，否则将计数器重置为1。然后我们可以将计数器的值存储在一个字典中，并将每个连续字符序列作为键。最后，我们可以按值对字典进行排序并输出结果。

def freq_of_conseq_chars(string):
    freq_dict = {}
    count = 1
    for i in range(1,len(string)):
        if string[i] == string[i-1]:
            count += 1
        else:
            if count in freq_dict:
                freq_dict[count].append(string[i-1]*count)
            else:
                freq_dict[count] = [string[i-1]*count]
            count = 1
    if count in freq_dict:
        freq_dict[count].append(string[-1]*count)
    else:
        freq_dict[count] = [string[-1]*count]
    sorted_dict = sorted(freq_dict.items(),reverse=True)
    for k,v in sorted_dict:
        print(f"{v[0]} of '{v[0][0]}' were found {k} times")

示例

如果我们对字符串“AAABBBCCDDEFFF”应用上述函数，将会得到以下输出：

FFF of 'F' were found 3 times
BBB of 'B' were found 3 times
AA of 'A' were found 3 times
CC of 'C' were found 2 times
DD of 'D' were found 2 times

结论

使用上述代码，我们可以计算一个字符串中连续字符出现的频率。这个功能在文本挖掘、数据分析等领域非常有用，能够帮助我们深入理解文本数据中的模式。