📅  最后修改于: 2023-12-03 15:34:09.371000             🧑  作者: Mango
在Python中,我们可以使用简单的代码来计算一个字符串中连续字符出现的频率。这个功能对于文本挖掘、数据分析等领域非常有用。
我们可以通过遍历字符串的每个字符,检查是否与前一个字符相同,如果相同,则计数器+1,否则将计数器重置为1。然后我们可以将计数器的值存储在一个字典中,并将每个连续字符序列作为键。最后,我们可以按值对字典进行排序并输出结果。
def freq_of_conseq_chars(string):
freq_dict = {}
count = 1
for i in range(1,len(string)):
if string[i] == string[i-1]:
count += 1
else:
if count in freq_dict:
freq_dict[count].append(string[i-1]*count)
else:
freq_dict[count] = [string[i-1]*count]
count = 1
if count in freq_dict:
freq_dict[count].append(string[-1]*count)
else:
freq_dict[count] = [string[-1]*count]
sorted_dict = sorted(freq_dict.items(),reverse=True)
for k,v in sorted_dict:
print(f"{v[0]} of '{v[0][0]}' were found {k} times")
如果我们对字符串“AAABBBCCDDEFFF”应用上述函数,将会得到以下输出:
FFF of 'F' were found 3 times
BBB of 'B' were found 3 times
AA of 'A' were found 3 times
CC of 'C' were found 2 times
DD of 'D' were found 2 times
使用上述代码,我们可以计算一个字符串中连续字符出现的频率。这个功能在文本挖掘、数据分析等领域非常有用,能够帮助我们深入理解文本数据中的模式。