Python – 子串后缀频率

给定一个字符串和子字符串，计算字符串中所有可用于完成子字符串的替代项。

Input : test_str = “Gfg is good . Gfg is good . Gfg is better . Gfg is good .”, substr = “Gfg is”
Output : {‘good’: 3, ‘better’: 1}
Explanation : good occurs 3 times as suffix after substring in string hence 3. and so on.

Input : test_str = “Gfg is good . Gfg is good . Gfg is good . Gfg is good .”, substr = “Gfg is”
Output : {‘good’: 4}
Explanation : good occurs 4 times as suffix after substring in string hence 4. and so on.

编程需要懂一点英语

方法 #1：使用 regex() + defaultdict() + 循环

这是可以执行此任务的方式之一。在此我们构造正则表达式来获取子字符串的所有匹配元素。然后使用 defaultdict() 检查所有可能出现在字符串中的频率计数。

Python3

# Python3 code to demonstrate working of
# Substring substitutes frequency
# Using regex() + defaultdict() + loop
from collections import defaultdict
import re
  
# initializing string
test_str = "Gfg is good . Gfg is best . Gfg is better . Gfg is good ."
  
# printing original string
print("The original string is : " + str(test_str))
  
# initializing substring
substr = "Gfg is"
  
# initializing regex
temp = re.findall(substr + " (\w+)", test_str, flags = re.IGNORECASE)
  
# adding values to form frequencies
res = defaultdict(int)
for idx in temp:
   res[idx] += 1
  
# printing result
print("Frequency of replacements : " + str(dict(res)))

Python3

# Python3 code to demonstrate working of
# Substring substitutes frequency
# Using Counter() + regex()
import re
from collections import Counter
  
# initializing string
test_str = "Gfg is good . Gfg is best . Gfg is better . Gfg is good ."
  
# printing original string
print("The original string is : " + str(test_str))
  
# initializing substring
substr = "Gfg is"
  
# initializing regex
temp = re.findall(substr + " (\w+)", test_str, flags = re.IGNORECASE)
  
# adding values to form frequencies
res = dict(Counter(temp))
  
# printing result
print("Frequency of replacements : " + str(res))

输出

The original string is : Gfg is good . Gfg is best . Gfg is better . Gfg is good .
Frequency of replacements : {'good': 2, 'best': 1, 'better': 1}

方法 #2：使用 Counter() + regex()

这是可以执行此任务的另一种方式。在此，我们使用 Counter() 计算元素频率。

Python3

# Python3 code to demonstrate working of
# Substring substitutes frequency
# Using Counter() + regex()
import re
from collections import Counter
  
# initializing string
test_str = "Gfg is good . Gfg is best . Gfg is better . Gfg is good ."
  
# printing original string
print("The original string is : " + str(test_str))
  
# initializing substring
substr = "Gfg is"
  
# initializing regex
temp = re.findall(substr + " (\w+)", test_str, flags = re.IGNORECASE)
  
# adding values to form frequencies
res = dict(Counter(temp))
  
# printing result
print("Frequency of replacements : " + str(res))

输出

The original string is : Gfg is good . Gfg is best . Gfg is better . Gfg is good .
Frequency of replacements : {'good': 2, 'best': 1, 'better': 1}