Python - 从列表中删除非英文字符字符串
给定一个字符串列表,执行删除所有非英文字符的字符串。
Input : test_list = [‘Good| ????’, ‘??Geeks???’]
Output : []
Explanation : Both contain non-English characters
Input : test_list = [“Gfg”, “Best”]
Output : [“Gfg”, “Best”]
Explanation : Both are valid English words.
方法 #1:使用正则表达式 + findall() + 列表理解
在此,我们创建一个 unicode 的正则表达式并检查字符串列表中的出现,使用 findall() 提取每个没有 unicode 的字符串。
Python3
# Python3 code to demonstrate working of
# Remove Non-English characters Strings from List
# Using regex + findall() + list comprehension
import re
# initializing list
test_list = ['Gfg', 'Good| ????', "for", '??Geeks???']
# printing original list
print("The original list is : " + str(test_list))
# using findall() to neglect unicode of Non-English alphabets
res = [idx for idx in test_list if not re.findall("[^\u0000-\u05C0\u2100-\u214F]+", idx)]
# printing result
print("The extracted list : " + str(res))
Python3
# Python3 code to demonstrate working of
# Remove Non-English characters Strings from List
# Using regex + search() + filter() + lambda
import re
# initializing list
test_list = ['Gfg', 'Good| ????', "for", '??Geeks???']
# printing original list
print("The original list is : " + str(test_list))
# using search() to get only those strings with alphabets
res = list(filter(lambda ele: re.search("[a-zA-Z\s]+", ele) is not None, test_list))
# printing result
print("The extracted list : " + str(res))
方法 #2:使用正则表达式 + search() + filter() + lambda
在此,我们仅在 String 中搜索英文字母,并仅提取那些具有这些的英文字母。我们使用 filter() + lambda 来执行传递过滤器功能和迭代的任务。
Python3
# Python3 code to demonstrate working of
# Remove Non-English characters Strings from List
# Using regex + search() + filter() + lambda
import re
# initializing list
test_list = ['Gfg', 'Good| ????', "for", '??Geeks???']
# printing original list
print("The original list is : " + str(test_list))
# using search() to get only those strings with alphabets
res = list(filter(lambda ele: re.search("[a-zA-Z\s]+", ele) is not None, test_list))
# printing result
print("The extracted list : " + str(res))