Python程序删除所有控制字符

在电信和计算机领域，控制字符是不可打印的字符，它是字符集的一部分。这些不代表任何书面符号。它们用于发出信号以引起某些效果，而不是向文本添加符号。删除这些控制字符是必不可少的实用程序。在本文中，我们将讨论如何删除所有这些控制字符。

例子：

Input : test_str = ‘Geeks\0\r for \n\bge\tee\0ks\f’

Output : Geeks for geeeks

Explanation : \n, \0, \f, \r, \b, \t being control characters are removed from string.

Input : test_str = ‘G\0\r\n\fg’

Output : Gfg

Explanation : \n, \0, \f, \r being control characters are removed from string, giving Gfg as output.

编程需要懂一点英语

方法一：使用translate()。

这里应用的逻辑是每个非控制字符都位于前 33 个 ASCII字符，因此使用翻译来避免除这些之外的所有其他字符通过映射。

Python3

# Python3 code to demonstrate working of
# Remove all control characters
# Using translate()
  
# initializing string
test_str = 'Geeks\0\r for \n\bge\tee\0ks\f'
  
# printing original string
print("The original string is : " + str(test_str))
  
# using translate() and fromkeys()
# to escape all control characters
mapping =  dict.fromkeys(range(32))
res = test_str.translate(mapping)
  
# printing result
print("String after removal of control characters : " + str(res))

Python3

# Python3 code to demonstrate working of
# Remove all control characters
# Using unicodedata library
import unicodedata
  
# initializing string
test_str = 'Geeks\0\r for \n\bge\tee\0ks\f'
  
# printing original string
print("The original string is : " + str(test_str))
  
# surpassing all control characters
# checking for starting with C
res = "".join(char for char in test_str if unicodedata.category(char)[0]!="C")
  
# printing result
print("String after removal of control characters : " + str(res))

输出：

for riginal string is : Geeks
ge    eeks
String after removal of control characters : Geeks for geeeks

方法二：使用unicodedata库

在此，使用unicodedata.category（），我们可以检查每一个字符开始的“C”是在控制字符和在结果字符串因此被避免。

蟒蛇3

# Python3 code to demonstrate working of
# Remove all control characters
# Using unicodedata library
import unicodedata
  
# initializing string
test_str = 'Geeks\0\r for \n\bge\tee\0ks\f'
  
# printing original string
print("The original string is : " + str(test_str))
  
# surpassing all control characters
# checking for starting with C
res = "".join(char for char in test_str if unicodedata.category(char)[0]!="C")
  
# printing result
print("String after removal of control characters : " + str(res))

输出：

for riginal string is : Geeks
ge    eeks
String after removal of control characters : Geeks for geeeks