BeautifulSoup – 移除标签内容

在本文中，我们将看到如何使用 BeautifulSoup 从 HTML 中删除内容标签。 BeautifulSoup 是一个用于提取 html 和 xml 文件的Python库。

需要的模块：

美汤： 我们的主要模块包含一种通过 HTTP 访问网页的方法。

要安装，请在终端中运行此命令：

pip install bs4

方法：

首先，我们将导入所需的库。
我们将读取 html 文件或文本。
我们将提取的文本提供给汤对象。
然后我们将找到所需的标签，然后清除其元素。

分步实施：

第 1 步：我们将初始化程序，导入库并读取或创建我们想要的 HTML 文档。

Python3

# Importing libraries
from bs4 import BeautifulSoup
  
# Reading the html text we want to parse
text = "  Welcome This is a test page
"

Python3

# creating a soup
soup = BeautifulSoup(text,"html.parser")
  
# printing the content in h1 tag
print(f"Content of h1 tag is: {soup.h1}")

Python3

# clearing the content of the tag
soup.h1.clear()
  
# printing the content in h1 tag after clearing
print(f"Content of h1 tag after clearing: {soup.h1}")

Python3

# Importing libraries
from bs4 import BeautifulSoup
  
# Reading the html text we want to parse
text = "  Welcome This is a test page"
  
# creating a soup
soup = BeautifulSoup(text,"html.parser")
  
# printing the content in h1 tag
print(f"Content of h1 tag is: {soup.h1}")
  
# clearing the content of the tag
soup.h1.clear()
  
# printing the content in h1 tag after clearing
print(f"Content of h1 tag after clearing: {soup.h1}")

第 2 步：我们将检索到的文本传递给汤对象并设置解析器，在这种情况下我们使用的是 html 解析器。可以使用的其他标记是 xml 或 html5。然后我们将提到我们必须从中删除内容的标签。

蟒蛇3

# creating a soup
soup = BeautifulSoup(text,"html.parser")
  
# printing the content in h1 tag
print(f"Content of h1 tag is: {soup.h1}")

输出：

第 3 步：我们将使用 .clear函数。它清除提到的标签的内容。

蟒蛇3

# clearing the content of the tag
soup.h1.clear()
  
# printing the content in h1 tag after clearing
print(f"Content of h1 tag after clearing: {soup.h1}")

下面是完整的实现：

蟒蛇3

# Importing libraries
from bs4 import BeautifulSoup
  
# Reading the html text we want to parse
text = "  Welcome This is a test page"
  
# creating a soup
soup = BeautifulSoup(text,"html.parser")
  
# printing the content in h1 tag
print(f"Content of h1 tag is: {soup.h1}")
  
# clearing the content of the tag
soup.h1.clear()
  
# printing the content in h1 tag after clearing
print(f"Content of h1 tag after clearing: {soup.h1}")