如何删除 BeautifulSoup 中的子元素?
Beautifulsoup 是一个用于网页抓取的Python库。这个强大的Python工具也可以用来修改html网页。这篇文章描述了如何使用 beautifulsoup 来删除子元素。为此,使用了模块的各种方法。
使用的方法:
- clear(): Tag.clear() 从给定 HTML 文档的树中删除标签。
- 分解(): Tag.decompose() 从给定 HTML 文档的树中删除一个标签,然后完全销毁它及其内容。
- replace(): Tag.replace() 用新标签替换特定标签。
方法:
- 导入模块。
- 从网页中抓取数据。
- 解析抓取到 html 的字符串。
- 找到要删除其子元素的标签。
- 使用任何一种方法:clear()、decompose() 或replace()。
- 打印替换的内容。
示例 1:
Python3
# importing module
from bs4 import BeautifulSoup
markup = """
Example
This is child of div with id = "parent".
Child of "P"
Another Child of div with id = "parent".
Piyush
"""
# parsering string to HTML
soup = BeautifulSoup(markup, 'html.parser')
# finding tag whose child to be deleted
div_bs4 = soup.find('div')
# delete the child element
div_bs4.clear()
print(div_bs4)
Python3
# importing module
from bs4 import BeautifulSoup
markup = """
Example
This is child of div with id = "parent".
Child of "P"
Another Child of div with id = "parent".
Piyush
"""
# parsering string to HTML
soup = BeautifulSoup(markup, 'html.parser')
# finding tag whose child to be deleted
div_bs4 = soup.find('div')
# delete the child element
div_bs4.decompose()
print(div_bs4)
Python3
# importing module
from bs4 import BeautifulSoup
markup = """
Example
This is child of div with id = "parent".
Child of "P"
Another Child of div with id = "parent".
Piyush
"""
# parsering string to HTML
soup = BeautifulSoup(markup, 'html.parser')
# finding tag whose child to be deleted
div_bs4 = soup.find('div')
# delete the child element
div_bs4.replaceWith('')
print(div_bs4)
输出:
示例 2:
蟒蛇3
# importing module
from bs4 import BeautifulSoup
markup = """
Example
This is child of div with id = "parent".
Child of "P"
Another Child of div with id = "parent".
Piyush
"""
# parsering string to HTML
soup = BeautifulSoup(markup, 'html.parser')
# finding tag whose child to be deleted
div_bs4 = soup.find('div')
# delete the child element
div_bs4.decompose()
print(div_bs4)
输出:
示例 3:
蟒蛇3
# importing module
from bs4 import BeautifulSoup
markup = """
Example
This is child of div with id = "parent".
Child of "P"
Another Child of div with id = "parent".
Piyush
"""
# parsering string to HTML
soup = BeautifulSoup(markup, 'html.parser')
# finding tag whose child to be deleted
div_bs4 = soup.find('div')
# delete the child element
div_bs4.replaceWith('')
print(div_bs4)
输出:
This is child of div with id = "parent".
Child of "P"
Another Child of div with id = "parent".