如何使用 BeautifulSoup 通过 id 提取 div 标签及其内容?
Beautifulsoup 是一个用于网页抓取的Python库。这个强大的Python工具也可以用来修改HTML网页。本文描述了如何使用 beautifulsoup 通过其 ID 提取 div 及其内容。为此,模块的 find()函数用于通过其 ID 查找 div。
方法:
- 导入模块
- 从网页中抓取数据
- 将抓取的字符串解析为 HTML
- 找到带有 ID 的 div
- 打印其内容
Syntax : find(tag_name, **kwargs)
Parameters:
- The tag_name argument tell Beautiful Soup to only find tags with given names. Text strings will be ignored, as will tags whose names that don’t match.
- The **kwargs arguments are used to filter against each tag’s ‘id’ attribute.
下面是实现:
示例 1:
Python3
#importing module
from bs4 import BeautifulSoup
markup = '''Div Content'''
soup = BeautifulSoup(markup, 'html.parser')
#finding the div with the id
div_bs4 = soup.find('div', id = "container")
print(div_bs4.string)
Python3
#importing module
from bs4 import BeautifulSoup
markup =markup = """
Example
Nested div
Div with ID first
Div with id second
"""
# parsering string to HTML
soup = BeautifulSoup(markup, 'html.parser')
#finding the div with the id
div_bs4 = soup.find('div', id = "second")
print(div_bs4.string)
输出:
Div Content
示例 2:
蟒蛇3
#importing module
from bs4 import BeautifulSoup
markup =markup = """
Example
Nested div
Div with ID first
Div with id second
"""
# parsering string to HTML
soup = BeautifulSoup(markup, 'html.parser')
#finding the div with the id
div_bs4 = soup.find('div', id = "second")
print(div_bs4.string)
输出:
Div with id second