后代生成器Python Beautifulsoup
后代生成器由 Beautiful Soup 提供,它是Python的网络抓取框架。网络抓取是使用自动化工具从网站中提取数据的过程,以加快过程。 .contents和.children属性只考虑标签的直接子级。后代生成器用于递归遍历所有标签的子代。每个子元素都将成为元素的标签元素和字符串的 NavigableString 。
句法:
tag.descendants
下面给出的例子解释了 Beautiful Soup 中后代生成器的概念。
示例 1:在此示例中,我们将获取元素的后代。
Python3
# Import Beautiful Soup
from bs4 import BeautifulSoup
# Create the document
doc = " Hello world "
# Initialize the object with the document
soup = BeautifulSoup(doc, "html.parser")
# Get the body tag
tag = soup.body
# Print all the descendants of tag
for descendant in tag.descendants:
print(descendant)
Python3
# Import Beautiful Soup
from bs4 import BeautifulSoup
# Create the document
doc = " Hello world "
# Initialize the object with the document
soup = BeautifulSoup(doc, "html.parser")
# Get the body tag
tag = soup.body
# Print the type of the descendants of tag
for descendant in tag.descendants:
print(type(descendant))
输出:
Hello world
Hello world
示例 2:在此示例中,我们将查看后代的类型。
蟒蛇3
# Import Beautiful Soup
from bs4 import BeautifulSoup
# Create the document
doc = " Hello world "
# Initialize the object with the document
soup = BeautifulSoup(doc, "html.parser")
# Get the body tag
tag = soup.body
# Print the type of the descendants of tag
for descendant in tag.descendants:
print(type(descendant))
输出: