BeautifulSoup 对象 – Python Beautifulsoup

BeautifulSoup对象由 Beautiful Soup 提供，它是Python的网络抓取框架。网络抓取是使用自动化工具从网站中提取数据的过程，以加快过程。 BeautifulSoup 对象将解析后的文档表示为一个整体。大多数情况下，您可以将其视为 Tag 对象。

Syntax: BeautifulSoup(document, parser)

Parameters: This function accepts two parameters as explained below:

document: This parameter contains the XML or HTML document.
parser: This parameter contains the name of the parser to be used to parse the document.

编程需要懂一点英语

下面给出的例子解释了 Beautiful Soup 中 BeautifulSoup 对象的概念。
示例 1：在本示例中，我们将创建一个带有 BeautifulSoup 对象的文档并打印一个标签。

Python3

# Import Beautiful Soup
from bs4 import BeautifulSoup
  
# Initialize the object with a HTML page
soup = BeautifulSoup('''
    
         Heading 1 
         Heading 2 
    
    ''', "lxml")
  
# Get the whole h2 tag
tag = soup.h2
  
# Print the tag
print(tag)

Python3

# Import Beautiful Soup
from bs4 import BeautifulSoup
  
# Initialize the object with a HTML page
soup = BeautifulSoup('''
      
         Heading 1 
         Heading 2 
      
    ''', "lxml")
  
# Get the whole h2 tag
tag = soup.h2
  
# Get the attribute
attribute = tag.attrs
  
# Print the output
print(attribute)

输出：

 Heading 1

示例 2：在本示例中，我们将使用 BeautifulSoup 对象创建一个文档，然后使用 attrs 方法提取属性。

蟒蛇3

# Import Beautiful Soup
from bs4 import BeautifulSoup
  
# Initialize the object with a HTML page
soup = BeautifulSoup('''
      
         Heading 1 
         Heading 2 
      
    ''', "lxml")
  
# Get the whole h2 tag
tag = soup.h2
  
# Get the attribute
attribute = tag.attrs
  
# Print the output
print(attribute)

输出：

{'class': ['hello']}