📜  BeautifulSoup 中 find 和 find_all 的区别 – Python

📅  最后修改于: 2021-09-12 10:44:59             🧑  作者: Mango

BeautifulSoup是Python最常见的库之一,用于导航、搜索和从 HTML 或 XML 网页中提取数据。用于在网页上查找任何内容的最常用方法是find()find_all() 。但是,这两者之间存在细微差别,让我们详细讨论它们。

find() 方法

find 方法用于找出具有指定名称或 id 的第一个标签,并返回一个 bs4 类型的对象。

例子:

例如,考虑这个具有不同段落标签的简单 HTML 网页。

HTML


    
 
   Geeks For Geeks
 
    
 
 
     

King

        

Prince

        

Queen

    
 

Princess

        


Python
# Find example
  
# Import the libraries BeautifulSoup
# and os
from bs4 import BeautifulSoup as bs
import os
  
# Remove the last segment of the path
base=os.path.dirname(os.path.abspath(__file__))
  
# Open the HTML in which you want to
# make changes
html=open(os.path.join(base, 'gfg.html'))
  
# Parse HTML file in Beautiful Soup
soup=bs(html, 'html.parser')
  
# Obtain the text from the widget after 
# finding it
find_example=soup.find("p", {"id":"vinayak"}).get_text()
  
# Printing the text obtained received 
# in previous step
print(find_example)


HTML


    
 
   Geeks For Geeks
 
    
 
 
     

King

        

Prince

        

Queen

    
 

Princess

          


Python
# find_all example
  
# Import the libraries BeautifulSoup
# and os
from bs4 import BeautifulSoup as bs
import os
  
# Remove the last segment of the path
base=os.path.dirname(os.path.abspath(__file__))
  
# Open the HTML in which you want to 
# make changes
html=open(os.path.join(base, 'gfg.html'))
  
# Parse HTML file in Beautiful Soup
soup=bs(html, 'html.parser')
  
# Construct a loop to find all the
# p tags
for word in soup.find_all('p'):
  
    # Obtain the text from the received
    # tags
    find_all_example=word.get_text()
  
    # Print the text obtained received 
    # in previous step
    print(find_all_example)


为了获取文本 King,我们使用 find 方法。

Python

# Find example
  
# Import the libraries BeautifulSoup
# and os
from bs4 import BeautifulSoup as bs
import os
  
# Remove the last segment of the path
base=os.path.dirname(os.path.abspath(__file__))
  
# Open the HTML in which you want to
# make changes
html=open(os.path.join(base, 'gfg.html'))
  
# Parse HTML file in Beautiful Soup
soup=bs(html, 'html.parser')
  
# Obtain the text from the widget after 
# finding it
find_example=soup.find("p", {"id":"vinayak"}).get_text()
  
# Printing the text obtained received 
# in previous step
print(find_example)

输出:

find_all() 方法

find_all 方法用于查找具有指定标签名称或 id 的所有标签,并将它们作为类型 bs4 的列表返回。

例子:

例如,考虑这个具有不同段落标签的简单 HTML 网页。

HTML



    
 
   Geeks For Geeks
 
    
 
 
     

King

        

Prince

        

Queen

    
 

Princess

          

为了获取所有文本,即King、Prince、Queen、Princess,我们使用find_all 方法。

Python

# find_all example
  
# Import the libraries BeautifulSoup
# and os
from bs4 import BeautifulSoup as bs
import os
  
# Remove the last segment of the path
base=os.path.dirname(os.path.abspath(__file__))
  
# Open the HTML in which you want to 
# make changes
html=open(os.path.join(base, 'gfg.html'))
  
# Parse HTML file in Beautiful Soup
soup=bs(html, 'html.parser')
  
# Construct a loop to find all the
# p tags
for word in soup.find_all('p'):
  
    # Obtain the text from the received
    # tags
    find_all_example=word.get_text()
  
    # Print the text obtained received 
    # in previous step
    print(find_all_example)

输出:

find 和 find_all 之间的区别表

S.No.       

find

find_all

1

find is used for returning the result when the searched element is found on the page. 

find_all is used for returning all the matches after scanning the entire document.

2

It is used for getting merely the first tag of the incoming HTML object for which condition is satisfied.  

It is used for getting all the incoming HTML objects for which condition is satisfied.  

3

The return type of find is .

The return type of find_all is 

4

We can print only the first search as an output.

We can print any search, I.e., second, third, last, etc. or all the searches as an output.

5

Prototype: find(tag, attributes, recursive, text, keywords)

Prototype: findAll(tag, attributes, recursive, text, limit, keywords)