使用 BeautifulSoup 计算段落标签的数量

有时，在从 HTML 网页中提取数据时，您是否想知道在给定的 HTML 文档中使用了多少段落标记？别担心，我们将在本文中讨论这个问题。

句法：

print(len(soup.find_all("p")))

方法：

步骤 1：首先，导入库、BeautifulSoup 和 os。

from bs4 import BeautifulSoup as bs
import os

第 2 步：现在，通过输入您当前在其中工作的Python文件的名称，删除路径的最后一段。

base=os.path.dirname(os.path.abspath(‘#Name of Python file in which you are currently working’))

编程需要懂一点英语

第 3 步：然后，打开要从中读取值的 HTML 文件。

html=open(os.path.join(base, ‘#Name of HTML file from which you wish to read value’))

编程需要懂一点英语

第 4 步：此外，在 BeautifulSoup 中解析 HTML 文件。

soup=bs(html, 'html.parser')

第 5 步：接下来，如果需要，打印某行。

print("Number of paragraph tags:")

第六步：最后，计算并打印HTML文档中段落标签的数量。

print(len(soup.find_all("p")))

执行：

示例 1

让我们考虑一个简单的 HTML 网页，它有许多段落标签。

HTML



    
 
   Geeks For Geeks
 
    
 
     
 
       
King
  
       
Prince
  
       
Queen
  
 
     
 Princess

Python

# Python program to get number of paragraph tags
# of a given HTML document in Beautifulsoup
  
# Import the libraries beautifulsoup 
# and os
from bs4 import BeautifulSoup as bs
import os
  
# Open the HTML file
html = open('gfg.html')
  
# Parse HTML file in Beautiful Soup
soup = bs(html, 'html.parser')
  
# Print a certain line
print("Number of paragraph tags:")
  
# Calculating and printing the
# number of paragraph tags
print(len(soup.find_all("p")))

Python

# Python program to get number of paragraph tags
# of a given Website in Beautifulsoup
  
# Import the libraries beautifulsoup 
# and os
from bs4 import BeautifulSoup as bs
import os
import requests
  
# Assign URL
URL = 'https://www.geeksforgeeks.org/'
  
# Page content from Website URL
page = requests.get(URL)
  
# Parse HTML file in Beautiful Soup
soup = bs(page.content, 'html.parser')
  
# Print a certain line
print("Number of paragraph tags:")
  
# Calculating and printing the
# number of paragraph tags
print(len(soup.find_all("p")))

要在上述 HTML 网页中查找段落标记的数量，请执行以下代码。

Python

# Python program to get number of paragraph tags
# of a given HTML document in Beautifulsoup
  
# Import the libraries beautifulsoup 
# and os
from bs4 import BeautifulSoup as bs
import os
  
# Open the HTML file
html = open('gfg.html')
  
# Parse HTML file in Beautiful Soup
soup = bs(html, 'html.parser')
  
# Print a certain line
print("Number of paragraph tags:")
  
# Calculating and printing the
# number of paragraph tags
print(len(soup.find_all("p")))

输出：

示例 2

在下面的程序中，我们将找到特定网站上段落标签的数量。

Python

# Python program to get number of paragraph tags
# of a given Website in Beautifulsoup
  
# Import the libraries beautifulsoup 
# and os
from bs4 import BeautifulSoup as bs
import os
import requests
  
# Assign URL
URL = 'https://www.geeksforgeeks.org/'
  
# Page content from Website URL
page = requests.get(URL)
  
# Parse HTML file in Beautiful Soup
soup = bs(page.content, 'html.parser')
  
# Print a certain line
print("Number of paragraph tags:")
  
# Calculating and printing the
# number of paragraph tags
print(len(soup.find_all("p")))

输出：