📜  使用 BeautifulSoup 计算段落标签的数量

📅  最后修改于: 2022-05-13 01:55:31.958000             🧑  作者: Mango

使用 BeautifulSoup 计算段落标签的数量

有时,在从 HTML 网页中提取数据时,您是否想知道在给定的 HTML 文档中使用了多少段落标记?别担心,我们将在本文中讨论这个问题。

句法:

print(len(soup.find_all("p")))

方法:

步骤 1:首先,导入库、BeautifulSoup 和 os。

from bs4 import BeautifulSoup as bs
import os

第 2 步:现在,通过输入您当前在其中工作的Python文件的名称,删除路径的最后一段。

第 3 步:然后,打开要从中读取值的 HTML 文件。



第 4 步:此外,在 BeautifulSoup 中解析 HTML 文件。

soup=bs(html, 'html.parser')

第 5 步:接下来,如果需要,打印某行。

print("Number of paragraph tags:")

第六步:最后,计算并打印HTML文档中段落标签的数量。

print(len(soup.find_all("p")))

执行:

示例 1

让我们考虑一个简单的 HTML 网页,它有许多段落标签。

HTML


    
 
   Geeks For Geeks
 
    
 
     
 
       

King

          

Prince

          

Queen

    
       

Princess

          


Python
# Python program to get number of paragraph tags
# of a given HTML document in Beautifulsoup
  
# Import the libraries beautifulsoup 
# and os
from bs4 import BeautifulSoup as bs
import os
  
# Open the HTML file
html = open('gfg.html')
  
# Parse HTML file in Beautiful Soup
soup = bs(html, 'html.parser')
  
# Print a certain line
print("Number of paragraph tags:")
  
# Calculating and printing the
# number of paragraph tags
print(len(soup.find_all("p")))


Python
# Python program to get number of paragraph tags
# of a given Website in Beautifulsoup
  
# Import the libraries beautifulsoup 
# and os
from bs4 import BeautifulSoup as bs
import os
import requests
  
# Assign URL
URL = 'https://www.geeksforgeeks.org/'
  
# Page content from Website URL
page = requests.get(URL)
  
# Parse HTML file in Beautiful Soup
soup = bs(page.content, 'html.parser')
  
# Print a certain line
print("Number of paragraph tags:")
  
# Calculating and printing the
# number of paragraph tags
print(len(soup.find_all("p")))


要在上述 HTML 网页中查找段落标记的数量,请执行以下代码。



Python

# Python program to get number of paragraph tags
# of a given HTML document in Beautifulsoup
  
# Import the libraries beautifulsoup 
# and os
from bs4 import BeautifulSoup as bs
import os
  
# Open the HTML file
html = open('gfg.html')
  
# Parse HTML file in Beautiful Soup
soup = bs(html, 'html.parser')
  
# Print a certain line
print("Number of paragraph tags:")
  
# Calculating and printing the
# number of paragraph tags
print(len(soup.find_all("p")))

输出:

示例 2

在下面的程序中,我们将找到特定网站上段落标签的数量。

Python

# Python program to get number of paragraph tags
# of a given Website in Beautifulsoup
  
# Import the libraries beautifulsoup 
# and os
from bs4 import BeautifulSoup as bs
import os
import requests
  
# Assign URL
URL = 'https://www.geeksforgeeks.org/'
  
# Page content from Website URL
page = requests.get(URL)
  
# Parse HTML file in Beautiful Soup
soup = bs(page.content, 'html.parser')
  
# Print a certain line
print("Number of paragraph tags:")
  
# Calculating and printing the
# number of paragraph tags
print(len(soup.find_all("p")))

输出: