📌 相关文章

📜 如何处理 BeautifulSoup 中的重复属性？

📅 最后修改于: 2022-05-13 01:54:53.106000 🧑 作者: Mango

如何处理 BeautifulSoup 中的重复属性？

有时在获取信息时，您是否在处理从相同标签的重复属性接收到的信息时遇到任何问题？如果是，那么请阅读文章并清除您的所有疑虑。

创建用于存储项目的列表后，请编写以下代码。

句法：

list=soup.find_all(“#Widget Name”, {“id”:”#Id name of widget in which you want to edit”})

编程需要懂一点英语

编写以下代码后，从输出中删除属性并从列表中打印您想要的特定项目。

方法：

导入模块
现在，通过输入您当前在其中工作的Python文件的名称来删除路径的最后一段。

句法：

base=os.path.dirname(os.path.abspath(‘#Name of Python file in which you are currently working’))

编程需要懂一点英语

然后，打开要从中读取值的 HTML 文件。

句法：

html=open(os.path.join(base, ‘#Name of HTML file from which you wish to read value’))

编程需要懂一点英语

在 BeautifulSoup 中解析 HTML 文件。
此外，创建一个列表来存储相同标签和属性的所有项目值。
接下来，找到所有具有相同标签和属性的项目。

句法：

list=soup.find_all(“#Widget Name”, {“id”:”#Id name of widget in which you want to edit”})

编程需要懂一点英语

稍后，从标签中删除所有属性。
最后，打印小部件标签的特定项目。

使用中的网页：

HTML



 
   Geeks For Geeks
 
 
 
     King
  
     Prince
  
     Queen
  
 
 Princess

Python

# Import the libraries beautifulsoup and os
from bs4 import BeautifulSoup as bs
import os
  
# Remove the last segment of the path
# Here replace the name of your python file with
# gfg4.py
base = os.path.dirname(os.path.abspath("gfg4.py"))
  
# Open the HTML in which you want to make 
# changes
html = open(os.path.join(base, 'gfg.html'))
  
# Parse HTML file in Beautiful Soup
soup = bs(html, 'html.parser')
  
# Create a list to store the items
list = [3]
  
# Finding all the elements inside div
# with paragraph having id: vinayak
list = soup.div.find_all("p", {"id": "vinayak"})
  
# Removing attributes from the output
for i in list:
    i.attrs = {}
  
# Printing the value Prince
print(list[1])
  
# Printing the value Queen
print(list[2])

程序：

Python

# Import the libraries beautifulsoup and os
from bs4 import BeautifulSoup as bs
import os
  
# Remove the last segment of the path
# Here replace the name of your python file with
# gfg4.py
base = os.path.dirname(os.path.abspath("gfg4.py"))
  
# Open the HTML in which you want to make 
# changes
html = open(os.path.join(base, 'gfg.html'))
  
# Parse HTML file in Beautiful Soup
soup = bs(html, 'html.parser')
  
# Create a list to store the items
list = [3]
  
# Finding all the elements inside div
# with paragraph having id: vinayak
list = soup.div.find_all("p", {"id": "vinayak"})
  
# Removing attributes from the output
for i in list:
    i.attrs = {}
  
# Printing the value Prince
print(list[1])
  
# Printing the value Queen
print(list[2])

输出：

Prince

Queen

编程需要懂一点英语