📌  相关文章
📜  BeautifulSoup – 按标签内的文本搜索

📅  最后修改于: 2022-05-13 01:54:49.126000             🧑  作者: Mango

BeautifulSoup – 按标签内的文本搜索

先决条件: Beautifulsoup

Beautifulsoup 是一个强大的Python模块,用于网页抓取。本文讨论如何在给定标签内搜索特定文本。

方法

  • 导入模块
  • 传递网址
  • 请求页面
  • 指定要搜索的标签
  • 对于按标签内的文本进行搜索,我们需要借助字符串函数检查条件。
  • 字符串函数将返回标签内的文本。
  • 当我们导航标签时,我们将使用文本检查条件。
  • 返回文本

我们将通过两种方法查看标签内的搜索文本。

方法一:迭代

此方法使用 for 循环 for 来搜索文本。

例子



Python3
from bs4 import BeautifulSoup
import requests
  
# sample web page
sample_web_page = 'https://www.geeksforgeeks.org/caching-page-tables/'
  
# call get method to request that page
page = requests.get(sample_web_page)
  
# with the help of beautifulSoup and html parser create soup
soup = BeautifulSoup(page.content, "html.parser")
  
child_soup = soup.find_all('strong')
  
text = 'page table base register (PTBR)'
  
# we will search the tag with in which text is same as given text
for i in child_soup:
    if(i.string == text):
        print(i)


Python3
from bs4 import BeautifulSoup
import requests
  
# sample web page
sample_web_page = 'https://www.geeksforgeeks.org/caching-page-tables/'
  
# call get method to request that page
page = requests.get(sample_web_page)
  
# with the help of beautifulSoup and html parser create soup
soup = BeautifulSoup(page.content, "html.parser")
  
text = 'CS Theory Course'
  
# Search by text with the help of lambda function
gfg = soup.find_all(lambda tag: tag.name == "strong" and text in tag.text)
  
print(gfg)


输出

方法 2:使用 lambda

它是上述示例的单内衬替代品。

例子

蟒蛇3

from bs4 import BeautifulSoup
import requests
  
# sample web page
sample_web_page = 'https://www.geeksforgeeks.org/caching-page-tables/'
  
# call get method to request that page
page = requests.get(sample_web_page)
  
# with the help of beautifulSoup and html parser create soup
soup = BeautifulSoup(page.content, "html.parser")
  
text = 'CS Theory Course'
  
# Search by text with the help of lambda function
gfg = soup.find_all(lambda tag: tag.name == "strong" and text in tag.text)
  
print(gfg)

输出