📌  相关文章
📜  使用Python获取航班状态

📅  最后修改于: 2022-05-13 01:54:59.501000             🧑  作者: Mango


使用 BeautifulSoup 在Python中实现 Web Scraping 的先决条件



  • bs4: Beautiful Soup(bs4) 是一个Python库,用于从 HTML 和 XML 文件中提取数据。这个模块没有内置于Python中。要安装此类型,请在终端中输入以下命令。
pip install bs4
  • 请求:请求允许您非常轻松地发送 HTTP/1.1 请求。这个模块也没有内置于Python中。要安装此类型,请在终端中输入以下命令。
pip install requests


  • 导入模块
  • 创建 URL 获取函数
  • 现在将信息合并到 URL 并将 URL 传递到 getdata()函数并将该数据转换为 HTML 代码。
  • 现在从 HTML 代码中找到所需的标签并遍历结果


# import module
import requests
from bs4 import BeautifulSoup
# UDF for get HTML code
# from URL
def get_html(Airline_code, Flight_number, Date, Month, Year):
    def getdata(url):
        r = requests.get(url)
        return r.text
    # url
    url = "https://www.flightstats.com/v2/flight-tracker/"+Airline_code + \
    # pass the url
    # into getdata function
    htmldata = getdata(url)
    soup = BeautifulSoup(htmldata, 'html.parser')
# Get Flight number
# from Html code
def flight_no(soup):
    Flight_no = ""
    # Find div tag with
    # unique class name
    for i in soup.find("div", class_="ticket__FlightNumberContainer-s1rrbl5o-4 hgbvHg"):
        Flight_no = Flight_no + (i.get_text()) + " "
    return (Flight_no)
# Get Airport name
# from HTML code
def airport(soup):
    Airport_name = []
    # Find div tag with
    # unique class name
    for i in soup.find_all("div", class_="text-helper__TextHelper-s8bko4a-0 CPamx"):
    return (Airport_name)
# get status
# from HTML code
def status(soup, Airport_list):
    Time_status = []
    Airport_List = []
    Status_str = []
    Gate = []
    Gate_no = []
    # Find div tag with
    # unique class name
    # to get Gate number
    for data in soup.find_all("div", class_="ticket__TGBLabel-s1rrbl5o-15 gcbyEH text-helper__TextHelper-s8bko4a-0 dfeqpK"):
    for data in soup.find_all("div", class_="ticket__TGBValue-s1rrbl5o-16 icyRae text-helper__TextHelper-s8bko4a-0 cCfBRT"):
    # Get status from
    # html code
    for i in soup.find_all("div", class_="text-helper__TextHelper-s8bko4a-0 bcmzUJ"):
    for i in soup.find_all("div", class_="text-helper__TextHelper-s8bko4a-0 cCfBRT"):
    # traverse the Data
    # from scraping data
    for item in range(4):
        if item == 0:
        if item == 2:
        print(Status_str[item] + " : " + Time_status[item])
        print(Gate[item] + " : " + Gate_no[item])
    for item in range(len(Gate)):
        print(Gate[item] + " : " + Gate_no[item])
# Driver code
if __name__ == '__main__':
    # Input Data from geek
    Airline_code = 'G8'
    Flight_number = '134'
    Date = '23'
    Month = '10'
    Year = '2020'
    # Calling the get_html
    # with argument
    # function calling
    soup = get_html(Airline_code, Flight_number, Date, Month, Year)
    print("Flight number : ", flight_no(soup))
    Airport_list = airport(soup)
    status(soup, Airport_list)


Flight number :  G8 134 GoAir 
Jay Prakash Narayan International Airport
Scheduled : 21:00 IST
Terminal : N/A
Estimated : 21:00 IST
Gate : N/A

Indira Gandhi International Airport
Scheduled : 22:40 IST
Terminal : T2
Estimated : 22:40 IST
Gate : 205
Terminal : N/A
Gate : N/A
Terminal : T2
Gate : 205