使用Python获取航班状态

使用 BeautifulSoup 在Python中实现 Web Scraping 的先决条件

在本文中，我们将编写一个Python脚本来获取航班状态。

需要的模块：

bs4: Beautiful Soup(bs4) 是一个Python库，用于从 HTML 和 XML 文件中提取数据。这个模块没有内置于Python中。要安装此类型，请在终端中输入以下命令。

pip install bs4

请求：请求允许您非常轻松地发送 HTTP/1.1 请求。这个模块也没有内置于Python中。要安装此类型，请在终端中输入以下命令。

pip install requests

方法：

导入模块
创建 URL 获取函数
现在将信息合并到 URL 并将 URL 传递到 getdata()函数并将该数据转换为 HTML 代码。
现在从 HTML 代码中找到所需的标签并遍历结果

执行：

Python3

# import module
import requests
from bs4 import BeautifulSoup
 
# UDF for get HTML code
# from URL
 
 
def get_html(Airline_code, Flight_number, Date, Month, Year):
    def getdata(url):
        r = requests.get(url)
        return r.text
 
    # url
    url = "https://www.flightstats.com/v2/flight-tracker/"+Airline_code + \
        "/"+Flight_number+"?year="+Year+"&month="+Month+"&date="+Date
 
    # pass the url
    # into getdata function
    htmldata = getdata(url)
    soup = BeautifulSoup(htmldata, 'html.parser')
    return(soup)
 
# Get Flight number
# from Html code
 
 
def flight_no(soup):
    Flight_no = ""
 
    # Find div tag with
    # unique class name
    for i in soup.find("div", class_="ticket__FlightNumberContainer-s1rrbl5o-4 hgbvHg"):
        Flight_no = Flight_no + (i.get_text()) + " "
    return (Flight_no)
 
# Get Airport name
# from HTML code
 
 
def airport(soup):
    Airport_name = []
    # Find div tag with
    # unique class name
    for i in soup.find_all("div", class_="text-helper__TextHelper-s8bko4a-0 CPamx"):
        Airport_name.append(i.get_text())
    return (Airport_name)
 
# get status
# from HTML code
 
 
def status(soup, Airport_list):
    Time_status = []
    Airport_List = []
    Status_str = []
    Gate = []
    Gate_no = []
 
    # Find div tag with
    # unique class name
    # to get Gate number
    for data in soup.find_all("div", class_="ticket__TGBLabel-s1rrbl5o-15 gcbyEH text-helper__TextHelper-s8bko4a-0 dfeqpK"):
        Gate.append(data.get_text())
    for data in soup.find_all("div", class_="ticket__TGBValue-s1rrbl5o-16 icyRae text-helper__TextHelper-s8bko4a-0 cCfBRT"):
        Gate_no.append(data.get_text())
 
    # Get status from
    # html code
    for i in soup.find_all("div", class_="text-helper__TextHelper-s8bko4a-0 bcmzUJ"):
        Status_str.append(i.get_text())
    for i in soup.find_all("div", class_="text-helper__TextHelper-s8bko4a-0 cCfBRT"):
        Time_status.append(i.get_text())
 
    # traverse the Data
    # from scraping data
    for item in range(4):
        if item == 0:
            print(Airport_list[0])
        if item == 2:
            print("")
            print(Airport_list[1])
        print(Status_str[item] + " : " + Time_status[item])
        print(Gate[item] + " : " + Gate_no[item])
    for item in range(len(Gate)):
        print(Gate[item] + " : " + Gate_no[item])
 
 
# Driver code
if __name__ == '__main__':
    # Input Data from geek
    Airline_code = 'G8'
    Flight_number = '134'
    Date = '23'
    Month = '10'
    Year = '2020'
 
    # Calling the get_html
    # with argument
    # function calling
    soup = get_html(Airline_code, Flight_number, Date, Month, Year)
    print("Flight number : ", flight_no(soup))
    Airport_list = airport(soup)
    status(soup, Airport_list)

输出：

Flight number :  G8 134 GoAir 
Jay Prakash Narayan International Airport
Scheduled : 21:00 IST
Terminal : N/A
Estimated : 21:00 IST
Gate : N/A

Indira Gandhi International Airport
Scheduled : 22:40 IST
Terminal : T2
Estimated : 22:40 IST
Gate : 205
Terminal : N/A
Gate : N/A
Terminal : T2
Gate : 205