📜  Python|根据顶级域对 URL 进行排序

📅  最后修改于: 2022-05-13 01:55:43.434000             🧑  作者: Mango

Python|根据顶级域对 URL 进行排序

给定一个 URL 列表,任务是根据顶级域对列表中的 URL 进行排序。
顶级域(TLD) 是 Internet 分层域名系统中处于最高级别的域之一。示例 – org、com、edu。
这主要用于我们必须废弃页面并根据顶级域对 URL 进行排序的情况。它被广泛用于开源项目,并作为方便的代码片段使用。

Input :
url = ["https://www.isb.edu", "www.google.com", 
"http://cyware.com", "https://www.gst.in", 
"https://www.coursera.org", "https://www.create.net", 
"https://www.ontariocolleges.ca"]

Output :
['https://www.ontariocolleges.ca', 'www.google.com', 
'http://cyware.com', 'https://www.isb.edu', 
'https://www.gst.in', 'https://www.create.net',
 'https://www.coursera.org']

Explanation:
The Tld for the above list is in sorted order
['.ca','.com','.com','.edu','.in','.net','.org']

以下是完成上述任务的一些方法。

方法一:使用排序
您可以拆分输入,然后使用排序根据 TLD 进行排序。

#Python code to sort the URL in the list based on the top-level domain.
  
#Url list initialization
Input = ["https://www.isb.edu", "www.google.com", "http://cyware.com",
 "https://www.gst.in", "https://www.coursera.org",
 "https://www.create.net", "https://www.ontariocolleges.ca"]
  
#Function to sort in tld order
def tld(Input):
    return Input.split('.')[-1]
  
#Using sorted and calling function
Output = sorted(Input,key=tld)
  
#Printing output
print("Initial list is :")
print(Input)
print("sorted list according to TLD is")
print(Output)
Initial list is :

['https://www.isb.edu', 'www.google.com', 'http://cyware.com',
 'https://www.gst.in', 'https://www.coursera.org', 
'https://www.create.net', 'https://www.ontariocolleges.ca']

Sorted list according to TLD is :

['https://www.ontariocolleges.ca', 'www.google.com', 
'http://cyware.com', 'https://www.isb.edu',
 'https://www.gst.in', 'https://www.create.net', 'https://www.coursera.org']

方法二:使用 Lambda
根据顶级域对列表中的 URL 进行排序的最简洁易读的方法是使用 lambda。

#Python code to sort the URL in the list based on the top-level domain.
  
#Url list initialization
Input = ["https://www.isb.edu", "www.google.com", "http://cyware.com",
"https://www.gst.in", "https://www.coursera.org",
"https://www.create.net", "https://www.ontariocolleges.ca"]
  
#Using lambda and sorted 
Output = sorted(Input,key=lambda x: x.split('.')[-1])
  
#Printing output
print("Initial list is :")
print(Input)
print("sorted list according to TLD is")
print(Output)
Initial list is :

['https://www.isb.edu', 'www.google.com', 'http://cyware.com',
 'https://www.gst.in', 'https://www.coursera.org', 
'https://www.create.net', 'https://www.ontariocolleges.ca']

Sorted list according to TLD is :

['https://www.ontariocolleges.ca', 'www.google.com', 
'http://cyware.com', 'https://www.isb.edu',
 'https://www.gst.in', 'https://www.create.net', 'https://www.coursera.org']

方法3:使用反向
反转输入并将其拆分,然后根据 TLD 应用排序以对 URL 进行排序

#Python code to sort the URL in the list based on the top-level domain.
  
#Url list initialization
Input = ["https://www.isb.edu", "www.google.com", "http://cyware.com",
"https://www.gst.in", "https://www.coursera.org",
"https://www.create.net", "https://www.ontariocolleges.ca"]
  
#Internal function for reversed
def internal(string):
    return list(reversed(string.split('.')))
  
#Using sorted and calling internal for reversed
Output = sorted(Input, key=internal)
  
#Printing output
print("Initial list is :")
print(Input)
print("sorted list according to TLD is")
print(Output)
Initial list is :

['https://www.isb.edu', 'www.google.com', 'http://cyware.com',
 'https://www.gst.in', 'https://www.coursera.org', 
'https://www.create.net', 'https://www.ontariocolleges.ca']

Sorted list according to TLD is :

['https://www.ontariocolleges.ca', 'www.google.com', 
'http://cyware.com', 'https://www.isb.edu',
 'https://www.gst.in', 'https://www.create.net', 'https://www.coursera.org']