📜  二分搜索与包含Java列表中的性能

📅  最后修改于: 2021-09-15 01:04:00             🧑  作者: Mango

Java提供了两种方法,即 Collections.binarySearch() 和 contains() 来查找列表中的元素。在幕后, contains() 方法使用 indexOf() 方法来搜索元素。 indexOf()方法线性循环遍历 List 并将每个元素与键进行比较,直到找到键为止,并返回 true 否则在找不到元素时返回 false。所以,contains() 的时间复杂度是 O(n)。 Collections.binarySearch() 的时间复杂度是 O(log2(n))。但是如果我们想使用这个方法,那么应该对列表进行排序。如果列表未排序,那么我们需要在使用 Collections.binarySearch() 之前对其进行排序,这需要 O(nlog(n)) 时间。

如何选择:

  • 如果要找到的元素靠近列表的开头,那么 contains() 方法的性能会更好,因为 contains() 从列表的开头开始线性搜索元素。
  • 如果元素已排序并且元素数量相对较大,则 Collections.binarySearch() 更快,因为它只需要 O(log2(n)) 时间。
  • 如果列表的元素未排序,则 contains() 方法的性能更好,因为它只需要 O(n) 时间,但如果搜索查询的数量很高,则 Collections.binarySearch() 方法的整体性能更好我们在第一次搜索期间只对列表排序一次,这需要 O(nlog(n)) 时间,之后每次搜索操作都需要 O(log(n)) 时间。
  • 对于包含相对少量元素的列表, contains() 会产生更好的速度。
  • 如果我们使用的 LinkedList 没有实现 RandomAccess 接口,因此无法提供 O(1) 时间来访问元素,那么我们应该更喜欢 contains() 而不是 Collections.binarySearch() 因为 Collections.binary search() 需要O(n) 执行链接遍历,然后需要 O(log(n)) 时间来执行比较。

现在,我们将Dscussing出两个变种,其中一个分类列表

  1. 排序小列表
  2. 排序大列表
  3. 未排序列表

案例 1:对于一个小的排序列表

在下面提到的代码中,我们以一个只包含 0 到 99 的 100 个元素的排序列表为例,我们搜索了 40 个元素,正如我们在上面看到的,在小列表中 contains() 在以下情况下比 Collections.binarySearch 有优势说到速度。

例子

Java
// Java program to compare the performance
// of contains() and Collections.binarySearch()
// For a Small List (Case 1)
 
// Importing ArrayList and Collections classes
// from java.util package
import java.util.ArrayList;
import java.util.Collections;
 
// Main class
class GFG {
 
    // Main driver method
    public static void main(String[] args)
    {
 
        // Creating an object of ArrayList
        // Declaring object of integer type
        ArrayList arr = new ArrayList<>();
 
        // Iterating over object using for loop
        for (int i = 0; i < 100; i++) {
            arr.add(i);
        }
 
        // Calculating and printing the time taken
        // where we are finding 40
        // Using contains() method
        long start = System.nanoTime();
        arr.contains(40);
        long end = System.nanoTime();
 
        // Print statement
        System.out.println(
            "Time taken to find 40 inside arr using contains() = "
            + (end - start) + " nano seconds");
 
        // Calculating and printing the time taken
        // to find 40
        // Using Collections.binarySearch() method
        start = System.nanoTime();
        Collections.binarySearch(arr, 40);
        end = System.nanoTime();
 
        // Print statement
        System.out.println(
            "Time taken to find 40 inside arr using binarySearch() = "
            + (end - start) + " nano seconds");
    }
}


Java
// Java program to Find and Compare the Performance
// of conatins() and Collections.sort() Methods
// For Large Sorted ArrayList (Case 2)
 
// Importing ArrayList and Collections classes
// from java.util package
import java.util.ArrayList;
import java.util.Collections;
 
// Main class
public class GFG {
 
    // Main driver method
    public static void main(String[] args)
    {
 
        // Creating an object of ArrayList class
        // Declaring object of Integer type
        ArrayList arr = new ArrayList<>();
 
        // Iterating over the object
        for (int i = 0; i < 100000; i++) {
 
            // Adding elements using add() method
            arr.add(i);
        }
 
        // Calculating and printing the time taken
        // to find 40000 using contains()
        long start = System.nanoTime();
        arr.contains(40000);
        long end = System.nanoTime();
 
        // Print statement
        System.out.println(
            "Time taken to find 40000 inside arr "
            + "using contains() = " + (end - start)
            + " nano seconds");
 
        // Calculating and printing the time taken
        // to find 40000 using Collections.binarySearch()
        start = System.nanoTime();
        Collections.binarySearch(arr, 40000);
        end = System.nanoTime();
 
        // Print statement
        System.out.println(
            "Time taken to find 40000 inside arr "
            + "using binarySearch() = " + (end - start)
            + " nano seconds");
    }
}


Java
// Java program to compare the performance
// of contains() and Collections.sort() method
//  on an unsorted ArrayList (Case3)
 
// Importing ArrayList and Collections class
// from java.util package
import java.util.ArrayList;
import java.util.Collections;
 
// Main class
class GFG {
 
    // Main driver method
    public static void main(String[] args)
    {
 
        // Creating an object of ArayList class
        ArrayList arr = new ArrayList<>();
 
        // Iterating between 0 to 100000 numbers
        for (int i = 0; i < 100000; i++) {
 
            // Generating random numbers as iterated
            // using random() function
            int rand = (int)(Math.random() * 100000);
 
            // Later storing them inside our list
            arr.add(rand);
        }
 
        // Setting the key to be found as the element
        // at index 30000 inside of unsorted list
        int key = arr.get(30000);
 
        // Calculating and printing the time taken
        // to find the key using contains()
        long start = System.nanoTime();
        arr.contains(key);
        long end = System.nanoTime();
 
        // Print statement
        System.out.println(
            "Time takes to find " + key
            + " inside arr using contains() = "
            + (end - start) + " nano seconds");
 
        // Calculating and printing the time taken to
        // find the key using Collections.binarySearch()
        // after sorting the list using Collections.sort()
        // method
        start = System.nanoTime();
        Collections.sort(arr);
        Collections.binarySearch(arr, key);
        end = System.nanoTime();
 
        // Print statement
        System.out.println(
            "Time takes to find " + key
            + " inside arr using binarySearch() = "
            + (end - start) + " nano seconds");
    }
}


输出
Time taken to find 40 inside arr using contains() = 16286 nano seconds
Time taken to find 40 inside arr using binarySearch() = 87957 nano seconds

案例 2:对于一个大的排序列表

在下面提到的内容中,我们创建了一个有序的 ArrayList,其中包含从 0 到 99999 的 100000 个元素,我们使用 contains() 和 Collections.sort() 方法找到其中的元素 40000。由于列表已排序且元素数量相对较多,因此 Collections.sort() 的性能优于 contains() 方法。

例子

Java

// Java program to Find and Compare the Performance
// of conatins() and Collections.sort() Methods
// For Large Sorted ArrayList (Case 2)
 
// Importing ArrayList and Collections classes
// from java.util package
import java.util.ArrayList;
import java.util.Collections;
 
// Main class
public class GFG {
 
    // Main driver method
    public static void main(String[] args)
    {
 
        // Creating an object of ArrayList class
        // Declaring object of Integer type
        ArrayList arr = new ArrayList<>();
 
        // Iterating over the object
        for (int i = 0; i < 100000; i++) {
 
            // Adding elements using add() method
            arr.add(i);
        }
 
        // Calculating and printing the time taken
        // to find 40000 using contains()
        long start = System.nanoTime();
        arr.contains(40000);
        long end = System.nanoTime();
 
        // Print statement
        System.out.println(
            "Time taken to find 40000 inside arr "
            + "using contains() = " + (end - start)
            + " nano seconds");
 
        // Calculating and printing the time taken
        // to find 40000 using Collections.binarySearch()
        start = System.nanoTime();
        Collections.binarySearch(arr, 40000);
        end = System.nanoTime();
 
        // Print statement
        System.out.println(
            "Time taken to find 40000 inside arr "
            + "using binarySearch() = " + (end - start)
            + " nano seconds");
    }
}

输出
Time taken to find 40000 inside arr using contains() = 6651276 nano seconds
Time taken to find 40000 inside arr using binarySearch() = 85231 nano seconds

案例 3:对于未排序的列表

在下面提到的代码中,我们通过在其中存储 0 到 100000 之间的随机数创建了一个未排序的 ArrayList。由于列表未排序, contains() 方法的性能更好,因为它只需要 O(n) 时间,而使用Collections.sort() 方法,我们首先必须对需要额外 O(nlog(n) 的列表进行排序) 时间,然后花费 O(log2(n)) 时间来搜索元素。\

例子

Java

// Java program to compare the performance
// of contains() and Collections.sort() method
//  on an unsorted ArrayList (Case3)
 
// Importing ArrayList and Collections class
// from java.util package
import java.util.ArrayList;
import java.util.Collections;
 
// Main class
class GFG {
 
    // Main driver method
    public static void main(String[] args)
    {
 
        // Creating an object of ArayList class
        ArrayList arr = new ArrayList<>();
 
        // Iterating between 0 to 100000 numbers
        for (int i = 0; i < 100000; i++) {
 
            // Generating random numbers as iterated
            // using random() function
            int rand = (int)(Math.random() * 100000);
 
            // Later storing them inside our list
            arr.add(rand);
        }
 
        // Setting the key to be found as the element
        // at index 30000 inside of unsorted list
        int key = arr.get(30000);
 
        // Calculating and printing the time taken
        // to find the key using contains()
        long start = System.nanoTime();
        arr.contains(key);
        long end = System.nanoTime();
 
        // Print statement
        System.out.println(
            "Time takes to find " + key
            + " inside arr using contains() = "
            + (end - start) + " nano seconds");
 
        // Calculating and printing the time taken to
        // find the key using Collections.binarySearch()
        // after sorting the list using Collections.sort()
        // method
        start = System.nanoTime();
        Collections.sort(arr);
        Collections.binarySearch(arr, key);
        end = System.nanoTime();
 
        // Print statement
        System.out.println(
            "Time takes to find " + key
            + " inside arr using binarySearch() = "
            + (end - start) + " nano seconds");
    }
}

输出
Time takes to find 66181 inside arr using contains() = 8331486 nano seconds
Time takes to find 66181 inside arr using binarySearch() = 140322701 nano seconds