Java中的线程池

背景

服务器程序（如数据库和 Web 服务器）重复执行来自多个客户端的请求，这些程序旨在处理大量短任务。构建服务器应用程序的一种方法是在每次请求到达时创建一个新线程，并在新创建的线程中为这个新请求提供服务。虽然这种方法实施起来似乎很简单，但它也有明显的缺点。与处理实际请求相比，为每个请求创建新线程的服务器将花费更多时间和消耗更多系统资源来创建和销毁线程。

由于活动线程会消耗系统资源，同时创建过多线程的 JVM 会导致系统内存不足。这就需要限制正在创建的线程数。

Java中的线程池是什么？

线程池重用先前创建的线程来执行当前任务，并为线程周期开销和资源抖动问题提供了解决方案。由于请求到达时线程已经存在，因此消除了线程创建引入的延迟，使应用程序更具响应性。

Java提供了以 Executor 接口为中心的 Executor 框架，它的子接口ExecutorService和实现这两个接口的类ThreadPoolExecutor 。通过使用执行器，只需实现 Runnable 对象并将它们发送到执行器执行。
它们允许您利用线程，但专注于您希望线程执行的任务，而不是线程机制。

要使用线程池，我们首先创建一个 ExecutorService 对象并将一组任务传递给它。 ThreadPoolExecutor 类允许设置核心和最大池大小。由特定线程运行的可运行对象按顺序执行。

大小 = 3 个线程的线程池初始化。任务队列 = 5 个可运行对象

执行器线程池方法

Method                         Description
newFixedThreadPool(int)           Creates a fixed size thread pool.
newCachedThreadPool()             Creates a thread pool that creates new 
                                  threads as needed, but will reuse previously 
                                  constructed threads when they are available
newSingleThreadExecutor()         Creates a single thread.

在固定线程池的情况下，如果执行器当前正在运行所有线程，则将挂起的任务放入队列中，并在线程空闲时执行。

线程池示例

在下面的教程中，我们将看一个线程池执行器的基本示例——FixedThreadPool。

应遵循的步骤

1. Create a task(Runnable Object) to execute
2. Create Executor Pool using Executors
3. Pass tasks to Executor Pool
4. Shutdown the Executor Pool

// Java program to illustrate 
// ThreadPool
import java.text.SimpleDateFormat; 
import java.util.Date;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
  
// Task class to be executed (Step 1)
class Task implements Runnable   
{
    private String name;
      
    public Task(String s)
    {
        name = s;
    }
      
    // Prints task name and sleeps for 1s
    // This Whole process is repeated 5 times
    public void run()
    {
        try
        {
            for (int i = 0; i<=5; i++)
            {
                if (i==0)
                {
                    Date d = new Date();
                    SimpleDateFormat ft = new SimpleDateFormat("hh:mm:ss");
                    System.out.println("Initialization Time for"
                            + " task name - "+ name +" = " +ft.format(d));   
                    //prints the initialization time for every task 
                }
                else
                {
                    Date d = new Date();
                    SimpleDateFormat ft = new SimpleDateFormat("hh:mm:ss");
                    System.out.println("Executing Time for task name - "+
                            name +" = " +ft.format(d));   
                    // prints the execution time for every task 
                }
                Thread.sleep(1000);
            }
            System.out.println(name+" complete");
        }
          
        catch(InterruptedException e)
        {
            e.printStackTrace();
        }
    }
}
public class Test
{
     // Maximum number of threads in thread pool
    static final int MAX_T = 3;             
  
    public static void main(String[] args)
    {
        // creates five tasks
        Runnable r1 = new Task("task 1");
        Runnable r2 = new Task("task 2");
        Runnable r3 = new Task("task 3");
        Runnable r4 = new Task("task 4");
        Runnable r5 = new Task("task 5");      
          
        // creates a thread pool with MAX_T no. of 
        // threads as the fixed pool size(Step 2)
        ExecutorService pool = Executors.newFixedThreadPool(MAX_T);  
         
        // passes the Task objects to the pool to execute (Step 3)
        pool.execute(r1);
        pool.execute(r2);
        pool.execute(r3);
        pool.execute(r4);
        pool.execute(r5); 
          
        // pool shutdown ( Step 4)
        pool.shutdown();    
    }
}

样品执行

Output:
Initialization Time for task name - task 2 = 02:32:56
Initialization Time for task name - task 1 = 02:32:56
Initialization Time for task name - task 3 = 02:32:56
Executing Time for task name - task 1 = 02:32:57
Executing Time for task name - task 2 = 02:32:57
Executing Time for task name - task 3 = 02:32:57
Executing Time for task name - task 1 = 02:32:58
Executing Time for task name - task 2 = 02:32:58
Executing Time for task name - task 3 = 02:32:58
Executing Time for task name - task 1 = 02:32:59
Executing Time for task name - task 2 = 02:32:59
Executing Time for task name - task 3 = 02:32:59
Executing Time for task name - task 1 = 02:33:00
Executing Time for task name - task 3 = 02:33:00
Executing Time for task name - task 2 = 02:33:00
Executing Time for task name - task 2 = 02:33:01
Executing Time for task name - task 1 = 02:33:01
Executing Time for task name - task 3 = 02:33:01
task 2 complete
task 1 complete
task 3 complete
Initialization Time for task name - task 5 = 02:33:02
Initialization Time for task name - task 4 = 02:33:02
Executing Time for task name - task 4 = 02:33:03
Executing Time for task name - task 5 = 02:33:03
Executing Time for task name - task 5 = 02:33:04
Executing Time for task name - task 4 = 02:33:04
Executing Time for task name - task 4 = 02:33:05
Executing Time for task name - task 5 = 02:33:05
Executing Time for task name - task 5 = 02:33:06
Executing Time for task name - task 4 = 02:33:06
Executing Time for task name - task 5 = 02:33:07
Executing Time for task name - task 4 = 02:33:07
task 5 complete
task 4 complete

从程序的执行中可以看出，只有当池中的线程空闲时，才会执行任务 4 或任务 5。在那之前，额外的任务被放置在一个队列中。

执行前三个任务的线程池

线程池执行任务 4 和 5

One of the main advantages of using this approach is when you want to process 100 requests at a time, but do not want to create 100 Threads for the same, so as to reduce JVM overload. You can use this approach to create a ThreadPool of 10 Threads and you can submit 100 requests to this ThreadPool. 
ThreadPool will create maximum of 10 threads to process 10 requests at a time.  After process completion of any single Thread, 
ThreadPool will internally allocate the 11th request to this Thread 
and will keep on doing the same to all the remaining requests.

使用线程池的风险

死锁：虽然死锁可能发生在任何多线程程序中，但线程池引入了另一种死锁情况，在这种情况下，由于线程不可用，所有正在执行的线程都在等待队列中等待的阻塞线程的结果。
线程泄漏：如果线程从池中删除以执行任务但在任务完成时没有返回给它，则会发生线程泄漏。例如，如果线程抛出异常并且池类没有捕捉到这个异常，那么线程将简单地退出，将线程池的大小减少一。如果这种情况重复很多次，那么池最终会变空，并且没有线程可用于执行其他请求。
资源抖动：如果线程池大小非常大，那么在线程之间的上下文切换中浪费时间。正如解释的那样，拥有比最佳数量更多的线程可能会导致导致资源抖动的饥饿问题。

要点

不要将同时等待其他任务结果的任务排队。这可能导致如上所述的死锁情况。
使用线程进行长期操作时要小心。这可能会导致线程永远等待并最终导致资源泄漏。
线程池必须在最后显式结束。如果不这样做，那么程序将继续执行并且永远不会结束。在池上调用 shutdown() 以结束执行程序。如果您在关闭后尝试向执行器发送另一个任务，它将抛出 RejectedExecutionException。
需要了解有效调整线程池的任务。如果任务差异很大，那么为不同类型的任务使用不同的线程池以便正确调整它们是有意义的。
您可以限制可以在 JVM 中运行的最大线程数，从而减少 JVM 内存不足的机会。
如果您需要实现循环来创建新线程进行处理，使用 ThreadPool 将有助于更快地处理，因为 ThreadPool 在达到最大限制后不会创建新线程。
Thread Processing 完成后，ThreadPool 可以使用同一个 Thread 做另一个进程（这样可以节省创建另一个 Thread 的时间和资源。）

调优线程池

线程池的最佳大小取决于可用处理器的数量和任务的性质。在一个只有计算类型进程的队列的 N 处理器系统上，最大线程池大小为 N 或 N+1 将实现最大效率。但是任务可能会等待 I/O，在这种情况下，我们会考虑比率请求的等待时间（W）和服务时间（S）；导致最大池大小为 N*(1+ W/S) 以获得最大效率。

线程池是组织服务器应用程序的有用工具。它在概念上非常简单，但是在实现和使用它时需要注意几个问题，例如死锁、资源抖动。使用执行器服务更容易实现。