📜  批处理和流处理的区别

📅  最后修改于: 2021-09-28 09:26:29             🧑  作者: Mango

先决条件 – 操作系统类型

1. 批处理:
批处理是指在特定时间跨度内批量处理大量数据。它一次处理大量数据。当数据大小已知且有限时使用批处理。处理数据需要更长的时间。它需要专门的人员来处理问题。批处理器处理多遍处理数据。当数据被加班收集并且类似的数据被批处理/分组在一起时,那么在这种情况下使用批处理。

批处理的挑战:

  • 这些系统的调试很困难,因为它需要专门的专业人员来修复错误。
  • 软件和培训最初只是为了理解批处理调度、触发、通知等,就需要高昂的费用。

2. 流处理:
流处理是指在产生连续数据流时立即对其进行处理。它实时分析流数据。当数据大小未知且无限连续时使用流处理。处理数据需要几秒钟或几毫秒。在流处理中,数据输出速率与数据输入速率一样快。流处理器以几次方式处理数据。当数据流是连续的并且需要立即响应时,则在这种情况下使用流处理。

流处理的挑战:

  • 数据输入速率和输出速率有时会产生问题。
  • 应对海量数据,即时响应。

批处理和流处理的区别:

S.No. BATCH PROCESSING STREAM PROCESSING
01. Batch processing refers to processing of high volume of data in batch within a specific time span. Stream processing refers to processing of continuous stream of data immediately as it is produced.
02. Batch processing processes large volume of data all at once. Stream processing analyzes streaming data in real time.
04. In Batch processing data size is known and finite. In Stream processing data size is unknown and infinite in advance.
05. In Batch processing the data is processes in multiple passes. In stream processing generally data is processed in few passes.
06. Batch processor takes longer time to processes data. Stream processor takes few seconds or milliseconds to process data.
07. In batch processing the input graph is static. In stream processing the input graph is dynamic.
08. In this processing the data is analyzed on a snapshot. In this processing the data is analyzed on continuous.
09. In batch processing the response is provided after job completion. In stream processing the response is provided immediately.
10. Examples are distributed programming platforms like MapReduce, Spark, GraphX etc. Examples are programming platforms like spark streaming and S4 (Simple Scalable Streaming System) etc.
11. Batch processing is used in payroll and billing system, food processing system etc. Stream processing is used in stock market, e-commerce transactions, social media etc.