📜  批处理和流处理之间的区别

📅  最后修改于: 2021-08-25 17:43:14             🧑  作者: Mango

先决条件–操作系统类型

1.批处理:
批处理是指在特定时间段内批量处理大量数据。它可以一次处理大量数据。当数据大小已知且有限时,使用批处理。处理数据花费的时间很少。它需要专门的人员来处理问题。批处理程序以多次处理来处理数据。如果超时收集数据,并且类似的数据被批量/分组在一起,则在这种情况下,将使用批处理。

批处理的挑战:

  • 这些系统的调试很困难,因为它需要专门的专家来修复错误。
  • 软件和培训最初只是为了了解批生产计划,触发,通知等而需要大量费用。

2.流处理:
流处理是指在产生连续数据流时立即对其进行处理。它实时分析流数据。当数据大小未知且无限且连续时,将使用流处理。处理数据需要几秒钟或几毫秒。在流处理中,数据输出速率与数据输入速率一样快。流处理器只需几步就可以处理数据。当数据流是连续的并且需要立即响应时,则在这种情况下使用流处理。

流处理面临的挑战:

  • 数据输入速率和输出速率有时会产生问题。
  • 处理大量数据并立即做出响应。

批处理和流处理之间的区别:

S.No. BATCH PROCESSING STREAM PROCESSING
01. Batch processing refers to processing of high volume of data in batch within a specific time span. Stream processing refers to processing of continuous stream of data immediately as it is produced.
02. Batch processing processes large volume of data all at once. Stream processing analyzes streaming data in real time.
04. In Batch processing data size is known and finite. In Stream processing data size is unknown and infinite in advance.
05. In Batch processing the data is processes in multiple passes. In stream processing generally data is processed in few passes.
06. Batch processor takes longer time to processes data. Stream processor takes few seconds or milliseconds to process data.
07. In batch processing the input graph is static. In stream processing the input graph is dynamic.
08. In this processing the data is analyzed on a snapshot. In this processing the data is analyzed on continuous.
09. In batch processing the response is provided after job completion. In stream processing the response is provided immediately.
10. Examples are distributed programming platforms like MapReduce, Spark, GraphX etc. Examples are programming platforms like spark streaming and S4 (Simple Scalable Streaming System) etc.
11. Batch processing is used in payroll and billing system, food processing system etc. Stream processing is used in stock market, e-commerce transactions, social media etc.