先决条件–操作系统类型
1.批处理:
批处理是指在特定时间段内批量处理大量数据。它可以一次处理大量数据。当数据大小已知且有限时,使用批处理。处理数据花费的时间很少。它需要专门的人员来处理问题。批处理程序以多次处理来处理数据。如果超时收集数据,并且类似的数据被批量/分组在一起,则在这种情况下,将使用批处理。
批处理的挑战:
- 这些系统的调试很困难,因为它需要专门的专家来修复错误。
- 软件和培训最初只是为了了解批生产计划,触发,通知等而需要大量费用。
2.流处理:
流处理是指在产生连续数据流时立即对其进行处理。它实时分析流数据。当数据大小未知且无限且连续时,将使用流处理。处理数据需要几秒钟或几毫秒。在流处理中,数据输出速率与数据输入速率一样快。流处理器只需几步就可以处理数据。当数据流是连续的并且需要立即响应时,则在这种情况下使用流处理。
流处理面临的挑战:
- 数据输入速率和输出速率有时会产生问题。
- 处理大量数据并立即做出响应。
批处理和流处理之间的区别:
S.No. | BATCH PROCESSING | STREAM PROCESSING |
---|---|---|
01. | Batch processing refers to processing of high volume of data in batch within a specific time span. | Stream processing refers to processing of continuous stream of data immediately as it is produced. |
02. | Batch processing processes large volume of data all at once. | Stream processing analyzes streaming data in real time. |
04. | In Batch processing data size is known and finite. | In Stream processing data size is unknown and infinite in advance. |
05. | In Batch processing the data is processes in multiple passes. | In stream processing generally data is processed in few passes. |
06. | Batch processor takes longer time to processes data. | Stream processor takes few seconds or milliseconds to process data. |
07. | In batch processing the input graph is static. | In stream processing the input graph is dynamic. |
08. | In this processing the data is analyzed on a snapshot. | In this processing the data is analyzed on continuous. |
09. | In batch processing the response is provided after job completion. | In stream processing the response is provided immediately. |
10. | Examples are distributed programming platforms like MapReduce, Spark, GraphX etc. | Examples are programming platforms like spark streaming and S4 (Simple Scalable Streaming System) etc. |
11. | Batch processing is used in payroll and billing system, food processing system etc. | Stream processing is used in stock market, e-commerce transactions, social media etc. |