📜  Apache Storm-集群体系结构

📅  最后修改于: 2020-12-02 05:54:49             🧑  作者: Mango


Apache Storm的主要亮点之一是它是一种容错的,快速的,没有“单一故障点”(SPOF)分布式应用程序的应用程序。我们可以根据需要在许多系统中安装Apache Storm,以增加应用程序的容量。

让我们看一下Apache Storm集群的设计方式及其内部架构。下图描述了集群设计。

Zookeeper框架

Apache Storm具有两种类型的节点: Nimbus (主节点)和Supervisor (工作节点)。 Nimbus是Apache Storm的核心组件。 Nimbus的主要工作是运行Storm拓扑。 Nimbus分析拓扑并收集要执行的任务。然后,它将任务分配给可用的主管。

主管将具有一个或多个工作程序。主管将任务委托给工人流程。工作进程将根据需要生成任意数量的执行程序并运行任务。 Apache Storm使用内部分布式消息传递系统在灵气与主管之间进行通信。

Components Description
Nimbus Nimbus is a master node of Storm cluster. All other nodes in the cluster are called as worker nodes. Master node is responsible for distributing data among all the worker nodes, assign tasks to worker nodes and monitoring failures.
Supervisor The nodes that follow instructions given by the nimbus are called as Supervisors. A supervisor has multiple worker processes and it governs worker processes to complete the tasks assigned by the nimbus.
Worker process A worker process will execute tasks related to a specific topology. A worker process will not run a task by itself, instead it creates executors and asks them to perform a particular task. A worker process will have multiple executors.
Executor An executor is nothing but a single thread spawn by a worker process. An executor runs one or more tasks but only for a specific spout or bolt.
Task A task performs actual data processing. So, it is either a spout or a bolt.
ZooKeeper framework

Apache ZooKeeper is a service used by a cluster (group of nodes) to coordinate between themselves and maintaining shared data with robust synchronization techniques. Nimbus is stateless, so it depends on ZooKeeper to monitor the working node status.

ZooKeeper helps the supervisor to interact with the nimbus. It is responsible to maintain the state of nimbus and supervisor.

风暴本质上是无状态的。即使无状态性质也有其自身的缺点,它实际上可以帮助Storm以最快,最好的方式处理实时数据。

风暴并不是完全无状态的。它将状态存储在Apache ZooKeeper中。由于该状态在Apache ZooKeeper中可用,因此发生故障的雨云总线可以重新启动并从其离开的地方开始工作。通常,服务监视工具(如monit)将监视Nimbus,并在出现任何故障时重新启动它。

Apache Storm还具有称为状态的Trident拓扑的高级拓扑,具有状态维护功能,并且还提供了如Pig的高级API。我们将在接下来的章节中讨论所有这些功能。