📜  Elasticsearch-模块

📅  最后修改于: 2020-10-30 14:21:45             🧑  作者: Mango


Elasticsearch由许多模块组成,这些模块负责其功能。这些模块具有两种类型的设置,如下所示:

  • 静态设置-在启动Elasticsearch之前,需要在config(elasticsearch.yml)文件中配置这些设置。您需要更新集群中的所有关注节点,以反映这些设置的更改。

  • 动态设置-这些设置可以在实时Elasticsearch上设置。

我们将在本章以下各节中讨论Elasticsearch的不同模块。

集群级路由和分片分配

群集级别设置决定了分片向不同节点的分配以及分片的重新分配以重新平衡群集。这些是用于控制分片分配的以下设置。

集群级分片分配

Setting Possible value Description
cluster.routing.allocation.enable
all This default value allows shard allocation for all kinds of shards.
primaries This allows shard allocation only for primary shards.
new_primaries This allows shard allocation only for primary shards for new indices.
none This does not allow any shard allocations.
cluster.routing.allocation .node_concurrent_recoveries Numeric value (by default 2) This restricts the number of concurrent shard recovery.
cluster.routing.allocation .node_initial_primaries_recoveries Numeric value (by default 4) This restricts the number of parallel initial primary recoveries.
cluster.routing.allocation .same_shard.host Boolean value (by default false) This restricts the allocation of more than one replica of the same shard in the same physical node.
indices.recovery.concurrent _streams Numeric value (by default 3) This controls the number of open network streams per node at the time of shard recovery from peer shards.
indices.recovery.concurrent _small_file_streams Numeric value (by default 2) This controls the number of open streams per node for small files having size less than 5mb at the time of shard recovery.
cluster.routing.rebalance.enable
all This default value allows balancing for all kinds of shards.
primaries This allows shard balancing only for primary shards.
replicas This allows shard balancing only for replica shards.
none This does not allow any kind of shard balancing.
cluster.routing.allocation .allow_rebalance
always This default value always allows rebalancing.
indices_primaries _active This allows rebalancing when all primary shards in cluster are allocated.
Indices_all_active This allows rebalancing when all the primary and replica shards are allocated.
cluster.routing.allocation.cluster _concurrent_rebalance Numeric value (by default 2) This restricts the number of concurrent shard balancing in cluster.
cluster.routing.allocation .balance.shard Float value (by default 0.45f) This defines the weight factor for shards allocated on every node.
cluster.routing.allocation .balance.index Float value (by default 0.55f) This defines the ratio of the number of shards per index allocated on a specific node.
cluster.routing.allocation .balance.threshold Non negative float value (by default 1.0f) This is the minimum optimization value of operations that should be performed.

基于磁盘的分片分配

Setting Possible value Description
cluster.routing.allocation.disk.threshold_enabled Boolean value (by default true) This enables and disables disk allocation decider.
cluster.routing.allocation.disk.watermark.low String value(by default 85%) This denotes maximum usage of disk; after this point, no other shard can be allocated to that disk.
cluster.routing.allocation.disk.watermark.high String value (by default 90%) This denotes the maximum usage at the time of allocation; if this point is reached at the time of allocation, then Elasticsearch will allocate that shard to another disk.
cluster.info.update.interval String value (by default 30s) This is the interval between disk usages checkups.
cluster.routing.allocation.disk.include_relocations Boolean value (by default true) This decides whether to consider the shards currently being allocated, while calculating disk usage.

发现

该模块帮助集群发现并维护集群中所有节点的状态。在节点上添加或删除节点后,群集的状态会更改。群集名称设置用于在不同群集之间创建逻辑差异。有一些模块可以帮助您使用云供应商提供的API,如下所示-

  • Azure发现
  • EC2发现
  • Google计算引擎发现
  • 禅发现

网关

此模块在整个群集重新启动时维护群集状态和分片数据。以下是此模块的静态设置-

Setting Possible value Description
gateway.expected_nodes numeric value (by default 0) The number of nodes that are expected to be in the cluster for
the recovery of local shards.
gateway.expected_master_nodes numeric value (by default 0) The number of master nodes that are expected to be in the cluster before start recovery.
gateway.expected_data_nodes numeric value (by default 0) The number of data nodes expected in the cluster before start recovery.
gateway.recover_after_time String value (by default 5m) This is the interval between disk usages checkups.
cluster.routing.allocation.
disk.include_relocations
Boolean value (by default true)

This specifies the time for which the recovery process will wait to start regardless of the number of nodes joined in the cluster.

gateway.recover_ after_nodes
gateway.recover_after_master_nodes
gateway.recover_after_data_nodes

HTTP

该模块管理HTTP客户端和Elasticsearch API之间的通信。可以通过将http.enabled的值更改为false来禁用此模块。

以下是用于控制此模块的设置(在elasticsearch.yml中配置)-

S.No Setting & Description
1

http.port

This is a port to access Elasticsearch and it ranges from 9200-9300.

2

http.publish_port

This port is for http clients and is also useful in case of firewall.

3

http.bind_host

This is a host address for http service.

4

http.publish_host

This is a host address for http client.

5

http.max_content_length

This is the maximum size of content in an http request. Its default value is 100mb.

6

http.max_initial_line_length

This is the maximum size of URL and its default value is 4kb.

7

http.max_header_size

This is the maximum http header size and its default value is 8kb.

8

http.compression

This enables or disables support for compression and its default value is false.

9

http.pipelinig

This enables or disables HTTP pipelining.

10

http.pipelining.max_events

This restricts the number of events to be queued before closing an HTTP request.

指标

此模块维护设置,这些设置是为每个索引全局设置的。以下设置主要与内存使用情况有关-

断路器

这用于防止操作引起OutOfMemroyError。该设置主要限制JVM堆大小。例如,indexs.breaker.total.limit设置,默认为JVM堆的70%。

现场数据缓存

主要用于在字段上聚合时使用。建议有足够的内存来分配它。可以使用index.fielddata.cache.size设置来控制用于字段数据缓存的内存量。

节点查询缓存

该内存用于缓存查询结果。该缓存使用最近最少使用(LRU)驱逐策略。 Indices.queries.cahce.size设置控制此缓存的内存大小。

索引缓冲区

该缓冲区将新创建的文档存储在索引中,并在缓冲区已满时刷新它们。像indexs.memory.index_buffer_size这样的设置控制为此缓冲区分配的堆数量。

分片请求缓存

该缓存用于存储每个分片的本地搜索数据。缓存可以在创建索引期间启用,也可以通过发送URL参数禁用。

Disable cache - ?request_cache = true
Enable cache "index.requests.cache.enable": true

指数恢复

它在恢复过程中控制资源。以下是设置-

Setting Default value
indices.recovery.concurrent_streams 3
indices.recovery.concurrent_small_file_streams 2
indices.recovery.file_chunk_size 512kb
indices.recovery.translog_ops 1000
indices.recovery.translog_size 512kb
indices.recovery.compress true
indices.recovery.max_bytes_per_sec 40mb

TTL间隔

生存时间(TTL)间隔定义了文档的时间,此后文档将被删除。以下是用于控制此过程的动态设置-

Setting Default value
indices.ttl.interval 60s
indices.ttl.bulk_size 1000

节点

每个节点都可以选择是否为数据节点。您可以通过更改node.data设置来更改此属性。将值设置为false定义该节点不是数据节点。