Elasticsearch-模块 - 芒果文档

📌 相关文章

📜 Elasticsearch-模块

📅 最后修改于: 2020-10-30 14:21:45 🧑 作者: Mango

Elasticsearch由许多模块组成，这些模块负责其功能。这些模块具有两种类型的设置，如下所示：

静态设置-在启动Elasticsearch之前，需要在config(elasticsearch.yml)文件中配置这些设置。您需要更新集群中的所有关注节点，以反映这些设置的更改。
动态设置-这些设置可以在实时Elasticsearch上设置。

我们将在本章以下各节中讨论Elasticsearch的不同模块。

集群级路由和分片分配

群集级别设置决定了分片向不同节点的分配以及分片的重新分配以重新平衡群集。这些是用于控制分片分配的以下设置。

集群级分片分配

Setting	Possible value	Description
cluster.routing.allocation.enable
	all	This default value allows shard allocation for all kinds of shards.
	primaries	This allows shard allocation only for primary shards.
	new_primaries	This allows shard allocation only for primary shards for new indices.
	none	This does not allow any shard allocations.
cluster.routing.allocation .node_concurrent_recoveries	Numeric value (by default 2)	This restricts the number of concurrent shard recovery.
cluster.routing.allocation .node_initial_primaries_recoveries	Numeric value (by default 4)	This restricts the number of parallel initial primary recoveries.
cluster.routing.allocation .same_shard.host	Boolean value (by default false)	This restricts the allocation of more than one replica of the same shard in the same physical node.
indices.recovery.concurrent _streams	Numeric value (by default 3)	This controls the number of open network streams per node at the time of shard recovery from peer shards.
indices.recovery.concurrent _small_file_streams	Numeric value (by default 2)	This controls the number of open streams per node for small files having size less than 5mb at the time of shard recovery.
cluster.routing.rebalance.enable
	all	This default value allows balancing for all kinds of shards.
	primaries	This allows shard balancing only for primary shards.
	replicas	This allows shard balancing only for replica shards.
	none	This does not allow any kind of shard balancing.
cluster.routing.allocation .allow_rebalance
	always	This default value always allows rebalancing.
	indices_primaries _active	This allows rebalancing when all primary shards in cluster are allocated.
	Indices_all_active	This allows rebalancing when all the primary and replica shards are allocated.
cluster.routing.allocation.cluster _concurrent_rebalance	Numeric value (by default 2)	This restricts the number of concurrent shard balancing in cluster.
cluster.routing.allocation .balance.shard	Float value (by default 0.45f)	This defines the weight factor for shards allocated on every node.
cluster.routing.allocation .balance.index	Float value (by default 0.55f)	This defines the ratio of the number of shards per index allocated on a specific node.
cluster.routing.allocation .balance.threshold	Non negative float value (by default 1.0f)	This is the minimum optimization value of operations that should be performed.

基于磁盘的分片分配

Setting	Possible value	Description
cluster.routing.allocation.disk.threshold_enabled	Boolean value (by default true)	This enables and disables disk allocation decider.
cluster.routing.allocation.disk.watermark.low	String value(by default 85%)	This denotes maximum usage of disk; after this point, no other shard can be allocated to that disk.
cluster.routing.allocation.disk.watermark.high	String value (by default 90%)	This denotes the maximum usage at the time of allocation; if this point is reached at the time of allocation, then Elasticsearch will allocate that shard to another disk.
cluster.info.update.interval	String value (by default 30s)	This is the interval between disk usages checkups.
cluster.routing.allocation.disk.include_relocations	Boolean value (by default true)	This decides whether to consider the shards currently being allocated, while calculating disk usage.

发现

该模块帮助集群发现并维护集群中所有节点的状态。在节点上添加或删除节点后，群集的状态会更改。群集名称设置用于在不同群集之间创建逻辑差异。有一些模块可以帮助您使用云供应商提供的API，如下所示-

Azure发现
EC2发现
Google计算引擎发现
禅发现

网关

此模块在整个群集重新启动时维护群集状态和分片数据。以下是此模块的静态设置-

Setting	Possible value	Description
gateway.expected_nodes	numeric value (by default 0)	The number of nodes that are expected to be in the cluster for the recovery of local shards.
gateway.expected_master_nodes	numeric value (by default 0)	The number of master nodes that are expected to be in the cluster before start recovery.
gateway.expected_data_nodes	numeric value (by default 0)	The number of data nodes expected in the cluster before start recovery.
gateway.recover_after_time	String value (by default 5m)	This is the interval between disk usages checkups.
cluster.routing.allocation. disk.include_relocations	Boolean value (by default true)	This specifies the time for which the recovery process will wait to start regardless of the number of nodes joined in the cluster. gateway.recover_ after_nodes gateway.recover_after_master_nodes gateway.recover_after_data_nodes

HTTP

该模块管理HTTP客户端和Elasticsearch API之间的通信。可以通过将http.enabled的值更改为false来禁用此模块。

以下是用于控制此模块的设置(在elasticsearch.yml中配置)-

S.No	Setting & Description
1	http.port This is a port to access Elasticsearch and it ranges from 9200-9300.
2	http.publish_port This port is for http clients and is also useful in case of firewall.
3	http.bind_host This is a host address for http service.
4	http.publish_host This is a host address for http client.
5	http.max_content_length This is the maximum size of content in an http request. Its default value is 100mb.
6	http.max_initial_line_length This is the maximum size of URL and its default value is 4kb.
7	http.max_header_size This is the maximum http header size and its default value is 8kb.
8	http.compression This enables or disables support for compression and its default value is false.
9	http.pipelinig This enables or disables HTTP pipelining.
10	http.pipelining.max_events This restricts the number of events to be queued before closing an HTTP request.

指标

此模块维护设置，这些设置是为每个索引全局设置的。以下设置主要与内存使用情况有关-

断路器

这用于防止操作引起OutOfMemroyError。该设置主要限制JVM堆大小。例如，indexs.breaker.total.limit设置，默认为JVM堆的70％。

现场数据缓存

主要用于在字段上聚合时使用。建议有足够的内存来分配它。可以使用index.fielddata.cache.size设置来控制用于字段数据缓存的内存量。

节点查询缓存

该内存用于缓存查询结果。该缓存使用最近最少使用(LRU)驱逐策略。 Indices.queries.cahce.size设置控制此缓存的内存大小。

索引缓冲区

该缓冲区将新创建的文档存储在索引中，并在缓冲区已满时刷新它们。像indexs.memory.index_buffer_size这样的设置控制为此缓冲区分配的堆数量。

分片请求缓存

该缓存用于存储每个分片的本地搜索数据。缓存可以在创建索引期间启用，也可以通过发送URL参数禁用。

Disable cache - ?request_cache = true
Enable cache "index.requests.cache.enable": true

指数恢复

它在恢复过程中控制资源。以下是设置-

Setting	Default value
indices.recovery.concurrent_streams	3
indices.recovery.concurrent_small_file_streams	2
indices.recovery.file_chunk_size	512kb
indices.recovery.translog_ops	1000
indices.recovery.translog_size	512kb
indices.recovery.compress	true
indices.recovery.max_bytes_per_sec	40mb

TTL间隔

生存时间(TTL)间隔定义了文档的时间，此后文档将被删除。以下是用于控制此过程的动态设置-

Setting	Default value
indices.ttl.interval	60s
indices.ttl.bulk_size	1000

节点

每个节点都可以选择是否为数据节点。您可以通过更改node.data设置来更改此属性。将值设置为false定义该节点不是数据节点。