java 读写es_ElasticSearch读写一致性问题

ES写过程

1.ES会将document发送给coordinate node,节点根据document数据路由到指定的节点，改节点包含该primary shard

2.把文档存储写入到primary shard，如果设置了index.write.wait_for_active_shards=1,那么写完主节点，直接返回客户端，如果 index.write.wait_for_active_shards=all,那么必须要把所有的副本写入完成才返回客户端。详情见官方文档:#index-wait-for-active-shards

3.如果index.write.wait_for_active_shards=1，那么es会异步的把主分片的数据同步到副本分片上去。(在此期间，可能会出现读请求可能读取不到最新数据的情况)

ES读取过程

1.客户端发送请求到任意一个 node，成为 coordinate node

2. coordinate node将请求的查询的条件，找到文档对应的分片和副本节点的地址

3.随机选择一个节点，一般是轮询，肯恩查询主节点，可能查询的是副本节点,然后将数据返回给coordinate node

4.coordinate node将数据返回给客户端。

由于可能存在primary shard的数据还没同步到 replica shard上的情况，所以客户端可能查询到旧的数据，我们可以做相应的调整，保证读取到最新的数据。

GET /_search?preference=_primary

{undefined

“query”: {undefined

“match”: {undefined

“title”: “elasticsearch”

}

方案2: 执行 Update , Delete , Bulk 等操作时，设备refresh策略，常见的有以下几种，

refresh=true,更新数据之后，立刻对相关的分片(包括副本) 刷新。

refresh=wait_for 这个参数表示，刷新不会立刻进行，而是等待一段时间才刷新 ( index.refresh_interval)，默认时间是 1 秒

enum RefreshPolicy implements Writeable {undefined

/**

* Don’t refresh after this request. The default.

NONE(“false”),

/**

* Force a refresh as part of this request. This refresh policy does not scale for high indexing or search throughput but is useful

* to present a consistent view to for indices with very low traffic. And it is wonderful for tests!

IMMEDIATE(“true”),

/**

* Leave this request open until a refresh has made the contents of this request visible to search. This refresh policy is

* compatible with high indexing and search throughput but it causes the request to wait to reply until a refresh occurs.

WAIT_UNTIL(“wait_for”);

ES写入的底层流程

如图

1.写入的数据先写入一个index_buffer的内存空间中，在index buffer中的数据无法被搜索到，这是ES无法保证100%实时的原因， indices.memory.index_buffer_size配置该空间的大小。

2.在写入index_buffer之前，需要先记录translog，translog用于记录每次写入的数据，在持久化失败后，可用于数据恢复。

3. index_buffer的空间被占满或者达刷新间隔到了index.refresh_interval的值时，会刷新的操作系统的filesystem cache中。

4.filesystem cache中数据，也会在一定的时间间隔，flush到磁盘中，当刷新到磁盘后，就会删除对应的文档的translog中数据。

智云一二三科技

java 读写es_ElasticSearch读写一致性问题

关于作者: 智云科技

给这篇文章的作者打赏

关于作者: 智云科技

相关文章

Map和List的几种遍历方式

全网大佬都在用的Java+Python这两套视频学习教程，学习很重要

JavaWeb快速进阶全套教程(程序员必备2020版)：视频+笔记+源码

热门文章

1分享新浪图床上传接口源码

2PHP简单实现路由Route功能

3Tideways、xhprof 和 xhgui 打造 PHP 非侵入式监控平台

4centos系统如何查看是否安装了mysql

5curl 工具简述