添加链接
link之家
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接
ES信息: Centos7.2,ES6.2.2 , MASTER:16核/128G物理 * 3 ,DATA:16核/128G/12块HDD6T组成RAID0 * 40, JVM开了30G,  目前只有一个索引,每天10T(算上副本),分片160,副本1,保留7天】 故障描述: 某一个节点(随机)总是无缘无故的脱离集群,节点load标高,100以上,敲命令都会卡住,只有强制重启才可以解决,加force_merge后更为严重,; 问题背景: 之前基本一个月内会出现一次上述的问题吧,前阵子我加了一个每天凌晨1点开始执行force_merge=1定时任务,每次基本12小时左右才能完成,加剧了上述问题的出现,但这个基本是在凌晨4-6点出现故障比较多,一周内至少出现一次或多次,导致集群写入严重下降,属于半不可用状态(写入堆积,非实时数据),当时是加了merge开始问题急剧出现,经过几天排查无果,后来因为对历史数据查询需求不大,便关了这个定时任务,但是这个问题根本一直没解决, 目前有两个问题:
1、为什么会出现脱离集群的问题呢,而且现在时不时的出现,出现时间没有规律性?
2、某一个节点脱离后,整个集群吞吐量下降严重,从原来写入qps 70w+  为什么会降到了30w左右呢?
排除硬件问题,重启后就恢复,而且找过系统部的同学看过没有硬件报警,希望有遇到过或者有排查思路的给一些建议或意见,以下是我收集的信息 信息一:在出现问题的当时(22:52),/vat/log/messages大量日志如下:
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994585] INFO: task java:104611 blocked for more than 120 seconds.
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994630] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994682] java D ffffffffffffffff 0 104611 1 0x00000100
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994685] ffff88013f05fc20 0000000000000082 ffff88001e6ee780 ffff88013f05ffd8
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994691] ffff88013f05ffd8 ffff88013f05ffd8 ffff88001e6ee780 ffff88013f05fd68
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994696] ffff88013f05fd70 7fffffffffffffff ffff88001e6ee780 ffffffffffffffff
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994701] Call Trace:
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994706] [<ffffffff8163a909>] schedule+0x29/0x70
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994710] [<ffffffff816385f9>] schedule_timeout+0x209/0x2d0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994715] [<ffffffff8101c829>] ? read_tsc+0x9/0x10
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994720] [<ffffffff810d814c>] ? ktime_get_ts64+0x4c/0xf0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994723] [<ffffffff8112882f>] ? delayacct_end+0x8f/0xb0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994728] [<ffffffff8163acd6>] wait_for_completion+0x116/0x170
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994733] [<ffffffff810b8c10>] ? wake_up_state+0x20/0x20
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994737] [<ffffffff8109e7ac>] flush_work+0xfc/0x1c0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994741] [<ffffffff8109a7e0>] ? move_linked_works+0x90/0x90
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994768] [<ffffffffa03a143a>] xlog_cil_force_lsn+0x8a/0x210 [xfs]
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994793] [<ffffffffa039fa7e>] _xfs_log_force_lsn+0x6e/0x2f0 [xfs]
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994798] [<ffffffff81639b12>] ? down_read+0x12/0x30
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994823] [<ffffffffa03824d0>] xfs_file_fsync+0x1b0/0x200 [xfs]
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994829] [<ffffffff8120f975>] do_fsync+0x65/0xa0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994834] [<ffffffff8120fc63>] SyS_fdatasync+0x13/0x20
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994839] [<ffffffff81645b12>] tracesys+0xdd/0xe2
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994854] INFO: task java:67513 blocked for more than 120 seconds.
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994898] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994951] java D ffff88001f8128a8 0 67513 1 0x00000100
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994954] ffff880054a63c20 0000000000000082 ffff880116971700 ffff880054a63fd8
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994959] ffff880054a63fd8 ffff880054a63fd8 ffff880116971700 ffff88001f8128a0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994964] ffff88001f8128a4 ffff880116971700 00000000ffffffff ffff88001f8128a8
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994970] Call Trace:
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994975] [<ffffffff8163b9e9>] schedule_preempt_disabled+0x29/0x70
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994979] [<ffffffff816396e5>] __mutex_lock_slowpath+0xc5/0x1c0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994983] [<ffffffff811e8a87>] ? unlazy_walk+0x87/0x140
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994987] [<ffffffff81638b4f>] mutex_lock+0x1f/0x2f
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994992] [<ffffffff8163251e>] lookup_slow+0x33/0xa7
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.994996] [<ffffffff811edf13>] path_lookupat+0x773/0x7a0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995001] [<ffffffff811c0e65>] ? kmem_cache_alloc+0x35/0x1d0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995005] [<ffffffff811eec0f>] ? getname_flags+0x4f/0x1a0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995008] [<ffffffff811edf6b>] filename_lookup+0x2b/0xc0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995013] [<ffffffff811efd37>] user_path_at_empty+0x67/0xc0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995018] [<ffffffff81101072>] ? from_kgid_munged+0x12/0x20
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995023] [<ffffffff811e3aef>] ? cp_new_stat+0x14f/0x180
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995027] [<ffffffff811efda1>] user_path_at+0x11/0x20
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995032] [<ffffffff811e35e3>] vfs_fstatat+0x63/0xc0
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995036] [<ffffffff811e3bb1>] SYSC_newlstat+0x31/0x60
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995042] [<ffffffff810222fd>] ? syscall_trace_enter+0x17d/0x220
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995047] [<ffffffff81645ab3>] ? tracesys+0x7e/0xe2
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995052] [<ffffffff811e3e3e>] SyS_newlstat+0xe/0x10
Aug 4 22:52:54 tjtx135-6-226 kernel: [4981123.995056] [<ffffffff81645b12>] tracesys+0xdd/0xe2

根据已上报错,搜索的结论 ung_task_timeout_secs和blocked for more than 120 seconds的解决方法 ,改了推荐的参数,问题还是依旧出现


Linux系统出现hung_task_timeout_secs和blocked for more than 120 seconds的解决方法

Linux系统出现系统没有响应。 在/var/log/message日志中出现大量的 “echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.” 和“blocked for more than 120 seconds”错误。

问题原因:

默认情况下, Linux会最多使用40%的可用内存作为文件系统缓存。当超过这个阈值后,文件系统会把将缓存中的内存全部写入磁盘, 导致后续的IO请求都是同步的。将缓存写入磁盘时,有一个默认120秒的超时时间。 出现上面的问题的原因是IO子系统的处理速度不够快,不能在120秒将缓存中的数据全部写入磁盘。IO系统响应缓慢,导致越来越多的请求堆积,最终系统内存全部被占用,导致系统失去响应。

解决方法:

根据应用程序情况,对vm.dirty_ratio,vm.dirty_background_ratio两个参数进行调优设置。 例如,推荐如下设置:
# sysctl -w vm.dirty_ratio=10
# sysctl -w vm.dirty_background_ratio=5
# sysctl -p

如果系统永久生效,修改/etc/sysctl.conf文件。加入如下两行:
#vi /etc/sysctl.confvm.dirty_background_ratio = 5 vm.dirty_ratio = 10重启系统生效。


故障节点log日志
[2018-08-04T06:49:12,265][WARN ][o.e.m.j.JvmGcMonitorService] [10.135.6.226] [gc][young][1013831][93448] duration [1.1s], collections [1]/[7s], total [1.1s]/[1.2h], memory [22.8gb]->[9.4gb]/[29.2gb], all_pools {[young] [6.1gb]->[1.9mb]/[6.4gb]}{[survivor] [633.8mb]->[0b]/[819.1mb]}{[old] [16.1gb]->[9.4gb]/[22gb]}
[2018-08-04T06:49:12,275][INFO ][o.e.m.j.JvmGcMonitorService] [10.135.6.226] [gc][old][1013831][1217] duration [5.1s], collections [1]/[7s], total [5.1s]/[4.3m], memory [22.8gb]->[9.4gb]/[29.2gb], all_pools {[young] [6.1gb]->[1.9mb]/[6.4gb]}{[survivor] [633.8mb]->[0b]/[819.1mb]}{[old] [16.1gb]->[9.4gb]/[22gb]}
[2018-08-04T06:49:12,275][WARN ][o.e.m.j.JvmGcMonitorService] [10.135.6.226] [gc][1013831] overhead, spent [6.3s] collecting in the last [7s]
[2018-08-04T22:51:04,451][ERROR][o.e.x.m.c.n.NodeStatsCollector] [10.135.6.226] collector [node_stats] timed out when collecting data
[2018-08-04T22:51:14,468][ERROR][o.e.a.b.TransportBulkAction] [10.135.6.226] failed to execute pipeline for a bulk request
org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of org.elasticsearch.ingest.PipelineExecutionService$2@57621aaa on EsThreadPoolExecutor[name = 10.135.6.226/bulk, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@19accc58[Running, pool size = 32, active threads = 32, queued tasks = 305, completed tasks = 160486966]]
at org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:48) ~[elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) ~[?:1.8.0_66]
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) ~[?:1.8.0_66]
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.doExecute(EsThreadPoolExecutor.java:98) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.execute(EsThreadPoolExecutor.java:93) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.ingest.PipelineExecutionService.executeBulkRequest(PipelineExecutionService.java:75) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.processBulkIndexIngestRequest(TransportBulkAction.java:496) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.doExecute(TransportBulkAction.java:135) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.doExecute(TransportBulkAction.java:86) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:167) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.apply(SecurityActionFilter.java:133) ~[?:?]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:165) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:139) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:81) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.node.NodeClient.executeLocally(NodeClient.java:83) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:72) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:405) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.support.AbstractClient.bulk(AbstractClient.java:482) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.core.ClientHelper.executeAsyncWithOrigin(ClientHelper.java:73) ~[x-pack-core-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.doFlush(LocalBulk.java:120) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.flush(ExportBulk.java:72) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.lambda$doFlush$1(ExportBulk.java:166) ~[?:?]
at org.elasticsearch.xpack.core.common.IteratingActionListener.run(IteratingActionListener.java:93) [x-pack-core-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.doFlush(ExportBulk.java:182) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.flushAndClose(ExportBulk.java:96) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.close(ExportBulk.java:86) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.Exporters.export(Exporters.java:205) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.MonitoringService$MonitoringExecution$1.doRun(MonitoringService.java:231) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_66]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_66]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:573) [elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_66]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_66]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_66]
[2018-08-04T22:51:14,473][WARN ][o.e.x.m.MonitoringService] [10.135.6.226] monitoring execution failed
org.elasticsearch.xpack.monitoring.exporter.ExportException: Exception when closing export bulk
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$1$1.<init>(ExportBulk.java:107) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$1.onFailure(ExportBulk.java:105) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound$1.onResponse(ExportBulk.java:218) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound$1.onResponse(ExportBulk.java:212) ~[?:?]
at org.elasticsearch.xpack.core.common.IteratingActionListener.onResponse(IteratingActionListener.java:108) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.lambda$doFlush$0(ExportBulk.java:176) ~[?:?]
at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:68) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.lambda$doFlush$1(LocalBulk.java:127) ~[?:?]
at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:68) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.ContextPreservingActionListener.onFailure(ContextPreservingActionListener.java:50) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction$1.onFailure(TransportAction.java:91) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.lambda$processBulkIndexIngestRequest$4(TransportBulkAction.java:503) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.ingest.PipelineExecutionService$2.onFailure(PipelineExecutionService.java:79) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.onRejection(AbstractRunnable.java:63) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.onRejection(ThreadContext.java:662) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.doExecute(EsThreadPoolExecutor.java:104) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.execute(EsThreadPoolExecutor.java:93) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.ingest.PipelineExecutionService.executeBulkRequest(PipelineExecutionService.java:75) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.processBulkIndexIngestRequest(TransportBulkAction.java:496) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.doExecute(TransportBulkAction.java:135) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.doExecute(TransportBulkAction.java:86) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:167) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.apply(SecurityActionFilter.java:133) ~[?:?]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:165) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:139) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:81) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.node.NodeClient.executeLocally(NodeClient.java:83) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:72) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:405) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.support.AbstractClient.bulk(AbstractClient.java:482) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.core.ClientHelper.executeAsyncWithOrigin(ClientHelper.java:73) ~[x-pack-core-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.doFlush(LocalBulk.java:120) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.flush(ExportBulk.java:72) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.lambda$doFlush$1(ExportBulk.java:166) ~[?:?]
at org.elasticsearch.xpack.core.common.IteratingActionListener.run(IteratingActionListener.java:93) [x-pack-core-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.doFlush(ExportBulk.java:182) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.flushAndClose(ExportBulk.java:96) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.close(ExportBulk.java:86) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.Exporters.export(Exporters.java:205) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.MonitoringService$MonitoringExecution$1.doRun(MonitoringService.java:231) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_66]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_66]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:573) [elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_66]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_66]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_66]
Caused by: org.elasticsearch.xpack.monitoring.exporter.ExportException: failed to flush export bulks
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.lambda$doFlush$0(ExportBulk.java:168) ~[?:?]
... 41 more
Caused by: org.elasticsearch.xpack.monitoring.exporter.ExportException: failed to flush export bulk [default_local]
... 40 more
Caused by: org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of org.elasticsearch.ingest.PipelineExecutionService$2@57621aaa on EsThreadPoolExecutor[name = 10.135.6.226/bulk, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@19accc58[Running, pool size = 32, active threads = 32, queued tasks = 305, completed tasks = 160486966]]
at org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:48) ~[elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) ~[?:1.8.0_66]
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) ~[?:1.8.0_66]
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.doExecute(EsThreadPoolExecutor.java:98) ~[elasticsearch-6.2.2.jar:6.2.2]
... 31 more
[2018-08-04T22:51:24,430][ERROR][o.e.a.b.TransportBulkAction] [10.135.6.226] failed to execute pipeline for a bulk request
org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of org.elasticsearch.ingest.PipelineExecutionService$2@7cfc78f3 on EsThreadPoolExecutor[name = 10.135.6.226/bulk, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@19accc58[Running, pool size = 32, active threads = 32, queued tasks = 305, completed tasks = 160486966]]
at org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:48) ~[elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) ~[?:1.8.0_66]
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) ~[?:1.8.0_66]
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.doExecute(EsThreadPoolExecutor.java:98) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.execute(EsThreadPoolExecutor.java:93) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.ingest.PipelineExecutionService.executeBulkRequest(PipelineExecutionService.java:75) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.processBulkIndexIngestRequest(TransportBulkAction.java:496) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.doExecute(TransportBulkAction.java:135) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.doExecute(TransportBulkAction.java:86) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:167) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.apply(SecurityActionFilter.java:133) ~[?:?]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:165) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:139) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:81) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.node.NodeClient.executeLocally(NodeClient.java:83) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:72) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:405) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.support.AbstractClient.bulk(AbstractClient.java:482) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.core.ClientHelper.executeAsyncWithOrigin(ClientHelper.java:73) ~[x-pack-core-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.doFlush(LocalBulk.java:120) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.flush(ExportBulk.java:72) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.lambda$doFlush$1(ExportBulk.java:166) ~[?:?]
at org.elasticsearch.xpack.core.common.IteratingActionListener.run(IteratingActionListener.java:93) [x-pack-core-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.doFlush(ExportBulk.java:182) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.flushAndClose(ExportBulk.java:96) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.close(ExportBulk.java:86) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.Exporters.export(Exporters.java:205) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.MonitoringService$MonitoringExecution$1.doRun(MonitoringService.java:231) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_66]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_66]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:573) [elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_66]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_66]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_66]
[2018-08-04T22:51:24,434][WARN ][o.e.x.m.MonitoringService] [10.135.6.226] monitoring execution failed
org.elasticsearch.xpack.monitoring.exporter.ExportException: Exception when closing export bulk
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$1$1.<init>(ExportBulk.java:107) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$1.onFailure(ExportBulk.java:105) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound$1.onResponse(ExportBulk.java:218) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound$1.onResponse(ExportBulk.java:212) ~[?:?]
at org.elasticsearch.xpack.core.common.IteratingActionListener.onResponse(IteratingActionListener.java:108) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.lambda$doFlush$0(ExportBulk.java:176) ~[?:?]
at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:68) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.lambda$doFlush$1(LocalBulk.java:127) ~[?:?]
at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:68) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.ContextPreservingActionListener.onFailure(ContextPreservingActionListener.java:50) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction$1.onFailure(TransportAction.java:91) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.lambda$processBulkIndexIngestRequest$4(TransportBulkAction.java:503) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.ingest.PipelineExecutionService$2.onFailure(PipelineExecutionService.java:79) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.onRejection(AbstractRunnable.java:63) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.onRejection(ThreadContext.java:662) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.doExecute(EsThreadPoolExecutor.java:104) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.execute(EsThreadPoolExecutor.java:93) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.ingest.PipelineExecutionService.executeBulkRequest(PipelineExecutionService.java:75) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.processBulkIndexIngestRequest(TransportBulkAction.java:496) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.doExecute(TransportBulkAction.java:135) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.bulk.TransportBulkAction.doExecute(TransportBulkAction.java:86) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:167) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.apply(SecurityActionFilter.java:133) ~[?:?]
at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:165) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:139) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:81) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.node.NodeClient.executeLocally(NodeClient.java:83) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:72) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:405) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.client.support.AbstractClient.bulk(AbstractClient.java:482) ~[elasticsearch-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.core.ClientHelper.executeAsyncWithOrigin(ClientHelper.java:73) ~[x-pack-core-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.doFlush(LocalBulk.java:120) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.flush(ExportBulk.java:72) ~[?:?]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.lambda$doFlush$1(ExportBulk.java:166) ~[?:?]
at org.elasticsearch.xpack.core.common.IteratingActionListener.run(IteratingActionListener.java:93) [x-pack-core-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.doFlush(ExportBulk.java:182) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.flushAndClose(ExportBulk.java:96) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk.close(ExportBulk.java:86) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.exporter.Exporters.export(Exporters.java:205) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.xpack.monitoring.MonitoringService$MonitoringExecution$1.doRun(MonitoringService.java:231) [x-pack-monitoring-6.2.2.jar:6.2.2]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_66]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_66]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:573) [elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_66]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_66]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_66]
Caused by: org.elasticsearch.xpack.monitoring.exporter.ExportException: failed to flush export bulks
at org.elasticsearch.xpack.monitoring.exporter.ExportBulk$Compound.lambda$doFlush$0(ExportBulk.java:168) ~[?:?]
... 41 more
Caused by: org.elasticsearch.xpack.monitoring.exporter.ExportException: failed to flush export bulk [default_local]
... 40 more
Caused by: org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of org.elasticsearch.ingest.PipelineExecutionService$2@7cfc78f3 on EsThreadPoolExecutor[name = 10.135.6.226/bulk, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@19accc58[Running, pool size = 32, active threads = 32, queued tasks = 305, completed tasks = 160486966]]
at org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:48) ~[elasticsearch-6.2.2.jar:6.2.2]
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) ~[?:1.8.0_66]
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) ~[?:1.8.0_66]
at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.doExecute(EsThreadPoolExecutor.java:98) ~[elasticsearch-6.2.2.jar:6.2.2]
... 31 more
[2018-08-04T22:51:34,430][ERROR][o.e.a.b.TransportBulkAction] [10.135.6.226] failed to execute pipeline for a bulk request
1、在三个 节点 上都下载es 如果要安装es,首先就要从官网elastic.co/downloads/elasticsearch下载es的安装包,并且最新es版本要求有JDK 8以上的版本。 es安装包的目录结构大致如下: bin:存放es的一些可执行脚本,比如用于启动进程的elasticsearch命令,以及用于安装插件的elasticsearch-plugin插件 conf:用于存放es的... 浏览器打开 我们在使用Linux 集群 的时候有不少的问题需要解决,其实有最总要的问题就在与Linux 集群 的原理理解与安装过程。那么在这里大家就会学习有关Linux 集群 的原理安装技术,这会为在之后的工作有很大帮助。 集群 原理   Linux 集群 系统包括 集群 节点 集群 管理器两部分。 集群 节点 有时简称为 节点 、服务器或服务器 节点 ,是提供处理资源的系统,它进行 集群 的实际工作。一般来讲,它必须进行配置才能成为 集群 的一 浏览器打开 [2019-02-26T15:53:25,805][ERROR][o.e.x.m.c.c.ClusterStatsCollector] [tCVBqAi] collector [cluster_stats] failed to collect data java .lang.NullPointerException: null         at org.elasticsearch.xp... 浏览器打开 Processor number. The keyword all indicates that statistics are calculated as averages among all processors. Show the percentage of CPU utilization that occurred while executing at the user level (application). %nice 浏览器打开 总结—elasticsearch启动失败的几种情况及解决 1、使用root用户启动失败 在有一次搭建elasticsearch的时候,使用systemctl启动elasticsearch失败,然后在bin目录下面去使用启动脚本启动,发现 错不能用root用户启动, “Caused by: java .lang.RuntimeException: can not run elasticsearch as root”: [root@localhost bin]# ./elasticsearch [2017-12- 浏览器打开 1. 问题描述 如题,在用PyCharm进行Python代码调试查看具体变量时,会随机遇到一直显示collecting data,到最后 错Timeout waiting for response,在界面中看不到变量内部的内容,如下图所示: 2. 解决办法 在PyCharm,打开Setting界面,在如下设置项中勾选“Gevent compatible”即可,如下图所示: 至此,问题得到解决。... 浏览器打开 1.由gc引起 节点 脱离 集群 因为gc时会使jvm停止工作,如果某个 节点 gc时间过长,master ping3次(zen discovery默认ping失败重试3次)不通后就会把该 节点 剔除出 集群 ,从而导致索引进行重新分配。 解决方法: (1)优化gc,减少gc时间。(2)调大zen discovery的重试次数(es参数:ping_retries)和 超时 时间(es参数:ping_ti 浏览器打开 1台master 节点 ,4台data 节点 ,9个shards 一台data 节点 宕机,导致5个分片处于unassigned状态, 集群 状态变为red,无法自动rerouting 解决步骤: 1.查看所有 节点 的日志信息,通过日志,我们发现master 节点 中出现了警告信息,通知宕机 节点 的磁盘利用率超过了90%,这也是导致 节点 宕机, 集群 出现unassigned的原因。 浏览器打开