Re: [PATCH v2 0/6] Add fence kdump support

Thursday, 23 January 2014

On 01/22/2014 07:33 PM, Vivek Goyal wrote:
...
 On Mon, Jan 13, 2014 at 01:39:11PM +0100, Marek Grac wrote:

 [..]
> if you have a lot of memory, you should set fence_kdump to wait
> longer (default 60 seconds)
> pcs stonith update myfence pcmk_reboot_timeout=600 --force
 Hi Marek,

 I think this is a problem. How would we know in advance how
 much it will take for dump to finish. And it will vary depending
 on so many things. (size of memory, speed of network etc). You don't need to
know this in advance. This is set on cluster-side and 
administrator should be able to set this timeout to proper value.

...

 By default, why this value can't be very high? Or this value can act
 more like a watchdog. As long as you keep on getting tick, you keep
 resetting internal counter. If you don't get a tick (message from
 node which is saving vmcore) for 60 seconds, then you assume
 that something went wrong with the node and power cycle it.

 Trying to keep an upper limit of 60 seconds and assuming dump will
 finish in this time, will not help. This is a general fence agent settings in
cluster and fence_kdump is 
only one that uses 'ticking' mechanism, all other should finished in a 
much more fixed time. Setting this value for kdump agent is fine as 
fence_kdump itself contains a different timeout mechanism which are 
based on 'ticks'.  I agree that it should be explained in 
documentation/kbase but it is not something what can be changed on fence 
agent level.

Cluster (pacemaker/corosync) accepts that some fence agents are slower 
than others, so it is possible to set this timeout value on 
per-instance-of-agent  with given command.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Re: [PATCH v2 0/6] Add fence kdump support