Hi,
On 01/16/2014 06:14 AM, WANG Chao wrote:
On 01/16/14 at 11:35am, Dave Young wrote:
> Hi, Marek
>
>> if you have a lot of memory, you should set fence_kdump to wait
>> longer (default 60 seconds)
>> pcs stonith update myfence pcmk_reboot_timeout=600 --force
> Large memory system might need hours to finish the vmcore capturing
> so 60 seconds is not enough, could you help to increase the default
> value? I think there's no side effect to set it as INT_MAX?
I'm a lot confused with the option "pcmk_reboot_timeout".
In kdump environment, fence_kdump_send is running background and send
an acknowledge message out every 10 seconds (by default) indefinately
(by default).
So how does the timeout works?
1. Will it be reset and count down again when it receives a valid
message from the crashed node?
2. Will fence kdump agent wait for time out after the very first
message is received?
pcmk_reboot_timeout:
* Specify an alternate timeout to use for reboot actions|. |If the
command is not finished in time, it is considered that fencing failed.
This option is set on a higher level then in fence agen itself and
it controls all fence agents - so it is not impacted by valid message
from the crash node.
Such thing is possible to control directly from fence_kdump but
defaults here are not a problem (if no valid message is obtained in 60
seconds, then fencing failed)
* default value 60s (provided command set it to 600seconds what is
more than enough on our testing machines)
m,