----- Original Message -----
On Mon, Jan 13, 2014 at 06:23:07PM +0800, WANG Chao wrote:
> From: arthur <zzou(a)redhat.com>
> Since kdump already support dump in cluster environment, this patch
> add a howto file to RPM package to describe how to configure kdump
> in cluster environment.
> Signed-off-by: arthur <zzou(a)redhat.com>
> kdump-in-cluster-environment.txt | 56
> kexec-tools.spec | 3 +++
> 2 files changed, 59 insertions(+)
> create mode 100644 kdump-in-cluster-environment.txt
> diff --git a/kdump-in-cluster-environment.txt
> new file mode 100644
> index 0000000..1e6a43a
> --- /dev/null
> +++ b/kdump-in-cluster-environment.txt
> @@ -0,0 +1,56 @@
> +Kdump-in-cluster-environment HOWTO
> +Kdump is a kexec based crash dumping mechansim for Linux. This docuement
> +illustrate how to configure kdump in cluster environment to allow the
> +crash recovery service complete without being preempted by traditional
> +fencing methods.
> +Details about Kexec/Kdump are available in Kexec-Kdump-howto file and will
> +be described here.
> +fence_kdump is an I/O fencing agent to be used with the kdump crash
> +service. When the fence_kdump agent is invoked, it will listen for a
> +from the failed node that acknowledges that the failed node it executing
The fence_dump agent is invoked by pacemaker(cluster manager) in every nodes in
the cluster. It runs like a deamon to listen for a message from the failed node(in
our case is the one who is doing kdump). The failed node should send the message to
other nodes in cluster to acknowledges itself is failed. In our case, That means
when a node is executing the kdump crash kernel(means it is failed), itself should
send the message to other nodes using fence_kdump_send command in the second kernel.
> +kdump crash kernel. Note that fence_kdump is not a replacement
> +fencing methods. The fence_kdump agent can only detect that a node has
> +the kdump crash recovery service. This allows the kdump crash recovery
> +complete without being preempted by traditional power fencing methods.
Who sends the message that a node is saving crash dump?
It is the node who is executing kdump crash kernel.
> +How to configure cluster environment:
> +If we want to use kdump in cluster environment, fence-agents-kdump should
> +installed in every nodes in the cluster. You can achieve this via the
> + # yum install -y fence-agents-kdump
> +Next is to add kdump_fence to the cluster. Assuming that the cluster
> +of three nodes, they are node1, node2 and node3, and use Pacemaker to
> +resource management and pcs as cli configuration tool.
> +With pcs it is easy to add a stonith resource to the cluster. For example,
> +a stonith resource named mykdumpfence with fence type of fence_kdump via
> +following commands:
> + # pcs stonith create mykdumpfence fence_kdump \
> + pcmk_host_check=static-list pcmk_host_list="node1 node2 node3"
> + # pcs stonith update mykdumpfence pcmk_monitor_action=metadata --force
> + # pcs stonith update mykdumpfence pcmk_status_action=metadata --force
> + # pcs stonith update mykdumpfence pcmk_reboot_action=off --force
> +Then enable stonith
> + # pcs property set stonith-enabled=true
> +How to configure kdump:
> +Actually there is nothing special in configuration between normal kdump
> +cluster environment kdump. So please refer to Kexec-Kdump-howto file for
I think we need to put some information here that how kdump sends the
information to other nodes after crash and what configuration file is
used to get node info etc.
kexec mailing list