This patch adds fadump howto document to kexec-tools. The document
is prepared in reference to kexec-kdump-howto.txt document.
Signed-off-by: Mahesh Salgaonkar <mahesh(a)linux.vnet.ibm.com>
Signed-off-by: Hari Bathini <hbathini(a)linux.vnet.ibm.com>
fadump-howto.txt | 428 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 428 insertions(+)
create mode 100644 fadump-howto.txt
diff --git a/fadump-howto.txt b/fadump-howto.txt
new file mode 100644
@@ -0,0 +1,428 @@
+Firmware assisted dump (fadump) HOWTO
+Firmware assisted dump is a new feature in the 3.4 mainline kernel supported
+only on powerpc architecture. The goal of firmware-assisted dump is to enable
+the dump of a crashed system, and to do so from a fully-reset system, and to
+minimize the total elapsed time until the system is back in production use. A
+complete documentation on implementation can be found at
+Documentation/powerpc/firmware-assisted-dump.txt in upstream linux kernel tree
+from 3.4 version and above.
+Please note that the firmware-assisted dump feature is only available on Power6
+and above systems with recent firmware versions.
+Fadump is a robust kernel crash dumping mechanism to get reliable kernel crash
+dump with assistance from firmware. This approach does not use kexec, instead
+firmware assists in booting the kdump kernel while preserving memory contents.
+Unlike kdump, the system is fully reset, and loaded with a fresh copy of the
+kernel. In particular, PCI and I/O devices are reinitialized and are in a
+clean, consistent state. This second kernel, often called a capture kernel,
+boots with very little memory and captures the dump image.
+The first kernel registers the sections of memory with the Power firmware for
+dump preservation during OS initialization. These registered sections of memory
+are reserved by the first kernel during early boot. When a system crashes, the
+Power firmware fully resets the system, preserves all the system memory
+contents, save the the low memory (boot memory of size larger of 5% of system
+RAM or 256MB) of RAM to the previous registered region. It will also save
+system registers, and hardware PTE's.
+Fadump is supported only on ppc64 platform. The standard kernel and capture
+kernel are one and the same on ppc64.
+If you're reading this document, you should already have kexec-tools
+installed. If not, you install it via the following command:
+ # yum install kexec-tools
+Fadump Operational Flow:
+Like kdump, fadump also exports the ELF formatted kernel crash dump through
+/proc/vmcore. Hence existing kdump infrastructure can be used to capture fadump
+vmcore. The idea is to keep the functionality transparent to end user. From
+user perspective there is no change in the way kdump init script works.
+However, unlike kdump, fadump does not pre-load kdump kernel and initrd into
+reserved memory, instead it always uses default OS initrd during second boot
+after crash. Hence, for fadump, we rebuild the new kdump initrd and replace it
+with default initrd. Before replacing existing default initrd we take a backup
+of original default initrd which is restored back when user decides to switch
+to kdump. The dracut package has been enhanced to rebuild the default initrd
+with vmcore capture steps as per /etc/kdump.conf
+The control flow of fadump works as follows:
+01. System panics.
+02. At the crash, kernel informs power firmware that kernel has crashed.
+03. Firmware takes the control and reboots the entire system preserving
+ only the memory (resets all other devices).
+04. The reboot follows the normal booting process (non-kexec).
+05. The boot loader loads the default kernel and initrd from /boot
+06. The default initrd loads and runs /init
+07. dracut-kdump.sh script present in fadump aware default initrd checks if
+ '/proc/vmcore' file exists before executing steps to capture vmcore.
+ (This check will help to bypass the vmcore capture steps during normal boot
+09. Captures dump according to /etc/kdump.conf
+10. Is dump capture successful (yes goto 12, no goto 11)
+11. Perfom the default action specified in /etc/kdump.conf (Default action
+ is reboot, if unspecified)
+How to configure fadump:
+Again, we assume if you're reading this document, you should already have
+kexec-tools installed. If not, you install it via the following command:
+ # yum install kexec-tools
+To be able to do much of anything interesting in the way of debug analysis,
+you'll also need to install the kernel-debuginfo package, of the same arch
+as your running kernel, and the crash utility:
+ # yum --enablerepo=\*debuginfo install kernel-debuginfo.$(uname -m) crash
+Next up, we need to modify some boot parameters to enable firmware assisted
+dump. With the help of grubby, it's very easy to append "fadump=on" to the
+of your kernel boot parameters. Optionally, user can also append
+'fadump_reserve_mem=X' kernel cmdline to specify size of the memory to reserve
+for boot memory dump preservation.
+ # grubby --args="fadump=on" --update-kernel=/boot/vmlinuz-`uname -r`
+The term 'boot memory' means size of the low memory chunk that is required for
+a kernel to boot successfully when booted with restricted memory. By default,
+the boot memory size will be the larger of 5% of system RAM or 256MB.
+Alternatively, user can also specify boot memory size through boot parameter
+'fadump_reserve_mem=' which will override the default calculated size. Use this
+option if default boot memory size is not sufficient for second kernel to boot
+After making said changes, reboot your system, so that the specified memory is
+reserved and left untouched by the normal system. Take note that the output of
+'free -m' will show X MB less memory than without this parameter, which is
+expected. If you see OOM (Out Of Memory) error messages while loading capture
+kernel, then you should bump up the memory reservation size.
+Now that you've got that reserved memory region set up, you want to turn on
+the kdump init script:
+ # chkconfig kdump on
+Then, start up kdump as well:
+ # systemctl start kdump.service
+This should turn on the firmware assisted functionality in kernel by
+echo'ing 1 to /sys/kernel/fadump_registered, leaving the system ready
+to capture a vmcore upon crashing. To test this out, you can force-crash
+your system by echo'ing a c into /proc/sysrq-trigger:
+ # echo c > /proc/sysrq-trigger
+You should see some panic output, followed by the system reset and booting into
+fresh copy of kernel. When default initrd loads and runs /init, vmcore should
+be copied out to disk (by default, in /var/crash/<YYYY-MM-DD-HH:MM>/vmcore),
+then the system rebooted back into your normal kernel.
+Once back to your normal kernel, you can use the previously installed crash
+kernel in conjunction with the previously installed kernel-debuginfo to
+perform postmortem analysis:
+ # crash /usr/lib/debug/lib/modules/2.6.17-1.2621.el5/vmlinux
+ crash> bt
+and so on...
+Kernel log bufferes are one of the most important information available
+in vmcore. Now before saving vmcore, kernel log bufferes are extracted
+from /proc/vmcore and saved into a file vmcore-dmesg.txt. After
+vmcore-dmesg.txt, vmcore is saved. Destination disk and directory for
+vmcore-dmesg.txt is same as vmcore. Note that kernel log buffers will
+not be available if dump target is raw device.
+Dump Triggering methods:
+This section talks about the various ways, other than a Kernel Panic, in which
+fadump can be triggered. The following methods assume that fadump is configured
+on your system, with the scripts enabled as described in the section above.
+1) AltSysRq C
+FAdump can be triggered with the combination of the 'Alt','SysRq' and
+keyboard keys. Please refer to the following link for more details:
+In addition, on PowerPC boxes, fadump can also be triggered via Hardware
+Management Console(HMC) using 'Ctrl', 'O' and 'C' keyboard keys.
+2) Kernel OOPs
+If we want to generate a dump everytime the Kernel OOPses, we can achieve this
+by setting the 'Panic On OOPs' option as follows:
+ # echo 1 > /proc/sys/kernel/panic_on_oops
+3) PowerPC specific methods:
+On IBM PowerPC machines, issuing a soft reset invokes the XMON debugger(if
+XMON is configured). To configure XMON one needs to compile the kernel with
+the CONFIG_XMON and CONFIG_XMON_DEFAULT options, or by compiling with
+CONFIG_XMON and booting the kernel with xmon=on option.
+Following are the ways to remotely issue a soft reset on PowerPC boxes, which
+would drop you to XMON. Pressing a 'X' (capital alphabet X) followed by an
+'Enter' here will trigger the dump.
+Hardware Management Console(HMC) available on Power4 and Power5 machines allow
+partitions to be reset remotely. This is specially useful in hang situations
+where the system is not accepting any keyboard inputs.
+Once you have HMC configured, the following steps will enable you to trigger
+fadump via a soft reset:
+ Using GUI
+ * In the right pane, right click on the partition you wish to dump.
+ * Select "Operating System->Reset".
+ * Select "Soft Reset".
+ * Select "Yes".
+ Using HMC Commandline
+ # reset_partition -m <machine> -p <partition> -t soft
+ Using GUI
+ * In the right pane, right click on the partition you wish to dump.
+ * Select "Restart Partition".
+ * Select "Dump".
+ * Select "OK".
+ Using HMC Commandline
+ # chsysstate -m <managed system name> -n <lpar name> -o dumprestart -r
+3.2) Blade Management Console for Blade Center
+To initiate a dump operation, go to Power/Restart option under "Blade Tasks"
+the Blade Management Console. Select the corresponding blade for which you want
+to initate the dump and then click "Restart blade with NMI". This issues a
+system reset and invokes xmon debugger.
+In addition to being able to capture a vmcore to your system's local file
+system, fadump can be configured to capture a vmcore to a number of other
+locations, including a raw disk partition, a dedicated file system, an NFS
+mounted file system, or a remote system via ssh/scp. Additional options
+exist for specifying the relative path under which the dump is captured,
+what to do if the capture fails, and for compressing and filtering the dump
+(so as to produce smaller, more manageable, vmcore files).
+In theory, dumping to a location other than the local file system should be
+safer than fadump's default setup, as its possible the default setup will try
+dumping to a file system that has become corrupted. The raw disk partition and
+dedicated file system options allow you to still dump to the local system,
+but without having to remount your possibly corrupted file system(s),
+thereby decreasing the chance a vmcore won't be captured. Dumping to an
+NFS server or remote system via ssh/scp also has this advantage, as well
+as allowing for the centralization of vmcore files, should you have several
+systems from which you'd like to obtain vmcore files. Of course, note that
+these configurations could present problems if your network is unreliable.
+Advanced setups are configured via modifications to /etc/kdump.conf,
+which out of the box, is fairly well documented itself. Any alterations to
+/etc/kdump.conf should be followed by a restart of the kdump service, so
+the changes can be incorporated in the fadump aware default initrd. Restarting
+the kdump service is as simple as '/sbin/systemctl restart kdump.service'.
+Note that kdump.conf is used as a configuration mechanism for capturing dump
+files from the initramfs (in the interests of safety), the root file system is
+mounted, and the init process is started, only as a last resort if the
+initramfs fails to capture the vmcore. As such, configuration made in
+/etc/kdump.conf is only applicable to capture recorded in the initramfs. If
+for any reason the init process is started on the root file system, only a
+simple copying of the vmcore from /proc/vmcore to /var/crash/$DATE/vmcore will
+For both local filesystem and nfs dump the dump target must be mounted before
+building fadump aware initramfs. That means one needs to put an entry for the
+dump file system in /etc/fstab so that after reboot when kdump service starts,
+it can find the dump target and build initramfs instead of failing.
+Usually the dump target should be used only for fadump. If you worry about
+someone uses the filesystem for something else other than dumping vmcore
+you can mount it as read-only. Mkdumprd will still remount it as read-write
+for creating dump directory and will move it back to read-only afterwards.
+Raw partition dumping requires that a disk partition in the system, at least
+as large as the amount of memory in the system, be left unformatted. Assuming
+/dev/vg/lv_kdump is left unformatted, kdump.conf can be configured with
+'raw /dev/vg/lv_kdump', and the vmcore file will be copied via dd directly
+onto partition /dev/vg/lv_kdump. Restart the kdump service via
+'/sbin/systemctl restart kdump.service' to commit this change to your fadump
+aware default initrd. Dump target should be persistent device name, such as lvm
+or device mapper canonical name.
+Dedicated file system
+Similar to raw partition dumping, you can format a partition with the file
+system of your choice, Again, it should be at least as large as the amount
+of memory in the system. Assuming it should be at least as large as the
+amount of memory in the system. Assuming /dev/vg/lv_kdump has been
+formatted ext4, specify 'ext4 /dev/vg/lv_kdump' in kdump.conf, and a
+vmcore file will be copied onto the file system after it has been mounted.
+Dumping to a dedicated partition has the advantage that you can dump multiple
+vmcores to the file system, space permitting, without overwriting previous ones,
+as would be the case in a raw partition setup. Restart the kdump service via
+'/sbin/systemctl restart kdump.service' to commit this change to
+your fadump aware default initrd. Note that for local file systems ext4 and
+ext2 are supported as dumpable targets. Kdump will not prevent you from
+specifying other filesystems, and they will most likely work, but their
+operation cannot be guaranteed. For instance specifying a vfat filesystem or
+msdos filesystem will result in a successful load of the kdump service, but
+during crash recovery, the dump will fail if the system has more than 2GB of
+memory (since vfat and msdos filesystems do not support more than 2GB files).
+Be careful of your filesystem selection when using this target.
+It is recommended to use persistent device names or UUID/LABEL for file system
+dumps. One example of persistent device is /dev/vg/<devname>.
+Dumping over NFS requires an NFS server configured to export a file system
+with full read/write access for the root user. All operations done within
+the fadump aware default initrd are done as root, and to write out a vmcore
+file, we obviously must be able to write to the NFS mount. Configuring an NFS
+server is outside the scope of this document, but either the no_root_squash
+or anonuid options on the NFS server side are likely of interest to permit
+the fadump aware default initrd operations write to the NFS mount as root.
+Assuming your're exporting /dump on the machine nfs-server.example.com
+once the mount is properly configured, specify it in kdump.conf, via
+'nfs nfs-server.example.com:/dump'. The server portion can be specified either
+by host name or IP address. Following a system crash, the fadump aware default
+initrd will mount the NFS mount and copy out the vmcore to your NFS server.
+Restart the kdump service via '/sbin/systemctl restart kdump.service' to commit
+this change to your fadump aware initrd.
+Remote system via ssh/scp
+Dumping over ssh/scp requires setting up passwordless ssh keys for every
+machine you wish to have dump via this method. First up, configure kdump.conf
+for ssh/scp dumping, adding a config line of 'ssh user@server', where
+can be any user on the target system you choose, and 'server' is the host
+name or IP address of the target system. Using a dedicated, restricted user
+account on the target system is recommended, as there will be keyless ssh
+access to this account.
+Once kdump.conf is appropriately configured, issue the command
+'kdumpctl propagate' to automatically set up the ssh host keys and transmit
+the necessary bits to the target server. You'll have to type in 'yes'
+to accept the host key for your targer server if this is the first time
+you've connected to it, and then input the target system user's password
+to send over the necessary ssh key file. Restart the kdump service via
+'/sbin/systemctl restart kdump.service' to commit this change to the fadump
+aware default initrd.
+By default, local file system vmcore files are written to /var/crash/%DATE
+on the local system, ssh/scp dumps to /var/crash/%HOST-%DATE on the target
+system, dedicated file system partition dumps to ./var/crash/%DATE, and
+NFS dumps to ./var/crash/%HOST-%DATE, the latter two both relative to
+their respective mount points within the fadump initrd (usually /mnt). The
+'/var/crash' portion of the path can be overridden using kdump.conf's
+variable, should you wish to write the vmcore out to a different location. For
+example, 'path /data/coredumps' would lead to vmcore files being written to
+/data/coredumps/%DATE if you were dumping to your local file system. Note
+that the path option is ignored if your kdump configuration results in the
+core being saved from the initscripts in the root filesystem.
+Kdump Post-Capture Executable
+It is possible to specify a custom script or binary you wish to run following
+an attempt to capture a vmcore. The executable is passed an exit code from
+the capture process, which can be used to trigger different actions from
+within your post-capture executable.
+Kdump Pre-Capture Executable
+It is possible to specify a custom script or binary you wish to run before
+capturing a vmcore. Exit status of this binary is interpreted:
+0 - continue with dump process as usual
+non 0 - reboot the system
+If you have specific binaries or scripts you want to have made available
+within your fadump aware default initrd, you can specify them by their full
+path, and they will be included in your initrd, along with all dependent
+libraries. This may be particularly useful for those running post-capture
+scripts that rely on other binaries.
+By default, only the bare minimum of kernel modules will be included in your
+fadump aware default initrd. Should you wish to capture your vmcore files to a
+non-boot-path storage device, such as an iscsi target disk or clustered file
+system, you may need to manually specify additional kernel modules to load into
+your fadump aware default initrd.
+Default action specifies what to do when dump to configured dump target
+fails. By default, default action is "reboot" and that is system reboots
+if attempt to save dump to dump target fails.
+There are other default actions available though.
+ This option tries to mount root and save dump on root filesystem
+ in a path specified by "path". This option will generally make
+ sense when dump target is not root filesystem. For example, if
+ dump is being saved over network using "ssh" then one can specify
+ default to "dump_to_rootfs" to try saving dump to root filesystem
+ if dump over network fails.
+ Drop into a shell session inside initramfs.
+ Halt system after failure
+ Poweroff system after failure.
+Compression and filtering
+Refer "Compression and filtering" section in "kexec-kdump-howto.txt"
+Compression and filtering are same for kdump & fadump.
+Notes on rootfs mount:
+Dracut is designed to mount rootfs by default. If rootfs mounting fails it
+will refuse to go on. So fadump leaves rootfs mounting to dracut currently.
+We make the assumtion that proper root= cmdline is being passed to dracut
+initramfs for the time being. If you need modify "KDUMP_COMMANDLINE=" in
+/etc/sysconfig/kdump, you will need to make sure that appropriate root=
+options are copied from /proc/cmdline. In general it is best to append
+command line options using "KDUMP_COMMANDLINE_APPEND=" instead of replacing
+the original command line completely.