This introduced a new sub-command "kdumpctl estimate --reboot". This sub
command will trigger a kdump run and collect memory usage during the
kdump run. The collected info can be viewed with "kdumpctl estimate"
(without the --reboot).
The estimation supports all regular kdump targets, except the raw
target. An fs storage is needed for storing the collected info. The
only way to support raw dump target is to include an extra fs in the
initramfs, which consumes extra memory and makes the result unreliable.
To make it work gracefully and compatible with most environments, a
temporary file will be created as /kdump-estimate. This file holds the
estimation status and progress data. A few new services and systemd
trigger are introduced for the implementation:
kdump-estimate.service: This is the boot service, will only start when
/kdump-estimate file exists. This is the service that calls kdump
estimate routine and makes the whole progress automated.
kdump.shutdown: This is the panic trigger. Then /kdump-estimate
indicates a panic is pending to be triggered, this systemd hook will
trigger the panic after all filesystems are umounted or remounted ro. So
a panic can be triggered safely without losing data.
kdump-estimate-cleanup.service: This is the failure handler. If anything
went wrong and kdump-estimate.service failed, this will clean up
everything.
An estimation run will progress through three stages:
Reboot stage: It will temporarily boost the crashkernel to a quarter of
all available memory, then reboot via kexec to make it take effect.
Using kexec reboot can avoid modifying the actual commandline. But a
`--hard-reboot` option is also available in case kexec doesn't work.
Panic stage: System is rebooted, a panic will be triggered. Some extra
kernel command line will be passed to the kdump kernel indicating extra
info need to be collected. kdump scripts will store the info on the dump
target temporarily.
Final stage: System reboot back into normal status. Kdump estimate
service will collect the captured info, and estimation is done. Now it's
reviewable via "kdumpctl estimate".
Currently, the actually reboot estimation logic is only based on the
output of "rd.memdebug", which can be extended more later to make use of
"memstrack" to track the peak memory usage better.
The info looks like this:
```
Reboot estimation:
Last estimation: Thu Sep 23 03:50:24 AM CST 2021
Kernel version: 5.13.15-200.fc34.x86_64
Estimate took: 34s
Boosted crashkernel: 511M
Cached memory usage: 70M
Uncached memory usage: 92M
Reserved memory: 56M
Average runtime memory usage: 138M
WARNING: /etc/kdump.conf has changed since last estimation, the result might be
outdated.
First kernel based estimation:
Reserved crashkernel: 511M
Kernel image size: 34M
Kernel modules size: 9M
Initramfs size: 36M
Runtime reservation: 64M
Large modules:
xfs: 1908736
Recommended crashkernel: 181M
```
Signed-off-by: Kairui Song <kasong(a)redhat.com>
---
.editorconfig | 2 +-
dracut-kdump.sh | 15 +
kdump-estimate-cleanup.service | 8 +
kdump-estimate.service | 11 +
kdump-estimate.sh | 699 ++++++++++++++++++++++++++++++++-
kdump-lib.sh | 72 ----
kdump.shutdown | 13 +
kexec-tools.spec | 21 +-
8 files changed, 747 insertions(+), 94 deletions(-)
create mode 100644 kdump-estimate-cleanup.service
create mode 100644 kdump-estimate.service
mode change 100755 => 100644 kdump-estimate.sh
create mode 100644 kdump.shutdown
diff --git a/.editorconfig b/.editorconfig
index 87c4f983..878a06bd 100644
--- a/.editorconfig
+++ b/.editorconfig
@@ -18,7 +18,7 @@ binary_next_line = false
space_redirects = true
# Some scripts will only run with bash
-[{mkfadumprd,mkdumprd,kdumpctl,kdump-lib.sh}]
+[{mkfadumprd,mkdumprd,kdumpctl,kdump-lib.sh,kdump-estimate.sh}]
shell_variant = bash
# Use dracut code style for *-module-setup.sh
diff --git a/dracut-kdump.sh b/dracut-kdump.sh
index b69bc98a..24f152e2 100755
--- a/dracut-kdump.sh
+++ b/dracut-kdump.sh
@@ -176,6 +176,14 @@ dump_fs()
return 1
fi
+ _estimate_dir=$(getarg kdump_estimate_dir=)
+ if [ -n "$_estimate_dir" ]; then
+ _estimate_dir=$1/$KDUMP_PATH/$_estimate_dir
+ dinfo "saving estimation result to $_estimate_dir/"
+ mkdir -p "$_estimate_dir"
+ ln -sfr "$_dump_fs_path/kexec-dmesg.log"
"$_estimate_dir/kexec-dmesg.log"
+ fi
+
# improper kernel cmdline can cause the failure of echo, we can ignore this kind of
failure
return 0
}
@@ -427,6 +435,13 @@ dump_ssh()
derror "saving log file failed, _exitcode:$_ret"
fi
+ _estimate_dir=$(getarg kdump_estimate_dir=)
+ if [ -n "$_estimate_dir" ]; then
+ _estimate_dir="$KDUMP_PATH/$_estimate_dir"
+ dinfo "saving estimation result to $2:$_estimate_dir/"
+ ssh $_ssh_opt "$2" "mkdir -p '$_estimate_dir' && ln -sfr
'$_ssh_dir/kexec-dmesg.log' '$_estimate_dir/kexec-dmesg.log'"
+ fi
+
return $_ret
}
diff --git a/kdump-estimate-cleanup.service b/kdump-estimate-cleanup.service
new file mode 100644
index 00000000..3b5a0bdf
--- /dev/null
+++ b/kdump-estimate-cleanup.service
@@ -0,0 +1,8 @@
+[Unit]
+Description=Kdump crash memory usage estimation failed
+DefaultDependencies=no
+
+[Service]
+Type=oneshot
+ExecStart=/usr/lib/kdump/kdump-estimate.sh stage-clean
+StandardOutput=journal+console
diff --git a/kdump-estimate.service b/kdump-estimate.service
new file mode 100644
index 00000000..b681d4b9
--- /dev/null
+++ b/kdump-estimate.service
@@ -0,0 +1,11 @@
+[Unit]
+Description=Kdump crash memory usage estimation
+ConditionPathExists=/kdump-estimate
+After=kdump.service network.target network-online.target remote-fs.target basic.target
+OnFailure=kdump-estimate-cleanup.service
+DefaultDependencies=no
+
+[Service]
+Type=oneshot
+ExecStart=/usr/lib/kdump/kdump-estimate.sh stage-check
+StandardOutput=journal+console
diff --git a/kdump-estimate.sh b/kdump-estimate.sh
old mode 100755
new mode 100644
index 421aaaf9..da7a2997
--- a/kdump-estimate.sh
+++ b/kdump-estimate.sh
@@ -1,30 +1,599 @@
#!/bin/bash
+# Kdump memory usage estimation
[[ $dracutbasedir ]] || dracutbasedir=/usr/lib/dracut
. $dracutbasedir/dracut-functions.sh
. /lib/kdump/kdump-lib.sh
. /lib/kdump/kdump-logger.sh
+KEXEC=/sbin/kexec
+KEXEC_ARGS=()
+
+DEFAULT_SSH_KEY_PATH="/root/.ssh/kdump_id_rsa"
+DEFAULT_SAVE_PATH=/var/crash
+
# These two variables will be overridden by prepare_kdump_bootinfo
KDUMP_INITRD=""
KDUMP_KERNEL=""
+ESTIMATE_REBOOT=0
+ESTIMATE_KEXEC_REBOOT=1
+
+ESTIMATE_RESULTS_DIR="/var/lib/kdump/kdump-estimate/"
+ESTIMATE_TMPDIR=""
+ESTIMATE_TMPMNT=""
+
+# File used to track estimate progress
+# File does not exist: not in a estiamtion process.
+# File is not empty: in a estimation process and status are dumped in this file
+ESTIMATE_STATUS_FILE=/kdump-estimate
+
+# Values that will be stored in ESTIMATE_STATUS_FILE:
+# Which stage the estimation process is in (reboot/panic/result)
+ESTIMATE_STAGE=
+# The temporary dir on dump target used to collect info
+ESTIMATE_DIR=
+# Temporary boosted crashkernel value
+ESTIMATE_MEMORY=
+# The kernel being estimated
+ESTIMATE_KERNEL=
+# The size of initramfs used for estimate
+ESTIMATE_INITRD_SIZE=
+# The time when estimating started
+ESTIMATE_START_TIMESTAMP=
+# The time when panic is triggered
+ESTIMATE_PANIC_TIMESTAMP=
+# The time when estimating started
+ESTIMATE_END_TIMESTAMP=
+# Store original crashkernel= value, only used in hard reboot mode
+ESTIMATE_ORIG_CRASHKERNEL=
+# The kernel boot entry being estimated, only used in hard reboot mode
+ESTIMATE_KERNEL_ENTRY=
+
+ESTIMATE_STATUS_KEYS=(
+ ESTIMATE_STAGE
+ ESTIMATE_DIR
+ ESTIMATE_MEMORY
+ ESTIMATE_KERNEL
+ ESTIMATE_START_TIMESTAMP
+ ESTIMATE_PANIC_TIMESTAMP
+ ESTIMATE_END_TIMESTAMP
+ ESTIMATE_ORIG_CRASHKERNEL
+ ESTIMATE_KERNEL_ENTRY
+)
+
if [[ -f /etc/sysconfig/kdump ]]; then
. /etc/sysconfig/kdump
fi
+# initiate the kdump logger
if ! dlog_init; then
echo "failed to initiate the kdump logger."
exit 1
fi
-do_estimate_simple()
+check_vmlinux()
+{
+ # Use readelf to check if it's a valid ELF
+ readelf -h "$1" &> /dev/null || return 1
+}
+
+get_vmlinux_size()
+{
+ local size=0 _msize
+
+ while read -r _msize; do
+ size=$((size + _msize))
+ done <<< "$(readelf -l -W "$1" | awk '/^ LOAD/{print
$6}' 2> /dev/stderr)"
+
+ echo $size
+}
+
+try_decompress()
{
+ # The obscure use of the "tr" filter is to work around older versions of
+ # "grep" that report the byte offset of the line instead of the pattern.
+
+ # Try to find the header ($1) and decompress from here
+ for pos in $(tr "$1\n$2" "\n$2=" < "$4" | grep -abo
"^$2"); do
+ if ! type -P "$3" > /dev/null; then
+ ddebug "Signiature detected but '$3' is missing, skip this
decompressor"
+ break
+ fi
+
+ pos=${pos%%:*}
+ tail "-c+$pos" "$img" | $3 > "$5" 2> /dev/null
+ if check_vmlinux "$5"; then
+ ddebug "Kernel is extracted with '$3'"
+ return 0
+ fi
+ done
+
+ return 1
+}
+
+# Borrowed from linux/scripts/extract-vmlinux
+get_kernel_size()
+{
+ # Prepare temp files:
+ local tmp img=$1
+
+ tmp=$(mktemp /tmp/vmlinux-XXX)
+ trap 'rm -f "$tmp"' 0
+
+ # Try to check if it's a vmlinux already
+ check_vmlinux "$img" && get_vmlinux_size "$img" &&
return 0
+
+ # That didn't work, so retry after decompression.
+ try_decompress '\037\213\010' xy gunzip "$img" "$tmp" ||
+ try_decompress '\3757zXZ\000' abcde unxz "$img" "$tmp" ||
+ try_decompress 'BZh' xy bunzip2 "$img" "$tmp" ||
+ try_decompress '\135\0\0\0' xxx unlzma "$img" "$tmp" ||
+ try_decompress '\211\114\132' xy 'lzop -d' "$img"
"$tmp" ||
+ try_decompress '\002!L\030' xxx 'lz4 -d' "$img"
"$tmp" ||
+ try_decompress '(\265/\375' xxx unzstd "$img" "$tmp"
+
+ # Finally check for uncompressed images or objects:
+ [[ $? -eq 0 ]] && get_vmlinux_size "$tmp" && return 0
+
+ # Fallback to use iomem
+ local _size=0 _seg
+ while read -r _seg; do
+ _size=$((_size + 0x${_seg#*-} - 0x${_seg%-*}))
+ done <<< "$(grep -E "Kernel (code|rodata|data|bss)" /proc/iomem
| cut -d ":" -f 1)"
+ echo $_size
+}
+
+save_estimate_status()
+{
+ touch "$ESTIMATE_STATUS_FILE"
+ chmod 0600 "$ESTIMATE_STATUS_FILE"
+
+ {
+ for _key in "${ESTIMATE_STATUS_KEYS[@]}"; do
+ echo "$_key=${!_key}"
+ done
+ } > "$ESTIMATE_STATUS_FILE"
+
+ sync
+}
+
+load_estimate_status()
+{
+ local _file=$1 _key _val
+ _file=${_file:-$ESTIMATE_STATUS_FILE}
+
+ if [[ -s $_file ]]; then
+ while IFS="=" read -r _key _val; do
+ if [[ " ${ESTIMATE_STATUS_KEYS[*]} " == *" $_key "* ]]; then
+ declare -g "$_key"="$_val"
+ else
+ derror "Unknown kdump estimate status '$_key'"
+ fi
+ done <<< "$(< "$_file")"
+ else
+ derror "Failed to read estimate status file $_file"
+ return 1
+ fi
+}
+
+clear_estimate_status()
+{
+ mv "$ESTIMATE_STATUS_FILE" "$ESTIMATE_STATUS_FILE.old"
+
+ sync
+}
+
+set_boot_crashkernel()
+{
+ local _kernel_entry=$1 _ck_value=$2
+
+ ESTIMATE_ORIG_CRASHKERNEL=$(grubby --info "$_kernel_entry" | sed -n -e
's/^args=.*\(crashkernel=\S*\).*"$/\1/p')
+ ESTIMATE_KERNEL_ENTRY=$_kernel_entry
+
+ grubby --args crashkernel="$_ck_value"
--update-kernel="$_kernel_entry"
+
+ save_estimate_status
+}
+
+# Restore crashkernel value for the default boot kernel
+restore_boot_crashkernel()
+{
+ if ! [[ $ESTIMATE_KERNEL_ENTRY ]]; then
+ return 0
+ fi
+
+ if [[ $ESTIMATE_ORIG_CRASHKERNEL ]]; then
+ dinfo "Restoring crashkernel= kernel parameter to original value:
'$ESTIMATE_ORIG_CRASHKERNEL'."
+ grubby --args "$ESTIMATE_ORIG_CRASHKERNEL" --update-kernel
"$ESTIMATE_KERNEL_ENTRY"
+ else
+ dinfo "Removing crashkernel= kernel parameter."
+ grubby --remove-args "crashkernel" --update-kernel
"$ESTIMATE_KERNEL_ENTRY"
+ fi
+
+ ESTIMATE_ORIG_CRASHKERNEL=
+ ESTIMATE_KERNEL_ENTRY=
+
+ save_estimate_status
+}
+
+is_in_estimate_process()
+{
+ [[ -s $ESTIMATE_STATUS_FILE ]]
+}
+
+ensure_service_enabled()
+{
+ if ! systemctl is-enabled kdump-estimate &> /dev/null; then
+ derror "kdump-estimate.service have to be enabled in systemd."
+
+ clear_estimate_status
+
+ exit 1
+ fi
+}
+
+prepare_tempdir()
+{
+ [[ -n $ESTIMATE_TMPDIR ]] && return 0
+
+ ESTIMATE_TMPDIR="$(mktemp -d -t kdump-estimate.XXXXXX)"
+ [ -d "$ESTIMATE_TMPDIR" ] || perror_exit "kdump-estimate: mktemp -p -d -t
kdump-estimate.XXXXXX failed."
+ ESTIMATE_TMPMNT="$ESTIMATE_TMPDIR/target"
+
+ trap '
+ ret=$?;
+ is_mounted $ESTIMATE_TMPMNT && umount -f $ESTIMATE_TMPMNT;
+ [[ -d $ESTIMATE_TMPDIR ]] && rm --one-file-system -rf --
"$ESTIMATE_TMPDIR";
+ exit $ret;
+ ' EXIT
+
+ # clean up after ourselves no matter how we die.
+ trap 'exit 1;' SIGINT
+}
+
+check_user_confirm()
+{
+ local _confirm
+
+ dwarn "continue? (y/n):"
+ read -r _confirm
+ while [[ $_confirm != 'y' ]]; do
+ if [[ $_confirm == 'n' ]]; then
+ exit 0
+ else
+ echo "Please input (y/n):"
+ read -r _confirm
+ fi
+ done
+
+ return 0
+}
+
+start_staged_reboot_estimate()
+{
+ local _initrd _confirm _memory _ck_memory _cur_ck_memory
+ local _kdump_kernel _kexec_cmdline _def_kernel
+
+ ensure_service_enabled
+
+ dwarn "WARNING: This will reboot current system and trigger a panic to"
+ dwarn " estimate real kdump memory usage, this may take a while."
+ check_user_confirm
+
+ if [[ $(grep -o crashkernel /proc/cmdline | wc -l) -gt 1 ]]; then
+ if [[ $ESTIMATE_KEXEC_REBOOT -ne 1 ]]; then
+ derror "Multiple crashkernel value is being used, hard reboot estimation"
+ derror "is not support with such config yet."
+ exit 1
+ fi
+ fi
+
+ if is_raw_dump_target; then
+ dwarn "ERROR: estimate result for raw dump target is not reliable, kdump
requires"
+ dwarn " a fs storage for the estimation result in capture kernel."
+ exit 1
+ fi
+
+ if [[ $(get_luks_crypt_dev "$(kdump_get_maj_min
"$(get_root_fs_device)")") ]] ||
+ [[ $(get_all_kdump_crypt_dev) ]]; then
+ dwarn "WARNING: encrypted device is in use, you will have to input "
+ dwarn " the password manually after reboot."
+ check_user_confirm
+ fi
+
+ # Check and gather memory info
+ _memory=$(get_system_size)
+ _ck_memory=$((_memory / 1024 / 1024 / 4))
+
+ _cur_ck_memory=$(< /sys/kernel/kexec_crash_size)
+ _cur_ck_memory=$((_cur_ck_memory / 1024 / 1024))
+
+ dinfo "Available system RAM: $((_memory / 1024 / 1024))MB"
+ if [[ $_ck_memory -gt 16384 ]]; then
+ _ck_memory=16384
+ fi
+
+ if [[ $_ck_memory -lt 256 ]]; then
+ derror "System RAM is too small to run an estimation."
+ exit 1
+ fi
+
+ if [[ $_ck_memory -lt $_cur_ck_memory ]]; then
+ _ck_memory=$_cur_ck_memory
+ fi
+
+ dinfo "Will use crashkernel=${_ck_memory}M for estimation."
+
+ # Check and gather kernel and boot info
+ # Following two environment variables are prepared by prepare_kdump_bootinfo
+ _initrd=$DEFAULT_INITRD
+ _kdump_kernel=$KDUMP_KERNEL
+
+ dinfo "Kdump kernel is: $_kdump_kernel"
+ if [[ $ESTIMATE_KEXEC_REBOOT == 1 ]]; then
+
+ _kexec_cmdline=$(sed -e 's/\(\s\+\|^\)crashkernel=\S*\(\s\+\|$\)//g'
/proc/cmdline)
+ _kexec_cmdline="$_kexec_cmdline crashkernel=${_ck_memory}M"
+
+ if is_secure_boot_enforced; then
+ dinfo "Secure Boot is enabled. Using kexec file based syscall."
+ KEXEC_ARGS+=("-s")
+ fi
+ else
+ _def_kernel=$(grubby --default-kernel)
+ dinfo "Hard reboot mode is enabled, default boot kernel is: $_def_kernel"
+
+ if [[ $_kdump_kernel != "$_def_kernel" ]]; then
+ dwarn "Default boot kernel is not the kdump kernel, estimation might be
unreliable."
+ check_user_confirm
+ fi
+ fi
+
+ ESTIMATE_START_TIMESTAMP=$(date +%s%N)
+ ESTIMATE_STAGE="reboot"
+ ESTIMATE_DIR=".$(tr -dc A-Za-z0-9 < /dev/urandom | head -c 12)"
+ ESTIMATE_MEMORY=$((_ck_memory * 1024))
+
+ # rebuild the initramfs
+ kdumpctl rebuild || exit $?
+
+ # ensure kdump service works and fail early if not
+ kdumpctl restart || exit $?
+
+ # Kdump initramfs size
+ ESTIMATE_INITRD_SIZE=$(du -b "$KDUMP_INITRD" | awk '{print $1}')
+
+ save_estimate_status
+
+ # double check, in case user touched the service
+ ensure_service_enabled
+
+ if [[ $ESTIMATE_KEXEC_REBOOT == 1 ]]; then
+ "$KEXEC" --command-line "$_kexec_cmdline" --initrd
"$_initrd" --load "$_kdump_kernel" "${KEXEC_ARGS[@]}"
+
+ dinfo "Rebooting with kexec to apply updated crashkernel kernel parameter."
+ systemctl kexec
+ else
+ set_boot_crashkernel "$_def_kernel" "${_ck_memory}M"
+
+ dinfo "Rebooting to apply updated crashkernel kernel parameter."
+ reboot
+ fi
+}
+
+retrive_estimate_result_fs()
+{
+ local _target=$1 _fstype=$2 _opt=$3 _estimate_dir=$4 _dest=$5
+ local _mnt _path
+
+ if ! is_mounted "$_target"; then
+ _mnt="$ESTIMATE_TMPMNT"
+ mkdir -p "$_mnt"
+ mount "$_target" "$_mnt" -t "$_fstype" -o defaults ||
mount_failure "$_target" "" "$_fstype"
+ else
+ _mnt="$(get_mntpoint_from_target "$_target")"
+ fi
+
+ _path=$(get_save_path)
+
+ # Currently just retrive the kexec dmesg for analyze
+ cat "$_mnt/$_path/$_estimate_dir/kexec-dmesg.log" >
"$_dest/kexec-dmesg.log"
+
+ rm -rf "${_mnt:?}/$_path/$_estimate_dir/"
+}
+
+retrive_estimate_result_ssh()
+{
+ local _target=$1 _estimate_dir=$2 _dest=$3
+ local _key _path _ssh_opt
+
+ _key=$(kdump_get_conf_val sshkey)
+ if ! [[ -f $_key ]]; then
+ _key="/root/.ssh/kdump_id_rsa"
+ if ! [[ -f $_key ]]; then
+ derror "Default SSH key '$_key' doesn't exist, no available key to
try, exiting."
+ exit 1
+ fi
+ fi
+
+ _ssh_opt=("-i" "$_key" "-o" "BatchMode=yes"
"-o" "StrictHostKeyChecking=yes")
+ _path=$(get_save_path)
+
+ # Currently just retrive the kexec dmesg for analyze
+ if ! ssh "${_ssh_opt[@]}" "$_target" cat
"$_path/$_estimate_dir/kexec-dmesg.log" > "$_dest/kexec-dmesg.log";
then
+ derror "Failed to retrive estimate result over ssh on '$_target'"
+ exit 1
+ ssh "${_ssh_opt[@]}" "$2" "rm -rf
'${_path:?}/$_estimate_dir'"
+ fi
+}
+
+staged_estimate_panic()
+{
+ dinfo "Preparing to trigger a panic."
+ dinfo "Kdump will generate extra data in '$ESTIMATE_DIR' on the dump
target."
+
+ KDUMP_COMMANDLINE_EXTRA="rd.memdebug=4 kdump_estimate_dir=$ESTIMATE_DIR"
kdumpctl restart || exit $?
+
+ ESTIMATE_STAGE="panic"
+ ESTIMATE_PANIC_TIMESTAMP=$(date +%s%N)
+ ESTIMATE_KERNEL=$(uname -r)
+ save_estimate_status
+
+ # Restore the boot crashkernel value asap
+ restore_boot_crashkernel
+
+ # The real panic will be triggered by
+ # /usr/lib/systemd/system-shutdown/kdump.shutdown
+ dinfo "Triggering a kernel panic."
+ systemctl halt
+}
+
+analyze_result()
+{
+ local _result_dir=$1
+ local _result_dmesg=$_result_dir/kexec-dmesg.log
+ local _cached_usage _uncached_usage _reserve_usage
+
+ local _memfree _memavail _memtotal
+ local _line _i
+
+ while read -r _line; do
+ case $_line in
+ *MemTotal:*)
+ _memtotal=$(echo "$_line" | awk '{print $(NF-1)}')
+ _reserve_usage=$((ESTIMATE_MEMORY - _memtotal))
+ ;;
+ *MemFree:*)
+ _memfree=$(echo "$_line" | awk '{print $(NF-1)}')
+ if [[ -n $_memtotal ]]; then
+ _i=$((_memtotal - _memfree))
+ if [[ $_i -lt $_cached_usage ]] || [[ -z $_cached_usage ]]; then
+ _cached_usage="$_i"
+ fi
+ fi
+ ;;
+ *MemAvailable:*)
+ _memavail=$(echo "$_line" | awk '{print $(NF-1)}')
+ if [[ -n $_memtotal ]]; then
+ _i=$((_memtotal - _memavail))
+ if [[ $_i -lt $_uncached_usage ]] || [[ -z $_uncached_usage ]]; then
+ _uncached_usage="$_i"
+ fi
+ fi
+ ;;
+ esac
+ done <<< "$(grep -A 3 "\[debug_mem\]"
"$_result_dmesg")"
+
+ _datedir=$(date +%Y-%m-%d-%T)
+ _estimate_dir=$ESTIMATE_RESULTS_DIR/$_datedir
+
+ mkdir -p "$_estimate_dir"
+ cp $ESTIMATE_STATUS_FILE "$_estimate_dir/status"
+ cp $KDUMP_CONFIG_FILE "$_estimate_dir/kdump.conf"
+ cat "$_result_dmesg" > "$_estimate_dir/dmesg"
+ echo "$_cached_usage $_uncached_usage $_reserve_usage" >
"$_estimate_dir/usage"
+
+ rm -rf $ESTIMATE_RESULTS_DIR/latest
+ ln -sfr "$_estimate_dir" $ESTIMATE_RESULTS_DIR/latest
+}
+
+staged_estimate_collect_result()
+{
+ local _target _fstype _opt _estimate_dir _temp_dest _ret
+
+ ESTIMATE_STAGE="result"
+ ESTIMATE_END_TIMESTAMP=$(date +%s%N)
+ save_estimate_status
+
+ prepare_tempdir
+ _estimate_dir=$ESTIMATE_DIR
+ _temp_dest=$ESTIMATE_TMPDIR
+
+ if is_mount_in_dracut_args; then
+ local _dracut_args
+
+ _dracut_args=$(kdump_get_conf_val dracut_args)
+ _target=$(get_dracut_args_target "$_dracut_args")
+ _fstype=$(get_dracut_args_fstype "$_dracut_args")
+ _opt=$(get_dracut_args_fsopts "$_dracut_args")
+
+ retrive_estimate_result_fs "$_target" "$_fstype" "$_opt"
"$_estimate_dir" "$_temp_dest"
+
+ elif is_nfs_dump_target; then
+ _target=$(kdump_get_conf_val "nfs\|nfs4")
+
+ retrive_estimate_result_fs "$_target" nfs defaults "$_estimate_dir"
"$_temp_dest"
+ elif is_ssh_dump_target; then
+ _target=$(kdump_get_conf_val "ssh")
+
+ retrive_estimate_result_ssh "$_target" "$_estimate_dir"
"$_temp_dest"
+ else
+ _target=$(get_block_dump_target)
+
+ if is_raw_dump_target; then
+ derror "Unexpected error, unsupported dump target."
+ return 1
+ fi
+
+ _opt=$(get_mntopt_from_target "$_target")
+ _fstype=$(get_fs_type_from_target "$_target")
+
+ retrive_estimate_result_fs "$_target" "$_fstype" "$_opt"
"$_estimate_dir" "$_temp_dest"
+ fi
+
+ if ! [[ -s "$_temp_dest/kexec-dmesg.log" ]]; then
+ derror "Failed to retrieve kexec-dmesg.log file for estimation."
+ return 1
+ fi
+
+ analyze_result "$_temp_dest"
+ _ret=$?
+
+ cleanup_estiamte_stage
+ return $_ret
+}
+
+cleanup_estiamte_stage()
+{
+ restore_boot_crashkernel
+ clear_estimate_status
+}
+
+progress_staged_estimate()
+{
+ if ! load_estimate_status; then
+ derror "The estimate process is interrupted unexpectedly."
+ cleanup_estiamte_stage
+ fi
+
+ if [[ $ESTIMATE_STAGE == "reboot" ]]; then
+ staged_estimate_panic
+ elif [[ $ESTIMATE_STAGE == "panic" ]]; then
+ staged_estimate_collect_result
+ else
+ derror "Unknown estimate stage: '$ESTIMATE_STAGE'"
+ cleanup_estiamte_stage
+ fi
+
+ return $?
+}
+
+estimate_report()
+{
+ ###
+ ### File based estimation report
+ ###
local kdump_mods
local -A large_mods
local baseline
local kernel_size mod_size initrd_size baseline_size runtime_size reserved_size
estimated_size recommended_size
- local size_mb=$((1024 * 1024))
+ local size_mb=$((1024 * 1024)) size_kb=1024
+
+ if ! [[ -f $KDUMP_INITRD ]]; then
+ derror "kdumpctl estimate: kdump initramfs is not built yet."
+ exit 1
+ fi
kdump_mods="$(lsinitrd "$KDUMP_INITRD" -f
/usr/lib/dracut/hostonly-kernel-modules.txt | tr '\n' ' ')"
baseline=$(kdump_get_arch_recommend_size)
@@ -33,7 +602,7 @@ do_estimate_simple()
elif [[ ${baseline: -1} == "G" ]]; then
baseline=$((${baseline%G} * 1024))
elif [[ ${baseline: -1} == "T" ]]; then
- baseline=$((${baseline%Y} * 1048576))
+ baseline=$((${baseline%T} * 1048576))
fi
# The default pre-reserved crashkernel value
@@ -71,35 +640,94 @@ do_estimate_simple()
break
done
done
- [[ $crypt_size -ne 0 ]] && echo -e "Encrypted kdump target requires extra
memory, assuming using the keyslot with minimun memory requirement\n"
+ [[ $crypt_size -ne 0 ]] && echo "NOTE: Encrypted kdump target requires
extra memory, assuming using the keyslot with the minimum memory requirement"
- estimated_size=$((kernel_size + mod_size + initrd_size + runtime_size + crypt_size))
- if [[ $baseline_size -gt $estimated_size ]]; then
- recommended_size=$baseline_size
+ ###
+ ### Reboot based estimation report
+ ###
+ local reboot_estimate_dir=$ESTIMATE_RESULTS_DIR/latest
+ local uncached_usage cached_usage reserve_usage reboot_estimate_size=0
+
+ if ! [[ -d $reboot_estimate_dir ]]; then
+ reboot_estimate_dir=""
+ else
+ load_estimate_status "$reboot_estimate_dir/status"
+
+ read -r uncached_usage cached_usage reserve_usage <
"$reboot_estimate_dir/usage"
+ reboot_estimate_size=$(((uncached_usage + cached_usage) / 2 + ESTIMATE_INITRD_SIZE +
reserve_usage))
+ reboot_estimate_size=$((reboot_estimate_size * size_kb))
+ fi
+
+ estimated_size=$((kernel_size + mod_size + initrd_size + runtime_size))
+ if [[ $reboot_estimate_size -gt $estimated_size ]]; then
+ recommended_size=$reboot_estimate_size
else
recommended_size=$estimated_size
fi
- echo "Reserved crashkernel: $((reserved_size / size_mb))M"
- echo "Recommended crashkernel: $((recommended_size / size_mb))M"
+ # There will be a peak usage when initramfs was being unpacked,
+ # two copies of the squashed content will exists at the same time.
+ # This can be removed if there is a way to avoid the peak usage.
+ recommended_size=$(( recommended_size + initrd_size ))
+
+ if [[ $baseline_size -gt $recommended_size ]]; then
+ recommended_size=$baseline_size
+ fi
+
+ # TODO: Remove this after keyring reuse is done, which eliminate the extra overhead for
LUKS decryption in second kernel
+ [[ $crypt_size -ne 0 ]] && recommended_size=$((recommended_size + crypt_size))
+
+ echo "Reboot estimation:"
+ if [[ $reboot_estimate_dir ]]; then
+ echo " Last estimation: $(date -d
"@${ESTIMATE_START_TIMESTAMP:0:10}")"
+ echo " Kernel version: $ESTIMATE_KERNEL"
+ echo " Estimate took: $(((ESTIMATE_END_TIMESTAMP - ESTIMATE_PANIC_TIMESTAMP) /
1000 / 1000 / 1000))s"
+ echo " Boosted crashkernel: $((ESTIMATE_MEMORY / size_kb))M"
+ echo
+ echo " Cached memory usage: $((cached_usage / size_kb))M"
+ echo " Uncached memory usage: $((uncached_usage / size_kb))M"
+ echo " Reserved memory: $((reserve_usage / size_kb))M"
+ echo
+ echo " Average runtime memory usage: $(((reboot_estimate_size + size_mb - 1) /
size_mb))M"
+ echo
+
+ if [[ -n $(diff $KDUMP_CONFIG_FILE "$reboot_estimate_dir/kdump.conf" -q) ]]
;then
+ echo " WARNING: $KDUMP_CONFIG_FILE has changed since last estimation, the result
might be outdated."
+ fi
+ if [[ $(uname -r) != "$ESTIMATE_KERNEL" ]] ;then
+ echo " WARNING: Kernel version has changed since last estimation, the result
might be outdated."
+ fi
+
+ else
+ echo " No result available."
+ echo " Use \`kdumpctl estimate --reboot\` to do a reboot estimation."
+ echo
+ fi
+
+ echo "First kernel based estimation:"
+ echo " Reserved crashkernel: $((reserved_size / size_mb))M"
echo
- echo "Kernel image size: $((kernel_size / size_mb))M"
- echo "Kernel modules size: $((mod_size / size_mb))M"
- echo "Initramfs size: $((initrd_size / size_mb))M"
- echo "Runtime reservation: $((runtime_size / size_mb))M"
+ echo " Kernel image size: $((kernel_size / size_mb))M"
+ echo " Kernel modules size: $((mod_size / size_mb))M"
+ echo " Initramfs size: $((initrd_size / size_mb))M"
+ echo " Runtime reservation: $((runtime_size / size_mb))M"
[[ $crypt_size -ne 0 ]] &&
- echo "LUKS required size: $((crypt_size / size_mb))M"
- echo -n "Large modules:"
+ echo " LUKS required size: $((crypt_size / size_mb))M"
+ echo " Large modules:"
if [[ ${#large_mods[@]} -eq 0 ]]; then
- echo " <none>"
+ echo " <none>"
else
- echo ""
for _mod in "${!large_mods[@]}"; do
echo " $_mod: ${large_mods[$_mod]}"
done
fi
- if [[ $reserved_size -le $recommended_size ]]; then
+ echo
+ echo "Recommended crashkernel: $((recommended_size / size_mb))M"
+ echo
+
+ # Leave a 1MB margin
+ if [[ $(( recommended_size - reserved_size )) -gt $size_mb ]]; then
echo "WARNING: Current crashkernel size is lower than recommended size
$((recommended_size / size_mb))M."
fi
}
@@ -114,4 +742,37 @@ if is_fadump_capable; then
KDUMP_INITRD="$DEFAULT_INITRD"
fi
-do_estimate_simple
+case $1 in
+estimate)
+ shift
+ while [[ $# -ne 0 ]]; do
+ case $1 in
+ --reboot)
+ ESTIMATE_REBOOT=1
+ ;;
+ --hard-reboot)
+ ESTIMATE_KEXEC_REBOOT=0
+ ;;
+ *)
+ derror "Unrecognized argument $1"
+ exit 1
+ ;;
+ esac
+ shift
+ done
+
+ if [[ $ESTIMATE_REBOOT -eq 1 ]]; then
+ start_staged_reboot_estimate
+ else
+ estimate_report
+ fi
+ ;;
+
+stage-check)
+ progress_staged_estimate || exit $?
+ ;;
+
+stage-clean)
+ cleanup_estiamte_stage || exit $?
+ ;;
+esac
diff --git a/kdump-lib.sh b/kdump-lib.sh
index 04fac86e..f42b0eca 100755
--- a/kdump-lib.sh
+++ b/kdump-lib.sh
@@ -878,75 +878,3 @@ get_all_kdump_crypt_dev()
get_luks_crypt_dev "$(kdump_get_maj_min "$_dev")"
done
}
-
-check_vmlinux()
-{
- # Use readelf to check if it's a valid ELF
- readelf -h "$1" &> /dev/null || return 1
-}
-
-get_vmlinux_size()
-{
- local size=0 _msize
-
- while read -r _msize; do
- size=$((size + _msize))
- done <<< "$(readelf -l -W "$1" | awk '/^ LOAD/{print
$6}' 2> /dev/stderr)"
-
- echo $size
-}
-
-try_decompress()
-{
- # The obscure use of the "tr" filter is to work around older versions of
- # "grep" that report the byte offset of the line instead of the pattern.
-
- # Try to find the header ($1) and decompress from here
- for pos in $(tr "$1\n$2" "\n$2=" < "$4" | grep -abo
"^$2"); do
- if ! type -P "$3" > /dev/null; then
- ddebug "Signiature detected but '$3' is missing, skip this
decompressor"
- break
- fi
-
- pos=${pos%%:*}
- tail "-c+$pos" "$img" | $3 > "$5" 2> /dev/null
- if check_vmlinux "$5"; then
- ddebug "Kernel is extracted with '$3'"
- return 0
- fi
- done
-
- return 1
-}
-
-# Borrowed from linux/scripts/extract-vmlinux
-get_kernel_size()
-{
- # Prepare temp files:
- local tmp img=$1
-
- tmp=$(mktemp /tmp/vmlinux-XXX)
- trap 'rm -f "$tmp"' 0
-
- # Try to check if it's a vmlinux already
- check_vmlinux "$img" && get_vmlinux_size "$img" &&
return 0
-
- # That didn't work, so retry after decompression.
- try_decompress '\037\213\010' xy gunzip "$img" "$tmp" ||
- try_decompress '\3757zXZ\000' abcde unxz "$img" "$tmp" ||
- try_decompress 'BZh' xy bunzip2 "$img" "$tmp" ||
- try_decompress '\135\0\0\0' xxx unlzma "$img" "$tmp" ||
- try_decompress '\211\114\132' xy 'lzop -d' "$img"
"$tmp" ||
- try_decompress '\002!L\030' xxx 'lz4 -d' "$img"
"$tmp" ||
- try_decompress '(\265/\375' xxx unzstd "$img" "$tmp"
-
- # Finally check for uncompressed images or objects:
- [[ $? -eq 0 ]] && get_vmlinux_size "$tmp" && return 0
-
- # Fallback to use iomem
- local _size=0 _seg
- while read -r _seg; do
- _size=$((_size + 0x${_seg#*-} - 0x${_seg%-*}))
- done <<< "$(grep -E "Kernel (code|rodata|data|bss)" /proc/iomem
| cut -d ":" -f 1)"
- echo $_size
-}
diff --git a/kdump.shutdown b/kdump.shutdown
new file mode 100644
index 00000000..19ed7987
--- /dev/null
+++ b/kdump.shutdown
@@ -0,0 +1,13 @@
+#!/bin/sh
+# Trigger a panic for estimation if in a esimation process
+
+ESTIMATE_STATUS_FILE=/kdump-estimate
+
+[ -s "$ESTIMATE_STATUS_FILE" ] || exit 0
+
+. "$ESTIMATE_STATUS_FILE"
+
+if [ "$ESTIMATE_STAGE" = "panic" ]; then
+ echo 1 > /proc/sys/kernel/sysrq
+ echo c > /proc/sysrq-trigger
+fi
diff --git a/kexec-tools.spec b/kexec-tools.spec
index 8065c1af..d95cd3ee 100644
--- a/kexec-tools.spec
+++ b/kexec-tools.spec
@@ -45,6 +45,9 @@ Source34: crashkernel-howto.txt
Source35: kdump-migrate-action.sh
Source36: kdump-restart.sh
Source37: kdump-estimate.sh
+Source38: kdump.shutdown
+Source39: kdump-estimate.service
+Source40: kdump-estimate-cleanup.service
#######################################
# These are sources for mkdumpramfs
@@ -174,12 +177,13 @@ mkdir -p -m755 $RPM_BUILD_ROOT%{_mandir}/man8/
mkdir -p -m755 $RPM_BUILD_ROOT%{_mandir}/man5/
mkdir -p -m755 $RPM_BUILD_ROOT%{_docdir}
mkdir -p -m755 $RPM_BUILD_ROOT%{_datadir}/kdump
+mkdir -p -m755 $RPM_BUILD_ROOT%{_sharedstatedir}/kdump
+mkdir -p -m755 $RPM_BUILD_ROOT%{_sharedstatedir}/kdump/kdump-estimate
mkdir -p -m755 $RPM_BUILD_ROOT%{_udevrulesdir}
mkdir -p $RPM_BUILD_ROOT%{_unitdir}
mkdir -p -m755 $RPM_BUILD_ROOT%{_bindir}
mkdir -p -m755 $RPM_BUILD_ROOT%{_libdir}
mkdir -p -m755 $RPM_BUILD_ROOT%{_prefix}/lib/kdump
-mkdir -p -m755 $RPM_BUILD_ROOT%{_sharedstatedir}/kdump
install -m 755 %{SOURCE1} $RPM_BUILD_ROOT%{_bindir}/kdumpctl
install -m 755 build/sbin/kexec $RPM_BUILD_ROOT/usr/sbin/kexec
@@ -200,8 +204,8 @@ install -m 644 %{SOURCE12} $RPM_BUILD_ROOT%{_mandir}/man8/mkdumprd.8
install -m 644 %{SOURCE25} $RPM_BUILD_ROOT%{_mandir}/man8/kdumpctl.8
install -m 755 %{SOURCE20} $RPM_BUILD_ROOT%{_prefix}/lib/kdump/kdump-lib.sh
install -m 755 %{SOURCE23} $RPM_BUILD_ROOT%{_prefix}/lib/kdump/kdump-lib-initramfs.sh
-install -m 755 %{SOURCE37} $RPM_BUILD_ROOT%{_prefix}/lib/kdump/kdump-estimate.sh
install -m 755 %{SOURCE31} $RPM_BUILD_ROOT%{_prefix}/lib/kdump/kdump-logger.sh
+install -m 755 %{SOURCE37} $RPM_BUILD_ROOT%{_prefix}/lib/kdump/kdump-estimate.sh
%ifarch ppc64 ppc64le
install -m 755 %{SOURCE35} $RPM_BUILD_ROOT%{_prefix}/lib/kdump/kdump-migrate-action.sh
install -m 755 %{SOURCE36} $RPM_BUILD_ROOT%{_prefix}/lib/kdump/kdump-restart.sh
@@ -219,9 +223,17 @@ install -m 644 %{SOURCE14}
$RPM_BUILD_ROOT%{_udevrulesdir}/98-kexec.rules
%endif
install -m 644 %{SOURCE15} $RPM_BUILD_ROOT%{_mandir}/man5/kdump.conf.5
install -m 644 %{SOURCE16} $RPM_BUILD_ROOT%{_unitdir}/kdump.service
+install -m 644 %{SOURCE39} $RPM_BUILD_ROOT%{_unitdir}/kdump-estimate.service
+install -m 644 %{SOURCE40} $RPM_BUILD_ROOT%{_unitdir}/kdump-estimate-cleanup.service
install -m 755 -D %{SOURCE22}
$RPM_BUILD_ROOT%{_prefix}/lib/systemd/system-generators/kdump-dep-generator.sh
install -m 755 -D %{SOURCE30}
$RPM_BUILD_ROOT%{_prefix}/lib/kernel/install.d/60-kdump.install
install -m 755 -D %{SOURCE33}
$RPM_BUILD_ROOT%{_prefix}/lib/kernel/install.d/92-crashkernel.install
+install -m 755 -D %{SOURCE38}
$RPM_BUILD_ROOT%{_prefix}/lib/systemd/system-shutdown/kdump.shutdown
+
+mkdir -p $RPM_BUILD_ROOT%{_unitdir}/multi-user.target.wants/
+pushd $RPM_BUILD_ROOT%{_unitdir}/multi-user.target.wants/
+ln -sr ../kdump-estimate.service
+popd
%ifarch %{ix86} x86_64 ppc64 s390x ppc64le aarch64
install -m 755 makedumpfile-%{mkdf_ver}/makedumpfile
$RPM_BUILD_ROOT/usr/sbin/makedumpfile
@@ -357,6 +369,7 @@ done
%dir %{_sysconfdir}/kdump/pre.d
%dir %{_sysconfdir}/kdump/post.d
%dir %{_sharedstatedir}/kdump
+%dir %{_sharedstatedir}/kdump/kdump-estimate
%{_mandir}/man8/kdumpctl.8.gz
%{_mandir}/man8/kexec.8.gz
%ifarch %{ix86} x86_64 ppc64 s390x ppc64le aarch64
@@ -366,7 +379,11 @@ done
%{_mandir}/man8/vmcore-dmesg.8.gz
%{_mandir}/man5/*
%{_unitdir}/kdump.service
+%{_unitdir}/kdump-estimate.service
+%{_unitdir}/kdump-estimate-cleanup.service
+%{_unitdir}/multi-user.target.wants/kdump-estimate.service
%{_prefix}/lib/systemd/system-generators/kdump-dep-generator.sh
+%{_prefix}/lib/systemd/system-shutdown/kdump.shutdown
%{_prefix}/lib/kernel/install.d/60-kdump.install
%{_prefix}/lib/kernel/install.d/92-crashkernel.install
%doc News
--
2.31.1