v3: - fixes suggested by Philipp - fix incorrect usage of kdump_get_arch_recommend_crashkernel - s/get_default_crashkernel/get-default-crashkernel - no longer depends on grubby --update-kernel=ALL to update all kernels' command line parameter and using a single loop to simplify the code - indentation issue fix - commit message improvement - update crashkernel-howto.txt as suggested by Dave - CoreOS suppport - makes "kdumpctl reset-crashkernel" work for CoreOS - kdumpctl can't be run in RPM scriplet, disable it for CoreOS - set up kernel crashkernel for osbuild
v2: - no longer address the swiotlb memory requirement when SME is enabled - automatically reset crashkernel to default value only when the value is set by kexec-tools before. So the crashkernel option to added to kdump.conf is replaced auto_reset_crashkernel option instead - multiple fixes suggested by Philipp including regex improvement, typo fixes, grubby kernel path check and commit message improvements - address the case where a kernel path is not /boot/vmlinuz-{KERNEL_RELEASE} - "kdumpctl fadump" dropped. Support fadump via "kdumpctl reset-crashkernel [--fadump=[on|off|nocma]]" instead
The crashkernel=auto implementation in kernel space has been rejected upstream [1]. The current user space implementation [2] [3] ships a crashkernel.default but hasn't supported fadump. Meanwhile the crashkernel.default implementation seems to be overly complex, - the default kernel crashkernel value rarely changes. This is no need to ship the same crashkernel.default default for every kernel package of a architecture; - when deciding the value of crashkernel for a new kernel, the crashkernel.default of installed kernels and running kernel is took into consideration (for the details, check 92-crashnernel.install).
According to Kairui [4], crashkernel.default per kernel package is to accommodate kernel difference, for example, different kernels could be built with different configurations thus different crashkernel values are needed. But these should be minor cases and may not be sufficent to justify the complexity of 92-crashkernel.install. Currently, we don't know how a kernel debug/feature config would affect the crashkernel value. Even if a kernel config may require much larger crashkernel, we can address it in kexec-tools later.
There are are known cases that could lead to a larger crashkernel including enabling SME, LUKS encryption and etc. But this patch set would put them aside since they may be took care of in the kernel space instead.
So this patch set would simply add support for fadump and move the default kernel crashkernel from kernel package to kexec-tools, - provide "kdumpctl get-default-crashkernel" for kdump-anaconda-addon to get the default kernel crashkernel values for a specific architecture (fadump is supported as well) - re-write "kdumpctl reset-crashkernel" to support fadump - introduce auto_reset_crashkernel which determines whether to reset kernel crashkernel to new default value or not when kexec-tools updates the default crashkernel value
Because the kernel hook /usr/lib/kernel/install.d/20-grub.install would make the installed kernel inherit the kernel cmdline of current running kernel i.e. /proc/cmdline, we only need to reset crashkernel when kexec-tools increases the default crashkernel values.
[1] https://lore.kernel.org/linux-mm/20210507010432.IN24PudKT%25akpm@linux-found... [2] https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1171 [3] https://lists.fedoraproject.org/archives/list/kexec@lists.fedoraproject.org/... [4] https://lists.fedoraproject.org/archives/list/kexec@lists.fedoraproject.org/...
Coiby Xu (13): update default crashkernel value factor out kdump_get_arch_recommend_crashkernel provide kdumpctl get-default-crashkernel for kdump_anaconda_addon and RPM scriptlet add a helper function to read kernel cmdline parameter from grubby --info add helper functions to get dump mode add helper functions to get kernel path by kernel release and the path of current running kernel fix incorrect usage of rpm-ostree to update kernel command line parameters rewrite reset_crashkernel to support fadump and to used by RPM scriptlet introduce the auto_reset_crashkernel option to kdump.conf try to reset kernel crashkernel when kexec-tools updates the default crashkernel value reset kernel crashkernel for the special case where the kernel is updated right after kexec-tools set up kernel crashkernel for osbuild in kernel hook update crashkernel-howto
92-crashkernel.install | 135 +---------------- crashkernel-howto.txt | 123 +++------------- kdump-lib.sh | 60 ++++---- kdump.conf | 7 + kdump.conf.5 | 6 + kdumpctl | 320 ++++++++++++++++++++++++++++++++++++++--- kdumpctl.8 | 16 ++- kexec-tools.spec | 15 ++ 8 files changed, 393 insertions(+), 289 deletions(-)
It has been decided to increase default crashkernel value to reduce the possibility of OOM.
Fixes: commit 7b7ddaba88af9bcbbcc3d219e1c7f00b3a61152d ("kdump-lib.sh: kdump_get_arch_recommend_size uses crashkernel.default") Signed-off-by: Coiby Xu coxu@redhat.com --- kdump-lib.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kdump-lib.sh b/kdump-lib.sh index 2e2775c..b8f6c96 100755 --- a/kdump-lib.sh +++ b/kdump-lib.sh @@ -841,7 +841,7 @@ kdump_get_arch_recommend_size() else arch=$(lscpu | grep Architecture | awk -F ":" '{ print $2 }' | tr '[:lower:]' '[:upper:]') if [[ $arch == "X86_64" ]] || [[ $arch == "S390X" ]]; then - ck_cmdline="1G-4G:160M,4G-64G:192M,64G-1T:256M,1T-:512M" + ck_cmdline="1G-4G:192M,4G-64G:256M,64G-:512M" elif [[ $arch == "AARCH64" ]]; then ck_cmdline="2G-:448M" elif [[ $arch == "PPC64LE" ]]; then
Factor out kdump_get_arch_recommend_crashkernel to prepare for kdump-anaconda-plugin for example to retrieve the default crashkernel value.
Note the support of crashkenrel.default is dropped.
Signed-off-by: Coiby Xu coxu@redhat.com --- kdump-lib.sh | 60 +++++++++++++++++++++++++++++++--------------------- 1 file changed, 36 insertions(+), 24 deletions(-)
diff --git a/kdump-lib.sh b/kdump-lib.sh index b8f6c96..b28db44 100755 --- a/kdump-lib.sh +++ b/kdump-lib.sh @@ -822,40 +822,52 @@ get_recommend_size() IFS="$OLDIFS" }
+# get default crashkernel +# $1 dump mode, if not specified, dump_mode will be judged by is_fadump_capable +kdump_get_arch_recommend_crashkernel() +{ + local _arch _ck_cmdline _dump_mode + + if [[ -z "$1" ]]; then + if is_fadump_capable; then + _dump_mode=fadump + else + _dump_mode=kdump + fi + else + _dump_mode=$1 + fi + + _arch=$(uname -m) + + if [[ $_arch == "x86_64" ]] || [[ $_arch == "s390x" ]]; then + _ck_cmdline="1G-4G:192M,4G-64G:256M,64G-:512M" + elif [[ $_arch == "aarch64" ]]; then + _ck_cmdline="2G-:448M" + elif [[ $_arch == "ppc64le" ]]; then + if [[ $_dump_mode == "fadump" ]]; then + _ck_cmdline="4G-16G:768M,16G-64G:1G,64G-128G:2G,128G-1T:4G,1T-2T:6G,2T-4T:12G,4T-8T:20G,8T-16T:36G,16T-32T:64G,32T-64T:128G,64T-:180G" + else + _ck_cmdline="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G" + fi + fi + + _ck_cmdline=${_ck_cmdline//-:/-102400T:} + echo -n "$_ck_cmdline" +} + # return recommended size based on current system RAM size # $1: kernel version, if not set, will defaults to $(uname -r) kdump_get_arch_recommend_size() { - local kernel=$1 arch + local _ck_cmdline
if ! [[ -r "/proc/iomem" ]]; then echo "Error, can not access /proc/iomem." return 1 fi - - [[ -z $kernel ]] && kernel=$(uname -r) - ck_cmdline=$(cat "/usr/lib/modules/$kernel/crashkernel.default" 2> /dev/null) - - if [[ -n $ck_cmdline ]]; then - ck_cmdline=${ck_cmdline#crashkernel=} - else - arch=$(lscpu | grep Architecture | awk -F ":" '{ print $2 }' | tr '[:lower:]' '[:upper:]') - if [[ $arch == "X86_64" ]] || [[ $arch == "S390X" ]]; then - ck_cmdline="1G-4G:192M,4G-64G:256M,64G-:512M" - elif [[ $arch == "AARCH64" ]]; then - ck_cmdline="2G-:448M" - elif [[ $arch == "PPC64LE" ]]; then - if is_fadump_capable; then - ck_cmdline="4G-16G:768M,16G-64G:1G,64G-128G:2G,128G-1T:4G,1T-2T:6G,2T-4T:12G,4T-8T:20G,8T-16T:36G,16T-32T:64G,32T-64T:128G,64T-:180G" - else - ck_cmdline="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G" - fi - fi - fi - - ck_cmdline=${ck_cmdline//-:/-102400T:} sys_mem=$(get_system_size) - + _ck_cmdline=$(kdump_get_arch_recommend_crashkernel) get_recommend_size "$sys_mem" "$ck_cmdline" }
Provide "kdumpctl get-default-crashkernel" for kdump_anaconda_addon so crashkernel.default isn't needed.
When fadump is on, kdump_anaconda_addon would need to specify the dump mode, i.e. "kdumpctl get-default-crashkernel fadump".
This interface would also be used by RPM scriptlet [1] to fetch default crashkernel value.
[1] https://docs.fedoraproject.org/en-US/packaging-guidelines/Scriptlets/
Signed-off-by: Coiby Xu coxu@redhat.com --- kdumpctl | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/kdumpctl b/kdumpctl index 59ec068..1938968 100755 --- a/kdumpctl +++ b/kdumpctl @@ -1299,6 +1299,13 @@ do_estimate() fi }
+get_default_crashkernel() +{ + local _dump_mode=$1 + + echo -n "$(kdump_get_arch_recommend_crashkernel "$_dump_mode")" +} + reset_crashkernel() { local kernel=$1 entry crashkernel_default @@ -1392,6 +1399,9 @@ main() estimate) do_estimate ;; + get-default-crashkernel) + get_default_crashkernel "$2" + ;; reset-crashkernel) reset_crashkernel "$2" ;;
Hi Coiby,
On Thu, 16 Dec 2021 14:36:45 +0800 Coiby Xu coxu@redhat.com wrote:
Provide "kdumpctl get-default-crashkernel" for kdump_anaconda_addon so crashkernel.default isn't needed.
When fadump is on, kdump_anaconda_addon would need to specify the dump mode, i.e. "kdumpctl get-default-crashkernel fadump".
This interface would also be used by RPM scriptlet [1] to fetch default crashkernel value.
[1] https://docs.fedoraproject.org/en-US/packaging-guidelines/Scriptlets/
Signed-off-by: Coiby Xu coxu@redhat.com
kdumpctl | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/kdumpctl b/kdumpctl index 59ec068..1938968 100755 --- a/kdumpctl +++ b/kdumpctl @@ -1299,6 +1299,13 @@ do_estimate() fi }
+get_default_crashkernel() +{
- local _dump_mode=$1
- echo -n "$(kdump_get_arch_recommend_crashkernel "$_dump_mode")"
you can the echo -n here. That's already done in kdump_get_arch_recommend_crashkernel.
Thanks Philipp
+}
reset_crashkernel() { local kernel=$1 entry crashkernel_default @@ -1392,6 +1399,9 @@ main() estimate) do_estimate ;;
- get-default-crashkernel)
get_default_crashkernel "$2"
reset-crashkernel) reset_crashkernel "$2" ;;;;
This helper function will be used to retrieve the value of kernel cmdline parameters including crashkernel, fadump, swiotlb and etc.
Suggested-by: Philipp Rudo prudo@redhat.com Signed-off-by: Coiby Xu coxu@redhat.com --- kdumpctl | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/kdumpctl b/kdumpctl index 1938968..7fc3a4d 100755 --- a/kdumpctl +++ b/kdumpctl @@ -1306,6 +1306,17 @@ get_default_crashkernel() echo -n "$(kdump_get_arch_recommend_crashkernel "$_dump_mode")" }
+# Read kernel cmdline parameter for a specific kernel +# $1: kernel path, DEFAULT or kernel path, ALL not accepted +# $2: kernel cmldine parameter +get_grub_kernel_boot_parameter() +{ + local _kernel_path=$1 _para=$2 + + [[ $_kernel_path == ALL ]] && derror "kernel_path=ALL invalid for get_grub_kernel_boot_parameter" && return 1 + grubby --info="$_kernel_path" | sed -En -e "/^args=.*$/{s/^.*(\s|")${_para}=(\S*).*"$/\2/p;q}" +} + reset_crashkernel() { local kernel=$1 entry crashkernel_default
Add a helper function to get dump mode. The dump mode would be - fadump if fadump=on or fadump=nocma - kdump if fadump=off or empty fadump
Otherwise return 1.
Also add another helper function to return a kernel's dump mode.
Signed-off-by: Coiby Xu coxu@redhat.com --- kdumpctl | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+)
diff --git a/kdumpctl b/kdumpctl index 7fc3a4d..c6046dc 100755 --- a/kdumpctl +++ b/kdumpctl @@ -1317,6 +1317,40 @@ get_grub_kernel_boot_parameter() grubby --info="$_kernel_path" | sed -En -e "/^args=.*$/{s/^.*(\s|")${_para}=(\S*).*"$/\2/p;q}" }
+# get dump mode by fadump value +# return +# - fadump, if fadump=on or fadump=nocma +# - kdump, if fadump=off or empty fadump, return kdump +# - error if otherwise +get_dump_mode_by_fadump_val() +{ + local _fadump_val=$1 + + if [[ -z $_fadump_val ]] || [[ $_fadump_val == off ]]; then + echo -n kdump + elif [[ $_fadump_val == on ]] || [[ $_fadump_val == nocma ]]; then + echo -n fadump + else + derror "invalid fadump=$_fadump_val" + return 1 + fi +} + +# get dump mode of a specific kernel +# based on its fadump kernel cmdline parameter +get_dump_mode_by_kernel() +{ + local _kernel_path=$1 _fadump_val _dump_mode + + _fadump_val=$(get_grub_kernel_boot_parameter "$_kernel_path" fadump) + if _dump_mode=$(get_dump_mode_by_fadump_val "$_fadump_val"); then + echo -n "$_dump_mode" + else + derror "failed to get dump mode for kernel $_kernel_path" + exit + fi +} + reset_crashkernel() { local kernel=$1 entry crashkernel_default
grubby --info=kernel-path or --add-kernel=kernel-path accepts a kernel path (e.g. /boot/vmlinuz-5.14.14-200.fc34.x86_64) instead of kernel release (e.g 5.14.14-200.fc34.x86_64). So we need to know the kernel path given a kernel release. Although for Fedora/RHEL, the kernel path is "/boot/vmlinuz-<KERNEL_RELEASE>", a path kernel could also be /boot/<machine-id>/<KERNEL_RELEASE>/vmlinuz. So the most reliable way to find the kernel path given a kernel release is to use "grubby --info".
Note these helper functions doesn't support CoreOS/Atomic/Silverblue since grubby isn't used by them.
Signed-off-by: Coiby Xu coxu@redhat.com --- kdumpctl | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+)
diff --git a/kdumpctl b/kdumpctl index c6046dc..c01f8ae 100755 --- a/kdumpctl +++ b/kdumpctl @@ -1351,6 +1351,36 @@ get_dump_mode_by_kernel() fi }
+_filter_grubby_kernel_str() +{ + local _grubby_kernel_str=$1 + echo -n "$_grubby_kernel_str" | sed -n -e 's/^kernel="(.*)"/\1/p' +} + +_find_kernel_path_by_release() +{ + local _release="$1" _grubby_kernel_str _kernel_path + _grubby_kernel_str=$(grubby --info ALL | grep "^kernel=.*$_release") + _kernel_path=$(_filter_grubby_kernel_str "$_grubby_kernel_str") + if [[ ! -e "$_kernel_path" ]]; then + derror "kernel $_release doesn't exist" + return 1 + fi + echo -n "$_kernel_path" +} + +_get_current_running_kernel_path() +{ + local _release _path + + _release=$(uname -r) + if _path=$(_find_kernel_path_by_release "$_release"); then + echo -n "$_path" + else + return 1 + fi +} + reset_crashkernel() { local kernel=$1 entry crashkernel_default
CoreOS/Atomic/Silverblue use "rpm-ostree kargs" to manage kernel command line parameters.
Fixes: commit 86130ec10fda37cc283f717eaacb56a4cbf76418 ("kdumpctl: Add kdumpctl reset-crashkernel") Signed-off-by: Coiby Xu coxu@redhat.com --- kdumpctl | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kdumpctl b/kdumpctl index c01f8ae..27971a1 100755 --- a/kdumpctl +++ b/kdumpctl @@ -1396,9 +1396,9 @@ reset_crashkernel()
if is_atomic; then if rpm-ostree kargs | grep -q "crashkernel="; then - rpm-ostree --replace="crashkernel=$crashkernel_default" + rpm-ostree kargs --replace="crashkernel=$crashkernel_default" else - rpm-ostree --append="crashkernel=$crashkernel_default" + rpm-ostree kargs --append="crashkernel=$crashkernel_default" fi else entry=$(grubby --info ALL | grep "^kernel=.*$kernel")
Rewrite kdumpctl reset-crashkernel KERNEL_PATH as kdumpctl reset-crashkernel [--fadump=[on|off|nocma]] [--kernel=path_to_kernel] [--reboot]
This interface would reset a specific kernel to the default crasherknel value given the kernel path. And it also supports grubby's syntax so there are the following special cases, - if --kernel not specified, current running kernel, i.e. `uname -r` is chosen - if --kernel=DEFAULT, the default boot kernel is chosen - if --kernel=ALL, all kernels would have its crashkernel reset to the default value and the /etc/default/grub is updated as well
--fadump=[on|off|nocma] toggles fadump on/off for the kernel provided in KERNEL_PATH. If --fadump is omitted the dump mode is determined by parsing the kernel command line for the kernel to update.
CoreOS/Atomic/Silverblue needs to be treated as a special case because, - "rpm-ostree kargs" is used to manage kernel command line parameters so --kernel doesn't make sense and there is no need to find current running kernel - "rpm-ostree kargs" itself would prompt the user to reboot the system after modify the kernel command line parameter - POWER is not supported so we can assume the dump mode is always kdump
This interface will also be called by kexec-tools RPM scriptlets [1] to reset crashkernel.
Note the support of crashkenrel.default is dropped.
[1] https://docs.fedoraproject.org/en-US/packaging-guidelines/Scriptlets/
Signed-off-by: Coiby Xu coxu@redhat.com --- kdumpctl | 165 +++++++++++++++++++++++++++++++++++++++++++++-------- kdumpctl.8 | 16 +++--- 2 files changed, 151 insertions(+), 30 deletions(-)
diff --git a/kdumpctl b/kdumpctl index 27971a1..412420d 100755 --- a/kdumpctl +++ b/kdumpctl @@ -1381,39 +1381,157 @@ _get_current_running_kernel_path() fi }
-reset_crashkernel() +_update_grub() { - local kernel=$1 entry crashkernel_default - local grub_etc_default="/etc/default/grub" - - [[ -z $kernel ]] && kernel=$(uname -r) - crashkernel_default=$(cat "/usr/lib/modules/$kernel/crashkernel.default" 2> /dev/null) - - if [[ -z $crashkernel_default ]]; then - derror "$kernel doesn't have a crashkernel.default" - exit 1 - fi + local _kernel_path=$1 _crashkernel=$2 _dump_mode=$3 _fadump_val=$4
if is_atomic; then if rpm-ostree kargs | grep -q "crashkernel="; then - rpm-ostree kargs --replace="crashkernel=$crashkernel_default" + rpm-ostree kargs --replace="crashkernel=$_crashkernel" else - rpm-ostree kargs --append="crashkernel=$crashkernel_default" + rpm-ostree kargs --append="crashkernel=$_crashkernel" fi else - entry=$(grubby --info ALL | grep "^kernel=.*$kernel") - entry=${entry#kernel=} - entry=${entry#"} - entry=${entry%"} + [[ -f /etc/zipl.conf ]] && zipl_arg="--zipl" + grubby --args "crashkernel=$_crashkernel" --update-kernel "$_kernel_path" $zipl_arg + if [[ $_dump_mode == kdump ]]; then + grubby --remove-args="fadump" --update-kernel "$_kernel_path" + else + grubby --args="fadump=$_fadump_val" --update-kernel "$_kernel_path" + fi + fi + [[ $zipl_arg ]] && zipl > /dev/null +} + +_valid_grubby_kernel_path() +{ + [[ -n "$1" ]] && grubby --info="$1" > /dev/null 2>&1 +}
- if [[ -f $grub_etc_default ]]; then - sed -i -e "s/^(GRUB_CMDLINE_LINUX=.*)crashkernel=[^\ "]*([\ "].*)$/\1$crashkernel_default\2/" "$grub_etc_default" +_get_all_kernels_from_grubby() +{ + local _kernels _line _kernel_path _grubby_kernel_path=$1 + + for _line in $(grubby --info "$_grubby_kernel_path" | grep "^kernel="); do + _kernel_path=$(_filter_grubby_kernel_str "$_line") + _kernels="$_kernels $_kernel_path" + done + echo -n "$_kernels" +} + +# modify the kernel command line parameter in default grub conf +# +# $1: the name of the kernel command line parameter +# $2: new value. If empty, the parameter would be removed +_update_kernel_cmdline_in_grub_etc_default() +{ + local grub_etc_default="/etc/default/grub" _para=$2 _val=$3 _para_val + + [[ -n $val ]] && _para_val="$_para=$_val" + + sed -i -E "s/^(GRUB_CMDLINE_LINUX=.*)([[:space:]"])crashkernel=[^[:space:]"]*(.*)$/\1\2$_para_val\3/" "$grub_etc_default" +} + +reset_crashkernel() +{ + local _opt _val _dump_mode _fadump_val _reboot _grubby_kernel_path _kernel _kernels + local _old_crashkernel _new_crashkernel _new_dump_mode _crashkernel_changed _new_fadump_val + + for _opt in "$@"; do + case "$_opt" in + --fadump=*) + _val=${_opt#*=} + if _dump_mode=$(get_dump_mode_by_fadump_val $_val); then + _fadump_val=$_val + else + derror "failed to determine dump mode" + exit + fi + ;; + --kernel=*) + _val=${_opt#*=} + if ! _valid_grubby_kernel_path $_val; then + derror "Invalid $_opt, please specify a valid kernel path, ALL or DEFAULT" + exit + fi + _grubby_kernel_path=$_val + ;; + --reboot) + _reboot=yes + ;; + *) + derror "$_opt not recognized" + exit 1 + ;; + esac + done + + # 1. CoreOS uses "rpm-ostree kargs" instead of grubby to manage kernel command + # line. --kernel=ALL doesn't make sense for CoreOS. + # 2. CoreOS doesn't support POWER so the dump mode is always kdump. + # 3. "rpm-ostree kargs" would prompt the user to reboot the system after + # modifying the kernel command line so there is no need for kexec-tools + # to repeat it. + if is_atomic; then + _old_crashkernel=$(rpm-ostree kargs | sed -n -E 's/.*(^|\s)crashkernel=(\S*).*/\2/p') + _new_dump_mode=kdump + _new_crashkernel=$(kdump_get_arch_recommend_crashkernel "$_new_dump_mode") + if [[ $_old_crashkernel != "$_new_crashkernel" ]]; then + _update_grub "" "$_new_crashkernel" "$_new_dump_mode" "" + [[ $_reboot == yes ]] && systemctl reboot fi + return + fi
- [[ -f /etc/zipl.conf ]] && zipl_arg="--zipl" - grubby --args "$crashkernel_default" --update-kernel "$entry" $zipl_arg - [[ $zipl_arg ]] && zipl > /dev/null + # Only ppc64le supports fadump. If not specified for ppc64le, the dump + # mode would be determined by parsing the kernel command line of the + # kernel(s) to be updated + if [[ -z $_dump_mode && $(uname -m) != ppc64le ]]; then + _dump_mode=kdump + fi + + if [[ -n $_dump_mode ]]; then + _crashkernel=$(kdump_get_arch_recommend_crashkernel "$_dump_mode") + _fadump_val=off + fi + + if [[ $_grubby_kernel_path == ALL && -n $_dump_mode ]]; then + _update_kernel_cmdline_in_grub_etc_default crashkernel "$_crashkernel" + # remove the fadump if fadump is disabled + [[ $_fadump_val == off ]] && _fadump_val="" + _update_kernel_cmdline_in_grub_etc_default fadump "$_fadump_val" + fi + + if [[ -z $_grubby_kernel_path ]]; then + if ! _kernel_path=$(_get_current_running_kernel_path); then + derror "no running kernel found" + exit 1 + fi + _kernels=$_kernel_path + else + _kernels=$(_get_all_kernels_from_grubby "$_grubby_kernel_path") fi + + for _kernel in $_kernels; do + if [[ -z $_dump_mode ]]; then + _new_dump_mode=$(get_dump_mode_by_kernel "$_kernel") + _new_crashkernel=$(kdump_get_arch_recommend_crashkernel "$_new_dump_mode") + _new_fadump_val=$(get_grub_kernel_boot_parameter "$_kernel" fadump) + else + _new_dump_mode=$_dump_mode + _new_crashkernel=$_crashkernel + _new_fadump_val=$_fadump_val + fi + + _old_crashkernel=$(get_grub_kernel_boot_parameter "$_kernel" crashkernel) + if [[ $_old_crashkernel != "$_new_crashkernel" ]]; then + _update_grub "$_kernel" "$_new_crashkernel" "$_new_dump_mode" "$_new_fadump_val" + [[ $_reboot != yes ]] && dwarn "For kernel=$_kernel, crashkernel=$_new_crashkernel now. Please reboot the system to take effect." + _crashkernel_changed=yes + fi + done + + [[ $_reboot == yes && $_crashkernel_changed == yes ]] && reboot }
if [[ ! -f $KDUMP_CONFIG_FILE ]]; then @@ -1478,7 +1596,8 @@ main() get_default_crashkernel "$2" ;; reset-crashkernel) - reset_crashkernel "$2" + shift + reset_crashkernel "$@" ;; *) dinfo $"Usage: $0 {estimate|start|stop|status|restart|reload|rebuild|reset-crashkernel|propagate|showmem}" diff --git a/kdumpctl.8 b/kdumpctl.8 index 74be062..fd44bea 100644 --- a/kdumpctl.8 +++ b/kdumpctl.8 @@ -50,13 +50,15 @@ Estimate a suitable crashkernel value for current machine. This is a best-effort estimate. It will print a recommanded crashkernel value based on current kdump setup, and list some details of memory usage. .TP -.I reset-crashkernel [KERNEL] -Reset crashkernel value to default value. kdumpctl will try to read -from /usr/lib/modules/<KERNEL>/crashkernel.default and reset specified -kernel's crashkernel cmdline value. If no kernel is -specified, will reset current running kernel's crashkernel value. -If /usr/lib/modules/<KERNEL>/crashkernel.default doesn't exist, will -simply exit return 1. +.I reset-crashkernel [--kernel=path_to_kernel] [--reboot] +Reset crashkernel to default value recommended by kexec-tools. If no kernel +is specified, will reset current running kernel's crashkernel value. You can +also specify --kernel=ALL and --kernel=DEFAULT which have the same meaning as +grubby's kernel-path=ALL and kernel-path=DEFAULT. ppc64le supports FADump and +supports an additonal [--fadump=[on|off|nocma]] paramerter to toggle FADump +on/off. Note that there are many factors would affect the memory requirement, +there is no guarantee the value recommended by kexec-tools would work for your +case.
.SH "SEE ALSO"
Hi Coiby,
the patch looks really good. Only a few small nits :)
On Thu, 16 Dec 2021 14:36:50 +0800 Coiby Xu coxu@redhat.com wrote:
Rewrite kdumpctl reset-crashkernel KERNEL_PATH as kdumpctl reset-crashkernel [--fadump=[on|off|nocma]] [--kernel=path_to_kernel] [--reboot]
This interface would reset a specific kernel to the default crasherknel
typo s/crasherknel/crashkernel/
value given the kernel path. And it also supports grubby's syntax so there are the following special cases,
- if --kernel not specified, current running kernel, i.e. `uname -r` is chosen
What happens when the user has set KDUMP_KERNELVER in /etc/sysconfig/kdump? My expectation is that it has precedence over uname -r. I.e. if no --kernel is given choose KDUMP_KERNELVER, if that is empty choose uname -r.
- if --kernel=DEFAULT, the default boot kernel is chosen
- if --kernel=ALL, all kernels would have its crashkernel reset to the default value and the /etc/default/grub is updated as well
--fadump=[on|off|nocma] toggles fadump on/off for the kernel provided in KERNEL_PATH. If --fadump is omitted the dump mode is determined by parsing the kernel command line for the kernel to update.
CoreOS/Atomic/Silverblue needs to be treated as a special case because,
- "rpm-ostree kargs" is used to manage kernel command line parameters so --kernel doesn't make sense and there is no need to find current running kernel
- "rpm-ostree kargs" itself would prompt the user to reboot the system after modify the kernel command line parameter
- POWER is not supported so we can assume the dump mode is always kdump
This interface will also be called by kexec-tools RPM scriptlets [1] to reset crashkernel.
Note the support of crashkenrel.default is dropped.
[1] https://docs.fedoraproject.org/en-US/packaging-guidelines/Scriptlets/
Signed-off-by: Coiby Xu coxu@redhat.com
kdumpctl | 165 +++++++++++++++++++++++++++++++++++++++++++++-------- kdumpctl.8 | 16 +++--- 2 files changed, 151 insertions(+), 30 deletions(-)
diff --git a/kdumpctl b/kdumpctl index 27971a1..412420d 100755 --- a/kdumpctl +++ b/kdumpctl @@ -1381,39 +1381,157 @@ _get_current_running_kernel_path() fi }
-reset_crashkernel() +_update_grub() {
- local kernel=$1 entry crashkernel_default
- local grub_etc_default="/etc/default/grub"
- [[ -z $kernel ]] && kernel=$(uname -r)
- crashkernel_default=$(cat "/usr/lib/modules/$kernel/crashkernel.default" 2> /dev/null)
- if [[ -z $crashkernel_default ]]; then
derror "$kernel doesn't have a crashkernel.default"
exit 1
- fi
local _kernel_path=$1 _crashkernel=$2 _dump_mode=$3 _fadump_val=$4
if is_atomic; then if rpm-ostree kargs | grep -q "crashkernel="; then
rpm-ostree kargs --replace="crashkernel=$crashkernel_default"
elserpm-ostree kargs --replace="crashkernel=$_crashkernel"
rpm-ostree kargs --append="crashkernel=$crashkernel_default"
fi elserpm-ostree kargs --append="crashkernel=$_crashkernel"
entry=$(grubby --info ALL | grep "^kernel=.*$kernel")
entry=${entry#kernel=}
entry=${entry#\"}
entry=${entry%\"}
[[ -f /etc/zipl.conf ]] && zipl_arg="--zipl"
grubby --args "crashkernel=$_crashkernel" --update-kernel "$_kernel_path" $zipl_arg
if [[ $_dump_mode == kdump ]]; then
grubby --remove-args="fadump" --update-kernel "$_kernel_path"
else
grubby --args="fadump=$_fadump_val" --update-kernel "$_kernel_path"
fi
- fi
- [[ $zipl_arg ]] && zipl > /dev/null
+}
+_valid_grubby_kernel_path() +{
- [[ -n "$1" ]] && grubby --info="$1" > /dev/null 2>&1
+}
if [[ -f $grub_etc_default ]]; then
sed -i -e "s/^\(GRUB_CMDLINE_LINUX=.*\)crashkernel=[^\ \"]*\([\ \"].*\)$/\1$crashkernel_default\2/" "$grub_etc_default"
+_get_all_kernels_from_grubby() +{
- local _kernels _line _kernel_path _grubby_kernel_path=$1
- for _line in $(grubby --info "$_grubby_kernel_path" | grep "^kernel="); do
_kernel_path=$(_filter_grubby_kernel_str "$_line")
_kernels="$_kernels $_kernel_path"
- done
- echo -n "$_kernels"
+}
+# modify the kernel command line parameter in default grub conf +# +# $1: the name of the kernel command line parameter +# $2: new value. If empty, the parameter would be removed +_update_kernel_cmdline_in_grub_etc_default() +{
- local grub_etc_default="/etc/default/grub" _para=$2 _val=$3 _para_val
- [[ -n $val ]] && _para_val="$_para=$_val"
- sed -i -E "s/^(GRUB_CMDLINE_LINUX=.*)([[:space:]"])crashkernel=[^[:space:]"]*(.*)$/\1\2$_para_val\3/" "$grub_etc_default"
+}
+reset_crashkernel() +{
- local _opt _val _dump_mode _fadump_val _reboot _grubby_kernel_path _kernel _kernels
- local _old_crashkernel _new_crashkernel _new_dump_mode _crashkernel_changed _new_fadump_val
- for _opt in "$@"; do
case "$_opt" in
--fadump=*)
_val=${_opt#*=}
if _dump_mode=$(get_dump_mode_by_fadump_val $_val); then
_fadump_val=$_val
else
derror "failed to determine dump mode"
exit
fi
;;
--kernel=*)
_val=${_opt#*=}
if ! _valid_grubby_kernel_path $_val; then
derror "Invalid $_opt, please specify a valid kernel path, ALL or DEFAULT"
exit
fi
_grubby_kernel_path=$_val
;;
--reboot)
_reboot=yes
;;
*)
derror "$_opt not recognized"
exit 1
;;
esac
- done
- # 1. CoreOS uses "rpm-ostree kargs" instead of grubby to manage kernel command
- # line. --kernel=ALL doesn't make sense for CoreOS.
- # 2. CoreOS doesn't support POWER so the dump mode is always kdump.
- # 3. "rpm-ostree kargs" would prompt the user to reboot the system after
- # modifying the kernel command line so there is no need for kexec-tools
- # to repeat it.
- if is_atomic; then
_old_crashkernel=$(rpm-ostree kargs | sed -n -E 's/.*(^|\s)crashkernel=(\S*).*/\2/p')
_new_dump_mode=kdump
_new_crashkernel=$(kdump_get_arch_recommend_crashkernel "$_new_dump_mode")
if [[ $_old_crashkernel != "$_new_crashkernel" ]]; then
_update_grub "" "$_new_crashkernel" "$_new_dump_mode" ""
fi[[ $_reboot == yes ]] && systemctl reboot
return
- fi
[[ -f /etc/zipl.conf ]] && zipl_arg="--zipl"
grubby --args "$crashkernel_default" --update-kernel "$entry" $zipl_arg
[[ $zipl_arg ]] && zipl > /dev/null
- # Only ppc64le supports fadump. If not specified for ppc64le, the dump
- # mode would be determined by parsing the kernel command line of the
- # kernel(s) to be updated
- if [[ -z $_dump_mode && $(uname -m) != ppc64le ]]; then
_dump_mode=kdump
- fi
I don't think this will work. The way I see it $_dump_mode will always be set after this if-block. But that will make all tests for -n/-z $_dump_mode predetermined. Which is not what you want. E.g. ...
- if [[ -n $_dump_mode ]]; then
_crashkernel=$(kdump_get_arch_recommend_crashkernel "$_dump_mode")
_fadump_val=off
- fi
... this will always turn off fadump independent what the user gave on the command line.
- if [[ $_grubby_kernel_path == ALL && -n $_dump_mode ]]; then
_update_kernel_cmdline_in_grub_etc_default crashkernel "$_crashkernel"
# remove the fadump if fadump is disabled
[[ $_fadump_val == off ]] && _fadump_val=""
_update_kernel_cmdline_in_grub_etc_default fadump "$_fadump_val"
- fi
- if [[ -z $_grubby_kernel_path ]]; then
if ! _kernel_path=$(_get_current_running_kernel_path); then
derror "no running kernel found"
exit 1
fi
_kernels=$_kernel_path
- else
fi_kernels=$(_get_all_kernels_from_grubby "$_grubby_kernel_path")
- for _kernel in $_kernels; do
if [[ -z $_dump_mode ]]; then
_new_dump_mode=$(get_dump_mode_by_kernel "$_kernel")
_new_crashkernel=$(kdump_get_arch_recommend_crashkernel "$_new_dump_mode")
_new_fadump_val=$(get_grub_kernel_boot_parameter "$_kernel" fadump)
else
_new_dump_mode=$_dump_mode
_new_crashkernel=$_crashkernel
_new_fadump_val=$_fadump_val
fi
_old_crashkernel=$(get_grub_kernel_boot_parameter "$_kernel" crashkernel)
if [[ $_old_crashkernel != "$_new_crashkernel" ]]; then
_update_grub "$_kernel" "$_new_crashkernel" "$_new_dump_mode" "$_new_fadump_val"
[[ $_reboot != yes ]] && dwarn "For kernel=$_kernel, crashkernel=$_new_crashkernel now. Please reboot the system to take effect."
The error message is a little bit clumsy. I would use
Updated crashkernel=$_new_crashkernel parameter for kernel=$_kernel. Please reboot the system for the change to take effect.
_crashkernel_changed=yes
fi
- done
- [[ $_reboot == yes && $_crashkernel_changed == yes ]] && reboot
}
if [[ ! -f $KDUMP_CONFIG_FILE ]]; then @@ -1478,7 +1596,8 @@ main() get_default_crashkernel "$2" ;; reset-crashkernel)
reset_crashkernel "$2"
shift
;; *) dinfo $"Usage: $0 {estimate|start|stop|status|restart|reload|rebuild|reset-crashkernel|propagate|showmem}"reset_crashkernel "$@"
diff --git a/kdumpctl.8 b/kdumpctl.8 index 74be062..fd44bea 100644 --- a/kdumpctl.8 +++ b/kdumpctl.8 @@ -50,13 +50,15 @@ Estimate a suitable crashkernel value for current machine. This is a best-effort estimate. It will print a recommanded crashkernel value based on current kdump setup, and list some details of memory usage. .TP -.I reset-crashkernel [KERNEL] -Reset crashkernel value to default value. kdumpctl will try to read -from /usr/lib/modules/<KERNEL>/crashkernel.default and reset specified -kernel's crashkernel cmdline value. If no kernel is -specified, will reset current running kernel's crashkernel value. -If /usr/lib/modules/<KERNEL>/crashkernel.default doesn't exist, will -simply exit return 1. +.I reset-crashkernel [--kernel=path_to_kernel] [--reboot] +Reset crashkernel to default value recommended by kexec-tools. If no kernel +is specified, will reset current running kernel's crashkernel value. You can +also specify --kernel=ALL and --kernel=DEFAULT which have the same meaning as +grubby's kernel-path=ALL and kernel-path=DEFAULT. ppc64le supports FADump and +supports an additonal [--fadump=[on|off|nocma]] paramerter to toggle FADump
typo s/paramerter/parameter/
+on/off. Note that there are many factors would affect the memory requirement, +there is no guarantee the value recommended by kexec-tools would work for your +case.
I would rephrase the last sentence. How about
Note: The memory requirements for kdump varies heavily depending on the used hardware and system configuration. Thus the recommended crashkernel might not work for your specific setup. Please test if kdump works after resetting the crashkernel value.
Thanks Philipp
.SH "SEE ALSO"
Hi Philipp,
On Wed, Dec 22, 2021 at 05:28:19PM +0100, Philipp Rudo wrote:
Hi Coiby,
the patch looks really good. Only a few small nits :)
Thanks for reviewing the patch!
On Thu, 16 Dec 2021 14:36:50 +0800 Coiby Xu coxu@redhat.com wrote:
Rewrite kdumpctl reset-crashkernel KERNEL_PATH as kdumpctl reset-crashkernel [--fadump=[on|off|nocma]] [--kernel=path_to_kernel] [--reboot]
This interface would reset a specific kernel to the default crasherknel
typo s/crasherknel/crashkernel/
value given the kernel path. And it also supports grubby's syntax so there are the following special cases,
- if --kernel not specified, current running kernel, i.e. `uname -r` is chosen
What happens when the user has set KDUMP_KERNELVER in /etc/sysconfig/kdump? My expectation is that it has precedence over uname -r. I.e. if no --kernel is given choose KDUMP_KERNELVER, if that is empty choose uname -r.
Nice catch for these two issues. I will fix it in v4.
- if --kernel=DEFAULT, the default boot kernel is chosen
- if --kernel=ALL, all kernels would have its crashkernel reset to the default value and the /etc/default/grub is updated as well
--fadump=[on|off|nocma] toggles fadump on/off for the kernel provided in KERNEL_PATH. If --fadump is omitted the dump mode is determined by parsing the kernel command line for the kernel to update.
CoreOS/Atomic/Silverblue needs to be treated as a special case because,
- "rpm-ostree kargs" is used to manage kernel command line parameters so --kernel doesn't make sense and there is no need to find current running kernel
- "rpm-ostree kargs" itself would prompt the user to reboot the system after modify the kernel command line parameter
- POWER is not supported so we can assume the dump mode is always kdump
This interface will also be called by kexec-tools RPM scriptlets [1] to reset crashkernel.
Note the support of crashkenrel.default is dropped.
[1] https://docs.fedoraproject.org/en-US/packaging-guidelines/Scriptlets/
Signed-off-by: Coiby Xu coxu@redhat.com
kdumpctl | 165 +++++++++++++++++++++++++++++++++++++++++++++-------- kdumpctl.8 | 16 +++--- 2 files changed, 151 insertions(+), 30 deletions(-)
diff --git a/kdumpctl b/kdumpctl index 27971a1..412420d 100755 --- a/kdumpctl +++ b/kdumpctl @@ -1381,39 +1381,157 @@ _get_current_running_kernel_path() fi }
-reset_crashkernel() +_update_grub() {
- local kernel=$1 entry crashkernel_default
- local grub_etc_default="/etc/default/grub"
- [[ -z $kernel ]] && kernel=$(uname -r)
- crashkernel_default=$(cat "/usr/lib/modules/$kernel/crashkernel.default" 2> /dev/null)
- if [[ -z $crashkernel_default ]]; then
derror "$kernel doesn't have a crashkernel.default"
exit 1
- fi
local _kernel_path=$1 _crashkernel=$2 _dump_mode=$3 _fadump_val=$4
if is_atomic; then if rpm-ostree kargs | grep -q "crashkernel="; then
rpm-ostree kargs --replace="crashkernel=$crashkernel_default"
elserpm-ostree kargs --replace="crashkernel=$_crashkernel"
rpm-ostree kargs --append="crashkernel=$crashkernel_default"
fi elserpm-ostree kargs --append="crashkernel=$_crashkernel"
entry=$(grubby --info ALL | grep "^kernel=.*$kernel")
entry=${entry#kernel=}
entry=${entry#\"}
entry=${entry%\"}
[[ -f /etc/zipl.conf ]] && zipl_arg="--zipl"
grubby --args "crashkernel=$_crashkernel" --update-kernel "$_kernel_path" $zipl_arg
if [[ $_dump_mode == kdump ]]; then
grubby --remove-args="fadump" --update-kernel "$_kernel_path"
else
grubby --args="fadump=$_fadump_val" --update-kernel "$_kernel_path"
fi
- fi
- [[ $zipl_arg ]] && zipl > /dev/null
+}
+_valid_grubby_kernel_path() +{
- [[ -n "$1" ]] && grubby --info="$1" > /dev/null 2>&1
+}
if [[ -f $grub_etc_default ]]; then
sed -i -e "s/^\(GRUB_CMDLINE_LINUX=.*\)crashkernel=[^\ \"]*\([\ \"].*\)$/\1$crashkernel_default\2/" "$grub_etc_default"
+_get_all_kernels_from_grubby() +{
- local _kernels _line _kernel_path _grubby_kernel_path=$1
- for _line in $(grubby --info "$_grubby_kernel_path" | grep "^kernel="); do
_kernel_path=$(_filter_grubby_kernel_str "$_line")
_kernels="$_kernels $_kernel_path"
- done
- echo -n "$_kernels"
+}
+# modify the kernel command line parameter in default grub conf +# +# $1: the name of the kernel command line parameter +# $2: new value. If empty, the parameter would be removed +_update_kernel_cmdline_in_grub_etc_default() +{
- local grub_etc_default="/etc/default/grub" _para=$2 _val=$3 _para_val
- [[ -n $val ]] && _para_val="$_para=$_val"
- sed -i -E "s/^(GRUB_CMDLINE_LINUX=.*)([[:space:]"])crashkernel=[^[:space:]"]*(.*)$/\1\2$_para_val\3/" "$grub_etc_default"
+}
+reset_crashkernel() +{
- local _opt _val _dump_mode _fadump_val _reboot _grubby_kernel_path _kernel _kernels
- local _old_crashkernel _new_crashkernel _new_dump_mode _crashkernel_changed _new_fadump_val
- for _opt in "$@"; do
case "$_opt" in
--fadump=*)
_val=${_opt#*=}
if _dump_mode=$(get_dump_mode_by_fadump_val $_val); then
_fadump_val=$_val
else
derror "failed to determine dump mode"
exit
fi
;;
--kernel=*)
_val=${_opt#*=}
if ! _valid_grubby_kernel_path $_val; then
derror "Invalid $_opt, please specify a valid kernel path, ALL or DEFAULT"
exit
fi
_grubby_kernel_path=$_val
;;
--reboot)
_reboot=yes
;;
*)
derror "$_opt not recognized"
exit 1
;;
esac
- done
- # 1. CoreOS uses "rpm-ostree kargs" instead of grubby to manage kernel command
- # line. --kernel=ALL doesn't make sense for CoreOS.
- # 2. CoreOS doesn't support POWER so the dump mode is always kdump.
- # 3. "rpm-ostree kargs" would prompt the user to reboot the system after
- # modifying the kernel command line so there is no need for kexec-tools
- # to repeat it.
- if is_atomic; then
_old_crashkernel=$(rpm-ostree kargs | sed -n -E 's/.*(^|\s)crashkernel=(\S*).*/\2/p')
_new_dump_mode=kdump
_new_crashkernel=$(kdump_get_arch_recommend_crashkernel "$_new_dump_mode")
if [[ $_old_crashkernel != "$_new_crashkernel" ]]; then
_update_grub "" "$_new_crashkernel" "$_new_dump_mode" ""
fi[[ $_reboot == yes ]] && systemctl reboot
return
- fi
[[ -f /etc/zipl.conf ]] && zipl_arg="--zipl"
grubby --args "$crashkernel_default" --update-kernel "$entry" $zipl_arg
[[ $zipl_arg ]] && zipl > /dev/null
- # Only ppc64le supports fadump. If not specified for ppc64le, the dump
- # mode would be determined by parsing the kernel command line of the
- # kernel(s) to be updated
- if [[ -z $_dump_mode && $(uname -m) != ppc64le ]]; then
_dump_mode=kdump
- fi
I don't think this will work. The way I see it $_dump_mode will always be set after this if-block. But that will make all tests for -n/-z $_dump_mode predetermined. Which is not what you want. E.g. ...
_dump_mode is set to kdump only for non-ppc systems. So this is something I want.
- if [[ -n $_dump_mode ]]; then
_crashkernel=$(kdump_get_arch_recommend_crashkernel "$_dump_mode")
_fadump_val=off
- fi
... this will always turn off fadump independent what the user gave on the command line.
- if [[ $_grubby_kernel_path == ALL && -n $_dump_mode ]]; then
_update_kernel_cmdline_in_grub_etc_default crashkernel "$_crashkernel"
# remove the fadump if fadump is disabled
[[ $_fadump_val == off ]] && _fadump_val=""
_update_kernel_cmdline_in_grub_etc_default fadump "$_fadump_val"
- fi
- if [[ -z $_grubby_kernel_path ]]; then
if ! _kernel_path=$(_get_current_running_kernel_path); then
derror "no running kernel found"
exit 1
fi
_kernels=$_kernel_path
- else
fi_kernels=$(_get_all_kernels_from_grubby "$_grubby_kernel_path")
- for _kernel in $_kernels; do
if [[ -z $_dump_mode ]]; then
_new_dump_mode=$(get_dump_mode_by_kernel "$_kernel")
_new_crashkernel=$(kdump_get_arch_recommend_crashkernel "$_new_dump_mode")
_new_fadump_val=$(get_grub_kernel_boot_parameter "$_kernel" fadump)
else
_new_dump_mode=$_dump_mode
_new_crashkernel=$_crashkernel
_new_fadump_val=$_fadump_val
fi
_old_crashkernel=$(get_grub_kernel_boot_parameter "$_kernel" crashkernel)
if [[ $_old_crashkernel != "$_new_crashkernel" ]]; then
_update_grub "$_kernel" "$_new_crashkernel" "$_new_dump_mode" "$_new_fadump_val"
[[ $_reboot != yes ]] && dwarn "For kernel=$_kernel, crashkernel=$_new_crashkernel now. Please reboot the system to take effect."
The error message is a little bit clumsy. I would use
Updated crashkernel=$_new_crashkernel parameter for kernel=$_kernel. Please reboot the system for the change to take effect.
_crashkernel_changed=yes
fi
- done
- [[ $_reboot == yes && $_crashkernel_changed == yes ]] && reboot
}
if [[ ! -f $KDUMP_CONFIG_FILE ]]; then @@ -1478,7 +1596,8 @@ main() get_default_crashkernel "$2" ;; reset-crashkernel)
reset_crashkernel "$2"
shift
;; *) dinfo $"Usage: $0 {estimate|start|stop|status|restart|reload|rebuild|reset-crashkernel|propagate|showmem}"reset_crashkernel "$@"
diff --git a/kdumpctl.8 b/kdumpctl.8 index 74be062..fd44bea 100644 --- a/kdumpctl.8 +++ b/kdumpctl.8 @@ -50,13 +50,15 @@ Estimate a suitable crashkernel value for current machine. This is a best-effort estimate. It will print a recommanded crashkernel value based on current kdump setup, and list some details of memory usage. .TP -.I reset-crashkernel [KERNEL] -Reset crashkernel value to default value. kdumpctl will try to read -from /usr/lib/modules/<KERNEL>/crashkernel.default and reset specified -kernel's crashkernel cmdline value. If no kernel is -specified, will reset current running kernel's crashkernel value. -If /usr/lib/modules/<KERNEL>/crashkernel.default doesn't exist, will -simply exit return 1. +.I reset-crashkernel [--kernel=path_to_kernel] [--reboot] +Reset crashkernel to default value recommended by kexec-tools. If no kernel +is specified, will reset current running kernel's crashkernel value. You can +also specify --kernel=ALL and --kernel=DEFAULT which have the same meaning as +grubby's kernel-path=ALL and kernel-path=DEFAULT. ppc64le supports FADump and +supports an additonal [--fadump=[on|off|nocma]] paramerter to toggle FADump
typo s/paramerter/parameter/
+on/off. Note that there are many factors would affect the memory requirement, +there is no guarantee the value recommended by kexec-tools would work for your +case.
I would rephrase the last sentence. How about
Note: The memory requirements for kdump varies heavily depending on the used hardware and system configuration. Thus the recommended crashkernel might not work for your specific setup. Please test if kdump works after resetting the crashkernel value.
Thanks for the above three improvements!
Thanks Philipp
.SH "SEE ALSO"
On Thu, Dec 23, 2021 at 12:22:54PM +0800, Coiby Xu wrote:
Hi Philipp,
[...]
- # Only ppc64le supports fadump. If not specified for ppc64le, the dump
- # mode would be determined by parsing the kernel command line of the
- # kernel(s) to be updated
- if [[ -z $_dump_mode && $(uname -m) != ppc64le ]]; then
_dump_mode=kdump
- fi
I don't think this will work. The way I see it $_dump_mode will always be set after this if-block. But that will make all tests for -n/-z $_dump_mode predetermined. Which is not what you want. E.g. ...
_dump_mode is set to kdump only for non-ppc systems. So this is something I want.
- if [[ -n $_dump_mode ]]; then
_crashkernel=$(kdump_get_arch_recommend_crashkernel "$_dump_mode")
_fadump_val=off
- fi
... this will always turn off fadump independent what the user gave on the command line.
But the above statement is correct. I'll move _fadump_val=off else where, i.e., + # For non-ppc64le systems, the dump mode is always kdump since only ppc64le + # has FADump. + if [[ -z $_dump_mode && $(uname -m) != ppc64le ]]; then + _dump_mode=kdump + _fadump_val=off + fi + + # If the dump mode is determined, we can also know the default crashkernel value + if [[ -n $_dump_mode ]]; then + _crashkernel=$(kdump_get_arch_recommend_crashkernel "$_dump_mode") + fi
Thanks for catching this problem!
This option will determine whether to reset kernel crashkernel to new default value or not when kexec-tools updates the default crashkernel value and existing kernels using the old default kernel crashkernel value. Default to yes.
Signed-off-by: Coiby Xu coxu@redhat.com --- kdump.conf | 7 +++++++ kdump.conf.5 | 6 ++++++ 2 files changed, 13 insertions(+)
diff --git a/kdump.conf b/kdump.conf index dea2e94..91e6928 100644 --- a/kdump.conf +++ b/kdump.conf @@ -11,6 +11,12 @@ # # Supported options: # +# auto_reset_crashkernel <yes|no> +# - whether to reset kernel crashkernel to new default value +# or not when kexec-tools updates the default crashkernel value and +# existing kernels using the old default kernel crashkernel value. +# The default value is yet. +# # raw <partition> # - Will dd /proc/vmcore into <partition>. # Use persistent device names for partition devices, @@ -170,6 +176,7 @@ #ssh user@my.server.com #ssh user@2001:db8::1:2:3:4 #sshkey /root/.ssh/kdump_id_rsa +auto_reset_crashkernel yes path /var/crash core_collector makedumpfile -l --message-level 7 -d 31 #core_collector scp diff --git a/kdump.conf.5 b/kdump.conf.5 index 6e6cafa..e3e9900 100644 --- a/kdump.conf.5 +++ b/kdump.conf.5 @@ -26,6 +26,12 @@ understand how this configuration file affects the behavior of kdump.
.SH OPTIONS
+.B auto_reset_crashkernel <yes|no> +.RS +determine whether to reset kernel crashkernel to new default value +or not when kexec-tools updates the default crashkernel value and +existing kernels using the old default kernel crashkernel value + .B raw <partition> .RS Will dd /proc/vmcore into <partition>. Use persistent device names for
kexec-tools could update the default crashkernel value. When auto_reset_crashkernel=yes, reset kernel to new crashkernel value in the following two cases, - crashkernel=auto is found in the kernel cmdline - the kernel crashkernel was previously set by kexec-tools i.e. the kernel is using old default crashkernel value
To tell if the user is using a custom value for the kernel crashkernel or not, we assume the user would never use the default crashkernel value as custom value. When kexec-tools gets updated, 1. save the default crashkernel value of the older package to /tmp/crashkernel (for POWER system, /tmp/crashkernel_fadump is saved as well). 2. If auto_reset_crashkernel=yes, iterate all installed kernels. For each kernel, compare its crashkernel value with the old default crashkernel and reset it if yes
The implementation makes use of two RPM scriptlets [2], - %pre is run before a package is installed so we can use it to save old default crashkernel value - %post is run after a package installed so we can use it to try to reset kernel crashkernel
There are several problems when running kdumpctl in the RPM scripts for CoreOS/Atomic/Silverblue, for example, the lock can't be acquired by kdumpctl, "rpm-ostree kargs" can't be run and etc.. So don't enable this feature for CoreOS/Atomic/Silverblue.
Note latest shellcheck (0.8.0) gives false positives about the associative array as of this commit. And Fedora's shellcheck is 0.7.2 and can't even correctly parse the shell code because of the associative array.
[1] https://github.com/koalaman/shellcheck/issues/2399 [2] https://docs.fedoraproject.org/en-US/packaging-guidelines/Scriptlets/
Signed-off-by: Coiby Xu coxu@redhat.com --- kdumpctl | 35 +++++++++++++++++++++++++++++++++++ kexec-tools.spec | 15 +++++++++++++++ 2 files changed, 50 insertions(+)
diff --git a/kdumpctl b/kdumpctl index 412420d..a6582f8 100755 --- a/kdumpctl +++ b/kdumpctl @@ -1534,6 +1534,36 @@ reset_crashkernel() [[ $_reboot == yes && $_crashkernel_changed == yes ]] && reboot }
+# shellcheck disable=SC2154 # false positive when dereferencing an array +reset_crashkernel_after_update() +{ + local _kernel _crashkernel _dump_mode _fadump_val _old_default_crashkernel _new_default_crashkernel + declare -A _crashkernel_vals + + _crashkernel_vals[old_kdump]=$(cat /tmp/old_default_crashkernel 2> /dev/null) + _crashkernel_vals[old_fadump]=$(cat /tmp/old_default_crashkernel_fadump 2> /dev/null) + _crashkernel_vals[new_kdump]=$(get_default_crashkernel kdump) + _crashkernel_vals[new_fadump]=$(get_default_crashkernel fadump) + + for _kernel in $(_get_all_kernels_from_grubby); do + _crashkernel=$(get_grub_kernel_boot_parameter "$_kernel" crashkernel) + if [[ $_crashkernel == auto ]]; then + reset_crashkernel "--kernel=$_kernel" + elif [[ -n $_crashkernel ]]; then + _dump_mode=$(get_dump_mode_by_kernel "$_kernel") + _old_default_crashkernel=${_crashkernel_vals[old_${_dump_mode}]} + _new_default_crashkernel=${_crashkernel_vals[new_${_dump_mode}]} + if [[ $_crashkernel == "$_old_default_crashkernel" ]] && + [[ $_new_default_crashkernel != "$_old_default_crashkernel" ]]; then + _fadump_val=$(get_grub_kernel_boot_parameter "$_kernel" fadump) + if _update_grub "$_kernel" "$_new_default_crashkernel" "$_dump_mode" "$_fadump_val"; then + echo "For kernel=$_kernel, crashkernel=$_new_default_crashkernel now." + fi + fi + fi + done +} + if [[ ! -f $KDUMP_CONFIG_FILE ]]; then derror "Error: No kdump config file found!" exit 1 @@ -1599,6 +1629,11 @@ main() shift reset_crashkernel "$@" ;; + reset-crashkernel-after-update) + if [[ $(kdump_get_conf_val auto_reset_crashkernel) != no ]]; then + reset_crashkernel_after_update + fi + ;; *) dinfo $"Usage: $0 {estimate|start|stop|status|restart|reload|rebuild|reset-crashkernel|propagate|showmem}" exit 1 diff --git a/kexec-tools.spec b/kexec-tools.spec index ab7f41f..d72044a 100644 --- a/kexec-tools.spec +++ b/kexec-tools.spec @@ -258,6 +258,14 @@ chmod 755 $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99zz-fadumpini mkdir -p $RPM_BUILD_ROOT/%{dracutlibdir}/modules.d/ mv $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/* $RPM_BUILD_ROOT/%{dracutlibdir}/modules.d/
+%pre +if ! grep -q "ostree" /proc/cmdline && [ $1 == 2 ] && grep "get-default-crashkernel" /usr/bin/kdumpctl &>/dev/null; then + kdumpctl get-default-crashkernel kdump > /tmp/old_default_crashkernel 2>/dev/null +%ifarch ppc64 ppc64le + kdumpctl get-default-crashkernel fadump > /tmp/old_default_crashkernel_fadump 2>/dev/null +%endif +fi + %post # Initial installation %systemd_post kdump.service @@ -290,6 +298,13 @@ then /etc/sysconfig/kdump > /etc/sysconfig/kdump.new mv /etc/sysconfig/kdump.new /etc/sysconfig/kdump fi +if ! grep -q "ostree" /proc/cmdline && [ $1 == 2 ]; then + kdumpctl reset-crashkernel-after-update + rm /tmp/old_default_crashkernel 2>/dev/null || : +%ifarch ppc64 ppc64le + rm /tmp/old_default_crashkernel_fadump 2>/dev/null || : +%endif +fi
%postun
Hi Coiby,
On Thu, 16 Dec 2021 14:36:52 +0800 Coiby Xu coxu@redhat.com wrote:
kexec-tools could update the default crashkernel value. When auto_reset_crashkernel=yes, reset kernel to new crashkernel value in the following two cases,
- crashkernel=auto is found in the kernel cmdline
- the kernel crashkernel was previously set by kexec-tools i.e. the kernel is using old default crashkernel value
To tell if the user is using a custom value for the kernel crashkernel or not, we assume the user would never use the default crashkernel value as custom value. When kexec-tools gets updated,
- save the default crashkernel value of the older package to /tmp/crashkernel (for POWER system, /tmp/crashkernel_fadump is saved as well).
- If auto_reset_crashkernel=yes, iterate all installed kernels. For each kernel, compare its crashkernel value with the old default crashkernel and reset it if yes
The implementation makes use of two RPM scriptlets [2],
- %pre is run before a package is installed so we can use it to save old default crashkernel value
- %post is run after a package installed so we can use it to try to reset kernel crashkernel
There are several problems when running kdumpctl in the RPM scripts for CoreOS/Atomic/Silverblue, for example, the lock can't be acquired by kdumpctl, "rpm-ostree kargs" can't be run and etc.. So don't enable this feature for CoreOS/Atomic/Silverblue.
Note latest shellcheck (0.8.0) gives false positives about the associative array as of this commit. And Fedora's shellcheck is 0.7.2 and can't even correctly parse the shell code because of the associative array.
[1] https://github.com/koalaman/shellcheck/issues/2399 [2] https://docs.fedoraproject.org/en-US/packaging-guidelines/Scriptlets/
Signed-off-by: Coiby Xu coxu@redhat.com
kdumpctl | 35 +++++++++++++++++++++++++++++++++++ kexec-tools.spec | 15 +++++++++++++++ 2 files changed, 50 insertions(+)
diff --git a/kdumpctl b/kdumpctl index 412420d..a6582f8 100755 --- a/kdumpctl +++ b/kdumpctl @@ -1534,6 +1534,36 @@ reset_crashkernel() [[ $_reboot == yes && $_crashkernel_changed == yes ]] && reboot }
+# shellcheck disable=SC2154 # false positive when dereferencing an array +reset_crashkernel_after_update() +{
- local _kernel _crashkernel _dump_mode _fadump_val _old_default_crashkernel _new_default_crashkernel
- declare -A _crashkernel_vals
- _crashkernel_vals[old_kdump]=$(cat /tmp/old_default_crashkernel 2> /dev/null)
- _crashkernel_vals[old_fadump]=$(cat /tmp/old_default_crashkernel_fadump 2> /dev/null)
- _crashkernel_vals[new_kdump]=$(get_default_crashkernel kdump)
- _crashkernel_vals[new_fadump]=$(get_default_crashkernel fadump)
- for _kernel in $(_get_all_kernels_from_grubby); do
_crashkernel=$(get_grub_kernel_boot_parameter "$_kernel" crashkernel)
if [[ $_crashkernel == auto ]]; then
reset_crashkernel "--kernel=$_kernel"
elif [[ -n $_crashkernel ]]; then
_dump_mode=$(get_dump_mode_by_kernel "$_kernel")
_old_default_crashkernel=${_crashkernel_vals[old_${_dump_mode}]}
_new_default_crashkernel=${_crashkernel_vals[new_${_dump_mode}]}
if [[ $_crashkernel == "$_old_default_crashkernel" ]] &&
[[ $_new_default_crashkernel != "$_old_default_crashkernel" ]]; then
_fadump_val=$(get_grub_kernel_boot_parameter "$_kernel" fadump)
if _update_grub "$_kernel" "$_new_default_crashkernel" "$_dump_mode" "$_fadump_val"; then
echo "For kernel=$_kernel, crashkernel=$_new_default_crashkernel now."
fi
fi
fi
- done
+}
if [[ ! -f $KDUMP_CONFIG_FILE ]]; then derror "Error: No kdump config file found!" exit 1 @@ -1599,6 +1629,11 @@ main() shift reset_crashkernel "$@" ;;
- reset-crashkernel-after-update)
if [[ $(kdump_get_conf_val auto_reset_crashkernel) != no ]]; then
reset_crashkernel_after_update
fi
*) dinfo $"Usage: $0 {estimate|start|stop|status|restart|reload|rebuild|reset-crashkernel|propagate|showmem}" exit 1;;
diff --git a/kexec-tools.spec b/kexec-tools.spec index ab7f41f..d72044a 100644 --- a/kexec-tools.spec +++ b/kexec-tools.spec @@ -258,6 +258,14 @@ chmod 755 $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99zz-fadumpini mkdir -p $RPM_BUILD_ROOT/%{dracutlibdir}/modules.d/ mv $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/* $RPM_BUILD_ROOT/%{dracutlibdir}/modules.d/
+%pre +if ! grep -q "ostree" /proc/cmdline && [ $1 == 2 ] && grep "get-default-crashkernel" /usr/bin/kdumpctl &>/dev/null; then
do you really need to check if get-default-crashkernel exists here? What would happen if kdumpctl returns with an error?
I'm just asking myself as there is no corresponding check in the %post scriplet. Furthermore I think we can assume that the spec file and kdumpctl are updated together. Especially that using a new spec file with an old kdumpctl that doesn't support get-default-crashkernel is not a supported use case. Or am I missing something?
Thanks Philipp
- kdumpctl get-default-crashkernel kdump > /tmp/old_default_crashkernel 2>/dev/null
+%ifarch ppc64 ppc64le
- kdumpctl get-default-crashkernel fadump > /tmp/old_default_crashkernel_fadump 2>/dev/null
+%endif +fi
%post # Initial installation %systemd_post kdump.service @@ -290,6 +298,13 @@ then /etc/sysconfig/kdump > /etc/sysconfig/kdump.new mv /etc/sysconfig/kdump.new /etc/sysconfig/kdump fi +if ! grep -q "ostree" /proc/cmdline && [ $1 == 2 ]; then
- kdumpctl reset-crashkernel-after-update
- rm /tmp/old_default_crashkernel 2>/dev/null || :
+%ifarch ppc64 ppc64le
- rm /tmp/old_default_crashkernel_fadump 2>/dev/null || :
+%endif +fi
%postun
Hi Philipp,
On Wed, Dec 22, 2021 at 05:29:22PM +0100, Philipp Rudo wrote:
Hi Coiby,
On Thu, 16 Dec 2021 14:36:52 +0800 Coiby Xu coxu@redhat.com wrote:
kexec-tools could update the default crashkernel value. When auto_reset_crashkernel=yes, reset kernel to new crashkernel value in the following two cases,
- crashkernel=auto is found in the kernel cmdline
- the kernel crashkernel was previously set by kexec-tools i.e. the kernel is using old default crashkernel value
To tell if the user is using a custom value for the kernel crashkernel or not, we assume the user would never use the default crashkernel value as custom value. When kexec-tools gets updated,
- save the default crashkernel value of the older package to /tmp/crashkernel (for POWER system, /tmp/crashkernel_fadump is saved as well).
- If auto_reset_crashkernel=yes, iterate all installed kernels. For each kernel, compare its crashkernel value with the old default crashkernel and reset it if yes
The implementation makes use of two RPM scriptlets [2],
- %pre is run before a package is installed so we can use it to save old default crashkernel value
- %post is run after a package installed so we can use it to try to reset kernel crashkernel
There are several problems when running kdumpctl in the RPM scripts for CoreOS/Atomic/Silverblue, for example, the lock can't be acquired by kdumpctl, "rpm-ostree kargs" can't be run and etc.. So don't enable this feature for CoreOS/Atomic/Silverblue.
Note latest shellcheck (0.8.0) gives false positives about the associative array as of this commit. And Fedora's shellcheck is 0.7.2 and can't even correctly parse the shell code because of the associative array.
[1] https://github.com/koalaman/shellcheck/issues/2399 [2] https://docs.fedoraproject.org/en-US/packaging-guidelines/Scriptlets/
Signed-off-by: Coiby Xu coxu@redhat.com
kdumpctl | 35 +++++++++++++++++++++++++++++++++++ kexec-tools.spec | 15 +++++++++++++++ 2 files changed, 50 insertions(+)
diff --git a/kdumpctl b/kdumpctl index 412420d..a6582f8 100755 --- a/kdumpctl +++ b/kdumpctl @@ -1534,6 +1534,36 @@ reset_crashkernel() [[ $_reboot == yes && $_crashkernel_changed == yes ]] && reboot }
+# shellcheck disable=SC2154 # false positive when dereferencing an array +reset_crashkernel_after_update() +{
- local _kernel _crashkernel _dump_mode _fadump_val _old_default_crashkernel _new_default_crashkernel
- declare -A _crashkernel_vals
- _crashkernel_vals[old_kdump]=$(cat /tmp/old_default_crashkernel 2> /dev/null)
- _crashkernel_vals[old_fadump]=$(cat /tmp/old_default_crashkernel_fadump 2> /dev/null)
- _crashkernel_vals[new_kdump]=$(get_default_crashkernel kdump)
- _crashkernel_vals[new_fadump]=$(get_default_crashkernel fadump)
- for _kernel in $(_get_all_kernels_from_grubby); do
_crashkernel=$(get_grub_kernel_boot_parameter "$_kernel" crashkernel)
if [[ $_crashkernel == auto ]]; then
reset_crashkernel "--kernel=$_kernel"
elif [[ -n $_crashkernel ]]; then
_dump_mode=$(get_dump_mode_by_kernel "$_kernel")
_old_default_crashkernel=${_crashkernel_vals[old_${_dump_mode}]}
_new_default_crashkernel=${_crashkernel_vals[new_${_dump_mode}]}
if [[ $_crashkernel == "$_old_default_crashkernel" ]] &&
[[ $_new_default_crashkernel != "$_old_default_crashkernel" ]]; then
_fadump_val=$(get_grub_kernel_boot_parameter "$_kernel" fadump)
if _update_grub "$_kernel" "$_new_default_crashkernel" "$_dump_mode" "$_fadump_val"; then
echo "For kernel=$_kernel, crashkernel=$_new_default_crashkernel now."
fi
fi
fi
- done
+}
if [[ ! -f $KDUMP_CONFIG_FILE ]]; then derror "Error: No kdump config file found!" exit 1 @@ -1599,6 +1629,11 @@ main() shift reset_crashkernel "$@" ;;
- reset-crashkernel-after-update)
if [[ $(kdump_get_conf_val auto_reset_crashkernel) != no ]]; then
reset_crashkernel_after_update
fi
*) dinfo $"Usage: $0 {estimate|start|stop|status|restart|reload|rebuild|reset-crashkernel|propagate|showmem}" exit 1;;
diff --git a/kexec-tools.spec b/kexec-tools.spec index ab7f41f..d72044a 100644 --- a/kexec-tools.spec +++ b/kexec-tools.spec @@ -258,6 +258,14 @@ chmod 755 $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99zz-fadumpini mkdir -p $RPM_BUILD_ROOT/%{dracutlibdir}/modules.d/ mv $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/* $RPM_BUILD_ROOT/%{dracutlibdir}/modules.d/
+%pre +if ! grep -q "ostree" /proc/cmdline && [ $1 == 2 ] && grep "get-default-crashkernel" /usr/bin/kdumpctl &>/dev/null; then
do you really need to check if get-default-crashkernel exists here? What would happen if kdumpctl returns with an error?
I'm just asking myself as there is no corresponding check in the %post scriplet. Furthermore I think we can assume that the spec file and kdumpctl are updated together. Especially that using a new spec file with an old kdumpctl that doesn't support get-default-crashkernel is not a supported use case. Or am I missing something?
Using using a new spec file with an old kdumpctl does happen for the %pre RPM scriplet since this scriptlet is called before the installation thus the old kdumpctl is called by the %pre scriptlet. But there is indeed no need to check if get-default-crashkernel exists since the case of an empty /tmp/default_crashkernel file could be correctly taken care of. I'll drop this check in v4.
Thanks Philipp
- kdumpctl get-default-crashkernel kdump > /tmp/old_default_crashkernel 2>/dev/null
+%ifarch ppc64 ppc64le
- kdumpctl get-default-crashkernel fadump > /tmp/old_default_crashkernel_fadump 2>/dev/null
+%endif +fi
%post # Initial installation %systemd_post kdump.service @@ -290,6 +298,13 @@ then /etc/sysconfig/kdump > /etc/sysconfig/kdump.new mv /etc/sysconfig/kdump.new /etc/sysconfig/kdump fi +if ! grep -q "ostree" /proc/cmdline && [ $1 == 2 ]; then
- kdumpctl reset-crashkernel-after-update
- rm /tmp/old_default_crashkernel 2>/dev/null || :
+%ifarch ppc64 ppc64le
- rm /tmp/old_default_crashkernel_fadump 2>/dev/null || :
+%endif +fi
%postun
On Thu, Dec 23, 2021 at 03:07:16PM +0800, Coiby Xu wrote:
Hi Philipp,
On Wed, Dec 22, 2021 at 05:29:22PM +0100, Philipp Rudo wrote:
Hi Coiby,
On Thu, 16 Dec 2021 14:36:52 +0800 Coiby Xu coxu@redhat.com wrote:
kexec-tools could update the default crashkernel value. When auto_reset_crashkernel=yes, reset kernel to new crashkernel value in the following two cases,
- crashkernel=auto is found in the kernel cmdline
- the kernel crashkernel was previously set by kexec-tools i.e. the kernel is using old default crashkernel value
To tell if the user is using a custom value for the kernel crashkernel or not, we assume the user would never use the default crashkernel value as custom value. When kexec-tools gets updated,
- save the default crashkernel value of the older package to /tmp/crashkernel (for POWER system, /tmp/crashkernel_fadump is saved as well).
- If auto_reset_crashkernel=yes, iterate all installed kernels. For each kernel, compare its crashkernel value with the old default crashkernel and reset it if yes
The implementation makes use of two RPM scriptlets [2],
- %pre is run before a package is installed so we can use it to save old default crashkernel value
- %post is run after a package installed so we can use it to try to reset kernel crashkernel
There are several problems when running kdumpctl in the RPM scripts for CoreOS/Atomic/Silverblue, for example, the lock can't be acquired by kdumpctl, "rpm-ostree kargs" can't be run and etc.. So don't enable this feature for CoreOS/Atomic/Silverblue.
Note latest shellcheck (0.8.0) gives false positives about the associative array as of this commit. And Fedora's shellcheck is 0.7.2 and can't even correctly parse the shell code because of the associative array.
[1] https://github.com/koalaman/shellcheck/issues/2399 [2] https://docs.fedoraproject.org/en-US/packaging-guidelines/Scriptlets/
Signed-off-by: Coiby Xu coxu@redhat.com
kdumpctl | 35 +++++++++++++++++++++++++++++++++++ kexec-tools.spec | 15 +++++++++++++++ 2 files changed, 50 insertions(+)
diff --git a/kdumpctl b/kdumpctl index 412420d..a6582f8 100755 --- a/kdumpctl +++ b/kdumpctl @@ -1534,6 +1534,36 @@ reset_crashkernel() [[ $_reboot == yes && $_crashkernel_changed == yes ]] && reboot }
+# shellcheck disable=SC2154 # false positive when dereferencing an array +reset_crashkernel_after_update() +{
- local _kernel _crashkernel _dump_mode _fadump_val _old_default_crashkernel _new_default_crashkernel
- declare -A _crashkernel_vals
- _crashkernel_vals[old_kdump]=$(cat /tmp/old_default_crashkernel 2> /dev/null)
- _crashkernel_vals[old_fadump]=$(cat /tmp/old_default_crashkernel_fadump 2> /dev/null)
- _crashkernel_vals[new_kdump]=$(get_default_crashkernel kdump)
- _crashkernel_vals[new_fadump]=$(get_default_crashkernel fadump)
- for _kernel in $(_get_all_kernels_from_grubby); do
_crashkernel=$(get_grub_kernel_boot_parameter "$_kernel" crashkernel)
if [[ $_crashkernel == auto ]]; then
reset_crashkernel "--kernel=$_kernel"
elif [[ -n $_crashkernel ]]; then
_dump_mode=$(get_dump_mode_by_kernel "$_kernel")
_old_default_crashkernel=${_crashkernel_vals[old_${_dump_mode}]}
_new_default_crashkernel=${_crashkernel_vals[new_${_dump_mode}]}
if [[ $_crashkernel == "$_old_default_crashkernel" ]] &&
[[ $_new_default_crashkernel != "$_old_default_crashkernel" ]]; then
_fadump_val=$(get_grub_kernel_boot_parameter "$_kernel" fadump)
if _update_grub "$_kernel" "$_new_default_crashkernel" "$_dump_mode" "$_fadump_val"; then
echo "For kernel=$_kernel, crashkernel=$_new_default_crashkernel now."
fi
fi
fi
- done
+}
if [[ ! -f $KDUMP_CONFIG_FILE ]]; then derror "Error: No kdump config file found!" exit 1 @@ -1599,6 +1629,11 @@ main() shift reset_crashkernel "$@" ;;
- reset-crashkernel-after-update)
if [[ $(kdump_get_conf_val auto_reset_crashkernel) != no ]]; then
reset_crashkernel_after_update
fi
*) dinfo $"Usage: $0 {estimate|start|stop|status|restart|reload|rebuild|reset-crashkernel|propagate|showmem}" exit 1;;
diff --git a/kexec-tools.spec b/kexec-tools.spec index ab7f41f..d72044a 100644 --- a/kexec-tools.spec +++ b/kexec-tools.spec @@ -258,6 +258,14 @@ chmod 755 $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99zz-fadumpini mkdir -p $RPM_BUILD_ROOT/%{dracutlibdir}/modules.d/ mv $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/* $RPM_BUILD_ROOT/%{dracutlibdir}/modules.d/
+%pre +if ! grep -q "ostree" /proc/cmdline && [ $1 == 2 ] && grep "get-default-crashkernel" /usr/bin/kdumpctl &>/dev/null; then
do you really need to check if get-default-crashkernel exists here? What would happen if kdumpctl returns with an error?
I'm just asking myself as there is no corresponding check in the %post scriplet. Furthermore I think we can assume that the spec file and kdumpctl are updated together. Especially that using a new spec file with an old kdumpctl that doesn't support get-default-crashkernel is not a supported use case. Or am I missing something?
Using using a new spec file with an old kdumpctl does happen for the %pre RPM scriplet since this scriptlet is called before the installation thus the old kdumpctl is called by the %pre scriptlet. But there is indeed no need to check if get-default-crashkernel exists since the case of an empty /tmp/default_crashkernel file could be correctly taken care of. I'll drop this check in v4.
Oh, I couldn't drop the check because dnf would check the return code of scriptlet. So in v4. I'll simply use "grep -q" instead.
Thanks Philipp
- kdumpctl get-default-crashkernel kdump > /tmp/old_default_crashkernel 2>/dev/null
+%ifarch ppc64 ppc64le
- kdumpctl get-default-crashkernel fadump > /tmp/old_default_crashkernel_fadump 2>/dev/null
+%endif +fi
%post # Initial installation %systemd_post kdump.service @@ -290,6 +298,13 @@ then /etc/sysconfig/kdump > /etc/sysconfig/kdump.new mv /etc/sysconfig/kdump.new /etc/sysconfig/kdump fi +if ! grep -q "ostree" /proc/cmdline && [ $1 == 2 ]; then
- kdumpctl reset-crashkernel-after-update
- rm /tmp/old_default_crashkernel 2>/dev/null || :
+%ifarch ppc64 ppc64le
- rm /tmp/old_default_crashkernel_fadump 2>/dev/null || :
+%endif +fi
%postun
-- Best regards, Coiby
When kexec-tools updates the default crashkernel value, it will try to reset the existing installed kernels including the currently running kernel. So the running kernel could have different kernel cmdline parameters from /proc/cmdline. When installing a kernel after updating kexec-tools, /usr/lib/kernel/install.d/20-grub.install would be called by kernel-install [1] which would use /proc/cmdline to set up new kernel's cmdline. To address this special case, reset the new kernel's crashkernel and fadump value to the value that would be used by running kernel after rebooting by the installation hook. One side effect of this commit is it would reset the installed kernel's crashkernel even currently running kernel don't use the default crashkernel value after rebooting. But I think this side effect is a benefit for the user.
The implementation depends on kernel-install which run the scripts in /usr/lib/kernel/install.d passing the following arguments,
add KERNEL-VERSION $BOOT/MACHINE-ID/KERNEL-VERSION/ KERNEL-IMAGE [INITRD-FILE ...]
An concrete example is given as follows, add 5.11.12-300.fc34.x86_64 /boot/e986846f63134c7295458cf36300ba5b/5.11.12-300.fc34.x86_64 /lib/modules/5.11.12-300.fc34.x86_64/vmlinuz
kernel-install could be started by the kernel package's RPM scriplet [2]. As mentioned in previous commit "try to reset kernel crashkernel when kexec-tools updates the default crashkernel value", kdumpctl has difficulty running in RPM scriptlet fore CoreOS. But rpm-ostree ignores all kernel hooks, there is no need to disable the kernel hook for CoreOS/Atomic/Silverblue. But a collaboration between rpm-ostree and kexec-tools is needed [3] to take care of this special case.
Note the crashkernel.default support is dropped.
[1] https://www.freedesktop.org/software/systemd/man/kernel-install.html [2] https://src.fedoraproject.org/rpms/kernel/blob/rawhide/f/kernel.spec#_2680 [3] https://github.com/coreos/rpm-ostree/issues/2894
Signed-off-by: Coiby Xu coxu@redhat.com --- 92-crashkernel.install | 135 +---------------------------------------- kdumpctl | 31 ++++++++++ 2 files changed, 32 insertions(+), 134 deletions(-)
diff --git a/92-crashkernel.install b/92-crashkernel.install index 78365ff..1d67a13 100755 --- a/92-crashkernel.install +++ b/92-crashkernel.install @@ -5,142 +5,9 @@ KERNEL_VERSION="$2" KDUMP_INITRD_DIR_ABS="$3" KERNEL_IMAGE="$4"
-grub_etc_default="/etc/default/grub" - -ver_lt() { - [[ "$(echo -e "$1\n$2" | sort -V)" == $1$'\n'* ]] && [[ $1 != "$2" ]] -} - -# Read crashkernel= value in /etc/default/grub -get_grub_etc_ck() { - [[ -e $grub_etc_default ]] && \ - sed -n -e "s/^GRUB_CMDLINE_LINUX=.*(crashkernel=[^\ "]*)[\ "].*$/\1/p" $grub_etc_default -} - -# Read crashkernel.default value of specified kernel -get_ck_default() { - ck_file="/usr/lib/modules/$1/crashkernel.default" - [[ -f "$ck_file" ]] && cat "$ck_file" -} - -# Iterate installed kernels, find the kernel with the highest version that has a -# valid crashkernel.default file, exclude current installing/removing kernel -# -# $1: a string representing a crashkernel= cmdline. If given, will also check the -# content of crashkernel.default, only crashkernel.default with the same value will match -get_highest_ck_default_kver() { - for kernel in $(find /usr/lib/modules -maxdepth 1 -mindepth 1 -printf "%f\n" | sort --version-sort -r); do - [[ $kernel == "$KERNEL_VERSION" ]] && continue - [[ -s "/usr/lib/modules/$kernel/crashkernel.default" ]] || continue - - echo "$kernel" - return 0 - done - - return 1 -} - -set_grub_ck() { - sed -i -e "s/^(GRUB_CMDLINE_LINUX=.*)crashkernel=[^\ "]*([\ "].*)$/\1$1\2/" "$grub_etc_default" -} - -# Set specified kernel's crashkernel cmdline value -set_kernel_ck() { - kernel=$1 - ck_cmdline=$2 - - entry=$(grubby --info ALL | grep "^kernel=.*$kernel") - entry=${entry#kernel=} - entry=${entry#"} - entry=${entry%"} - - if [[ -z "$entry" ]]; then - echo "$0: failed to find boot entry for kernel $kernel" - return 1 - fi - - [[ -f /etc/zipl.conf ]] && zipl_arg="--zipl" - grubby --args "$ck_cmdline" --update-kernel "$entry" $zipl_arg - [[ $zipl_arg ]] && zipl > /dev/null ||: -} - case "$COMMAND" in add) - # - If current boot kernel is using default crashkernel value, update - # installing kernel's crashkernel value to its default value, - # - If intalling a higher version kernel, and /etc/default/grub's - # crashkernel value is using default value, update it to installing - # kernel's default value. - inst_ck_default=$(get_ck_default "$KERNEL_VERSION") - # If installing kernel doesn't have crashkernel.default, just exit. - [[ -z "$inst_ck_default" ]] && exit 0 - - boot_kernel=$(uname -r) - boot_ck_cmdline=$(sed -n -e "s/^.*(crashkernel=\S*).*$/\1/p" /proc/cmdline) - highest_ck_default_kver=$(get_highest_ck_default_kver) - highest_ck_default=$(get_ck_default "$highest_ck_default_kver") - - # Try update /etc/default/grub if present, else grub2-mkconfig could - # override crashkernel value. - grub_etc_ck=$(get_grub_etc_ck) - if [[ -n "$grub_etc_ck" ]]; then - if [[ -z "$highest_ck_default_kver" ]]; then - # None of installed kernel have a crashkernel.default, - # check for 'crashkernel=auto' in case of legacy kernel - [[ "$grub_etc_ck" == "crashkernel=auto" ]] && \ - set_grub_ck "$inst_ck_default" - else - # There is a valid crashkernel.default, check if installing kernel - # have a higher version and grub config is using default value - ver_lt "$highest_ck_default_kver" "$KERNEL_VERSION" && \ - [[ "$grub_etc_ck" == "$highest_ck_default" ]] && \ - [[ "$grub_etc_ck" != "$inst_ck_default" ]] && \ - set_grub_ck "$inst_ck_default" - fi - fi - - # Exit if crashkernel is not used in current cmdline - [[ -z $boot_ck_cmdline ]] && exit 0 - - # Get current boot kernel's default value - boot_ck_default=$(get_ck_default "$boot_kernel") - if [[ $boot_ck_cmdline == "crashkernel=auto" ]]; then - # Legacy RHEL kernel defaults to "auto" - boot_ck_default="$boot_ck_cmdline" - fi - - # If boot kernel doesn't have a crashkernel.default, check - # if it's using any installed kernel's crashkernel.default - if [[ -z $boot_ck_default ]]; then - [[ $(get_highest_ck_default_kver "$boot_ck_cmdline") ]] && boot_ck_default="$boot_ck_cmdline" - fi - - # If boot kernel is using a default crashkernel, update - # installing kernel's crashkernel to new default value - if [[ "$boot_ck_cmdline" != "$inst_ck_default" ]] && [[ "$boot_ck_cmdline" == "$boot_ck_default" ]]; then - set_kernel_ck "$KERNEL_VERSION" "$inst_ck_default" - fi - - exit 0 - ;; -remove) - # If grub default value is upgraded when this kernel was installed, try downgrade it - grub_etc_ck=$(get_grub_etc_ck) - [[ $grub_etc_ck ]] || exit 0 - - removing_ck_conf=$(get_ck_default "$KERNEL_VERSION") - [[ $removing_ck_conf ]] || exit 0 - - highest_ck_default_kver=$(get_highest_ck_default_kver) || exit 0 - highest_ck_default=$(get_ck_default "$highest_ck_default_kver") - [[ $highest_ck_default ]] || exit 0 - - if ver_lt "$highest_ck_default_kver" "$KERNEL_VERSION"; then - if [[ $grub_etc_ck == "$removing_ck_conf" ]] && [[ $grub_etc_ck != "$highest_ck_default" ]]; then - set_grub_ck "$highest_ck_default" - fi - fi - + kdumpctl reset-crashkernel-for-installed_kernel "$KERNEL_VERSION" exit 0 ;; esac diff --git a/kdumpctl b/kdumpctl index a6582f8..2979e1d 100755 --- a/kdumpctl +++ b/kdumpctl @@ -1564,6 +1564,32 @@ reset_crashkernel_after_update() done }
+reset_crashkernel_for_installed_kernel() +{ + local _installed_kernel _running_kernel _crashkernel _crashkernel_running + local _dump_mode_running _fadump_val_running + + if ! _installed_kernel=$(_find_kernel_path_by_release "$1"); then + exit 1 + fi + + if ! _running_kernel=$(_get_current_running_kernel_path); then + derror "Couldn't find current running kernel" + exit + fi + + _crashkernel=$(get_grub_kernel_boot_parameter "$_installed_kernel" crashkernel) + _crashkernel_running=$(get_grub_kernel_boot_parameter "$_running_kernel" crashkernel) + _dump_mode_running=$(get_dump_mode_by_kernel "$_running_kernel") + _fadump_val_running=$(get_grub_kernel_boot_parameter "$_kernel" fadump) + + if [[ $_crashkernel != "$_crashkernel_running" ]]; then + if _update_grub "$_installed_kernel" "$_crashkernel_running" "$_dump_mode_running" "$_fadump_val_running"; then + echo "kexec-tools has reset $_installed_kernel to use the new default crashkernel value $_crashkernel_running" + fi + fi +} + if [[ ! -f $KDUMP_CONFIG_FILE ]]; then derror "Error: No kdump config file found!" exit 1 @@ -1634,6 +1660,11 @@ main() reset_crashkernel_after_update fi ;; + reset-crashkernel-for-installed_kernel) + if [[ $(kdump_get_conf_val auto_reset_crashkernel) != no ]]; then + reset_crashkernel_for_installed_kernel "$2" + fi + ;; *) dinfo $"Usage: $0 {estimate|start|stop|status|restart|reload|rebuild|reset-crashkernel|propagate|showmem}" exit 1
osbuild is a tool to build OS images. It uses bwrap to install packages inside a sandbox/container. Since the kernel package recommends kexec-tools which in in turn recommends grubby, the installation order would be grubby -> kexec-tools -> kernel. So we can use the kernel hook 92-crashkernel.install provided by kexec-tools to set up kernel crashkernel for the target OS image. But in osbuild's case, there is no current running kernel and running `uname -r` in the container/sandbox actually returns the host kernel release. To set up kernel crashkernel for the OS image built by osbuild, a different logic is needed.
We will check if kernel hook is running inside the osbuild container then set up kernel crashkernel only if osbuild hasn't specified a custome value. osbuild exposes [1] the container=bwrap-osbuild environment variable. According to [2], the environment variable is not inherited down the process tree, so we need to check /proc/1/environ to detect this environment variable to tell if the kernel hook is running inside a bwrap-osbuild container. After that we need to know if osbuild wants to use custom crashkernel value. This is done by checking if /etc/kernel/cmdline has crashkernel set [3]. /etc/kernel/cmdline is written before packages are installed.
[1] https://github.com/osbuild/osbuild/pull/926 [2] https://systemd.io/CONTAINER_INTERFACE/ [3] https://bugzilla.redhat.com/show_bug.cgi?id=2024976#c5
Signed-off-by: Coiby Xu coxu@redhat.com --- kdumpctl | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/kdumpctl b/kdumpctl index 2979e1d..5249244 100755 --- a/kdumpctl +++ b/kdumpctl @@ -1564,6 +1564,11 @@ reset_crashkernel_after_update() done }
+_is_osbuild() +{ + [[ $(sed -n -E 's/.*(^|\s)container=(\S*).*/\2/p' < /proc/1/environ) == bwrap-osbuild ]] +} + reset_crashkernel_for_installed_kernel() { local _installed_kernel _running_kernel _crashkernel _crashkernel_running @@ -1573,6 +1578,11 @@ reset_crashkernel_for_installed_kernel() exit 1 fi
+ if _is_osbuild && ! grep crashkernel= /etc/kernel/cmdline &> /dev/null ; then + reset_crashkernel "--kernel=$_installed_kernel" + return + fi + if ! _running_kernel=$(_get_current_running_kernel_path); then derror "Couldn't find current running kernel" exit
Hi Coiby,
On Thu, 16 Dec 2021 14:36:54 +0800 Coiby Xu coxu@redhat.com wrote:
osbuild is a tool to build OS images. It uses bwrap to install packages inside a sandbox/container. Since the kernel package recommends kexec-tools which in in turn recommends grubby, the installation
typo s/in in/in/
order would be grubby -> kexec-tools -> kernel. So we can use the kernel hook 92-crashkernel.install provided by kexec-tools to set up kernel crashkernel for the target OS image. But in osbuild's case, there is no current running kernel and running `uname -r` in the container/sandbox actually returns the host kernel release. To set up kernel crashkernel for the OS image built by osbuild, a different logic is needed.
We will check if kernel hook is running inside the osbuild container then set up kernel crashkernel only if osbuild hasn't specified a custome value. osbuild exposes [1] the container=bwrap-osbuild environment variable. According to [2], the environment variable is not inherited down the process tree, so we need to check /proc/1/environ to detect this environment variable to tell if the kernel hook is running inside a bwrap-osbuild container. After that we need to know if osbuild wants to use custom crashkernel value. This is done by checking if /etc/kernel/cmdline has crashkernel set [3]. /etc/kernel/cmdline is written before packages are installed.
[1] https://github.com/osbuild/osbuild/pull/926 [2] https://systemd.io/CONTAINER_INTERFACE/ [3] https://bugzilla.redhat.com/show_bug.cgi?id=2024976#c5
Signed-off-by: Coiby Xu coxu@redhat.com
kdumpctl | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/kdumpctl b/kdumpctl index 2979e1d..5249244 100755 --- a/kdumpctl +++ b/kdumpctl @@ -1564,6 +1564,11 @@ reset_crashkernel_after_update() done }
+_is_osbuild() +{
- [[ $(sed -n -E 's/.*(^|\s)container=(\S*).*/\2/p' < /proc/1/environ) == bwrap-osbuild ]]
+}
reset_crashkernel_for_installed_kernel() { local _installed_kernel _running_kernel _crashkernel _crashkernel_running @@ -1573,6 +1578,11 @@ reset_crashkernel_for_installed_kernel() exit 1 fi
- if _is_osbuild && ! grep crashkernel= /etc/kernel/cmdline &> /dev/null ; then
you can use grep -q instead of redirecting to /dev/null
Thanks Philipp
reset_crashkernel "--kernel=$_installed_kernel"
return
- fi
- if ! _running_kernel=$(_get_current_running_kernel_path); then derror "Couldn't find current running kernel" exit
On Wed, Dec 22, 2021 at 05:28:52PM +0100, Philipp Rudo wrote:
Hi Coiby,
On Thu, 16 Dec 2021 14:36:54 +0800 Coiby Xu coxu@redhat.com wrote:
osbuild is a tool to build OS images. It uses bwrap to install packages inside a sandbox/container. Since the kernel package recommends kexec-tools which in in turn recommends grubby, the installation
typo s/in in/in/
order would be grubby -> kexec-tools -> kernel. So we can use the kernel hook 92-crashkernel.install provided by kexec-tools to set up kernel crashkernel for the target OS image. But in osbuild's case, there is no current running kernel and running `uname -r` in the container/sandbox actually returns the host kernel release. To set up kernel crashkernel for the OS image built by osbuild, a different logic is needed.
We will check if kernel hook is running inside the osbuild container then set up kernel crashkernel only if osbuild hasn't specified a custome value. osbuild exposes [1] the container=bwrap-osbuild environment variable. According to [2], the environment variable is not inherited down the process tree, so we need to check /proc/1/environ to detect this environment variable to tell if the kernel hook is running inside a bwrap-osbuild container. After that we need to know if osbuild wants to use custom crashkernel value. This is done by checking if /etc/kernel/cmdline has crashkernel set [3]. /etc/kernel/cmdline is written before packages are installed.
[1] https://github.com/osbuild/osbuild/pull/926 [2] https://systemd.io/CONTAINER_INTERFACE/ [3] https://bugzilla.redhat.com/show_bug.cgi?id=2024976#c5
Signed-off-by: Coiby Xu coxu@redhat.com
kdumpctl | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/kdumpctl b/kdumpctl index 2979e1d..5249244 100755 --- a/kdumpctl +++ b/kdumpctl @@ -1564,6 +1564,11 @@ reset_crashkernel_after_update() done }
+_is_osbuild() +{
- [[ $(sed -n -E 's/.*(^|\s)container=(\S*).*/\2/p' < /proc/1/environ) == bwrap-osbuild ]]
+}
reset_crashkernel_for_installed_kernel() { local _installed_kernel _running_kernel _crashkernel _crashkernel_running @@ -1573,6 +1578,11 @@ reset_crashkernel_for_installed_kernel() exit 1 fi
- if _is_osbuild && ! grep crashkernel= /etc/kernel/cmdline &> /dev/null ; then
you can use grep -q instead of redirecting to /dev/null
Thanks for the tip. I'll apply it to v4.
Thanks Philipp
reset_crashkernel "--kernel=$_installed_kernel"
return
- fi
- if ! _running_kernel=$(_get_current_running_kernel_path); then derror "Couldn't find current running kernel" exit
Update crashkernel-howto since crashkernel.default has been removed. The documentation is also simplified as a result.
Signed-off-by: Coiby Xu coxu@redhat.com --- crashkernel-howto.txt | 123 +++++++----------------------------------- 1 file changed, 19 insertions(+), 104 deletions(-)
diff --git a/crashkernel-howto.txt b/crashkernel-howto.txt index 20f50e0..15768ab 100644 --- a/crashkernel-howto.txt +++ b/crashkernel-howto.txt @@ -13,13 +13,14 @@ kdump after you updated the `crashkernel=` value or changed the dump target. Default crashkernel value =========================
-Latest kernel packages include a `crashkernel.default` file installed in kernel -modules folder, available as: +Latest kexec-tools provides "kdumpctl get-default-crashkernel" to retrieve +the default crashkernel value,
- /usr/lib/modules/<kernel>/crashkernel.default + $ echo $(kdumpctl get-default-crashkernel) + 1G-4G:192M,4G-64G:256M,64G-102400T:512M
-The content of the file will be taken as the default value of 'crashkernel=', or -take this file as a reference for setting crashkernel value manually. +It will be taken as the default value of 'crashkernel=', you can use +this value as a reference for setting crashkernel value manually.
New installed system @@ -27,7 +28,7 @@ New installed system
Anaconda is the OS installer which sets all the kernel boot cmdline on a newly installed system. If kdump is enabled during Anaconda installation, Anaconda -will use the `crashkernel.default` file as the default `crashkernel=` value on +will use the default crashkernel value as the default `crashkernel=` value on the newly installed system.
Users can override the value during Anaconda installation manually. @@ -36,20 +37,11 @@ Users can override the value during Anaconda installation manually. Auto update of crashkernel boot parameter =========================================
-Following context in this section assumes all kernel packages have a -`crashkernel.default` file bundled, which is true for the latest official kernel -packages. For kexec-tools behavior with a kernel that doesn't have a -`crashkernel.default` file, please refer to the “Custom Kernel” section of this -doc. - -When `crashkernel=` is using the default value, kexec-tools will need to update -the `crashkernel=` value of new installed kernels, since the default value may -change in new kernel packages. - -kexec-tools does so by adding a kernel installation hook, which gets triggered -every time a new kernel is installed, so kexec-tools can do necessary checks and -updates. - +A new release of kexec-tools could update the the default crashkernel value. +By default, kexec-tools would reset crashkernel to the new default value if it +detects old default crashkernel value is used by installed kernels. If you don't +want kexec-tools to update the old default crashkernel to the new default +crashkernel, you can change auto_reset_crashkernel to no in kdump.conf.
Supported Bootloaders --------------------- @@ -59,92 +51,13 @@ on `grubby`. If other boot loaders are used, the user will have to update the `crashkernel=` value manually.
-Updating kernel package ------------------------ - -When a new version of package kernel is released in the official repository, the -package will always come with a `crashkernel.default` file bundled. Kexec-tools -will act with following rules: - -If current boot kernel is using the default `crashkernel=` boot param value from -its `crashkernel.default` file, then kexec-tools will update new installed -kernel’s `crashkernel=` boot param using the value from the new installed -kernel’s `crashkernel.default` file. This ensures `crashkernel=` is always using -the latest default value. - -If current boot kernel's `crashkernel=` value is set to a non-default value, the -new installed kernel simply inherits this value. - -On systems using GRUB2 as the bootloader, each kernel has its own boot entry, -making it possible to set different `crashkernel=` boot param values for -different kernels. So kexec-tools won’t touch any already installed kernel's -boot param, only new installed kernel's `crashkernel=` boot param value will be -updated. - -But some utilities like `grub2-mkconfig` and `grubby` can override all boot -entry's boot params with the boot params value from the GRUB config file -`/etc/defaults/grub`, so kexec-tools will also update the GRUB config file in -case old `crashkernel=` value overrides new installed kernel’s boot param. - - -Downgrading kernel package --------------------------- - -When upgrading a kernel package, kexec-tools may update the `crashkernel=` value -in GRUB2 config file to the new value. So when downgrading the kernel package, -kexec-tools will also try to revert that update by setting GRUB2 config file’s -`crashkernel=` value back to the default value in the older kernel package. This -will only occur when the GRUB2 config file is using the default `crashkernel=` -value. - - -Custom kernel -============= - -To make auto crashkernel update more robust, kexec-tools will try to keep -tracking the default 'crashkernel=` value with kernels that don’t have a -`crashkernel.default` file, such kernels are referred to as “custom kernel” in -this doc. This is only a best-effort support to make it easier debugging and -testing the system. - -When installing a custom kernel that doesn’t have a `crashkernel.default` file, -the `crashkernel=` value will be simply inherited from the current boot kernel. - -When installing a new official kernel package and current boot kernel is a -custom kernel, since the boot kernel doesn’t have a `crashkernel.default` file, -kexec-tools will iterate installed kernels and check if the boot kernel -inherited the default value from any other existing kernels’ -`crashkernel.default` file. If a matching `crashkernel.default` file is found, -kexec-tools will update the new installed kernel `crashkernel=` boot param using -the value from the new installed kernel’s `crashkernel.default` file, ensures -the auto crashkernel value update won’t break over one or two custom kernel -installations. - -It is possible that the auto crashkernel value update will fail when custom -kernels are used. One example is a custom kernel inheriting the default -`crashkernel=` value from an older official kernel package, but later that -kernel package is uninstalled. So when booted with the custom kernel, -kexec-tools can't determine if the boot kernel is inheriting a default -`crashkernel=` value from any official build. In such a case, please refer to -the "Reset crashkernel to default value" section of this doc. - - Reset crashkernel to default value ==================================
kexec-tools only perform the auto update of crashkernel value when it can confirm the boot kernel's crashkernel value is using its corresponding default -value or inherited from any installed kernel. - -kexec-tools may fail to determine if the boot kernel is using default -crashkernel value in some use cases: -- kexec-tools package is absent during a kernel package upgrade, and the new - kernel package’s `crashkernel.default` value has changed. -- Custom kernel is used and the kernel it inherits `crashkernel=` value from is - uninstalled. - -So it's recommended to reset the crashkernel value if users have uninstalled -kexec-tools or using a custom kernel. +value and auto_reset_crashkernel=yes in kdump.conf. In other cases, the user +can reset the crasherknel value by themselves.
Reset using kdumpctl -------------------- @@ -152,12 +65,14 @@ Reset using kdumpctl To make it easier to reset the `crashkernel=` kernel cmdline to this default value properly, `kdumpctl` also provides a sub-command:
- `kdumpctl reset-crashkernel [<kernel version>]` + `kdumpctl reset-crashkernel [--kernel=path_to_kernel] [--reboot]`
This command will read from the `crashkernel.default` file and reset bootloader's kernel cmdline to the default value. It will also update bootloader config if the bootloader has a standalone config file. User will have to reboot -the machine after this command to make it take effect. +the machine after this command to make it take effect if --reboot is not specified. +For ppc64le, an optional "[--fadump=[on|off|nocma]]" can also be specified to toggle +FADump on/off.
Reset manually -------------- @@ -166,7 +81,7 @@ To reset the crashkernel value manually, it's recommended to use utils like `grubby`. A one liner script for resetting `crashkernel=` value of all installed kernels to current boot kernel's crashkernel.default` is:
- grubby --update-kernel ALL --args "$(cat /usr/lib/modules/$(uname -r)/crashkernel.default)" + grubby --update-kernel ALL --args "$(kdumpctl get-default-crashkernel)"
Estimate crashkernel ====================
Hi Coiby,
v3 looks really good, just a few small nits.
There is one big problem I see but I'm afraid it's out of scope for this series. The problem I see is that kdumpctl becomes more and more a cli for "normal" users.
The way I see it is that kdumpctl was originally planned as script run by the kdump.service but not directly by the user on the command line. Thus the "ui" were the config files /etc/kdump.conf and /etc/sysconfig/kdump. When the user needed to change anything for kdump he was supposed to update the config files and restart kdump.service. For the reset-crashkernel or estimate sub-command this changed. Those are now expected to be run by "normal" users on the command line.
In my opinion this makes quite a difference on how we, as a team, need to treat those sub-commands. For example we need to keep "public" sub-commands, i.e. those that are supposed to be run by the user, stable within a rhel major release as incompatible changes might break scripts written by the user. Furthermore the "public" sub-commands should all follow one "design" (e.g. option naming) so the cli is easy to use.
That's something we should discuss in the team so we all head the same direction. But as said earlier I'm afraid it's a problem beyond this series. Especially it is too late when we start the discussion now when we want the feature to make it into 8.6.
Thanks Philipp
On Thu, 16 Dec 2021 14:36:42 +0800 Coiby Xu coxu@redhat.com wrote:
v3:
- fixes suggested by Philipp
- fix incorrect usage of kdump_get_arch_recommend_crashkernel
- s/get_default_crashkernel/get-default-crashkernel
- no longer depends on grubby --update-kernel=ALL to update all kernels' command line parameter and using a single loop to simplify the code
- indentation issue fix
- commit message improvement
- update crashkernel-howto.txt as suggested by Dave
- CoreOS suppport
- makes "kdumpctl reset-crashkernel" work for CoreOS
- kdumpctl can't be run in RPM scriplet, disable it for CoreOS
- set up kernel crashkernel for osbuild
v2:
- no longer address the swiotlb memory requirement when SME is enabled
- automatically reset crashkernel to default value only when the value is set by kexec-tools before. So the crashkernel option to added to kdump.conf is replaced auto_reset_crashkernel option instead
- multiple fixes suggested by Philipp including regex improvement, typo fixes, grubby kernel path check and commit message improvements
- address the case where a kernel path is not /boot/vmlinuz-{KERNEL_RELEASE}
- "kdumpctl fadump" dropped. Support fadump via "kdumpctl reset-crashkernel [--fadump=[on|off|nocma]]" instead
The crashkernel=auto implementation in kernel space has been rejected upstream [1]. The current user space implementation [2] [3] ships a crashkernel.default but hasn't supported fadump. Meanwhile the crashkernel.default implementation seems to be overly complex,
- the default kernel crashkernel value rarely changes. This is no need to ship the same crashkernel.default default for every kernel package of a architecture;
- when deciding the value of crashkernel for a new kernel, the crashkernel.default of installed kernels and running kernel is took into consideration (for the details, check 92-crashnernel.install).
According to Kairui [4], crashkernel.default per kernel package is to accommodate kernel difference, for example, different kernels could be built with different configurations thus different crashkernel values are needed. But these should be minor cases and may not be sufficent to justify the complexity of 92-crashkernel.install. Currently, we don't know how a kernel debug/feature config would affect the crashkernel value. Even if a kernel config may require much larger crashkernel, we can address it in kexec-tools later.
There are are known cases that could lead to a larger crashkernel including enabling SME, LUKS encryption and etc. But this patch set would put them aside since they may be took care of in the kernel space instead.
So this patch set would simply add support for fadump and move the default kernel crashkernel from kernel package to kexec-tools,
- provide "kdumpctl get-default-crashkernel" for kdump-anaconda-addon to get the default kernel crashkernel values for a specific architecture (fadump is supported as well)
- re-write "kdumpctl reset-crashkernel" to support fadump
- introduce auto_reset_crashkernel which determines whether to reset kernel crashkernel to new default value or not when kexec-tools updates the default crashkernel value
Because the kernel hook /usr/lib/kernel/install.d/20-grub.install would make the installed kernel inherit the kernel cmdline of current running kernel i.e. /proc/cmdline, we only need to reset crashkernel when kexec-tools increases the default crashkernel values.
[1] https://lore.kernel.org/linux-mm/20210507010432.IN24PudKT%25akpm@linux-found... [2] https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1171 [3] https://lists.fedoraproject.org/archives/list/kexec@lists.fedoraproject.org/... [4] https://lists.fedoraproject.org/archives/list/kexec@lists.fedoraproject.org/...
Coiby Xu (13): update default crashkernel value factor out kdump_get_arch_recommend_crashkernel provide kdumpctl get-default-crashkernel for kdump_anaconda_addon and RPM scriptlet add a helper function to read kernel cmdline parameter from grubby --info add helper functions to get dump mode add helper functions to get kernel path by kernel release and the path of current running kernel fix incorrect usage of rpm-ostree to update kernel command line parameters rewrite reset_crashkernel to support fadump and to used by RPM scriptlet introduce the auto_reset_crashkernel option to kdump.conf try to reset kernel crashkernel when kexec-tools updates the default crashkernel value reset kernel crashkernel for the special case where the kernel is updated right after kexec-tools set up kernel crashkernel for osbuild in kernel hook update crashkernel-howto
92-crashkernel.install | 135 +---------------- crashkernel-howto.txt | 123 +++------------- kdump-lib.sh | 60 ++++---- kdump.conf | 7 + kdump.conf.5 | 6 + kdumpctl | 320 ++++++++++++++++++++++++++++++++++++++--- kdumpctl.8 | 16 ++- kexec-tools.spec | 15 ++ 8 files changed, 393 insertions(+), 289 deletions(-)