On Mon, Apr 29, 2019 at 7:02 PM Kairui Song <kasong(a)redhat.com> wrote:
On Mon, Apr 29, 2019 at 1:58 PM Pingfan Liu <piliu(a)redhat.com> wrote:
>
> On powerpc, after hot add cpu and trigger crash on the hot-added cpu, the
> kdump kernel hangs after "I'm in purgatory".
>
> The current udev rules expects the dtb to be rebuit on cpu add/remove event.
> But since powerpc does not follow the standard cpu hot add framework, it
> only ejects online/offline event to user space when cpu is hot
> added/removed, instead of add/remove event. Pingfan tried fixing that but
> it didn't please the maintainer as it breaks some old userspace tools.
>
> Due to the failure of dtb's rebuilding, KDump kernel fails to get the
> 'boot_cpuid' and eventually fails to boot [see early_init_dt_scan_cpus() in
> arch/powerpc/kernel/prom.c file] if system crashes on hot-added CPU.
>
> Work around it by changing udev rules on powerpc to onlne/offline.
>
> As for offline message, it is even useless on powerpc, and can be dropped.
> See the explain: On powerpc, /sys/devices/system/cpu/cpuX nodes are present
> for all "possible", irrespective of whether a CPU is hot-added/removed.
> crash_notes are already built for all /sys/devices/system/cpu/cpuX nodes and
> these nodes are present for all "possible" CPUs
> (online/offline/could-be-hot-removed/could-be-hot-added)
>
> Signed-off-by: Pingfan Liu <piliu(a)redhat.com>
> ---
> v3 -> v4: fix the mistakenly installed version on non ppc platform. And rename
the
> file name as 98-kexec.rules.ppc64, installed name as 98-kexec.rules.
> Tested on x86 and ppcle
>
> 98-kexec.rules.ppc64 | 15 +++++++++++++++
> kexec-tools.spec | 10 ++++++++--
> 2 files changed, 23 insertions(+), 2 deletions(-)
> create mode 100644 98-kexec.rules.ppc64
>
> diff --git a/98-kexec.rules.ppc64 b/98-kexec.rules.ppc64
> new file mode 100644
> index 0000000..9d783a0
> --- /dev/null
> +++ b/98-kexec.rules.ppc64
> @@ -0,0 +1,15 @@
> +SUBSYSTEM=="cpu", ACTION=="online",
GOTO="kdump_reload"
> +SUBSYSTEM=="memory", ACTION=="online",
GOTO="kdump_reload"
> +SUBSYSTEM=="memory", ACTION=="offline",
GOTO="kdump_reload"
> +
> +GOTO="kdump_reload_end"
> +
> +LABEL="kdump_reload"
> +
> +# If kdump is not loaded, calling "kdumpctl reload" will end up
> +# doing nothing, but it and systemd-run will always generate
> +# extra logs for each call, so trigger the "kdumpctl reload"
> +# only if kdump service is active to avoid unnecessary logs
> +RUN+="/bin/sh -c '/usr/bin/systemctl is-active kdump.service || exit 0;
/usr/bin/systemd-run --quiet /usr/bin/kdumpctl reload'"
Noticed a small problem when merging it, on udev event "kdumpctl
reload" is used directly here, maybe it's better to use the
"kdump-udev-throttler", in case there are a lot of udev event and that
will bring an extra overhead for the machine. On x86 some hypervisor
will timeout waiting to everything to settle, maybe such problem don't
exist on PPC but still better to not floor the machine with kdump
reload.
> +
> +LABEL="kdump_reload_end"
> diff --git a/kexec-tools.spec b/kexec-tools.spec
> index a1e6686..3ebf9da 100644
> --- a/kexec-tools.spec
> +++ b/kexec-tools.spec
> @@ -18,7 +18,8 @@ Source8: kdump.conf
> Source9:
http://downloads.sourceforge.net/project/makedumpfile/makedumpfile/1.6.5/...
> Source10: kexec-kdump-howto.txt
> Source12: mkdumprd.8
> -Source14: 98-kexec.rules
> +Source13: 98-kexec.rules
> +Source14: 98-kexec.rules.ppc64
> Source15: kdump.conf.5
> Source16: kdump.service
> Source18: kdump.sysconfig.s390x
> @@ -169,10 +170,15 @@ install -m 644 %{SOURCE25}
$RPM_BUILD_ROOT%{_mandir}/man8/kdumpctl.8
> install -m 755 %{SOURCE20} $RPM_BUILD_ROOT%{_prefix}/lib/kdump/kdump-lib.sh
> install -m 755 %{SOURCE23}
$RPM_BUILD_ROOT%{_prefix}/lib/kdump/kdump-lib-initramfs.sh
> %ifnarch s390x
> +install -m 755 %{SOURCE28} $RPM_BUILD_ROOT%{_udevrulesdir}/../kdump-udev-throttler
> +%endif
> +%ifnarch s390x ppc64 ppc64le
> # For s390x the ELF header is created in the kdump kernel and therefore kexec
> # udev rules are not required
> +install -m 644 %{SOURCE13} $RPM_BUILD_ROOT%{_udevrulesdir}/98-kexec.rules
> +%endif
> +%ifarch ppc64 ppc64le
> install -m 644 %{SOURCE14} $RPM_BUILD_ROOT%{_udevrulesdir}/98-kexec.rules
> -install -m 755 %{SOURCE28} $RPM_BUILD_ROOT%{_udevrulesdir}/../kdump-udev-throttler
> %endif
> install -m 644 %{SOURCE15} $RPM_BUILD_ROOT%{_mandir}/man5/kdump.conf.5
> install -m 644 %{SOURCE16} $RPM_BUILD_ROOT%{_unitdir}/kdump.service
> --
> 2.20.1
>
Looks good to me.
Acked-by: Kairui Song <kasong(a)redhat.com>
So nack my previous ack...
--
Best Regards,
Kairui Song