On Wed, Nov 27, 2019 at 8:19 PM d.hatayama(a)fujitsu.com
<d.hatayama(a)fujitsu.com> wrote:
> -----Original Message-----
> From: Kairui Song [mailto:kasong@redhat.com]
> Sent: Tuesday, November 26, 2019 3:05 PM
> To: kexec(a)lists.fedoraproject.org
> Cc: Hatayama, Daisuke/畑山 大輔 <d.hatayama(a)fujitsu.com>; Pingfan Liu
> <piliu(a)redhat.com>; Dave Young <dyoung(a)redhat.com>; Kairui Song
> <kasong(a)redhat.com>
> Subject: [PATCH] kdump-error-handler.service: Remove ExecStopPost
>
> Currently "systemctl --fail --no-block default" will be executed on
> kdump-error-handler exit due to the config ExecStopPost. This may makes
> systemd try to isolate to the default target again, if both
> final_action or failure_action failed to terminates the system.
>
> The execute chain will be like:
>
> initrd.target -> kdump.sh (failed) -> kdump-error-handler.service ->
> failure_action -> final_action -> ExecStopPost (go to initrd.target again
> and loop infinitely)
>
> Currently, reboot/shutdown/halt is guarenteen to be called by either
> by executing failure_action or final_action. So the loop will be stopped.
> However none of the reboot/shutdown/halt call is blocking so it might
> have race issue, and this ExecStopPost config is not useful after all.
>
> Just drop the ExecStopPost to fix this potential issue.
> ---
> dracut-kdump-error-handler.service | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/dracut-kdump-error-handler.service
> b/dracut-kdump-error-handler.service
> index 32b74ab..a23b75e 100644
> --- a/dracut-kdump-error-handler.service
> +++ b/dracut-kdump-error-handler.service
> @@ -21,7 +21,6 @@ Environment=DRACUT_SYSTEMD=1
> Environment=NEWROOT=/sysroot
> WorkingDirectory=/
> ExecStart=/bin/kdump-error-handler.sh
> -ExecStopPost=-/usr/bin/systemctl --fail --no-block default
> Type=oneshot
> StandardInput=tty-force
> StandardOutput=inherit
> --
> 2.23.0
[^[[0;32m OK ^[[0m] Started Setup Virtual Console.^M
Starting Kdump Error Handler...^M
Kdump: Executing failure action exit 0^M
[^[[0;32m OK ^[[0m] Started Kdump Error Handler.^M
[^[[0;32m OK ^[[0m] Started Journal Service.^M
[ 12.857263] systemd[1]: Startup finished in 1.728s (kernel) + 0 (initrd) + 11.127s
(userspace) = 12.856s.^M
-- 1:d.hatayama@localhost:~ -- time-stamp -- Nov/27/19 7:15:34 --
I tried to test with some artificial change to reproduce the issue
and I see no infinite loop.
Tested-by: HATAYAMA Daisuke <d.hatayama(a)fujitsu.com>
The current behavior leads to hang finally, so maybe further
improvement is to add service unit that runs systemctl reboot -ff
or systemctl poweroff -ff that is started by a timer unit with
OnBootSec=, for example.
Good suggestion, that could be done as an improvement later.
--
Best Regards,
Kairui Song