On 07/31/14 at 08:58am, Vivek Goyal wrote:
On Thu, Jul 31, 2014 at 02:30:09PM +0800, WANG Chao wrote:
[..]
I just realize this doesn't work. Because dracut-initqueue.sh will call
"systemctl start dracut-emergency" directly, not via
"OnFailure=dracut-emergency". So this will be blocked:
dracut-initqueue (running)
--> call dracut-emergency:
--> dracut-emergency (running)
--> call dracut-initqueue:
--> blocking and waiting for the original instance to exit.
Only dracut-initqueue has such issue, because its code is written as so.
I'm thinking of introducing a wrapper emergency service. This emegency
service will replace both the systemd and dracut emergency. And this
service does nothing but to isolate to real kdump error handler service:
dracut-initqueue (running)
--> call dracut-emergency:
--> dracut-emergency isolate to kdump-error-handler.service
--> dracut-emergency and dracut-initqueue will both be stopped
and kdump-error-handler.service will run kdump-error-handler.sh
In a normal failure case, this still works:
foo.service fails
--> trigger emergency.service
--> emergency.service isolates to kdump-error-handler.service
--> kdump-error-handler.service will run kdump-error-handler.sh
> --- /dev/null
> +++ b/dracut-kdump-error-handler.sh
> @@ -0,0 +1,18 @@
> +#!/bin/sh
> +
> +. /lib/kdump-lib-initramfs.sh
> +
> +set -o pipefail
> +export PATH=$PATH:$KDUMP_SCRIPT_DIR
> +
> +# We only allow to enter kdump error handler once.
> +# If error happens in error handling path and we simply reboot.
> +if [ -f /kdump-error-handler ]; then
> + do_final_action
Why are we rebooting here. Why not simply return and exit and other
instance of error handler will face error and reboot system?
There will be only one instance of a service allowed running. If we exit
from it, systemd would continue to boot for default.target
(initrd.target). Because kdump.sh runs before initrd.target, kdump.sh
will be run. And it's totally wrong because we've already entered kdump
error handler.
"reboot" when we finish error handling makes sense, because:
1. We've already done error handling and it's nothing left for us to do.
2. Exit from error handling path will cause systemd continue to boot. We
don't want go any further in an error case. Who knows what would
happen if there's already something unexpected (error/failure) to
make the boot unstable.
> +else
> + > /kdump-error-handler
> +fi
how about calling this file as kdump-error-handler-running. So that we
nobody confuses it with actual error handler.
I'm OK with it.
Thanks
WANG Chao