On 01/06/17 at 05:49pm, Xunlei Pang wrote:
Check the number of cpus for x86_64 kdump kernel to boot with.
We met an issue for x86_64: kdump runs out of vectors with the
default "nr_cpus=1", when requesting tons of irqs.
This patch detects such situation and warns users about the risk.
Signed-off-by: Xunlei Pang <xlpang(a)redhat.com>
- When detecting risky cpu vectors, we just warn users instead of
modifying "nr_cpus=X" forcely.
- Improved code comments.
- Replaced nr_old with nr_origin, and improved some logic.
- Improved the code according to Dave's suggestions.
kdumpctl | 73 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 73 insertions(+)
diff --git a/kdumpctl b/kdumpctl
index b2068cc..9dfc9e2 100755
@@ -105,6 +105,77 @@ append_cmdline()
+# Check the number of cpus for kdump kernel to boot with.
+# We met an issue for x86_64: kdump runs out of vectors with
+# "nr_cpus=1" when requesting tons of irqs, so here we check
+# "nr_cpus=X" and warn users if kdump probably can't work.
+ local nr nr_search nr_origin nr_min nr_max
+ local arch=$(uname -m) cmdline=$KDUMP_COMMANDLINE
+ # Special treatment for x86_64 only currently.
+ if [ $arch != "x86_64" ]; then
+ # We only care about "nr_cpus=X" format for x86.
+ nr_search=$(echo $cmdline | grep -o "nr_cpus=[0-9]*" | cut -d "="
-f2 | grep "[0-9]" | sort)
+ # In case there are multiple "nr_cpus=X", get the lowest non-zero value.
+ for nr in $nr_search; do
+ if [ $nr -gt 0 ]; then
+ if [ -z "$nr_origin" ]; then
How about just echo $cmdline | grep -e "nr_cpus=1[[:space:]]"
If it return true then we can regard use 1 to compare with nr_min.
> + # Online cpus in first kernel.
> + nr_max=$(grep -c '^processor' /proc/cpuinfo)
> + # Calculate estimated minium cpus required by irqs(vectors).
> + nr_min=$(ls /proc/irq/ -l | grep ^d | wc -l)
> + # We roughly use 256-32(see kernel FIRST_EXTERNAL_VECTOR)=224 as
> + # maximum supported vectors can be allocated to io devices percpu.
> + # As nr_min is a ballpart figure, also some high-numbered vectors
> + # are consumed by the kernel(see FIRST_SYSTEM_VECTOR), we need a
> + # variance for safety.
> + #
> + # We got a large machine with 240 cpus, 6TB memory, 8 iommus, and
> + # 12 io-apics, 132 irqs under /proc/irq/, it can boot successfully
> + # with "nr_cpus=1". (256-32-132)=92, so choosing 64 as the variance
> + # seems ok. Then we get the max external irqs supported per cpu:
> + # (256-32-64)=160 as the dividend.
> + nr_min=$(($nr_min + 160 - 1))
> + nr_min=$(($nr_min / 160))
> + if [ $nr_min -gt 1 ]; then
> + # The system seems to have tons of interrupts. while interrupts
> + # with multiple-cpu affinity can consume multiple vectors, with
> + # one vector for each cpu within the affinity mask. Fortunately
> + # for x2apic which is widely used on large modern machines, in
> + # default case of boot, device bringup etc will use a single cpu
> + # for the interrupt affinity to minimize vector pressure.
> + #
> + # For further safety, we add one more cpu and round it up to an
> + # even number which is commonly-used.
> + nr_min=$(($nr_min + 1))
> + nr_min=$(($nr_min + $nr_min % 2))
> + fi
> + if [ $nr_min -gt $nr_max ]; then
> + nr_min=$nr_max
> + fi
> + if [ $nr_origin -ge $nr_min ]; then
> + return
> + fi
> + echo "Warning: nr_cpus=$nr_origin may not be enough for kdump boot, try
nr_cpus=$nr_min or larger instead"
> # This function performs a series of edits on the command line.
> # Store the final result in global $KDUMP_COMMANDLINE.
> @@ -134,6 +205,8 @@ prepare_cmdline()
> + check_kdump_cpus
> kexec mailing list -- kexec(a)lists.fedoraproject.org
> To unsubscribe send an email to kexec-leave(a)lists.fedoraproject.org