On 2016/11/01 at 14:10, Xunlei Pang wrote:
On 2016/10/31 at 15:15, Xunlei Pang wrote:
> Add dracut-memdebug-ko.sh, install it to the dracut kdump module.
>
> The principle is to use kernel trace to track buddy page allocation
> events during kernel module loading(module_init), thus we can analyze
> all the trace data and get the total memory consumption. as for large
> slab allocation, it will fall into buddy, thus tracing "mm_page_alloc"
> only should be enough for the purpose.
>
> One major flaw of this method is that it consumes a lot of memory, users
> should increase the crash kernel memory reservation or trace buffer size
> (via "trace_buf_size=nn[KMG]") as needed.
We can address this major flaw now, as we can use filter to make the trace data very
small.
Here is the improved version:
Subject: [PATCH v2 1/3] memdebug-ko: add dracut-memdebug-ko.sh to debug kernel
module memory consumption
Add dracut-memdebug-ko.sh, install it to the dracut kdump module.
The principle is to use kernel trace to track buddy page allocation
events during kernel module loading(module_init), thus we can analyze
all the trace data and get the total memory consumption. as for large
slab allocation, it will fall into buddy, thus tracing "mm_page_alloc"
alone should be enough for the purpose.
There are three kinds of known applications for module loading:
"systemd-udevd", "modprobe" and "insmod".
We utilize them as the mm_page_alloc filter, so that loads of events
can be avoided. As a result, we get very small trace data.
Signed-off-by: Xunlei Pang <xlpang(a)redhat.com>
---
dracut-memdebug-ko.sh | 111 ++++++++++++++++++++++++++++++++++++++++++++++++++
kexec-tools.spec | 2 +
2 files changed, 113 insertions(+)
create mode 100755 dracut-memdebug-ko.sh
diff --git a/dracut-memdebug-ko.sh b/dracut-memdebug-ko.sh
new file mode 100755
index 0000000..fc0a4ba
--- /dev/null
+++ b/dracut-memdebug-ko.sh
@@ -0,0 +1,111 @@
+# Try to find out kernel modules with large total memory allocation during loading.
+# For large slab allocation, it will fall into buddy, thus tracing
"mm_page_alloc"
+# alone should be enough for the purpose.
+
+TRACE_BASE="/sys/kernel/debug"
+# trace access through debugfs would be obsolete if "/sys/kernel/tracing" is
available.
+if [[ -d "/sys/kernel/tracing" ]]; then
+ TRACE_BASE="/sys/kernel"
+fi
+
+# old debugfs case.
+if ! [[ -d "$TRACE_BASE/tracing" ]]; then
+ mount none -t debugfs $TRACE_BASE
+# new tracefs case.
+elif ! [[ -f "$TRACE_BASE/tracing/trace" ]]; then
+ mount none -t tracefs "$TRACE_BASE/tracing"
+fi
+
+if ! [[ -f "$TRACE_BASE/tracing/trace" ]]; then
+ warn "Mount trace failed for kernel module memory analyzing."
+ return 0
+fi
+
+MATCH_EVENTS="module:module_put module:module_load kmem:mm_page_alloc"
+SET_EVENTS=$(echo $(cat $TRACE_BASE/tracing/set_event))
+# Check if trace was properly setup, prepare it if not.
+if [[ $(cat $TRACE_BASE/tracing/tracing_on) != 1 ]] || \
+ [[ "$SET_EVENTS" != "$MATCH_EVENTS" ]]; then
+ # Set our trace events.
+ echo $MATCH_EVENTS > $TRACE_BASE/tracing/set_event
+
+ # There are three kinds of known applications for module loading:
+ # "systemd-udevd", "modprobe" and "insmod".
+ # Set them to the mm_page_alloc event filter.
+ page_alloc_filter="comm == systemd-udevd* || comm == modprobe* || comm ==
modprobe*"
Oops, this line should be, seems having "*" doesn't work for the filter:
page_alloc_filter="comm == systemd-udevd || comm == modprobe || comm == insmod"
+ echo $page_alloc_filter >
$TRACE_BASE/tracing/events/kmem/mm_page_alloc/filter
+
+ # Set the number of comm-pid. Thanks to filters, 4096 is big enough(also generally
supported).
+ echo 4096 > $TRACE_BASE/tracing/saved_cmdlines_size
+
+ # Enable and clear trace data.
+ echo 1 > $TRACE_BASE/tracing/tracing_on
+ echo > $TRACE_BASE/tracing/trace
+ return 0
+fi
+
+# Indexed by task pid.
+declare -A current_module
+# Indexed by module name.
+declare -A module_loaded
+declare -A nr_alloc_pages
+
+# For large trace data, parsing tracing/trace turns out to be very slow,
+# so copy it out first and we parse the copy file to avoid this issue.
+TMPFILE=/tmp/kdump.trace.tmp.$$$$
+cp $TRACE_BASE/tracing/trace $TMPFILE -f
+while read pid cpu flags ts function ;
+do
+ # Skip comment lines
+ if [[ $pid = "#" ]]; then
+ continue
+ fi
+
+ if [[ $function = module_load* ]]; then
+ # One module is being loaded, save the task pid for tracking.
+ module_name=${function#*: }
+ # Remove the trailing after whitespace, there may be the module flags.
+ module_name=${module_name%% *}
+ module_names+=" $module_name"
+ current_module[$pid]="$module_name"
+ [[ ${module_loaded[$module_name]} ]] && warn
"\"$module_name\" was loaded multiple times!"
+ unset module_loaded[$module_name]
+ nr_alloc_pages[$module_name]=0
+ fi
+
+ if ! [[ ${current_module[$pid]} ]]; then
+ continue
+ fi
+
+ if [[ $function = module_put* ]]; then
+ # Mark the module as loaded
+ module_loaded[${current_module[$pid]}]=1
+ # Module has been loaded when module_put is called, untrack the task
+ unset current_module[$pid]
+ continue
+ fi
+
+ # Once we get here, the task is being tracked(is loading a module).
+ # Get the module name.
+ module_name=${current_module[$pid]}
+
+ if [[ $function = mm_page_alloc* ]]; then
+ order=$(echo $function | sed -e 's/.*order=\([0-9]*\) .*/\1/')
+ nr_alloc_pages[$module_name]=$((${nr_alloc_pages[$module_name]}+$((2 **
$order))))
+ fi
+done < $TMPFILE
+
+echo "== debug_mem for kernel modules during loading begin ==" >&2
+for i in $module_names; do
+ status="load finished"
+ if ! [[ ${module_loaded[$i]} ]]; then
+ status="loading"
+ fi
+ echo "${nr_alloc_pages[$i]} pages consumed by \"$i\" [$status]"
>&2
+done
+echo "== debug_mem for kernel modules during loading end ==" >&2
+
+unset module_names
+unset module_loaded
+rm $TMPFILE -f
+return 0
diff --git a/kexec-tools.spec b/kexec-tools.spec
index 1597071..691ad7a 100644
--- a/kexec-tools.spec
+++ b/kexec-tools.spec
@@ -41,6 +41,7 @@ Source103: dracut-kdump-error-handler.sh
Source104: dracut-kdump-emergency.service
Source105: dracut-kdump-error-handler.service
Source106: dracut-kdump-capture.service
+Source107: dracut-memdebug-ko.sh
Requires(post): systemd-units
Requires(preun): systemd-units
@@ -224,6 +225,7 @@ cp %{SOURCE103}
$RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpb
cp %{SOURCE104}
$RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix
%{SOURCE104}}
cp %{SOURCE105}
$RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix
%{SOURCE105}}
cp %{SOURCE106}
$RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix
%{SOURCE106}}
+cp %{SOURCE107}
$RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix
%{SOURCE107}}
chmod 755
$RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix
%{SOURCE100}}
chmod 755
$RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix
%{SOURCE101}}