在 2019年08月12日 17:18, Bhupesh Sharma 写道:
Hi Lianbo,
On 8/11/19 6:40 AM, lijiang wrote:
在 2019年08月09日 13:45, Dave Young 写道:
On 08/06/19 at 07:22pm, Lianbo Jiang wrote:
From: Jun Wang junw99@yahoo.com
With some corrupted vmcore files, the vmcore-dmesg.txt file may grow forever till the kdump disk becomes full, and also probably causes the disk error messages as follow: ... sd 0:0:0:0: [sda] tag#6 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK sd 0:0:0:0: [sda] tag#6 CDB: Read(10) 28 00 08 06 4c 98 00 00 08 00 blk_update_request: I/O error, dev sda, sector 134630552 sd 0:0:0:0: [sda] tag#7 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK sd 0:0:0:0: [sda] tag#7 CDB: Read(10) 28 00 08 06 4c 98 00 00 08 00 blk_update_request: I/O error, dev sda, sector 134630552 ...
Rethink about the problem, if the log_buf_len is not corrupted, it will be fine, in kernel code log_buf_len is a u32 int. so it will be not possible to fill disk forever unless vmcore-dmesg is buggy.
Also about the implementation, it looks not very elegant. Can you try to add some limitation in vmcore-dmesg.c?
Actually in upstream kernel there is a macro LOG_BUF_LEN_MAX which is defined as 2G, so limit the file output to within 2G should be fine.
I checked vmcore-dmesg.c, and made a draft patch limit the size of vmcore-dmesg.txt as follow:
diff --git a/util_lib/elf_info.c b/util_lib/elf_info.c index 90a3b21662e7..a66241d8d76a 100644 --- a/util_lib/elf_info.c +++ b/util_lib/elf_info.c @@ -54,6 +54,9 @@ static uint64_t phys_offset = UINT64_MAX; #error "Unknown machine endian" #endif +/* stole this macro from kernel printk.c */ +#define LOG_BUF_LEN_MAX (uint32_t)(1 << 31)
static uint16_t file16_to_cpu(uint16_t val) { if (ehdr.e_ident[EI_DATA] != ELFDATANATIVE) @@ -534,6 +537,13 @@ static int32_t read_file_s32(int fd, uint64_t addr) static void write_to_stdout(char *buf, unsigned int nr) { ssize_t ret; + static uint32_t n_bytes = 0;
+ n_bytes += nr; + if (n_bytes > LOG_BUF_LEN_MAX) { + fprintf(stderr, "The vmcore-dmesg.txt over 2G in size is not supported.\n"); + return; + } ret = write(STDOUT_FILENO, buf, nr); if (ret != nr) {
What's your opinion?
I remember seeing a similar problem for arm64 vmcore-dmesg in the past (see https://docs.oracle.com/cd/E52668_01/E97565/html/ol7-bug-28064675.html for details).
However, normally what we observed was that the vmcore-dmesg created was almost never of abnormal size, rather there was a bug (infinite loop) in vmcore-dmesg.c which was addressed via kexec-tools commit:
commit e277fa9ec702fea7bd3135393c67327c821d5a3a Author: Omar Sandoval osandov@fb.com Date: Wed May 23 13:59:31 2018 -0700
vmcore-dmesg: fix infinite loop if log buffer wraps around
I think if you have commit e08d26b3b7f1 and e277fa9ec702 applied you shouldn't face this issue.
Thank you, Bhupesh. I have never reproduced this issue. But i guess above patch can ensure that the size of vmcore-dmesg.txt is limited to 2G.
Hi Jun Wang,
Have you been able to reproduce this issue at your end even with the latest upstream kexec-tools (kexec-tools 2.0.20)
If yes, I think we should probably fix any left-over infinite loop in vmcore-dmesg.c, rather than worrying about the vmcore-dmesg size, as normally the vmcore-dmesg is never too large in size (which I found by adding some debug prints to vmcore-dmesg.c) evenf for corrupted vmcore cases.
Thanks, Bhupesh
Lets limit the size of vmcore-dmesg.txt to avoid such problems.
Signed-off-by: Jun Wang junw99@yahoo.com Signed-off-by: Lianbo Jiang lijiang@redhat.com
Changes since v1: [1] Add dump_fs path to limit the size of vmcore-dmesg.txt [2] Add the option 'iflag=fullblock' for the dd command.
dracut-kdump.sh | 4 ++-- kdump-lib-initramfs.sh | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/dracut-kdump.sh b/dracut-kdump.sh index 2ae1c7c5d70d..ddd96efb5184 100755 --- a/dracut-kdump.sh +++ b/dracut-kdump.sh @@ -102,8 +102,8 @@ save_vmcore_dmesg_ssh() { local _opts="$3" local _location=$4 - echo "kdump: saving vmcore-dmesg.txt" - $_dmesg_collector /proc/vmcore | ssh $_opts $_location "dd of=$_path/vmcore-dmesg-incomplete.txt" + echo "kdump: saving vmcore-dmesg.txt, up to 100MB" + $_dmesg_collector /proc/vmcore | ssh $_opts $_location "dd of=$_path/vmcore-dmesg-incomplete.txt bs=512 count=204800 iflag=fullblock" _exitcode=$? if [ $_exitcode -eq 0 ]; then diff --git a/kdump-lib-initramfs.sh b/kdump-lib-initramfs.sh index 608dc6efc07e..44f9ae4dfb8f 100755 --- a/kdump-lib-initramfs.sh +++ b/kdump-lib-initramfs.sh @@ -137,8 +137,8 @@ save_vmcore_dmesg_fs() { local _dmesg_collector=$1 local _path=$2 - echo "kdump: saving vmcore-dmesg.txt" - $_dmesg_collector /proc/vmcore > ${_path}/vmcore-dmesg-incomplete.txt + echo "kdump: saving vmcore-dmesg.txt, up to 100MB" + $_dmesg_collector /proc/vmcore | dd of=$_path/vmcore-dmesg-incomplete.txt bs=512 count=204800 iflag=fullblock _exitcode=$? if [ $_exitcode -eq 0 ]; then mv ${_path}/vmcore-dmesg-incomplete.txt ${_path}/vmcore-dmesg.txt -- 2.17.1
Thanks Dave
kexec mailing list -- kexec@lists.fedoraproject.org To unsubscribe send an email to kexec-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/kexec@lists.fedoraproject.org