Hi Lianbo,
On 8/11/19 6:40 AM, lijiang wrote:
在 2019年08月09日 13:45, Dave Young 写道:
> On 08/06/19 at 07:22pm, Lianbo Jiang wrote:
>> From: Jun Wang <junw99(a)yahoo.com>
>>
>> With some corrupted vmcore files, the vmcore-dmesg.txt file may grow
>> forever till the kdump disk becomes full, and also probably causes
>> the disk error messages as follow:
>> ...
>> sd 0:0:0:0: [sda] tag#6 FAILED Result: hostbyte=DID_BAD_TARGET
driverbyte=DRIVER_OK
>> sd 0:0:0:0: [sda] tag#6 CDB: Read(10) 28 00 08 06 4c 98 00 00 08 00
>> blk_update_request: I/O error, dev sda, sector 134630552
>> sd 0:0:0:0: [sda] tag#7 FAILED Result: hostbyte=DID_BAD_TARGET
driverbyte=DRIVER_OK
>> sd 0:0:0:0: [sda] tag#7 CDB: Read(10) 28 00 08 06 4c 98 00 00 08 00
>> blk_update_request: I/O error, dev sda, sector 134630552
>> ...
>
> Rethink about the problem, if the log_buf_len is not corrupted, it will
> be fine, in kernel code log_buf_len is a u32 int. so it will be not possible to fill
disk forever
> unless vmcore-dmesg is buggy.
>
> Also about the implementation, it looks not very elegant. Can you try
> to add some limitation in vmcore-dmesg.c?
>
> Actually in upstream kernel there is a macro LOG_BUF_LEN_MAX which is
> defined as 2G, so limit the file output to within 2G should be fine.
>
I checked vmcore-dmesg.c, and made a draft patch limit the size of vmcore-dmesg.txt as
follow:
diff --git a/util_lib/elf_info.c b/util_lib/elf_info.c
index 90a3b21662e7..a66241d8d76a 100644
--- a/util_lib/elf_info.c
+++ b/util_lib/elf_info.c
@@ -54,6 +54,9 @@ static uint64_t phys_offset = UINT64_MAX;
#error "Unknown machine endian"
#endif
+/* stole this macro from kernel printk.c */
+#define LOG_BUF_LEN_MAX (uint32_t)(1 << 31)
+
static uint16_t file16_to_cpu(uint16_t val)
{
if (ehdr.e_ident[EI_DATA] != ELFDATANATIVE)
@@ -534,6 +537,13 @@ static int32_t read_file_s32(int fd, uint64_t addr)
static void write_to_stdout(char *buf, unsigned int nr)
{
ssize_t ret;
+ static uint32_t n_bytes = 0;
+
+ n_bytes += nr;
+ if (n_bytes > LOG_BUF_LEN_MAX) {
+ fprintf(stderr, "The vmcore-dmesg.txt over 2G in size is not
supported.\n");
+ return;
+ }
ret = write(STDOUT_FILENO, buf, nr);
if (ret != nr) {
What's your opinion?
I remember seeing a similar problem for arm64 vmcore-dmesg in the past
(see
<
https://docs.oracle.com/cd/E52668_01/E97565/html/ol7-bug-28064675.html>
for details).
However, normally what we observed was that the vmcore-dmesg created was
almost never of abnormal size, rather there was a bug (infinite loop) in
vmcore-dmesg.c which was addressed via kexec-tools commit:
commit e277fa9ec702fea7bd3135393c67327c821d5a3a
Author: Omar Sandoval <osandov(a)fb.com>
Date: Wed May 23 13:59:31 2018 -0700
vmcore-dmesg: fix infinite loop if log buffer wraps around
I think if you have commit e08d26b3b7f1 and e277fa9ec702 applied you
shouldn't face this issue.
Hi Jun Wang,
Have you been able to reproduce this issue at your end even with the
latest upstream kexec-tools (kexec-tools 2.0.20)
If yes, I think we should probably fix any left-over infinite loop in
vmcore-dmesg.c, rather than worrying about the vmcore-dmesg size, as
normally the vmcore-dmesg is never too large in size (which I found by
adding some debug prints to vmcore-dmesg.c) evenf for corrupted vmcore
cases.
Thanks,
Bhupesh
>> Lets limit the size of vmcore-dmesg.txt to avoid such
problems.
>>
>> Signed-off-by: Jun Wang <junw99(a)yahoo.com>
>> Signed-off-by: Lianbo Jiang <lijiang(a)redhat.com>
>> ---
>> Changes since v1:
>> [1] Add dump_fs path to limit the size of vmcore-dmesg.txt
>> [2] Add the option 'iflag=fullblock' for the dd command.
>>
>> dracut-kdump.sh | 4 ++--
>> kdump-lib-initramfs.sh | 4 ++--
>> 2 files changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/dracut-kdump.sh b/dracut-kdump.sh
>> index 2ae1c7c5d70d..ddd96efb5184 100755
>> --- a/dracut-kdump.sh
>> +++ b/dracut-kdump.sh
>> @@ -102,8 +102,8 @@ save_vmcore_dmesg_ssh() {
>> local _opts="$3"
>> local _location=$4
>>
>> - echo "kdump: saving vmcore-dmesg.txt"
>> - $_dmesg_collector /proc/vmcore | ssh $_opts $_location "dd
of=$_path/vmcore-dmesg-incomplete.txt"
>> + echo "kdump: saving vmcore-dmesg.txt, up to 100MB"
>> + $_dmesg_collector /proc/vmcore | ssh $_opts $_location "dd
of=$_path/vmcore-dmesg-incomplete.txt bs=512 count=204800 iflag=fullblock"
>> _exitcode=$?
>>
>> if [ $_exitcode -eq 0 ]; then
>> diff --git a/kdump-lib-initramfs.sh b/kdump-lib-initramfs.sh
>> index 608dc6efc07e..44f9ae4dfb8f 100755
>> --- a/kdump-lib-initramfs.sh
>> +++ b/kdump-lib-initramfs.sh
>> @@ -137,8 +137,8 @@ save_vmcore_dmesg_fs() {
>> local _dmesg_collector=$1
>> local _path=$2
>>
>> - echo "kdump: saving vmcore-dmesg.txt"
>> - $_dmesg_collector /proc/vmcore > ${_path}/vmcore-dmesg-incomplete.txt
>> + echo "kdump: saving vmcore-dmesg.txt, up to 100MB"
>> + $_dmesg_collector /proc/vmcore | dd of=$_path/vmcore-dmesg-incomplete.txt
bs=512 count=204800 iflag=fullblock
>> _exitcode=$?
>> if [ $_exitcode -eq 0 ]; then
>> mv ${_path}/vmcore-dmesg-incomplete.txt ${_path}/vmcore-dmesg.txt
>> --
>> 2.17.1
>>
>
> Thanks
> Dave
>
_______________________________________________
kexec mailing list -- kexec(a)lists.fedoraproject.org
To unsubscribe send an email to kexec-leave(a)lists.fedoraproject.org
Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives:
https://lists.fedoraproject.org/archives/list/kexec@lists.fedoraproject.org