在 2019年08月12日 17:18, Bhupesh Sharma 写道:
Hi Lianbo,
On 8/11/19 6:40 AM, lijiang wrote:
> 在 2019年08月09日 13:45, Dave Young 写道:
>> On 08/06/19 at 07:22pm, Lianbo Jiang wrote:
>>> From: Jun Wang <junw99(a)yahoo.com>
>>>
>>> With some corrupted vmcore files, the vmcore-dmesg.txt file may grow
>>> forever till the kdump disk becomes full, and also probably causes
>>> the disk error messages as follow:
>>> ...
>>> sd 0:0:0:0: [sda] tag#6 FAILED Result: hostbyte=DID_BAD_TARGET
driverbyte=DRIVER_OK
>>> sd 0:0:0:0: [sda] tag#6 CDB: Read(10) 28 00 08 06 4c 98 00 00 08 00
>>> blk_update_request: I/O error, dev sda, sector 134630552
>>> sd 0:0:0:0: [sda] tag#7 FAILED Result: hostbyte=DID_BAD_TARGET
driverbyte=DRIVER_OK
>>> sd 0:0:0:0: [sda] tag#7 CDB: Read(10) 28 00 08 06 4c 98 00 00 08 00
>>> blk_update_request: I/O error, dev sda, sector 134630552
>>> ...
>>
>> Rethink about the problem, if the log_buf_len is not corrupted, it will
>> be fine, in kernel code log_buf_len is a u32 int. so it will be not possible to
fill disk forever
>> unless vmcore-dmesg is buggy.
>>
>> Also about the implementation, it looks not very elegant. Can you try
>> to add some limitation in vmcore-dmesg.c?
>>
>> Actually in upstream kernel there is a macro LOG_BUF_LEN_MAX which is
>> defined as 2G, so limit the file output to within 2G should be fine.
>>
>
> I checked vmcore-dmesg.c, and made a draft patch limit the size of vmcore-dmesg.txt
as follow:
>
> diff --git a/util_lib/elf_info.c b/util_lib/elf_info.c
> index 90a3b21662e7..a66241d8d76a 100644
> --- a/util_lib/elf_info.c
> +++ b/util_lib/elf_info.c
> @@ -54,6 +54,9 @@ static uint64_t phys_offset = UINT64_MAX;
> #error "Unknown machine endian"
> #endif
> +/* stole this macro from kernel printk.c */
> +#define LOG_BUF_LEN_MAX (uint32_t)(1 << 31)
> +
> static uint16_t file16_to_cpu(uint16_t val)
> {
> if (ehdr.e_ident[EI_DATA] != ELFDATANATIVE)
> @@ -534,6 +537,13 @@ static int32_t read_file_s32(int fd, uint64_t addr)
> static void write_to_stdout(char *buf, unsigned int nr)
> {
> ssize_t ret;
> + static uint32_t n_bytes = 0;
> +
> + n_bytes += nr;
> + if (n_bytes > LOG_BUF_LEN_MAX) {
> + fprintf(stderr, "The vmcore-dmesg.txt over 2G in size is not
supported.\n");
> + return;
> + }
> ret = write(STDOUT_FILENO, buf, nr);
> if (ret != nr) {
>
> What's your opinion?
I remember seeing a similar problem for arm64 vmcore-dmesg in the past (see
<
https://docs.oracle.com/cd/E52668_01/E97565/html/ol7-bug-28064675.html> for
details).
However, normally what we observed was that the vmcore-dmesg created was almost never of
abnormal size, rather there was a bug (infinite loop) in vmcore-dmesg.c which was
addressed via kexec-tools commit:
commit e277fa9ec702fea7bd3135393c67327c821d5a3a
Author: Omar Sandoval <osandov(a)fb.com>
Date: Wed May 23 13:59:31 2018 -0700
vmcore-dmesg: fix infinite loop if log buffer wraps around
I think if you have commit e08d26b3b7f1 and e277fa9ec702 applied you shouldn't face
this issue.
Thank you, Bhupesh.
I have never reproduced this issue. But i guess above patch can ensure that the size of
vmcore-dmesg.txt is limited to 2G.
> Hi Jun Wang,
>
> Have you been able to reproduce this issue at your end even with the latest upstream
kexec-tools (kexec-tools 2.0.20)
>
> If yes, I think we should probably fix any left-over infinite loop in vmcore-dmesg.c,
rather than worrying about the vmcore-dmesg size, as normally the vmcore-dmesg is never
too large in size (which I found by adding some debug prints to vmcore-dmesg.c) evenf for
corrupted vmcore cases.
>
> Thanks,
> Bhupesh
>
>
>>>> Lets limit the size of vmcore-dmesg.txt to avoid such problems.
>>>>
>>>> Signed-off-by: Jun Wang <junw99(a)yahoo.com>
>>>> Signed-off-by: Lianbo Jiang <lijiang(a)redhat.com>
>>>> ---
>>>> Changes since v1:
>>>> [1] Add dump_fs path to limit the size of vmcore-dmesg.txt
>>>> [2] Add the option 'iflag=fullblock' for the dd command.
>>>>
>>>> dracut-kdump.sh | 4 ++--
>>>> kdump-lib-initramfs.sh | 4 ++--
>>>> 2 files changed, 4 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/dracut-kdump.sh b/dracut-kdump.sh
>>>> index 2ae1c7c5d70d..ddd96efb5184 100755
>>>> --- a/dracut-kdump.sh
>>>> +++ b/dracut-kdump.sh
>>>> @@ -102,8 +102,8 @@ save_vmcore_dmesg_ssh() {
>>>> local _opts="$3"
>>>> local _location=$4
>>>> - echo "kdump: saving vmcore-dmesg.txt"
>>>> - $_dmesg_collector /proc/vmcore | ssh $_opts $_location "dd
of=$_path/vmcore-dmesg-incomplete.txt"
>>>> + echo "kdump: saving vmcore-dmesg.txt, up to 100MB"
>>>> + $_dmesg_collector /proc/vmcore | ssh $_opts $_location "dd
of=$_path/vmcore-dmesg-incomplete.txt bs=512 count=204800 iflag=fullblock"
>>>> _exitcode=$?
>>>> if [ $_exitcode -eq 0 ]; then
>>>> diff --git a/kdump-lib-initramfs.sh b/kdump-lib-initramfs.sh
>>>> index 608dc6efc07e..44f9ae4dfb8f 100755
>>>> --- a/kdump-lib-initramfs.sh
>>>> +++ b/kdump-lib-initramfs.sh
>>>> @@ -137,8 +137,8 @@ save_vmcore_dmesg_fs() {
>>>> local _dmesg_collector=$1
>>>> local _path=$2
>>>> - echo "kdump: saving vmcore-dmesg.txt"
>>>> - $_dmesg_collector /proc/vmcore >
${_path}/vmcore-dmesg-incomplete.txt
>>>> + echo "kdump: saving vmcore-dmesg.txt, up to 100MB"
>>>> + $_dmesg_collector /proc/vmcore | dd
of=$_path/vmcore-dmesg-incomplete.txt bs=512 count=204800 iflag=fullblock
>>>> _exitcode=$?
>>>> if [ $_exitcode -eq 0 ]; then
>>>> mv ${_path}/vmcore-dmesg-incomplete.txt
${_path}/vmcore-dmesg.txt
>>>> --
>>>> 2.17.1
>>>>
>>>
>>> Thanks
>>> Dave
>>>
>> _______________________________________________
>> kexec mailing list -- kexec(a)lists.fedoraproject.org
>> To unsubscribe send an email to kexec-leave(a)lists.fedoraproject.org
>> Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
>> List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
>> List Archives:
https://lists.fedoraproject.org/archives/list/kexec@lists.fedoraproject.org
>>
>