From: Jun Wang junw99@yahoo.com
With some corrupted vmcore files, the vmcore-dmesg.txt file may grow forever till the kdump disk becomes full, and also probably causes the disk error messages as follow: ... sd 0:0:0:0: [sda] tag#6 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK sd 0:0:0:0: [sda] tag#6 CDB: Read(10) 28 00 08 06 4c 98 00 00 08 00 blk_update_request: I/O error, dev sda, sector 134630552 sd 0:0:0:0: [sda] tag#7 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK sd 0:0:0:0: [sda] tag#7 CDB: Read(10) 28 00 08 06 4c 98 00 00 08 00 blk_update_request: I/O error, dev sda, sector 134630552 ...
Lets limit the size of vmcore-dmesg.txt to avoid such problems.
Signed-off-by: Jun Wang junw99@yahoo.com Signed-off-by: Lianbo Jiang lijiang@redhat.com --- Changes since v1: [1] Add dump_fs path to limit the size of vmcore-dmesg.txt [2] Add the option 'iflag=fullblock' for the dd command.
dracut-kdump.sh | 4 ++-- kdump-lib-initramfs.sh | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/dracut-kdump.sh b/dracut-kdump.sh index 2ae1c7c5d70d..ddd96efb5184 100755 --- a/dracut-kdump.sh +++ b/dracut-kdump.sh @@ -102,8 +102,8 @@ save_vmcore_dmesg_ssh() { local _opts="$3" local _location=$4
- echo "kdump: saving vmcore-dmesg.txt" - $_dmesg_collector /proc/vmcore | ssh $_opts $_location "dd of=$_path/vmcore-dmesg-incomplete.txt" + echo "kdump: saving vmcore-dmesg.txt, up to 100MB" + $_dmesg_collector /proc/vmcore | ssh $_opts $_location "dd of=$_path/vmcore-dmesg-incomplete.txt bs=512 count=204800 iflag=fullblock" _exitcode=$?
if [ $_exitcode -eq 0 ]; then diff --git a/kdump-lib-initramfs.sh b/kdump-lib-initramfs.sh index 608dc6efc07e..44f9ae4dfb8f 100755 --- a/kdump-lib-initramfs.sh +++ b/kdump-lib-initramfs.sh @@ -137,8 +137,8 @@ save_vmcore_dmesg_fs() { local _dmesg_collector=$1 local _path=$2
- echo "kdump: saving vmcore-dmesg.txt" - $_dmesg_collector /proc/vmcore > ${_path}/vmcore-dmesg-incomplete.txt + echo "kdump: saving vmcore-dmesg.txt, up to 100MB" + $_dmesg_collector /proc/vmcore | dd of=$_path/vmcore-dmesg-incomplete.txt bs=512 count=204800 iflag=fullblock _exitcode=$? if [ $_exitcode -eq 0 ]; then mv ${_path}/vmcore-dmesg-incomplete.txt ${_path}/vmcore-dmesg.txt
On 08/06/19 at 07:22pm, Lianbo Jiang wrote:
From: Jun Wang junw99@yahoo.com
With some corrupted vmcore files, the vmcore-dmesg.txt file may grow forever till the kdump disk becomes full, and also probably causes the disk error messages as follow: ... sd 0:0:0:0: [sda] tag#6 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK sd 0:0:0:0: [sda] tag#6 CDB: Read(10) 28 00 08 06 4c 98 00 00 08 00 blk_update_request: I/O error, dev sda, sector 134630552 sd 0:0:0:0: [sda] tag#7 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK sd 0:0:0:0: [sda] tag#7 CDB: Read(10) 28 00 08 06 4c 98 00 00 08 00 blk_update_request: I/O error, dev sda, sector 134630552 ...
Rethink about the problem, if the log_buf_len is not corrupted, it will be fine, in kernel code log_buf_len is a u32 int. so it will be not possible to fill disk forever unless vmcore-dmesg is buggy.
Also about the implementation, it looks not very elegant. Can you try to add some limitation in vmcore-dmesg.c?
Actually in upstream kernel there is a macro LOG_BUF_LEN_MAX which is defined as 2G, so limit the file output to within 2G should be fine.
Lets limit the size of vmcore-dmesg.txt to avoid such problems.
Signed-off-by: Jun Wang junw99@yahoo.com Signed-off-by: Lianbo Jiang lijiang@redhat.com
Changes since v1: [1] Add dump_fs path to limit the size of vmcore-dmesg.txt [2] Add the option 'iflag=fullblock' for the dd command.
dracut-kdump.sh | 4 ++-- kdump-lib-initramfs.sh | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/dracut-kdump.sh b/dracut-kdump.sh index 2ae1c7c5d70d..ddd96efb5184 100755 --- a/dracut-kdump.sh +++ b/dracut-kdump.sh @@ -102,8 +102,8 @@ save_vmcore_dmesg_ssh() { local _opts="$3" local _location=$4
- echo "kdump: saving vmcore-dmesg.txt"
- $_dmesg_collector /proc/vmcore | ssh $_opts $_location "dd of=$_path/vmcore-dmesg-incomplete.txt"
echo "kdump: saving vmcore-dmesg.txt, up to 100MB"
$_dmesg_collector /proc/vmcore | ssh $_opts $_location "dd of=$_path/vmcore-dmesg-incomplete.txt bs=512 count=204800 iflag=fullblock" _exitcode=$?
if [ $_exitcode -eq 0 ]; then
diff --git a/kdump-lib-initramfs.sh b/kdump-lib-initramfs.sh index 608dc6efc07e..44f9ae4dfb8f 100755 --- a/kdump-lib-initramfs.sh +++ b/kdump-lib-initramfs.sh @@ -137,8 +137,8 @@ save_vmcore_dmesg_fs() { local _dmesg_collector=$1 local _path=$2
- echo "kdump: saving vmcore-dmesg.txt"
- $_dmesg_collector /proc/vmcore > ${_path}/vmcore-dmesg-incomplete.txt
- echo "kdump: saving vmcore-dmesg.txt, up to 100MB"
- $_dmesg_collector /proc/vmcore | dd of=$_path/vmcore-dmesg-incomplete.txt bs=512 count=204800 iflag=fullblock _exitcode=$? if [ $_exitcode -eq 0 ]; then mv ${_path}/vmcore-dmesg-incomplete.txt ${_path}/vmcore-dmesg.txt
-- 2.17.1
Thanks Dave
The root cause is likely an infinite loop or corrupted data somewhere, but not too many vmcore dmessages. The fix of limiting vmcore-dmesg.txt size is mainly to defend kdump servers against the existing problem, as well as other future problems. It doesn’t prevent us from further investigating the root cause of the problem that is hard to reproduce either. Thanks,Jun
Sent from Yahoo Mail for iPhone
On Thursday, August 8, 2019, 10:45 PM, Dave Young dyoung@redhat.com wrote:
On 08/06/19 at 07:22pm, Lianbo Jiang wrote:
From: Jun Wang junw99@yahoo.com
With some corrupted vmcore files, the vmcore-dmesg.txt file may grow forever till the kdump disk becomes full, and also probably causes the disk error messages as follow: ... sd 0:0:0:0: [sda] tag#6 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK sd 0:0:0:0: [sda] tag#6 CDB: Read(10) 28 00 08 06 4c 98 00 00 08 00 blk_update_request: I/O error, dev sda, sector 134630552 sd 0:0:0:0: [sda] tag#7 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK sd 0:0:0:0: [sda] tag#7 CDB: Read(10) 28 00 08 06 4c 98 00 00 08 00 blk_update_request: I/O error, dev sda, sector 134630552 ...
Rethink about the problem, if the log_buf_len is not corrupted, it will be fine, in kernel code log_buf_len is a u32 int. so it will be not possible to fill disk forever unless vmcore-dmesg is buggy.
Also about the implementation, it looks not very elegant. Can you try to add some limitation in vmcore-dmesg.c?
Actually in upstream kernel there is a macro LOG_BUF_LEN_MAX which is defined as 2G, so limit the file output to within 2G should be fine.
Lets limit the size of vmcore-dmesg.txt to avoid such problems.
Signed-off-by: Jun Wang junw99@yahoo.com Signed-off-by: Lianbo Jiang lijiang@redhat.com
Changes since v1: [1] Add dump_fs path to limit the size of vmcore-dmesg.txt [2] Add the option 'iflag=fullblock' for the dd command.
dracut-kdump.sh | 4 ++-- kdump-lib-initramfs.sh | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/dracut-kdump.sh b/dracut-kdump.sh index 2ae1c7c5d70d..ddd96efb5184 100755 --- a/dracut-kdump.sh +++ b/dracut-kdump.sh @@ -102,8 +102,8 @@ save_vmcore_dmesg_ssh() { local _opts="$3" local _location=$4 - echo "kdump: saving vmcore-dmesg.txt" - $_dmesg_collector /proc/vmcore | ssh $_opts $_location "dd of=$_path/vmcore-dmesg-incomplete.txt" + echo "kdump: saving vmcore-dmesg.txt, up to 100MB" + $_dmesg_collector /proc/vmcore | ssh $_opts $_location "dd of=$_path/vmcore-dmesg-incomplete.txt bs=512 count=204800 iflag=fullblock" _exitcode=$? if [ $_exitcode -eq 0 ]; then diff --git a/kdump-lib-initramfs.sh b/kdump-lib-initramfs.sh index 608dc6efc07e..44f9ae4dfb8f 100755 --- a/kdump-lib-initramfs.sh +++ b/kdump-lib-initramfs.sh @@ -137,8 +137,8 @@ save_vmcore_dmesg_fs() { local _dmesg_collector=$1 local _path=$2 - echo "kdump: saving vmcore-dmesg.txt" - $_dmesg_collector /proc/vmcore > ${_path}/vmcore-dmesg-incomplete.txt + echo "kdump: saving vmcore-dmesg.txt, up to 100MB" + $_dmesg_collector /proc/vmcore | dd of=$_path/vmcore-dmesg-incomplete.txt bs=512 count=204800 iflag=fullblock _exitcode=$? if [ $_exitcode -eq 0 ]; then mv ${_path}/vmcore-dmesg-incomplete.txt ${_path}/vmcore-dmesg.txt -- 2.17.1
Thanks Dave
On 08/09/19 at 06:33am, Jun Wang wrote:
The root cause is likely an infinite loop or corrupted data somewhere, but not too many vmcore dmessages. The fix of limiting vmcore-dmesg.txt size is mainly to defend kdump servers against the existing problem, as well as other future problems. It doesn’t prevent us from further investigating the root cause of the problem that is hard to reproduce either.
It is easy to add some assert in vmcore-dmesg.c to limit the write fd size and assert, so that do not add extra things in distribution.
Also for the size 100M is not necessarily a good value, it would be good to keep same max as kernel is using. Even if it is corrupted thing, it is still not good to have a 100M limitation.
Thanks Dave
Sure, the limit can instead be added in vmcore-dmesg.c too as long as it is sufficient to defend kdump servers. Assert only may not be sufficient coz it needs to capture run-time error in production version not just to exist in program debugging phase.
Thanks, Jun
On 08/09/19 at 06:33am, Jun Wang wrote:
It is easy to add some assert in vmcore-dmesg.c to limit the write fd size and assert, so that do not add extra things in distribution.
Also for the size 100M is not necessarily a good value, it would be good to keep same max as kernel is using. Even if it is corrupted thing, it is still not good to have a 100M limitation.
Thanks Dave
在 2019年08月09日 13:45, Dave Young 写道:
On 08/06/19 at 07:22pm, Lianbo Jiang wrote:
From: Jun Wang junw99@yahoo.com
With some corrupted vmcore files, the vmcore-dmesg.txt file may grow forever till the kdump disk becomes full, and also probably causes the disk error messages as follow: ... sd 0:0:0:0: [sda] tag#6 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK sd 0:0:0:0: [sda] tag#6 CDB: Read(10) 28 00 08 06 4c 98 00 00 08 00 blk_update_request: I/O error, dev sda, sector 134630552 sd 0:0:0:0: [sda] tag#7 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK sd 0:0:0:0: [sda] tag#7 CDB: Read(10) 28 00 08 06 4c 98 00 00 08 00 blk_update_request: I/O error, dev sda, sector 134630552 ...
Rethink about the problem, if the log_buf_len is not corrupted, it will be fine, in kernel code log_buf_len is a u32 int. so it will be not possible to fill disk forever unless vmcore-dmesg is buggy.
Also about the implementation, it looks not very elegant. Can you try to add some limitation in vmcore-dmesg.c?
Actually in upstream kernel there is a macro LOG_BUF_LEN_MAX which is defined as 2G, so limit the file output to within 2G should be fine.
I checked vmcore-dmesg.c, and made a draft patch limit the size of vmcore-dmesg.txt as follow:
diff --git a/util_lib/elf_info.c b/util_lib/elf_info.c index 90a3b21662e7..a66241d8d76a 100644 --- a/util_lib/elf_info.c +++ b/util_lib/elf_info.c @@ -54,6 +54,9 @@ static uint64_t phys_offset = UINT64_MAX; #error "Unknown machine endian" #endif
+/* stole this macro from kernel printk.c */ +#define LOG_BUF_LEN_MAX (uint32_t)(1 << 31) + static uint16_t file16_to_cpu(uint16_t val) { if (ehdr.e_ident[EI_DATA] != ELFDATANATIVE) @@ -534,6 +537,13 @@ static int32_t read_file_s32(int fd, uint64_t addr) static void write_to_stdout(char *buf, unsigned int nr) { ssize_t ret; + static uint32_t n_bytes = 0; + + n_bytes += nr; + if (n_bytes > LOG_BUF_LEN_MAX) { + fprintf(stderr, "The vmcore-dmesg.txt over 2G in size is not supported.\n"); + return; + }
ret = write(STDOUT_FILENO, buf, nr); if (ret != nr) {
What's your opinion?
Thanks. Lianbo
Lets limit the size of vmcore-dmesg.txt to avoid such problems.
Signed-off-by: Jun Wang junw99@yahoo.com Signed-off-by: Lianbo Jiang lijiang@redhat.com
Changes since v1: [1] Add dump_fs path to limit the size of vmcore-dmesg.txt [2] Add the option 'iflag=fullblock' for the dd command.
dracut-kdump.sh | 4 ++-- kdump-lib-initramfs.sh | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/dracut-kdump.sh b/dracut-kdump.sh index 2ae1c7c5d70d..ddd96efb5184 100755 --- a/dracut-kdump.sh +++ b/dracut-kdump.sh @@ -102,8 +102,8 @@ save_vmcore_dmesg_ssh() { local _opts="$3" local _location=$4
- echo "kdump: saving vmcore-dmesg.txt"
- $_dmesg_collector /proc/vmcore | ssh $_opts $_location "dd of=$_path/vmcore-dmesg-incomplete.txt"
echo "kdump: saving vmcore-dmesg.txt, up to 100MB"
$_dmesg_collector /proc/vmcore | ssh $_opts $_location "dd of=$_path/vmcore-dmesg-incomplete.txt bs=512 count=204800 iflag=fullblock" _exitcode=$?
if [ $_exitcode -eq 0 ]; then
diff --git a/kdump-lib-initramfs.sh b/kdump-lib-initramfs.sh index 608dc6efc07e..44f9ae4dfb8f 100755 --- a/kdump-lib-initramfs.sh +++ b/kdump-lib-initramfs.sh @@ -137,8 +137,8 @@ save_vmcore_dmesg_fs() { local _dmesg_collector=$1 local _path=$2
- echo "kdump: saving vmcore-dmesg.txt"
- $_dmesg_collector /proc/vmcore > ${_path}/vmcore-dmesg-incomplete.txt
- echo "kdump: saving vmcore-dmesg.txt, up to 100MB"
- $_dmesg_collector /proc/vmcore | dd of=$_path/vmcore-dmesg-incomplete.txt bs=512 count=204800 iflag=fullblock _exitcode=$? if [ $_exitcode -eq 0 ]; then mv ${_path}/vmcore-dmesg-incomplete.txt ${_path}/vmcore-dmesg.txt
-- 2.17.1
Thanks Dave
Hi Lianbo,
On 8/11/19 6:40 AM, lijiang wrote:
在 2019年08月09日 13:45, Dave Young 写道:
On 08/06/19 at 07:22pm, Lianbo Jiang wrote:
From: Jun Wang junw99@yahoo.com
With some corrupted vmcore files, the vmcore-dmesg.txt file may grow forever till the kdump disk becomes full, and also probably causes the disk error messages as follow: ... sd 0:0:0:0: [sda] tag#6 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK sd 0:0:0:0: [sda] tag#6 CDB: Read(10) 28 00 08 06 4c 98 00 00 08 00 blk_update_request: I/O error, dev sda, sector 134630552 sd 0:0:0:0: [sda] tag#7 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK sd 0:0:0:0: [sda] tag#7 CDB: Read(10) 28 00 08 06 4c 98 00 00 08 00 blk_update_request: I/O error, dev sda, sector 134630552 ...
Rethink about the problem, if the log_buf_len is not corrupted, it will be fine, in kernel code log_buf_len is a u32 int. so it will be not possible to fill disk forever unless vmcore-dmesg is buggy.
Also about the implementation, it looks not very elegant. Can you try to add some limitation in vmcore-dmesg.c?
Actually in upstream kernel there is a macro LOG_BUF_LEN_MAX which is defined as 2G, so limit the file output to within 2G should be fine.
I checked vmcore-dmesg.c, and made a draft patch limit the size of vmcore-dmesg.txt as follow:
diff --git a/util_lib/elf_info.c b/util_lib/elf_info.c index 90a3b21662e7..a66241d8d76a 100644 --- a/util_lib/elf_info.c +++ b/util_lib/elf_info.c @@ -54,6 +54,9 @@ static uint64_t phys_offset = UINT64_MAX; #error "Unknown machine endian" #endif
+/* stole this macro from kernel printk.c */ +#define LOG_BUF_LEN_MAX (uint32_t)(1 << 31)
- static uint16_t file16_to_cpu(uint16_t val) { if (ehdr.e_ident[EI_DATA] != ELFDATANATIVE)
@@ -534,6 +537,13 @@ static int32_t read_file_s32(int fd, uint64_t addr) static void write_to_stdout(char *buf, unsigned int nr) { ssize_t ret;
static uint32_t n_bytes = 0;
n_bytes += nr;
if (n_bytes > LOG_BUF_LEN_MAX) {
fprintf(stderr, "The vmcore-dmesg.txt over 2G in size is not supported.\n");
return;
} ret = write(STDOUT_FILENO, buf, nr); if (ret != nr) {
What's your opinion?
I remember seeing a similar problem for arm64 vmcore-dmesg in the past (see https://docs.oracle.com/cd/E52668_01/E97565/html/ol7-bug-28064675.html for details).
However, normally what we observed was that the vmcore-dmesg created was almost never of abnormal size, rather there was a bug (infinite loop) in vmcore-dmesg.c which was addressed via kexec-tools commit:
commit e277fa9ec702fea7bd3135393c67327c821d5a3a Author: Omar Sandoval osandov@fb.com Date: Wed May 23 13:59:31 2018 -0700
vmcore-dmesg: fix infinite loop if log buffer wraps around
I think if you have commit e08d26b3b7f1 and e277fa9ec702 applied you shouldn't face this issue.
Hi Jun Wang,
Have you been able to reproduce this issue at your end even with the latest upstream kexec-tools (kexec-tools 2.0.20)
If yes, I think we should probably fix any left-over infinite loop in vmcore-dmesg.c, rather than worrying about the vmcore-dmesg size, as normally the vmcore-dmesg is never too large in size (which I found by adding some debug prints to vmcore-dmesg.c) evenf for corrupted vmcore cases.
Thanks, Bhupesh
Lets limit the size of vmcore-dmesg.txt to avoid such problems.
Signed-off-by: Jun Wang junw99@yahoo.com Signed-off-by: Lianbo Jiang lijiang@redhat.com
Changes since v1: [1] Add dump_fs path to limit the size of vmcore-dmesg.txt [2] Add the option 'iflag=fullblock' for the dd command.
dracut-kdump.sh | 4 ++-- kdump-lib-initramfs.sh | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/dracut-kdump.sh b/dracut-kdump.sh index 2ae1c7c5d70d..ddd96efb5184 100755 --- a/dracut-kdump.sh +++ b/dracut-kdump.sh @@ -102,8 +102,8 @@ save_vmcore_dmesg_ssh() { local _opts="$3" local _location=$4
- echo "kdump: saving vmcore-dmesg.txt"
- $_dmesg_collector /proc/vmcore | ssh $_opts $_location "dd of=$_path/vmcore-dmesg-incomplete.txt"
echo "kdump: saving vmcore-dmesg.txt, up to 100MB"
$_dmesg_collector /proc/vmcore | ssh $_opts $_location "dd of=$_path/vmcore-dmesg-incomplete.txt bs=512 count=204800 iflag=fullblock" _exitcode=$?
if [ $_exitcode -eq 0 ]; then
diff --git a/kdump-lib-initramfs.sh b/kdump-lib-initramfs.sh index 608dc6efc07e..44f9ae4dfb8f 100755 --- a/kdump-lib-initramfs.sh +++ b/kdump-lib-initramfs.sh @@ -137,8 +137,8 @@ save_vmcore_dmesg_fs() { local _dmesg_collector=$1 local _path=$2
- echo "kdump: saving vmcore-dmesg.txt"
- $_dmesg_collector /proc/vmcore > ${_path}/vmcore-dmesg-incomplete.txt
- echo "kdump: saving vmcore-dmesg.txt, up to 100MB"
- $_dmesg_collector /proc/vmcore | dd of=$_path/vmcore-dmesg-incomplete.txt bs=512 count=204800 iflag=fullblock _exitcode=$? if [ $_exitcode -eq 0 ]; then mv ${_path}/vmcore-dmesg-incomplete.txt ${_path}/vmcore-dmesg.txt
-- 2.17.1
Thanks Dave
kexec mailing list -- kexec@lists.fedoraproject.org To unsubscribe send an email to kexec-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/kexec@lists.fedoraproject.org
在 2019年08月12日 17:18, Bhupesh Sharma 写道:
Hi Lianbo,
On 8/11/19 6:40 AM, lijiang wrote:
在 2019年08月09日 13:45, Dave Young 写道:
On 08/06/19 at 07:22pm, Lianbo Jiang wrote:
From: Jun Wang junw99@yahoo.com
With some corrupted vmcore files, the vmcore-dmesg.txt file may grow forever till the kdump disk becomes full, and also probably causes the disk error messages as follow: ... sd 0:0:0:0: [sda] tag#6 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK sd 0:0:0:0: [sda] tag#6 CDB: Read(10) 28 00 08 06 4c 98 00 00 08 00 blk_update_request: I/O error, dev sda, sector 134630552 sd 0:0:0:0: [sda] tag#7 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK sd 0:0:0:0: [sda] tag#7 CDB: Read(10) 28 00 08 06 4c 98 00 00 08 00 blk_update_request: I/O error, dev sda, sector 134630552 ...
Rethink about the problem, if the log_buf_len is not corrupted, it will be fine, in kernel code log_buf_len is a u32 int. so it will be not possible to fill disk forever unless vmcore-dmesg is buggy.
Also about the implementation, it looks not very elegant. Can you try to add some limitation in vmcore-dmesg.c?
Actually in upstream kernel there is a macro LOG_BUF_LEN_MAX which is defined as 2G, so limit the file output to within 2G should be fine.
I checked vmcore-dmesg.c, and made a draft patch limit the size of vmcore-dmesg.txt as follow:
diff --git a/util_lib/elf_info.c b/util_lib/elf_info.c index 90a3b21662e7..a66241d8d76a 100644 --- a/util_lib/elf_info.c +++ b/util_lib/elf_info.c @@ -54,6 +54,9 @@ static uint64_t phys_offset = UINT64_MAX; #error "Unknown machine endian" #endif +/* stole this macro from kernel printk.c */ +#define LOG_BUF_LEN_MAX (uint32_t)(1 << 31)
static uint16_t file16_to_cpu(uint16_t val) { if (ehdr.e_ident[EI_DATA] != ELFDATANATIVE) @@ -534,6 +537,13 @@ static int32_t read_file_s32(int fd, uint64_t addr) static void write_to_stdout(char *buf, unsigned int nr) { ssize_t ret; + static uint32_t n_bytes = 0;
+ n_bytes += nr; + if (n_bytes > LOG_BUF_LEN_MAX) { + fprintf(stderr, "The vmcore-dmesg.txt over 2G in size is not supported.\n"); + return; + } ret = write(STDOUT_FILENO, buf, nr); if (ret != nr) {
What's your opinion?
I remember seeing a similar problem for arm64 vmcore-dmesg in the past (see https://docs.oracle.com/cd/E52668_01/E97565/html/ol7-bug-28064675.html for details).
However, normally what we observed was that the vmcore-dmesg created was almost never of abnormal size, rather there was a bug (infinite loop) in vmcore-dmesg.c which was addressed via kexec-tools commit:
commit e277fa9ec702fea7bd3135393c67327c821d5a3a Author: Omar Sandoval osandov@fb.com Date: Wed May 23 13:59:31 2018 -0700
vmcore-dmesg: fix infinite loop if log buffer wraps around
I think if you have commit e08d26b3b7f1 and e277fa9ec702 applied you shouldn't face this issue.
Thank you, Bhupesh. I have never reproduced this issue. But i guess above patch can ensure that the size of vmcore-dmesg.txt is limited to 2G.
Hi Jun Wang,
Have you been able to reproduce this issue at your end even with the latest upstream kexec-tools (kexec-tools 2.0.20)
If yes, I think we should probably fix any left-over infinite loop in vmcore-dmesg.c, rather than worrying about the vmcore-dmesg size, as normally the vmcore-dmesg is never too large in size (which I found by adding some debug prints to vmcore-dmesg.c) evenf for corrupted vmcore cases.
Thanks, Bhupesh
Lets limit the size of vmcore-dmesg.txt to avoid such problems.
Signed-off-by: Jun Wang junw99@yahoo.com Signed-off-by: Lianbo Jiang lijiang@redhat.com
Changes since v1: [1] Add dump_fs path to limit the size of vmcore-dmesg.txt [2] Add the option 'iflag=fullblock' for the dd command.
dracut-kdump.sh | 4 ++-- kdump-lib-initramfs.sh | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/dracut-kdump.sh b/dracut-kdump.sh index 2ae1c7c5d70d..ddd96efb5184 100755 --- a/dracut-kdump.sh +++ b/dracut-kdump.sh @@ -102,8 +102,8 @@ save_vmcore_dmesg_ssh() { local _opts="$3" local _location=$4 - echo "kdump: saving vmcore-dmesg.txt" - $_dmesg_collector /proc/vmcore | ssh $_opts $_location "dd of=$_path/vmcore-dmesg-incomplete.txt" + echo "kdump: saving vmcore-dmesg.txt, up to 100MB" + $_dmesg_collector /proc/vmcore | ssh $_opts $_location "dd of=$_path/vmcore-dmesg-incomplete.txt bs=512 count=204800 iflag=fullblock" _exitcode=$? if [ $_exitcode -eq 0 ]; then diff --git a/kdump-lib-initramfs.sh b/kdump-lib-initramfs.sh index 608dc6efc07e..44f9ae4dfb8f 100755 --- a/kdump-lib-initramfs.sh +++ b/kdump-lib-initramfs.sh @@ -137,8 +137,8 @@ save_vmcore_dmesg_fs() { local _dmesg_collector=$1 local _path=$2 - echo "kdump: saving vmcore-dmesg.txt" - $_dmesg_collector /proc/vmcore > ${_path}/vmcore-dmesg-incomplete.txt + echo "kdump: saving vmcore-dmesg.txt, up to 100MB" + $_dmesg_collector /proc/vmcore | dd of=$_path/vmcore-dmesg-incomplete.txt bs=512 count=204800 iflag=fullblock _exitcode=$? if [ $_exitcode -eq 0 ]; then mv ${_path}/vmcore-dmesg-incomplete.txt ${_path}/vmcore-dmesg.txt -- 2.17.1
Thanks Dave
kexec mailing list -- kexec@lists.fedoraproject.org To unsubscribe send an email to kexec-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/kexec@lists.fedoraproject.org