On Thu, Jul 11, 2013 at 12:42:16PM -0600, Stephen John Smoogen wrote:
[..]
Issues I ran into was:
- kdump needs to write to an unencrypted disk space. I tried a USB disk
and various other places but the best ability I got was reinstalling the laptop and making a /var/crash partition.
Is your root encrypted? USB should have worked. Otherwise try dumping to NFS partition. Or ssh the dump out to a different machine. All of these should work.
The USB was the ones I tried but couldn't get to work correctly. NFS and SSH were not going to work because the problem is with RHEL-5 talking over the bridge and my laptop has wireless.
[ I am ccing devel list again. So that if people have ideas about how to get serial console on laptop, that will help ]
What do you mean by "NFS and SSH were not going to work because the problem is with RHEL-5 talking over the bridge"?
I have never tested kdump with wireless. As I always tried to make these work on servers and always assumed etherhnet connectivity is there.
Anyway, USB case is interesting. I have to admin I have never tried dumping to USB disk either. But in theory it should work.
Right now it does not work with encrypted disks. Given the fact that dumping to root disk is easiest on a laptop, I think it is reasonable to try to make it work with encrypted disks.
With encrypted disks we don't know where to get the password. I think we probably can just wait for user to enter the password. But wait, we have plenty of issues with display reset in kdump environment. Kdump might be working in the background while display might be frozen or garbage displayed. That's why I always use serial console for any kind of debugging.
So until and unless we figure a way out to solve resetting display issues, we can't expect a user to enter password on prompt and supporting encrypted disk is hard.
Anyway, so as you said in your case trying to mount an un-encrypted disk/partition and trying to dump to that parition is easiest.
- kdump didn't seem to dump for anything than the forced dump in the
instruction manual.
You mean dump did not trigger after panic or it did not complete after panic?
If kdump kernel is loaded, and panic happens or oops happens and panic_on_oops is set, we should transition in to second kernel and capture dump.
This did not happen. The system froze completely.
We need to have serial console to debug things here. Without console we have no idea where things might have gone wrong.
Power cycle was required and nothing was in /var/crash. This could be a problem with my setup but it was pretty much stock what the fedora web pages said to do. The system-config-kdump application didn't work when I tried it so I went to fedora-kernel and got the "we don't expect it to work, please try a rawhide kernel and see if it ooops" which it did.
If it did not work, there must have been some kernel issue. Please open
bugs for these issues.
OK what are you wanting to look for in a bug. At the moment I would just be opening the unhelpful bug of:
laptop freezes. no kdump is found.
We will need a serial console to debug kdump issues. I am not expert enough to figure out how to reset graphical console without going through the bios. Is there any reliable way to do that.
Once we have console going then we can try different things like enabling debugging messages in purgatory, enable early printk etc to figure out where did things fail.
Now laptops don't have serial port (most new one). Are there any usb based gadgets which can help here? I don't know.
Which could be a setup issue on my part or a bunch of other stuff. Since the bug is still possible to trigger with Fedora 19+RHEL-5 guest, I can go through the steps again to see what needs to be done. I just need to know what they are.
So you are running RHEL-5 as Guest with F19 host and trying to take dump of host?
I think we need to solve the issue of how to get a serial console working on a laptop to debug this issue.
Anybody, any ideas?
Thanks Vivek
On Thu, Jul 11, 2013 at 03:10:07PM -0400, Vivek Goyal wrote:
We will need a serial console to debug kdump issues. I am not expert enough to figure out how to reset graphical console without going through the bios. Is there any reliable way to do that.
Make sure the kdump kernel has graphics drivers - they should be able to reconfigure the device. Or just pass the framebuffer offset, size, stride and pixel format to the kdump kernel and have it treat it as an unaccelerated linear framebuffer.
On Thu, Jul 11, 2013 at 08:22:17PM +0100, Matthew Garrett wrote:
On Thu, Jul 11, 2013 at 03:10:07PM -0400, Vivek Goyal wrote:
We will need a serial console to debug kdump issues. I am not expert enough to figure out how to reset graphical console without going through the bios. Is there any reliable way to do that.
Make sure the kdump kernel has graphics drivers - they should be able to reconfigure the device.
Ok, including graphics drivers in initramfs should be doable. But this will still not display the message on consoles if early failures happen during transition to second kernel and drivers are not loaded yet.
Or just pass the framebuffer offset, size, stride and pixel format to the kdump kernel and have it treat it as an unaccelerated linear framebuffer.
Ok, I will look into this. Thanks for the ideas though.
Thanks Vivek
On 11 July 2013 13:10, Vivek Goyal vgoyal@redhat.com wrote:
On Thu, Jul 11, 2013 at 12:42:16PM -0600, Stephen John Smoogen wrote:
[..]
Issues I ran into was:
- kdump needs to write to an unencrypted disk space. I tried a USB
disk
and various other places but the best ability I got was reinstalling
the
laptop and making a /var/crash partition.
Is your root encrypted? USB should have worked. Otherwise try dumping to NFS partition. Or ssh the dump out to a different machine. All of these should work.
The USB was the ones I tried but couldn't get to work correctly. NFS and SSH were not going to work because the problem is with RHEL-5 talking
over
the bridge and my laptop has wireless.
[ I am ccing devel list again. So that if people have ideas about how to get serial console on laptop, that will help ]
What do you mean by "NFS and SSH were not going to work because the problem is with RHEL-5 talking over the bridge"?
Well the system hard crashes the laptop when I am on wireless. I expect that this is an untested scenario and since most of the time I am sitting on some cafe's wireless trying to push 8 GB of dump to somewhere would not be the most useful way to try.
I have never tested kdump with wireless. As I always tried to make these work on servers and always assumed etherhnet connectivity is there.
Anyway, USB case is interesting. I have to admin I have never tried dumping to USB disk either. But in theory it should work.
I tried USB direct dump and USB ext3. kdump said it could see the USB disk in the logs and then nothing would get written.
Right now it does not work with encrypted disks. Given the fact that dumping to root disk is easiest on a laptop, I think it is reasonable to try to make it work with encrypted disks.
I really can't see a way to do encrypted disks in a secure way. Basically everything I thought of required it have the password stored somewhere which is wrong on many levels. So I don't mind having to have an unencrypted space.
This did not happen. The system froze completely.
We need to have serial console to debug things here. Without console we have no idea where things might have gone wrong.
Sadly the laptop is USB only so I am not sure if this will be possible. I will defer to someone with a lot more hardware knowledge but I was under the assumption that unless I had a UART any console hooked up would really be a "software" versus "hardware" console and so data sent to it went through a lot of corruptible stacks :/. [Ah for a nice old x86 with UART.]
On Thu, Jul 11, 2013 at 04:46:42PM -0600, Stephen John Smoogen wrote:
Sadly the laptop is USB only so I am not sure if this will be possible. I will defer to someone with a lot more hardware knowledge but I was under the assumption that unless I had a UART any console hooked up would really be a "software" versus "hardware" console and so data sent to it went through a lot of corruptible stacks :/. [Ah for a nice old x86 with UART.]
There's USB debug cables, but last I heard they're currently impossible to get hold of.
On Thu, Jul 11, 2013 at 04:46:42PM -0600, Stephen John Smoogen wrote:
[..]
Anyway, USB case is interesting. I have to admin I have never tried dumping to USB disk either. But in theory it should work.
I tried USB direct dump and USB ext3. kdump said it could see the USB disk in the logs and then nothing would get written.
Ok, I just took an laptop (lenovo T61, yes it is old) and installed F19 and tried kdump (echo c > /proc/sysrq-trigger) in following 3 configurations.
- Save to local disk (root, unencrypted). - Save to a usb flash driver (4GB, ext4 file system) - Save dump over ssh
All 3 worked for me.
Display does get reset but that happens very late and we don't see any of the kernel messages. I see just dracut and kdump messages.
If USB did not work for you, you can try passing rd.debug on command line (edit /etc/sysconfig/kdump) and also set "default shell" in /etc/kdump.conf. So after failing to save dump, you should be put in a shell. You can look around for usb device. Also debug outupt should tell us where we are.
Thanks Vivek