Hi Coiby,
Hi Tao,
On Sat, 2 Apr 2022 16:43:37 +0800
Coiby Xu <coxu(a)redhat.com> wrote:
> Hi Tao,
>
> On Sat, Apr 02, 2022 at 02:14:38PM +0800, Tao Liu wrote:
> >Hi Philipp,
> >
> >On Wed, Mar 30, 2022 at 11:34 PM Philipp Rudo <prudo(a)redhat.com> wrote:
> [...]
> >
> >> > implemented the reservation in the kernel side, as part of the kernel
> >> > crashkernel=auto implementation.
>
>
> For swiotlb, I think Dave and Emma prefer to do it in kernel space.
> Maybe you can try a kernel space solution first like [1]?
>
> [1]
https://lore.kernel.org/lkml/20190910151341.14986-3-kasong@redhat.com/
>
> >> >
> >> > For rhel9 however, crashkernel=auto has been mirrored to userspace
> >> > kexec-tools, so this patch will enable the swiotlb reservation for AMD
> >> > sme/sev in kexec-tools as well.
> >>
> >> Not really about this patch but I find it weird that we adjust the
> >> crashkernel=auto value for swiotlb but not for other cases wher we know
> >> that we need a larger crashkernel, e.g. when the dump target is on a
> >> LUKS encrypted device. I understand that in the past, when
> >> crashkernel=auto was handled by the kernel, we simply didn't know about
> >> what the dump target will be when we allocated the crashkernel during
> >> boot. But today, when crashkernel=auto is handled in userspace we can.
> >>
> >
> >I didn't have much context on LUKS, by looking into the code, I see only
> >"kdumpctl estimate" has the LUKS related crashkernel value increment.
It needs
> >the user to adjust crashkernel value manually by estimate-and-set. Maybe we
> >can merge the crashkernel adjustment of swiotlb and luks together,
> >so luks don't need to "kdumpctl estimate" manually anymore?
>
> For LUKS, a kernel-space solution by reusing LUKS master key is
> preferred because a) LUKS requires too much memory b) we can't expect
> the user to wait at the screen to input the passphrase to open the disk
> when kernel crashes. Let's see how far my work [2] can go.
you are right that it is better to solve the problem with LUKS in the
kernel. I only used it as an example where today we know that we need
more memory but don't automatically increase the crashkernel. So why do
it for swiotlb but for others? What's the reason for this inconsistency?
And a little bit more general. What is the relation between 'kdumpctl
reset-crashkernel' and 'kdumctl estimate'? The way I see it the
workflow we suggest to users currently is
1. $ kdumpctl reser-crashkernel
2. -> test if it works, if not
3. $ kdumpctl estimate
4. -> set crashkernel manually to suggested value
5. -> test if it works, if not
6. -> set crashkernel manually to larger value
7. -> test if it works, if not go back to 6
I don't understand why we need the separate 'kdumpctl estimate' step in
between. In my opinion the following workflow would be much simpler
from a user perspective
1. $ kdumpctl reset-crashkernel
(calls kdumpctl estimate internally using current default values
as minimum values)
2. -> test if it woks, if not
3. -> set crashkernel manually larger value
4. -> test if it works, if not go back to 3
Thanks for sharing your thoughts! This approach also looks better to me!
Previously I also tried addressing the swiotlb case in "kdumpctl
reset-crashkernel" but QE was worried customer requests to address other
cases may flood in and we would quickly reach our capacity to process
them. On the other end, I think I need more time to evaluate how "kdumpcl
estimate" (including the --reboot feature, Karui's patches still under
review) works. "kdumpctl estimate --reboot" seems more promising to me
because it offers a general way to estimate the memory requirement of
kdump kernel and we don't need to address memory requirement case by
case. My plan is to unify "reset-crashkernel" and estimate after
evaluating how well "estimate" works. For example, one issue
with "estimate --reboot" is it fails to find out the peak memory
consumption.