On Fri, May 27, 2022 at 06:05:27PM +0200, Zdenek Kabelac wrote:
Dne 27. 05. 22 v 17:39 Vivek Goyal napsal(a):
> On Fri, May 27, 2022 at 04:59:38PM +0200, Zdenek Kabelac wrote:
> > Dne 27. 05. 22 v 16:50 Vivek Goyal napsal(a):
> > > On Fri, May 27, 2022 at 04:42:25PM +0200, Zdenek Kabelac wrote:
> > > > Dne 27. 05. 22 v 14:20 Vivek Goyal napsal(a):
> > > > > On Fri, May 27, 2022 at 02:45:14PM +0800, Tao Liu wrote:
> > > > > > If lvm2 thinp is enabled in kdump, lvm2-monitor.service is
needed for
> > > > > > monitor and autoextend the size of thin pool. Otherwise the
vmcore
> > > > > > dumped to a no-enough-space target will be incomplete and
unable for
> > > > > > further analysis.
> > > > > >
> > > > > > In this patch, lvm2-monitor.service will be started before
kdump-capture
> > > > > > .service for 2nd kernel, then be stopped in kdump post.d
phase. So
> > > > > > the thin pool monitoring and size-autoextend can be ensured
during kdump.
> > > > > >
> > > > > > Signed-off-by: Tao Liu <ltao(a)redhat.com>
> > > > > > ---
> > > > > > dracut-lvm2-monitor.service | 15 +++++++++++++++
> > > > > > dracut-module-setup.sh | 16 ++++++++++++++++
> > > > > > kexec-tools.spec | 2 ++
> > > > > > 3 files changed, 33 insertions(+)
> > > > > > create mode 100644 dracut-lvm2-monitor.service
> > > > > >
> > > > > > diff --git a/dracut-lvm2-monitor.service
b/dracut-lvm2-monitor.service
> > > > > This seems to be a copy of
/lib/systemd/system/lvm2-monitor.service.
> > > > > Wondering if we can dirctly include that file in initramfs when
generating
> > > > > image. But I am fuzzy on details of dracut implementation. It
has been
> > > > > too long since I played with it. So Bao and kdump team will be
best
> > > > > to comment on this.
> > > > >
> > > > This is quite interesting - monitoring should in fact never be
started
> > > > wthin 'ramdisk' so I'm acutlly wondering what is this
service file doing
> > > > there.
> > > >
> > > > Design was to start 'monitoring' of devices just after switch
to 'rootfs' -
> > > > since running 'dmeventd' out of ramdisk does not make any
sense at all.
> > > Hi Zdenek,
> > >
> > > In case of kdump, we save core dump from initramfs context and reboot
> > > back into primary kernel. And that's why this need of dm monitoring (
> > > and thin pool auto extension) working from inside the initramfs
> > > context.
> > >
> > So IMHO this although does not look like the best approach. AFAIK the
> > lvm.conf within ramdisk is also a modified version.
> >
> > It looks like there should be a better alternative - like 'after'
activation
> > checking there is 'enough' room in thin-pool for use with thinLV -
should
> > be 'computable' and in case the size is not good enough - try to
extend
> > thin-pool prior use/mount of thinLV (size of space in thin-pool %DATA &
> > %METATDATA and occupancy of %DATA thinLV could be obtained by 'lvs'
tool)
> One potential problem here is that we don't know what's the size of
> vmcore in advance. It gets filtered and saved and we dont know in
> advance, how many kernel pages will be there.
>
> Is that still right, Bao?
>
> Technically speaking, one could first run makedumpfile to just determine
> what will be size of vmcore and then actually save vmcore in second
> round. But that will double the filtering time.
You could likely 'stream/buffer' these kdump data in form of i.e. '4MiB ~
128MiB' chunks (or any other suitable size which will be 'quick enough) and
before each new write of such chunk just compare there is enough free space
in thin-pool with lvs - should be still better then running 'dmeventd' in
the background -
and gives you also the best control over the deadlock in
case you run completely out-of-space (i.e. leaving enough room in thin-pool
and avoiding full dump so user could still 'boot')
So if we fill up thin pool completely, it might fail to activate over
reboot? I do remember there were issues w.r.t filling up thin pool
compltely and it was not desired.
So above does not involve growing thin pool at all? Above just says,
query currently available space in thin pool and when it is about
to be full, stop writing to it? This is suboptimal if there is
free space in underlying volume group.
Ok, this is going to be ugly given how kdump works right now. We have
this config option core_collector where user can specify how vmcore
should be saved (dd, cp, makedumpfile, .....)
None of these tools know about streaming and thin pool extension etc.
I guess one could think of making maekdumpfile aware of thin pool. But
given there can be so many dump targets, it will be really ugly from
design point of view. Embedding knowledge of a target in a generic
filtering tool.
Alternatively we could probably write a tool of our own and pipe
makedumpfile output to it. But then user will have to specify it
in core_collector for thin pool targets only.
None of the solutions look clean or fit well into the current design.
Thanks
Vivek
Since you will be only a single user of thinLV in initramfs - this should be
reasonable straigforward to achieve.
Regards
Zdenek