On Wed, May 25, 2022 at 09:05:11AM -0500, Eric Sandeen wrote:
On 5/25/22 6:42 AM, Vivek Goyal wrote:
>> Yes, I have tested with xfs, which works fine too. Please see the dmesg log:
>>
>> [ 3.426025] XFS (dm-3): Mounting V5 Filesystem
>> [ 3.599421] XFS (dm-3): Starting recovery (logdev: internal)
>> [ 3.624443] XFS (dm-3): Ending recovery (logdev: internal)
>> [ 3.647705] xfs filesystem being mounted at /kdumproot/mnt supports
timestamps until 2038 (0x7fffffff)
>> ....
>> [ 3.552870] kdump[514]: saving vmcore
>> [ 4.082238] device-mapper: thin: 253:2: reached low water mark for data
device: sending event.
>> [ 4.142192] device-mapper: thin: 253:2: switching pool to out-of-data-space
(queue IO) mode
>> [ 3.679640] lvm[446]: WARNING: Sum of all thin volume sizes (300.00 MiB)
exceeds the size of thin pools and the size of whole volume group (48.00 MiB).
>> [ 3.690712] lvm[446]: Size of logical volume vg00/thinpool_tdata changed from
12.00 MiB (3 extents) to 20.00 MiB (5 extents).
>> [ 4.211179] device-mapper: thin: 253:2: switching pool to write mode
>> [ 4.227481] device-mapper: thin: 253:2: growing the data device from 192 to
320 blocks
>> [ 3.789801] lvm[446]: Logical volume vg00/thinpool_tdata successfully
resized.
>> [ 4.269802] device-mapper: thin: 253:2: reached low water mark for data
device: sending event.
>> [ 4.335903] device-mapper: thin: 253:2: switching pool to out-of-data-space
(queue IO) mode
>> [ 3.948400] lvm[446]: WARNING: Sum of all thin volume sizes (300.00 MiB)
exceeds the size of thin pools and the size of whole volume group (48.00 MiB).
>> [ 3.972514] lvm[446]: Size of logical volume vg00/thinpool_tdata changed from
20.00 MiB (5 extents) to 32.00 MiB (8 extents).
>> [ 4.468128] device-mapper: thin: 253:2: switching pool to write mode
>> [ 4.485914] device-mapper: thin: 253:2: growing the data device from 320 to
512 blocks
>> [ 4.049409] lvm[446]: Logical volume vg00/thinpool_tdata successfully
resized.
>> [ 4.528457] device-mapper: thin: 253:2: reached low water mark for data
device: sending event.
>> [ 4.605303] device-mapper: thin: 253:2: switching pool to out-of-data-space
(queue IO) mode
>> [ 4.224389] lvm[446]: Insufficient free space: 4 extents needed, but only 2
available
>> [ 4.238011] lvm[446]: Failed command for vg00-thinpool-tpool.
>> [ 4.251287] lvm[446]: WARNING: Thin pool vg00-thinpool-tpool data is now
100.00% full.
>> [ 4.272262] lvm[446]: Insufficient free space: 4 extents needed, but only 2
available
>> [ 4.293131] lvm[446]: Failed command for vg00-thinpool-tpool.
>> [ 4.339268] kdump.sh[515]: ^MChecking for memory holes
: [ 0.0 %] / ^MChecking for memory holes :
[100.0 %] | ^MExcluding unnecessary pages : [100.0
%] \ ^MCopying data : [ 97.8 %] -
eta: 0s^MCopying data : [100.0 %] /
eta: 0s^MCopying data : [100.0 %] | eta:
0s
>> [ 4.365706] kdump.sh[515]: The dumpfile is saved to
/kdumproot/mnt/var/crash/127.0.0.1-2022-05-25-00:26:53//vmcore-incomplete.
>> [ 4.374532] kdump.sh[515]: makedumpfile Completed.
>> [ 23.344673] lvm[446]: Insufficient free space: 4 extents needed, but only 2
available
>> [ 23.361064] lvm[446]: Failed command for vg00-thinpool-tpool.
>> [ 53.344919] lvm[446]: Insufficient free space: 4 extents needed, but only 2
available
>> [ 53.372743] lvm[446]: Failed command for vg00-thinpool-tpool.
>> [ 67.034930] device-mapper: thin: 253:2: switching pool to out-of-data-space
(error IO) mode
>> [ 67.049185] dm-3: writeback error on inode 134, offset 176128, sector 1664
>> [ 67.049388] dm-3: writeback error on inode 134, offset 21590016, sector
42448
>> [ 67.058358] dm-3: writeback error on inode 134, offset 25784320, sector
58752
>> [ 67.067440] dm-3: writeback error on inode 134, offset 29978624, sector
64128
>> [ 67.077099] dm-3: writeback error on inode 134, offset 34172928, sector
67328
>> [ 67.086122] dm-3: writeback error on inode 134, offset 34340864, sector
82816
>> [ 67.095074] dm-3: writeback error on inode 134, offset 37294080, sector
88704
>> [ 67.104907] dm-3: writeback error on inode 134, offset 40325120, sector
90368
>> [ 67.120230] dm-3: writeback error on inode 134, offset 770048, sector 1784
>> [ 66.670306] kdump.sh[562]: sync: error syncing
'/kdumproot/mnt/var/crash/127.0.0.1-2022-05-25-00:26:53//vmcore': Input/output
error
>> [ 66.694671] kdump[570]: sync vmcore failed, exitcode:1
>> [ 66.709217] lvm[446]: Insufficient free space: 4 extents needed, but only 2
available
>> [ 66.757336] kdump[572]: saving vmcore failed
> And system rebooted by itself after this?
>
> This is strange. Has xfs default behavior changed now? It used to try
> infinitely by default if thin pool got full.
>
> Eric, would have any idea. What has changed.
I don't actually see any XFS errors at all.... oh, ok - the "writeback
error" messages
are from iomap, presumably generated by xfs calls. Data IO errors won't be critical
to
xfs; the (now tunable) error retry behavior is only for metadata IO failures.
It does look like the sync properly reported the failure though, yes? Which was the
goal of this exercise, I think.
Yes that was one of the goals.
Will xfs not flush metadata as well as part of sync (if there is some). So
it is possible that thin pool is full and metadata can not be flushed
and then xfs will hang? IOW, looks like we might hang some time and
not other times.
So tweaking XFS error knobs is still a good idea, IMHO to make sure
we do not hang while saving dump. If we can't save dump because thin
pool is full, we should give an error and reboot back.
Thanks
Vivek