On Tue, Feb 11, 2020 at 11:48 PM Chris Murphy <lists@colorremedies.com> wrote:
On Tue, Feb 11, 2020 at 3:00 AM Kamil Paral <kparal@redhat.com> wrote:
>
> On Mon, Feb 10, 2020 at 9:43 PM Chris Murphy <lists@colorremedies.com> wrote:
>>
>> Where I came up with 2:1 is from anaconda/blivet code:
>> anaconda/pyanaconda/storage/utils.py:642:    :param bool hibernation:
>> calculate swap size big enough for hibernation
>> https://github.com/rhinstaller/anaconda/blob/master/pyanaconda/storage/utils.py#L654
>>
>> Note on line 673 it actually could be 3x RAM, if --hibernation were
>> used, but this flag isn't used on Fedora Workstation so this
>> computation never gets used.
>
>
> I  guess that's because if you search the Internets, the swap recommendations are all over the place, so some median numbers were picked :-) I had no idea about the hibernation limitation either, until the web browsers started to eat 4+GBs RAM... :-)
>
> The problem is that you don't know how much swap the system will use during regular usage, so you don't know how much headroom you need on top of the 0.5x mem size needed for hibernation. I think nowadays a new formula could be devised, something like `swap size = 0.5x mem size + 2 GB`. I think that's plenty for regular usage (even SSDs are too slow to be used as RAM, not to mention it will wear them out quickly) and it should allow for hibernation most of the time.

I don't think that's enough to be reliable.

Example system: 32G RAM, all of it used, plus 2G of page outs (into
the swap device).

+ 2G already paged out to swap
+ 16GB needs to be paged out to swap, to free up enough memory to
create the hibernation image

If I understood the kernel discussions correctly, currently there's no simple and reliable mechanism to achieve this (move the excess memory to swap). So I wouldn't count it like this. If you have more than 50% memory occupied, you're out of luck, hibernation will be a no-op operation (it would be nice if it returned some user-friendly GUI message outside of system journal). If you are under 50% memory utilization, but your swap space is insufficient, the hibernation will also be aborted (this time not immediately, though, but only after it compresses the memory and finds out it doesn't fit into the free swap space). Again, some visible error message would be nice. It sounds like there are too many cases where the functionality fails, and that's true, but that's what we have, and often I consider it better to close a few extra tabs than having to power off. But it's annoying and far from polished, yes.
 
+ 8-16GB for the (compressed) hibernation image to be written to a
*contiguous* range within swap

That's 26-34G needed for the swap device. Since the swap device is
shared for pages and hibernation image, the actual size isn't knowable
until the approximate time hibernation is called.

Windows, AFAIK, pre-allocates the hibernation image and doesn't share it with swap space. It's very inefficient regarding disk size, but it improves reliability, obviously.

Btw, you have a very good remark that the memory image gets compressed. So for hibernation image, you can know its max size, but you can't know the optimal size - that depends on how well your memory gets compressed. And that's why you can often hibernate even with quite small swap - it just compressed well.


As Bastien noted in the pagure/workstation issue:
https://pagure.io/fedora-workstation/issue/121#comment-620831

I've started a followup thread upstream to get more information about
all of this. Kamil, can you do a brief write up of your use case or
reproduce steps? I want upstream to know that your case is real world,
and not some already known contrived case. You can either reply here
and I'll reference it on linux-mm@ or you can post directly to this
thread if you prefer. Thanks.

https://lore.kernel.org/linux-mm/CAA25o9TvFMEJnF45NFVqAfdxzKy5umzHHVDs+SCxrChGSKczTw@mail.gmail.com/

Hmm, what exactly do you want to know? :) Why I use hibernation?

On my desktop:
* Due to some hardware/firmware flaky-ness I can't suspend to RAM, because I sometimes get random memory corruption on resume (note: this is not faulty hardware in the usual sense, the problem occurs *only* when suspending to RAM, everything is super reliable otherwise). So I hibernate. I don't want to log in and open all the windows every time I return to the computer (that would be the case if I powered it off). Even if I didn't have flaky hardware, I'd want sometimes to suspend to RAM and sometimes hibernate, depending on whether I want to turn off the power strip at that time (e.g. when leaving the flat for a few days, during a big storm, when having other appliances in the same power strip and wanting to save power, etc).

On my laptop:
* I use suspend to RAM, and it works well. But occasionally the power drain is surprisingly high during suspend (perhaps a firmware issue) and the battery can be depleted overnight. Usually it can last 2-3 days, which is fine during work-week, but it might be risky over the weekend. It requires you to keep track of the battery level when suspending (especially on Fridays), or simply power it off. I don't want to say it's a problem, it's not, just not as comfortable as on Windows. A hybrid sleep would work wonders here. You simply suspend it and it hibernates after a few hours automatically. However, GNOME doesn't actually allow me configure it and it even overrides systemd configs. I'd ideally want to pick the right action that suits my current need (suspend, hybrid sleep, hibernate) with a configurable default for closing the lid.

On my parents'/wife's laptop:
* This is the most difficult use case. They can't and will not think about battery level and how suspend works when closing the laptop lid. They expect the system to behave intelligently. If they come back to the laptop in a few days and it's drained to 0% so that it won't even turn on and all the progress in opened applications has been lost, that's a big failure. They don't consider it their fault ("I should have powered it off instead"), but the system fault. After all, this situation doesn't happen on Windows. For these regular users, I believe that hybrid sleep is a necessity. Every time I install Linux to a new person, I have to note that they need to be much more careful about power management and using suspend.





>> And yet there is a 'resume=UUID' boot
>> parameter included. Why is this boot parameter set as if we're
>> supporting hibernation out of the box?
>
>
> Probably because of this?
> https://bugzilla.redhat.com/show_bug.cgi?id=1206936
> https://github.com/rhinstaller/anaconda/pull/1360

Right but anaconda is not called with --hibernation is what I mean.
The higher partition requirements for hibernation are tied to the
--hibernation flag, but the resume= boot parameter insertion is not.
And it seems to me those things should be tied to --hibernation, and
then also whether --hibernation is used should be an edition specific
decision, and that it should be explicit because otherwise we're just
setting users up for misadventure.

I somewhat disagree. The --hibernation flag just affects swap size. Larger swap makes it a bit more likely that hibernation will work, yes. But that doesn't mean it wouldn't work even with smaller size swap. The resume= parameter is required in all cases, whether you target a system that has large swap or medium swap size by default. Also, don't forget these are just default values, the users can change them during partitioning. It would be sad if the user intentionally configured large swap because she knows she wants to hibernate, and it wouldn't work just because the resume= parameter was missing, right? If some edition wants to prevent hibernation (that alone sounds like a bad idea), they can do it better by not offering the GUI option, or in the extreme case by somehow overriding "systemctl hibernate" functionality (let's not do that). If you allow to hibernate but break the resume (by omitting resume=), then the user just have lost some data.
 


> And yes, we currently support hibernation out of the box, and it works. If you don't have SecureBoot enabled, and if you use non-GNOME or know how to call it from GNOME.

It definitely doesn't work in all cases, and we don't have any
evidence that it works in most cases.

Yes, I meant it works in general, except for all the usual hardware issues of course (which are many, yes; I myself have ~50% success rate with my past hardware).
 
I have a laptop that 100% of the
time fails to resume from hibernation because the hibernation image is
considered corrupt by the kernel. I don't know if it's corrupt during
image write out, or read in. S4 relies on ACPI, and isn't even
reliable on Windows or macOS 100% of the time. And as you've
experienced, S3 isn't certainly reliable.

Whereas S0 low power mode is reliable, which is why so much effort is
going there. Just leave ACPI and the (logic board) firmware out of
consideration.

I would rather hibernation work. But I don't think it's OK to, by
default, create huge swap partitions when this very clearly is not at
all reliable as evidenced by your own experience, which requires
either luck or esoteric knowledge to get the active page amount below
some ~50% threshold in order for hibernation image creation to
succeed.

Yes, and I have no objections to lowering the default swap size (and thus making hibernation very unlikely to work) for these very reasons. But I'd like to keep it "functional" (except hardware/firmware issues) out of the box if the user decides to create a larger swap during installation (for this use case or some other).