Re: Fedora 33 System-Wide Change proposal: swap on zram

Saturday, 6 June 2020

On Sat, Jun 6, 2020 at 4:18 AM David Kaufmann <astra(a)ionic.at&gt; wrote:
...

 Hi!

 On Sat, Jun 06, 2020 at 01:15:35AM -0700, Samuel Sieb wrote:
 > 5GB of swap space that normally would be on disk is now taking less than 2G
 > of RAM.  Instead of the usual 6G in the disk swap, now I have less than 2.

 What's bugging me about this thread is that quite a few people made
 comparisions resulting in "compressed zram is smaller than uncompressed
 disk swap". For me this seems quite obvious, and looks like flawed
 testing.

 Wouldn't it be more correct to also compress disk swap to compare size?
 This would probably make the size-argument void, as I'd expect them to
 be the same size. If it's not, that would be an useful data point.
 Also I don't care about 10GB stuff sitting in disk swap, whilst 1GB of
 RAM could be quite essential. This leaves overall performance as a
 potential benefit of zram swap. 
To me this sounds like too much dependency on swap. There is a
balance. You really do want some swap, because incidental and
occasional heavy swap, is merely inactive page eviction and it's good
to get them out of the way because it frees memory for real work. And
it avoids reclaim, which is what must happen in the noswap case. What
people hate is slow swap. So the proposal is basically: let's use some
fast swap and not a lot of slow swap. It's mainly an optimization.

I expect special workloads will need to make adjustments, and in the
case of heavy persistent swap it might be zswap is better or it might
be that nothing can help that workload except buying more RAM.

...
 What almost noone wrote about is CPU usage, for which I saw two
 anecdotal references: "no change" and "doubled CPU usage". 
For sure there is an impact on CPU. This exchanges IO bound work, for
CPU and memory bound work. But it's pretty lightweight compression.

And again, whatever is defined as "too much" CPU hit for the workload,
necessarily translates into either "suffer the cost of IO bound
swap-on-drive, or suffer the cost of more memory." There is no free
lunch. It really isn't magic.

One thing this might alter the calculus of is swappiness. Because the
zram device is so much faster, page out page in is a lower penalty
than file reclaim reads. So now instead of folks thinking swappiness
should be 1 (or even 0), it's the opposite. It probably should be 100.

See the swappiness section in the Chris Down article referenced in the proposal:
https://chrisdown.name/2018/01/02/in-defence-of-swap.html

...

 It would be interesting to see a comparison of swap, compressed swap and
 zram swap performance while having processes competing on swap.
 E.g. System with 8 GB RAM:
 * 2 processes with 3.5 GB memory each
   this would leave about 1GB for the system itself, so with disk swap
   there should be not much swapping and with zram swap I'd expect little
   swapping too.
 * 2 processes with 4 GB memory each
   this would force light disk swap usage in the first two cases, I'm not
   really sure what it would do to zram swap.
 * 2 processes with 4.5 GB memory each
   this would definitely lead to constant swapping, but it should still
   fit in zram swap, with a ratio of 2:1 it should theoretically fit in a
   2GB zram device, and realistically in a 4GB zram device. 
I'm not sure. Maybe. But this is the plus of having it available and
folks testing it. It is still as rudimentary as swap. It might be
useful to make it a bit smarter, look into some of the cgroupsv2
resource control work for hints on how to make the zram device bigger
or smaller. Some workloads might favor a zram device sized to 100%
RAM, without difficulty.

...
 For the programs I'd think of something allocating memory and
 writing/reading random parts of it, running for N iterations.
 Interesting points would be: average CPU usage and wallclock time for
 each of the processes.

 I've left out the obvious cases of "not enough memory usage to use any
 of those anyway" and "too much memory usage that it results in oom".
 oom killer should not invoke in daily usage anyway, I see that as a sign
 for "something has gone seriously wrong", comparable with "program
 segfaults on start". 
There are worse things than OOM. Stuck with a totally unresponsive
system and no OOM on the way. Hence earlyoom. And on-going resource
control work with cgroupsv2 isolation.

-- 
Chris Murphy

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: Fedora 33 System-Wide Change proposal: swap on zram