= Proposed System Wide Change: SSD cache = https://fedoraproject.org/wiki/Changes/SSD_cache
Change owner(s): Rolf Fokkens rolf@rolffokkens.nl
Using recent kernel (3.9 and later) features for (fast) SSD caching of (slow) ordinary hard disks.
== Detailed description == Recent Linux kernels support the use of Solid State Drives as caches for rotational hard disks. Because the high cost per GB for SSD devices this feature may bring the best of both: fast end big yet affordable storage capacity. Linux kernel 3.9 introduced dm-cache, kernel 3.10 introduces bcache.
== Scope == Proposal owners: Enable caching features in new kernels
Other developers: Support the caching features in their respective packages. Special focus should be on making the system boot from a cached root FS.
Release engineering: All packages should operate in close harmony te make this work. Only a rebuild of the relevant packages is required.
Policies and guidelines: No changes I think. _______________________________________________ devel-announce mailing list devel-announce@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel-announce
On 2013-07-15 12:56, Jaroslav Reznik wrote:
= Proposed System Wide Change: SSD cache = https://fedoraproject.org/wiki/Changes/SSD_cache
One thing I would recommend would be to correctly detect SSD: ATM, I installed my Fedora over an SSD and it did not ajust the mount settings nor suggest an appropriate setting for the SSD. If we can at least detect SSD and suggest (just suggest) a setting it would be great.
Note: there might already be such a feateure in very recent Fedora installer. I used F18 one.
On Mon, Jul 15, 2013 at 7:47 AM, Mihamina Rakotomandimby mihamina@rktmb.org wrote:
One thing I would recommend would be to correctly detect SSD: ATM, I installed my Fedora over an SSD and it did not ajust the mount settings nor suggest an appropriate setting for the SSD. If we can at least detect SSD and suggest (just suggest) a setting it would be great.
you can set IO scheduler to noop for disks that are reported as not "rotational".
Example udev script at https://github.com/satya164/fedorautils/blob/master/plugins/disk_io_schedule...
It might be a good idea to add this initscripts.
cheers,
m -- martin.langhoff@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff
On Mon, Jul 15, 2013 at 3:42 PM, Martin Langhoff martin.langhoff@gmail.com wrote:
On Mon, Jul 15, 2013 at 7:47 AM, Mihamina Rakotomandimby mihamina@rktmb.org wrote:
One thing I would recommend would be to correctly detect SSD: ATM, I installed my Fedora over an SSD and it did not ajust the mount settings nor suggest an appropriate setting for the SSD. If we can at least detect SSD and suggest (just suggest) a setting it would be great.
you can set IO scheduler to noop for disks that are reported as not "rotational".
Example udev script at https://github.com/satya164/fedorautils/blob/master/plugins/disk_io_schedule...
It might be a good idea to add this initscripts.
A tuned profile I believe is the recommended spot for this sort of thing these days.
Peter
On Mon, Jul 15, 2013 at 10:55 AM, Peter Robinson pbrobinson@gmail.com wrote:
On Mon, Jul 15, 2013 at 3:42 PM, Martin Langhoff martin.langhoff@gmail.com wrote:
On Mon, Jul 15, 2013 at 7:47 AM, Mihamina Rakotomandimby mihamina@rktmb.org wrote:
One thing I would recommend would be to correctly detect SSD: ATM, I installed my Fedora over an SSD and it did not ajust the mount settings nor suggest an appropriate setting for the SSD. If we can at least detect SSD and suggest (just suggest) a setting it would be great.
you can set IO scheduler to noop for disks that are reported as not "rotational".
Example udev script at https://github.com/satya164/fedorautils/blob/master/plugins/disk_io_schedule...
It might be a good idea to add this initscripts.
A tuned profile I believe is the recommended spot for this sort of thing these days.
Would you run tuned on a server?
m -- martin.langhoff@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff
Right. I am not familiar with it. OTOH, a noop scheduler is simple enough and non-dynamic rule.
udev seems like a much better fit than a daemon.
But what do I know. I am never in the fashion :-)
m
On Mon, Jul 15, 2013 at 12:27 PM, Matthew Garrett mjg59@srcf.ucam.org wrote:
On Mon, Jul 15, 2013 at 12:21:57PM -0400, Martin Langhoff wrote:
Would you run tuned on a server?
It was written with that in mind.
-- Matthew Garrett | mjg59@srcf.ucam.org -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel
On Mon, Jul 15, 2013 at 5:21 PM, Martin Langhoff martin.langhoff@gmail.com wrote:
On Mon, Jul 15, 2013 at 10:55 AM, Peter Robinson pbrobinson@gmail.com wrote:
On Mon, Jul 15, 2013 at 3:42 PM, Martin Langhoff martin.langhoff@gmail.com wrote:
On Mon, Jul 15, 2013 at 7:47 AM, Mihamina Rakotomandimby mihamina@rktmb.org wrote:
One thing I would recommend would be to correctly detect SSD: ATM, I installed my Fedora over an SSD and it did not ajust the mount settings nor suggest an appropriate setting for the SSD. If we can at least detect SSD and suggest (just suggest) a setting it would be great.
you can set IO scheduler to noop for disks that are reported as not "rotational".
Example udev script at https://github.com/satya164/fedorautils/blob/master/plugins/disk_io_schedule...
It might be a good idea to add this initscripts.
A tuned profile I believe is the recommended spot for this sort of thing these days.
Would you run tuned on a server?
Yes, it sets and deals with a lot more than just schedulers.
[1] https://docs.fedoraproject.org/en-US/Fedora/15/html-single/Power_Management_...
On Mon, Jul 15, 2013 at 02:47:30PM +0300, Mihamina Rakotomandimby wrote:
On 2013-07-15 12:56, Jaroslav Reznik wrote:
= Proposed System Wide Change: SSD cache = https://fedoraproject.org/wiki/Changes/SSD_cache
One thing I would recommend would be to correctly detect SSD: ATM, I installed my Fedora over an SSD and it did not ajust the mount settings nor suggest an appropriate setting for the SSD. If we can at least detect SSD and suggest (just suggest) a setting it would be great.
What are the right default settings for an SSD? From an IO scheduler perspective, CFQ already reads rotational flag and changes its behavior.
Thanks Vivek
On Mon, Jul 15, 2013 at 5:56 AM, Jaroslav Reznik jreznik@redhat.com wrote:
= Proposed System Wide Change: SSD cache = https://fedoraproject.org/wiki/Changes/SSD_cache
Change owner(s): Rolf Fokkens rolf@rolffokkens.nl
Using recent kernel (3.9 and later) features for (fast) SSD caching of (slow) ordinary hard disks.
== Detailed description == Recent Linux kernels support the use of Solid State Drives as caches for rotational hard disks. Because the high cost per GB for SSD devices this feature may bring the best of both: fast end big yet affordable storage capacity. Linux kernel 3.9 introduced dm-cache, kernel 3.10 introduces bcache.
== Scope == Proposal owners: Enable caching features in new kernels
These options are already enabled in the kernel.
Other developers: Support the caching features in their respective packages. Special focus should be on making the system boot from a cached root FS.
This is pretty generic. Which packages? Who is going to do the work here?
Release engineering: All packages should operate in close harmony te make this work. Only a rebuild of the relevant packages is required.
This doesn't make sense. We're doing a mass rebuild, so all packages are getting rebuild anyway.
I'm confused what this Change is actually for. It doesn't sound like an actual planned and targeted set of changes. It seems more of a nebulous "we should get people to do this" proposal.
josh
On Mon, Jul 15, 2013 at 09:17:56AM -0400, Josh Boyer wrote:
On Mon, Jul 15, 2013 at 5:56 AM, Jaroslav Reznik jreznik@redhat.com wrote:
= Proposed System Wide Change: SSD cache = https://fedoraproject.org/wiki/Changes/SSD_cache
Release engineering: All packages should operate in close harmony te make this work. Only a rebuild of the relevant packages is required.
This doesn't make sense. We're doing a mass rebuild, so all packages are getting rebuild anyway.
I'm confused what this Change is actually for. It doesn't sound like an actual planned and targeted set of changes. It seems more of a nebulous "we should get people to do this" proposal.
Yes, I would *guess* it involves: - modifing anaconda to allow cache device designation during installation (this is more important with bcache, as it needs special formatting; dm-cache can be disabled/enabled on the fly) - modify dracut to properly attach bcache in initramfs - integrate dm-cache handling with local-fs.target - finishing SSD caching layer for btrfs - (...) ?
Tomasz Torcz (tomek@pipebreaker.pl) said:
On Mon, Jul 15, 2013 at 09:17:56AM -0400, Josh Boyer wrote:
On Mon, Jul 15, 2013 at 5:56 AM, Jaroslav Reznik jreznik@redhat.com wrote:
= Proposed System Wide Change: SSD cache = https://fedoraproject.org/wiki/Changes/SSD_cache
Release engineering: All packages should operate in close harmony te make this work. Only a rebuild of the relevant packages is required.
This doesn't make sense. We're doing a mass rebuild, so all packages are getting rebuild anyway.
I'm confused what this Change is actually for. It doesn't sound like an actual planned and targeted set of changes. It seems more of a nebulous "we should get people to do this" proposal.
Yes, I would *guess* it involves:
- modifing anaconda to allow cache device designation during installation (this is more important with bcache, as it needs special formatting; dm-cache can be disabled/enabled on the fly)
- modify dracut to properly attach bcache in initramfs
- integrate dm-cache handling with local-fs.target
- finishing SSD caching layer for btrfs
- (...) ?
Yeah, I'm confused. If it's just the raw enablers, it's: - turn on bcache & dm-cache (done) - add dm-cache to device-mapper userspace (done, IIRC) - build bcache-tools (don't see a review for it)
The real system wide change would be the more invasive things Tomasz mentions above.
Bill
----- Original Message -----
Tomasz Torcz (tomek@pipebreaker.pl) said:
On Mon, Jul 15, 2013 at 09:17:56AM -0400, Josh Boyer wrote:
On Mon, Jul 15, 2013 at 5:56 AM, Jaroslav Reznik jreznik@redhat.com wrote:
= Proposed System Wide Change: SSD cache = https://fedoraproject.org/wiki/Changes/SSD_cache
Release engineering: All packages should operate in close harmony te make this work. Only a rebuild of the relevant packages is required.
This doesn't make sense. We're doing a mass rebuild, so all packages are getting rebuild anyway.
I'm confused what this Change is actually for. It doesn't sound like an actual planned and targeted set of changes. It seems more of a nebulous "we should get people to do this" proposal.
Yes, I would *guess* it involves:
- modifing anaconda to allow cache device designation during installation (this is more important with bcache, as it needs special formatting; dm-cache can be disabled/enabled on the fly)
- modify dracut to properly attach bcache in initramfs
- integrate dm-cache handling with local-fs.target
- finishing SSD caching layer for btrfs
- (...) ?
Yeah, I'm confused. If it's just the raw enablers, it's:
- turn on bcache & dm-cache (done)
- add dm-cache to device-mapper userspace (done, IIRC)
- build bcache-tools (don't see a review for it)
The real system wide change would be the more invasive things Tomasz mentions above.
From initial look, before it was clarified here, it looked more system
wide - kernel changes etc. Now I agree, it changed.
Jaroslav
Bill
devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel
I think this is a bad idea, at least for my setup. I really don't want my small expensive boot SSD being beaten to death trying to cache a multi-terabyte array, especially since I have plenty of RAM that already serves that purpose (the machine rarely reboots).
At the very least, this feature should be disabled if the SSD is the boot/root drive. When SSDs fail, they fail completely, and it's irresponsible to cause early failure on a drive that's critical for booting and OS operation.
Also, I think such features should be postponed until/unless there's a clear and obvious way to configure/disable them that doesn't involve installing additional packages or editing obscure text files.
On Mon, 15 Jul 2013 13:57:39 -0400 DJ Delorie dj@redhat.com wrote:
I think this is a bad idea, at least for my setup. I really don't want my small expensive boot SSD being beaten to death trying to cache a multi-terabyte array, especially since I have plenty of RAM that already serves that purpose (the machine rarely reboots).
Actually, bcache is very good about *not* wearing out SSDs -- it writes in giant erase block-sized portions and likely you can tune how much is written.
And either of these layers must be turned on by an admin -- it's not going to be shoved down your throat.
At the very least, this feature should be disabled if the SSD is the boot/root drive. When SSDs fail, they fail completely, and it's irresponsible to cause early failure on a drive that's critical for booting and OS operation.
By default, bcache runs a write-through cache -- it only caches clean data. If the caching SSD dies, the bcache layer can just forward requests to spinning drive. No data is lost.
(Bcache has a writeback mode where data loss is possible. I do not recommend this mode.)
Also, I think such features should be postponed until/unless there's a clear and obvious way to configure/disable them that doesn't involve installing additional packages or editing obscure text files.
Again -- no one is forcing you to use this. It's opt-in.
Conrad
it's not going to be shoved down your throat.
I've found this to be untrue in Fedora.
At the very least, this feature should be disabled if the SSD is the boot/root drive. When SSDs fail, they fail completely, and it's irresponsible to cause early failure on a drive that's critical for booting and OS operation.
By default, bcache runs a write-through cache -- it only caches clean data. If the caching SSD dies, the bcache layer can just forward requests to spinning drive. No data is lost.
No, I wasn't worried about the spinny disks. I was worried about the SSD itself, in the case where the SSD hosts both boot/root *and* a cache for, say, a /home array.
Also, I think such features should be postponed until/unless there's a clear and obvious way to configure/disable them that doesn't involve installing additional packages or editing obscure text files.
Again -- no one is forcing you to use this. It's opt-in.
Please read the /tmp-on-tmpfs thread for an example of what I'm worried about.
On Mon, 15 Jul 2013 15:36:06 -0400 DJ Delorie dj@redhat.com wrote:
it's not going to be shoved down your throat.
I've found this to be untrue in Fedora.
I think this is disingenuous. Especially at the file-system / block layer, one can point to numerous examples of new features *not* forced on users. No one forces you to use LUKS. No one forces you to use LVM. No one forces you to use btrfs, ext4, or even ext3. You can still use ext2 if you like. Tmp-on-tmpfs is a default, and while it's opt-out, you *can* opt out.
Bcache is not something that will be default, and certainly not for upgrades from existing systems.
At the very least, this feature should be disabled if the SSD is the boot/root drive. When SSDs fail, they fail completely, and it's irresponsible to cause early failure on a drive that's critical for booting and OS operation.
By default, bcache runs a write-through cache -- it only caches clean data. If the caching SSD dies, the bcache layer can just forward requests to spinning drive. No data is lost.
No, I wasn't worried about the spinny disks. I was worried about the SSD itself, in the case where the SSD hosts both boot/root *and* a cache for, say, a /home array.
Ah, I tend to think of /home as living on root.
Yes we agree here. One shouldn't use an SSD for both data (/boot, /, or otherwise) and cache.
Also, I think such features should be postponed until/unless there's a clear and obvious way to configure/disable them that doesn't involve installing additional packages or editing obscure text files.
Again -- no one is forcing you to use this. It's opt-in.
Please read the /tmp-on-tmpfs thread for an example of what I'm worried about.
*Sigh*. This is pure hyperbole. Even the author of bcache can't show that bcache gives performance wins on all systems / use-cases. It's a wash.
And as you're afraid of, there are risks to using it that shouldn't be the default for users. I agree.
I think bcache should be opt-in but available. I agree that it should not be the default. It would be nice if anaconda supported it as an option, but that takes a significant amount of work.
I'm not convinced this should be a Fedora feature (unless Anaconda support is there). Until then, it's just a new 3.9 kernel feature.
Conrad
On 07/15/2013 09:25 PM, Conrad Meyer wrote:
By default, bcache runs a write-through cache -- it only caches clean data. If the caching SSD dies, the bcache layer can just forward requests to spinning drive. No data is lost.
(Bcache has a writeback mode where data loss is possible. I do not recommend this mode.)
What's the benefit of bcache, compared to just sticking more RAM in the machine? That you can get more cache, especially on systems that are short on memory sockets? Or that the cache persits across reboots (something that can be tricky because it requires synchronizing writes to the cache and the disk)?
On Tue, 16 Jul 2013 10:36:55 +0200 Florian Weimer fweimer@redhat.com wrote:
On 07/15/2013 09:25 PM, Conrad Meyer wrote:
By default, bcache runs a write-through cache -- it only caches clean data. If the caching SSD dies, the bcache layer can just forward requests to spinning drive. No data is lost.
(Bcache has a writeback mode where data loss is possible. I do not recommend this mode.)
What's the benefit of bcache, compared to just sticking more RAM in the machine? That you can get more cache, especially on systems that are short on memory sockets? Or that the cache persits across reboots (something that can be tricky because it requires synchronizing writes to the cache and the disk)?
From 5 minutes of research: - 512 GB SSD on newegg -- $390. - 512 GB of RAM on newegg -- $4200*. * Doesn't include the cost of a server board that has 32 ram slots.
So bcache is a more cost-effective way (than RAM) to expand the working set of disk you can access very quickly.
bcache in write-back mode must persist or else you suffer data loss on any power failure. So, I think that answers that question. Getting the syncing right isn't actually that hard.
Regards, Conrad