When running a ftbfs run on both euca and openstack I started to see some pretty big differences in performance. CPU and mem were all the same or close enough so I decided to look at disk performance.
euca is backed by local disks in a raid/lvm layout and/or exported via iscsi through the Storage Controller.
openstack is using a replicated/distributed gluster for all disk back ends - including ephemeral (local) and volume-backed (iscsi)
Results are here and are kinda staggering:
http://skvidal.fedorapeople.org/misc/cloudbench.txt
in short - gluster performance really bogs us down for building in the cloud instances.
thoughts on improving that performance or do we simply want to have certain workloads in euca specifically b/c the disks are more disposable and/or faster?
-sv
On Sat, Oct 06, 2012 at 05:26:00PM -0400, Seth Vidal wrote:
openstack is using a replicated/distributed gluster for all disk back ends - including ephemeral (local) and volume-backed (iscsi) Results are here and are kinda staggering: http://skvidal.fedorapeople.org/misc/cloudbench.txt
We should contact the gluster people about this. When we were running GPFS (proprietary cluster filesystem from IBM) on our cluster at my last job, it was noticeably faster than local disk, and scaled better. Gluster might not match that yet but it needs to play in that ballpark.
On 10/06/2012 02:26 PM, Seth Vidal wrote:
When running a ftbfs run on both euca and openstack I started to see some pretty big differences in performance. CPU and mem were all the same or close enough so I decided to look at disk performance.
euca is backed by local disks in a raid/lvm layout and/or exported via iscsi through the Storage Controller.
openstack is using a replicated/distributed gluster for all disk back ends - including ephemeral (local) and volume-backed (iscsi)
Results are here and are kinda staggering:
http://skvidal.fedorapeople.org/misc/cloudbench.txt
in short - gluster performance really bogs us down for building in the cloud instances.
thoughts on improving that performance or do we simply want to have certain workloads in euca specifically b/c the disks are more disposable and/or faster?
A few thoughts - and bear in mind that i don't have an entirely optimal understanding of everything here ;) so I have no idea how much of this may apply...
- AIUI gluster can be less than optimal for lots of small writes - esp. when doing replication - I don't know if you have looked at all of the translators, there are a few performance/write translators that can help improve this.
- If this stuff is on a LAN that is slow/bogged down/not operating at Full Pipe Speedz then that could cause suckage (depending on ... things?? just recalling vague things in my brain... could be worth a shot.)
Anyway, I'm forwarding on to the gluster folks (john mark, eco, jeff, kaleb) in the hopes they might be able to help fine-tune things; I know that john mark has been curious about how things are going anyhow.
-Robyn
-sv
infrastructure mailing list infrastructure@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/infrastructure
On Sun, 7 Oct 2012, Robyn Bergeron wrote:
- AIUI gluster can be less than optimal for lots of small writes - esp. when doing replication - I don't know if you have looked at all of the translators, there are a few performance/write
translators that can help improve this.
Indeed - we're usin gluster to backend hosted - so we know it has some issues with small writes. Our reason for using it is mostly for a shared, reliable fs - specifically so we can use the live migration features of openstack
- If this stuff is on a LAN that is slow/bogged down/not operating at Full Pipe Speedz then that could cause suckage (depending on ... things?? just recalling vague things in my brain... could be
worth a shot.)
The lan is completely free and clear of anything other than the cloud-noise.
Anyway, I'm forwarding on to the gluster folks (john mark, eco, jeff, kaleb) in the hopes they might be able to help fine-tune things; I know that john mark has been curious about how things are going anyhow.
Thanks - I had planned on pinging jeff today but you beat me to it. -sv
On Sat, Oct 6, 2012 at 5:26 PM, Seth Vidal skvidal@fedoraproject.org wrote:
When running a ftbfs run on both euca and openstack I started to see some pretty big differences in performance. CPU and mem were all the same or close enough so I decided to look at disk performance.
euca is backed by local disks in a raid/lvm layout and/or exported via iscsi through the Storage Controller.
openstack is using a replicated/distributed gluster for all disk back ends - including ephemeral (local) and volume-backed (iscsi)
Results are here and are kinda staggering:
http://skvidal.fedorapeople.org/misc/cloudbench.txt
in short - gluster performance really bogs us down for building in the cloud instances.
thoughts on improving that performance or do we simply want to have certain workloads in euca specifically b/c the disks are more disposable and/or faster?
-sv
Gluster currently doesn't proclaim running VM images as one of its strong points, although many do use it in that capacity. A few months back a well funded group attempted to use Gluster as storage for their CloudStack instance and were unable to tune performance to acceptable levels. In the end they abandoned their gluster efforts and moved to ceph for their distributed storage. That said, I wouldn't expect any distributed storage solution to get close to DAS speed. If you are willing to live with the downsides of DAS, it is almost always going to be faster.
--David
On Sun, 7 Oct 2012, David Nalley wrote:
Gluster currently doesn't proclaim running VM images as one of its strong points, although many do use it in that capacity. A few months back a well funded group attempted to use Gluster as storage for their CloudStack instance and were unable to tune performance to acceptable levels. In the end they abandoned their gluster efforts and moved to ceph for their distributed storage. That said, I wouldn't expect any distributed storage solution to get close to DAS speed. If you are willing to live with the downsides of DAS, it is almost always going to be faster.
And the point of the comparison was not to say "DAS is sooo much better".
The point of the comparison was to see if we can improve the gluster performance some or if we should tailor our use of each specifically based on workload/type of use. The other point was to simply be aware of the difference.
We use gluster in other places (fedorahosted) and quite like it. Doesn't mean it is perfect for every case but the only way we're going to be sure is to benchmark. That's all I was trying to provide.
-sv
infrastructure@lists.fedoraproject.org