I've been playing around with glusterfs the last few days, and I
thought I would send out a note about what I had found and ideas for
how we could use it. ;)
glusterfs is a distributed filesystem. It's actually very easy to
setup and manage, which is nice. ;)
You can setup a local one-node gluster volume in just a few commands:
yum install glusterfs\*
service glusterd start
gluster volume create testvolume yourhostname:/testbrick
gluster volume start testvolume
mount -t glusterfs yourhostname:/testvolume /mnt/testvolume
Setting up multiple nodes/peers/bricks is pretty easy.
Setting up distributing data and replication is pretty easy.
The replication seems to work pretty transparently. I setup a 2 node
setup and kill -9'ed the gluster processes on one and the other kept on
trucking just fine, and resynced fine after I restarted it.
Has an nfs mount ability, although it's not that good IMHO, as it's a
single point of failure on whatever hostname you specify in the mount
then. It could however be a handy fallback.
- iptables rules are a bit anoying to allow the nodes to talk to each
- There is a georeplication feature to allow you to replicate over a
WAN link to a slave gluster instance or directory. However, this
can't be a live instance, it's just for disaster recovery, and
currently it requires a root ssh login with passwordless key. Pass.
- df is a bit useless, as mounts show the space on the backing volume
that you created the brick on. Unless we setup mounts for each volume
to use it won't really reflect space. On the other hand, du should
Possible uses for us:
We could look at using this for a shared virt storage, which would let
us move things around more easily. However, there's a number of
problems with that: we would have to run it on the bare virthosts, we
would have to switch (as far as I can tell) to filesystem .img files
for the virt images, which may not be as nice as lvm volumes. Also, we
haven't in the past really moved things around much, and libvirt allows
for migrations anyhow. So, I don't think this usage is too much win.
So, looking at sharing application level data, I would think we would
want to setup a virt on each of our virthosts (called 'glusterN' or
something). Then we could make volumes and share them to needed
applications with the required replication/distribution needs for that
What things could we put on this?
- The tracker xapian db?
- How about other databases? I'm not sure how some db's would handle
it, but it could make us able to stop the db on one host, bring it up
on another with virtually no outage. If each database is it's own
volume, we can move each one around pretty easily. (ie, have 4 db
servers, move all db's to one, reboot/update the other 3, move them
back, reboot the last, etc).
- Web/static content? Right now we rsync that to all the proxies every
hour. If we had a gluster volume for it, we could just build and
rsync to the gluster instance. Or if the build doesn't do anything
wacky, just build on there.
- Hosted and Collab data? This would be better than the current drbd
setup as we could have two instances actively using the mount/data at
the same time. We would need to figure out how to distribute requests
- Moving forward to next year/later this year if we get a hold of a
bunch of storage, how about /mnt/koji? We would need at least 2 nodes
that have enough space, but then that would get us good replication
and ability to survive a machine crash much easier.
- Insert your crazy idea here. What do we have that could be made
better by replication/distribution in this way? Is there enough here
to make it worth deploying?
The infrastructure team will be having it's weekly meeting tomorrow
2012-02-02 at 1900 UTC in #fedora-meeting on the freenode network.
Suggested topics (suggested by whom):
* New folks introductions and Apprentice tasks.
* 2 factor auth status
* Staging re-work status
* Upcoming outages.
* Applications status / discussion
* Upcoming Tasks/Items
2012-02-01 - 2012-02-03 dgillmore is at phx2
2012-02-07 - fas 0.8.11 final release.
2012-02-10 - drop inactive fi-apprentices
2012-02-14 to 2012-02-28 - F17 Alpha Freeze
2012-02-28 - F17alpha release day
* Meeting tagged tickets:
* Open Floor
Submit your agenda items, as tickets in the trac instance and send a
note replying to this thread.
More info here:
So, I got to looking at search engines again the other day. In
particular the horrible horrible mediawiki one we are using on the
This pointed me to sphinx.
- There is a mediawiki sphinx plugin. (needs packaging)
- sphinx is c++ and already packaged.
- sphinx uses mysql directly to index the database contents.
- You can pass other data into it via an xml format. This could be a
pain for any non wiki setups.
It was noted that the new tagger application uses xapian as it's search
- xapian is also c++
- xapain has a web crawler/indexer (omega) that could index our other
stuff more easily than sphinx.
- There's no mediawiki plugin for xapian, but we could point the wiki
search box to a site wide search using xapian.
So, there's tradeoffs either way.
Would anyone care to lead an effort to test these two?
xapian would probably be easy to test from anywhere.
sphinx might require some access to our mediawiki database, but you
could also just setup a new mediawiki, the plugin and sphinx and see
how it works there.
If no one steps up I can look at doing it next week. ;)
My name is Michael, I have been a fedora user for roughly 10 years, I have
also been working
as a Systems Administrator for just over 5 years. I'm looking forward to
lurking around and getting
familiar with the way things are run by the infrastructure team. Once I
familiarize myself with
things I would like to contribute, in particular with sysadmin or
Fedora Account: kaos01