============================================
#fedora-meeting: Infrastructure (2013-02-21)
============================================
Meeting started by nirik at 19:00:01 UTC. The full logs are available at
http://meetbot.fedoraproject.org/fedora-meeting/2013-02-21/infrastructure...
.
Meeting summary
---------------
* welcome y'all (nirik, 19:00:01)
* New folks introductions and Apprentice tasks. (nirik, 19:02:15)
* new easyfix tasks welcome, team members are encouraged to try and
file tickets for them. (nirik, 19:05:28)
* Applications status / discussion (nirik, 19:06:17)
* pingou has vastly simplified the pkgdb db. (nirik, 19:07:42)
* new pkgdb-cli pushed out as well as copr-cli (nirik, 19:08:16)
* fas release being tested in staging, for 2013-02-28 release to prod.
(nirik, 19:08:57)
* askbot is now sending fedmsg's. (nirik, 19:11:56)
* more fas-openid testing welcome. Has worked for those folks that
have tried it so far. (nirik, 19:15:29)
* fedocal ready for 1.0 tag and review process. (nirik, 19:16:16)
* LINK:
http://elections-dev.cloud.fedoraproject.org/ (abadger1999,
19:16:30)
* testing on new elections version welcome:
http://elections-dev.cloud.fedoraproject.org/ (make account in
fakefas) (nirik, 19:17:04)
* will try out an f18 server for mm3 staging testing and feel out an
updates policy, etc. Possibly using snapshots more. (nirik,
19:33:27)
* will look at moving fas-openid to prod as soon as is feasable.
(nirik, 19:33:46)
* feedback on github reviews of all commits welcome. (nirik,
19:39:04)
* mirrormanager update to 1.4 soon. (nirik, 19:39:11)
* Sysadmin status / discussion (nirik, 19:43:00)
* smooge got our bnfs01 server's disks working again. (nirik,
19:43:56)
* nagios adjustments in progress (nirik, 19:44:30)
* arm boxes will get new net friday hopefully (nirik, 19:45:07)
* mass reboot next wed (tenative) for rhel 6.4 upgrades. (nirik,
19:47:52)
* Private Cloud status update / discussion (nirik, 19:52:50)
* euca cloudlet limping along after upgrade. (nirik, 19:55:11)
* work on going to bring openstack cloudlet up to more production
(nirik, 19:55:26)
* please see skvidal if you want to get involved in our private cloud
setup (nirik, 20:01:29)
* Upcoming Tasks/Items (nirik, 20:01:33)
* 2013-02-28 end of 4th quarter (nirik, 20:01:44)
* 2013-03-01 nag fi-apprentices (nirik, 20:01:44)
* 2013-03-07 remove inactive apprentices. (nirik, 20:01:44)
* 2013-03-19 to 2013-03-26 - koji update (nirik, 20:01:44)
* 2013-03-29 - spring holiday. (nirik, 20:01:44)
* 2013-04-02 to 2013-04-16 ALPHA infrastructure freeze (nirik,
20:01:46)
* 2013-04-16 F19 alpha release (nirik, 20:01:48)
* 2013-05-07 to 2013-05-21 BETA infrastructure freeze (nirik,
20:01:50)
* 2013-05-21 F19 beta release (nirik, 20:01:52)
* 2013-05-31 end of 1st quarter (nirik, 20:01:54)
* 2013-06-11 to 2013-06-25 FINAL infrastructure freeze. (nirik,
20:01:56)
* 2013-06-25 F19 FINAL release (nirik, 20:01:58)
* Open Floor (nirik, 20:02:49)
Meeting ended at 20:04:14 UTC.
Action Items
------------
Action Items, by person
-----------------------
* **UNASSIGNED**
* (none)
People Present (lines said)
---------------------------
* nirik (143)
* skvidal (99)
* abadger1999 (47)
* pingou (24)
* abompard (15)
* smooge (10)
* mdomsch (10)
* threebean (6)
* zodbot (5)
* SmootherFrOgZ (4)
* cyberworm54 (4)
* lmacken (2)
* maayke (1)
* ricky (0)
* dgilmore (0)
* CodeBlock (0)
--
19:00:01 <nirik> #startmeeting Infrastructure (2013-02-21)
19:00:01 <zodbot> Meeting started Thu Feb 21 19:00:01 2013 UTC. The chair is nirik.
Information about MeetBot at
http://wiki.debian.org/MeetBot.
19:00:01 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
19:00:01 <nirik> #meetingname infrastructure
19:00:01 <zodbot> The meeting name has been set to 'infrastructure'
19:00:01 <nirik> #topic welcome y'all
19:00:01 <nirik> #chair smooge skvidal CodeBlock ricky nirik abadger1999 lmacken
dgilmore mdomsch threebean
19:00:01 <zodbot> Current chairs: CodeBlock abadger1999 dgilmore lmacken mdomsch
nirik ricky skvidal smooge threebean
19:00:13 * skvidal is here
19:00:15 <nirik> hello everyone. whos around for an infrastructure meeting?
19:00:15 <smooge> not guilty
19:00:23 * cyberworm54 is here
19:00:25 * lmacken
19:00:26 * threebean is kinda here
19:00:28 * maayke is here
19:00:33 * abadger1999 here
19:00:40 * pingou here
19:00:52 * SmootherFrOgZ here
19:02:08 <nirik> ok, I guess lets go ahead and dive in...
19:02:15 <nirik> #topic New folks introductions and Apprentice tasks.
19:02:30 <nirik> any new folks like to introduce themselves? or apprentices with
questions or comments?
19:03:04 <cyberworm54> Hi I am an apprentice and hopefully I can learn and
contribute as much as I can
19:03:31 <nirik> welcome (back) cyberworm54
19:03:57 <cyberworm54> Thanks!
19:04:01 <nirik> to digress a bit... do folks think our apprentice setup is working
well? or is there anything we can do to improve it?
19:04:20 <nirik> I think the biggest problem is new people getting up to speed and
finding things they can work on.
19:04:52 <skvidal> nirik: also - we have a fair amount more code-related tasks than
general admin tasks that newcomers can get into
19:04:56 <nirik> we are also low on new easyfix tickets, particularly in the
sysadmin side.
19:05:02 <nirik> yeah.
19:05:14 <cyberworm54> it is a bit ...confusing but once you get to the docs and
actually read it you have a start point
19:05:28 <nirik> #info new easyfix tasks welcome, team members are encouraged to try
and file tickets for them.
19:06:06 <nirik> ok, moving on then I guess.
19:06:17 <nirik> #topic Applications status / discussion
19:06:27 <nirik> any application / development news this week or upcoming?
19:06:46 <pingou> I've been doing some cleanup on the pkgdb db scheme
19:06:49 <pingou> before:
http://ambre.pingoured.fr/public/pkgdb.png
19:06:57 <pingou> after:
http://ambre.pingoured.fr/public/pkgdb2.png
19:07:25 <pingou> that's with the help of abadger1999 :)
19:07:29 <nirik> wow. nice!
19:07:29 <lmacken> nice ☺
19:07:42 <nirik> #info pingou has vastly simplified the pkgdb db.
19:07:46 * abadger1999 just reviews and makes suggestions to what pingou writes ;-)
19:07:54 <pingou> pushed a new version of pkgdb-cli (waiting to arrive in testing)
and pushed upstream a new version of copr-cli
19:08:16 <nirik> #info new pkgdb-cli pushed out as well as copr-cli
19:08:19 <abadger1999> New fas release is finally out the door. Planning to upgrade
production on Feb 28.
19:08:29 <pingou> abadger1999 and I have started to think about pkgdb2 basically,
schema update is the first step
19:08:56 <abadger1999> pkgdb -- yeah, and pkgdb2 api is probably going to be the
second step
19:08:57 <nirik> #info fas release being tested in staging, for 2013-02-28 release
to prod.
19:09:19 <abadger1999> as a note for admins -- the fas release that introduced
fedmsg introduced a bug that you should know about
19:09:40 <SmootherFrOgZ> btw, there's a bunch of locale fixes in the new fas
release
19:09:41 <abadger1999> email verification when people change their email address was
broken.
19:09:50 <nirik> thats the one we have in prod, but we have hotfixed it right?
19:10:00 <SmootherFrOgZ> would good to test fas with different languages
19:10:32 <nirik> cool.
19:10:39 <abadger1999> it would change the email when the user first entered the
updated email in the form instead of waiting for them to confirm that the received the
verification email.
19:10:45 <nirik> I saw in stg that it also has the 'no longer accept just
yubikey for password' in.
19:11:37 <threebean> askbot got fedmsg hooks in production this week. there are
some new bugs to chase down regarding invalid sigs and busted links..
19:11:41 <nirik> any other application news? oh...
19:11:56 <nirik> #info askbot is now sending fedmsg's.
19:11:58 <threebean> Latest status ->
http://www.fedmsg.com/en/latest/status/
19:12:08 <skvidal> fedmsg.com? wow
19:12:25 <nirik> Has anyone had a chance to test patrick's fas-openid dev
instance? any feedback for him?
19:12:26 <abadger1999> nirik: Hmm... looks like production isn't hotfixed.
19:12:30 <skvidal> threebean: what's the status on fedmsg emitters from outside
of the vpn?
19:12:35 <abadger1999> nirik: but next fas release will have the fix.
19:12:40 <nirik> abadger1999: :( I thought we did. ok.
19:12:47 <threebean> skvidal: no material progress yet, but I've been thinking
it over.
19:12:50 <abadger1999> Can we wait until Thursday?
19:13:01 <skvidal> threebean: okay thanks
19:13:04 <threebean> skvidal: I have some janitorial work to do.. then that's
next on my list.
19:13:21 <skvidal> threebean: that's the limiting factor for adding notices from
coprs, I think
19:13:29 <nirik> abadger1999: I suppose
19:14:12 * threebean nods
19:14:18 <abadger1999> I've used fas-openid but not tested it heavily. It has
worked and looks nice. puiterwijk has a flask-fas-openid auth plugin that he's tested
and converted fedocal, IIRC, to use it.
19:14:41 <nirik> yeah, it's worked for me for a small set of sites I tested.
19:15:22 <pingou> speaking of fedocal, I need to tag 0.1.0 and put it up for review
19:15:29 <nirik> #info more fas-openid testing welcome. Has worked for those folks
that have tried it so far.
19:15:41 <pingou> the current feature requests will have to wait for the next
release...
19:15:57 <nirik> pingou: yeah. Will be good to get it setup. :)
19:16:15 <abadger1999> Oh, fchiulli has a new version of elections that's ready
for some light testing
19:16:16 <nirik> #info fedocal ready for 1.0 tag and review process.
19:16:24 <pingou> abadger1999: oh cool!
19:16:30 <abadger1999>
http://elections-dev.cloud.fedoraproject.org/
19:16:31 <nirik> abadger1999: cool. Is there an instance up?
19:16:34 <nirik> nice.
19:16:44 <skvidal> nirik: should be
19:16:48 <abadger1999> You need to make an account on fakefas in order to try it
out.
19:17:04 <nirik> #info testing on new elections version welcome:
http://elections-dev.cloud.fedoraproject.org/ (make account in fakefas)
19:17:06 <abadger1999> Please do try it out.
19:17:06 <skvidal> abadger1999: is elections switching to fas-openid, too?
19:17:24 <pingou> abadger1999: and the code is ?
19:17:45 <abadger1999> skvidal: I believe it is using flask-fas right now because
flask-fas-openid isn't in a released python-fedora yet.
19:18:03 <abadger1999> pingou:
https://github.com/fedora-infra/elections
19:18:14 <skvidal> abadger1999: got it
19:18:19 <pingou> abadger1999: great
19:18:20 <skvidal> abadger1999: thx
19:18:36 <abadger1999> np
19:18:47 <nirik> I have one more application type thing to discuss... dunno if
abompard is still awake, but we should discuss mailman3. ;)
19:18:51 <abadger1999> I am all for moving more things over to the flask-fas-openid
plugin though.
19:19:15 * nirik is too.
19:19:33 <nirik> anyhow, we are looking at setting up a mailman3 staging to do some
more testing and shake things out.
19:19:41 <nirik> however, mailman3 needs python 2.7
19:19:43 <abompard> nirik: yeaj
19:20:06 <nirik> so, it seems: a) rhel6 + a bunch of python rpms we build and
maintain against python 2.7
19:20:12 <nirik> or b) fedora 18 instance
19:20:30 <smooge> abadger1999, congrats on election stuff
19:20:38 <abompard> yes, and MM3 really does not work on python 2.6, sadly
19:20:47 * pingou question: which one will be out first: EL7 or MM3? :-p
19:20:55 <nirik> we are starting to have more fedora in our infra (for example, the
arm builders are all f18)
19:21:09 <nirik> so, we might want to come up with some policy/process around them.
Like when do to updates, etc.
19:21:09 <abadger1999> smooge: thanks. It was all fchiulli though :-) I told him
he can be the new owner of the code too :-)
19:21:13 <abompard> I've already rebuilt an application for a non-system python,
and it's not much fun
19:21:33 <smooge> bwahahahah
19:21:33 <abompard> as in non-scriptable
19:21:58 <nirik> yeah, it's pain either way...
19:21:59 * abadger1999 thinks fedora boxes are going to be preferable to non-system
python.
19:22:07 <pingou> +1
19:22:09 <skvidal> nirik: an idea
19:22:11 <nirik> I'm leaning that way as well.
19:22:16 <abompard> by the way, Debian has a strange but nifty packaging policy for
python package that make them work with all the installed versions of python
19:22:21 <smooge> I think we should make a bunch of servers rawhide
19:22:40 <skvidal> abompard: I assume the db /data for mm3 is all separate from
where it needs to run, right
19:22:46 <abadger1999> abompard: yeah -- I've looked at the policy but not hte
implementation. But every time I've run it by dmalcolm, he's said he doesn't
like it.
19:23:04 <abadger1999> abompard: i think some of that might be because he has looked
at the implementation :-)
19:23:05 <abompard> abadger1999: understandably, it's symlink-based
19:23:17 <abompard> skvidal: yeah, to some extent
19:23:23 <skvidal> nirik: I wonder if we could have 2 instances - talking to the
same db - so we could update f18 to latest - run mm3 on it in r/o mode - to make sure it
is working
19:23:27 <abompard> skvidal: it has local spool directories
19:23:30 <skvidal> nirik: then just pass the ip over to the other one
19:23:40 <nirik> in the past we have been shy of fedora instances because of the
massive updates flow I think, as well as possible bugs around those updates. I think
it's gotten much better in the last few years (I like to think due to the updates
policy, but hard to say)
19:23:59 <skvidal> nirik: which is why I was thinking we don't do updates to the
RUNNING instance
19:24:07 <skvidal> we just swap out the instance that is in use/has that ip
19:24:08 <abadger1999> ... or less contributors? /me ducks and runs
19:24:16 <nirik> :)
19:24:22 <skvidal> nirik: so we test the 'install'
19:24:24 <nirik> skvidal: right, so a extra level of staging?
19:24:31 <skvidal> nirik: one level, really
19:24:32 <abompard> skvidal: I don't know how MM3 will handle a read-only DB
19:24:37 <skvidal> prod and staging
19:25:02 <nirik> well, right now we are talking about a staging instance only, but
yeah, I see what you mean. we could do something along those lines.
19:25:17 <nirik> I also think for some use cases it's not as likely to break...
19:25:36 <nirik> ie, for mailman, postfix and mailman and httpd all need to work,
but it doesn't need super leaf nodes right?
19:25:39 <skvidal> abompard: understood
19:26:02 <skvidal> nirik: anyway - just an idea
19:26:04 <skvidal> nirik: ooo - actually
19:26:13 <skvidal> nirik: I just had a second idea that you will either hate or
love
19:26:14 <nirik> where as for something like a pyramid app, it would be a much more
complex stack
19:26:16 <skvidal> nirik: snapshots
19:26:16 <abompard> skvidal: we may get bugs because of that, not because of the
upgrade
19:26:30 <skvidal> nirik: we snapshot the running instance in the cloud
19:26:32 <nirik> yeah, we could do that too.
19:26:32 <skvidal> nirik: upgrade it
19:26:36 <skvidal> and if it dies - roll it out
19:27:04 <abompard> for the moment it will only be low-traffic lists anyway
19:27:22 <abompard> and I must check that, but if MM is not running, I think postfix
keeps the message
19:27:30 <abadger1999> skvidal: how would that work in terms of data? would we keep
the db and local spool directory separate from the snapshots?
19:27:33 <abompard> and re-delivers when MM starts
19:27:34 <skvidal> abompard: yes
19:27:35 <nirik> FWIW, I run f18 servers at home here, and they have been pretty
darn stable. (as they were when f17... earlier releases had more breakage from my
standpoint)
19:27:41 <skvidal> err
19:27:41 <skvidal> abadger1999: yes
19:27:44 <abadger1999> Cool.
19:28:11 <skvidal> abadger1999: no reason we can't have a mm3-db server in the
cloud :)
19:28:12 * abadger1999 kinda likes that. although possibly he just doesn't know all
the corner cases there :-)
19:28:16 <nirik> yeah. I'm sure we could do something with snapshots.
19:28:21 <skvidal> anyway - just an idea
19:28:23 <skvidal> nothing in stone
19:28:27 <nirik> yeah.
19:29:06 <nirik> also, for updates, we may just do them on the same schedule as rhel
ones, unless something security comes up in an exposed part... ie, just look at the httpd,
etc not the entire machine.
19:29:42 <nirik> anyhow, all to be determined, we can feel out a policy.
19:29:49 <nirik> anything else on the applications side?
19:29:56 <abadger1999> I have two more
19:30:00 <abadger1999> Do we have a schedule for getting fas-openid into
production?
19:30:28 <nirik> abadger1999: I think it's ready for stg for sure now... but not
sure when prod...
19:30:58 <nirik> I'm fine with rolling it out as fast as we are comfortable
with.
19:31:03 <nirik> I'd like to see it get more use. ;)
19:31:04 <abadger1999> I think we're coming along great. But if we're going
to start migrating apps to use fas-openid/telling people to use it when developing their
apps (like elections), then we need to have a plan for getting it into prod
19:31:09 <abadger1999> <nod>
19:31:19 <abadger1999> nirik: it's setup to replace the current fas urls?
19:31:34 <nirik> abadger1999: not fully sure on that. I think so...
19:31:36 * abadger1999 was wondering if we could deploy it and just not announce it for a
few weeks
19:31:46 <nirik> thats a thought.
19:32:22 <abadger1999> alright -- I guess let's talk about htis more on Friday
after our classroom session with puiterwijk :-)
19:32:26 <nirik> Oddly I have noticed that for things like askbot you get two
different "users" with different urls.
19:32:28 <nirik> yeah
19:33:05 <abadger1999> Other thing is for all the devs here, how's the
"review all changes" idea working out?
19:33:27 <nirik> #info will try out an f18 server for mm3 staging testing and feel
out an updates policy, etc. Possibly using snapshots more.
19:33:39 <abadger1999> I've liked how it works with pingou, puiterwijk, and
SmootherFrogZ for fas, python-fedora, and packagedb.
19:33:46 <nirik> #info will look at moving fas-openid to prod as soon as is
feasable.
19:33:55 <skvidal> abompard: how much space do you need on the mm server itself - if
you are not storing the db there?
19:33:59 <abadger1999> lmacken: Is it working okay for bodhi and such too?
19:34:07 <abadger1999> anything that's falling through the cracks?
19:34:14 <abompard> skvidal: I need to check that
19:34:21 <nirik> skvidal: if we are doing this as a real staging, we might want to
just make a real 'lists01.stg.phx2' virthost instead of cloud?
19:34:26 <pingou> abompard: I defintevely like it
19:34:53 <abadger1999> Do we want to say that certain things are okay to push
without review? (making a release would be a candidate...I was going to suggest
documentation earlier but pingou found a number of problems with my documentation patch
:-)
19:34:53 <pingou> abadger1999: ^ :)
19:34:54 <skvidal> nirik: okay - I didn't know if we wanted to be cloud-er-fic
about it or not
19:35:01 <skvidal> nirik: thx
19:35:31 <nirik> skvidal: yeah, I'm open to either, but I think right now until
we have less fog in our clouds, a real one might be better for this... but either way
19:35:53 <skvidal> nirik: well - with attached persistent volumes - using one of the
qcow imgs is non-harmful
19:35:55 <nirik> abadger1999: I like seeing the extra review. I've not done much
reviewing myself. ;)
19:36:06 <abompard> skvidal: not much, a few hudred MB
19:36:08 <skvidal> nirik: but I agree about fog
19:36:17 * abadger1999 notes that threebean is in another meeting but said he still likes
the idea but hasn't done it consisstently all the time. So more experimentation with
it needed.
19:36:43 * abadger1999 liked that nb reviewed a documentation update the other day :-)
19:37:02 <pingou> I think it can bring us new contributor
19:37:21 <pingou> some of them are easyfix
19:37:31 <pingou> other are bigger and then might need more experienced reviewers
19:37:57 <nirik> yeah
19:38:21 <nirik> welcome mdomsch
19:38:41 <abadger1999> Yeah. I agree. it's nice to have someone else's
eyes on the bigger fixes even if they're relatively new too, though. It's better
than before where I would have committed it without any review at all.
19:38:49 <mdomsch> better late than never
19:38:51 <nirik> that reminds me, mdomsch was going to look at updating mm in prod
to 1.4 on friday... if not then, then sometime soon. ;)
19:39:04 <nirik> #info feedback on github reviews of all commits welcome.
19:39:11 <mdomsch> anyone have any grief with doing a major MM upgrade tomorrow
afternoon?
19:39:11 <nirik> #info mirrormanager update to 1.4 soon.
19:39:47 <abadger1999> mdomsch: If you're around in case it goes sideways it
would be very nice.
19:39:51 <mdomsch> everything I know I've broken, I've fixed. Now it's
time to test in production. :-)
19:39:52 <nirik> I think it should be fine. We can be somewhat paranoid and not
touch one of the apps so we have an easy fallback.
19:40:11 <abadger1999> get the fixes in that you've had pending and get us onto
a single codebase for development.
19:40:13 <mdomsch> k
19:40:25 <nirik> (until we are sure the others are all working right I mean)
19:40:31 <mdomsch> right
19:40:34 <mdomsch> so bapp02, then app01
19:40:47 * nirik nods.
19:40:48 <mdomsch> and I'll stop the automatic push from bapp02 to app*
19:40:58 <nirik> sounds good.
19:41:00 <mdomsch> until we're comfortable. Worst case, we have slightly stale
data for a few hours
19:41:21 * nirik nods.
19:41:28 <abadger1999> instead of "if you're around" it would'vs
been clearer for me to say "as long as you're around" :-)
19:41:43 <nirik> mdomsch: you've picked up all the hotfixes into 1.4 right?
19:41:46 <mdomsch> abadger1999: naturally; I'm not around nearly as much
19:41:56 <abadger1999> Yeah. we miss you ;-)
19:42:16 <nirik> abadger1999: +1 :)
19:42:40 <nirik> anyhow, any other application news? or shall we move on?
19:43:00 <nirik> #topic Sysadmin status / discussion
19:43:06 <mdomsch> nirik: yes I pulled them all in while at FUDCon
19:43:17 <nirik> lets see... this week smooge was out at phx2 for a whirlwind tour.
19:43:22 <nirik> mdomsch: cool.
19:43:45 <nirik> #info smooge got out bnfs01 server's disks working again.
19:43:51 <nirik> #undo
19:43:51 <zodbot> Removing item from minutes: <MeetBot.items.Info object at
0x281d8c50>
19:43:56 <nirik> #info smooge got our bnfs01 server's disks working again.
19:44:09 <smooge> kind of sort of
19:44:19 <nirik> I've been tweaking nagios of late... hopefully making it
better.
19:44:30 <nirik> #info nagios adjustments in progress
19:44:56 <nirik> We should have net for the rest of the arm boxes friday.
19:45:07 <nirik> #info arm boxes will get new net friday hopefully
19:45:14 <skvidal> I had a discussion with the author of pynag this morning
19:45:49 <nirik> cool. Worth using for a tool for us to runtime manage nagios?
19:45:50 <skvidal> if we have people willing to spend some time - we could easily
build a query tool/cli-tool for nagios downtimes/acknowledgements/etc
19:46:07 <nirik> that would be quite handy, IMHO
19:46:12 <skvidal> nirik: it needs some code to make it work - but I think the basic
functionality is available
19:46:41 <nirik> for some things the ansible nagios module would do, but for others
it would be nice to have a command line.
19:47:15 <nirik> I'd like to look at doing a mass reboot next wed or so...
upgrade everything to rhel 6.4.
19:47:17 <SmootherFrOgZ> skvidal: interesting!
19:47:37 <nirik> Might do staging today/tomorrow to let it soak there and see if any
of our stuff breaks. ;)
19:47:52 <nirik> #info mass reboot next wed (tenative) for rhel 6.4 upgrades.
19:47:59 <skvidal> nirik: right - I'd like to be able to enhance the ansible
nagios module to be more idempotent and 'proper'
19:48:04 <skvidal> nirik: pynag _could_ do that
19:48:17 <nirik> yeah, it looks very bare bones right now.
19:48:35 <nirik> in particular we could use a 'downtime for host and all
dependent hosts' type thing
19:48:52 <skvidal> nirik: we could also use a 'give me the state of this
host'
19:48:58 <skvidal> without having to go to the webpage
19:49:14 <skvidal> according to palli (a pynag developer) it can read status.dat
19:49:15 <skvidal> from nagios
19:49:18 <smooge> I am looking at lldpd for our PHX2 systems
http://vincentbernat.github.com/lldpd/ Mainly to better get an idea of where things are
19:49:20 <skvidal> to determine ACTUAL state
19:49:26 <nirik> finally in the sysamin world, I'd really like to poke ansible
more and get it to where we can use it for more hosts. Keep getting sidetracked, but it
will happen! :)
19:51:04 <nirik> smooge: another thing we could look at there is
http://linux-ha.org/source-doc/assimilation/html/index.html (it uses lldpd type stuff).
They are about to have their first release... so very early days.
19:51:33 <smooge> ah cool
19:51:38 <smooge> wiill look at that also
19:51:55 <nirik> oh, on nagios, I set an option: soft_state_dependencies=1
19:52:22 <nirik> this hopefully will help us not get the flurry of notices when a
machine is dropping on and off the net, or has too high a load to answer, then answers
again.
19:52:50 <nirik> #topic Private Cloud status update / discussion
19:53:01 <nirik> skvidal: want to share your pain where we are with cloudlets? :)
19:53:08 <skvidal> sure
19:53:23 <skvidal> last week I did the euca upgrade and the wheels came right off
19:53:29 <skvidal> and then it plunged over a cliff
19:53:31 <skvidal> into a volcano
19:53:41 <pingou> sounds like a lot of fun
19:53:42 <skvidal> where it was eaten by a volcano monster
19:53:54 <smooge> who was riding a yak
19:53:56 <skvidal> anyway the euca instance is limping along at the moment with
not-occasional failures :(
19:54:04 <skvidal> smooge: and the yak had to be shaven
19:54:17 <pingou> brough back some pictures >
19:54:19 <pingou> ?
19:54:21 <skvidal> so...
19:54:35 <skvidal> I've been working on porting our imgs/amis/etc over to
openstack
19:54:44 <skvidal> and getting things more production-y in the openstack instance -
19:54:58 <skvidal> I got ssl working around the ec2 api for openstack
19:55:11 <nirik> #info euca cloudlet limping along after upgrade.
19:55:12 <skvidal> working on ssl'ing the other items
19:55:18 <skvidal> for the past couple of days
19:55:26 <nirik> #info work on going to bring openstack cloudlet up to more
production
19:55:28 <skvidal> I've been in a fist fight with openstack and qcow images
19:55:33 <skvidal> and resizing disks
19:55:47 <skvidal> I just got confirmation from someone that what we want to do is
just not possible at the moment :)
19:55:54 <nirik> lovely. ;(
19:56:10 <skvidal> nirik: not until we get the initramdisk to resize the partitions
:(
19:56:17 <skvidal> so - I'm punting on this
19:56:24 <skvidal> I just put in a new ami and kernel/ramdisk combo
19:56:29 <skvidal> that's rhel6.4 latest
19:56:30 <smooge> sometimes that is best
19:56:35 <nirik> yeah. I think that could work, but needs some time to get working
right. Hopefully by the cloud-utils maintainer. ;)
19:56:38 <skvidal> and since it is an AMI it resizes the disks
19:56:50 <skvidal> what it DOES NOT DO is follow the kernel on the disk - it uses
the one(s) in the cloud
19:56:54 <skvidal> which is suck
19:57:00 <skvidal> but at least it is known/obvious suck
19:57:08 <nirik> but it should also get us moving past it for now.
19:57:11 <skvidal> I've also just built a new qcow from rhel6.4
19:57:27 <skvidal> so for systems that don't need to be on-the-fly made - we can
spin them up
19:57:31 <skvidal> growpart the partition
19:57:33 <skvidal> reboot
19:57:35 <skvidal> resize
19:57:37 <skvidal> and go
19:57:47 <skvidal> and i'm working on a playbook to handle all of the above for
you
19:57:51 <skvidal> and, yes, it makes me cry inside
19:58:08 <nirik> ;(
19:58:12 <skvidal> that's where we are at the moment
19:58:26 <skvidal> I am making new keys/accounts/tenants/whatever
19:58:35 <skvidal> for our lockbox 'admin' user
19:58:40 <skvidal> for making persistent instances
19:58:53 * nirik nods.
19:58:57 <skvidal> the next step is to start making use of the resource tags in
openstack
19:59:02 <skvidal> so we can more easily track all this shit
19:59:15 <skvidal> also I have to make a bunch of volumes and rsycn over all the
data from the euca volumes :(
19:59:30 <skvidal> I fully expect that last part to be a giant example of suffering
19:59:46 <nirik> yeah. we should probibly move one set of instances first and sort
out if there's any doom
19:59:57 <skvidal> if I sound kinda 'bleah' there's a reason
20:00:02 <skvidal> nirik: I thought I'd start with the fartboard
20:00:07 <nirik> heh. ok
20:00:37 <skvidal> nirik: also - now that we have instance tags - it should be
doable to write a simple 'start me up' script using ansible to spin out the
instances
20:00:40 <skvidal> and KNOW where they are
20:00:48 <nirik> ok, we are running over time... let me quickly do upcoming and open
floor. ;)
20:00:52 <skvidal> sorry
20:00:55 <skvidal> thx
20:00:57 <nirik> thats fine. ;) all good info
20:01:04 <skvidal> one last thing
20:01:08 <skvidal> if anyone wants to get involved
20:01:08 <skvidal> ping me
20:01:29 <nirik> #info please see skvidal if you want to get involved in our private
cloud setup
20:01:33 <nirik> #topic Upcoming Tasks/Items
20:01:42 <nirik> (big paste)
20:01:44 <nirik> #info 2013-02-28 end of 4th quarter
20:01:44 <nirik> #info 2013-03-01 nag fi-apprentices
20:01:44 <nirik> #info 2013-03-07 remove inactive apprentices.
20:01:44 <nirik> #info 2013-03-19 to 2013-03-26 - koji update
20:01:44 <nirik> #info 2013-03-29 - spring holiday.
20:01:46 <nirik> #info 2013-04-02 to 2013-04-16 ALPHA infrastructure freeze
20:01:48 <nirik> #info 2013-04-16 F19 alpha release
20:01:50 <nirik> #info 2013-05-07 to 2013-05-21 BETA infrastructure freeze
20:01:52 <nirik> #info 2013-05-21 F19 beta release
20:01:54 <nirik> #info 2013-05-31 end of 1st quarter
20:01:56 <nirik> #info 2013-06-11 to 2013-06-25 FINAL infrastructure freeze.
20:01:58 <nirik> #info 2013-06-25 F19 FINAL release
20:02:00 <nirik> anything people want to schedule/note etc?
20:02:07 <nirik> I'll add the fas update and the mass reboot.
20:02:20 <abadger1999> Sounds good.
20:02:49 <nirik> #topic Open Floor
20:02:54 <nirik> Anyone have items for open floor?
20:03:32 <pingou> I have a series of blog post 'Fedora-Infra: Did you know?'
coming, like once a week for the coming 4 weeks
20:03:32 <nirik> ok.
20:03:42 <skvidal> pingou: wow
20:03:46 <nirik> pingou: awesome. More blog posts would be great.
20:03:49 <pingou> short stuff, speaking about some cool features/ideas
20:03:52 <skvidal> pingou: looking forward to seeing those
20:04:10 <nirik> Thanks for coming everyone. Do continue over on our regular
channels. :)
20:04:14 <nirik> #endmeeting