On 06/10/2011 12:24 PM, Mark McLoughlin wrote:
On Sat, 2011-06-11 at 00:00 +1000, Justin Clift wrote:
> On 10/06/2011, at 2:36 AM, Carl Trieloff wrote:
>> Justin,
>>
>> Can we get this info linked from the Aeolus web pages?
>
> Ok, this is my conversion of that info to an Aeolus web page
> (dev version):
>
>
http://justinclift.fedorapeople.org/aeolus/high_availability.html
Kudos Justin and Steve, I love the focus on why this is interesting to
users. Very cool stuff.
I've tried to get my head around the architecture of the technical
integration here, but I'm still missing some of the big picture.
e.g.
- matahari needs to be installed in all the guests because it's what
does the application monitoring and restarting. That sounds like a
job for our "service" concept. You can include a matahari service
descriptor in your deployable or template and it will be configured
at either boot-time or build-time, respectively
ack
* Build time config would require guest images to be built to include
the matahari packages
* post boot config would need to configure matahari to talk to the
proper brokers in the cloud
- the DPE - is that on the guest and built on matahari? or is it
the
server-side piece that talks to matahari on the guest
DPE is part of server side infrastructure. It communicates with the
guests via the QMF Bus/Matahari
- the CPE is server-side. Is there one of these per cloud? Similar
to
the config server? Is it plausible for it to be combined with the
config server somehow? e.g. if we have an AMI for the config server
on EC2, would we put the CPE on it too?
I'll let Steve take that one :)
- it sounds like CPE is going to consume the deployable XML. Does
conductor push this to CPE, or does CPE watch for new deployables
and pull it from conductor?
Ditto
- the assumption is that the deployable XML will describe the
applications that the CPE needs to monitor. They'll be described as
services. And the service descriptions in the deployable XML will
include the information required for matahari to restart the
daemons configured by the deployable.
Correct, though as I mentioned in another thread, I think it would be
better to call these services that need to be monitored "Resources" as
it prevents confusion with the other things we're calling services, and
is more similar to what other Cluster Stacks (for HA) do.
at first glance, I'm not sure that's the best way of
doing this.
e.g. if the script associated with a service (the script runs at
build or boot time, depending on whether the service is in the
deployable or the template) enables some daemons and configures
matahari with the information it needs to restart it, does that
work?
The script associated with a HA Resource (again, avoiding Service since
that term is overloaded) is controlled via a:
* SysV/LSB init script
* OCF Compliant Resource Agent (see resource-agents package in Fedora)
Matahari doesn't need to be configured by post boot or build time
scripts on the guest with 'what Resources to monitor'. The only thing
wrt Matahari that needs to be handled is the initial
bootstrap/connection to the Broker/Bus.
Once that occurs, telling the matahari-services Agent which Resources to
monitor (via SysV/LSB/OCF RA) would be done via the DPE, based on the
information it is given about the Deployable and Assemblies.
think about where the service script is a puppet manifest. The
recipe descibes the daemons that need to be running. Does conductor
need to be involved here, or can matahari just figure out from the
manifest (which will be available on the VM) that it needs to
monitor and restart some daemons?
The manifest is just a listing of packages, no? How would we know from
that list of packages (RPMs) which packages are:
* Daemons that need to be monitored
* Daemons that don't need to be monitored
* Things that aren't daemons at all
So unless the manifest contains additional information that flags a
specific package as daemon/not-daemon/HA-daemon, we don't have enough
info for DPE to tell Matahari what to monitor
So something in the Deployable/Assembly creation step needs to store
that information and then provide it to CPE/DPEs
- does conductor have a hard-dependency on pacemaker-cloud? or is
it
an optional dependency? or does conductor know nothing at all about
pacemaker-cloud?
Well, right now it certainly doesn't have a dependency, since the first
releases of Conductor will not be integrated at all with Pacemaker Cloud.
Whether or not there is a hard dependency still needs to be worked
out... One thing to keep in mind is: Pacemaker Cloud is the integration
point with Matahari. So even taking away all of the HA concepts, it's
still useful to rely on pcmk-cloud purely for the guest monitoring and
introspection aspects. So I think it would make sense to always use
Conductor with pcmk-cloud, and the HA functionality specifically could
be enabled/disabled as desired. But the general OS monitoring should
probably always be present.
- should pacemaker-cloud be a part of Aeolus releases so that we
can
pimp the HA stuff as an Aeolus feature? if so, is it okay for it to
follow the Aeolus release schedule?
I think that sounds like a reasonable idea. I think the notion of guest
monitoring is sort of essential, so I'd like to see tight integration.
pcmk-cloud is a separable component though, so I don't think it would be
in the same tarball as another Aeolus component, but I'd have no
objection with bundling pcmk-cloud releases with Aeolus overall releases
and also no objection to tying their schedules together.
Steve, any thoughts on the above or objections?
Perry