Hi Bill,
On Wed, Apr 02, 2014 at 07:58:47PM +0000, Bill Oliver wrote:
On Wed, 2 Apr 2014, Digimer wrote:
>How do you define "real cluster"?
>
Something that I can take *one* program compiled for parallelization
that will distribute the processing among machines, as compared to
running multiple invocations of a program on different machine, each
chewing on a different dataset.
I think this is more application specific than your original email
suggested. I have seen this kind of features implemented as a
client-server model. See root-proof in the fedora repositories. I have
used it a few times. Some institutes affiliated to CERN use it
internally. This particular implementation allows for concurrent jobs
on often distributed datasets when the applications are written using
the ROOT C++ framework (again available in Fedora repos). But I think
this does not satisfy your "not running on different datasets"
requirement. I really think you should choose the framework to
experiment with based on your workload rather than the otherway around.
Cheers,
--
Suvayu
Open source is the future. It sets us free.