My purpose is to queue up a bunch of tasks > #cpus, and have #cpus run at a
time in parallel.
So if I have 120 jobs to run, and 32 cores, I want to queue them all up and
run 32 in parallel at a time.
Or, maybe I need to set --ncpus=16 so schedule 16 parallel jobs instead of
32 (my scheduler is very simple and doesn't know about free memory)
On Fri, Feb 4, 2022 at 10:24 AM John Mellor <john.mellor(a)gmail.com> wrote:
On 2022-02-04 09:04, Neal Becker wrote:
After this discussion, I needed a simple batch scheduling system. I tried
installing and starting condor on F35. Never saw so many selinux
problems. Couldn't dnf remove it fast enough.
After a bit of searching, I found the system I wrote 11 years ago.
https://pypi.org/project/batch-queue/
I just finished updating for py3 and a few more tweaks.
It's a very simple system that runs on the local host and allows you to
submit jobs. It will schedule them up to the #cpus. There are commands to
list the queue, kill jobs, suspend and continue them. Does just what I
need.
If it helps you too that's be great.
On Wed, Jan 26, 2022 at 2:04 PM Fred Erickson <fredferickson(a)gmail.com>
wrote:
> On Wed, 26 Jan 2022 12:59:23 -0500
> Neal Becker <ndbecker2(a)gmail.com> wrote:
>
> > I've needed this over the years but all the ones I've seen appeared
> > much too complex for my simple use case. I ended up writing my own
> > using pyxmlrpc. Unfortunately haven't used it for years and don't
> > know if I could find it again (was uploaded to pypi at one time).
> >
> > Are any of these batch systems simple to install, use, and maintain?
> >
>
> I see batch was included with my f34 system and condor is provided in
> the updates repo.
I'm confused. Why do you feel the need for an overly-complicated job
scheduler, emulating the mainframe scheduling mess? Linux is a unix-like
system. Why don't you simply use the already-installed batch command? It
can easily handle thousands of simultaneously-scheduled batch jobs without
bringing the system to its knees.
Unless your jobs are 100% cpu-bound, scheduling jobs by the number of cpus
seems just wrong, leaving a lot of unused cpu cycles on the table. If your
jobs are i/o-bound, then disk or network load seems like it should be taken
into account, not cpus.
Luckily, the overall system load is always calculated for you with no
complicated mechanisms required. The batch command by default will not
schedule a job if the load > 1.5 so that you do not impact foreground
processes very much. You can also renice your batch job as required to
lessen the as-running impact even more.
If what you really need is a CI/CD system, then use the correct tools for
the job. Batch is generally not considered to be one of them. Install
Jenkins or CircleCI or any of a dozen tools that are built to do this right.
--
John Mellor
_______________________________________________
users mailing list -- users(a)lists.fedoraproject.org
To unsubscribe send an email to users-leave(a)lists.fedoraproject.org
Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives:
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it:
https://pagure.io/fedora-infrastructure