On 11Jul2019 08:17, Alex <mysqlstudent(a)gmail.com> wrote:
> However, the RAID arrangement is proprietry and different to
mdadm
> and/or LVM. OTOH, I did once spent an hour on the phone with a very
> helpful LSI engineer trying to rescue one here.
>
> So, using the LSI in JOB (just a bunch of discs) mode, yes?
No, I believe it's RAID5 - 8x240GB.
I thought you said earlier that it was managed with mdadm, so I took it
to be a JBOD in the LSI controlled and a RAID5 in mdadm.
I'm planning on replacing those disks with 4x1TB disks, which
would
work on the onboard controllers. I still can't decide whether to
continue to use the LSI controller.
Your call. They're a bit more directly visible and manageable under
mdadm. If you're not using the LSI controller then the drives can also
be physically moved to a machine with no LSI controller should that
become necessary (thinking DR here; but you've got backups?)
Actually, if I did use the onboard, I could use mdadm, take the
system
down to perform the initial install, and sync the data from the LSI
disks while the system is running, then shut it down briefly to do the
final sync after the bulk of the data has transferred.
Yes, that would work I think. Which would migrate you off the LSI
controller, yes? To pure mdadm with on board SATA?
> The LSI stuff is pretty good in my experience. Ran them in
several IBM
> boxes and also at home for years.
I am inclined to believe it will perform better than mdadm.
It should. The OS sees one drive for the raidset and does one I/O for a
write; the other I/O to the individual drives is done by the LSI
controller. If mdadm manages the RAID it must update each backend drive
itself from the OS.
It doesn't change the underlying physical drive behaviour, but it moves
managing the RAID and the writes requires out of the OS.
> >Using the LSI makes me nervous - there have been one or two
times when
> >I almost lost the array, but I'll probably keep using it.
>
> The important thing is to be able to monitor them. I've some scripts for
> that - put them in a 5 minute cronjob. Or in your monitoring system eg
> nagios. Then you will get timely emails if a problem occurs.
That sounds awesome. Do you know where I can find those scripts? I
forgot they used to be referred to as megaraid.
I've attached "mcli" and "nagios-report-mcli". "mcli"
invokes the
cs.app.megacli python module conveniently. The nagios script wraps mcli
and produces a nagios compatible status line.
> I wrote the cs.app.megacli Python module for this (see PyPI) and
have
> some small auxiliary scripts which wrap it.
Can you forward it on?
The module is here:
https://pypi.org/project/cs.app.megacli/
It is Python 2 specific because historically it needed to run on the
native Python of some RHEL4 and RHEL5 machines. I've a TODO to make a
Python 3 version.
Install is "pip install cs.app.megacli". Or use the python file I've
also attached.
This is dependent on the LSI MegaRAID Linux software. Which you used to
be able to download, but I can't find a download for it any more. I can
ship you an RPM or a tarball of the unpacked tree for x86_64 separately.
The Python code expects this installed at /opt/MegaRAID.
I have another system on the same network with like 7TB of data
available. I'm thinking that I sync a copy of the user data to that
system, and create a virtual machine on that system with the mail
server config that somehow mounts the directory on the host system.
You could make a virtual drive for the VM in the usual way (distinct
from the VM's OS virtual drive). Copy to it via the VM. You could NFS
mount the original system's volume to the VM and do a regular
cp-then-rsync.
Disable the production system and change the IP of the virtual
machine
to assume that of the mail server.
Maybe allocate a service IP for the mail system instead (just an extra
IP _not_ officially assigned to a particular physical machine). _Add_ it
to the original system well ahead of the move. Update DNS for the mail
to point at the service address (hoping your clients use a logical name
like "mail" or "smtp" etc, not the personal hostname of the mail
system). Wait for all connections to be using the service address.
When the new address is in play, sync your replacement VM. (Of course,
get the copy and a mostly-sync underway while this is playing out.)
Users will then access this
temporary system while I rebuild the production system and transfer
back the data.
At cutover: down the mail services. Sync latest volume state to the VM
(from the VM so that it can be root at the VM end). Drop the service
address from the main machine, add it to the VM. Do final sync. Check
things, then bring up mail services on the VM.
Once the initial bulk transfer has occurred, shut down services on
the
virtual machine and sync the remaining user data back to the
production system.
Yep. And move the service address back.
Does this sound feasible?
I think so. I'd get it all down on paper (or a text file) as a complete
sequence of operations. There's nothing like writing down a plan to
discover a gaping procedural gap. Especially if someone else then
reviews the plan for you.
I'd also make sure I had a separate backup of the big filesystem anyway
eg on a removable drive (or so); I have a lot of paranoia about
production data.
For added fun, the VM can be your proof of concept for the OS upgrade as
well: install it with the same OS etc you're intending for the main
machine. That way you can get any bugs or surprises out before
committing to the main machine upgrade.
Cheers,
Cameron Simpson <cs(a)cskk.id.au>