Migrating RAID1 to new system

VLC midi support

F30: Suspend to disk - and then...

Alex

Friday, 28 June 2019 Fri, 28 Jun '19

10:03 a.m.

Hi, I have an older fedora install that I need to upgrade. It has 8 Intel SSD 520 Series 240GB disks in there now, mounted on root using an LSI SAS 9260-8i controller. There is about 1.3TB usable space. I need to upgrade it to add more space. If I bought eight 512GB SSDs, how do I calculate how much usable space I would have after partitioning/formatting using XFS? Can I safely use XFS on the root partition? My strategy to upgrade the system using the eight new 512GB SSDs would be: - Add two regular 2TB disks as RAID1 to the existing system - Copy the 1.3TB of user data onto it - Remove the eight existing 240GB SSDs - Install the eight new 512GB SSDs - Install fedora30 onto the new system - Migrate the old configs from backup onto the new system - Mount the two RAID1 drives onto the new system - Copy data from RAID1 array to new system Do I need to migrate the RAID1 config, or will mdadm figure it out on its own? Anything else I should watch for?

Show replies by date

Cameron Simpson

Friday, 28 June Fri, 28 Jun

7:02 p.m.

On 28Jun2019 11:03, Alex <mysqlstudent(a)gmail.com> wrote:

...

I have an older fedora install that I need to upgrade. It has 8 Intel SSD 520 Series 240GB disks in there now, mounted on root using an LSI SAS 9260-8i controller. There is about 1.3TB usable space. I need to upgrade it to add more space. If I bought eight 512GB SSDs, how do I calculate how much usable space I would have after partitioning/formatting using XFS?

Unsure what the overheads of the partitions and XFS are, but they are small; the RAID setup has a MUCH larger impact on available space. I would just figure out how much space you lose from the raid config (eg 50% for RAID-1 with 2 drives, N/(N+2) for RAID5 with one parity and one spare, etc) and round it down a bit. Can you describe your use case? I'm surprised you've got that much space "as root". I normally make the OS drive (or RAID set) pretty small, <10GB, and maintain the larger areas totally separate. A smaller scale example than yours, our home server has: - / on an onboard SD card, current 4GB but that is way too small now; I need to do a fresh install with a bigger card - my attempted at copying it to a large card and changing partition sizes have been catastrophic unbootable failures, largely due to grub being a hostile counterintuitive POS - unsure how much of that I should really blame on the historic IBM PC architecture, but grub's documentation doesn't help - /home and some swap on a 500GB SSD - /app8tb which is a RAID1 of 2 8TB SATA drives, which is largely the media server storage and scratch space This means that I've got some physical separation of the OS from the other areas. When I rebuilt this machine I'll just pull out the "/" SD card, put in a better and bigger one, and install a more modern release. No mucking with the other drives at all.

...

Can I safely use XFS on the root partition?

Yes.

...

My strategy to upgrade the system using the eight new 512GB SSDs would be: - Add two regular 2TB disks as RAID1 to the existing system

As hardware RAID or mdadm software RAID? Guessing the latter?

...

- Copy the 1.3TB of user data onto it

Definitely. You can see my setup sidesteps this requirement. But you've clearly got a different arrangement.

...

- Remove the eight existing 240GB SSDs - Install the eight new 512GB SSDs - Install fedora30 onto the new system - Migrate the old configs from backup onto the new system - Mount the two RAID1 drives onto the new system - Copy data from RAID1 array to new system

Sounds sound to me.

...

Do I need to migrate the RAID1 config, or will mdadm figure it out on its own?

Mdadm figures this stuff out. It should recompose the software raidsets automatically. You will have to hand mount the /dev/mdX partitions yourself of course.

...

Anything else I should watch for?

It is very useful to use the "-L label" option with any filesystems you hand make - then you can use LABEL= in the fstab for mounting. Modern installs also give filesystems UUIDs, which are more unique, but I find them personally hard to use because they are not memorable. I recently discovered the "lsblk -f" command, VERY useful for seeing devices, and filesystems and their mountedness. I think my main argument here is that you should try to have separate media for the OS (and small SSD or something) so that you can do a complete reinstall with minimal interference and risk to your nonOS data. Cheers, Cameron Simpson <cs(a)cskk.id.au>

Alex

Sunday, 30 June Sun, 30 Jun

8:01 p.m.

Hi, On Fri, Jun 28, 2019 at 8:51 PM Cameron Simpson <cs(a)cskk.id.au> wrote:

...

On 28Jun2019 11:03, Alex <mysqlstudent(a)gmail.com> wrote: >I have an older fedora install that I need to upgrade. It has 8 Intel >SSD 520 Series 240GB disks in there now, mounted on root using an LSI >SAS 9260-8i controller. There is about 1.3TB usable space. > >I need to upgrade it to add more space. If I bought eight 512GB SSDs, >how do I calculate how much usable space I would have after >partitioning/formatting using XFS? Unsure what the overheads of the partitions and XFS are, but they are small; the RAID setup has a MUCH larger impact on available space. I would just figure out how much space you lose from the raid config (eg 50% for RAID-1 with 2 drives, N/(N+2) for RAID5 with one parity and one spare, etc) and round it down a bit.

I should have been more clear - I'm trying to make due with two 2TB disks to hold the the 1.3TB of data for the interim while I rebuild the server itself with new SSDs.

...

Can you describe your use case? I'm surprised you've got that much space "as root". I normally make the OS drive (or RAID set) pretty small, <10GB, and maintain the larger areas totally separate.

It's a pop/imap/smtp mail server for about two thousand accounts.

...

A smaller scale example than yours, our home server has:

Yes, thank you. This should have been better partitioned when this was installed many years ago.

...

>My strategy to upgrade the system using the eight new 512GB SSDs would be: > >- Add two regular 2TB disks as RAID1 to the existing system As hardware RAID or mdadm software RAID? Guessing the latter?

Yes, mdadm.

...

>- Copy the 1.3TB of user data onto it Definitely. You can see my setup sidesteps this requirement. But you've clearly got a different arrangement. >- Remove the eight existing 240GB SSDs >- Install the eight new 512GB SSDs >- Install fedora30 onto the new system >- Migrate the old configs from backup onto the new system >- Mount the two RAID1 drives onto the new system >- Copy data from RAID1 array to new system Sounds sound to me.

I've since learned it takes entirely too long to copy 1.3TB to two 2TB disks. I can't keep the system down that long. I'm going to have to transfer all the accounts, configuration data, and user data on these two disks to another interim system with the production system IPs while I entirely rebuild the new one, copy the bulk of the data across the network to it, stop all services on both systems, sync the differences that occurred during the main data transfer, change the IPs back, then start all the production services.

...

>Anything else I should watch for? It is very useful to use the "-L label" option with any filesystems you hand make - then you can use LABEL= in the fstab for mounting. Modern installs also give filesystems UUIDs, which are more unique, but I find them personally hard to use because they are not memorable. I recently discovered the "lsblk -f" command, VERY useful for seeing devices, and filesystems and their mountedness. I think my main argument here is that you should try to have separate media for the OS (and small SSD or something) so that you can do a complete reinstall with minimal interference and risk to your nonOS data.

This system is nearly seven years old. Thanks for the tips - things are quite different today. Thanks, Alex

Cameron Simpson

9:19 p.m.

On 30Jun2019 21:01, Alex <mysqlstudent(a)gmail.com> wrote:

...

On Fri, Jun 28, 2019 at 8:51 PM Cameron Simpson <cs(a)cskk.id.au> wrote: > On 28Jun2019 11:03, Alex <mysqlstudent(a)gmail.com> wrote: > >I have an older fedora install that I need to upgrade. It has 8 Intel > >SSD 520 Series 240GB disks in there now, mounted on root using an LSI > >SAS 9260-8i controller. There is about 1.3TB usable space. > > > >I need to upgrade it to add more space. If I bought eight 512GB SSDs, > >how do I calculate how much usable space I would have after > >partitioning/formatting using XFS? > > Unsure what the overheads of the partitions and XFS are, but they are > small; the RAID setup has a MUCH larger impact on available space. I > would just figure out how much space you lose from the raid config (eg > 50% for RAID-1 with 2 drives, N/(N+2) for RAID5 with one parity and one > spare, etc) and round it down a bit. I should have been more clear - I'm trying to make due with two 2TB disks to hold the the 1.3TB of data for the interim while I rebuild the server itself with new SSDs.

I would be very surprised if 1.3TB did not fit on a 2TB volume. Even many small files (eg your email service, if using Maildir folders) would be unlikely to have such a large overhead.

...

> Can you describe your use case? I'm surprised you've got that much > space > "as root". I normally make the OS drive (or RAID set) pretty small, > <10GB, and maintain the larger areas totally separate. It's a pop/imap/smtp mail server for about two thousand accounts.

Ah, ok.

...

> A smaller scale example than yours, our home server has: Yes, thank you. This should have been better partitioned when this was installed many years ago.

This might be an opportunity to improve things. I highly recommend separating the OS drives from the nonOS data drives if that is feasible. [...]

...

I've since learned it takes entirely too long to copy 1.3TB to two 2TB disks. I can't keep the system down that long.

You don't need to. 1: Set up the 2TB filesystem; my strong preference is XFS. 2: "cp -a" the trees into it. Don't worry that they will change during the copy. 3: "rsync -ia --delete" the live volume into the copy. Time this. 4: Schedule downtime. 5: At downtime: shutdown mail services etc. Or even reboot into single user. Even remount the mail volume readonly ("mount -i remount,ro /the/mail/volume"), should be feasible if the services are off. 6: Run the rsync again. Repeat it to ensure that it is clean (empty output). Your backup is complete. Do the upgrade. If the old raidset FS isn't XFS, I encourage you to seize the opportunity to make it XFS this time. The downside is still that pulling the data back onto the new volumes will be time consuming anyway. Caveats: I'm hoping you do not have hardlinks in the tree to move. If you do, this gets more expensive. You need to use tar|tar or rsync with the -H option to preserve hardlinks - if there are many this is memory intensive (he says, with bitter experience moving an EXTREMELY hardlinked backup tree to a new volume - about to dig into using xfsdump in the future). Happy to recount my experiences here and to discuss alternatives if this is necessary. You can check this with: find /the/voleume/to/copy -type f -links +1 -ls The copy back downtime is large.

...

I'm going to have to transfer all the accounts, configuration data, and user data on these two disks to another interim system with the production system IPs while I entirely rebuild the new one, copy the bulk of the data across the network to it, stop all services on both systems, sync the differences that occurred during the main data transfer, change the IPs back, then start all the production services.

Do you have the hardware to assemble the new raidset with the new drives and have both online at once (with two machines I suppose)? If so you can do the cp-then-rsync directly to the new drives without the intermediate 2TB volume. Which means there's no time consuming copy back. Cheers, Cameron Simpson <cs(a)cskk.id.au>

Alex

Tuesday, 2 July Tue, 2 Jul

2:23 p.m.

Hi,

...

>I've since learned it takes entirely too long to copy 1.3TB to two 2TB >disks. I can't keep the system down that long. You don't need to.

The problem is the LSI hardware RAID. All eight ports are consumed with the eight 240GB disks. The two 2TB disks are connected to the onboard SATA controllers. I forget the reason why I didn't just use the onboard SATA controllers when I installed the system seven years ago. I know there's only six ports, and I'm using eight disks on the LSI controller, but that wasn't the reason - the decision was made to use the LSI when there was only four regular SATA disks installed. Maybe that was the reason - the onboard were too slow. Using the LSI makes me nervous - there have been one or two times when I almost lost the array, but I'll probably keep using it. This means I have to use an interim server to hold the 2TB of data while rebuilding and restore the data to the original server. I'll probably set it up with the two 2TB regular disks, shift all the services to it, rebuild the existing production system, copy the data back, then shift the IP and services back to the original production machine. Another problem - just saw one of the 2TB disks I'm using for backup is failing: [411086.090668] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [411086.091908] ata6.00: irq_stat 0x40000001 [411086.093071] ata6.00: failed command: READ DMA EXT [411086.094218] ata6.00: cmd 25/00:00:80:82:b9/00:05:49:00:00/e0 tag 16 dma 655360 in res 53/40:00:80:82:b9/00:00:49:00:00/00 Emask 0x8 (media error) [411086.096519] ata6.00: status: { DRDY SENSE ERR } [411086.097699] ata6.00: error: { UNC } [411086.099676] ata6.00: NCQ Send/Recv Log not supported [411086.101691] ata6.00: NCQ Send/Recv Log not supported [411086.102885] ata6.00: configured for UDMA/133 [411086.104086] sd 5:0:0:0: [sdb] tag#16 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [411086.105329] sd 5:0:0:0: [sdb] tag#16 Sense Key : Vendor Specific(9) [current] [411086.105950] sd 5:0:0:0: [sdb] tag#16 <<vendor>>ASC=0x80 ASCQ=0x0 [411086.106522] sd 5:0:0:0: [sdb] tag#16 CDB: Read(16) 88 00 00 00 00 00 49 b9 82 80 00 00 05 00 00 00 [411086.107675] print_req_error: I/O error, dev sdb, sector 1236894336 [411086.108296] ata6: EH complete # cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdb1[1] sda1[0] 1953381440 blocks super 1.2 [2/2] [UU] [======>..............] check = 31.6% (618322048/1953381440) finish=115514.5min speed=192K/sec bitmap: 0/15 pages [0KB], 65536KB chunk

...

Caveats: I'm hoping you do not have hardlinks in the tree to move. If you do, this gets more expensive. You need to use tar|tar or rsync with the -H

Thankfully no hardlinks. I will also take the opportunity to use XFS over ext4.

...

Because of the hardware RAID controller, I cannot.

...

Cheers, Cameron Simpson <cs(a)cskk.id.au>

Thanks, mate.

Cameron Simpson

6:54 p.m.

On 02Jul2019 15:23, Alex <mysqlstudent(a)gmail.com> wrote:

...

> >I've since learned it takes entirely too long to copy 1.3TB to two > >2TB > >disks. I can't keep the system down that long. > > You don't need to. The problem is the LSI hardware RAID. All eight ports are consumed with the eight 240GB disks.

At least with mdadm you have control of the RAID fairly directly. Your LSI controller can also run the drives as a raidset, but it it harder to deal with. I have some scripts for monitoring and, to a limited extent, controlling these while the OS is up instead of via the BIOS interface. However, the RAID arrangement is proprietry and different to mdadm and/or LVM. OTOH, I did once spent an hour on the phone with a very helpful LSI engineer trying to rescue one here. So, using the LSI in JOB (just a bunch of discs) mode, yes?

...

The two 2TB disks are connected to the onboard SATA controllers. I forget the reason why I didn't just use the onboard SATA controllers when I installed the system seven years ago. I know there's only six ports, and I'm using eight disks on the LSI controller, but that wasn't the reason - the decision was made to use the LSI when there was only four regular SATA disks installed. Maybe that was the reason - the onboard were too slow.

The LSI stuff is pretty good in my experience. Ran them in several IBM boxes and also at home for years.

...

Using the LSI makes me nervous - there have been one or two times when I almost lost the array, but I'll probably keep using it.

The important thing is to be able to monitor them. I've some scripts for that - put them in a 5 minute cronjob. Or in your monitoring system eg nagios. Then you will get timely emails if a problem occurs. I wrote the cs.app.megacli Python module for this (see PyPI) and have some small auxiliary scripts which wrap it.

...

This means I have to use an interim server to hold the 2TB of data while rebuilding and restore the data to the original server.

I usually use the USB bus for this (copying user data outside the system so that I've a backup and can copy it back) - it leaves the system SATA stuff available for whatever. Have you got USB3 on this machine? Otherwise it will be munch slower.

...

I'll probably set it up with the two 2TB regular disks, shift all the services to it, rebuild the existing production system, copy the data back, then shift the IP and services back to the original production machine. Another problem - just saw one of the 2TB disks I'm using for backup is failing:

Are they a RAID? Or 2 independent drives and filesystems? If you've got USB3, a pair of WD Elements external USB bus powered drives can be convenient. Or the like.

...

[411086.090668] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [411086.091908] ata6.00: irq_stat 0x40000001 [411086.093071] ata6.00: failed command: READ DMA EXT [411086.094218] ata6.00: cmd 25/00:00:80:82:b9/00:05:49:00:00/e0 tag 16 dma 655360 in res 53/40:00:80:82:b9/00:00:49:00:00/00 Emask 0x8 (media error) [411086.096519] ata6.00: status: { DRDY SENSE ERR } [411086.097699] ata6.00: error: { UNC } [411086.099676] ata6.00: NCQ Send/Recv Log not supported [411086.101691] ata6.00: NCQ Send/Recv Log not supported [411086.102885] ata6.00: configured for UDMA/133 [411086.104086] sd 5:0:0:0: [sdb] tag#16 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [411086.105329] sd 5:0:0:0: [sdb] tag#16 Sense Key : Vendor Specific(9) [current] [411086.105950] sd 5:0:0:0: [sdb] tag#16 <<vendor>>ASC=0x80 ASCQ=0x0 [411086.106522] sd 5:0:0:0: [sdb] tag#16 CDB: Read(16) 88 00 00 00 00 00 49 b9 82 80 00 00 05 00 00 00 [411086.107675] print_req_error: I/O error, dev sdb, sector 1236894336 [411086.108296] ata6: EH complete # cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdb1[1] sda1[0] 1953381440 blocks super 1.2 [2/2] [UU] [======>..............] check = 31.6% (618322048/1953381440) finish=115514.5min speed=192K/sec bitmap: 0/15 pages [0KB], 65536KB chunk

I think this recovered. It also may not be the drive. I've hand probelms with dodgy SATA enclosures and old/frail SATA cables too.

...

> Do you have the hardware to assemble the new raidset with the new > drives > and have both online at once (with two machines I suppose)? > > If so you can do the cp-then-rsync directly to the new drives without > the intermediate 2TB volume. Which means there's no time consuming copy > back. Because of the hardware RAID controller, I cannot.

Alas. Still, if you've got on board SATA in addition to the LSI controller you could: copy the 1.3TB to external USB drives, install the new OS on the SATA bus, swap out the old raidset for the new raidset, copy back from the USB drives. So you can still win on the copy-out phase with cp-then-rsync, but you'll still have the lond copy back phase. SATA is almost certainly more efficient than USB, though you coud measure that. Cheers, Cameron Simpson <cs(a)cskk.id.au>

Tim

10:21 p.m.

On Wed, 2019-07-03 at 09:54 +1000, Cameron Simpson wrote:

...

I've hand probelms with dodgy SATA enclosures and old/frail SATA cables too.

The connectors are not the most robust, nor particularly firm-fitting. And if people bend the cables that affects data transmission.

Alex

Thursday, 11 July Thu, 11 Jul

7:17 a.m.

Hi,

...

However, the RAID arrangement is proprietry and different to mdadm and/or LVM. OTOH, I did once spent an hour on the phone with a very helpful LSI engineer trying to rescue one here. So, using the LSI in JOB (just a bunch of discs) mode, yes?

No, I believe it's RAID5 - 8x240GB. I'm planning on replacing those disks with 4x1TB disks, which would work on the onboard controllers. I still can't decide whether to continue to use the LSI controller. Actually, if I did use the onboard, I could use mdadm, take the system down to perform the initial install, and sync the data from the LSI disks while the system is running, then shut it down briefly to do the final sync after the bulk of the data has transferred.

...

The LSI stuff is pretty good in my experience. Ran them in several IBM boxes and also at home for years.

I am inclined to believe it will perform better than mdadm.

...

>Using the LSI makes me nervous - there have been one or two times when >I almost lost the array, but I'll probably keep using it. The important thing is to be able to monitor them. I've some scripts for that - put them in a 5 minute cronjob. Or in your monitoring system eg nagios. Then you will get timely emails if a problem occurs.

That sounds awesome. Do you know where I can find those scripts? I forgot they used to be referred to as megaraid.

...

I wrote the cs.app.megacli Python module for this (see PyPI) and have some small auxiliary scripts which wrap it.

Can you forward it on?

...

>Another problem - just saw one of the 2TB disks I'm using for backup is failing: Are they a RAID? Or 2 independent drives and filesystems?

They are RAID1, directly connected to the onboard SATA, not through USB.

...

>> Do you have the hardware to assemble the new raidset with the new >> drives >> and have both online at once (with two machines I suppose)? >> >> If so you can do the cp-then-rsync directly to the new drives without >> the intermediate 2TB volume. Which means there's no time consuming copy >> back. > >Because of the hardware RAID controller, I cannot. Alas. Still, if you've got on board SATA in addition to the LSI controller you could: copy the 1.3TB to external USB drives, install the new OS on the SATA bus, swap out the old raidset for the new raidset, copy back from the USB drives.

I have another system on the same network with like 7TB of data available. I'm thinking that I sync a copy of the user data to that system, and create a virtual machine on that system with the mail server config that somehow mounts the directory on the host system. Disable the production system and change the IP of the virtual machine to assume that of the mail server. Users will then access this temporary system while I rebuild the production system and transfer back the data. Once the initial bulk transfer has occurred, shut down services on the virtual machine and sync the remaining user data back to the production system. Does this sound feasible?

cs＠cskk.id.au

7:22 p.m.

On 11Jul2019 08:17, Alex <mysqlstudent(a)gmail.com> wrote:

...

> However, the RAID arrangement is proprietry and different to mdadm > and/or LVM. OTOH, I did once spent an hour on the phone with a very > helpful LSI engineer trying to rescue one here. > > So, using the LSI in JOB (just a bunch of discs) mode, yes? No, I believe it's RAID5 - 8x240GB.

I thought you said earlier that it was managed with mdadm, so I took it to be a JBOD in the LSI controlled and a RAID5 in mdadm.

...

I'm planning on replacing those disks with 4x1TB disks, which would work on the onboard controllers. I still can't decide whether to continue to use the LSI controller.

Your call. They're a bit more directly visible and manageable under mdadm. If you're not using the LSI controller then the drives can also be physically moved to a machine with no LSI controller should that become necessary (thinking DR here; but you've got backups?)

...

Actually, if I did use the onboard, I could use mdadm, take the system down to perform the initial install, and sync the data from the LSI disks while the system is running, then shut it down briefly to do the final sync after the bulk of the data has transferred.

Yes, that would work I think. Which would migrate you off the LSI controller, yes? To pure mdadm with on board SATA?

...

> The LSI stuff is pretty good in my experience. Ran them in several IBM > boxes and also at home for years. I am inclined to believe it will perform better than mdadm.

It should. The OS sees one drive for the raidset and does one I/O for a write; the other I/O to the individual drives is done by the LSI controller. If mdadm manages the RAID it must update each backend drive itself from the OS. It doesn't change the underlying physical drive behaviour, but it moves managing the RAID and the writes requires out of the OS.

...

> >Using the LSI makes me nervous - there have been one or two times when > >I almost lost the array, but I'll probably keep using it. > > The important thing is to be able to monitor them. I've some scripts for > that - put them in a 5 minute cronjob. Or in your monitoring system eg > nagios. Then you will get timely emails if a problem occurs. That sounds awesome. Do you know where I can find those scripts? I forgot they used to be referred to as megaraid.

I've attached "mcli" and "nagios-report-mcli". "mcli" invokes the cs.app.megacli python module conveniently. The nagios script wraps mcli and produces a nagios compatible status line.

...

> I wrote the cs.app.megacli Python module for this (see PyPI) and have > some small auxiliary scripts which wrap it. Can you forward it on?

The module is here: https://pypi.org/project/cs.app.megacli/ It is Python 2 specific because historically it needed to run on the native Python of some RHEL4 and RHEL5 machines. I've a TODO to make a Python 3 version. Install is "pip install cs.app.megacli". Or use the python file I've also attached. This is dependent on the LSI MegaRAID Linux software. Which you used to be able to download, but I can't find a download for it any more. I can ship you an RPM or a tarball of the unpacked tree for x86_64 separately. The Python code expects this installed at /opt/MegaRAID.

...

You could make a virtual drive for the VM in the usual way (distinct from the VM's OS virtual drive). Copy to it via the VM. You could NFS mount the original system's volume to the VM and do a regular cp-then-rsync.

...

Disable the production system and change the IP of the virtual machine to assume that of the mail server.

Maybe allocate a service IP for the mail system instead (just an extra IP _not_ officially assigned to a particular physical machine). _Add_ it to the original system well ahead of the move. Update DNS for the mail to point at the service address (hoping your clients use a logical name like "mail" or "smtp" etc, not the personal hostname of the mail system). Wait for all connections to be using the service address. When the new address is in play, sync your replacement VM. (Of course, get the copy and a mostly-sync underway while this is playing out.)

...

Users will then access this temporary system while I rebuild the production system and transfer back the data.

At cutover: down the mail services. Sync latest volume state to the VM (from the VM so that it can be root at the VM end). Drop the service address from the main machine, add it to the VM. Do final sync. Check things, then bring up mail services on the VM.

...

Once the initial bulk transfer has occurred, shut down services on the virtual machine and sync the remaining user data back to the production system.

Yep. And move the service address back.

...

Does this sound feasible?

I think so. I'd get it all down on paper (or a text file) as a complete sequence of operations. There's nothing like writing down a plan to discover a gaping procedural gap. Especially if someone else then reviews the plan for you. I'd also make sure I had a separate backup of the big filesystem anyway eg on a removable drive (or so); I have a lot of paranoia about production data. For added fun, the VM can be your proof of concept for the OS upgrade as well: install it with the same OS etc you're intending for the main machine. That way you can get any bugs or surprises out before committing to the main machine upgrade. Cheers, Cameron Simpson <cs(a)cskk.id.au>

Roberto Ragusa

Monday, 1 July Mon, 1 Jul

2:44 p.m.

On 7/1/19 3:01 AM, Alex wrote:

...

I've since learned it takes entirely too long to copy 1.3TB to two 2TB disks. I can't keep the system down that long.

This is why LVM was invented a long time ago, you would do this online with minimal disruption if you want to maintain the filesystem as is (that may or may not be your case). Regards. -- Roberto Ragusa mail at robertoragusa.it

Alex

Tuesday, 2 July Tue, 2 Jul

7:10 a.m.

Hi, On Mon, Jul 1, 2019 at 4:32 PM Roberto Ragusa <mail(a)robertoragusa.it> wrote:

...

On 7/1/19 3:01 AM, Alex wrote: > I've since learned it takes entirely too long to copy 1.3TB to two 2TB > disks. I can't keep the system down that long. This is why LVM was invented a long time ago, you would do this online with minimal disruption if you want to maintain the filesystem as is (that may or may not be your case).

I've always hated LVM, but that isn't why I didn't use it here. Mostly because it was/is very difficult to use with mdadm. There were eight slots for disks, and all eight slots were filled with 240GB SSDs. This was actually already upgraded from four 240GB about five years ago. Thanks, alex

1744

days inactive

1758

days old

users@lists.fedoraproject.org

Manage subscription

10 comments

5 participants

tags (0)

participants (5)

Alex
Cameron Simpson
cs＠cskk.id.au
Roberto Ragusa
Tim