On Tue, Mar 26, 2024 at 7:31 PM Tim via users users@lists.fedoraproject.org wrote:
I hate having to deal with back-ups, it's time-consuming. Things can go wrong with them too, like what my web host did: Backed-up and restored my site's files without telling me (they were probably doing some maintenance on their gear the first time, the second time they were flailing around in the dark after they'd destroyed their perl installation). Every file had their permissions fouled up. Twice, now, I've had to un-munge about 1500 files.
T'is a common tale but true.
I once spent a summer rescuing plain text data files from a backup of CDC Cyber onto Solaris. The files had records out of order or missing and blocks of duplicated records. Over the years, there had been changes to the data format.
The original files were used to produce printed data reports, then later scanned with an automated system that sometimes messed up a page without anyone checking. In some cases OCR on short sections of the data reports was able to replace missing records.
I did the work using command-line tools in Apple OSX. Rather than manually editing the files I was able to write shell scripts to remove duplicate records, sort them correctly, and adjust to a common format. It was a big exercise in POSIX shell and utilities text processing.