NetNews Usenet Archive 1992 #19

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #19 / NN_1992_19.iso / spool / comp / unix / aix / 9130 < prev next >

Wrap

Internet Message Format | 1992-08-26 | 5.1 KB

Path: sparky!uunet!gatech!rutgers!rochester!cantaloupe.srv.cs.cmu.edu!crabapple.srv.cs.cmu.edu!andrew.cmu.edu!<UNAUTHENTICATED>+ From: Barry_Wolman@transarc.com Newsgroups: comp.unix.aix Subject: HELP! How to Restore Lost Logical Vol Message-ID: <web30vD0Bwx3I0mEQy@transarc.com> Date: 27 Aug 92 01:30:35 GMT Organization: Carnegie Mellon, Pittsburgh, PA Lines: 102 Sorry for the length of the posting. Don't bother reading on if you're not an AIX Logical Volume Manager (LVM) expert. Several weeks ago the 320H I used for performance testing was upgraded from five to eight external Maxtor 203 disks. The original five disks had been organized into two volume groups vg01 and vg02. I reorganized the volume groups so vg01 had six disks and vg02 had two disks hdisk5 and hdisk9. Some of the disks that used to be in vg02 were moved into vg01. In vg02 I created two 48MB raw partitions (one per disk) and a 300MB mounted file system /perf striped across the two drives. lv01, the logical volume for /perf, consisted of 38 LPs on hdisk5 and 36 LPs on hdisk9. There was a loglv01 on hdisk9. I used the system without problems until today. I used the past tense in the preceding paragraph, because when I logged in today, I discovered that the system had been rebooted for a reason unrelated to the problems reported here. I found that /perf wasn't mounted. When I checked, I found that vg02 and all the logical volumes in it had disappeared! When I called our operations group I was told that they hadn't been able to varyon the logical volumes in vg02 after the reboot and that they had tried "various things" without success. About 90+% of /perf consisted of files duplicated elsewhere or files that can be regenerated by recompiling sources stored on a regularly backed up partition. There were a small number of files that I'd miss, e.g. some recent raw profiles and traces, so I was willing to invest a few hours in trying to regenerate /perf. Fortunately, when I did reorganized the volume groups, I wrote a script that ran various combinations of lslv, lvpv, and lsvg to generate a detailed summary of the status and location of every logical volume. I had a hard copy of the output of this script from just after the reorganization. I decided to see if I could recreate the logical volumes exactly where they were before the crash. My hope was that I could mount lv01 as /perf and be back on the air. I recreated vg02 and then realized it might be a good idea to make a physical copy of the two disks. I used dd if=/dev/hdisk5 of=/dev/rmt0 bs=4096k to make a copy of hdisk5 on a 8mm tape and used a similar dd to make a copy of hdisk9. Using the detailed info of where the logical volumes had been, I used smit to recreate new instances of the logical volumes and verified that they occupied the same ranges of PPs that they used to occupy and had the same attributes. I then tried to mount /perf by mount -o rw,log=/dev/loglv01 /dev/lv01 /perf Of course this didn't work (if it had, I wouldn't have made this posting :-(). I then tried the smit command that adds a new journaled file system using an existing logical volume (lv01). It got the right size for the file system (608xxx blocks), but after I mounted /perf I discovered it was empty. I then tried to refill lv01 and loglv01 with what was on the disk before I started fiddling with smit. After unmounting /perf I used dd with appropriate skip= and seek= commands to copy the right number of PPs from the two tapes into the LPs in lv01 and loglv01. After I did this, mount complained about a file system call receiving an invalid parameter. I'm pretty sure I used the right set of parameters when I "refilled" lv01 and loglv01, but it's possible I made a mistake. I'll double check when I get back to work tomorrow. After refilling the logical volumes failed, I used dd to restore the two physical disks from the tapes I had written. This didn't result in a usable file system either. I think smit showed the file system size as 0. I'm a bit surprised that none of the above worked (I suppose it's possible that operations did something to clobber the disk when they were trying to "fix" the problem). If I had been booting the system, I would have taken the two physical dumps as soon as the first varyon attempt failed and I realized that vg02 was gone. Is there anything I can do? Should I resign myself to the loss of the files that weren't backed up? A more troubling question is why did vg02 disappear? I did all the file operations from smit and my detailed display of LVM values showed no signs of error. Could there be a LVM bug here? In case it is relevant, I have a record of what was in vg01 and vg02 before the reorganization, but not at the same level of detail as for after the reorganization. Unfortunately, I have a partial, but not a complete log of the smit operations I did during the reorganization. We use AFS, so when I "su rootl", I lose access to my home directory in AFS where smit.log is stored. If I remember to "klog barry", the smit ops get logged, if I don't klog, they aren't. The reorganization was done in stages, and I sometimes didn't klog. Thanks in advance for any advice, Barry