home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!gatech!rutgers!rochester!cantaloupe.srv.cs.cmu.edu!crabapple.srv.cs.cmu.edu!andrew.cmu.edu!<UNAUTHENTICATED>+
- From: Barry_Wolman@transarc.com
- Newsgroups: comp.unix.aix
- Subject: HELP! How to Restore Lost Logical Vol
- Message-ID: <web30vD0Bwx3I0mEQy@transarc.com>
- Date: 27 Aug 92 01:30:35 GMT
- Organization: Carnegie Mellon, Pittsburgh, PA
- Lines: 102
-
- Sorry for the length of the posting. Don't bother reading on if
- you're not an AIX Logical Volume Manager (LVM) expert.
-
- Several weeks ago the 320H I used for performance testing was
- upgraded from five to eight external Maxtor 203 disks. The
- original five disks had been organized into two volume groups vg01
- and vg02. I reorganized the volume groups so vg01 had six disks
- and vg02 had two disks hdisk5 and hdisk9. Some of the disks that
- used to be in vg02 were moved into vg01.
-
- In vg02 I created two 48MB raw partitions (one per disk) and a
- 300MB mounted file system /perf striped across the two drives.
- lv01, the logical volume for /perf, consisted of 38 LPs on hdisk5
- and 36 LPs on hdisk9. There was a loglv01 on hdisk9. I used the
- system without problems until today.
-
- I used the past tense in the preceding paragraph, because when I
- logged in today, I discovered that the system had been rebooted
- for a reason unrelated to the problems reported here. I found
- that /perf wasn't mounted. When I checked, I found that vg02 and
- all the logical volumes in it had disappeared! When I called our
- operations group I was told that they hadn't been able to varyon
- the logical volumes in vg02 after the reboot and that they had
- tried "various things" without success.
-
- About 90+% of /perf consisted of files duplicated elsewhere or
- files that can be regenerated by recompiling sources stored on a
- regularly backed up partition. There were a small number of files
- that I'd miss, e.g. some recent raw profiles and traces, so I was
- willing to invest a few hours in trying to regenerate /perf.
-
- Fortunately, when I did reorganized the volume groups, I wrote a
- script that ran various combinations of lslv, lvpv, and lsvg to
- generate a detailed summary of the status and location of every
- logical volume. I had a hard copy of the output of this script
- from just after the reorganization.
-
- I decided to see if I could recreate the logical volumes exactly
- where they were before the crash. My hope was that I could mount
- lv01 as /perf and be back on the air.
-
- I recreated vg02 and then realized it might be a good idea to make
- a physical copy of the two disks. I used
- dd if=/dev/hdisk5 of=/dev/rmt0 bs=4096k to make a copy of
- hdisk5 on a 8mm tape and used a similar dd to make a copy of
- hdisk9.
-
- Using the detailed info of where the logical volumes had been, I
- used smit to recreate new instances of the logical volumes and
- verified that they occupied the same ranges of PPs that they used
- to occupy and had the same attributes. I then tried to mount
- /perf by
- mount -o rw,log=/dev/loglv01 /dev/lv01 /perf
- Of course this didn't work (if it had, I wouldn't have made this
- posting :-().
-
- I then tried the smit command that adds a new journaled file
- system using an existing logical volume (lv01). It got the right
- size for the file system (608xxx blocks), but after I mounted
- /perf I discovered it was empty.
-
- I then tried to refill lv01 and loglv01 with what was on the disk
- before I started fiddling with smit. After unmounting /perf I
- used dd with appropriate skip= and seek= commands to copy the
- right number of PPs from the two tapes into the LPs in lv01 and
- loglv01. After I did this, mount complained about a file system
- call receiving an invalid parameter. I'm pretty sure I used the
- right set of parameters when I "refilled" lv01 and loglv01, but
- it's possible I made a mistake. I'll double check when I get back
- to work tomorrow.
-
- After refilling the logical volumes failed, I used dd to restore
- the two physical disks from the tapes I had written. This didn't
- result in a usable file system either. I think smit showed the
- file system size as 0.
-
- I'm a bit surprised that none of the above worked (I suppose it's
- possible that operations did something to clobber the disk when
- they were trying to "fix" the problem). If I had been booting the
- system, I would have taken the two physical dumps as soon as the
- first varyon attempt failed and I realized that vg02 was gone.
-
- Is there anything I can do? Should I resign myself to the loss of
- the files that weren't backed up?
-
- A more troubling question is why did vg02 disappear? I did all
- the file operations from smit and my detailed display of LVM
- values showed no signs of error. Could there be a LVM bug here?
- In case it is relevant, I have a record of what was in vg01 and
- vg02 before the reorganization, but not at the same level of
- detail as for after the reorganization.
-
- Unfortunately, I have a partial, but not a complete log of the
- smit operations I did during the reorganization. We use AFS, so
- when I "su rootl", I lose access to my home directory in AFS where
- smit.log is stored. If I remember to "klog barry", the smit ops
- get logged, if I don't klog, they aren't. The reorganization was
- done in stages, and I sometimes didn't klog.
-
- Thanks in advance for any advice,
-
- Barry
-