Fedora Core 5 LV Disk
recovery or When the proverbial S..t hits the fan
My case:
A 160G drive’s lose power cable caused the drive to eventually
stop running. Some serious damage was caused to the file system. When I managed
to get the drive back on line and run a fsck it took about one hour and half for all the
corrections to be applied. I had to run fsck several
times before I had a clean drive again.
All data was recovered.
Just to add a few things to the excellent article below:
I’m assuming a standard installation as set up by the automatic
partitioning of the fedora 5 install and no Raid.
Understand that its not /dev/hda2 that
you are going to be addressing but /dev/VolGroup00/LogVol00 which in your
standard fedora 5 installation included /dev/hda2 and the swap device
Note that the article below was written for a rather older Fedora
version so not everything is exactly as described but its close enough that you
should be able to figure things out. If you cannot email me, skype me of message me.
email: Anthony.Dawson@thelasis.com
Skype ID: aegdawson
ICQ ID: 23-227-727
Messenger: aegdawson
MSN: anthony.dawson@hotmail.com
Any fsck –yf
should run against the LV so when you have done with your recovery process you
will probably want to do a clean up.
Also if the file system superblock is totally screwed you will
need to use the –b option to select an alternative super block back up copy (of
which there are many throughout the disk). If you don’t know where they are
use:
mke2fs –n /dev/hda2 this will list a whole bunch of super block
copies.
Apart from the rest, I had to issue a pvcreate
command within the lvm environment in order to set up
the LV table which I had previously deleted in ignorance…
Also note that the article renames the logical device so as not to
conflict with the sane installation. I booted the linux
system of the first installation disk in “linux
recovery” mode and repaired the disk to the point where I could finally boot
the system up again.
In my case the process was the following:
1) boot off
fedora boot disk press F5 and type in linux rescue
2) Use
parted to check that the partitions still exist
3) fsck –yf /dev/hda1 note
that the /dev/hda1 is not a LV
4) Use dd to extract the volume information
1) Use vi to edit the volume information and create VolGoup00
configuration file
2) Use pvcreate with the right drive ID to label the volume with
the correct uuid which I go from 2)
3) Use vgcfgrestore to restore the volume description
4) Use vgchange to make the volume active
5) Use fsck –yf /dev/VolGroup00/LogVol00
to repair the file system
6) If that
fails use mke2fs –n /dev/hda2 to find the location of super block backup and
then use
7) Use fsck –yfb nnnn
/dev/VolGroup00/LogVol00 where nnn is the inode of the alternative super block
8) Reboot
If you’re lucky you’re back in business J
Recovery of RAID and LVM2 Volumes
Apr 28, 2006
By Richard Bulling...
in
Raid and Logical Volume Managers are
great, until you lose data.
Restoring Access to the RAID Array Members
To recover, the
first thing to do is to move the drive to another machine. You can do this
pretty easily by putting the drive in a USB2 hard drive enclosure. It then will
show up as a SCSI hard disk device, for example, /dev/sda,
when you plug it in to your recovery computer. This reduces the risk of
damaging the recovery machine while attempting to install the hardware from the
original computer.
The challenge
then is to get the RAID setup recognized and to gain access to the logical
volumes within. You can use sfdisk -l /dev/sda to check that the partitions on the old drive are still there.
To get the RAID
setup recognized, use mdadm to scan the devices for their raid
volume UUID signatures, as shown in Listing 3.
Listing 3. Scanning a
Disk for RAID Array Members
[root@recoverybox
~]# mdadm --examine
--scan /dev/sda1 /dev/sda2 /dev/sda3
ARRAY /dev/md2 level=raid1
num-devices=2
↪UUID=532502de:90e44fb0:242f485f:f02a2565
devices=/dev/sda3
ARRAY /dev/md1 level=raid1
num-devices=2
↪UUID=75fa22aa:9a11bcad:b42ed14a:b5f8da3c
devices=/dev/sda2
ARRAY /dev/md0 level=raid1
num-devices=2
↪UUID=b3cd99e7:d02be486:b0ea429a:e18ccf65
devices=/dev/sda1
This format is
very close to the format of the /etc/mdadm.conf file
that the mdadm tool uses. You need to redirect the
output of mdadm to a file, join the device lines onto
the ARRAY lines and put in a nonexistent second device to get a RAID1
configuration. Viewing the the md
array in degraded mode will allow data recovery:
[root@recoverybox
~]# mdadm --examine --scan /dev/sda1
↪/dev/sda2 /dev/sda3 >> /etc/mdadm.conf
[root@recoverybox ~]# vi
/etc/mdadm.conf
Edit /etc/mdadm.conf so that the devices statements are on the same
lines as the ARRAY statements, as they are in Listing 4. Add the “missing”
device to the devices entry for each array member to fill out the raid1
complement of two devices per array. Don't forget to renumber the md entries if the recovery computer already has md devices and ARRAY statements in /etc/mdadm.conf.
DEVICE partitions
ARRAY /dev/md0 level=raid1
num-devices=2
↪UUID=b3cd99e7:d02be486:b0ea429a:e18ccf65
↪devices=/dev/sda1,missing
ARRAY /dev/md1 level=raid1
num-devices=2
↪UUID=75fa22aa:9a11bcad:b42ed14a:b5f8da3c
↪devices=/dev/sda2,missing
ARRAY /dev/md2 level=raid1
num-devices=2
↪UUID=532502de:90e44fb0:242f485f:f02a2565
↪devices=/dev/sda3,missing
Then, activate
the new md devices with mdadm -A -s, and check
/proc/mdstat to verify that the RAID array is active.
Listing 5 shows how the raid array should look.
Listing 5. Reactivating
the RAID Array
[root@recoverybox
~]# mdadm -A -s
[root@recoverybox
~]# cat /proc/mdstat
Personalities : [raid1]
md2 : active raid1 sda3[1]
77521536 blocks [2/1] [_U]
md1 : active raid1 sda2[1]
522048 blocks [2/1] [_U]
md0 : active raid1 sda1[1]
104320 blocks [2/1] [_U]
unused devices: <none>
If md devices show up in /proc/mdstat,
all is well, and you can move on to getting the LVM volumes mounted again.
Recovering and Renaming the LVM2 Volume
The next hurdle
is that the system now will have two sets of lvm2 disks with VolGroup00 in
them. Typically, the vgchange -a -y command would
allow LVM2 to recognize a new volume group. That won't work if devices
containing identical volume group names are present, though. Issuing vgchange -a y will report that VolGroup00 is inconsistent, and the
VolGroup00 on the RAID device will be invisible. To fix this, you need to
rename the volume group that you are about to mount on the system by
hand-editing its lvm configuration file.
If you made a
backup of the files in /etc on raidbox, you can edit
a copy of the file /etc/lvm/backup/VolGroup00, so
that it reads VolGroup01 or RestoreVG or whatever you
want it to be named on the system you are going to restore under, making sure
to edit the file itself to rename the volume group in the file.
If you don't
have a backup, you can re-create the equivalent of an LVM2 backup file by
examining the LVM2 header on the disk and editing out the binary stuff. LVM2
typically keeps copies of the metadata configuration at the beginning of the
disk, in the first 255 sectors following the partition table in sector 1 of the
disk. See /etc/lvm/lvm.conf
and man lvm.conf for more details. Because each disk
sector is typically 512 bytes, reading this area will yield a 128KB file. LVM2
may have stored several different text representations of the LVM2
configuration stored on the partition itself in the first 128KB. Extract these
to an ordinary file as follows, then edit the file:
dd
if=/dev/md2 bs=512 count=255 skip=1 of=/tmp/md2-raw-start
vi /tmp/md2-raw-start
You will see
some binary gibberish, but look for the bits of plain text. LVM treats this
metadata area as a ring buffer, so there may be multiple configuration entries
on the disk. On my disk, the first entry had only the details for the physical
volume and volume group, and the next entry had the logical volume information.
Look for the block of text with the most recent timestamp, and edit out
everything except the block of plain text that contains LVM declarations. This
has the volume group declarations that include logical volumes information. Fix
up physical device declarations if needed. If in doubt, look at the existing
/etc/lvm/backup/VolGroup00 file to see what is there.
On disk, the text entries are not as nicely formatted and are in a different
order than in the normal backup file, but they will do. Save the trimmed
configuration as VolGroup01. This file should then look like Listing 6.
Listing 6. Modified Volume Group Configuration
File
VolGroup01 {
id =
"xQZqTG-V4wn-DLeQ-bJ0J-GEHB-4teF-A4PPBv"
seqno
= 1
status = ["RESIZEABLE",
"READ", "WRITE"]
extent_size = 65536
max_lv = 0
max_pv = 0
physical_volumes {
pv0 {
id =
"tRACEy-cstP-kk18-zQFZ-ErG5-QAIV-YqHItA"
device = "/dev/md2"
status = ["ALLOCATABLE"]
pe_start = 384
pe_count = 2365
}
}
# Generated by LVM2: Sun Feb 5 22:57:19 2006
Once you have a
volume group configuration file, migrate the volume
group to this system with vgcfgrestore, as Listing 7
shows.
Listing 7. Activating the
Recovered LVM2 Volume
[root@recoverybox
~]# vgcfgrestore -f
VolGroup01 VolGroup01
[root@recoverybox
~]# vgscan
Reading all physical volumes. This may take a while...
Found volume group "VolGroup01" using metadata type lvm2
Found volume group "VolGroup00" using metadata type lvm2
[root@recoverybox
~]# pvscan
PV /dev/md2 VG VolGroup01 lvm2 [73.91 GB / 32.00 MB free]
PV /dev/hda2 VG VolGroup00 lvm2 [18.91 GB / 32.00 MB free]
Total: 2 [92.81 GB] / in use: 2 [92.81 GB] / in no VG: 0 [0 ]
[root@recoverybox
~]# vgchange VolGroup01 -a y
1 logical volume(s) in volume group "VolGroup01" now active
[root@recoverybox
~]# lvscan
ACTIVE
'/dev/VolGroup01/LogVol00' [73.88 GB] inherit
ACTIVE
'/dev/VolGroup00/LogVol00' [18.38 GB] inherit
ACTIVE
'/dev/VolGroup00/LogVol01' [512.00 MB] inherit
At this point,
you can now mount the old volume on the new system, and gain access to the
files within, as shown in Listing 8.
Listing 8. Mounting the
Recovered Volume
[root@recoverybox
~]# mount /dev/VolGroup01/LogVol00 /mnt
[root@recoverybox
~]# df -h
Filesystem
Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
19G 4.7G
13G 28% /
/dev/hda1 99M 12M 82M 13% /boot
none 126M 0
126M 0% /dev/shm
/dev/mapper/VolGroup01-LogVol00
73G 2.5G
67G 4% /mnt
# ls
-l /mnt
total 200
drwxr-xr-x
2 root root
4096 Feb 6 02:36 bin
drwxr-xr-x
2 root root
4096 Feb 5 18:03 boot
drwxr-xr-x
4 root root
4096 Feb 5 18:03 dev
drwxr-xr-x 79 root root 12288 Feb 6
23:54 etc
drwxr-xr-x
3 root root
4096 Feb 6 01:11 home
drwxr-xr-x
2 root root
4096 Feb 21 2005 initrd
drwxr-xr-x 11 root root 4096 Feb 6 02:36 lib
drwx------ 2 root root
16384 Feb 5 17:59 lost+found
drwxr-xr-x
3 root root
4096 Feb 6 22:12 media
drwxr-xr-x
2 root root
4096 Oct 7 09:03 misc
drwxr-xr-x
2 root root
4096 Feb 21 2005 mnt
drwxr-xr-x
2 root root
4096 Feb 21 2005 opt
drwxr-xr-x
2 root root
4096 Feb 5 18:03 proc
drwxr-x--- 5 root root 4096 Feb
7 00:19 root
drwxr-xr-x
2 root root 12288 Feb 6 22:37 sbin
drwxr-xr-x
2 root root
4096 Feb 5 23:04 selinux
drwxr-xr-x
2 root root
4096 Feb 21 2005 srv
drwxr-xr-x
2 root root
4096 Feb 5 18:03 sys
drwxr-xr-x
3 root root
4096 Feb 6 00:22 tftpboot
drwxrwxrwt 5 root root 4096 Feb
7 00:21 tmp
drwxr-xr-x 15 root root 4096 Feb 6 22:33 usr
drwxr-xr-x 20 root root 4096 Feb 5 23:15 var
Now that you
have access to your data, a prudent final step would be to back up the volume
group information with vcfgbackup, as Listing 9
shows.
Listing 9. Backing Up
Recovered Volume Group Configuration
[root@teapot-new
~]# vgcfgbackup
Volume group "VolGroup01"
successfully backed up.
Volume group "VolGroup00"
successfully backed up.
[root@teapot-new
~]# ls -l /etc/lvm/backup/
total 24
-rw------- 1 root root 1350
Feb 10 09:09 VolGroup00
-rw------- 1 root root 1051
Feb 10 09:09 VolGroup01
LVM2 and Linux
software RAID make it possible to create economical, reliable storage solutions
with commodity hardware. One trade-off involved is that some procedures for
recovering from failure situations may not be clear. A tool that reliably
extracted old volume group information directly from the disk would make
recovery easier. Fortunately, the designers of the LVM2 system had the wisdom
to keep plain-text backup copies of the configuration on the disk itself. With
a little patience and some research, I was able to regain access to the logical
volume I thought was lost; may you have as much
success with your LVM2 and RAID installation.
Resources for
this article: /article/8948.
Richard Bullington-McGuire
is the Managing Partner of PKR Internet, LLC, a software
and systems consulting firm in Arlington, Virginia, specializing in Linux, Open
Source and Java. He has been a Linux sysadmin since
1994. You can reach him at rbulling@pkrinternet.com.