All of lore.kernel.org
 help / color / mirror / Atom feed
* [linux-lvm] Fedora Core 3 system with lvm2 won't boot
@ 2004-12-13 19:44 Dan Stromberg
  2004-12-13 23:53 ` [linux-lvm] " Dan Stromberg
  0 siblings, 1 reply; 4+ messages in thread
From: Dan Stromberg @ 2004-12-13 19:44 UTC (permalink / raw)
  To: linux-lvm



Copied from https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=142737

Description of problem:
System won't boot

Version-Release number of selected component (if applicable):


How reproducible:
Probably difficult, but easy on my machine.  :)

Steps to Reproduce:
1. Shut down FC3 without sync'ing disks
2. Try to boot
3. It doesn't.
  
Actual results:
System won't boot

Expected results:
System should boot.

Additional info:
I have an FC3 system, that was happy, but is now unhappy.  This may be
related to someone, who shall remain nameless, having shut off the power
on it without doing an orderly shutdown.  Then again, maybe it was
because of a "yum -y update", because I put off rebooting for a while
after that.

Anyway, now when it tries to boot, I see:

Red Hat nash version 4.1.18 starting
  Reading all physical volumes.  This may take a while...
  Found volume group "VolGroup00" using metadata type lvm2
  2 logical volume(s) in volume group "VolGroup00" now active


...and that's it.  I've left it there for over and hour, and it never
gets past that.

I booted off of an FC3 rescue cd, and found that I could mount the /boot
partition, but I cannot mount the / partition.  I ran various lvm
commands that identified two lvm volumes on the system.
fsck'ing /dev/hda2 (which is /) is getting me no where though - it just
says "invalid argument".

I tried firing up device mapper and udev in order to get
a /dev/VolGroup00 directory, but it just wouldn't do it - at least, not
with the things I tried.  I could mkdir the directory, but then "lvm
vgmknodes" would remove it.

What do I need to do to get past this?  There's stuff in the filesystem
I want quite a bit.  :-S

I tried all 3 FC3 kernels I have on the system, but none would come up,
getting stuck at that same point.

When I boot up into
the rescue CD and let it try to find my fedora install, it gets really
confused.  More specifically, it says:

Searching for Fedora Core installations...

        0%              install exited abnormally -- received signal 15
                                kernel panic - not syncing: Out of
memory and no killable processes


If I remove "quiet" and add "single" to my boot options, I get:

EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
kjournald starting.  Commit interval 5 seconds
EXT3-fs: dm-0: orphan cleanup on readonly fs

...and there it hangs.


Also, I ran memtest86 on the box for a while (a little over an hour),
and found no errors.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [linux-lvm] Re: Fedora Core 3 system with lvm2 won't boot
  2004-12-13 19:44 [linux-lvm] Fedora Core 3 system with lvm2 won't boot Dan Stromberg
@ 2004-12-13 23:53 ` Dan Stromberg
  2004-12-14  3:21   ` Robin Green
  0 siblings, 1 reply; 4+ messages in thread
From: Dan Stromberg @ 2004-12-13 23:53 UTC (permalink / raw)
  To: linux-lvm


Questions:

1) Anyone seen this before?  Redhat or otherwise?

2) Anyone know how I can boot from a rescue CD, and get to a point where
I can attempt to fsck the filesystem, or at least see where in bringing
up LVM2 the system is getting confused?

3) Should I just write off the data on that filesystem as a loss and
start over?  I had/have a bunch of financial records, an opensource app
I was developing, a pretty detailed html to palmdoc conversion setup,
and on in the filesystem.

Is this stuff is in a FAQ somewhere, please just tell me what to google
for.  I've google'd around already, and found essentially nothing.

Yes, I know, I should keep backups.

Thanks!

On Mon, 2004-12-13 at 11:44 -0800, Dan Stromberg wrote:
> 
> Copied from https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=142737
> 
> Description of problem:
> System won't boot
> 
> Version-Release number of selected component (if applicable):
> 
> 
> How reproducible:
> Probably difficult, but easy on my machine.  :)
> 
> Steps to Reproduce:
> 1. Shut down FC3 without sync'ing disks
> 2. Try to boot
> 3. It doesn't.
>   
> Actual results:
> System won't boot
> 
> Expected results:
> System should boot.
> 
> Additional info:
> I have an FC3 system, that was happy, but is now unhappy.  This may be
> related to someone, who shall remain nameless, having shut off the power
> on it without doing an orderly shutdown.  Then again, maybe it was
> because of a "yum -y update", because I put off rebooting for a while
> after that.
> 
> Anyway, now when it tries to boot, I see:
> 
> Red Hat nash version 4.1.18 starting
>   Reading all physical volumes.  This may take a while...
>   Found volume group "VolGroup00" using metadata type lvm2
>   2 logical volume(s) in volume group "VolGroup00" now active
> 
> 
> ...and that's it.  I've left it there for over and hour, and it never
> gets past that.
> 
> I booted off of an FC3 rescue cd, and found that I could mount the /boot
> partition, but I cannot mount the / partition.  I ran various lvm
> commands that identified two lvm volumes on the system.
> fsck'ing /dev/hda2 (which is /) is getting me no where though - it just
> says "invalid argument".
> 
> I tried firing up device mapper and udev in order to get
> a /dev/VolGroup00 directory, but it just wouldn't do it - at least, not
> with the things I tried.  I could mkdir the directory, but then "lvm
> vgmknodes" would remove it.
> 
> What do I need to do to get past this?  There's stuff in the filesystem
> I want quite a bit.  :-S
> 
> I tried all 3 FC3 kernels I have on the system, but none would come up,
> getting stuck at that same point.
> 
> When I boot up into
> the rescue CD and let it try to find my fedora install, it gets really
> confused.  More specifically, it says:
> 
> Searching for Fedora Core installations...
> 
>         0%              install exited abnormally -- received signal 15
>                                 kernel panic - not syncing: Out of
> memory and no killable processes
> 
> 
> If I remove "quiet" and add "single" to my boot options, I get:
> 
> EXT3-fs: INFO: recovery required on readonly filesystem.
> EXT3-fs: write access will be enabled during recovery.
> kjournald starting.  Commit interval 5 seconds
> EXT3-fs: dm-0: orphan cleanup on readonly fs
> 
> ...and there it hangs.
> 
> 
> Also, I ran memtest86 on the box for a while (a little over an hour),
> and found no errors.
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [linux-lvm] Re: Fedora Core 3 system with lvm2 won't boot
  2004-12-13 23:53 ` [linux-lvm] " Dan Stromberg
@ 2004-12-14  3:21   ` Robin Green
  2004-12-17 17:35     ` [linux-lvm] " Dan Stromberg
  0 siblings, 1 reply; 4+ messages in thread
From: Robin Green @ 2004-12-14  3:21 UTC (permalink / raw)
  To: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 4766 bytes --]

On Mon, Dec 13, 2004 at 03:53:58PM -0800, Dan Stromberg wrote:
> 2) Anyone know how I can boot from a rescue CD, and get to a point where
> I can attempt to fsck the filesystem, or at least see where in bringing
> up LVM2 the system is getting confused?

I'm not an LVM2 expert, but I'll try and help.

> On Mon, 2004-12-13 at 11:44 -0800, Dan Stromberg wrote:
> > Anyway, now when it tries to boot, I see:
> > 
> > Red Hat nash version 4.1.18 starting
> >   Reading all physical volumes.  This may take a while...
> >   Found volume group "VolGroup00" using metadata type lvm2
> >   2 logical volume(s) in volume group "VolGroup00" now active
> > 
> > 
> > ...and that's it.  I've left it there for over and hour, and it never
> > gets past that.
> > 
> > I booted off of an FC3 rescue cd, and found that I could mount the /boot
> > partition, but I cannot mount the / partition.  I ran various lvm
> > commands that identified two lvm volumes on the system.
> > fsck'ing /dev/hda2 (which is /) is getting me no where though - it just
> > says "invalid argument".

Yes, it would do, it looks like /dev/hda2 holds a volume (sorry if my terminology
is incorrect, it's bloody confusing), but definitely not a filesystem directly.
So you don't want to fsck /dev/hda2!

> > I tried firing up device mapper and udev in order to get
> > a /dev/VolGroup00 directory, but it just wouldn't do it - at least, not
> > with the things I tried.  I could mkdir the directory, but then "lvm
> > vgmknodes" would remove it.

You don't actually have to use udev if you can't get it to work. udev is just
a userspace program which automates the grunt work of setting up a ramdisk-based
/dev filesystem.

All you really need to do to gain access to the root filesystem is:

1) Note down what the root= device is that appears on the kernel
command line (this can be found by going to boot from hard drive and then examining
the kernel command line in grub, or by looking in /boot/grub/grub.conf )

2) Be booted from rescue disk

3) Sanity check: ensure that the nodes /dev/hda, /dev/hda2 etc. exist

4) Start up LVM2 (assuming it is not already started by the rescue disk!) by
typing:

  lvm vgchange --ignorelockingfailure -P -a y

Looking at my initrd script, it doesn't seem necessary to run any other commands
to get LVM2 volumes activated - that's it.

5) Find out which major/minor number the root device is. This is the slightly tricky
bit. You may have to use trial-and-error. In my case, I guessed right first time:
(no comments about my odd hardware setup please ;)

[root@localhost t]# ls /sys/block
dm-0  dm-2  hdd    loop1  loop3  loop5  loop7  ram0  ram10  ram12  ram14  ram2  ram4  ram6  ram8
dm-1  hdc   loop0  loop2  loop4  loop6  md0    ram1  ram11  ram13  ram15  ram3  ram5  ram7  ram9
[root@localhost t]# cat /sys/block/dm-0/dev
253:0
[root@localhost t]# devmap_name 253 0
Volume01-LogVol02

In the first command, I listed the block devices known to the kernel. dm-* are the LVM
devices (on my 2.6.9 kernel, anyway). In the second command, I found out the major:minor
numbers of /dev/dm-0. In the third command, I used devmap_name to check that the device
mapper name of node with major 253 and minor 0, is the same as the name of the root device
from my kernel command line (cf. step 1). Apart from a slight punctuation difference,
it is the same, therefore I have found the root device.

I'm not sure if FC3 includes the devmap_name command. According to fr2.rpmfind.net, it doesn't.
But you don't really need it, you can just try all the LVM devices in turn until you find
your root device. Or, I can email you a statically-linked binary of it if you want.

6) Create the /dev node for the root filesystem if it doesn't already exist, e.g.:

  mknod /dev/dm-0 b 253 0

using the major-minor numbers found in step 5.

Please note that for the purpose of _rescue_, the node doesn't actually have to be under
/dev (so /dev doesn't have to be writeable) and its name does not matter. It just needs
to exist somewhere on a filesystem, and you have to refer to it in the next command.

7) Do what you want to the root filesystem, e.g.:

  fsck /dev/dm-0
  mount /dev/dm-0 /where/ever

As you probably know, the fsck might actually work, because a fsck can sometimes
correct filesystem errors that the kernel filesystem modules cannot.

8) If the fsck doesn't work, look in the output of fsck and in dmesg for signs of
physical drive errors. If you find them, (a) think about calling a data recovery
specialist, (b) do NOT use the drive!

I _think_ that is the right order to do things in, but I'm not 100% sure.

If this works, I expect a reward, ta.

-- 
Robin

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [linux-lvm] Re: Re: Fedora Core 3 system with lvm2 won't boot
  2004-12-14  3:21   ` Robin Green
@ 2004-12-17 17:35     ` Dan Stromberg
  0 siblings, 0 replies; 4+ messages in thread
From: Dan Stromberg @ 2004-12-17 17:35 UTC (permalink / raw)
  To: linux-lvm


Robin, I owe you.

I'd like to repay you either by coding a small to medium-sized app for you
in C, bash or python, or by buying you a tea/coffee/beer/wine of your
choice (one of moderate price :).

What would you prefer?

This is what worked for me:

On FC3's recovery cdrom...

1) Do startup network interfaces
2) Don't try to automatically mount the filesystems - not even readonly
3) lvm vgchange --ignorelockingfailure -P -a y
4) fdisk -l, and guess which partition is which based on size: the small one was /boot, and the large one was /
5) mkdir /mnt/boot
6) mount /dev/hda1 /mnt/boot
7) Look up the device node for the root filesystem in /mnt/boot/grub/grub.conf
8) A first tentative step, to see if things are working: fsck -n /dev/VolGroup00/LogVol00
9) Dive in: fsck -f -y /dev/VolGroup00/LogVol00
10) Wait a while...  Be patient.  Don't interrupt it
11) Reboot

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2004-12-17 17:44 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-12-13 19:44 [linux-lvm] Fedora Core 3 system with lvm2 won't boot Dan Stromberg
2004-12-13 23:53 ` [linux-lvm] " Dan Stromberg
2004-12-14  3:21   ` Robin Green
2004-12-17 17:35     ` [linux-lvm] " Dan Stromberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.