linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* CentOS 6.2 on partition able RAID1 (md_d0) - kernel panic with either disk not present
@ 2012-10-03 11:17 Arun Khan
  2012-10-03 14:23 ` John Robinson
  0 siblings, 1 reply; 8+ messages in thread
From: Arun Khan @ 2012-10-03 11:17 UTC (permalink / raw)
  To: Linux MDADM Raid

I had posted the following in the CentOS General mailing list but the
problem remains unresolved.

Here is the link to the CentOS thread:
<http://lists.centos.org/pipermail/centos/2012-June/126927.html>

I am posting it here in the hope that some one will be able to help me
pinpoint the "gotcha" and fix it.

In case you are able to help, please read all the postings in the
above thread, to see what others have suggested and what I have
already tried, so as to avoid repetition of questions and answers.

My server environment:
CentOS 6.2 amd64 (min. server install)
2 virtual hard disks of 10GB each
Running as Linux KVM guest OS

Following the instructions on CentOS Wiki
<http://wiki.centos.org/HowTos/Install_On_Partitionable_RAID1>

I  installed a min. server in Linux KVM setup (script shown below)

<script>
#!/bin/bash
nic_mac_addr0=00:07:43:53:2b:bb

kvm \
-vga std \
-m 1024 \
-cpu core2duo \
-smp 2,cores=2 \
-drive file=/home/arunk/KVM/vdisks/centos62.raid1.disk1.img \
-drive file=/home/arunk/KVM/vdisks/centos62.raid1.disk2.img \
-net nic,vlan=1,model=e1000,macaddr=${nic_mac_addr0} \
-net tap,vlan=1,ifname=tap0,script=no,downscript=no \

</script>

The system boots fine when *both* disks are online.

However, when I remove either of the disks (delete the -drive file= line), the
system boots to a point wherein the GRUB menu is displayed and the
progress bar displays for a while till the white bar reaches about
halfway point and then it:

Kernel panic - not syncing: Attempted to kill init!

I am doing this testing on a VM environment before deploying such a
setup on bare metal.
The disk failure test is failing.   My understanding is that such a
setup should work the same whether it is VM or bare metal.

Any suggestions/ideas as to what I may doing incorrectly?

FWIW, outputs from "fdisk -l"  and "df -hT"

<fdisk -l>
root@centos62-raid1 ~ >
# fdisk -l

Disk /dev/sda: 10.7 GB, 10737418240 bytes
255 heads, 63 sectors/track, 1305 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000e8353

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1         523     4194304   83  Linux
Partition 1 does not end on cylinder boundary.
/dev/sda2             523        1045     4194304   83  Linux
/dev/sda3            1045        1176     1048576   82  Linux swap / Solaris

Disk /dev/sdb: 10.7 GB, 10737418240 bytes
255 heads, 63 sectors/track, 1305 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000e8353

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1         523     4194304   83  Linux
Partition 1 does not end on cylinder boundary.
/dev/sdb2             523        1045     4194304   83  Linux
/dev/sdb3            1045        1176     1048576   82  Linux swap / Solaris

Disk /dev/md_d0: 10.7 GB, 10737352704 bytes
2 heads, 4 sectors/track, 2621424 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000e8353

      Device Boot      Start         End      Blocks   Id  System
/dev/md_d0p1   *         257     1048832     4194304   83  Linux
Partition 1 does not end on cylinder boundary.
/dev/md_d0p2         1048833     2097408     4194304   83  Linux
Partition 2 does not end on cylinder boundary.
/dev/md_d0p3         2097409     2359552     1048576   82  Linux swap / Solaris
Partition 3 does not end on cylinder boundary

</fdisk -l>

<df -hT>

# df -hT
Filesystem    Type    Size  Used Avail Use% Mounted on
/dev/md_d0p1  ext4    4.0G  1.9G  1.9G  50% /
tmpfs        tmpfs    499M     0  499M   0% /dev/shm
/dev/md_d0p2  ext4    4.0G  136M  3.7G   4% /home

</df -hT>

Thanks.
-- Arun Khan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: CentOS 6.2 on partition able RAID1 (md_d0) - kernel panic with either disk not present
  2012-10-03 11:17 CentOS 6.2 on partition able RAID1 (md_d0) - kernel panic with either disk not present Arun Khan
@ 2012-10-03 14:23 ` John Robinson
  2012-10-03 16:48   ` Arun Khan
  2012-10-03 17:45   ` Keith Keller
  0 siblings, 2 replies; 8+ messages in thread
From: John Robinson @ 2012-10-03 14:23 UTC (permalink / raw)
  To: Arun Khan; +Cc: Linux MDADM Raid

On 03/10/2012 12:17, Arun Khan wrote:
> I had posted the following in the CentOS General mailing list but the
> problem remains unresolved.
>
> Here is the link to the CentOS thread:
> <http://lists.centos.org/pipermail/centos/2012-June/126927.html>
>
> I am posting it here in the hope that some one will be able to help me
> pinpoint the "gotcha" and fix it.
>
> In case you are able to help, please read all the postings in the
> above thread, to see what others have suggested and what I have
> already tried, so as to avoid repetition of questions and answers.

Please post the full output of
   cat /proc/mdstat
   cat /etc/mdadm.conf
   mdadm -Evvs
   file -s /dev/sda
   file -s /dev/sda1
   file -s /dev/sdb
   file -s /dev/sdb1
   cat /etc/grub/grub.conf
(from inside the running VM, obviously).

I have to say I don't like the wiki article you quoted as a method of 
installation, but let's see if we can fix it before starting out doing 
it another way.

In addition, you probably ought to be testing on bare metal similar to 
your future production box.

Cheers,

John.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: CentOS 6.2 on partition able RAID1 (md_d0) - kernel panic with either disk not present
  2012-10-03 14:23 ` John Robinson
@ 2012-10-03 16:48   ` Arun Khan
       [not found]     ` <CAHhM8gAOfUXwPet1EY3gBa_9--K2opvEYFmOxBh9KJnHSCr_Hw@mail.gmail.com>
  2012-10-03 17:45   ` Keith Keller
  1 sibling, 1 reply; 8+ messages in thread
From: Arun Khan @ 2012-10-03 16:48 UTC (permalink / raw)
  To: Linux MDADM Raid

On Wed, Oct 3, 2012 at 7:53 PM, John Robinson  wrote:

> Please post the full output of
>   cat /proc/mdstat

# cat /proc/mdstat
Personalities : [raid1]
md_d0 : active raid1 sda[0] sdb[1]
      10485696 blocks [2/2] [UU]

unused devices: <none>



>   cat /etc/mdadm.conf
# cat /etc/mdadm.conf
ARRAY /dev/md_d0 metadata=0.90 UUID=e04bec5e:534382ba:bfe78010:bc810f04

>   mdadm -Evvs

# mdadm -Evvs
mdadm: No md superblock detected on /dev/md_d0p3.
mdadm: No md superblock detected on /dev/md_d0p2.
mdadm: No md superblock detected on /dev/root.
/dev/md_d0:
   MBR Magic : aa55
Partition[0] :      8388608 sectors at         2048 (type fd)
Partition[1] :      8388608 sectors at      8390656 (type fd)
Partition[2] :      2097152 sectors at     16779264 (type 82)
/dev/sdb:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : e04bec5e:534382ba:bfe78010:bc810f04
  Creation Time : Tue Jun 12 00:20:36 2012
     Raid Level : raid1
  Used Dev Size : 10485696 (10.00 GiB 10.74 GB)
     Array Size : 10485696 (10.00 GiB 10.74 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0

    Update Time : Wed Oct  3 22:04:38 2012
          State : active
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0
       Checksum : fa05f4fd - correct
         Events : 71


      Number   Major   Minor   RaidDevice State
this     1       8       16        1      active sync   /dev/sdb

   0     0       8        0        0      active sync   /dev/sda
   1     1       8       16        1      active sync   /dev/sdb
/dev/sda:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : e04bec5e:534382ba:bfe78010:bc810f04
  Creation Time : Tue Jun 12 00:20:36 2012
     Raid Level : raid1
  Used Dev Size : 10485696 (10.00 GiB 10.74 GB)
     Array Size : 10485696 (10.00 GiB 10.74 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0

    Update Time : Wed Oct  3 22:04:38 2012
          State : active
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0
       Checksum : fa05f4eb - correct
         Events : 71


      Number   Major   Minor   RaidDevice State
this     0       8        0        0      active sync   /dev/sda

   0     0       8        0        0      active sync   /dev/sda
   1     1       8       16        1      active sync   /dev/sdb

>   file -s /dev/sda

# file -s /dev/sda
/dev/sda: x86 boot sector; GRand Unified Bootloader, stage1 version
0x3, boot drive 0x80, 1st sector stage2 0x443840, GRUB version 0.94;
partition 1: ID=0xfd, active, starthead 32, startsector 2048, 8388608
sectors; partition 2: ID=0xfd, starthead 75, startsector 8390656,
8388608 sectors; partition 3: ID=0x82, starthead 254, startsector
16779264, 2097152 sectors, code offset 0x48

>   file -s /dev/sda1
# file -s /dev/sda1
/dev/sda1: cannot open `/dev/sda1' (No such file or directory)

>   file -s /dev/sdb
# file -s /dev/sdb
/dev/sdb: x86 boot sector; GRand Unified Bootloader, stage1 version
0x3, boot drive 0x80, 1st sector stage2 0x443840, GRUB version 0.94;
partition 1: ID=0xfd, active, starthead 32, startsector 2048, 8388608
sectors; partition 2: ID=0xfd, starthead 75, startsector 8390656,
8388608 sectors; partition 3: ID=0x82, starthead 254, startsector
16779264, 2097152 sectors, code offset 0x48


>   file -s /dev/sdb1

# file -s /dev/sdb1
/dev/sdb1: cannot open `/dev/sdb1' (No such file or directory)

>   cat /etc/grub/grub.conf

# cat /boot/grub/grub.conf
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE:  You do not have a /boot partition.  This means that
#          all kernel and initrd paths are relative to /, eg.
#          root (hd0,0)
#          kernel /boot/vmlinuz-version ro root=/dev/sda1
#          initrd /boot/initrd-[generic-]version.img
#boot=/dev/sda
default=0
timeout=2
# splashimage=(hd0,0)/boot/grub/splash.xpm.gz
# hiddenmenu
title CentOS (2.6.32-220.el6.x86_64)
	root (hd0,0)
	kernel /boot/vmlinuz-2.6.32-220.el6.x86_64 ro root=/dev/md_d0p1
rd_NO_LUKS rd_NO_LVM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16
crashkernel=auto  KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM
	initrd /boot/initramfs-2.6.32-220.el6.x86_64.img

> I have to say I don't like the wiki article you quoted as a method of
> installation, but let's see if we can fix it before starting out doing it
> another way.

It looked different, rather than carving out disk partitions and
creating raid1 on paired devices.   However, it is no good if the
system becomes useless on disk failure as I have found out in the
process of "break" testing.

> In addition, you probably ought to be testing on bare metal similar to your
> future production box.

Agree,  but I don't have spare hardware and therefore went the VM way.

Let me know if you need any other information regarding the setup.

Thanks,
-- Arun Khan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: CentOS 6.2 on partition able RAID1 (md_d0) - kernel panic with either disk not present
  2012-10-03 14:23 ` John Robinson
  2012-10-03 16:48   ` Arun Khan
@ 2012-10-03 17:45   ` Keith Keller
  1 sibling, 0 replies; 8+ messages in thread
From: Keith Keller @ 2012-10-03 17:45 UTC (permalink / raw)
  To: linux-raid

On 2012-10-03, John Robinson <john.robinson@anonymous.org.uk> wrote:
>
> I have to say I don't like the wiki article you quoted as a method of 
> installation, but let's see if we can fix it before starting out doing 
> it another way.

Can you briefly describe your objections?  I'm looking at how to install
CentOS to a bootable RAID1, and am trying to evaluate that method
versus the method supported by anaconda (which, AFAICT, is partitioning
each drive and putting a RAID1 over matching partitions) versus whatever
other methods I can find.

--keith


-- 
kkeller@wombat.san-francisco.ca.us



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: CentOS 6.2 on partition able RAID1 (md_d0) - kernel panic with either disk not present
       [not found]     ` <CAHhM8gAOfUXwPet1EY3gBa_9--K2opvEYFmOxBh9KJnHSCr_Hw@mail.gmail.com>
@ 2012-10-07  9:06       ` John Robinson
  2012-10-07 16:13         ` Arun Khan
  0 siblings, 1 reply; 8+ messages in thread
From: John Robinson @ 2012-10-07  9:06 UTC (permalink / raw)
  To: Arun Khan; +Cc: Linux RAID

On 07/10/2012 07:24, Arun Khan wrote:
> Hi John,
>
> I posted the requested info to the list.
>
> I was wondering if you had a chance to go through it.
>
> Either way do let me know.  I am anxious to get this solved if possible.

Sorry, I was distracted by $realjob...

The only thing that occurs to me is:

>> # cat /boot/grub/grub.conf
>> # grub.conf generated by anaconda
>> #
>> # Note that you do not have to rerun grub after making changes to this file
>> # NOTICE:  You do not have a /boot partition.  This means that
>> #          all kernel and initrd paths are relative to /, eg.
>> #          root (hd0,0)
>> #          kernel /boot/vmlinuz-version ro root=/dev/sda1
>> #          initrd /boot/initrd-[generic-]version.img
>> #boot=/dev/sda

this. Uncomment it, and change to boot=/dev/md_d0p1, and re-run 
`grub-install`.

Also, just to be sure, if you haven't installed a new kernel since 
migrating to RAID, update your initrd - something like `mkinitrd -f 
/boot/initramfs-2.6.32-220.el6.x86_64.img 2.6.32-220.el6`.

I can't think of anything else.

Cheers,

John.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: CentOS 6.2 on partition able RAID1 (md_d0) - kernel panic with either disk not present
  2012-10-07  9:06       ` John Robinson
@ 2012-10-07 16:13         ` Arun Khan
  2012-10-09 15:24           ` Arun Khan
  0 siblings, 1 reply; 8+ messages in thread
From: Arun Khan @ 2012-10-07 16:13 UTC (permalink / raw)
  To: Linux RAID

On Sun, Oct 7, 2012 at 2:36 PM, John Robinson
<john.robinson@anonymous.org.uk> wrote:
> On 07/10/2012 07:24, Arun Khan wrote:
>>
>> Hi John,
>>
>> I posted the requested info to the list.
>>
>> I was wondering if you had a chance to go through it.
>>
>> Either way do let me know.  I am anxious to get this solved if possible.
>
>
> Sorry, I was distracted by $realjob...

No issues.   We all have to worry about putting bread on the dinner table :)

> The only thing that occurs to me is:
>
>
>>> # cat /boot/grub/grub.conf
>>> # grub.conf generated by anaconda
>>> #
>>> # Note that you do not have to rerun grub after making changes to this
>>> file
>>> # NOTICE:  You do not have a /boot partition.  This means that
>>> #          all kernel and initrd paths are relative to /, eg.
>>> #          root (hd0,0)
>>> #          kernel /boot/vmlinuz-version ro root=/dev/sda1
>>> #          initrd /boot/initrd-[generic-]version.img
>>> #boot=/dev/sda
>
>
> this. Uncomment it, and change to boot=/dev/md_d0p1, and re-run
> `grub-install`.
>

OK.   Will do so and let you know.

> Also, just to be sure, if you haven't installed a new kernel since migrating
> to RAID, update your initrd - something like `mkinitrd -f
> /boot/initramfs-2.6.32-220.el6.x86_64.img 2.6.32-220.el6`.
>

No I have not updated any package, let alone the kernel.

Thanks,
-- Arun Khan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: CentOS 6.2 on partition able RAID1 (md_d0) - kernel panic with either disk not present
  2012-10-07 16:13         ` Arun Khan
@ 2012-10-09 15:24           ` Arun Khan
  2012-10-15 10:40             ` Arun Khan
  0 siblings, 1 reply; 8+ messages in thread
From: Arun Khan @ 2012-10-09 15:24 UTC (permalink / raw)
  To: Linux RAID

On Sun, Oct 7, 2012 at 9:43 PM, Arun Khan  wrote:
> On Sun, Oct 7, 2012 at 2:36 PM, John Robinson  wrote:
>> The only thing that occurs to me is:
>>
>>
>>>> # cat /boot/grub/grub.conf
>>>> # grub.conf generated by anaconda
>>>> #
>>>> # Note that you do not have to rerun grub after making changes to this
>>>> file
>>>> # NOTICE:  You do not have a /boot partition.  This means that
>>>> #          all kernel and initrd paths are relative to /, eg.
>>>> #          root (hd0,0)
>>>> #          kernel /boot/vmlinuz-version ro root=/dev/sda1
>>>> #          initrd /boot/initrd-[generic-]version.img
>>>> #boot=/dev/sda
>>
>>
>> this. Uncomment it, and change to boot=/dev/md_d0p1, and re-run
>> `grub-install`.
>>

I made the above suggested change to grub.conf.

There was an update for grub for CentOS 6.2 which I installed.

I also changed /boot/grub/device.map to include both disks (sda and sdb).

# grub-install /dev/sda

No joy.  I get an error:

               The file /boot/grub/stage1 not read correctly.

I attempted grub installation on /dev/md_d0 as well as /dev/sdb.  I
get the same above error.

I commented the line boot=/dev/... in grub.conf (in effect restore it
to the original content) and still get the same error!

Looks like Grub and md_d0 do not get along :(

Upon querying uncle Google, I  came across this link which is similar
to what I am experiencing.

<http://idolinux.blogspot.in/2009/07/reinstall-grub-bootloader-on-md0.html>

I will report back when I hit upon a solution.

Meanwhile, do let me know if other ideas come to your mind.

I am also considering changing the bootloader to lilo or extlinux.
Please share any experience in configuring either for raid1.

Thanks,
-- Arun Khan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: CentOS 6.2 on partition able RAID1 (md_d0) - kernel panic with either disk not present
  2012-10-09 15:24           ` Arun Khan
@ 2012-10-15 10:40             ` Arun Khan
  0 siblings, 0 replies; 8+ messages in thread
From: Arun Khan @ 2012-10-15 10:40 UTC (permalink / raw)
  To: Linux RAID

___ SOLVED ___

After experimenting with the boot loader changes (without any success
in solving the problem), I did some search on the other suspect
'dracut,' replacement for mkinitrd in RHEL/CentOS 6.

There were two bug reports on the tool (6.2 and 6.3).

I updated dracut to dracut-004-284.el6_3.1.noarch and that fixes the problem.

Details on the procedure posted in the CentOS mailing list.
Ref. <http://lists.centos.org/pipermail/centos/2012-October/129654.html>

Thanks to all who helped.

-- Arun Khan

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-10-15 10:40 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-03 11:17 CentOS 6.2 on partition able RAID1 (md_d0) - kernel panic with either disk not present Arun Khan
2012-10-03 14:23 ` John Robinson
2012-10-03 16:48   ` Arun Khan
     [not found]     ` <CAHhM8gAOfUXwPet1EY3gBa_9--K2opvEYFmOxBh9KJnHSCr_Hw@mail.gmail.com>
2012-10-07  9:06       ` John Robinson
2012-10-07 16:13         ` Arun Khan
2012-10-09 15:24           ` Arun Khan
2012-10-15 10:40             ` Arun Khan
2012-10-03 17:45   ` Keith Keller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).