All of lore.kernel.org
 help / color / mirror / Atom feed
* RAID1 recovery fails with 2.6 kernel
@ 2003-10-19 14:27 Dick Streefland
  2003-10-20  6:15 ` Neil Brown
  0 siblings, 1 reply; 7+ messages in thread
From: Dick Streefland @ 2003-10-19 14:27 UTC (permalink / raw)
  To: linux-raid

When I mark one of the devices in a RAID1 array faulty, remove it, and
re-add it, recovery starts, but stops too early, leaving the array in
degraded mode. I'me seeing this in the latest 2.6.0 kernels, including
2.6.0-test8. The 2.4.22 kernel works OK, although the array is marked
"dirty" (?). Below is a script to reproduce the problem, followed by
the output of the script. I'm using mdadm-1.3.0.

# cat raid-recovery
#!/bin/sh

dd bs=1024k count=20 if=/dev/zero of=/tmp/img1 2> /dev/null
dd bs=1024k count=20 if=/dev/zero of=/tmp/img2 2> /dev/null
sync
losetup /dev/loop1 /tmp/img1
losetup /dev/loop2 /tmp/img2
sync
mdadm -C -n 2 -l 1 /dev/md0 /dev/loop1 /dev/loop2
sleep 1
while grep resync /proc/mdstat; do sleep 1; done
sleep 1
cat /proc/mdstat
mdadm -QD /dev/md0

mdadm /dev/md0 -f /dev/loop2
mdadm /dev/md0 -r /dev/loop2
mdadm -QD /dev/md0

mdadm /dev/md0 -a /dev/loop2
while grep recovery /proc/mdstat; do sleep 1; done
sleep 1
cat /proc/mdstat
mdadm -QD /dev/md0

mdadm -S /dev/md0
losetup -d /dev/loop1
losetup -d /dev/loop2
rm /tmp/img1
rm /tmp/img2
# ./raid-recovery 
mdadm: array /dev/md0 started.
      [=>...................]  resync =  5.0% (1280/20416) finish=0.2min speed=1280K/sec
      [==>..................]  resync = 10.0% (2176/20416) finish=0.2min speed=1088K/sec
      [===>.................]  resync = 15.0% (3200/20416) finish=0.2min speed=1066K/sec
      [====>................]  resync = 20.0% (4224/20416) finish=0.2min speed=1056K/sec
      [=====>...............]  resync = 25.0% (5248/20416) finish=0.2min speed=1049K/sec
      [======>..............]  resync = 30.0% (6144/20416) finish=0.2min speed=1024K/sec
      [=======>.............]  resync = 35.0% (8064/20416) finish=0.1min speed=1152K/sec
      [========>............]  resync = 40.0% (9088/20416) finish=0.1min speed=1136K/sec
      [=========>...........]  resync = 45.0% (10112/20416) finish=0.1min speed=1123K/sec
      [==========>..........]  resync = 50.0% (11008/20416) finish=0.1min speed=1100K/sec
      [===========>.........]  resync = 55.0% (12032/20416) finish=0.1min speed=1093K/sec
      [============>........]  resync = 60.0% (13056/20416) finish=0.1min speed=1088K/sec
      [=============>.......]  resync = 65.0% (14080/20416) finish=0.0min speed=1083K/sec
      [==============>......]  resync = 70.0% (15104/20416) finish=0.0min speed=1078K/sec
      [===============>.....]  resync = 75.0% (16000/20416) finish=0.0min speed=1066K/sec
      [================>....]  resync = 80.0% (17024/20416) finish=0.0min speed=1064K/sec
      [=================>...]  resync = 85.0% (18048/20416) finish=0.0min speed=1061K/sec
      [==================>..]  resync = 90.0% (19072/20416) finish=0.0min speed=1059K/sec
      [===================>.]  resync = 95.0% (20096/20416) finish=0.0min speed=1057K/sec
Personalities : [raid1] 
md0 : active raid1 loop2[1] loop1[0]
      20416 blocks [2/2] [UU]
      
unused devices: <none>
/dev/md0:
        Version : 00.90.01
  Creation Time : Sun Oct 19 15:53:13 2003
     Raid Level : raid1
     Array Size : 20416 (19.94 MiB 20.91 MB)
    Device Size : 20416 (19.94 MiB 20.91 MB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Sun Oct 19 15:53:34 2003
          State : clean, no-errors
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0


    Number   Major   Minor   RaidDevice State
       0       7        1        0      active sync   /dev/loop1
       1       7        2        1      active sync   /dev/loop2
           UUID : 102b76c8:1af754b8:5c8d47a0:fe849836
         Events : 0.1
mdadm: set device faulty failed for /dev/loop2:  Success
mdadm: hot removed /dev/loop2
/dev/md0:
        Version : 00.90.01
  Creation Time : Sun Oct 19 15:53:13 2003
     Raid Level : raid1
     Array Size : 20416 (19.94 MiB 20.91 MB)
    Device Size : 20416 (19.94 MiB 20.91 MB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Sun Oct 19 15:53:35 2003
          State : clean, no-errors
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0


    Number   Major   Minor   RaidDevice State
       0       7        1        0      active sync   /dev/loop1
       1       0        0       -1      removed
           UUID : 102b76c8:1af754b8:5c8d47a0:fe849836
         Events : 0.3
mdadm: hot added /dev/loop2
      [=>...................]  recovery =  5.0% (1024/20416) finish=0.2min speed=1024K/sec
      [==>..................]  recovery = 10.0% (2048/20416) finish=0.1min speed=2048K/sec
Personalities : [raid1] 
md0 : active raid1 loop2[2] loop1[0]
      20416 blocks [2/1] [U_]
      
unused devices: <none>
/dev/md0:
        Version : 00.90.01
  Creation Time : Sun Oct 19 15:53:13 2003
     Raid Level : raid1
     Array Size : 20416 (19.94 MiB 20.91 MB)
    Device Size : 20416 (19.94 MiB 20.91 MB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Sun Oct 19 15:53:37 2003
          State : clean, no-errors
 Active Devices : 1
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 1


    Number   Major   Minor   RaidDevice State
       0       7        1        0      active sync   /dev/loop1
       1       0        0       -1      removed
       2       7        2        1      spare   /dev/loop2
           UUID : 102b76c8:1af754b8:5c8d47a0:fe849836
         Events : 0.5

-- 
Dick Streefland                    ////               De Bilt
dick.streefland@xs4all.nl         (@ @)       The Netherlands
------------------------------oOO--(_)--OOo------------------


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: RAID1 recovery fails with 2.6 kernel
  2003-10-19 14:27 RAID1 recovery fails with 2.6 kernel Dick Streefland
@ 2003-10-20  6:15 ` Neil Brown
  2003-10-20  8:43   ` Dick Streefland
  2003-10-22 22:54   ` Kernel OOps: bad magic 0 while RAID5 resync operation Bo Moon
  0 siblings, 2 replies; 7+ messages in thread
From: Neil Brown @ 2003-10-20  6:15 UTC (permalink / raw)
  To: Dick Streefland; +Cc: linux-raid

On Sunday October 19, spam@streefland.xs4all.nl wrote:
> When I mark one of the devices in a RAID1 array faulty, remove it, and
> re-add it, recovery starts, but stops too early, leaving the array in
> degraded mode. I'me seeing this in the latest 2.6.0 kernels, including
> 2.6.0-test8. The 2.4.22 kernel works OK, although the array is marked
> "dirty" (?). Below is a script to reproduce the problem, followed by
> the output of the script. I'm using mdadm-1.3.0.

Thanks for providing a script...
It works fine for me (2.6.0-test8).

I don't suppose there is anything in the kernel logs about write
errors on loop2 ???

Does it fail consistently for you, or only occasionally?

NeilBrown

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: RAID1 recovery fails with 2.6 kernel
  2003-10-20  6:15 ` Neil Brown
@ 2003-10-20  8:43   ` Dick Streefland
  2003-10-22 17:43     ` Mike Tran
  2003-10-22 22:54   ` Kernel OOps: bad magic 0 while RAID5 resync operation Bo Moon
  1 sibling, 1 reply; 7+ messages in thread
From: Dick Streefland @ 2003-10-20  8:43 UTC (permalink / raw)
  To: linux-raid

Neil Brown <neilb@cse.unsw.edu.au> wrote:
| Thanks for providing a script...
| It works fine for me (2.6.0-test8).
| 
| I don't suppose there is anything in the kernel logs about write
| errors on loop2 ???

No, there was nothing unusual in the log files. I have no access to
the test machine at the moment, but there is a message when the
recovery starts, and a few seconds later the message "sync done".

| Does it fail consistently for you, or only occasionally?

It fails every time. This test was on an dual PIII 450 system, but it
also fails on a VIA C6 system with the 2.6.0-test5 kernel. Both
kernels are compiled without CONFIG_PREEMPT, because I had other
problems that might be related to this option:

  http://www.spinics.net/lists/raid/msg03507.html

Could this be related to CONFIG_DM_IOCTL_V4? I was not sure about this
option, and have not enabled it. Otherwise, I think it is time to put
in some printk's. Do you have suggestions where to start looking?

-- 
Dick Streefland                      ////                      Altium BV
dick.streefland@altium.nl           (@ @)          http://www.altium.com
--------------------------------oOO--(_)--OOo---------------------------


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: RAID1 recovery fails with 2.6 kernel
  2003-10-20  8:43   ` Dick Streefland
@ 2003-10-22 17:43     ` Mike Tran
  2003-10-22 18:59       ` Dick Streefland
  2003-10-26 20:36       ` Dick Streefland
  0 siblings, 2 replies; 7+ messages in thread
From: Mike Tran @ 2003-10-22 17:43 UTC (permalink / raw)
  To: Dick Streefland; +Cc: linux-raid

On Mon, 2003-10-20 at 03:43, Dick Streefland wrote:
> Neil Brown <neilb@cse.unsw.edu.au> wrote:
> | Thanks for providing a script...
> | It works fine for me (2.6.0-test8).
> | 
> | I don't suppose there is anything in the kernel logs about write
> | errors on loop2 ???
> 
> No, there was nothing unusual in the log files. I have no access to
> the test machine at the moment, but there is a message when the
> recovery starts, and a few seconds later the message "sync done".
> 
> | Does it fail consistently for you, or only occasionally?
> 
> It fails every time. This test was on an dual PIII 450 system, but it
> also fails on a VIA C6 system with the 2.6.0-test5 kernel. Both
> kernels are compiled without CONFIG_PREEMPT, because I had other
> problems that might be related to this option:
> 
>   http://www.spinics.net/lists/raid/msg03507.html
> 
> Could this be related to CONFIG_DM_IOCTL_V4? I was not sure about this
> option, and have not enabled it. Otherwise, I think it is time to put
> in some printk's. Do you have suggestions where to start looking?

I have been experiencing the same problem on my test machine.  I found
out that the resync terminated early because of MD_RECOVERY_ER R bit set
by raid1's sync_write_request().  I don't understand why it fails the
sync when all the writes already completed successfully and quickly.  If
there is a need to check for "nowhere to write this to" as in 2.4.x
kernel, I think we need a different check.

The following patch for 2.6.0-test8 kernel seems to fix it.

--- a/raid1.c   2003-10-17 16:43:14.000000000 -0500
+++ b/raid1.c   2003-10-22 11:57:59.350900256 -0500
@@ -841,7 +841,7 @@
        }
 
        if (atomic_dec_and_test(&r1_bio->remaining)) {
-               md_done_sync(mddev, r1_bio->master_bio->bi_size >> 9,
0);
+               md_done_sync(mddev, r1_bio->master_bio->bi_size >> 9,
1);
                put_buf(r1_bio);
        }
 }







^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: RAID1 recovery fails with 2.6 kernel
  2003-10-22 17:43     ` Mike Tran
@ 2003-10-22 18:59       ` Dick Streefland
  2003-10-26 20:36       ` Dick Streefland
  1 sibling, 0 replies; 7+ messages in thread
From: Dick Streefland @ 2003-10-22 18:59 UTC (permalink / raw)
  To: linux-raid

Mike Tran <mhtran@us.ibm.com> wrote:
| I have been experiencing the same problem on my test machine.  I found
| out that the resync terminated early because of MD_RECOVERY_ER R bit set
| by raid1's sync_write_request().  I don't understand why it fails the
| sync when all the writes already completed successfully and quickly.  If
| there is a need to check for "nowhere to write this to" as in 2.4.x
| kernel, I think we need a different check.
| 
| The following patch for 2.6.0-test8 kernel seems to fix it.
| 
| --- a/raid1.c   2003-10-17 16:43:14.000000000 -0500
| +++ b/raid1.c   2003-10-22 11:57:59.350900256 -0500
| @@ -841,7 +841,7 @@
|         }
|  
|         if (atomic_dec_and_test(&r1_bio->remaining)) {
| -               md_done_sync(mddev, r1_bio->master_bio->bi_size >> 9, 0);
| +               md_done_sync(mddev, r1_bio->master_bio->bi_size >> 9, 1);
|                 put_buf(r1_bio);
|         }
|  }

This is exactly the spot where I interrupted my investigations last
night to get some sleep. I can confirm that your patch fixes the
problem. Thanks!

-- 
Dick Streefland                    ////               De Bilt
dick.streefland@xs4all.nl         (@ @)       The Netherlands
------------------------------oOO--(_)--OOo------------------


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Kernel OOps: bad magic 0 while RAID5 resync operation
  2003-10-20  6:15 ` Neil Brown
  2003-10-20  8:43   ` Dick Streefland
@ 2003-10-22 22:54   ` Bo Moon
  1 sibling, 0 replies; 7+ messages in thread
From: Bo Moon @ 2003-10-22 22:54 UTC (permalink / raw)
  To: linux-raid

Hello!

Anyone has this OOps before? Is it harmful?

I am using debian arm linux 2.4.20.

Thanks in advance,



Bo

-------------------------- console log--------------------------------------

PRAETORIAN:/home/bmoon# cat /proc/mdstat

Personalities : [linear] [raid0] [raid1] [raid5]

read_ahead 4096 sectors

md0 : active raid5 hde4[3] hdc4[1] hda4[0]

20479360 blocks chunks 64k algorithm 2 [3/2] [UU_]

[=================>...] recovery = 86.2% (8838224/10239680) finish=5.4min
speed=4272K/sec

unused devices: <none>

-----------------------------------------------------Kernel
OOPS------------------------------------------------------------------------
----

PRAETORIAN:/home/bmoon# bad magic 0 (should be 304a03c), <2>kernel BUG at
/home/bo/queengate2/linux/include/linux/wait.h:229!

Unable to handle kernel NULL pointer dereference at virtual address 00000000

mm = c000e78c pgd = c37a8000

*pgd = c37ac001, *pmd = c37ac001, *pte = 00000000, *ppte = 00000000

Internal error: Oops: ffffffff

CPU: 0

pc : [<c0025844>] lr : [<c002dbf4>] Not tainted

sp : c2a1fec4 ip : c2a1fe74 fp : c2a1fed4

r10: 00000009 r9 : 00000104 r8 : c2a1ff64

r7 : 00000000 r6 : 00000100 r5 : a0000013 r4 : 00000000

r3 : 00000000 r2 : c38b3f38 r1 : c38b3f38 r0 : 00000001

Flags: nZCv IRQs off FIQs on Mode SVC_32 Segment user

Control: C37AB17F Table: C37AB17F DAC: 00000015

Process winbindd (pid: 527, stack limit = 0xc2a1e37c)

Stack: (0xc2a1fec4 to 0xc2a20000)

fec0: 0304a02c c2a1feec c2a1fed8 c002b718 c002580c 0304a028 c088a000

fee0: c2a1ff04 c2a1fef0 c005eed8 c002b6c8 00000000 c2b0b734 c2a1ff50
c2a1ff08

ff00: c005f308 c005eeb8 c2a1e000 c2a1ff20 00000000 00000000 c2a1ff60
00000009

ff20: 00000000 c088a000 00000004 00000001 c2a1e000 00000004 c1faa924
00000009

ff40: bfffe9bc c2a1ffa4 c2a1ff54 c005f648 c005f0e8 00000028 00000000
00000000

ff60: 00000009 c1faa924 c1faa928 c1faa92c c1faa930 c1faa934 c1faa938
bfffe8b4

ff80: bfffe9bc 00000009 0000008e c00206e4 c2a1e000 00000000 00000000
c2a1ffa8

ffa0: c0020540 c005f35c bfffe8b4 c00299ec 00000009 bfffe9bc 00000000
00000000

ffc0: bfffe8b4 bfffe9bc 00000009 00000000 00000000 00000000 00000000
bfffe9bc

ffe0: 0010c4d8 bfffe804 000591b0 4014482c a0000010 00000009 00000000
00000000

Backtrace:

Function entered at [<c0025800>] from [<c002b718>]

r4 = 0304A02C

Function entered at [<c002b6bc>] from [<c005eed8>]

r5 = C088A000 r4 = 0304A028

Function entered at [<c005eeac>] from [<c005f308>]

r5 = C2B0B734 r4 = 00000000

Function entered at [<c005f0dc>] from [<c005f648>]

Function entered at [<c005f350>] from [<c0020540>]

Code: eb002013 e59f0014 eb002011 e3a03000 (e5833000)

md: hde4 [events: 00000002](write) hde4's sb offset: 10249344

md: hdc4 [events: 00000002](write) hdc4's sb offset: 10239680

md: hda4 [events: 00000002](write) hda4's sb offset: 10484096

----------------------------------------------------------------------------
--------after OOPs----------------------------------

PRAETORIAN:/home/bmoon# cat /proc/mdstat

Personalities : [linear] [raid0] [raid1] [raid5]

read_ahead 4096 sectors

md0 : active raid5 hde4[2] hdc4[1] hda4[0]

20479360 blocks chunks 64k algorithm 2 [3/3] [UUU]


unused devices: <none>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: RAID1 recovery fails with 2.6 kernel
  2003-10-22 17:43     ` Mike Tran
  2003-10-22 18:59       ` Dick Streefland
@ 2003-10-26 20:36       ` Dick Streefland
  1 sibling, 0 replies; 7+ messages in thread
From: Dick Streefland @ 2003-10-26 20:36 UTC (permalink / raw)
  To: linux-raid

Mike Tran <mhtran@us.ibm.com> wrote:
| I have been experiencing the same problem on my test machine.  I found
| out that the resync terminated early because of MD_RECOVERY_ER R bit set
| by raid1's sync_write_request().  I don't understand why it fails the
| sync when all the writes already completed successfully and quickly.  If
| there is a need to check for "nowhere to write this to" as in 2.4.x
| kernel, I think we need a different check.
| 
| The following patch for 2.6.0-test8 kernel seems to fix it.
| 
| --- a/raid1.c   2003-10-17 16:43:14.000000000 -0500
| +++ b/raid1.c   2003-10-22 11:57:59.350900256 -0500
| @@ -841,7 +841,7 @@
|         }
|  
|         if (atomic_dec_and_test(&r1_bio->remaining)) {
| -               md_done_sync(mddev, r1_bio->master_bio->bi_size >> 9, 0);
| +               md_done_sync(mddev, r1_bio->master_bio->bi_size >> 9, 1);
|                 put_buf(r1_bio);
|         }
|  }

Has this patch been forwarded to Linus already? It would be nice to
have this fixed before the final 2.6.0 is released.

-- 
Dick Streefland                    ////               De Bilt
dick.streefland@xs4all.nl         (@ @)       The Netherlands
------------------------------oOO--(_)--OOo------------------


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2003-10-26 20:36 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-10-19 14:27 RAID1 recovery fails with 2.6 kernel Dick Streefland
2003-10-20  6:15 ` Neil Brown
2003-10-20  8:43   ` Dick Streefland
2003-10-22 17:43     ` Mike Tran
2003-10-22 18:59       ` Dick Streefland
2003-10-26 20:36       ` Dick Streefland
2003-10-22 22:54   ` Kernel OOps: bad magic 0 while RAID5 resync operation Bo Moon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.