All of lore.kernel.org
 help / color / mirror / Atom feed
* Hints on reducing kickouts
@ 2012-03-01 18:14 John Obaterspok
  2012-03-01 18:22 ` Bryan Mesich
  0 siblings, 1 reply; 7+ messages in thread
From: John Obaterspok @ 2012-03-01 18:14 UTC (permalink / raw)
  To: linux-raid

Hello,

Occasionally my system don't shutdown cleanly and almost all time
mdadm kicks one of the disks on the RAID5 array.
Is there anything I can do to help prevent this?

I'm using sw raid 5 with 3 x 3TB Hitachi Deskstar (7K3000 HDS723030ALA640 64MB).

--------

/dev/md2:
        Version : 1.2
  Creation Time : Thu Jul 28 07:23:45 2011
     Raid Level : raid5
     Array Size : 5706313728 (5441.96 GiB 5843.27 GB)
  Used Dev Size : 2853156864 (2720.98 GiB 2921.63 GB)
   Raid Devices : 3
  Total Devices : 3
    Persistence : Superblock is persistent

    Update Time : Thu Mar  1 18:59:21 2012
          State : clean, degraded, recovering
 Active Devices : 2
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 512K

 Rebuild Status : 14% complete

           Name : Emperor:2  (local to host Emperor)
           UUID : a9f823df:05acab4f:1b02cfb9:70173894
         Events : 792

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       4       8       19        1      spare rebuilding   /dev/sdb3
       3       8       35        2      active sync   /dev/sdc3

--------



/dev/sda3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : a9f823df:05acab4f:1b02cfb9:70173894
           Name : Emperor:2  (local to host Emperor)
  Creation Time : Thu Jul 28 07:23:45 2011
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 5706314639 (2720.98 GiB 2921.63 GB)
     Array Size : 11412627456 (5441.96 GiB 5843.27 GB)
  Used Dev Size : 5706313728 (2720.98 GiB 2921.63 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : abafe531:af25e9ab:54bba541:775a1c11

    Update Time : Thu Mar  1 19:11:53 2012
       Checksum : 3b777715 - correct
         Events : 794

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAA ('A' == active, '.' == missing)
/dev/sdb3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x2
     Array UUID : a9f823df:05acab4f:1b02cfb9:70173894
           Name : Emperor:2  (local to host Emperor)
  Creation Time : Thu Jul 28 07:23:45 2011
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 5706314639 (2720.98 GiB 2921.63 GB)
     Array Size : 11412627456 (5441.96 GiB 5843.27 GB)
  Used Dev Size : 5706313728 (2720.98 GiB 2921.63 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
Recovery Offset : 713289232 sectors
          State : clean
    Device UUID : adefab73:b8a595de:666a951b:41513425

    Update Time : Thu Mar  1 19:11:53 2012
       Checksum : c875cb12 - correct
         Events : 794

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAA ('A' == active, '.' == missing)
/dev/sdc3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : a9f823df:05acab4f:1b02cfb9:70173894
           Name : Emperor:2  (local to host Emperor)
  Creation Time : Thu Jul 28 07:23:45 2011
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 5706314639 (2720.98 GiB 2921.63 GB)
     Array Size : 11412627456 (5441.96 GiB 5843.27 GB)
  Used Dev Size : 5706313728 (2720.98 GiB 2921.63 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 9d3685e5:cbc37737:d8b864e4:452a5718

    Update Time : Thu Mar  1 19:11:53 2012
       Checksum : 249f6979 - correct
         Events : 794

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAA ('A' == active, '.' == missing)



---------
--john

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Hints on reducing kickouts
  2012-03-01 18:14 Hints on reducing kickouts John Obaterspok
@ 2012-03-01 18:22 ` Bryan Mesich
  2012-03-03 21:28   ` John Obaterspok
  0 siblings, 1 reply; 7+ messages in thread
From: Bryan Mesich @ 2012-03-01 18:22 UTC (permalink / raw)
  To: John Obaterspok; +Cc: linux-raid

On Thu, Mar 01, 2012 at 07:14:38PM +0100, John Obaterspok wrote:
> Hello,
> 
> Occasionally my system don't shutdown cleanly and almost all time
> mdadm kicks one of the disks on the RAID5 array.
> Is there anything I can do to help prevent this?
> 
> I'm using sw raid 5 with 3 x 3TB Hitachi Deskstar (7K3000 HDS723030ALA640 64MB).
> 
> --------

[snip...]

I'm guessing you're trying to avoid re-syncing a 6TB array in the
event that your host goes down in an un-clean state.  In that
case, I would suggest using a write-intent bitmap on the array.
If you're concerned about loss of performance when using a
bitmap, use an external bitmap that is located on another
spindle(s).

Bryan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Hints on reducing kickouts
  2012-03-01 18:22 ` Bryan Mesich
@ 2012-03-03 21:28   ` John Obaterspok
  2012-03-03 22:24     ` Bryan Mesich
  0 siblings, 1 reply; 7+ messages in thread
From: John Obaterspok @ 2012-03-03 21:28 UTC (permalink / raw)
  To: Bryan Mesich, John Obaterspok, linux-raid

Please see below for errors I get when trying to add internal bitmap.

2012/3/1 Bryan Mesich <bryan.mesich@ndsu.edu>:
> On Thu, Mar 01, 2012 at 07:14:38PM +0100, John Obaterspok wrote:
>> Hello,
>>
>> Occasionally my system don't shutdown cleanly and almost all time
>> mdadm kicks one of the disks on the RAID5 array.
>> Is there anything I can do to help prevent this?
>>
>> I'm using sw raid 5 with 3 x 3TB Hitachi Deskstar (7K3000 HDS723030ALA640 64MB).
>>
>> --------
>
> [snip...]
>
> I'm guessing you're trying to avoid re-syncing a 6TB array in the
> event that your host goes down in an un-clean state.  In that
> case, I would suggest using a write-intent bitmap on the array.
> If you're concerned about loss of performance when using a
> bitmap, use an external bitmap that is located on another
> spindle(s).

okay, I'm trying to add an interal bitmap to see how much it slows down things:

[root@Emperor ~]# mdadm --version
mdadm - v3.2.3 - 23rd December 2011


[root@Emperor ~]# mdadm --grow --bitmap=internal /dev/md2
mdadm: failed to set internal bitmap.


[root@Emperor ~]# mdadm --examine-bitmap /dev/md2
        Filename : /dev/md2
           Magic : 00000000
mdadm: invalid bitmap magic 0x0, the bitmap file appears to be corrupted
         Version : 0
mdadm: unknown bitmap version 0, either the bitmap file is corrupted
or you need to upgrade your tools


[root@Emperor ~]# mdadm -D /dev/md2
/dev/md2:
        Version : 1.2
  Creation Time : Thu Jul 28 07:23:45 2011
     Raid Level : raid5
     Array Size : 5706313728 (5441.96 GiB 5843.27 GB)
  Used Dev Size : 2853156864 (2720.98 GiB 2921.63 GB)
   Raid Devices : 3
  Total Devices : 3
    Persistence : Superblock is persistent

    Update Time : Sat Mar  3 22:24:21 2012
          State : clean
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 512K

           Name : Emperor:2  (local to host Emperor)
           UUID : a9f823df:05acab4f:1b02cfb9:70173894
         Events : 1041

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       4       8       19        1      active sync   /dev/sdb3
       3       8       35        2      active sync   /dev/sdc3

This appears in 'dmesg'


[50068.295573] mdadm: sending ioctl 1261 to a partition!
[50068.295577] mdadm: sending ioctl 1261 to a partition!
[50068.320235] mdadm: sending ioctl 800c0910 to a partition!
[50068.320239] mdadm: sending ioctl 800c0910 to a partition!
[50174.085742] scsi_verify_blk_ioctl: 76 callbacks suppressed
[50174.085745] mdadm: sending ioctl 1261 to a partition!
[50174.085748] mdadm: sending ioctl 1261 to a partition!
[50174.086052] mdadm: sending ioctl 1261 to a partition!
[50174.086056] mdadm: sending ioctl 1261 to a partition!
[50174.086261] mdadm: sending ioctl 1261 to a partition!
[50174.086263] mdadm: sending ioctl 1261 to a partition!
[50174.091615] md2: invalid bitmap file superblock: bad magic
[50174.091618] md2: bitmap file superblock:
[50174.091620]          magic: 00400000
[50174.091622]        version: 4194320
[50174.091623]           uuid: 00400020.20005fe0.00010000.00000000
[50174.091625]         events: 2510791976631140352
[50174.091626] events cleared: 18014471528120321
[50174.091628]          state: 00030000
[50174.091629]      chunksize: 0 B
[50174.091630]   daemon sleep: 0s
[50174.091631]      sync size: 1152991873353122064 KB
[50174.091633] max write behind: -1180360704
[50174.095699] mdadm: sending ioctl 1261 to a partition!
[50174.095701] mdadm: sending ioctl 1261 to a partition!
[50174.121531] mdadm: sending ioctl 800c0910 to a partition!
[50174.121535] mdadm: sending ioctl 800c0910 to a partition!
[50226.959946] scsi_verify_blk_ioctl: 76 callbacks suppressed
[50226.959949] mdadm: sending ioctl 1261 to a partition!
[50226.959952] mdadm: sending ioctl 1261 to a partition!
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Hints on reducing kickouts
  2012-03-03 21:28   ` John Obaterspok
@ 2012-03-03 22:24     ` Bryan Mesich
  2012-03-11 18:55       ` John Obaterspok
  0 siblings, 1 reply; 7+ messages in thread
From: Bryan Mesich @ 2012-03-03 22:24 UTC (permalink / raw)
  To: John Obaterspok; +Cc: linux-raid

On Sat, Mar 03, 2012 at 10:28:20PM +0100, John Obaterspok wrote:
> Please see below for errors I get when trying to add internal bitmap.

[snip...]
 
> okay, I'm trying to add an interal bitmap to see how much it slows down things:
> 
> [root@Emperor ~]# mdadm --version
> mdadm - v3.2.3 - 23rd December 2011
> 
> 
> [root@Emperor ~]# mdadm --grow --bitmap=internal /dev/md2
> mdadm: failed to set internal bitmap.

I think you are using the wrong cmd line syntax.  Try the
following for adding an internal bitmap to /dev/md2

mdadm --grow /dev/md2 --bitmap=internal

If you want to use an external bitmap, use --bitmap="path to
file".

> [root@Emperor ~]# mdadm --examine-bitmap /dev/md2
>         Filename : /dev/md2
>            Magic : 00000000
> mdadm: invalid bitmap magic 0x0, the bitmap file appears to be corrupted
>          Version : 0
> mdadm: unknown bitmap version 0, either the bitmap file is corrupted
> or you need to upgrade your tools
> 

[snip...]

Bryan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Hints on reducing kickouts
  2012-03-03 22:24     ` Bryan Mesich
@ 2012-03-11 18:55       ` John Obaterspok
  2012-03-11 19:15         ` Bryan Mesich
  0 siblings, 1 reply; 7+ messages in thread
From: John Obaterspok @ 2012-03-11 18:55 UTC (permalink / raw)
  To: Bryan Mesich, linux-raid

Hello,

mdadm --grow /dev/md2 --bitmap=internal
mdadm: failed to set internal bitmap.

I then tried it on md1 and this crashed my machine. (bitmap_endwrite
<- handle_stripe)
hmm.

I don't know what to do.

-- John

2012/3/3 Bryan Mesich <bryan.mesich@ndsu.edu>:
> On Sat, Mar 03, 2012 at 10:28:20PM +0100, John Obaterspok wrote:
>> Please see below for errors I get when trying to add internal bitmap.
>
> [snip...]
>
>> okay, I'm trying to add an interal bitmap to see how much it slows down things:
>>
>> [root@Emperor ~]# mdadm --version
>> mdadm - v3.2.3 - 23rd December 2011
>>
>>
>> [root@Emperor ~]# mdadm --grow --bitmap=internal /dev/md2
>> mdadm: failed to set internal bitmap.
>
> I think you are using the wrong cmd line syntax.  Try the
> following for adding an internal bitmap to /dev/md2
>
> mdadm --grow /dev/md2 --bitmap=internal
>
> If you want to use an external bitmap, use --bitmap="path to
> file".
>
>> [root@Emperor ~]# mdadm --examine-bitmap /dev/md2
>>         Filename : /dev/md2
>>            Magic : 00000000
>> mdadm: invalid bitmap magic 0x0, the bitmap file appears to be corrupted
>>          Version : 0
>> mdadm: unknown bitmap version 0, either the bitmap file is corrupted
>> or you need to upgrade your tools
>>
>
> [snip...]
>
> Bryan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Hints on reducing kickouts
  2012-03-11 18:55       ` John Obaterspok
@ 2012-03-11 19:15         ` Bryan Mesich
  2012-03-28 20:11           ` John Obaterspok
  0 siblings, 1 reply; 7+ messages in thread
From: Bryan Mesich @ 2012-03-11 19:15 UTC (permalink / raw)
  To: John Obaterspok; +Cc: linux-raid

On Sun, Mar 11, 2012 at 07:55:10PM +0100, John Obaterspok wrote:
> Hello,
> 
> mdadm --grow /dev/md2 --bitmap=internal
> mdadm: failed to set internal bitmap.
> 
> I then tried it on md1 and this crashed my machine. (bitmap_endwrite
> <- handle_stripe)
> hmm.
> 
> I don't know what to do.

This looks like a bug that Neil commented about last week (I
deleted the thread, so can't directly forward it to you).
The thread I'm referring to can be found here:

http://marc.info/?t=133122326800004&r=1&w=2

Subject of the thread is: "invalid bitmap file superblock: bad
magic".  Neil indicated that the fix in included in the following
commit:

http://neil.brown.name/git?p=mdadm;a=commitdiff;h=6ef89052d85b8137b8a7100f761d896ae6f61001


Bryan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Hints on reducing kickouts
  2012-03-11 19:15         ` Bryan Mesich
@ 2012-03-28 20:11           ` John Obaterspok
  0 siblings, 0 replies; 7+ messages in thread
From: John Obaterspok @ 2012-03-28 20:11 UTC (permalink / raw)
  To: Bryan Mesich, John Obaterspok, linux-raid

Thanks Bryan,

I Appreciate it. I'll try with write intent when I get a chance (I'm
still struggeling with md).

My sdb disk got kicked out a second time in perhaps two weeks. I have
a feeling it's the sdb that gets kicked most of the time. md0 & md1
got the sdb kicked but md1 didn't. Any reasons why a disk gets kicked,
or perhaps why it gets unclean and thus kicked?

md2 : active raid5 sdc3[3] sda3[0]
      5706313728 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [U_U]

md0 : active raid1 sdc1[2] sda1[0]
      307188 blocks super 1.0 [3/2] [U_U]

md1 : active raid5 sdb2[4] sdc2[3] sda2[0]
      153596928 blocks super 1.1 level 5, 512k chunk, algorithm 2 [3/3] [UUU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

--john

2012/3/11 Bryan Mesich <bryan.mesich@ndsu.edu>:
> On Sun, Mar 11, 2012 at 07:55:10PM +0100, John Obaterspok wrote:
>> Hello,
>>
>> mdadm --grow /dev/md2 --bitmap=internal
>> mdadm: failed to set internal bitmap.
>>
>> I then tried it on md1 and this crashed my machine. (bitmap_endwrite
>> <- handle_stripe)
>> hmm.
>>
>> I don't know what to do.
>
> This looks like a bug that Neil commented about last week (I
> deleted the thread, so can't directly forward it to you).
> The thread I'm referring to can be found here:
>
> http://marc.info/?t=133122326800004&r=1&w=2
>
> Subject of the thread is: "invalid bitmap file superblock: bad
> magic".  Neil indicated that the fix in included in the following
> commit:
>
> http://neil.brown.name/git?p=mdadm;a=commitdiff;h=6ef89052d85b8137b8a7100f761d896ae6f61001
>
>
> Bryan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-03-28 20:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-01 18:14 Hints on reducing kickouts John Obaterspok
2012-03-01 18:22 ` Bryan Mesich
2012-03-03 21:28   ` John Obaterspok
2012-03-03 22:24     ` Bryan Mesich
2012-03-11 18:55       ` John Obaterspok
2012-03-11 19:15         ` Bryan Mesich
2012-03-28 20:11           ` John Obaterspok

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.