* Re: change strip_cache_size freeze the whole raid
[not found] <001801c73e14$c3177170$28df0f3d@kylecea1512a3f>
@ 2007-01-22 12:18 ` Justin Piszcz
2007-01-22 13:09 ` kyle
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Justin Piszcz @ 2007-01-22 12:18 UTC (permalink / raw)
To: kyle; +Cc: linux-raid, linux-kernel
On Mon, 22 Jan 2007, kyle wrote:
> Hi,
>
> Yesterday I tried to increase the value of strip_cache_size to see if I can
> get better performance or not. I increase the value from 2048 to something
> like 16384. After I did that, the raid5 freeze. Any proccess read / write to
> it stucked at D state. I tried to change it back to 2048, read
> strip_cache_active, cat /proc/mdstat, mdadm stop, etc. All didn't return back.
> I even cannot shutdown the machine. Finally I need to press the reset button
> in order to get back my control.
>
> Kernel is 2.6.17.8 x86-64, running at AMD Athlon3000+, 2GB Ram, 8 x Seagate
> 8200.10 250GB HDD, nvidia chipset.
>
> cat /proc/mdstat (after reboot):
> Personalities : [raid1] [raid5] [raid4]
> md1 : active raid1 hdc2[1] hda2[0]
> 6144768 blocks [2/2] [UU]
>
> md2 : active raid5 sdf1[7] sde1[6] sdd1[5] sdc1[4] sdb1[3] sda1[2] hdc4[1]
> hda4[0]
> 1664893440 blocks level 5, 512k chunk, algorithm 2 [8/8] [UUUUUUUU]
>
> md0 : active raid1 hdc1[1] hda1[0]
> 104320 blocks [2/2] [UU]
>
> Kyle
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
Yes, I noticed this bug too, if you change it too many times or change it
at the 'wrong' time, it hangs up when you echo numbr >
/proc/stripe_cache_size.
Basically don't run it more than once and don't run it at the 'wrong' time
and it works. Not sure where the bug lies, but yeah I've seen that on 3
different machines!
Justin.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: change strip_cache_size freeze the whole raid
2007-01-22 12:18 ` change strip_cache_size freeze the whole raid Justin Piszcz
@ 2007-01-22 13:09 ` kyle
2007-01-22 14:56 ` Justin Piszcz
2007-01-22 14:57 ` Steve Cousins
2007-01-22 16:10 ` Liang Yang
2 siblings, 1 reply; 10+ messages in thread
From: kyle @ 2007-01-22 13:09 UTC (permalink / raw)
To: Justin Piszcz; +Cc: linux-raid, linux-kernel
>
> On Mon, 22 Jan 2007, kyle wrote:
>
>> Hi,
>>
>> Yesterday I tried to increase the value of strip_cache_size to see if I
>> can
>> get better performance or not. I increase the value from 2048 to
>> something
>> like 16384. After I did that, the raid5 freeze. Any proccess read / write
>> to
>> it stucked at D state. I tried to change it back to 2048, read
>> strip_cache_active, cat /proc/mdstat, mdadm stop, etc. All didn't return
>> back.
>> I even cannot shutdown the machine. Finally I need to press the reset
>> button
>> in order to get back my control.
> Yes, I noticed this bug too, if you change it too many times or change it
> at the 'wrong' time, it hangs up when you echo numbr >
> /proc/stripe_cache_size.
>
> Basically don't run it more than once and don't run it at the 'wrong' time
> and it works. Not sure where the bug lies, but yeah I've seen that on 3
> different machines!
>
> Justin.
>
>
I just change it once, then it freeze. It's hard to get the 'right time'
Actually I tried it several times before. As I remember there was once it
freezed for around 1 or 2 minutes , then back to normal operation. This is
the first time it completely freezed and I waited after around 10 minutes it
still didn't wake up.
Kyle
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: change strip_cache_size freeze the whole raid
2007-01-22 13:09 ` kyle
@ 2007-01-22 14:56 ` Justin Piszcz
2007-01-22 15:18 ` kyle
0 siblings, 1 reply; 10+ messages in thread
From: Justin Piszcz @ 2007-01-22 14:56 UTC (permalink / raw)
To: kyle; +Cc: linux-raid, linux-kernel
On Mon, 22 Jan 2007, kyle wrote:
> >
> > On Mon, 22 Jan 2007, kyle wrote:
> >
> > > Hi,
> > >
> > > Yesterday I tried to increase the value of strip_cache_size to see if I
> > > can
> > > get better performance or not. I increase the value from 2048 to something
> > > like 16384. After I did that, the raid5 freeze. Any proccess read / write
> > > to
> > > it stucked at D state. I tried to change it back to 2048, read
> > > strip_cache_active, cat /proc/mdstat, mdadm stop, etc. All didn't return
> > > back.
> > > I even cannot shutdown the machine. Finally I need to press the reset
> > > button
> > > in order to get back my control.
>
> > Yes, I noticed this bug too, if you change it too many times or change it
> > at the 'wrong' time, it hangs up when you echo numbr >
> > /proc/stripe_cache_size.
> >
> > Basically don't run it more than once and don't run it at the 'wrong' time
> > and it works. Not sure where the bug lies, but yeah I've seen that on 3
> > different machines!
> >
> > Justin.
> >
> >
>
> I just change it once, then it freeze. It's hard to get the 'right time'
>
> Actually I tried it several times before. As I remember there was once it
> freezed for around 1 or 2 minutes , then back to normal operation. This is the
> first time it completely freezed and I waited after around 10 minutes it still
> didn't wake up.
>
> Kyle
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
What kernel version are you using? It normally works the first time for
me, I put it in my startup scripts, as one of the last items. However, if
I change it a few times, it will hang and there is no way to reboot except
via SYSRQ or pressing the reboot button on the machine.
This seems to be true of 2.6.19.1 and 2.6.19.2, I did not try under
2.6.20-rc5 because I am tired of hanging my machine :)
Justin.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: change strip_cache_size freeze the whole raid
2007-01-22 12:18 ` change strip_cache_size freeze the whole raid Justin Piszcz
2007-01-22 13:09 ` kyle
@ 2007-01-22 14:57 ` Steve Cousins
2007-01-22 15:01 ` Justin Piszcz
` (2 more replies)
2007-01-22 16:10 ` Liang Yang
2 siblings, 3 replies; 10+ messages in thread
From: Steve Cousins @ 2007-01-22 14:57 UTC (permalink / raw)
To: Justin Piszcz; +Cc: kyle, linux-raid, linux-kernel
Justin Piszcz wrote:
> Yes, I noticed this bug too, if you change it too many times or change it
> at the 'wrong' time, it hangs up when you echo numbr >
> /proc/stripe_cache_size.
>
> Basically don't run it more than once and don't run it at the 'wrong' time
> and it works. Not sure where the bug lies, but yeah I've seen that on 3
> different machines!
Can you tell us when the "right" time is or maybe what the "wrong" time
is? Also, is this kernel specific? Does it (increasing
stripe_cache_size) work with RAID6 too?
Thanks,
Steve
--
______________________________________________________________________
Steve Cousins, Ocean Modeling Group Email: cousins@umit.maine.edu
Marine Sciences, 452 Aubert Hall http://rocky.umeoce.maine.edu
Univ. of Maine, Orono, ME 04469 Phone: (207) 581-4302
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: change strip_cache_size freeze the whole raid
2007-01-22 14:57 ` Steve Cousins
@ 2007-01-22 15:01 ` Justin Piszcz
2007-01-23 14:22 ` kyle
2007-01-22 15:10 ` Justin Piszcz
2007-01-22 15:13 ` kyle
2 siblings, 1 reply; 10+ messages in thread
From: Justin Piszcz @ 2007-01-22 15:01 UTC (permalink / raw)
To: Steve Cousins; +Cc: kyle, linux-raid, linux-kernel
On Mon, 22 Jan 2007, Steve Cousins wrote:
>
>
> Justin Piszcz wrote:
> > Yes, I noticed this bug too, if you change it too many times or change it at
> > the 'wrong' time, it hangs up when you echo numbr > /proc/stripe_cache_size.
> >
> > Basically don't run it more than once and don't run it at the 'wrong' time
> > and it works. Not sure where the bug lies, but yeah I've seen that on 3
> > different machines!
>
> Can you tell us when the "right" time is or maybe what the "wrong" time is?
> Also, is this kernel specific? Does it (increasing stripe_cache_size) work
> with RAID6 too?
>
> Thanks,
>
> Steve
> --
> ______________________________________________________________________
> Steve Cousins, Ocean Modeling Group Email: cousins@umit.maine.edu
> Marine Sciences, 452 Aubert Hall http://rocky.umeoce.maine.edu
> Univ. of Maine, Orono, ME 04469 Phone: (207) 581-4302
>
>
>
The wrong time (for me anyway) is when/or around the time in which kernel
is auto-detecting arrays/udev starts, when I put it there I get OOPSES all
over the screen and it gets really nasty. Basically the best time appears
to be right after the system has started up but I/O hasn't started hitting
the array yet. Tricky, I know.
Justin.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: change strip_cache_size freeze the whole raid
2007-01-22 14:57 ` Steve Cousins
2007-01-22 15:01 ` Justin Piszcz
@ 2007-01-22 15:10 ` Justin Piszcz
2007-01-22 15:13 ` kyle
2 siblings, 0 replies; 10+ messages in thread
From: Justin Piszcz @ 2007-01-22 15:10 UTC (permalink / raw)
To: Steve Cousins; +Cc: kyle, linux-raid, linux-kernel
On Mon, 22 Jan 2007, Steve Cousins wrote:
>
>
> Justin Piszcz wrote:
> > Yes, I noticed this bug too, if you change it too many times or change it at
> > the 'wrong' time, it hangs up when you echo numbr > /proc/stripe_cache_size.
> >
> > Basically don't run it more than once and don't run it at the 'wrong' time
> > and it works. Not sure where the bug lies, but yeah I've seen that on 3
> > different machines!
>
> Can you tell us when the "right" time is or maybe what the "wrong" time is?
> Also, is this kernel specific? Does it (increasing stripe_cache_size) work
> with RAID6 too?
>
> Thanks,
>
> Steve
> --
> ______________________________________________________________________
> Steve Cousins, Ocean Modeling Group Email: cousins@umit.maine.edu
> Marine Sciences, 452 Aubert Hall http://rocky.umeoce.maine.edu
> Univ. of Maine, Orono, ME 04469 Phone: (207) 581-4302
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
Also, I have not tested the stripe_cache_size under RAID6, I am unsure.
Justin.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: change strip_cache_size freeze the whole raid
2007-01-22 14:57 ` Steve Cousins
2007-01-22 15:01 ` Justin Piszcz
2007-01-22 15:10 ` Justin Piszcz
@ 2007-01-22 15:13 ` kyle
2 siblings, 0 replies; 10+ messages in thread
From: kyle @ 2007-01-22 15:13 UTC (permalink / raw)
To: Steve Cousins, Justin Piszcz; +Cc: linux-raid, linux-kernel
> Justin Piszcz wrote:
>> Yes, I noticed this bug too, if you change it too many times or change it
>> at the 'wrong' time, it hangs up when you echo numbr >
>> /proc/stripe_cache_size.
>>
>> Basically don't run it more than once and don't run it at the 'wrong'
>> time and it works. Not sure where the bug lies, but yeah I've seen that
>> on 3 different machines!
>
> Can you tell us when the "right" time is or maybe what the "wrong" time
> is? Also, is this kernel specific? Does it (increasing
> stripe_cache_size) work with RAID6 too?
>
> Thanks,
>
> Steve
I think if your /sys/block/md_your_raid6/md/ have a file
"stripe_cache_size", then it should works with raid6 too.
Kyle
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: change strip_cache_size freeze the whole raid
2007-01-22 14:56 ` Justin Piszcz
@ 2007-01-22 15:18 ` kyle
0 siblings, 0 replies; 10+ messages in thread
From: kyle @ 2007-01-22 15:18 UTC (permalink / raw)
To: Justin Piszcz; +Cc: linux-raid, linux-kernel
>>
>> > Yes, I noticed this bug too, if you change it too many times or change
>> > it
>> > at the 'wrong' time, it hangs up when you echo numbr >
>> > /proc/stripe_cache_size.
>> >
>> > Basically don't run it more than once and don't run it at the 'wrong'
>> > time
>> > and it works. Not sure where the bug lies, but yeah I've seen that on
>> > 3
>> > different machines!
>> >
>> > Justin.
>> >
>> >
>>
>> I just change it once, then it freeze. It's hard to get the 'right time'
>>
>> Actually I tried it several times before. As I remember there was once it
>> freezed for around 1 or 2 minutes , then back to normal operation. This
>> is the
>> first time it completely freezed and I waited after around 10 minutes it
>> still
>> didn't wake up.
>>
>> Kyle
>>
>> -
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
> What kernel version are you using? It normally works the first time for
> me, I put it in my startup scripts, as one of the last items. However, if
> I change it a few times, it will hang and there is no way to reboot except
> via SYSRQ or pressing the reboot button on the machine.
>
> This seems to be true of 2.6.19.1 and 2.6.19.2, I did not try under
> 2.6.20-rc5 because I am tired of hanging my machine :)
>
> Justin.
>
It was 2.6.17.8. Now it's 2.6.7.13 but I won't touch it now! It's around
15km from me!
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: change strip_cache_size freeze the whole raid
2007-01-22 12:18 ` change strip_cache_size freeze the whole raid Justin Piszcz
2007-01-22 13:09 ` kyle
2007-01-22 14:57 ` Steve Cousins
@ 2007-01-22 16:10 ` Liang Yang
2 siblings, 0 replies; 10+ messages in thread
From: Liang Yang @ 2007-01-22 16:10 UTC (permalink / raw)
To: Justin Piszcz, kyle; +Cc: linux-raid, linux-kernel
Do we need to consider the chunk size when we adjust the value of
Striped_Cache_Szie for the MD-RAID5 array?
Liang
----- Original Message -----
From: "Justin Piszcz" <jpiszcz@lucidpixels.com>
To: "kyle" <kylewong@southa.com>
Cc: <linux-raid@vger.kernel.org>; <linux-kernel@vger.kernel.org>
Sent: Monday, January 22, 2007 5:18 AM
Subject: Re: change strip_cache_size freeze the whole raid
>
>
> On Mon, 22 Jan 2007, kyle wrote:
>
>> Hi,
>>
>> Yesterday I tried to increase the value of strip_cache_size to see if I
>> can
>> get better performance or not. I increase the value from 2048 to
>> something
>> like 16384. After I did that, the raid5 freeze. Any proccess read / write
>> to
>> it stucked at D state. I tried to change it back to 2048, read
>> strip_cache_active, cat /proc/mdstat, mdadm stop, etc. All didn't return
>> back.
>> I even cannot shutdown the machine. Finally I need to press the reset
>> button
>> in order to get back my control.
>>
>> Kernel is 2.6.17.8 x86-64, running at AMD Athlon3000+, 2GB Ram, 8 x
>> Seagate
>> 8200.10 250GB HDD, nvidia chipset.
>>
>> cat /proc/mdstat (after reboot):
>> Personalities : [raid1] [raid5] [raid4]
>> md1 : active raid1 hdc2[1] hda2[0]
>> 6144768 blocks [2/2] [UU]
>>
>> md2 : active raid5 sdf1[7] sde1[6] sdd1[5] sdc1[4] sdb1[3] sda1[2]
>> hdc4[1]
>> hda4[0]
>> 1664893440 blocks level 5, 512k chunk, algorithm 2 [8/8] [UUUUUUUU]
>>
>> md0 : active raid1 hdc1[1] hda1[0]
>> 104320 blocks [2/2] [UU]
>>
>> Kyle
>>
>> -
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
> Yes, I noticed this bug too, if you change it too many times or change it
> at the 'wrong' time, it hangs up when you echo numbr >
> /proc/stripe_cache_size.
>
> Basically don't run it more than once and don't run it at the 'wrong' time
> and it works. Not sure where the bug lies, but yeah I've seen that on 3
> different machines!
>
> Justin.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: change strip_cache_size freeze the whole raid
2007-01-22 15:01 ` Justin Piszcz
@ 2007-01-23 14:22 ` kyle
0 siblings, 0 replies; 10+ messages in thread
From: kyle @ 2007-01-23 14:22 UTC (permalink / raw)
To: Justin Piszcz, Steve Cousins; +Cc: linux-raid, linux-kernel
> I can try and do this later this week possibly.
> Justin.
>>
>> alt-sysrq-T or "echo t > /proc/sysrq-trigger" can be really helpful to
>> diagnose this sort of problem (providing the system isn't so badly
>> stuck that the kernel logs don't get stored).
>>
>> It is probably hitting a memory-allocation deadlock, though I cannot
>> see exactly where the deadlock would be. If you are able to reproduce
>> it and can get the kernel logs after 'alt-sysrq-T' I would really
>> appreciate it.
>> Justin,Maybe you can try freeze it once more and get the kernel logs
>> before try Neil's patch ...... :D~Kyle
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2007-01-23 14:23 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <001801c73e14$c3177170$28df0f3d@kylecea1512a3f>
2007-01-22 12:18 ` change strip_cache_size freeze the whole raid Justin Piszcz
2007-01-22 13:09 ` kyle
2007-01-22 14:56 ` Justin Piszcz
2007-01-22 15:18 ` kyle
2007-01-22 14:57 ` Steve Cousins
2007-01-22 15:01 ` Justin Piszcz
2007-01-23 14:22 ` kyle
2007-01-22 15:10 ` Justin Piszcz
2007-01-22 15:13 ` kyle
2007-01-22 16:10 ` Liang Yang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).