All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: MD RAID1 deadlock on failed disk
@ 2010-10-27  0:18 Hubert Tonneau
  2010-10-26 23:56 ` Neil Brown
  0 siblings, 1 reply; 5+ messages in thread
From: Hubert Tonneau @ 2010-10-27  0:18 UTC (permalink / raw)
  To: linux-raid

2.6.32.24 kernel worked fine.

Hubert Tonneau wrote:
>
> Hi,
> 
> The configuration is:
> Perc H200 controler configured with no RAID (mpt2sas driver),
> 2 SATA disks (sda and sdb),
> Linux MD Sofware RAID1 (md0),
> stock Linux 2.6.35.7 kernel.
> 
> I hotunplug the second (sdb) disk, and the result is:
> . as expected, I can read sda device,
> . as expected, any read to sdb device fails,
> . unexpectedly, and read to md0 never returns.
> 
> No oops or thing like that in the kernel log.
> I did not try the same with other kernel releases.
> 
> Regards,
> Hubert Tonneau


^ permalink raw reply	[flat|nested] 5+ messages in thread
* Re: MD RAID1 deadlock on failed disk
@ 2010-10-27 10:44 Hubert Tonneau
  2010-10-27  9:52 ` Neil Brown
  0 siblings, 1 reply; 5+ messages in thread
From: Hubert Tonneau @ 2010-10-27 10:44 UTC (permalink / raw)
  To: linux-scsi; +Cc: Neil Brown

Hi,

The configuration is:
Perc H200 controller configured with no RAID (mpt2sas driver),
2 SATA disks (sda and sdb),
Linux MD Sofware RAID1 (md0),
stock Linux 2.6.35.7 kernel.

I hotunplug the second (sdb) disk, and the result is:
. as expected, I can read sda device,
. as expected, any read to sdb device fails,
. unexpectedly, any read to md0 never returns.

No oops or thing like that in the kernel log.
I did not try the same with other kernel releases.

2.6.32.24 kernel worked fine.

Neil Brown asked for /proc/sysrq-trigger ouput,
and concluded that the problem is related to 'fw_event0'.
See his answer bellow.

Regards,
Hubert Tonneau


Neil Brown wrote:
>
> The fw_event0 process is interesting.
> It seems to be hung trying to 'sync' the drive that has just been pulled.
> If that is somehow causing some IO request from the md/raid1 to be delayed
> then that would certainly hang the array.
> 
> There is a section in the middle of the trace which is missing - presumably
> the sysrq-trigger output overflowed a buffer - that isn't uncommon.
> 
> So I cannot see all the timing clearly.
> How long after pulling the drive was this trace taken?
> 
> I suspect that you need to post this to linux-scsi@vger.kernel.org
> and ask about that fw_event0 thread - whether that should happen, whether it
> has been fixed, and whether it could delay pending IO requests.
> 
> NeilBrown


^ permalink raw reply	[flat|nested] 5+ messages in thread
* MD RAID1 deadlock on failed disk
@ 2010-10-26 22:32 Hubert Tonneau
  0 siblings, 0 replies; 5+ messages in thread
From: Hubert Tonneau @ 2010-10-26 22:32 UTC (permalink / raw)
  To: linux-raid

Hi,

The configuration is:
Perc H200 controler configured with no RAID (mpt2sas driver),
2 SATA disks (sda and sdb),
Linux MD Sofware RAID1 (md0),
stock Linux 2.6.35.7 kernel.

I hotunplug the second (sdb) disk, and the result is:
. as expected, I can read sda device,
. as expected, any read to sdb device fails,
. unexpectedly, and read to md0 never returns.

No oops or thing like that in the kernel log.
I did not try the same with other kernel releases.

Regards,
Hubert Tonneau

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-10-27  9:52 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-27  0:18 MD RAID1 deadlock on failed disk Hubert Tonneau
2010-10-26 23:56 ` Neil Brown
  -- strict thread matches above, loose matches on Subject: below --
2010-10-27 10:44 Hubert Tonneau
2010-10-27  9:52 ` Neil Brown
2010-10-26 22:32 Hubert Tonneau

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.