linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* system hang on HDIO_DRIVE_RESET! help!
@ 2003-02-26 16:45 rain.wang
  2003-02-26 19:44 ` Alan Cox
  0 siblings, 1 reply; 12+ messages in thread
From: rain.wang @ 2003-02-26 16:45 UTC (permalink / raw)
  To: linux-kernel

Hi,
    I did HDIO_DRIVE_RESET ioctl, but system hung without any response,
only printed some mesages from kernel(v2.4.20):

hda: DMA disabled
hda: ide_set_handler: handler not null; old=c01ce300, new=c01d4400
bug: kernel timer added twice at c01ce102

     would you please help me with it?

Regards
rain.w





^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: system hang on HDIO_DRIVE_RESET! help!
  2003-02-26 16:45 system hang on HDIO_DRIVE_RESET! help! rain.wang
@ 2003-02-26 19:44 ` Alan Cox
  2003-02-28  5:04   ` rain.wang
  0 siblings, 1 reply; 12+ messages in thread
From: Alan Cox @ 2003-02-26 19:44 UTC (permalink / raw)
  To: rain.wang; +Cc: Linux Kernel Mailing List

On Wed, 2003-02-26 at 16:45, rain.wang wrote:
> Hi,
>     I did HDIO_DRIVE_RESET ioctl, but system hung without any response,
> only printed some mesages from kernel(v2.4.20):
> 
> hda: DMA disabled
> hda: ide_set_handler: handler not null; old=c01ce300, new=c01d4400
> bug: kernel timer added twice at c01ce102
> 
>      would you please help me with it?

Does this still occur on 2.4.21pre. It should be fixed now


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: system hang on HDIO_DRIVE_RESET! help!
  2003-02-26 19:44 ` Alan Cox
@ 2003-02-28  5:04   ` rain.wang
  2003-02-28 13:35     ` Alan Cox
  2003-03-04 13:22     ` rain.wang
  0 siblings, 2 replies; 12+ messages in thread
From: rain.wang @ 2003-02-28  5:04 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linux Kernel Mailing List

Alan Cox wrote:

> On Wed, 2003-02-26 at 16:45, rain.wang wrote:
> > Hi,
> >     I did HDIO_DRIVE_RESET ioctl, but system hung without any response,
> > only printed some mesages from kernel(v2.4.20):
> >
> > hda: DMA disabled
> > hda: ide_set_handler: handler not null; old=c01ce300, new=c01d4400
> > bug: kernel timer added twice at c01ce102
> >
> >      would you please help me with it?
>
> Does this still occur on 2.4.21pre. It should be fixed now

I had tested 'hdparm -w /dev/hda' under 2.4.21-pre4, but problem sill exist,

just same message as in 2.4.20.

rain.w



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: system hang on HDIO_DRIVE_RESET! help!
  2003-02-28 13:35     ` Alan Cox
@ 2003-02-28 13:30       ` rain.wang
  0 siblings, 0 replies; 12+ messages in thread
From: rain.wang @ 2003-02-28 13:30 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linux Kernel Mailing List

Alan Cox wrote:

> On Fri, 2003-02-28 at 05:04, rain.wang wrote:
> > > Does this still occur on 2.4.21pre. It should be fixed now
> >
> > I had tested 'hdparm -w /dev/hda' under 2.4.21-pre4, but problem sill exist,
> >
> > just same message as in 2.4.20.
>
> What controller are you using and I'll look into it a bit further

Intel 82801AA host controller,  and I found when I disabled DMA before doing
drive reset, system wouldn't hang at most time.  It seemed not tight related with

host chip, does it?

rain.w



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: system hang on HDIO_DRIVE_RESET! help!
  2003-02-28  5:04   ` rain.wang
@ 2003-02-28 13:35     ` Alan Cox
  2003-02-28 13:30       ` rain.wang
  2003-03-04 13:22     ` rain.wang
  1 sibling, 1 reply; 12+ messages in thread
From: Alan Cox @ 2003-02-28 13:35 UTC (permalink / raw)
  To: rain.wang; +Cc: Linux Kernel Mailing List

On Fri, 2003-02-28 at 05:04, rain.wang wrote:
> > Does this still occur on 2.4.21pre. It should be fixed now
> 
> I had tested 'hdparm -w /dev/hda' under 2.4.21-pre4, but problem sill exist,
> 
> just same message as in 2.4.20.

What controller are you using and I'll look into it a bit further


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: system hang on HDIO_DRIVE_RESET! help!
  2003-02-28  5:04   ` rain.wang
  2003-02-28 13:35     ` Alan Cox
@ 2003-03-04 13:22     ` rain.wang
  2003-03-04 15:27       ` Alan Cox
  1 sibling, 1 reply; 12+ messages in thread
From: rain.wang @ 2003-03-04 13:22 UTC (permalink / raw)
  To: Alan Cox, Linux Kernel Mailing List

"rain.wang" wrote:

> Alan Cox wrote:
>
> > On Wed, 2003-02-26 at 16:45, rain.wang wrote:
> > > Hi,
> > >     I did HDIO_DRIVE_RESET ioctl, but system hung without any response,
> > > only printed some mesages from kernel(v2.4.20):
> > >
> > > hda: DMA disabled
> > > hda: ide_set_handler: handler not null; old=c01ce300, new=c01d4400
> > > bug: kernel timer added twice at c01ce102
> > >
> > >      would you please help me with it?
> >
> > Does this still occur on 2.4.21pre. It should be fixed now
>
> I had tested 'hdparm -w /dev/hda' under 2.4.21-pre4, but problem sill exist,
>
> just same message as in 2.4.20.
>
> rain.w

Hi Alan,
    I had tested 'hdparm -w /dev/hda' under 2.4.25-pre5-ac1, system
crashed
with
kernel oops message:
    kernel BUG at ide-iops:1046!
    ...

    can this be resolved?

rain.w

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: system hang on HDIO_DRIVE_RESET! help!
  2003-03-04 13:22     ` rain.wang
@ 2003-03-04 15:27       ` Alan Cox
  2003-03-07  6:04         ` rain.wang
  0 siblings, 1 reply; 12+ messages in thread
From: Alan Cox @ 2003-03-04 15:27 UTC (permalink / raw)
  To: rain.wang; +Cc: Linux Kernel Mailing List

On Tue, 2003-03-04 at 13:22, rain.wang wrote:
>     I had tested 'hdparm -w /dev/hda' under 2.4.25-pre5-ac1, system
> crashed
> with
> kernel oops message:
>     kernel BUG at ide-iops:1046!
>     ...
> 
>     can this be resolved?

Once I understand what the problems all are yes. The BUG() is good, it
confirms that what we are both seeing is the same thing - the reset is
managing to issue two commands to the controller at the same time.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: system hang on HDIO_DRIVE_RESET! help!
  2003-03-04 15:27       ` Alan Cox
@ 2003-03-07  6:04         ` rain.wang
  2003-03-07 12:58           ` Alan Cox
  0 siblings, 1 reply; 12+ messages in thread
From: rain.wang @ 2003-03-07  6:04 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linux Kernel Mailing List

Alan Cox wrote:

> On Tue, 2003-03-04 at 13:22, rain.wang wrote:
> >     I had tested 'hdparm -w /dev/hda' under 2.4.25-pre5-ac1, system
> > crashed
> > with
> > kernel oops message:
> >     kernel BUG at ide-iops:1046!
> >     ...
> >
> >     can this be resolved?
>
> Once I understand what the problems all are yes. The BUG() is good, it
> confirms that what we are both seeing is the same thing - the reset is
> managing to issue two commands to the controller at the same time.

Hi,
    thank you, Alan. I tested pre5-ac2 patch and that seems all ok.

rain.w


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: system hang on HDIO_DRIVE_RESET! help!
  2003-03-07  6:04         ` rain.wang
@ 2003-03-07 12:58           ` Alan Cox
  2003-03-14  8:28             ` rain.wang
  0 siblings, 1 reply; 12+ messages in thread
From: Alan Cox @ 2003-03-07 12:58 UTC (permalink / raw)
  To: rain.wang; +Cc: Linux Kernel Mailing List

On Fri, 2003-03-07 at 06:04, rain.wang wrote:
> > Once I understand what the problems all are yes. The BUG() is good, it
> > confirms that what we are both seeing is the same thing - the reset is
> > managing to issue two commands to the controller at the same time.
> 
> Hi,
>     thank you, Alan. I tested pre5-ac2 patch and that seems all ok.

Thanks for the confirmation it is fixed


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: system hang on HDIO_DRIVE_RESET! help!
  2003-03-07 12:58           ` Alan Cox
@ 2003-03-14  8:28             ` rain.wang
  2003-03-14  9:13               ` Andre Hedrick
  0 siblings, 1 reply; 12+ messages in thread
From: rain.wang @ 2003-03-14  8:28 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linux Kernel Mailing List

Alan Cox wrote:

> On Fri, 2003-03-07 at 06:04, rain.wang wrote:
> > > Once I understand what the problems all are yes. The BUG() is good, it
> > > confirms that what we are both seeing is the same thing - the reset is
> > > managing to issue two commands to the controller at the same time.
> >
> > Hi,
> >     thank you, Alan. I tested pre5-ac2 patch and that seems all ok.
>
> Thanks for the confirmation it is fixed

Hi Alan,
    for 2.4.21-pre5-ac2 and -ac3 patch also.
    there's still problem on reset. when I do 'hdparm -w /dev/hda' once
after another, all seems ok.  but when I make a shell script and let
'hdparm -w' run in several times loop, system would always crashed
at the second time and left oops messages:
    kernel BUG at ide.c:1700!
    ...
so, if any bugs still locking there?

rain.w



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: system hang on HDIO_DRIVE_RESET! help!
  2003-03-14  8:28             ` rain.wang
@ 2003-03-14  9:13               ` Andre Hedrick
  2003-03-14 14:21                 ` Alan Cox
  0 siblings, 1 reply; 12+ messages in thread
From: Andre Hedrick @ 2003-03-14  9:13 UTC (permalink / raw)
  To: rain.wang; +Cc: Alan Cox, Linux Kernel Mailing List


Rain,

The only way to deal with this is to treat the operations a failed and
punch them back out to block for clean up.  Now we failed the a command.
However, I think I need to set a default block hook during the reset
process for the drive, channel, hba ... depending on the magnitude of the
wrecking ball generated.  I need to offline Alan for this core dump.

The hang is in the clean ups after the reset.

I suspect the driver/hba is in DMA and drive is not.

Cheers,

Andre Hedrick
LAD Storage Consulting Group
------------------------------------
Pokemon (n), A Jamaican proctologist
------------------------------------

On Fri, 14 Mar 2003, rain.wang wrote:

> Alan Cox wrote:
> 
> > On Fri, 2003-03-07 at 06:04, rain.wang wrote:
> > > > Once I understand what the problems all are yes. The BUG() is good, it
> > > > confirms that what we are both seeing is the same thing - the reset is
> > > > managing to issue two commands to the controller at the same time.
> > >
> > > Hi,
> > >     thank you, Alan. I tested pre5-ac2 patch and that seems all ok.
> >
> > Thanks for the confirmation it is fixed
> 
> Hi Alan,
>     for 2.4.21-pre5-ac2 and -ac3 patch also.
>     there's still problem on reset. when I do 'hdparm -w /dev/hda' once
> after another, all seems ok.  but when I make a shell script and let
> 'hdparm -w' run in several times loop, system would always crashed
> at the second time and left oops messages:
>     kernel BUG at ide.c:1700!
>     ...
> so, if any bugs still locking there?
> 
> rain.w
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: system hang on HDIO_DRIVE_RESET! help!
  2003-03-14  9:13               ` Andre Hedrick
@ 2003-03-14 14:21                 ` Alan Cox
  0 siblings, 0 replies; 12+ messages in thread
From: Alan Cox @ 2003-03-14 14:21 UTC (permalink / raw)
  To: Andre Hedrick; +Cc: rain.wang, Linux Kernel Mailing List

On Fri, 2003-03-14 at 09:13, Andre Hedrick wrote:
> Rain,
> 
> The only way to deal with this is to treat the operations a failed and
> punch them back out to block for clean up.  Now we failed the a command.
> However, I think I need to set a default block hook during the reset
> process for the drive, channel, hba ... depending on the magnitude of the
> wrecking ball generated.  I need to offline Alan for this core dump.

I fixed one set of races with resets and it doesnt suprise me there is
another right now. 


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2003-03-14 13:02 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-02-26 16:45 system hang on HDIO_DRIVE_RESET! help! rain.wang
2003-02-26 19:44 ` Alan Cox
2003-02-28  5:04   ` rain.wang
2003-02-28 13:35     ` Alan Cox
2003-02-28 13:30       ` rain.wang
2003-03-04 13:22     ` rain.wang
2003-03-04 15:27       ` Alan Cox
2003-03-07  6:04         ` rain.wang
2003-03-07 12:58           ` Alan Cox
2003-03-14  8:28             ` rain.wang
2003-03-14  9:13               ` Andre Hedrick
2003-03-14 14:21                 ` Alan Cox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).