linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: PATCH, IDE corruption, 2.4.18
@ 2002-05-03 20:39 Neil Conway
  0 siblings, 0 replies; 14+ messages in thread
From: Neil Conway @ 2002-05-03 20:39 UTC (permalink / raw)
  To: linux-kernel

Darn, forgot to mention: this only bites if the CD
drive supports DMA.

Neil

__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PATCH, IDE corruption, 2.4.18
  2002-05-05 20:44 ` Neil Conway
@ 2002-05-06  0:23   ` Mike Fedyk
  0 siblings, 0 replies; 14+ messages in thread
From: Mike Fedyk @ 2002-05-06  0:23 UTC (permalink / raw)
  To: Neil Conway, Andrew Morton; +Cc: Bartlomiej Zolnierkiewicz, linux-kernel

On Sun, May 05, 2002 at 09:44:31PM +0100, Neil Conway wrote:
>  Also, does anyone understand why screwing up a DMA transfer results in
> the trashing of inodes?  Even better, how come this hasn't bitten many
> more people?  Surely there are lots of people out there with disks and
> CDs on the same IDE cable...
> 
> Neil
> (PS: I have reproduced the problem on two systems so far.)

That seems to be a seperate problem with the block layer and locked buffers
or pages (don't remember which).

I think a patch was submitted and integrated sometime in 2.4.19-pre.  Andrew
Morton would know more.

Mike

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PATCH, IDE corruption, 2.4.18
  2002-05-05 15:49 Bartlomiej Zolnierkiewicz
  2002-05-05 15:04 ` Martin Dalecki
@ 2002-05-05 20:44 ` Neil Conway
  2002-05-06  0:23   ` Mike Fedyk
  1 sibling, 1 reply; 14+ messages in thread
From: Neil Conway @ 2002-05-05 20:44 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz; +Cc: linux-kernel

--- Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl> wrote:
> 
> You've got been mistaken by unfortunate name (Martin changed
> name dmaproc() to udma() in 2.5.12).
> [snip]
> btw. udma() name is really misleading,
>      it should be read (u)dma() not udma() :)

Ouch, thanks for the wakeup.  I was scanning the code a little too
rapidly it seems...

Martin: why?  The old IDE code was admittedly in need of some work, but
a name like dmaproc is very obviously a function.  A name like udma is
likely to be (a) misconstrued by lusers like me as a variable, and (b)
misconstrued by all and sundry as something UDMA-specific, rather than
DMA-specific.  Would it really be too much grief to rename it back to
dmaproc()?  Misleading code will mutate into buggy code.

:-)

Now that a few people have cast their eyes over my report+patch, is it
safe to assume that the problem is real and not specific to my systems?
 Also, does anyone understand why screwing up a DMA transfer results in
the trashing of inodes?  Even better, how come this hasn't bitten many
more people?  Surely there are lots of people out there with disks and
CDs on the same IDE cable...

Neil
(PS: I have reproduced the problem on two systems so far.)

__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PATCH, IDE corruption, 2.4.18
@ 2002-05-05 15:49 Bartlomiej Zolnierkiewicz
  2002-05-05 15:04 ` Martin Dalecki
  2002-05-05 20:44 ` Neil Conway
  0 siblings, 2 replies; 14+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2002-05-05 15:49 UTC (permalink / raw)
  To: Neil Conway; +Cc: linux-kernel


> Explanation: some code now differs in the code path concerned, and
> ide_register_subdriver now only calls ide_dma_check for UDMA drives
> (previously all DMA drives), but ultimately ide_dma_check still ends up
> in ide_config_drive_speed, and that's still fuctionally the same as
> 2.4.

You've got been mistaken by unfortunate name (Martin changed
name dmaproc() to udma() in 2.5.12).
Code calls ide_dma_check for chipsets which registerered udma()
handler (formerly dmaproc()), I think the same 2.4 does.

btw. udma() name is really misleading,
     it should be read (u)dma() not udma() :)

--
bkz


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PATCH, IDE corruption, 2.4.18
  2002-05-05 15:49 Bartlomiej Zolnierkiewicz
@ 2002-05-05 15:04 ` Martin Dalecki
  2002-05-05 20:44 ` Neil Conway
  1 sibling, 0 replies; 14+ messages in thread
From: Martin Dalecki @ 2002-05-05 15:04 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz; +Cc: Neil Conway, linux-kernel

Uz.ytkownik Bartlomiej Zolnierkiewicz napisa?:
>>Explanation: some code now differs in the code path concerned, and
>>ide_register_subdriver now only calls ide_dma_check for UDMA drives
>>(previously all DMA drives), but ultimately ide_dma_check still ends up
>>in ide_config_drive_speed, and that's still fuctionally the same as
>>2.4.
> 
> 
> You've got been mistaken by unfortunate name (Martin changed
> name dmaproc() to udma() in 2.5.12).
> Code calls ide_dma_check for chipsets which registerered udma()
> handler (formerly dmaproc()), I think the same 2.4 does.
> 
> btw. udma() name is really misleading,
>      it should be read (u)dma() not udma() :)


It's just an intermediate step before this whole crap get's
trown over ;-).


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PATCH, IDE corruption, 2.4.18
  2002-05-05  7:36     ` Mike Fedyk
@ 2002-05-05  9:44       ` Neil Conway
  0 siblings, 0 replies; 14+ messages in thread
From: Neil Conway @ 2002-05-05  9:44 UTC (permalink / raw)
  To: Mike Fedyk; +Cc: linux-kernel

Well, downloaded 2.5.13 last night.  Executive summary: 2.5.13 IS
buggy.

Explanation: some code now differs in the code path concerned, and
ide_register_subdriver now only calls ide_dma_check for UDMA drives
(previously all DMA drives), but ultimately ide_dma_check still ends up
in ide_config_drive_speed, and that's still fuctionally the same as
2.4.

My patch doesn't apply though, due to slight changes to types and
macros, but it's not exactly a hard patch to port.  Moreover, it's
probably better to fix the problem more directly than to port my patch
;-))

Neil



__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PATCH, IDE corruption, 2.4.18
  2002-05-05  1:54   ` Neil Conway
@ 2002-05-05  7:36     ` Mike Fedyk
  2002-05-05  9:44       ` Neil Conway
  0 siblings, 1 reply; 14+ messages in thread
From: Mike Fedyk @ 2002-05-05  7:36 UTC (permalink / raw)
  To: Neil Conway; +Cc: linux-kernel

On Sun, May 05, 2002 at 02:54:17AM +0100, Neil Conway wrote:
> Hi...
> 
>  --- Mike Fedyk <mfedyk@matchmail.com> wrote: > On Sat, May 04, 2002 at
> 01:15:20PM +0100, Neil Conway wrote:
> > > -	byte stat;
> > > +	byte stat,unit;
> > 
> > [snip]
> > 
> > >  #if defined(CONFIG_BLK_DEV_IDEDMA) && !defined(CONFIG_DMA_NONPCI)
> > > -	byte unit = (drive->select.b.unit & 0x01);
> > > +	unit = (drive->select.b.unit & 0x01);
> > 
> > Why are you moving the init of "unit" out of that ifdef?
> 
> Basically to make it compile.  I'm only moving the declaration, and
> that's only because I need to abort the routine BEFORE the second line
> of the ifdef switches off DMA on that unit.  I did compile a version
> with the ifdef split into two bits but I decided that was a little
> messy. (It won't compile if my check goes in before the first line of
> the ifdef as originally written as it declares a variable and C won't
> let declarations follow plain code.)
>

Right.  Duh on my part, sorry.

> > Can you see if this problem is still in 2.5 also? 
> 
> I haven't got a 2.5.13 tree but I found 2.5.7 on a source browser
> online and verified that, back then at least, ide-features.c was still
> basically the same.  Of course, the routines in between
> ide_register_subdriver and ide_config_drive_speed might have been
> different.  If someone can look at the code-path between these two
> routines in 2.5.13 to see if there is any check on whether or not the
> hwgroup is busy (or simply whether or not DMA is in progress) that
> would clear it up.  I'll probably download 2.5.13 sometime soon anyway.
> 

Andre cleaims that the situation is worse on 2.5.13 or so.  I wouldn't hurt
to test-run the code though.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PATCH, IDE corruption, 2.4.18
  2002-05-04 22:58 ` Andre Hedrick
@ 2002-05-05  2:10   ` Neil Conway
  0 siblings, 0 replies; 14+ messages in thread
From: Neil Conway @ 2002-05-05  2:10 UTC (permalink / raw)
  To: Andre Hedrick; +Cc: linux-kernel

Ah good stuff, an honest-to-god expert with brains for the picking :)

 --- Andre Hedrick <andre@linux-ide.org> wrote: > 
> It is a viable check but it the need for it shows another problem in
> the
> code.  In the IDEDMA recovery kludges it must be losing the the
> hwgroup
> busy state.

Have I misunderstood then?  My understanding was this:
ide_config_drive_speed is doing a SELECT_DRIVE, which is quite liable
to happen during a DMA transfer to/from the other unit on the hwif, and
thus screw up that transfer.

The recovery or not (from the screw-up) is a secondary factor (and it
would be nice if such recoveries worked) but the primary fix is surely
to make sure we don't gratuitously screw up transfers in the first
place.

Or have I just demonstrated my utter ignorance of IDE?

Neil
PS: a secondary issue is that after ide_config_drive_speed buggers up a
transfer, the selected drive is NOT the one that ide_intr expects to be
selected.  I haven't yet got any clue if this is also relevant though.


__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PATCH, IDE corruption, 2.4.18
  2002-05-05  0:22 ` Mike Fedyk
  2002-05-05  0:47   ` Andre Hedrick
@ 2002-05-05  1:54   ` Neil Conway
  2002-05-05  7:36     ` Mike Fedyk
  1 sibling, 1 reply; 14+ messages in thread
From: Neil Conway @ 2002-05-05  1:54 UTC (permalink / raw)
  To: Mike Fedyk; +Cc: linux-kernel

Hi...

 --- Mike Fedyk <mfedyk@matchmail.com> wrote: > On Sat, May 04, 2002 at
01:15:20PM +0100, Neil Conway wrote:
> > -	byte stat;
> > +	byte stat,unit;
> 
> [snip]
> 
> >  #if defined(CONFIG_BLK_DEV_IDEDMA) && !defined(CONFIG_DMA_NONPCI)
> > -	byte unit = (drive->select.b.unit & 0x01);
> > +	unit = (drive->select.b.unit & 0x01);
> 
> Why are you moving the init of "unit" out of that ifdef?

Basically to make it compile.  I'm only moving the declaration, and
that's only because I need to abort the routine BEFORE the second line
of the ifdef switches off DMA on that unit.  I did compile a version
with the ifdef split into two bits but I decided that was a little
messy. (It won't compile if my check goes in before the first line of
the ifdef as originally written as it declares a variable and C won't
let declarations follow plain code.)

> Can you see if this problem is still in 2.5 also? 

I haven't got a 2.5.13 tree but I found 2.5.7 on a source browser
online and verified that, back then at least, ide-features.c was still
basically the same.  Of course, the routines in between
ide_register_subdriver and ide_config_drive_speed might have been
different.  If someone can look at the code-path between these two
routines in 2.5.13 to see if there is any check on whether or not the
hwgroup is busy (or simply whether or not DMA is in progress) that
would clear it up.  I'll probably download 2.5.13 sometime soon anyway.

cheers
Neil

__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PATCH, IDE corruption, 2.4.18
  2002-05-05  0:22 ` Mike Fedyk
@ 2002-05-05  0:47   ` Andre Hedrick
  2002-05-05  1:54   ` Neil Conway
  1 sibling, 0 replies; 14+ messages in thread
From: Andre Hedrick @ 2002-05-05  0:47 UTC (permalink / raw)
  To: Mike Fedyk; +Cc: Neil Conway, linux-kernel


It is not a problem in 2.5, you have a worse one.

NO DMA TIMEOUT RECOVERY -- woohoo ...........

On Sat, 4 May 2002, Mike Fedyk wrote:

> On Sat, May 04, 2002 at 01:15:20PM +0100, Neil Conway wrote:
> > -	byte stat;
> > +	byte stat,unit;
> 
> [snip]
> 
> >  #if defined(CONFIG_BLK_DEV_IDEDMA) && !defined(CONFIG_DMA_NONPCI)
> > -	byte unit = (drive->select.b.unit & 0x01);
> > +	unit = (drive->select.b.unit & 0x01);
> 
> Why are you moving the init of "unit" out of that ifdef?
> 
> Can you see if this problem is still in 2.5 also?
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

Andre Hedrick
LAD Storage Consulting Group


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PATCH, IDE corruption, 2.4.18
  2002-05-04 12:15 Neil Conway
  2002-05-04 22:58 ` Andre Hedrick
@ 2002-05-05  0:22 ` Mike Fedyk
  2002-05-05  0:47   ` Andre Hedrick
  2002-05-05  1:54   ` Neil Conway
  1 sibling, 2 replies; 14+ messages in thread
From: Mike Fedyk @ 2002-05-05  0:22 UTC (permalink / raw)
  To: Neil Conway; +Cc: linux-kernel

On Sat, May 04, 2002 at 01:15:20PM +0100, Neil Conway wrote:
> -	byte stat;
> +	byte stat,unit;

[snip]

>  #if defined(CONFIG_BLK_DEV_IDEDMA) && !defined(CONFIG_DMA_NONPCI)
> -	byte unit = (drive->select.b.unit & 0x01);
> +	unit = (drive->select.b.unit & 0x01);

Why are you moving the init of "unit" out of that ifdef?

Can you see if this problem is still in 2.5 also?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PATCH, IDE corruption, 2.4.18
  2002-05-04 12:15 Neil Conway
@ 2002-05-04 22:58 ` Andre Hedrick
  2002-05-05  2:10   ` Neil Conway
  2002-05-05  0:22 ` Mike Fedyk
  1 sibling, 1 reply; 14+ messages in thread
From: Andre Hedrick @ 2002-05-04 22:58 UTC (permalink / raw)
  To: Neil Conway; +Cc: linux-kernel


It is a viable check but it the need for it shows another problem in the
code.  In the IDEDMA recovery kludges it must be losing the the hwgroup
busy state.


On Sat, 4 May 2002, Neil Conway wrote:

> Argh, humble apologies.  Just noticed that Yahoo didn't like it that my
> attachment didn't have a suffix and encoded it base64 :((
> 
> Here it is as a plain text attachment.
> 
> Neil
> 
> __________________________________________________
> Do You Yahoo!?
> Everything you'll ever need on one web page
> from News and Sport to Email and Music Charts
> http://uk.my.yahoo.com

Andre Hedrick
LAD Storage Consulting Group


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: PATCH, IDE corruption, 2.4.18
@ 2002-05-04 12:15 Neil Conway
  2002-05-04 22:58 ` Andre Hedrick
  2002-05-05  0:22 ` Mike Fedyk
  0 siblings, 2 replies; 14+ messages in thread
From: Neil Conway @ 2002-05-04 12:15 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 355 bytes --]

Argh, humble apologies.  Just noticed that Yahoo didn't like it that my
attachment didn't have a suffix and encoded it base64 :((

Here it is as a plain text attachment.

Neil

__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com

[-- Attachment #2: ide_patch030502.txt --]
[-- Type: text/plain, Size: 706 bytes --]

--- ide-features.c.orig	Fri Feb  9 19:40:02 2001
+++ ide-features.c	Fri May  3 20:21:58 2002
@@ -281,12 +281,18 @@
  */
 int ide_config_drive_speed (ide_drive_t *drive, byte speed)
 {
+	ide_hwgroup_t *hwgroup = HWGROUP(drive);
 	ide_hwif_t *hwif = HWIF(drive);
 	int	i, error = 1;
-	byte stat;
+	byte stat,unit;
+
+	if (hwgroup->busy) {
+		printk("Argh: hwgroup is busy in ide_config_drive_speed\n");
+		return error;
+	}
 
 #if defined(CONFIG_BLK_DEV_IDEDMA) && !defined(CONFIG_DMA_NONPCI)
-	byte unit = (drive->select.b.unit & 0x01);
+	unit = (drive->select.b.unit & 0x01);
 	outb(inb(hwif->dma_base+2) & ~(1<<(5+unit)), hwif->dma_base+2);
 #endif /* (CONFIG_BLK_DEV_IDEDMA) && !(CONFIG_DMA_NONPCI) */
 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* PATCH, IDE corruption, 2.4.18
@ 2002-05-03 20:36 Neil Conway
  0 siblings, 0 replies; 14+ messages in thread
From: Neil Conway @ 2002-05-03 20:36 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1762 bytes --]

Well, I say patch, but it's really a band-aid.

[Caveats: before reading further, understand that I'm
not by any means an IDE expert; I have just figured
out how to prevent disk corruption on my PC.  I may
have completely missed the point.]

While debugging some "0x58" status errors (which I
believed to have lead to disk corruption) I discovered
that ide_config_drive_speed() uses SELECT_DRIVE
without checking to see if there are disk transfers in
progress (by definition these have to be DMA
transfers).  Unless I've misread the ATA specs, this
is a Very Bad Thing.

This means that when modules which call
ide_register_subdriver() are loaded (so far I think
it's just ide-cd and ide-scsi) any disk transfers in
progress on the other half of the cable are stuffed.

This is very easy to reproduce: just do a "dd
if=/dev/hda of=/dev/null" and a few of "rmmod
ide-cd;modprobe ide-cd" (where there must be a CD on
hdb, or hdc and hdd but you get the idea).  On my box,
this is about 80% successful each module-load at
causing the "0x58" status error.  Repeated usage
trashes inodes (don't know why) and even causes
lockups (don't know why either).  This suggests that
the error-recovery doesn't work very well...

Anyway, here's the band aid as promised.  A timeout,
or a queued request would be better, but I'm not sure
of how best to do either, so at least now the real
experts can fix it properly...

cheers
Neil
PS: don't know (but doubt it) if this is what caused
the 0x58 problems in 2.2 etc.
PPS: I'm off-list at present, so please CC me if you
want me to notice responses.


__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com

[-- Attachment #2: ide_patch030502 --]
[-- Type: application/x-unknown, Size: 706 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2002-05-06  0:23 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-05-03 20:39 PATCH, IDE corruption, 2.4.18 Neil Conway
  -- strict thread matches above, loose matches on Subject: below --
2002-05-05 15:49 Bartlomiej Zolnierkiewicz
2002-05-05 15:04 ` Martin Dalecki
2002-05-05 20:44 ` Neil Conway
2002-05-06  0:23   ` Mike Fedyk
2002-05-04 12:15 Neil Conway
2002-05-04 22:58 ` Andre Hedrick
2002-05-05  2:10   ` Neil Conway
2002-05-05  0:22 ` Mike Fedyk
2002-05-05  0:47   ` Andre Hedrick
2002-05-05  1:54   ` Neil Conway
2002-05-05  7:36     ` Mike Fedyk
2002-05-05  9:44       ` Neil Conway
2002-05-03 20:36 Neil Conway

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).