linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* FS-corrupting IDE bug still in 2.4.20-rc3
@ 2002-11-27 10:23 Neil Conway
  2002-11-28 21:03 ` Marcelo Tosatti
  0 siblings, 1 reply; 4+ messages in thread
From: Neil Conway @ 2002-11-27 10:23 UTC (permalink / raw)
  To: Andre Hedrick, Marcelo Tosatti; +Cc: lkml

[-- Attachment #1: Type: text/plain, Size: 1212 bytes --]

Guys - you may remember this one from May this year.

I've been off-list and not paying much attention since Andre
acknowledged it was a bug (and didn't like my patch).

I recently needed to compile 2.4.19 and was surprised to find the bug
still present.  On examining 2.4.20-rc3 it still seems to be there too
-- no time to compile yet, sorry, but since 2.4.20 is imminent I
thought I should err on the side of caution and remind people about the
bug.

Let me be very clear: this bug has corrupted filesystems on three
machines of mine.  All of these had PIIX chipsets.  I have also
reproduced it on a VIA chipset, but since that machine was production I
didn't try very hard to corrupt the fs.

The patch is not a real fix, merely a workaround.  But since 6 months
have already elapsed, can I request that the patch be applied now, and
when Andre creates a proper fix we can use that.

I've updated the comments in the patch to reflect the fact that I now
realise it's not only DMA transfers that can be trashed by the bug.

Neil


__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com

[-- Attachment #2: ide_patch_2.4.20-rc3_261102.txt --]
[-- Type: text/plain, Size: 2204 bytes --]

--- ide-features.c.orig	Wed Nov 27 09:34:18 2002
+++ ide-features.c	Wed Nov 27 10:06:49 2002
@@ -272,12 +272,39 @@
  */
 int ide_config_drive_speed (ide_drive_t *drive, byte speed)
 {
+	ide_hwgroup_t *hwgroup = HWGROUP(drive);
 	ide_hwif_t *hwif = HWIF(drive);
 	int	i, error = 1;
-	byte stat;
+	byte stat,unit;
+	unsigned long flags;
+
+	spin_lock_irqsave(&io_request_lock, flags);
+	/*
+	 * XXXXX FIXME:
+	 * The next test is a band-aid.  This is because this routine can be
+	 * called while the hwgroup is busy - e.g., after a DMA or PIO
+	 * transfer has been initiated.  Known culprits: so far, the only
+	 * known way to trigger the bug is to load an IDE CD module (both
+	 * ide-scsi and ide-cd count) - on most chipsets, this ultimately
+	 * causes a call to this routine with no regard for the busy-ness of
+	 * the hwgroup.  If a transfer is in progress, then as soon as we issue
+	 * the SELECT_DRIVE() command below, we trash it.  This has caused
+	 * fs corruption (it probably shouldn't!).
+	 *
+	 * The RIGHT way to deal with this is probably either to queue the
+	 * call for execution when the hwgroup isn't busy, OR (dodgy?) to sleep
+	 * right here in this routine until it isn't busy.  We also now have
+	 * to use the io_request_lock spinlock to keep SMP systems honest.
+	 * This lot is temporary, pending a real fix.  NJC 9/5/02, 26/11/02
+	 */
+	if (hwgroup) if (hwgroup->busy) {
+		spin_unlock_irqrestore(&io_request_lock, flags);
+		printk("Argh: hwgroup is busy in ide_config_drive_speed\n");
+		return error;
+	}
 
 #if defined(CONFIG_BLK_DEV_IDEDMA) && !defined(CONFIG_DMA_NONPCI)
-	byte unit = (drive->select.b.unit & 0x01);
+	unit = (drive->select.b.unit & 0x01);
 	outb(inb(hwif->dma_base+2) & ~(1<<(5+unit)), hwif->dma_base+2);
 #endif /* (CONFIG_BLK_DEV_IDEDMA) && !(CONFIG_DMA_NONPCI) */
 
@@ -338,6 +365,7 @@
 	enable_irq(hwif->irq);
 
 	if (error) {
+		spin_unlock_irqrestore(&io_request_lock, flags);
 		(void) ide_dump_status(drive, "set_drive_speed_status", stat);
 		return error;
 	}
@@ -371,6 +399,7 @@
 		case XFER_SW_DMA_0: drive->id->dma_1word |= 0x0101; break;
 		default: break;
 	}
+	spin_unlock_irqrestore(&io_request_lock, flags);
 	return error;
 }
 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: FS-corrupting IDE bug still in 2.4.20-rc3
  2002-11-27 10:23 FS-corrupting IDE bug still in 2.4.20-rc3 Neil Conway
@ 2002-11-28 21:03 ` Marcelo Tosatti
  0 siblings, 0 replies; 4+ messages in thread
From: Marcelo Tosatti @ 2002-11-28 21:03 UTC (permalink / raw)
  To: Neil Conway; +Cc: Andre Hedrick, lkml



On Wed, 27 Nov 2002, Neil Conway wrote:

> Guys - you may remember this one from May this year.
>
> I've been off-list and not paying much attention since Andre
> acknowledged it was a bug (and didn't like my patch).

Neil,

Sorry for taking so long to answer, but does not seem to be a kernel
problem.

Andre, could you comment on his issue?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: FS-corrupting IDE bug still in 2.4.20-rc3
  2002-11-27 10:59 Marc-Christian Petersen
@ 2002-11-27 12:36 ` Neil Conway
  0 siblings, 0 replies; 4+ messages in thread
From: Neil Conway @ 2002-11-27 12:36 UTC (permalink / raw)
  To: Marc-Christian Petersen, linux-kernel; +Cc: nconway_kernel

Hi Marc...

 --- Marc-Christian Petersen <m.c.p@wolk-project.de> wrote:
> You may try that patch with a VIA boxen and I am quite sure you may
> experience 
> a bug that none of your harddisks may be recognized and result in a 
> panic();

Aha...  Actually, if you read the patch, you'll see why that's no
longer the case.  I had in fact forgotten that I'd had to patch the
patch to make my VIA box boot (it was 6 months ago now!).  I now do "if
(hwgroup) ..." in the test-for-busy section.  So, the new patch does
NOT now cause panics on VIA.

> I had the same Fix in WOLK some time ago and many users with VIA
> chipset 
> complained that with the fix their mashine does not recognize any
> harddisks 
> and after trying to recognize they had a panic();

Yes, only some chipsets end up in ide_config_drive_speed() at bootup;
notably the PIIX doesn't and the VIA does.  Mea culpa!  I should have
posted the fixed patch perhaps, but then it was deprecated and I
thought Andre had a better fix in hand.  (BTW, when you say you had
"the same fix", you mean you fixed it independently or you used my
patch from May '02?)\x13

> Maybe it's working for you with some VIA chipsets but I removed that
> fix and 
> after removal all users with VIA were happy. I've never heard of a FS
> corruption of them.

Well, do they have the trigger ingredients?  You MUST have an IDE CDROM
sharing a bus with a HDD.  Also, the perfect recipe for disk corruption
is to reboot, and then log in to your chosen desktop, and while it's
hammering the disk starting everything up, it should fire off
"magicdev", which in turn loads ide-cd: BOOM.  RedHat 7.x with GNOME
does things in this order.  YMMV.

Neil

__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: FS-corrupting IDE bug still in 2.4.20-rc3
@ 2002-11-27 10:59 Marc-Christian Petersen
  2002-11-27 12:36 ` Neil Conway
  0 siblings, 1 reply; 4+ messages in thread
From: Marc-Christian Petersen @ 2002-11-27 10:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: nconway_kernel

Hi Neil,

> I've been off-list and not paying much attention since Andre
> acknowledged it was a bug (and didn't like my patch).
I can imagine why ...

> Let me be very clear: this bug has corrupted filesystems on three
> machines of mine.  All of these had PIIX chipsets.  I have also
> reproduced it on a VIA chipset, but since that machine was production I
> didn't try very hard to corrupt the fs.
You may try that patch with a VIA boxen and I am quite sure you may experience 
a bug that none of your harddisks may be recognized and result in a 
panic();

> The patch is not a real fix, merely a workaround.  But since 6 months
> have already elapsed, can I request that the patch be applied now, and
> when Andre creates a proper fix we can use that.
I had the same Fix in WOLK some time ago and many users with VIA chipset 
complained that with the fix their mashine does not recognize any harddisks 
and after trying to recognize they had a panic();

Maybe it's working for you with some VIA chipsets but I removed that fix and 
after removal all users with VIA were happy. I've never heard of a FS 
corruption of them.

ciao, Marc

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2002-11-28 23:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-11-27 10:23 FS-corrupting IDE bug still in 2.4.20-rc3 Neil Conway
2002-11-28 21:03 ` Marcelo Tosatti
2002-11-27 10:59 Marc-Christian Petersen
2002-11-27 12:36 ` Neil Conway

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).