linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.5.65-ac2 -- hda/ide trouble on ICH4
@ 2003-03-22 14:03 Dominik Brodowski
  2003-03-22 16:35 ` Alan Cox
  0 siblings, 1 reply; 31+ messages in thread
From: Dominik Brodowski @ 2003-03-22 14:03 UTC (permalink / raw)
  To: alan; +Cc: linux-kernel

Hi Alan,

unfortunately 2.5.65-ac2 does not boot:

ICH4: IDE controller at PCI slot 00:1f.1
ICH4: chipset revision 1
ICH4: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:DMA, hdb:pio
...
hda: ICH35L080AVVA07-0, ATA DISK driver
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
...
hda: host protected area => 1
hda: 160836480 sectors (82348 MB) w/1863KiB Cache, CHS=159560/16/63, UDMA(100)
 hda: [PTBL] [10011/255/63] hda1 hda2 hda3 hda4 < hda5 hda6 hda7 hda8 hda9 >

and *deadlock*...

in plain 2.5.65 I was seeing strange error messages like:

Mar 19 20:29:55 mondschein kernel: hda: dma_timer_expiry: dma status == 0x24
Mar 19 20:29:55 mondschein kernel: hda: lost interrupt
Mar 19 20:29:55 mondschein kernel: hda: dma_intr: bad DMA status (dma_stat=30)
Mar 19 20:29:55 mondschein kernel: hda: dma_intr: status=0x52 { DriveReady SeekComplete Index }
Mar 19 20:29:55 mondschein kernel:

which are not repeatable when I switch back to 2.5.62.

lspci:
00:00.0 Host bridge: Intel Corp. 82845 845 (Brookdale) Chipset Host Bridge (rev 11)
00:01.0 PCI bridge: Intel Corp. 82845 845 (Brookdale) Chipset AGP Bridge (rev 11)
00:1d.0 USB Controller: Intel Corp. 82801DB USB (Hub #1) (rev 01)
00:1d.1 USB Controller: Intel Corp. 82801DB USB (Hub #2) (rev 01)
00:1d.2 USB Controller: Intel Corp. 82801DB USB (Hub #3) (rev 01)
00:1d.7 USB Controller: Intel Corp. 82801DB USB EHCI Controller (rev 01)
00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB PCI Bridge (rev 81)
00:1f.0 ISA bridge: Intel Corp. 82801DB ISA Bridge (LPC) (rev 01)
00:1f.1 IDE interface: Intel Corp. 82801DB ICH4 IDE (rev 01)
01:00.0 VGA compatible controller: ATI Technologies Inc Radeon R200 QL [Radeon 8500 LE]
02:03.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10)
02:0a.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)

parts of dmesg in 2.5.62:
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ICH4: IDE controller at PCI slot 00:1f.1
ICH4: chipset revision 1
ICH4: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:DMA, hdb:pio
    ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc:DMA, hdd:DMA
hda: IC35L080AVVA07-0, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hdc: HL-DT-STDVD-ROM GDR8161B, ATAPI CD/DVD-ROM drive
hdd: HL-DT-ST GCE-8480B, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
hda: host protected area => 1
hda: 160836480 sectors (82348 MB) w/1863KiB Cache, CHS=159560/16/63, UDMA(100)
 hda: hda1 hda2 hda3 hda4 < hda5 hda6 hda7 hda8 hda9 >


	Dominik

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: 2.5.65-ac2 -- hda/ide trouble on ICH4
  2003-03-22 16:35 ` Alan Cox
@ 2003-03-22 16:25   ` Dominik Brodowski
  2003-03-22 17:42     ` Alan Cox
  2003-03-22 22:03     ` [PATCH] Re: 2.5.65-ac2 -- hda/ide trouble on ICH4 Bartlomiej Zolnierkiewicz
  0 siblings, 2 replies; 31+ messages in thread
From: Dominik Brodowski @ 2003-03-22 16:25 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linux Kernel Mailing List

On Sat, Mar 22, 2003 at 04:35:05PM +0000, Alan Cox wrote:
> On Sat, 2003-03-22 at 14:03, Dominik Brodowski wrote:
> > hda: host protected area => 1
> > hda: 160836480 sectors (82348 MB) w/1863KiB Cache, CHS=159560/16/63, UDMA(100)
> >  hda: [PTBL] [10011/255/63] hda1 hda2 hda3 hda4 < hda5 hda6 hda7 hda8 hda9 >
> > 
> > and *deadlock*...
> 
> Where is the lock, what does the NMI oopser show ?

The lock is directly "below" that line -- and the NMI oopser isn't
triggered, AFAICT

> > in plain 2.5.65 I was seeing strange error messages like:
> > 
> > Mar 19 20:29:55 mondschein kernel: hda: dma_timer_expiry: dma status == 0x24
> > Mar 19 20:29:55 mondschein kernel: hda: lost interrupt
> > Mar 19 20:29:55 mondschein kernel: hda: dma_intr: bad DMA status (dma_stat=30)
> > Mar 19 20:29:55 mondschein kernel: hda: dma_intr: status=0x52 { DriveReady SeekComplete Index }
> > Mar 19 20:29:55 mondschein kernel:
> 
> I've seen 3 or 4 reports of this, none of them duplicatable with the same IDE
> code on 2.4 so far. Which is odd but I don't yet understand what is going on.
/me neither, unfortunately :-(

	Dominik

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: 2.5.65-ac2 -- hda/ide trouble on ICH4
  2003-03-22 14:03 2.5.65-ac2 -- hda/ide trouble on ICH4 Dominik Brodowski
@ 2003-03-22 16:35 ` Alan Cox
  2003-03-22 16:25   ` Dominik Brodowski
  0 siblings, 1 reply; 31+ messages in thread
From: Alan Cox @ 2003-03-22 16:35 UTC (permalink / raw)
  To: Dominik Brodowski; +Cc: Linux Kernel Mailing List

On Sat, 2003-03-22 at 14:03, Dominik Brodowski wrote:
> hda: host protected area => 1
> hda: 160836480 sectors (82348 MB) w/1863KiB Cache, CHS=159560/16/63, UDMA(100)
>  hda: [PTBL] [10011/255/63] hda1 hda2 hda3 hda4 < hda5 hda6 hda7 hda8 hda9 >
> 
> and *deadlock*...

Where is the lock, what does the NMI oopser show ?

> in plain 2.5.65 I was seeing strange error messages like:
> 
> Mar 19 20:29:55 mondschein kernel: hda: dma_timer_expiry: dma status == 0x24
> Mar 19 20:29:55 mondschein kernel: hda: lost interrupt
> Mar 19 20:29:55 mondschein kernel: hda: dma_intr: bad DMA status (dma_stat=30)
> Mar 19 20:29:55 mondschein kernel: hda: dma_intr: status=0x52 { DriveReady SeekComplete Index }
> Mar 19 20:29:55 mondschein kernel:

I've seen 3 or 4 reports of this, none of them duplicatable with the same IDE
code on 2.4 so far. Which is odd but I don't yet understand what is going on.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: 2.5.65-ac2 -- hda/ide trouble on ICH4
  2003-03-22 17:42     ` Alan Cox
@ 2003-03-22 16:39       ` Jan Dittmer
  2003-03-23  1:03       ` Dominik Brodowski
  1 sibling, 0 replies; 31+ messages in thread
From: Jan Dittmer @ 2003-03-22 16:39 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel

Alan Cox wrote:
> On Sat, 2003-03-22 at 16:25, Dominik Brodowski wrote:
> 
>>>Where is the lock, what does the NMI oopser show ?
>>
>>The lock is directly "below" that line -- and the NMI oopser isn't
>>triggered, AFAICT
> 
> 
> Anything useful off right-alt scroll-lock etc ?
> 
I'm seeing the same problem. VIA KT133A platform, nothing after 
partition detection.  No NMI-Watchdog, no sysrq magic. Just dead.
Any patch particular I could try to revert?

Jan

Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller at PCI slot 00:07.1
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: VIA vt82c686b (rev 40) IDE UDMA100 controller on pci00:07.1
ide0: BM-DMA at 0x9000-0x9007, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0x9008-0x900f, BIOS settings: hdc:DMA, hdd:DMA
hda: IC35L060AVV207-0, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hdc: SONY CD-RW CRX175E2, ATAPI CD/DVD-ROM drive
hdd: Pioneer DVD-ROM ATAPIModel DVD-104S 012, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
hda: host protected area => 1
hda: 120103200 sectors (61493 MB) w/1821KiB Cache, CHS=7476/255/63, 
UDMA(100)
  /dev/ide/host0/bus0/target0/lun0: p1 p2 p3 < p5 p6 p7 >


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: 2.5.65-ac2 -- hda/ide trouble on ICH4
  2003-03-22 16:25   ` Dominik Brodowski
@ 2003-03-22 17:42     ` Alan Cox
  2003-03-22 16:39       ` Jan Dittmer
  2003-03-23  1:03       ` Dominik Brodowski
  2003-03-22 22:03     ` [PATCH] Re: 2.5.65-ac2 -- hda/ide trouble on ICH4 Bartlomiej Zolnierkiewicz
  1 sibling, 2 replies; 31+ messages in thread
From: Alan Cox @ 2003-03-22 17:42 UTC (permalink / raw)
  To: Dominik Brodowski; +Cc: Linux Kernel Mailing List

On Sat, 2003-03-22 at 16:25, Dominik Brodowski wrote:
> > Where is the lock, what does the NMI oopser show ?
> 
> The lock is directly "below" that line -- and the NMI oopser isn't
> triggered, AFAICT

Anything useful off right-alt scroll-lock etc ?


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] Re: 2.5.65-ac2 -- hda/ide trouble on ICH4
  2003-03-22 16:25   ` Dominik Brodowski
  2003-03-22 17:42     ` Alan Cox
@ 2003-03-22 22:03     ` Bartlomiej Zolnierkiewicz
  2003-03-22 23:27       ` Alan Cox
  2003-03-23  9:11       ` Dominik Brodowski
  1 sibling, 2 replies; 31+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2003-03-22 22:03 UTC (permalink / raw)
  To: Dominik Brodowski; +Cc: Alan Cox, Linux Kernel Mailing List

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1091 bytes --]


On Sat, 22 Mar 2003, Dominik Brodowski wrote:
> On Sat, Mar 22, 2003 at 04:35:05PM +0000, Alan Cox wrote:
> >
> > I've seen 3 or 4 reports of this, none of them duplicatable with the same IDE
> > code on 2.4 so far. Which is odd but I don't yet understand what is going on.
> /me neither, unfortunately :-(


Alan, I can trigger the same dma_intr bug under 2.4.21-pre5-ac3 but not
2.4.20-ac2 (VIA vt8235 + WD800JB so it is not controller/disk related).

Dominik could you try attached patch with vanilla 2.5.65?
It reverts previous logic of handling masked_irq argument of ide_do_request().

Previously callers called it with masked_irq=0 and disabling/enabling
hwif->irq code wasn't executed, now ide_do_request() is called with
masked_irq=IDE_NO_IRQ=-1 so this code is executed for sure.

And no, I don't know wtf is exactly going on there :\.


[ Alan, please forget about yesterday's mail, I hitted dma_intr again today
  with yesterday's patch, with attached patch I hope it is gone now :-) ]

BTW 2.5.64-ac4 deadlocks for me the same way Dominik has described.


Greets
--
Bartlomiej

[-- Attachment #2: Type: TEXT/PLAIN, Size: 1340 bytes --]

--- linux-2.5.65/drivers/ide/ide-io.c	Fri Mar 21 14:24:31 2003
+++ linux-dma_intr-fix/drivers/ide/ide-io.c	Sat Mar 22 22:03:26 2003
@@ -838,14 +838,14 @@
 		 * happens anyway when any interrupt comes in, IDE or otherwise
 		 *  -- the kernel masks the IRQ while it is being handled.
 		 */
-		if (hwif->irq != masked_irq)
+		if (masked_irq && hwif->irq != masked_irq)
 			disable_irq_nosync(hwif->irq);
 		spin_unlock(&ide_lock);
 		local_irq_enable();
 			/* allow other IRQs while we start this request */
 		startstop = start_request(drive, rq);
 		spin_lock_irq(&ide_lock);
-		if (hwif->irq != masked_irq)
+		if (masked_irq && hwif->irq != masked_irq)
 			enable_irq(hwif->irq);
 		if (startstop == ide_released)
 			goto queue_next;
@@ -861,7 +861,7 @@
  */
 void do_ide_request(request_queue_t *q)
 {
-	ide_do_request(q->queuedata, IDE_NO_IRQ);
+	ide_do_request(q->queuedata, 0);
 }
 
 /*
@@ -1009,7 +1009,7 @@
 				hwgroup->busy = 0;
 		}
 	}
-	ide_do_request(hwgroup, IDE_NO_IRQ);
+	ide_do_request(hwgroup, 0);
 	spin_unlock_irqrestore(&ide_lock, flags);
 }
 
@@ -1299,7 +1299,7 @@
 		insert_end = 0;
 	}
 	__elv_add_request(&drive->queue, rq, insert_end, 0);
-	ide_do_request(hwgroup, IDE_NO_IRQ);
+	ide_do_request(hwgroup, 0);
 	spin_unlock_irqrestore(&ide_lock, flags);
 
 	err = 0;

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] Re: 2.5.65-ac2 -- hda/ide trouble on ICH4
  2003-03-22 23:27       ` Alan Cox
@ 2003-03-22 22:33         ` Bartlomiej Zolnierkiewicz
  0 siblings, 0 replies; 31+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2003-03-22 22:33 UTC (permalink / raw)
  To: Alan Cox; +Cc: Dominik Brodowski, Linux Kernel Mailing List


On 22 Mar 2003, Alan Cox wrote:

> On Sat, 2003-03-22 at 22:03, Bartlomiej Zolnierkiewicz wrote:
> > Previously callers called it with masked_irq=0 and disabling/enabling
> > hwif->irq code wasn't executed, now ide_do_request() is called with
> > masked_irq=IDE_NO_IRQ=-1 so this code is executed for sure.
>
> You are right - I botched the simplification of that. The logic is actually
> cleaner than I did with a bit more thought - IDE_NO_IRQ can go away and
> we should be using hwif->irq as the argument.

I think so.

--
Bartlomiej


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] Re: 2.5.65-ac2 -- hda/ide trouble on ICH4
  2003-03-22 22:03     ` [PATCH] Re: 2.5.65-ac2 -- hda/ide trouble on ICH4 Bartlomiej Zolnierkiewicz
@ 2003-03-22 23:27       ` Alan Cox
  2003-03-22 22:33         ` Bartlomiej Zolnierkiewicz
  2003-03-23  9:11       ` Dominik Brodowski
  1 sibling, 1 reply; 31+ messages in thread
From: Alan Cox @ 2003-03-22 23:27 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz; +Cc: Dominik Brodowski, Linux Kernel Mailing List

On Sat, 2003-03-22 at 22:03, Bartlomiej Zolnierkiewicz wrote:
> Previously callers called it with masked_irq=0 and disabling/enabling
> hwif->irq code wasn't executed, now ide_do_request() is called with
> masked_irq=IDE_NO_IRQ=-1 so this code is executed for sure.


You are right - I botched the simplification of that. The logic is actually
cleaner than I did with a bit more thought - IDE_NO_IRQ can go away and
we should be using hwif->irq as the argument.



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: 2.5.65-ac2 -- hda/ide trouble on ICH4
  2003-03-22 17:42     ` Alan Cox
  2003-03-22 16:39       ` Jan Dittmer
@ 2003-03-23  1:03       ` Dominik Brodowski
  2003-03-23 15:47         ` Alan Cox
  1 sibling, 1 reply; 31+ messages in thread
From: Dominik Brodowski @ 2003-03-23  1:03 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linux Kernel Mailing List, B.Zolnierkiewicz

On Sat, Mar 22, 2003 at 05:42:02PM +0000, Alan Cox wrote:
> On Sat, 2003-03-22 at 16:25, Dominik Brodowski wrote:
> > > Where is the lock, what does the NMI oopser show ?
> > 
> > The lock is directly "below" that line -- and the NMI oopser isn't
> > triggered, AFAICT
> 
> Anything useful off right-alt scroll-lock etc ?

not from this debugging source - USB wireless keyboard :) - however, ~1000
printks later I've found out the following: the kernel spins in the while()
loop in drivers/ide/ide_register_driver:

	while (!list_empty(&list)) {
		ide_drive_t *drive = list_entry(list.next, ide_drive_t,
list);
		list_del_init(&drive->list);
		if (drive->present)
			ata_attach(drive);
	}


It was called by ide_register_driver, which itself got called by
idedisk_init. 

	Dominik

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] Re: 2.5.65-ac2 -- hda/ide trouble on ICH4
  2003-03-22 22:03     ` [PATCH] Re: 2.5.65-ac2 -- hda/ide trouble on ICH4 Bartlomiej Zolnierkiewicz
  2003-03-22 23:27       ` Alan Cox
@ 2003-03-23  9:11       ` Dominik Brodowski
  1 sibling, 0 replies; 31+ messages in thread
From: Dominik Brodowski @ 2003-03-23  9:11 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz; +Cc: Alan Cox, Linux Kernel Mailing List

On Sat, Mar 22, 2003 at 11:03:33PM +0100, Bartlomiej Zolnierkiewicz wrote:
> 
> On Sat, 22 Mar 2003, Dominik Brodowski wrote:
> > On Sat, Mar 22, 2003 at 04:35:05PM +0000, Alan Cox wrote:
> > >
> > > I've seen 3 or 4 reports of this, none of them duplicatable with the same IDE
> > > code on 2.4 so far. Which is odd but I don't yet understand what is going on.
> > /me neither, unfortunately :-(
> 
> 
> Alan, I can trigger the same dma_intr bug under 2.4.21-pre5-ac3 but not
> 2.4.20-ac2 (VIA vt8235 + WD800JB so it is not controller/disk related).
> 
> Dominik could you try attached patch with vanilla 2.5.65?
> It reverts previous logic of handling masked_irq argument of ide_do_request().

Seems to work fine over here...

> BTW 2.5.64-ac4 deadlocks for me the same way Dominik has described.

plain 2.5.65 does not, but 2.5.65-bk-current does.

	Dominik

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: 2.5.65-ac2 -- hda/ide trouble on ICH4
  2003-03-23 15:47         ` Alan Cox
@ 2003-03-23 14:59           ` Dominik Brodowski
  2003-03-23 18:41             ` Alan Cox
  0 siblings, 1 reply; 31+ messages in thread
From: Dominik Brodowski @ 2003-03-23 14:59 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linux Kernel Mailing List, B.Zolnierkiewicz

On Sun, Mar 23, 2003 at 03:47:54PM +0000, Alan Cox wrote:
> On Sun, 2003-03-23 at 01:03, Dominik Brodowski wrote:
> > 	while (!list_empty(&list)) {
> > 		ide_drive_t *drive = list_entry(list.next, ide_drive_t,
> > list);
> > 		list_del_init(&drive->list);
> > 		if (drive->present)
> > 			ata_attach(drive);
> 
> Can you boot and printk the drive name out each iteration see if the list
> is hosed somewhere.

printk("%4s\n", drive->name) prints out "hdd" all the time. 

hda is an ATA disk drive
hdb is empty
hdc is an ATAPI CD/DVD-ROM drive
hdd is an ATAPI CD-ROM CD-R/RW drive

	Dominik

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: 2.5.65-ac2 -- hda/ide trouble on ICH4
  2003-03-23  1:03       ` Dominik Brodowski
@ 2003-03-23 15:47         ` Alan Cox
  2003-03-23 14:59           ` Dominik Brodowski
  0 siblings, 1 reply; 31+ messages in thread
From: Alan Cox @ 2003-03-23 15:47 UTC (permalink / raw)
  To: Dominik Brodowski; +Cc: Linux Kernel Mailing List, B.Zolnierkiewicz

On Sun, 2003-03-23 at 01:03, Dominik Brodowski wrote:
> 	while (!list_empty(&list)) {
> 		ide_drive_t *drive = list_entry(list.next, ide_drive_t,
> list);
> 		list_del_init(&drive->list);
> 		if (drive->present)
> 			ata_attach(drive);

Can you boot and printk the drive name out each iteration see if the list
is hosed somewhere.

list is the list of all the drives. We pull the drive off the list
and attach it to the relevant device driver (or ide-default if none
is known).

The behaviour you see certainly sounds like the list got corrupted


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: 2.5.65-ac2 -- hda/ide trouble on ICH4
  2003-03-23 18:41             ` Alan Cox
@ 2003-03-23 18:15               ` Dominik Brodowski
  2003-03-23 18:25                 ` ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4] Dominik Brodowski
  0 siblings, 1 reply; 31+ messages in thread
From: Dominik Brodowski @ 2003-03-23 18:15 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linux Kernel Mailing List, B.Zolnierkiewicz

On Sun, Mar 23, 2003 at 06:41:10PM +0000, Alan Cox wrote:
> > printk("%4s\n", drive->name) prints out "hdd" all the time. 
> > 
> > hda is an ATA disk drive
> > hdb is empty
> > hdc is an ATAPI CD/DVD-ROM drive
> > hdd is an ATAPI CD-ROM CD-R/RW drive
> 
> This gets weirder by the minute, and I still can't get it to happen here
> annoyingly.  
>
> The list thats breaking is a private list. We delete the drive from that
> list, if its present (it may be an empty bay) we then use ata_attach 
> to add it to a device list (or back to ata_unused). 
> 
> I find it hard to believe something like this is a compiler bug, but right
> now I don't see how stuff is reappearing on the list.

Just got it to boot :) -- the while(!list_empty...) { list_entry ... looks
suspicious. Might be better to use list_for_each_safe() which is designed
exactly for this purpouse. I'm currently recompiling
2.5.65-bk-current-as-of-yesterday with the attached patch. Let's see whether
it works with this kernel, too...

	Dominik

--- linux/drivers/ide/ide.c.original	2003-03-23 19:08:40.000000000 +0100
+++ linux/drivers/ide/ide.c	2003-03-23 19:10:25.000000000 +0100
@@ -2392,6 +2392,8 @@
 int ide_register_driver(ide_driver_t *driver)
 {
 	struct list_head list;
+	struct list_head *list_loop;
+	struct list_head *tmp_storage;
 
 	spin_lock(&drivers_lock);
 	list_add(&driver->drivers, &drivers);
@@ -2402,8 +2404,8 @@
 	list_splice_init(&ata_unused, &list);
 	spin_unlock(&drives_lock);
 
-	while (!list_empty(&list)) {
-		ide_drive_t *drive = list_entry(list.next, ide_drive_t, list);
+	list_for_each_safe(list_loop, tmp_storage, &list) {
+		ide_drive_t *drive = container_of(list_loop, ide_drive_t, list);
 		list_del_init(&drive->list);
 		if (drive->present)
 			ata_attach(drive);

^ permalink raw reply	[flat|nested] 31+ messages in thread

* ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4]
  2003-03-23 18:15               ` Dominik Brodowski
@ 2003-03-23 18:25                 ` Dominik Brodowski
  2003-03-23 22:16                   ` Jan Dittmer
  2003-03-24  9:55                   ` ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4] Alexander Atanasov
  0 siblings, 2 replies; 31+ messages in thread
From: Dominik Brodowski @ 2003-03-23 18:25 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linux Kernel Mailing List, B.Zolnierkiewicz

On Sun, Mar 23, 2003 at 07:15:33PM +0100, Dominik Brodowski wrote:
> On Sun, Mar 23, 2003 at 06:41:10PM +0000, Alan Cox wrote:
> > > printk("%4s\n", drive->name) prints out "hdd" all the time. 
> > > 
> > > hda is an ATA disk drive
> > > hdb is empty
> > > hdc is an ATAPI CD/DVD-ROM drive
> > > hdd is an ATAPI CD-ROM CD-R/RW drive
> > 
> > This gets weirder by the minute, and I still can't get it to happen here
> > annoyingly.  
> >
> > The list thats breaking is a private list. We delete the drive from that
> > list, if its present (it may be an empty bay) we then use ata_attach 
> > to add it to a device list (or back to ata_unused). 
> > 
> > I find it hard to believe something like this is a compiler bug, but right
> > now I don't see how stuff is reappearing on the list.
> 
> Just got it to boot :) -- the while(!list_empty...) { list_entry ... looks
> suspicious. Might be better to use list_for_each_safe() which is designed
> exactly for this purpouse. I'm currently recompiling
> 2.5.65-bk-current-as-of-yesterday with the attached patch. Let's see whether
> it works with this kernel, too...

Yes, it also works with 2.5.65-bkX.

--- linux/drivers/ide/ide.c.original	2003-03-23 19:08:40.000000000 +0100
+++ linux/drivers/ide/ide.c	2003-03-23 19:10:25.000000000 +0100
@@ -2392,6 +2392,8 @@
 int ide_register_driver(ide_driver_t *driver)
 {
 	struct list_head list;
+	struct list_head *list_loop;
+	struct list_head *tmp_storage;
 
 	spin_lock(&drivers_lock);
 	list_add(&driver->drivers, &drivers);
@@ -2402,8 +2404,8 @@
 	list_splice_init(&ata_unused, &list);
 	spin_unlock(&drives_lock);
 
-	while (!list_empty(&list)) {
-		ide_drive_t *drive = list_entry(list.next, ide_drive_t, list);
+	list_for_each_safe(list_loop, tmp_storage, &list) {
+		ide_drive_t *drive = container_of(list_loop, ide_drive_t, list);
 		list_del_init(&drive->list);
 		if (drive->present)
 			ata_attach(drive);

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: 2.5.65-ac2 -- hda/ide trouble on ICH4
  2003-03-23 14:59           ` Dominik Brodowski
@ 2003-03-23 18:41             ` Alan Cox
  2003-03-23 18:15               ` Dominik Brodowski
  0 siblings, 1 reply; 31+ messages in thread
From: Alan Cox @ 2003-03-23 18:41 UTC (permalink / raw)
  To: Dominik Brodowski; +Cc: Linux Kernel Mailing List, B.Zolnierkiewicz

> printk("%4s\n", drive->name) prints out "hdd" all the time. 
> 
> hda is an ATA disk drive
> hdb is empty
> hdc is an ATAPI CD/DVD-ROM drive
> hdd is an ATAPI CD-ROM CD-R/RW drive

This gets weirder by the minute, and I still can't get it to happen here
annoyingly.  

The list thats breaking is a private list. We delete the drive from that
list, if its present (it may be an empty bay) we then use ata_attach 
to add it to a device list (or back to ata_unused). 

I find it hard to believe something like this is a compiler bug, but right
now I don't see how stuff is reappearing on the list.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4]
  2003-03-23 18:25                 ` ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4] Dominik Brodowski
@ 2003-03-23 22:16                   ` Jan Dittmer
  2003-03-24 11:08                     ` PROBLEM: linux-2.5.65-ac3 does not boot whith IDE-drivers Norbert Wolff
  2003-03-24  9:55                   ` ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4] Alexander Atanasov
  1 sibling, 1 reply; 31+ messages in thread
From: Jan Dittmer @ 2003-03-23 22:16 UTC (permalink / raw)
  To: Dominik Brodowski; +Cc: Alan Cox, Linux Kernel Mailing List, B.Zolnierkiewicz

Dominik Brodowski wrote:
> On Sun, Mar 23, 2003 at 07:15:33PM +0100, Dominik Brodowski wrote:
>>Just got it to boot :) -- the while(!list_empty...) { list_entry ... looks
>>suspicious. Might be better to use list_for_each_safe() which is designed
>>exactly for this purpouse. I'm currently recompiling
>>2.5.65-bk-current-as-of-yesterday with the attached patch. Let's see whether
>>it works with this kernel, too...
> 
> 
> Yes, it also works with 2.5.65-bkX.
> 

Yes, my system also boots again :)

Thanks,

Jan


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4]
  2003-03-23 18:25                 ` ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4] Dominik Brodowski
  2003-03-23 22:16                   ` Jan Dittmer
@ 2003-03-24  9:55                   ` Alexander Atanasov
  2003-03-24 13:59                     ` Alan Cox
  1 sibling, 1 reply; 31+ messages in thread
From: Alexander Atanasov @ 2003-03-24  9:55 UTC (permalink / raw)
  To: Dominik Brodowski
  Cc: Linux Kernel Mailing List, Bartlomiej Zolnierkiewicz, Alan Cox

	Hello,

On Sun, 23 Mar 2003, Dominik Brodowski wrote:

> Yes, it also works with 2.5.65-bkX.

	Dominik, can you try this patch on top of 2.5.65-ac3/bk,
i can't reproduce the hang but it seems that drives without driver can get
both in ata_unused and idedefault_driver.drives and lists go nuts.
It kills ata_unused and uses idedefault_driver.drives only,
boots fine here. I'd guess you have ide-cd as module, and the two drives
handled by it couse the trouble - first joins the lists second couses the
loop.

-- 
have fun,
alex

===== drivers/ide/ide.c 1.52 vs edited =====
--- 1.52/drivers/ide/ide.c	Sun Mar 23 02:00:50 2003
+++ edited/drivers/ide/ide.c	Mon Mar 24 08:48:54 2003
@@ -469,7 +469,6 @@
 	return -ENXIO;
 }
 
-static LIST_HEAD(ata_unused);
 static spinlock_t drives_lock = SPIN_LOCK_UNLOCKED;
 static spinlock_t drivers_lock = SPIN_LOCK_UNLOCKED;
 static LIST_HEAD(drivers);
@@ -1440,9 +1439,6 @@
 	spin_unlock(&drivers_lock);
 	if(idedefault_driver.attach(drive) != 0)
 		panic("ide: default attach failed");
-	spin_lock(&drives_lock);
-	list_add_tail(&drive->list, &ata_unused);
-	spin_unlock(&drives_lock);
 	return 1;
 }
 
@@ -2399,7 +2395,7 @@
 
 	spin_lock(&drives_lock);
 	INIT_LIST_HEAD(&list);
-	list_splice_init(&ata_unused, &list);
+	list_splice_init(&idedefault_driver.drives, &list);
 	spin_unlock(&drives_lock);
 
 	while (!list_empty(&list)) {


^ permalink raw reply	[flat|nested] 31+ messages in thread

* PROBLEM: linux-2.5.65-ac3 does not boot whith IDE-drivers
  2003-03-23 22:16                   ` Jan Dittmer
@ 2003-03-24 11:08                     ` Norbert Wolff
  2003-03-24 13:54                       ` Alan Cox
  0 siblings, 1 reply; 31+ messages in thread
From: Norbert Wolff @ 2003-03-24 11:08 UTC (permalink / raw)
  To: linux-kernel; +Cc: alan

Hi !

I tried linux-2.5.65-ac3 with ide-disk and ide-scsi drivers both built in
the Kernel.
Two ide-disks attached to ide0, two CDROMS attached to ide1.
Im using the sis5513-PCI-IDE-Driver, but configuring it out makes no difference.

The System hangs while booting with last messages
	ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
	ide1 at 0x170-0x177,0x376 on irq 15

It seems that all Drives went twice in the driver-list, so ata_attach
gets called twice for them, which leads to the hang.

Dirty Hack that works for me below.

Regards,

	Norbert


--- drivers/ide/ide.c.orig	2003-03-24 10:52:40.000000000 +0000
+++ drivers/ide/ide.c	2003-03-24 10:57:52.000000000 +0000
@@ -1423,6 +1423,14 @@
 {
 	struct list_head *p;
 	spin_lock(&drivers_lock);
+#define _DEBUG 1
+#ifdef _DEBUG
+	printk("ata_attach called for %s\n", drive->name);
+#endif
+	if (drive->already_attached) {
+		printk ("ata_attach: already attached for %s !\n", drive->name);
+		return 0;
+	}
 	list_for_each(p, &drivers) {
 		ide_driver_t *driver = list_entry(p, ide_driver_t, drivers);
 		if (!try_module_get(driver->owner))
@@ -1431,12 +1439,14 @@
 		if (driver->attach(drive) == 0) {
 			module_put(driver->owner);
 			drive->gendev.driver = &driver->gen_driver;
+			drive->already_attached = 1;
 			return 0;
 		}
 		spin_lock(&drivers_lock);
 		module_put(driver->owner);
 	}
 	drive->gendev.driver = &idedefault_driver.gen_driver;
+	drive->already_attached = 1;
 	spin_unlock(&drivers_lock);
 	if(idedefault_driver.attach(drive) != 0)
 		panic("ide: default attach failed");
--- include/linux/ide.h.orig	2003-03-24 11:01:41.000000000 +0000
+++ include/linux/ide.h	2003-03-24 11:02:33.000000000 +0000
@@ -791,6 +791,7 @@
 	int		forced_lun;	/* if hdxlun was given at boot */
 	int		lun;		/* logical unit */
 	int		crc_count;	/* crc counter to reduce drive speed */
+	int     already_attached;  /* Dirty Hack ... */
 	struct list_head list;
 	struct device	gendev;
 	struct gendisk *disk;

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PROBLEM: linux-2.5.65-ac3 does not boot whith IDE-drivers
  2003-03-24 11:08                     ` PROBLEM: linux-2.5.65-ac3 does not boot whith IDE-drivers Norbert Wolff
@ 2003-03-24 13:54                       ` Alan Cox
  0 siblings, 0 replies; 31+ messages in thread
From: Alan Cox @ 2003-03-24 13:54 UTC (permalink / raw)
  To: Norbert Wolff; +Cc: Linux Kernel Mailing List

On Mon, 2003-03-24 at 11:08, Norbert Wolff wrote:
> Hi !
> 
> I tried linux-2.5.65-ac3 with ide-disk and ide-scsi drivers both built in
> the Kernel.
> Two ide-disks attached to ide0, two CDROMS attached to ide1.
> Im using the sis5513-PCI-IDE-Driver, but configuring it out makes no difference.

See the -ac4 tree for a cleaner fix from Dominik

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4]
  2003-03-24  9:55                   ` ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4] Alexander Atanasov
@ 2003-03-24 13:59                     ` Alan Cox
  2003-03-24 16:01                       ` Alexander Atanasov
  0 siblings, 1 reply; 31+ messages in thread
From: Alan Cox @ 2003-03-24 13:59 UTC (permalink / raw)
  To: Alexander Atanasov
  Cc: Dominik Brodowski, Linux Kernel Mailing List, Bartlomiej Zolnierkiewicz

On Mon, 2003-03-24 at 09:55, Alexander Atanasov wrote:
> i can't reproduce the hang but it seems that drives without driver can get
> both in ata_unused and idedefault_driver.drives and lists go nuts.
> It kills ata_unused and uses idedefault_driver.drives only,
> boots fine here. I'd guess you have ide-cd as module, and the two drives
> handled by it couse the trouble - first joins the lists second couses the
> loop.

We need to know the difference between the two really so I would much rather
ensure we don't end up on both lists at once (which is a bug) than lose a
list


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4]
  2003-03-24 13:59                     ` Alan Cox
@ 2003-03-24 16:01                       ` Alexander Atanasov
  2003-03-24 17:40                         ` Alan Cox
  0 siblings, 1 reply; 31+ messages in thread
From: Alexander Atanasov @ 2003-03-24 16:01 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux, linux-kernel, B.Zolnierkiewicz

	Hello,

On 24 Mar 2003 13:59:33 +0000
Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:

> On Mon, 2003-03-24 at 09:55, Alexander Atanasov wrote:
> > i can't reproduce the hang but it seems that drives without driver
> > can get both in ata_unused and idedefault_driver.drives and lists go
> > nuts. It kills ata_unused and uses idedefault_driver.drives only,
> > boots fine here. I'd guess you have ide-cd as module, and the two
> > drives handled by it couse the trouble - first joins the lists
> > second couses the loop.
> 
> We need to know the difference between the two really so I would much
> rather ensure we don't end up on both lists at once (which is a bug)
> than lose a list
> 

	I don't understand, what's the difference and how the list is lost?
ata_unused used to hold all drives that were not claimed by any driver,
now idedefault_driver claims all that drives, all drives go in the .list
of its driver. ide_register_driver wants to take all unused drives and
attach them to the newly registered driver, so take all drives that use
idedefault_driver, and try, if they fail to find a driver they end up
again with the same driver and list (idedefault_driver). I think
idedefault_driver.list and ata_unused became dublicates, and the proper
place is to hold drives with no real driver is idedefault_driver, so the
patch. list from ata_unused becomes idedefault_driver.list, and does
exactly the same as ata_unused. I want to understand where i'm wrong, 
please?

The bug is there,  and waiting to explode, keeping both lists would mean to 
add one more  list head  in ide_drive_t,  is that the fix you want?

--
have fun,
alex
 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4]
  2003-03-24 17:40                         ` Alan Cox
@ 2003-03-24 17:24                           ` Alexander Atanasov
  2003-03-25  4:16                             ` Andre Hedrick
  2003-03-25 20:05                             ` Bartlomiej Zolnierkiewicz
  2003-04-03  7:00                           ` PATCH:ide_do_reset() fix for 2.5.66 rain.wang
  1 sibling, 2 replies; 31+ messages in thread
From: Alexander Atanasov @ 2003-03-24 17:24 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux, linux-kernel, B.Zolnierkiewicz

	Hello, Alan!

On 24 Mar 2003 17:40:08 +0000
Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:

> On Mon, 2003-03-24 at 16:01, Alexander Atanasov wrote:
> > 	I don't understand, what's the difference and how the list is
> > 	lost?
> > ata_unused used to hold all drives that were not claimed by any
> > driver, now idedefault_driver claims all that drives, all drives go
> > in the .list
> 
> ata_unused -> unattached device slots, new hotplug discoveries

	Ok. 

> idedefault_driver -> attached/known devices with no driver
> other list -> driven by that driver
> 
> > The bug is there,  and waiting to explode, keeping both lists would
> > mean to add one more  list head  in ide_drive_t,  is that the fix
> > you want?
> 
> I don't see where stuff is ending up on both lists yet. I've not had
> time to look hard at it though
> 

	It happens this way:
	ide_register_driver -> ata_attach -> idedefault_driver.attach -> ide_register_subdriver -> list_add(&driver->list, &driver->drives) ->
return to ata_attach -> list_add_tail(&drive->list, &ata_unused);
	
--
have fun,
alex

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4]
  2003-03-24 16:01                       ` Alexander Atanasov
@ 2003-03-24 17:40                         ` Alan Cox
  2003-03-24 17:24                           ` Alexander Atanasov
  2003-04-03  7:00                           ` PATCH:ide_do_reset() fix for 2.5.66 rain.wang
  0 siblings, 2 replies; 31+ messages in thread
From: Alan Cox @ 2003-03-24 17:40 UTC (permalink / raw)
  To: Alexander Atanasov; +Cc: linux, Linux Kernel Mailing List, B.Zolnierkiewicz

On Mon, 2003-03-24 at 16:01, Alexander Atanasov wrote:
> 	I don't understand, what's the difference and how the list is lost?
> ata_unused used to hold all drives that were not claimed by any driver,
> now idedefault_driver claims all that drives, all drives go in the .list

ata_unused -> unattached device slots, new hotplug discoveries
idedefault_driver -> attached/known devices with no driver
other list -> driven by that driver

> The bug is there,  and waiting to explode, keeping both lists would mean to 
> add one more  list head  in ide_drive_t,  is that the fix you want?

I don't see where stuff is ending up on both lists yet. I've not had time to look
hard at it though


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4]
  2003-03-24 17:24                           ` Alexander Atanasov
@ 2003-03-25  4:16                             ` Andre Hedrick
  2003-03-25 13:59                               ` Alan Cox
  2003-03-25 20:05                             ` Bartlomiej Zolnierkiewicz
  1 sibling, 1 reply; 31+ messages in thread
From: Andre Hedrick @ 2003-03-25  4:16 UTC (permalink / raw)
  To: Alexander Atanasov; +Cc: Alan Cox, linux, linux-kernel, B.Zolnierkiewicz


This is one thing all of you don't get about hotplug.

You are not allowed on PATA, only SATA.

The BIOS and setup on the HBA's need a kick to come alive.
There are basic hooks that do not permit post boot hotplug in PATA.

Cheers,

On Mon, 24 Mar 2003, Alexander Atanasov wrote:

> 	Hello, Alan!
> 
> On 24 Mar 2003 17:40:08 +0000
> Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> 
> > On Mon, 2003-03-24 at 16:01, Alexander Atanasov wrote:
> > > 	I don't understand, what's the difference and how the list is
> > > 	lost?
> > > ata_unused used to hold all drives that were not claimed by any
> > > driver, now idedefault_driver claims all that drives, all drives go
> > > in the .list
> > 
> > ata_unused -> unattached device slots, new hotplug discoveries
> 
> 	Ok. 
> 
> > idedefault_driver -> attached/known devices with no driver
> > other list -> driven by that driver
> > 
> > > The bug is there,  and waiting to explode, keeping both lists would
> > > mean to add one more  list head  in ide_drive_t,  is that the fix
> > > you want?
> > 
> > I don't see where stuff is ending up on both lists yet. I've not had
> > time to look hard at it though
> > 
> 
> 	It happens this way:
> 	ide_register_driver -> ata_attach -> idedefault_driver.attach -> ide_register_subdriver -> list_add(&driver->list, &driver->drives) ->
> return to ata_attach -> list_add_tail(&drive->list, &ata_unused);
> 	
> --
> have fun,
> alex
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

Andre Hedrick
LAD Storage Consulting Group


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4]
  2003-03-25  4:16                             ` Andre Hedrick
@ 2003-03-25 13:59                               ` Alan Cox
  0 siblings, 0 replies; 31+ messages in thread
From: Alan Cox @ 2003-03-25 13:59 UTC (permalink / raw)
  To: Andre Hedrick
  Cc: Alexander Atanasov, linux, Linux Kernel Mailing List, B.Zolnierkiewicz

On Tue, 2003-03-25 at 04:16, Andre Hedrick wrote:
> This is one thing all of you don't get about hotplug.
> 
> You are not allowed on PATA, only SATA.
> 
> The BIOS and setup on the HBA's need a kick to come alive.
> There are basic hooks that do not permit post boot hotplug in PATA.

Several vendors support bus tristate handling. We now do error
handling on that. Its a first step towards being able to rescan
the bus.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4]
  2003-03-24 17:24                           ` Alexander Atanasov
  2003-03-25  4:16                             ` Andre Hedrick
@ 2003-03-25 20:05                             ` Bartlomiej Zolnierkiewicz
  2003-03-25 20:24                               ` Bartlomiej Zolnierkiewicz
  1 sibling, 1 reply; 31+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2003-03-25 20:05 UTC (permalink / raw)
  To: Alexander Atanasov; +Cc: Alan Cox, linux, linux-kernel


On Mon, 24 Mar 2003, Alexander Atanasov wrote:
> 	Hello, Alan!
>
> On 24 Mar 2003 17:40:08 +0000
> Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
>
> > On Mon, 2003-03-24 at 16:01, Alexander Atanasov wrote:
> > > 	I don't understand, what's the difference and how the list is
> > > 	lost?
> > > ata_unused used to hold all drives that were not claimed by any
> > > driver, now idedefault_driver claims all that drives, all drives go
> > > in the .list
> >
> > ata_unused -> unattached device slots, new hotplug discoveries
>
> 	Ok.
>
> > idedefault_driver -> attached/known devices with no driver
> > other list -> driven by that driver
> >
> > > The bug is there,  and waiting to explode, keeping both lists would
> > > mean to add one more  list head  in ide_drive_t,  is that the fix
> > > you want?
> >
> > I don't see where stuff is ending up on both lists yet. I've not had
> > time to look hard at it though
> >
>
> 	It happens this way:
> 	ide_register_driver -> ata_attach -> idedefault_driver.attach -> ide_register_subdriver -> list_add(&driver->list, &driver->drives) ->
> return to ata_attach -> list_add_tail(&drive->list, &ata_unused);

Exactly.

Alan, if we want to keep ata_unused, we should remove
list_add_tail(%drive->list, &ata_unused) from ata_attach()
and perhaps use (after fixing it to handle idedefault_driver)
ide_replace_subdriver() for driver switching for drives owned
by idedefault_driver.

BTW in ide_register_driver() we don't use ide_drives lock to protect
    drive->list changes, why?

--
bzolnier

> --
> have fun,
> alex
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4]
  2003-03-25 20:05                             ` Bartlomiej Zolnierkiewicz
@ 2003-03-25 20:24                               ` Bartlomiej Zolnierkiewicz
  0 siblings, 0 replies; 31+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2003-03-25 20:24 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Alexander Atanasov, Alan Cox, linux, linux-kernel


On Tue, 25 Mar 2003, Bartlomiej Zolnierkiewicz wrote:

>
> On Mon, 24 Mar 2003, Alexander Atanasov wrote:
> > 	Hello, Alan!
> >
> > On 24 Mar 2003 17:40:08 +0000
> > Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:
> >
> > > On Mon, 2003-03-24 at 16:01, Alexander Atanasov wrote:
> > > > 	I don't understand, what's the difference and how the list is
> > > > 	lost?
> > > > ata_unused used to hold all drives that were not claimed by any
> > > > driver, now idedefault_driver claims all that drives, all drives go
> > > > in the .list
> > >
> > > ata_unused -> unattached device slots, new hotplug discoveries
> >
> > 	Ok.
> >
> > > idedefault_driver -> attached/known devices with no driver
> > > other list -> driven by that driver
> > >
> > > > The bug is there,  and waiting to explode, keeping both lists would
> > > > mean to add one more  list head  in ide_drive_t,  is that the fix
> > > > you want?
> > >
> > > I don't see where stuff is ending up on both lists yet. I've not had
> > > time to look hard at it though
> > >
> >
> > 	It happens this way:
> > 	ide_register_driver -> ata_attach -> idedefault_driver.attach -> ide_register_subdriver -> list_add(&driver->list, &driver->drives) ->
> > return to ata_attach -> list_add_tail(&drive->list, &ata_unused);
>
> Exactly.
>
> Alan, if we want to keep ata_unused, we should remove
> list_add_tail(%drive->list, &ata_unused) from ata_attach()
> and perhaps use (after fixing it to handle idedefault_driver)
> ide_replace_subdriver() for driver switching for drives owned
> by idedefault_driver.
>
> BTW in ide_register_driver() we don't use ide_drives lock to protect
>     drive->list changes, why?

drives_lock lock of course, writing from memory :-)

>
> --
> bzolnier
>
> > --
> > have fun,
> > alex


^ permalink raw reply	[flat|nested] 31+ messages in thread

* PATCH:ide_do_reset() fix for 2.5.66
  2003-03-24 17:40                         ` Alan Cox
  2003-03-24 17:24                           ` Alexander Atanasov
@ 2003-04-03  7:00                           ` rain.wang
  2003-04-03  7:16                             ` Jens Axboe
  1 sibling, 1 reply; 31+ messages in thread
From: rain.wang @ 2003-04-03  7:00 UTC (permalink / raw)
  To: Alan Cox, Linux Kernel Mailing List; +Cc: Andre Hedrick

[-- Attachment #1: Type: text/plain, Size: 141 bytes --]

Hi Alan,
    I found just changing ide_do_reset() to wait till completion can
handle the handler race. can this be enough?

regards
rain.w



[-- Attachment #2: ide-iops.c.diff --]
[-- Type: text/plain, Size: 1071 bytes --]

--- /usr/src/linux/drivers/ide/ide-iops.c	Thu Apr  3 14:13:51 2003
+++ ide-iops.c	Thu Apr  3 14:29:47 2003
@@ -1107,6 +1107,10 @@
 	}
 	/* done polling */
 	hwgroup->poll_timeout = 0;
+	
+	/* tell ide_do_reset it complete */
+	complete((struct completion *)hwif->hwif_data);
+
 	return ide_stopped;
 }
 
@@ -1171,6 +1175,10 @@
 		}
 	}
 	hwgroup->poll_timeout = 0;	/* done polling */
+
+	/* tell ide_do_reset it complete */
+	complete((struct completion *)hwif->hwif_data);
+
 	return ide_stopped;
 }
 
@@ -1307,7 +1315,27 @@
 
 ide_startstop_t ide_do_reset (ide_drive_t *drive)
 {
-	return do_reset1(drive, 0);
+	/* 
+	 * Waiting for completion needed.
+	 */
+	unsigned long flags;
+	ide_hwif_t *hwif;
+	void *old_data;
+	DECLARE_COMPLETION(wait);
+	
+	spin_lock_irqsave(&ide_lock, flags);
+	hwif = HWIF(drive);
+	
+	old_data = hwif->hwif_data;
+	hwif->hwif_data = &wait;
+
+	(void) do_reset1(drive, 0);
+	
+	wait_for_completion(&wait);
+
+	hwif->hwif_data = old_data;
+	spin_unlock_irqrestore(&ide_lock, flags);
+	return ide_stopped;
 }
 
 EXPORT_SYMBOL(ide_do_reset);

[-- Attachment #3: ide.c.diff --]
[-- Type: text/plain, Size: 357 bytes --]

--- /usr/src/linux/drivers/ide/ide.c	Tue Apr  1 17:26:45 2003
+++ ide.c	Thu Apr  3 14:31:38 2003
@@ -1586,8 +1586,6 @@
 			spin_lock_irqsave(&ide_lock, flags);
 			
 			DRIVER(drive)->abort(drive, "drive reset");
-			if(HWGROUP(drive)->handler)
-				BUG();
 				
 			/* Ensure nothing gets queued after we
 			   drop the lock. Reset will clear the busy */

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: PATCH:ide_do_reset() fix for 2.5.66
  2003-04-03  7:00                           ` PATCH:ide_do_reset() fix for 2.5.66 rain.wang
@ 2003-04-03  7:16                             ` Jens Axboe
  2003-04-03  8:36                               ` PATCH RFC :ide_do_reset() " rain.wang
  0 siblings, 1 reply; 31+ messages in thread
From: Jens Axboe @ 2003-04-03  7:16 UTC (permalink / raw)
  To: rain.wang; +Cc: Alan Cox, Linux Kernel Mailing List, Andre Hedrick

On Thu, Apr 03 2003, rain.wang wrote:
> Hi Alan,
>     I found just changing ide_do_reset() to wait till completion can
> handle the handler race. can this be enough?

This is buggy for a number of reasons. Firstly, how do you make sure
that someone else doesn't race with your hwif_data manipulation? This
looks very suspect. By far the worst problem is that you are assuming
that ide_do_reset() can sleep, when in fact it cannot (just follow the
various paths into ide_do_request()). You even grab the ide_lock _and_
disable interrupts yourself prior calling wait_for_completion(), this is
incredibly broken.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 31+ messages in thread

* PATCH RFC :ide_do_reset() fix for 2.5.66
  2003-04-03  7:16                             ` Jens Axboe
@ 2003-04-03  8:36                               ` rain.wang
  2003-04-10  3:37                                 ` [rfc][patch]: fix handler race in HDIO_DRIVE_RESET path for 2.5.67-ac1 rain.wang
  0 siblings, 1 reply; 31+ messages in thread
From: rain.wang @ 2003-04-03  8:36 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Alan Cox, Linux Kernel Mailing List, Andre Hedrick

[-- Attachment #1: Type: text/plain, Size: 865 bytes --]

Jens Axboe wrote:

> On Thu, Apr 03 2003, rain.wang wrote:
> > Hi Alan,
> >     I found just changing ide_do_reset() to wait till completion can
> > handle the handler race. can this be enough?
>
> This is buggy for a number of reasons. Firstly, how do you make sure
> that someone else doesn't race with your hwif_data manipulation? This
> looks very suspect. By far the worst problem is that you are assuming
> that ide_do_reset() can sleep, when in fact it cannot (just follow the
> various paths into ide_do_request()). You even grab the ide_lock _and_
> disable interrupts yourself prior calling wait_for_completion(), this is
> incredibly broken.
>
> --
> Jens Axboe

Hi,
    Thank you, I'm too young. I should have put this in RFC.
please help me to replace using 'hwif-data', I mainly mean
to let drive reset wait for its clean up complete.

regards
rain.w

[-- Attachment #2: ide-iops.c.diff --]
[-- Type: text/plain, Size: 968 bytes --]

--- /usr/src/linux/drivers/ide/ide-iops.c	Thu Apr  3 14:13:51 2003
+++ ide-iops.c	Thu Apr  3 16:24:31 2003
@@ -1107,6 +1107,10 @@
 	}
 	/* done polling */
 	hwgroup->poll_timeout = 0;
+	
+	/* tell ide_do_reset it complete */
+	complete((struct completion *)hwif->hwif_data);
+
 	return ide_stopped;
 }
 
@@ -1171,6 +1175,10 @@
 		}
 	}
 	hwgroup->poll_timeout = 0;	/* done polling */
+
+	/* tell ide_do_reset it complete */
+	complete((struct completion *)hwif->hwif_data);
+
 	return ide_stopped;
 }
 
@@ -1307,7 +1315,25 @@
 
 ide_startstop_t ide_do_reset (ide_drive_t *drive)
 {
-	return do_reset1(drive, 0);
+	/* 
+	 * Waiting for completion needed.
+	 */
+	ide_hwif_t *hwif;
+	void *old_data;
+	DECLARE_COMPLETION(wait);
+	
+	hwif = HWIF(drive);
+	
+	old_data = hwif->hwif_data;
+	hwif->hwif_data = &wait;
+
+	(void) do_reset1(drive, 0);
+	
+	wait_for_completion(&wait);
+
+	hwif->hwif_data = old_data;
+	
+	return ide_stopped;
 }
 
 EXPORT_SYMBOL(ide_do_reset);

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [rfc][patch]: fix handler race in HDIO_DRIVE_RESET path for 2.5.67-ac1
  2003-04-03  8:36                               ` PATCH RFC :ide_do_reset() " rain.wang
@ 2003-04-10  3:37                                 ` rain.wang
  0 siblings, 0 replies; 31+ messages in thread
From: rain.wang @ 2003-04-10  3:37 UTC (permalink / raw)
  To: Jens Axboe, Alan Cox, Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 170 bytes --]

Hi,
    I found there's another 50 msec wait needed after the
first reset poll handler return to avoid the handler race.
but I can't find out reason why.

regards
rain.w

[-- Attachment #2: ide.c.diff.2 --]
[-- Type: text/plain, Size: 354 bytes --]

--- /usr/src/linux-2.5.67-ac1/drivers/ide/ide.c	Wed Apr  9 11:31:40 2003
+++ ide.c	Wed Apr  9 13:31:18 2003
@@ -1608,6 +1608,10 @@
 			HWGROUP(drive)->busy = 1;
 			spin_unlock_irqrestore(&ide_lock, flags);
 			(void) ide_do_reset(drive);
+
+			/* wait for another 50ms */
+			mdelay(50);
+
 			if (drive->suspend_reset) {
 /*
  *				APM WAKE UP todo !!

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2003-04-10  3:25 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-03-22 14:03 2.5.65-ac2 -- hda/ide trouble on ICH4 Dominik Brodowski
2003-03-22 16:35 ` Alan Cox
2003-03-22 16:25   ` Dominik Brodowski
2003-03-22 17:42     ` Alan Cox
2003-03-22 16:39       ` Jan Dittmer
2003-03-23  1:03       ` Dominik Brodowski
2003-03-23 15:47         ` Alan Cox
2003-03-23 14:59           ` Dominik Brodowski
2003-03-23 18:41             ` Alan Cox
2003-03-23 18:15               ` Dominik Brodowski
2003-03-23 18:25                 ` ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4] Dominik Brodowski
2003-03-23 22:16                   ` Jan Dittmer
2003-03-24 11:08                     ` PROBLEM: linux-2.5.65-ac3 does not boot whith IDE-drivers Norbert Wolff
2003-03-24 13:54                       ` Alan Cox
2003-03-24  9:55                   ` ide: indeed, using list_for_each_entry_safe removes endless looping / hang [Was: Re: 2.5.65-ac2 -- hda/ide trouble on ICH4] Alexander Atanasov
2003-03-24 13:59                     ` Alan Cox
2003-03-24 16:01                       ` Alexander Atanasov
2003-03-24 17:40                         ` Alan Cox
2003-03-24 17:24                           ` Alexander Atanasov
2003-03-25  4:16                             ` Andre Hedrick
2003-03-25 13:59                               ` Alan Cox
2003-03-25 20:05                             ` Bartlomiej Zolnierkiewicz
2003-03-25 20:24                               ` Bartlomiej Zolnierkiewicz
2003-04-03  7:00                           ` PATCH:ide_do_reset() fix for 2.5.66 rain.wang
2003-04-03  7:16                             ` Jens Axboe
2003-04-03  8:36                               ` PATCH RFC :ide_do_reset() " rain.wang
2003-04-10  3:37                                 ` [rfc][patch]: fix handler race in HDIO_DRIVE_RESET path for 2.5.67-ac1 rain.wang
2003-03-22 22:03     ` [PATCH] Re: 2.5.65-ac2 -- hda/ide trouble on ICH4 Bartlomiej Zolnierkiewicz
2003-03-22 23:27       ` Alan Cox
2003-03-22 22:33         ` Bartlomiej Zolnierkiewicz
2003-03-23  9:11       ` Dominik Brodowski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).