linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: IDE from current bk tree, UDMA and two channels...
@ 2002-07-30 18:19 Petr Vandrovec
  2002-07-31 19:48 ` Marcin Dalecki
  0 siblings, 1 reply; 29+ messages in thread
From: Petr Vandrovec @ 2002-07-30 18:19 UTC (permalink / raw)
  To: dalecki; +Cc: linux-kernel

I wrote:
> On 30 Jul 02 at 16:25, Marcin Dalecki wrote:
> > > Second problem is that read operation which ends with
> > > "drive ready, seek complete, data request" (why it happened in first
> > > place?) will just read one sector from drive (it was DMA transfer,
> > > so drive->mult_count == 0), and then it returns from ata_error
> > > with ATA_OP_CONTINUES. But what continues? Drive told us that
> > > current operation is done, and no new operation was started, so
> > > there is very low chance that some IRQ will ever come, and timer was
> > > just removed by ata_irq_request(), so channel will never awake.
> > 
> > What should continue is the retry of the operation, since otherwise
> > it will be abondoned in do_ide_request(). However I will recheck.
> 
> It is UP machine (with SMP non-preemptible kernel). Stack trace does not 
> look like that it was caused by some race.

There is something severely broken... I reenabled
ide: unexpected interrupt in ata_irq_request and to my surprise here
we get one suprious interrupt for each request we do, on both
channels - primary and secondary.

It looked:

udma_pci_init: sending read command to drive
ata_irq_request: IRQ arrived, for us, calling handler
ata_irq_request: handler returned 0
ide: unexpected interrupt 1 15 handler=00000000
callstack: ata_irq_request + 7e/234, handle_IRQ_event + 29/4c,
           do_IRQ + df/190, common_interrupt + 18/20, do_softirq + 50/ac,
           do_IRQ + 179/190, common_interrupt + 18/20
udma_pci_init: sending read command to drive
ata_irq_request: IRQ arrived, for us, calling handler
ata_irq_request: handler returned 0
ide: unexpected interrupt 1 15 handler=00000000
callstack: same as above
udma_pci_init: sending read command to drive
ata_irq_request: IRQ arrived, for us, calling handler
ata_irq_request: handler returned 0
udma_pci_init: sending read command to drive
ata_irq_request: command immediately queued by do_ide_request
ata_irq_request: IRQ arrived, for us, calling handler
oops: ide_dma_intr: udmastatus=00, diskstatus=58

So we are getting one spurious interrupt for each UDMA request.
Until we do not issue new command to the drive immediately, IRQ
is silently ignored, and everybody is happy (?). But when we
queue command immediately by call to do_ide_request in
ata_irq_request, sooner or later spurious interrupt will
arrive with wrong timming, and we'll think that command is
done while it is still in progress.

I see same spurious interrupt problem on primary channel too,
but somehow timming is different with UDMA100, and we always find
command done instead of in progress when spurious interrupt happens.

Unfortunately ATA/ATAPIv7 says that single interrupt is triggered
after command is done and all data transfered, and we do not play
with select bit. But we play with nIEN bit of disk. Do you see
any reason why this should cause spurious interrupt? (system is using
XT-PIC, FYI)
                                        Thanks,
                                            Petr Vandrovec
                                            vandrove@vc.cvut.cz
                                            

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
  2002-07-30 18:19 IDE from current bk tree, UDMA and two channels Petr Vandrovec
@ 2002-07-31 19:48 ` Marcin Dalecki
  0 siblings, 0 replies; 29+ messages in thread
From: Marcin Dalecki @ 2002-07-31 19:48 UTC (permalink / raw)
  To: Petr Vandrovec; +Cc: linux-kernel

Petr Vandrovec wrote:
> I wrote:
> 
>>On 30 Jul 02 at 16:25, Marcin Dalecki wrote:
>>
>>>>Second problem is that read operation which ends with
>>>>"drive ready, seek complete, data request" (why it happened in first
>>>>place?) will just read one sector from drive (it was DMA transfer,
>>>>so drive->mult_count == 0), and then it returns from ata_error
>>>>with ATA_OP_CONTINUES. But what continues? Drive told us that
>>>>current operation is done, and no new operation was started, so
>>>>there is very low chance that some IRQ will ever come, and timer was
>>>>just removed by ata_irq_request(), so channel will never awake.
>>>
>>>What should continue is the retry of the operation, since otherwise
>>>it will be abondoned in do_ide_request(). However I will recheck.
>>
>>It is UP machine (with SMP non-preemptible kernel). Stack trace does not 
>>look like that it was caused by some race.
> 
> 
> There is something severely broken... I reenabled
> ide: unexpected interrupt in ata_irq_request and to my surprise here
> we get one suprious interrupt for each request we do, on both
> channels - primary and secondary.
> 
> It looked:
> 
> udma_pci_init: sending read command to drive
> ata_irq_request: IRQ arrived, for us, calling handler
> ata_irq_request: handler returned 0
> ide: unexpected interrupt 1 15 handler=00000000
> callstack: ata_irq_request + 7e/234, handle_IRQ_event + 29/4c,
>            do_IRQ + df/190, common_interrupt + 18/20, do_softirq + 50/ac,
>            do_IRQ + 179/190, common_interrupt + 18/20
> udma_pci_init: sending read command to drive
> ata_irq_request: IRQ arrived, for us, calling handler
> ata_irq_request: handler returned 0
> ide: unexpected interrupt 1 15 handler=00000000
> callstack: same as above
> udma_pci_init: sending read command to drive
> ata_irq_request: IRQ arrived, for us, calling handler
> ata_irq_request: handler returned 0
> udma_pci_init: sending read command to drive
> ata_irq_request: command immediately queued by do_ide_request
> ata_irq_request: IRQ arrived, for us, calling handler
> oops: ide_dma_intr: udmastatus=00, diskstatus=58
> 
> So we are getting one spurious interrupt for each UDMA request.
> Until we do not issue new command to the drive immediately, IRQ
> is silently ignored, and everybody is happy (?). But when we
> queue command immediately by call to do_ide_request in
> ata_irq_request, sooner or later spurious interrupt will
> arrive with wrong timming, and we'll think that command is
> done while it is still in progress.
> 
> I see same spurious interrupt problem on primary channel too,
> but somehow timming is different with UDMA100, and we always find
> command done instead of in progress when spurious interrupt happens.
> 
> Unfortunately ATA/ATAPIv7 says that single interrupt is triggered
> after command is done and all data transfered, and we do not play
> with select bit. But we play with nIEN bit of disk. Do you see
> any reason why this should cause spurious interrupt? (system is using
> XT-PIC, FYI)

What I actually try to do is to maintain the nIEN bit enabled the
times we don't do any transfer to the disk in question.
Precisely to prevent the disk from spewing IRQs at times
when it should not. And yes this bit is acting in a reversed manner.
But I'm sure you already know this.
You could of course try to make the ata_irq_enbale()
function a no-op and see whatever this is changing anything.

(Me: Scratching my head with a puzzled expression on the face...;-)



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
  2002-08-01 23:13 Petr Vandrovec
@ 2002-08-02 13:07 ` Alan Cox
  0 siblings, 0 replies; 29+ messages in thread
From: Alan Cox @ 2002-08-02 13:07 UTC (permalink / raw)
  To: Petr Vandrovec; +Cc: Alexander Viro, martin, linux-kernel, mingo

On Fri, 2002-08-02 at 00:13, Petr Vandrovec wrote:
> Last half-KB is useless, as filesystem on it is ext2 with 4KB blocks... 
> Only problem is that previously stable system was now dying in e2fsck. I'll 
> try to invent some solution before 2.6 ;-) 

Guess where EFI puts partition tables 8(


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
  2002-08-01 22:52 ` Alexander Viro
@ 2002-08-02  9:11   ` Marcin Dalecki
  0 siblings, 0 replies; 29+ messages in thread
From: Marcin Dalecki @ 2002-08-02  9:11 UTC (permalink / raw)
  To: Alexander Viro; +Cc: Petr Vandrovec, martin, linux-kernel, mingo

Uz.ytkownik Alexander Viro napisa?:
> 
> On Fri, 2 Aug 2002, Petr Vandrovec wrote:
> 
> 
>>>Uh-oh...
>>>
>>>Let me see if I got it straight:
>>>
>>>a) your disk doesn't work with half-Kb requests
>>>b) you have a partition with odd number of sectors
>>>c) hardsect_size is set to half-Kb
>>>d) old code worked since it rounded size to multiple of kilobyte.
>>>
>>>Correct?
>>
>>Yes, exactly. Replacing disk is not an option...
> 
> 
> OK.  At the very least we need a way for driver to tell what the sector
> size is.  And that can be a problem - AFAICS IDE shares the queue for
> master and slave and sector size is queue property.

Wrong. It is sharing the queue lock not the queue itself.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
  2002-08-01 22:45 ` Alexander Viro
@ 2002-08-02  9:10   ` Marcin Dalecki
  0 siblings, 0 replies; 29+ messages in thread
From: Marcin Dalecki @ 2002-08-02  9:10 UTC (permalink / raw)
  To: Alexander Viro; +Cc: Linus Torvalds, Martin Dalecki, linux-kernel

Uz.ytkownik Alexander Viro napisa?:
> 
> On Thu, 1 Aug 2002, Linus Torvalds wrote:
> 
> 
>>You probably saw this. Looks like blocksize has been buggered somehow.
>>Apparently Petr has a 1kB blocksize optical device..
> 
> 
> Yeah - with partition boundaries set not on a physical sector boundary ;-/
> 
> He's actually lucky that beginning of partition was not in the middle of
> a physical sector...
> 
> Looks like we need
> 	a) accurate hardsect_size for these beasts (which is a problem
> with current setup, since it's per-queue and not per-device; master and
> slave can have different hardsect sizes).

FYI: In the ATA driver area all queues *are* explicitely per device.

> 	b) extra check in check_partitions() that would verify that
> partition doesn't end in the middle of a sector (and round it down
> if it does).
> 
> Basically, old code worked by accident on that setup - Petr had half-Kb
> in the end of partition unaccessible and do_open() didn't notice that.
> Now it does and tries to give such access.  Disk is not happy...



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
  2002-08-01 22:53 Petr Vandrovec
  2002-08-01 23:02 ` Alexander Viro
@ 2002-08-01 23:54 ` Linus Torvalds
  1 sibling, 0 replies; 29+ messages in thread
From: Linus Torvalds @ 2002-08-01 23:54 UTC (permalink / raw)
  To: Petr Vandrovec; +Cc: Alexander Viro, Martin Dalecki, linux-kernel


On Fri, 2 Aug 2002, Petr Vandrovec wrote:
> 
> Just to correct you: it is normal magnetic disk with 512 byte sectors,
> from notebook. It works with 512B UDMA requests if we talk to the drive
> slowly, with pauses here and there. If we talk to it back-to-back, it
> dies.

Ugh.

You apparently use udma2 - can you try forcing it to udma0/1 or the other
DMA modes? It may just be that the drive simply cannot take udma2
reliably.

		Linus


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
@ 2002-08-01 23:13 Petr Vandrovec
  2002-08-02 13:07 ` Alan Cox
  0 siblings, 1 reply; 29+ messages in thread
From: Petr Vandrovec @ 2002-08-01 23:13 UTC (permalink / raw)
  To: Alexander Viro; +Cc: martin, linux-kernel, mingo

On  1 Aug 02 at 19:05, Alexander Viro wrote:
> 
> On Fri, 2 Aug 2002, Petr Vandrovec wrote:
>  
> > Normal DOS partition, with 512 byte block size, as this is 512B block
> > device, at least I believed to it until now. As start=63, it apparently
> > also handles 1024B requests on odd address (I believe that sfdisk -d dumps
> > start 0-based).
> > 
> > # partition table of /dev/hdc
> > unit: sectors
> > 
> > /dev/hdc1 : start=       63, size=12685617, Id=83, bootable
> 
> Blacklist time.  That, or decrementing size to 12675616, depending on whether
> you want that last half-Kb or not.

Last half-KB is useless, as filesystem on it is ext2 with 4KB blocks... 
Only problem is that previously stable system was now dying in e2fsck. I'll 
try to invent some solution before 2.6 ;-) 

Maybe fix to e2fsck would be sufficient, I always thought that it reads disk 
in blocksize (4KB) chunks, so disk will not see 512B request. But
apparently it either reads partition in 512B chunks, or block layer does not
do merging correctly.
                                                Petr Vandrovec
                                                vandrove@vc.cvut.cz
                                                

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
  2002-08-01 23:00 Petr Vandrovec
@ 2002-08-01 23:05 ` Alexander Viro
  0 siblings, 0 replies; 29+ messages in thread
From: Alexander Viro @ 2002-08-01 23:05 UTC (permalink / raw)
  To: Petr Vandrovec; +Cc: martin, linux-kernel, mingo



On Fri, 2 Aug 2002, Petr Vandrovec wrote:
 
> Normal DOS partition, with 512 byte block size, as this is 512B block
> device, at least I believed to it until now. As start=63, it apparently
> also handles 1024B requests on odd address (I believe that sfdisk -d dumps
> start 0-based).
> 
> # partition table of /dev/hdc
> unit: sectors
> 
> /dev/hdc1 : start=       63, size=12685617, Id=83, bootable

Blacklist time.  That, or decrementing size to 12675616, depending on whether
you want that last half-Kb or not.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
  2002-08-01 22:53 Petr Vandrovec
@ 2002-08-01 23:02 ` Alexander Viro
  2002-08-01 23:54 ` Linus Torvalds
  1 sibling, 0 replies; 29+ messages in thread
From: Alexander Viro @ 2002-08-01 23:02 UTC (permalink / raw)
  To: Petr Vandrovec; +Cc: Martin Dalecki, linux-kernel, torvalds



On Fri, 2 Aug 2002, Petr Vandrovec wrote:

> On  1 Aug 02 at 18:45, Alexander Viro wrote:
> > 
> > On Thu, 1 Aug 2002, Linus Torvalds wrote:
> > 
> > > You probably saw this. Looks like blocksize has been buggered somehow.
> > > Apparently Petr has a 1kB blocksize optical device..
> 
> Just to correct you: it is normal magnetic disk with 512 byte sectors,
> from notebook. It works with 512B UDMA requests if we talk to the drive
> slowly, with pauses here and there. If we talk to it back-to-back, it
> dies. Apparently it forgets that it is doing UDMA transfers and tries
> to do normal PIO or MDMA or what - host terminates transfer in the middle,
> and disk is signaling that it has more data to go.

_Ouch_.  Then I have to agree with Martin - it's a blacklist time.  There's
not much partition code could do with that - you really have a partition
with a chunk that _can't_ be handled by 1Kb request.

Old code (pretty much by accident) hid it from you, so I'd suggest just
decrementing partition size - it's not that you had anything in that last
half-Kb.  At least nothing that could be accessed by old kernels.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
@ 2002-08-01 23:00 Petr Vandrovec
  2002-08-01 23:05 ` Alexander Viro
  0 siblings, 1 reply; 29+ messages in thread
From: Petr Vandrovec @ 2002-08-01 23:00 UTC (permalink / raw)
  To: Alexander Viro; +Cc: martin, linux-kernel, mingo

On  1 Aug 02 at 18:52, Alexander Viro wrote:
> On Fri, 2 Aug 2002, Petr Vandrovec wrote:
> 
> > > Uh-oh...
> > > 
> > > Let me see if I got it straight:
> > > 
> > > a) your disk doesn't work with half-Kb requests
> > > b) you have a partition with odd number of sectors
> > > c) hardsect_size is set to half-Kb
> > > d) old code worked since it rounded size to multiple of kilobyte.
> > > 
> > > Correct?
> > 
> > Yes, exactly. Replacing disk is not an option...
> 
> OK.  At the very least we need a way for driver to tell what the sector
> size is.  And that can be a problem - AFAICS IDE shares the queue for
> master and slave and sector size is queue property.
> 
> BTW, what type of partition table do you have there?

Normal DOS partition, with 512 byte block size, as this is 512B block
device, at least I believed to it until now. As start=63, it apparently
also handles 1024B requests on odd address (I believe that sfdisk -d dumps
start 0-based).

# partition table of /dev/hdc
unit: sectors

/dev/hdc1 : start=       63, size=12685617, Id=83, bootable
/dev/hdc2 : start=        0, size=       0, Id= 0
/dev/hdc3 : start=        0, size=       0, Id= 0
/dev/hdc4 : start=        0, size=       0, Id= 0

                                                      Petr Vandrovec
                                                      
                                                            

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
@ 2002-08-01 22:53 Petr Vandrovec
  2002-08-01 23:02 ` Alexander Viro
  2002-08-01 23:54 ` Linus Torvalds
  0 siblings, 2 replies; 29+ messages in thread
From: Petr Vandrovec @ 2002-08-01 22:53 UTC (permalink / raw)
  To: Alexander Viro; +Cc: Martin Dalecki, linux-kernel, torvalds

On  1 Aug 02 at 18:45, Alexander Viro wrote:
> 
> On Thu, 1 Aug 2002, Linus Torvalds wrote:
> 
> > You probably saw this. Looks like blocksize has been buggered somehow.
> > Apparently Petr has a 1kB blocksize optical device..

Just to correct you: it is normal magnetic disk with 512 byte sectors,
from notebook. It works with 512B UDMA requests if we talk to the drive
slowly, with pauses here and there. If we talk to it back-to-back, it
dies. Apparently it forgets that it is doing UDMA transfers and tries
to do normal PIO or MDMA or what - host terminates transfer in the middle,
and disk is signaling that it has more data to go.
 
> Looks like we need
>     a) accurate hardsect_size for these beasts (which is a problem
> with current setup, since it's per-queue and not per-device; master and
> slave can have different hardsect sizes).
>     b) extra check in check_partitions() that would verify that
> partition doesn't end in the middle of a sector (and round it down
> if it does).

It will not help. Device is reporting 512B sectors, and it even supports
them in PIO.
                                            Petr Vandrovec
                                            vandrove@vc.cvut.cz
                                            

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
  2002-08-01 22:42 Petr Vandrovec
@ 2002-08-01 22:52 ` Alexander Viro
  2002-08-02  9:11   ` Marcin Dalecki
  0 siblings, 1 reply; 29+ messages in thread
From: Alexander Viro @ 2002-08-01 22:52 UTC (permalink / raw)
  To: Petr Vandrovec; +Cc: martin, linux-kernel, mingo



On Fri, 2 Aug 2002, Petr Vandrovec wrote:

> > Uh-oh...
> > 
> > Let me see if I got it straight:
> > 
> > a) your disk doesn't work with half-Kb requests
> > b) you have a partition with odd number of sectors
> > c) hardsect_size is set to half-Kb
> > d) old code worked since it rounded size to multiple of kilobyte.
> > 
> > Correct?
> 
> Yes, exactly. Replacing disk is not an option...

OK.  At the very least we need a way for driver to tell what the sector
size is.  And that can be a problem - AFAICS IDE shares the queue for
master and slave and sector size is queue property.

BTW, what type of partition table do you have there?


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
       [not found] <200208012219.g71MJV109133@penguin.transmeta.com>
@ 2002-08-01 22:45 ` Alexander Viro
  2002-08-02  9:10   ` Marcin Dalecki
  0 siblings, 1 reply; 29+ messages in thread
From: Alexander Viro @ 2002-08-01 22:45 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Martin Dalecki, linux-kernel



On Thu, 1 Aug 2002, Linus Torvalds wrote:

> You probably saw this. Looks like blocksize has been buggered somehow.
> Apparently Petr has a 1kB blocksize optical device..

Yeah - with partition boundaries set not on a physical sector boundary ;-/

He's actually lucky that beginning of partition was not in the middle of
a physical sector...

Looks like we need
	a) accurate hardsect_size for these beasts (which is a problem
with current setup, since it's per-queue and not per-device; master and
slave can have different hardsect sizes).
	b) extra check in check_partitions() that would verify that
partition doesn't end in the middle of a sector (and round it down
if it does).

Basically, old code worked by accident on that setup - Petr had half-Kb
in the end of partition unaccessible and do_open() didn't notice that.
Now it does and tries to give such access.  Disk is not happy...


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
@ 2002-08-01 22:42 Petr Vandrovec
  2002-08-01 22:52 ` Alexander Viro
  0 siblings, 1 reply; 29+ messages in thread
From: Petr Vandrovec @ 2002-08-01 22:42 UTC (permalink / raw)
  To: Alexander Viro; +Cc: martin, linux-kernel, mingo

On  1 Aug 02 at 18:39, Alexander Viro wrote:
> On Fri, 2 Aug 2002, Marcin Dalecki wrote:
> 
> > > I'd like to apologize to Ingo, his changes were completely innocent.
> > > Problem was triggered by Al's 'block device size cleanups' (currently
> > > cset 1.403.160.5 on bkbits).
> > > 
> > > Before this change, my system was using 4KB block size when reading
> > > from /dev/hdc1, because of blk_size[][] (which is in 1kB units) of this 
> > > partition was multiple of 2, and so i_size % 4096 was 0.  But after
> > > Al's change partition size is read from gendisk, and not from blk_size,
> > > and gendisk partition size is in 512 bytes units: and, as you can
> > > probably guess, now my partition had i_size % 4096 == 512, and so only
> > > 512 byte block size was choosen. And with 512 bytes block size my
> > > harddisk refuses to cooperate.
> 
> Uh-oh...
> 
> Let me see if I got it straight:
> 
> a) your disk doesn't work with half-Kb requests
> b) you have a partition with odd number of sectors
> c) hardsect_size is set to half-Kb
> d) old code worked since it rounded size to multiple of kilobyte.
> 
> Correct?

Yes, exactly. Replacing disk is not an option...
                                            Petr Vandrovec
                                            

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
  2002-08-01 22:13   ` Marcin Dalecki
@ 2002-08-01 22:39     ` Alexander Viro
  0 siblings, 0 replies; 29+ messages in thread
From: Alexander Viro @ 2002-08-01 22:39 UTC (permalink / raw)
  To: martin; +Cc: Petr Vandrovec, linux-kernel, mingo



On Fri, 2 Aug 2002, Marcin Dalecki wrote:

> > I'd like to apologize to Ingo, his changes were completely innocent.
> > Problem was triggered by Al's 'block device size cleanups' (currently
> > cset 1.403.160.5 on bkbits).
> > 
> > Before this change, my system was using 4KB block size when reading
> > from /dev/hdc1, because of blk_size[][] (which is in 1kB units) of this 
> > partition was multiple of 2, and so i_size % 4096 was 0.  But after
> > Al's change partition size is read from gendisk, and not from blk_size,
> > and gendisk partition size is in 512 bytes units: and, as you can
> > probably guess, now my partition had i_size % 4096 == 512, and so only
> > 512 byte block size was choosen. And with 512 bytes block size my
> > harddisk refuses to cooperate.

Uh-oh...

Let me see if I got it straight:

a) your disk doesn't work with half-Kb requests
b) you have a partition with odd number of sectors
c) hardsect_size is set to half-Kb
d) old code worked since it rounded size to multiple of kilobyte.

Correct?


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
@ 2002-08-01 22:34 Petr Vandrovec
  0 siblings, 0 replies; 29+ messages in thread
From: Petr Vandrovec @ 2002-08-01 22:34 UTC (permalink / raw)
  To: Marcin Dalecki; +Cc: viro, linux-kernel, mingo

On  2 Aug 02 at 0:13, Marcin Dalecki wrote:
> 
> Would you mind sending me hdparm -i /dev/hdx and hdparm -I /dev/hdx
> for documentation purposes? The host controller chip could be the
> one to blame as well.
> 
> I fear the need for jet another black list.

Here they are. This is with i845 (82801BA rev B5) host chip. I'll check
i440BX tomorrow and PDC20265 on sunday. I believe that PDC20265
worked because of I did not notice problem at home, only at work.
                                            Petr Vandrovec
                                            vandrove@vc.cvut.cz

/dev/hdc:

 Model=TOSHIBA MK6409MAV, FwRev=F1.03 A, SerialNo=58S40974
 Config={ Fixed }
 RawCHS=13424/15/63, TrkSize=0, SectSize=0, ECCbytes=36
 BuffType=unknown, BuffSize=0kB, MaxMultSect=16, MultSect=off
 CurCHS=13424/15/63, CurSects=12685680, LBA=yes, LBAsects=12685680
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes: pio0 pio1 pio2 pio3 pio4 
 DMA modes: sdma0 sdma1 sdma2 mdma0 mdma1 mdma2 udma0 udma1 *udma2 
 AdvancedPM=no WriteCache=enabled
 Drive Supports : Reserved : ATA-1 ATA-2 ATA-3 ATA-4 


/dev/hdc:

non-removable ATA device, with non-removable media
    Model Number:       TOSHIBA MK6409MAV                       
    Serial Number:      58S40974            
    Firmware Revision:  F1.03 A 
Standards:
    Supported: 1 2 3 4 
    Likely used: 4
Configuration:
    Logical         max     current
    cylinders       13424   13424
    heads           15      15
    sectors/track   63      63
    bytes/track:    0       (obsolete)
    bytes/sector:   0       (obsolete)
    current sector capacity: 12685680
    LBA user addressable sectors = 12685680
Capabilities:
    LBA, IORDY(can be disabled)
    ECC bytes: 36   Queue depth: 1
    Standby timer values: spec'd by Vendor, no device specific minimum
    r/w multiple sector transfer: Max = 16  Current = 16
    DMA: sdma0 sdma1 sdma2 mdma0 mdma1 mdma2 udma0 udma1 *udma2 
         Cycle time: min=120ns recommended=120ns
    PIO: pio0 pio1 pio2 pio3 pio4 
         Cycle time: no flow control=120ns  IORDY flow control=120ns
Commands/features:
    Enabled Supported:
       *    NOP cmd
       *    READ BUFFER cmd
       *    WRITE BUFFER cmd
       *    Host Protected Area feature set
       *    look-ahead
       *    write cache
       *    Power Management feature set
            Security Mode feature set
       *    SMART feature set
Security: 
        supported
    not enabled
    not locked
        frozen
    not expired: security count
    not supported: enhanced erase
    22min for SECURITY ERASE UNIT. 

00:1f.1 IDE interface: Intel Corp. 82801BA IDE U100 (rev 05) (prog-if 80 [Master])
    Subsystem: Intel Corp.: Unknown device 5054
00:1f.1 Class 0101: 8086:244b (rev 05) (prog-if 80 [Master])
    Subsystem: 8086:5054
    Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
    Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
    Latency: 0
    Region 4: I/O ports at ffa0 [size=16]
00: 86 80 4b 24 05 00 80 02 05 80 01 01 00 00 00 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: a1 ff 00 00 00 00 00 00 00 00 00 00 86 80 54 50
30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
40: 47 e3 47 e3 00 00 00 00 05 00 01 02 00 00 00 00
50: 00 00 00 00 10 14 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 47 0f 00 00 00 00 00 00


pcibus = 33333
00:1f.1 vendor=8086 device=244b class=0101 irq=0 base4=ffa1
----------PIIX BusMastering IDE Configuration---------------
Driver Version:                     1.3
South Bridge:                       9291
Revision:                           IDE 0x5
Highest DMA rate:                   UDMA100
BM-DMA base:                        0xffa0
PCI clock:                          33.3MHz
-----------------------Primary IDE-------Secondary IDE------
Enabled:                      yes                 yes
Simplex only:                  no                  no
Cable Type:                   80w                 40w
-------------------drive0----drive1----drive2----drive3-----
Prefetch+Post:        yes       yes       yes       yes
Transfer Mode:       UDMA       PIO      UDMA       PIO
Address Setup:       90ns      90ns      90ns      90ns
Cmd Active:         360ns     360ns     360ns     360ns
Cmd Recovery:       540ns     540ns     540ns     540ns
Data Active:         90ns     360ns      90ns     360ns
Data Recovery:       30ns     540ns      30ns     540ns
Cycle Time:          22ns     900ns      60ns     900ns
Transfer Rate:   88.8MB/s   2.2MB/s  33.3MB/s   2.2MB/s

(whee, Intel defines UDMA(100) to 88.8MB/s?)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
  2002-08-01 22:00 ` Petr Vandrovec
@ 2002-08-01 22:13   ` Marcin Dalecki
  2002-08-01 22:39     ` Alexander Viro
  0 siblings, 1 reply; 29+ messages in thread
From: Marcin Dalecki @ 2002-08-01 22:13 UTC (permalink / raw)
  To: Petr Vandrovec; +Cc: viro, linux-kernel, mingo

Uz.ytkownik Petr Vandrovec napisa?:
> On Thu, Aug 01, 2002 at 07:07:21PM +0200, Petr Vandrovec wrote:
> 
>>On 31 Jul 02 at 22:01, Marcin Dalecki wrote:
>>
>>>Well OK this was my next idea, but apparently you already did the
>>>experient on your own. Thanks for the result. I'm still scratching my
>>>head and I have already observed this before myself.
>>>It's always funny to see what happens when one stops a driver
>>>from deliberately disabling IRQs for eons of jiffies :-).
>>
>>I currently suspect IRQ handling changes, but maybe someone has
>>better idea? Also, I cannot reproduce problem with Seagate UDMA66
>>drive switched to UDMA33 mode, so it looks like that problem is 
>>timming/firmware (Toshiba MK6409MAV) dependent.
> 
> 
> I'd like to apologize to Ingo, his changes were completely innocent.
> Problem was triggered by Al's 'block device size cleanups' (currently
> cset 1.403.160.5 on bkbits).
> 
> Before this change, my system was using 4KB block size when reading
> from /dev/hdc1, because of blk_size[][] (which is in 1kB units) of this 
> partition was multiple of 2, and so i_size % 4096 was 0.  But after
> Al's change partition size is read from gendisk, and not from blk_size,
> and gendisk partition size is in 512 bytes units: and, as you can
> probably guess, now my partition had i_size % 4096 == 512, and so only
> 512 byte block size was choosen. And with 512 bytes block size my
> harddisk refuses to cooperate.
> 
> I was trying to find reason in code, why 512 byte block size should
> not work, but I was not able to reveal any. Maybe I/O gurus here
> will know?

Petr. First - I wish to express my respect (for whatever it's
worth). Once again you are fscking sharp and up the point in problem
analysis.

For what few things I know about the situation is that the SATA
people are having great problems with the mediocre physical sector size 
and they are pushing hard toward bigger sector
sizes. This may explain a bit why there is a propability why one
should be awake in this area.

Would you mind sending me hdparm -i /dev/hdx and hdparm -I /dev/hdx
for documentation purposes? The host controller chip could be the
one to blame as well.

I fear the need for jet another black list.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
  2002-08-01 17:07 Petr Vandrovec
@ 2002-08-01 22:00 ` Petr Vandrovec
  2002-08-01 22:13   ` Marcin Dalecki
  0 siblings, 1 reply; 29+ messages in thread
From: Petr Vandrovec @ 2002-08-01 22:00 UTC (permalink / raw)
  To: dalecki; +Cc: viro, linux-kernel, mingo

On Thu, Aug 01, 2002 at 07:07:21PM +0200, Petr Vandrovec wrote:
> On 31 Jul 02 at 22:01, Marcin Dalecki wrote:
> > 
> > Well OK this was my next idea, but apparently you already did the
> > experient on your own. Thanks for the result. I'm still scratching my
> > head and I have already observed this before myself.
> > It's always funny to see what happens when one stops a driver
> > from deliberately disabling IRQs for eons of jiffies :-).
> 
> I currently suspect IRQ handling changes, but maybe someone has
> better idea? Also, I cannot reproduce problem with Seagate UDMA66
> drive switched to UDMA33 mode, so it looks like that problem is 
> timming/firmware (Toshiba MK6409MAV) dependent.

I'd like to apologize to Ingo, his changes were completely innocent.
Problem was triggered by Al's 'block device size cleanups' (currently
cset 1.403.160.5 on bkbits).

Before this change, my system was using 4KB block size when reading
from /dev/hdc1, because of blk_size[][] (which is in 1kB units) of this 
partition was multiple of 2, and so i_size % 4096 was 0.  But after
Al's change partition size is read from gendisk, and not from blk_size,
and gendisk partition size is in 512 bytes units: and, as you can
probably guess, now my partition had i_size % 4096 == 512, and so only
512 byte block size was choosen. And with 512 bytes block size my
harddisk refuses to cooperate.

I was trying to find reason in code, why 512 byte block size should
not work, but I was not able to reveal any. Maybe I/O gurus here
will know?

For now, I'm using patch below. It fixes problems for me, block size = 1024
is sufficient in my configuration. If you have any insights what can be
a problem, please tell me. Problem apparently is not in i_size not being
multiple of 1024: without changing bsize problem still occurs, even if
I shrink i_size down to be multiple of 4K.

After some more testing I found that my other drive (120GB WD) handles
bsize=512 quite happily, so it looks like that just my Toshiba disk
does not like 512B back to back transfers?! Are there any plans to
read from block devices in 4KB blocks for all reads/writes except for
the last partial page?
					Thanks,
						Petr Vandrovec
						vandrove@vc.cvut.cz

--- linux-2.5.29-c548/fs/block_dev.c.orig	2002-07-31 12:48:23.000000000 +0200
+++ linux-2.5.29-c548/fs/block_dev.c	2002-08-01 23:20:43.000000000 +0200
@@ -608,6 +608,11 @@
 				break;
 			bsize <<= 1;
 		}
+		if (bsize == 512) {
+			printk(KERN_ERR "Found 512b device! Using larger block size...\n");
+			bdev->bd_inode->i_size -= 512;
+			bsize = 1024;
+		}
 		bdev->bd_block_size = bsize;
 		bdev->bd_inode->i_blkbits = blksize_bits(bsize);
 		if (p->queue)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
@ 2002-08-01 17:07 Petr Vandrovec
  2002-08-01 22:00 ` Petr Vandrovec
  0 siblings, 1 reply; 29+ messages in thread
From: Petr Vandrovec @ 2002-08-01 17:07 UTC (permalink / raw)
  To: Marcin Dalecki; +Cc: linux-kernel, mingo

On 31 Jul 02 at 22:01, Marcin Dalecki wrote:
> 
> Well OK this was my next idea, but apparently you already did the
> experient on your own. Thanks for the result. I'm still scratching my
> head and I have already observed this before myself.
> It's always funny to see what happens when one stops a driver
> from deliberately disabling IRQs for eons of jiffies :-).

I finally managed to compile older kernels, and I found that
2.5.27 (and 2.4.19-rc1 and 2.5.26) works fine (modulo endless loop 
in ide_do_request... but it takes at least 5 minutes to trigger it), 
while 2.5.28 dies in one second with UDMA status 0x25 (irq requested, 
transfer in progress) and IDE status 0x58 (drq asserted).

Because of only change in IDE system between 2.5.27 and 2.5.28 is
renaming __save_flags => local_save_flags, fixing get_request for
ioctl commands (so 2.5.28 should be correct while 2.5.27 is not),
and moving some ioctls around, it looks like that problem is triggered
by something else.

I currently suspect IRQ handling changes, but maybe someone has
better idea? Also, I cannot reproduce problem with Seagate UDMA66
drive switched to UDMA33 mode, so it looks like that problem is 
timming/firmware (Toshiba MK6409MAV) dependent.

And I did all these tests with UP kernel, just to eliminate cli/sti 
changes.
                                            Petr Vandrovec
                                            vandrove@vc.cvut.cz
                                            

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
  2002-08-01 10:33         ` Marcin Dalecki
@ 2002-08-01 10:45           ` Jens Axboe
  0 siblings, 0 replies; 29+ messages in thread
From: Jens Axboe @ 2002-08-01 10:45 UTC (permalink / raw)
  To: martin; +Cc: Petr Vandrovec, linux-kernel

On Thu, Aug 01 2002, Marcin Dalecki wrote:
> Jens Axboe wrote:
> 
> >>>that would work, but I think it would seriously starve the other device
> >>>on the same channel.
> >>
> >>We starve anyway, becouse the kernel isn't real time and we can't
> >>guarantee "sleeping" for some maximum time and comming back.
> >>We don't reschedule the kernel during this kind of "sleeping".
> >>And we can't know that a command on the "mate" will not take 
> >>extraordinary amounts of time. It's only a problem if mixing travan
> >>tapes with disks on a channel.
> >
> >
> >I'm thinking about the alternation of the devices so one device can't
> >starve the other device off the channel.
> 
> Ah so you are thinking about two equally powered devices
> competing for the channel. Something I would call the "sumo fight"
> situation. Well disks didn't use the "sleeping" mechanism at all anyway
> and the chances someone would do cp from CD-ROM to CD-ROM are low.
> 
> Finally I think that the proper granularity of scheduling requests to
> the drive is, well, the request layer. The queue processing layer should
> handle this becouse otherwise we would have two "competing" optimization
> mechanisms. And there we are indeed able to actually relinquish some CPU 
> time. If you look at an request processing optimization as a low pass
> signal filter it's immediately obvious that the effects of chaining them
> can be, well at least "counter intuitive".

Actually, I'm thinking of a much simple scenario: basically any two
devices on the same channel, both with pending requests on the queue.
This could be a hard drive and a cd writer, for instance. If you have 60
requests pending for the hard drive, queue gets unplugged, you start the
first one. Correct me if I'm wrong, but now you pass back the drive to
the request handler when the first request completes, and you select a
new request from that very same drive without considering device
starvation? Any run of the cd writer queue would do nothing, since it
would just find the channel busy.

This sort of thing cannot be solved at the block layer. The two queues
are independent seen from that layer, the channel-busy dependency cannot
be solved there.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
  2002-08-01 10:05       ` Jens Axboe
@ 2002-08-01 10:33         ` Marcin Dalecki
  2002-08-01 10:45           ` Jens Axboe
  0 siblings, 1 reply; 29+ messages in thread
From: Marcin Dalecki @ 2002-08-01 10:33 UTC (permalink / raw)
  To: Jens Axboe; +Cc: martin, Petr Vandrovec, linux-kernel

Jens Axboe wrote:

>>>that would work, but I think it would seriously starve the other device
>>>on the same channel.
>>
>>We starve anyway, becouse the kernel isn't real time and we can't
>>guarantee "sleeping" for some maximum time and comming back.
>>We don't reschedule the kernel during this kind of "sleeping".
>>And we can't know that a command on the "mate" will not take 
>>extraordinary amounts of time. It's only a problem if mixing travan
>>tapes with disks on a channel.
> 
> 
> I'm thinking about the alternation of the devices so one device can't
> starve the other device off the channel.

Ah so you are thinking about two equally powered devices
competing for the channel. Something I would call the "sumo fight"
situation. Well disks didn't use the "sleeping" mechanism at all anyway
and the chances someone would do cp from CD-ROM to CD-ROM are low.

Finally I think that the proper granularity of scheduling requests to
the drive is, well, the request layer. The queue processing layer should
handle this becouse otherwise we would have two "competing" optimization
mechanisms. And there we are indeed able to actually relinquish some CPU 
time. If you look at an request processing optimization as a low pass
signal filter it's immediately obvious that the effects of chaining them
can be, well at least "counter intuitive".





^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
  2002-08-01  9:56     ` Marcin Dalecki
@ 2002-08-01 10:05       ` Jens Axboe
  2002-08-01 10:33         ` Marcin Dalecki
  0 siblings, 1 reply; 29+ messages in thread
From: Jens Axboe @ 2002-08-01 10:05 UTC (permalink / raw)
  To: martin; +Cc: Petr Vandrovec, linux-kernel

On Thu, Aug 01 2002, Marcin Dalecki wrote:
> >hey that sucks :-)
> 
> Since IDE 111 not any more...

Yeah I just saw that 110 was the 'broken' solution, 111 made it right.
Good.

> >seriously, the better way to do this would be to change the q->queuedata
> >to be a pointer to drive instead of the channel.
> 
> ... becouse this is already *done* there :-).

:-)

> >that would work, but I think it would seriously starve the other device
> >on the same channel.
> 
> We starve anyway, becouse the kernel isn't real time and we can't
> guarantee "sleeping" for some maximum time and comming back.
> We don't reschedule the kernel during this kind of "sleeping".
> And we can't know that a command on the "mate" will not take 
> extraordinary amounts of time. It's only a problem if mixing travan
> tapes with disks on a channel.

I'm thinking about the alternation of the devices so one device can't
starve the other device off the channel.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
  2002-08-01  9:56   ` Jens Axboe
@ 2002-08-01  9:56     ` Marcin Dalecki
  2002-08-01 10:05       ` Jens Axboe
  0 siblings, 1 reply; 29+ messages in thread
From: Marcin Dalecki @ 2002-08-01  9:56 UTC (permalink / raw)
  To: Jens Axboe; +Cc: martin, Petr Vandrovec, linux-kernel

Jens Axboe wrote:
> On Wed, Jul 31 2002, Marcin Dalecki wrote:
> 
>>>Unfortunately, problem is still here: when kernel was in idedisk_do_request
>>>performed on channel 0, IRQ for channel 1 arrived, and this irq found 
>>>channel 1 DMA engine ready, but drive had DRQ set... oops. Shortly after 
>>>that IRQ for channel 1 arrived again, but as it was unexpected, nothing 
>>>happened. 
>>>
>>>I hope that i845 is not simplex device, but first (unexpected) IRQ arrived 
>>>just when channel 0 code wrote new value to its IDE_SELECT_REG register. 
>>>Now I even disconnected DVD drive, so it is simple two masters, two
>>>channels configuration, but it still happens.
>>
>>One idea and one experiment I was already thinking about is
>>to change do_ide_request to actually *not* select delibreately which 
>>device do handle. (The big for loop found there...)
>>One can instead search for a device on the channel which is matching
>>the queue for which do_ide_request() was called.
>>
>>for (unit = 0; unit < MAX_DEVICES; ++unit) {
>>  ....
>>  if (tmp->queue == q) {
>>        drive = tmp;
>>	break;
>>  }
>>}
>>if (!drive)
>>  BUG();
> 
> 
> hey that sucks :-)

Since IDE 111 not any more...

> seriously, the better way to do this would be to change the q->queuedata
> to be a pointer to drive instead of the channel.

... becouse this is already *done* there :-).

> that would work, but I think it would seriously starve the other device
> on the same channel.

We starve anyway, becouse the kernel isn't real time and we can't
guarantee "sleeping" for some maximum time and comming back.
We don't reschedule the kernel during this kind of "sleeping".
And we can't know that a command on the "mate" will not take 
extraordinary amounts of time. It's only a problem if mixing travan
tapes with disks on a channel.





^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
  2002-07-31 20:01 ` Marcin Dalecki
@ 2002-08-01  9:56   ` Jens Axboe
  2002-08-01  9:56     ` Marcin Dalecki
  0 siblings, 1 reply; 29+ messages in thread
From: Jens Axboe @ 2002-08-01  9:56 UTC (permalink / raw)
  To: martin; +Cc: Petr Vandrovec, linux-kernel

On Wed, Jul 31 2002, Marcin Dalecki wrote:
> >Unfortunately, problem is still here: when kernel was in idedisk_do_request
> >performed on channel 0, IRQ for channel 1 arrived, and this irq found 
> >channel 1 DMA engine ready, but drive had DRQ set... oops. Shortly after 
> >that IRQ for channel 1 arrived again, but as it was unexpected, nothing 
> >happened. 
> >
> >I hope that i845 is not simplex device, but first (unexpected) IRQ arrived 
> >just when channel 0 code wrote new value to its IDE_SELECT_REG register. 
> >Now I even disconnected DVD drive, so it is simple two masters, two
> >channels configuration, but it still happens.
> 
> One idea and one experiment I was already thinking about is
> to change do_ide_request to actually *not* select delibreately which 
> device do handle. (The big for loop found there...)
> One can instead search for a device on the channel which is matching
> the queue for which do_ide_request() was called.
> 
> for (unit = 0; unit < MAX_DEVICES; ++unit) {
>   ....
>   if (tmp->queue == q) {
>         drive = tmp;
> 	break;
>   }
> }
> if (!drive)
>   BUG();

hey that sucks :-)

seriously, the better way to do this would be to change the q->queuedata
to be a pointer to drive instead of the channel.

that would work, but I think it would seriously starve the other device
on the same channel.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
  2002-07-30 19:26 Petr Vandrovec
@ 2002-07-31 20:01 ` Marcin Dalecki
  2002-08-01  9:56   ` Jens Axboe
  0 siblings, 1 reply; 29+ messages in thread
From: Marcin Dalecki @ 2002-07-31 20:01 UTC (permalink / raw)
  To: Petr Vandrovec; +Cc: linux-kernel

Petr Vandrovec wrote:
> I wrote:
> 
>>Unfortunately ATA/ATAPIv7 says that single interrupt is triggered
>>after command is done and all data transfered, and we do not play
>>with select bit. But we play with nIEN bit of disk. Do you see
>>any reason why this should cause spurious interrupt? (system is using
>>XT-PIC, FYI)
> 
> 
> OK. As I am using only one device on each channel, I commented
> out ata_irq_enable(drive, 1) in ide-disk.c when issuing command,
> and removed disabling irq in ide_do_request in ide.c when we
> do not issue command to the drive, and spurious interrupts disappeared.
> So now I'm getting only half of IRQs for channel 0, and system still
> works as before ;-)

Well OK this was my next idea, but apparently you already did the
experient on your own. Thanks for the result. I'm still scratching my
head and I have already observed this before myself.
It's always funny to see what happens when one stops a driver
from deliberately disabling IRQs for eons of jiffies :-).

> Unfortunately, problem is still here: when kernel was in idedisk_do_request
> performed on channel 0, IRQ for channel 1 arrived, and this irq found 
> channel 1 DMA engine ready, but drive had DRQ set... oops. Shortly after 
> that IRQ for channel 1 arrived again, but as it was unexpected, nothing 
> happened. 
> 
> I hope that i845 is not simplex device, but first (unexpected) IRQ arrived 
> just when channel 0 code wrote new value to its IDE_SELECT_REG register. 
> Now I even disconnected DVD drive, so it is simple two masters, two
> channels configuration, but it still happens.

One idea and one experiment I was already thinking about is
to change do_ide_request to actually *not* select delibreately which 
device do handle. (The big for loop found there...)
One can instead search for a device on the channel which is matching
the queue for which do_ide_request() was called.

for (unit = 0; unit < MAX_DEVICES; ++unit) {
   ....
   if (tmp->queue == q) {
         drive = tmp;
	break;
   }
}
if (!drive)
   BUG();

Just please forget temporarly that there is a mechanism for "sleeping".
It is bogous anyway (doesn give time back to anybody) and the only
consumer of it is ide-cd (easly removed there) and ide-tape.c (don't 
care the driver was never usable in 2.5.xx)

> And as always, something else: ata_error does:
> 
> OUT_BYTE(WIN_NOP, ch->ports[IDE_CONTROL_OFFSET])
> 
> I'd say that it should use 0x00 instead of WIN_NOP, and also tha
> comment above OUT_BYTE(0x04, ch->ports[IDE_CONTROL_OFFSET]) is bogus.
> Command register is IDE_COMMAND, not IDE_CONTROL ;-)

Yes I know already about this I will remove the comment.
(Must have forgotten about it.)



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
@ 2002-07-30 19:26 Petr Vandrovec
  2002-07-31 20:01 ` Marcin Dalecki
  0 siblings, 1 reply; 29+ messages in thread
From: Petr Vandrovec @ 2002-07-30 19:26 UTC (permalink / raw)
  To: dalecki; +Cc: linux-kernel

I wrote:
> 
> Unfortunately ATA/ATAPIv7 says that single interrupt is triggered
> after command is done and all data transfered, and we do not play
> with select bit. But we play with nIEN bit of disk. Do you see
> any reason why this should cause spurious interrupt? (system is using
> XT-PIC, FYI)

OK. As I am using only one device on each channel, I commented
out ata_irq_enable(drive, 1) in ide-disk.c when issuing command,
and removed disabling irq in ide_do_request in ide.c when we
do not issue command to the drive, and spurious interrupts disappeared.
So now I'm getting only half of IRQs for channel 0, and system still
works as before ;-)

Unfortunately, problem is still here: when kernel was in idedisk_do_request
performed on channel 0, IRQ for channel 1 arrived, and this irq found 
channel 1 DMA engine ready, but drive had DRQ set... oops. Shortly after 
that IRQ for channel 1 arrived again, but as it was unexpected, nothing 
happened. 

I hope that i845 is not simplex device, but first (unexpected) IRQ arrived 
just when channel 0 code wrote new value to its IDE_SELECT_REG register. 
Now I even disconnected DVD drive, so it is simple two masters, two
channels configuration, but it still happens.

And as always, something else: ata_error does:

OUT_BYTE(WIN_NOP, ch->ports[IDE_CONTROL_OFFSET])

I'd say that it should use 0x00 instead of WIN_NOP, and also that
comment above OUT_BYTE(0x04, ch->ports[IDE_CONTROL_OFFSET]) is bogus.
Command register is IDE_COMMAND, not IDE_CONTROL ;-)
                                        Best regards,
                                                Petr Vandrovec
                                                vandrove@vc.cvut.cz
                                                

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
@ 2002-07-30 16:15 Petr Vandrovec
  0 siblings, 0 replies; 29+ messages in thread
From: Petr Vandrovec @ 2002-07-30 16:15 UTC (permalink / raw)
  To: Marcin Dalecki; +Cc: linux-kernel

On 30 Jul 02 at 16:25, Marcin Dalecki wrote:
> > Second problem is that read operation which ends with
> > "drive ready, seek complete, data request" (why it happened in first
> > place?) will just read one sector from drive (it was DMA transfer,
> > so drive->mult_count == 0), and then it returns from ata_error
> > with ATA_OP_CONTINUES. But what continues? Drive told us that
> > current operation is done, and no new operation was started, so
> > there is very low chance that some IRQ will ever come, and timer was
> > just removed by ata_irq_request(), so channel will never awake.
> 
> What should continue is the retry of the operation, since otherwise
> it will be abondoned in do_ide_request(). However I will recheck.

It looks to me like that we only issue idle immediate and reset
to the drive. And even if we reset drive, we do not reissue
command, not even talking about resetting handler. And because of 
ide_dma_intr -> ata_error will report ATA_OP_CONTINUES, ata_irq_request 
will think that handler reissued command, and it will leave IDE_BUSY set.
So we are left with IDE_BUSY set, idle hardware, no handler and no timer 
active, and with one request on the fly lost somewhere in the system.
Probably code which reissued hardware was dropped sometime in the past
changes?

Another problem I found: ata_error calls ata_status_poll, which can
call back to ata_error. Hardwiring BUSY_STAT bit to 1 (== unplugging
drive from system, for example) can cause this loop, as far as I can see.
Fortunately on my system it reads 0x7F from status register after disk
unplug, but it still does not look correct.
 
> > And last thing: problem does not happen when only one of channels is
> > active, it is triggered only when both channels are active, and
> > channel #1 is always one which dies. Channel #0 uses IRQ14, channel #1
> > IRQ15, so there should be no sharing involved.
> 
> Do you do unmasking of IRQs? Holding them a bit longer could have some
> impact as well...

It was happening with default configuration, with unmaskirq=1. Now I tried

hdparm -u 0 /dev/hda; hdparm -u 0 /dev/hdc
vmware-config.pl -default & fsck -f /dev/hdc1

and it again died. vmware-config.pl is used as simple compile test,
it happens with 'ls -lRta /' too, but with 'vmware-config.pl' it happens
much faster.

Stack trace when this problem happens is:

ide_dma_intr + b8/cc (here I added printstate() call)
ata_irq_request + 11e/1cc
handle_IRQ_event + 29/4c
do_IRQ + df/190
common_interrupt + 18/20
madvise_willneed + 10/94
radix_tree_lookup + 18/60
do_page_cache_readahead + 92/13c
do_generic_file_read + 57/2a8
generic_file_read + 11c/138
file_read_actor + 0/8c
vfs_read + b4/134
sys_read + 2a/3c
syscall_call + 7/b

It is UP machine (with SMP non-preemptible kernel). Stack trace does not 
look like that it was caused by some race.
                                                Best regards,
                                                    Petr Vandrovec
                                                    vandrove@vc.cvut.cz
                                                    

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: IDE from current bk tree, UDMA and two channels...
  2002-07-30 14:03 Petr Vandrovec
@ 2002-07-30 14:25 ` Marcin Dalecki
  0 siblings, 0 replies; 29+ messages in thread
From: Marcin Dalecki @ 2002-07-30 14:25 UTC (permalink / raw)
  To: Petr Vandrovec; +Cc: linux-kernel

Petr Vandrovec wrote:
> Hi Martin,
>   here at work I have i845 chipset, with one UDMA100 disk connected
> to the primary channel, and one UDMA100 disk and one CD-DVD on the
> secondary one. CD-DVD driver is not loaded at all, all three devices
> are configured for UDMA by kernel. 
> 
>   Today 2.5.29-cset511 died when rebooting to 2.5.29-cset536 (rmap.c:212 
> BUG(), but I believe that it is fixed by Paulus's page->index patch 
> (cset520)) and after reboot I'm not able to fsck /dev/hdc1. It dies with
> 
> hdc: ide_dma_intr: status=0x58 [ drive ready,seek complete,data request]
> hdc: request error, nr. 1

That is usually indicating that some operation was started before
some other really finished.

> and fsck is D, and channel is stopped :-( First something easy: I think 
> that we should use ", " as a separator in dump_bits, and if there is
> space after opening "[", there should be also space before closing "]".

Yeep. No problem.
>
> Second problem is that read operation which ends with
> "drive ready, seek complete, data request" (why it happened in first
> place?) will just read one sector from drive (it was DMA transfer,
> so drive->mult_count == 0), and then it returns from ata_error
> with ATA_OP_CONTINUES. But what continues? Drive told us that
> current operation is done, and no new operation was started, so
> there is very low chance that some IRQ will ever come, and timer was
> just removed by ata_irq_request(), so channel will never awake.

What should continue is the retry of the operation, since otherwise
it will be abondoned in do_ide_request(). However I will recheck.

> And third, why this happens at all? When I instrumented ide_dma_intr 
> with printk, udma_stop() returns zero: it means that everything went 
> fine, UDMA engine asked for interrupt, no error, UDMA engine stopped. 
> Only reason I can invent is that drive did not clear DRQ bit yet, or 
> that we programmed UDMA engine with too few bytes to transfer. Either 
> of these explanations looks strange to me, as this does not explain
> why it happens only when both channels are in use simultaneously.
> 
> And last thing: problem does not happen when only one of channels is
> active, it is triggered only when both channels are active, and
> channel #1 is always one which dies. Channel #0 uses IRQ14, channel #1
> IRQ15, so there should be no sharing involved.

Hmm, the order of channels matters for the way the queues are feed.
I think we could expirence reentrancy problems. Or there
are some errors in ata_irq_handler() in dispatching the incomming
IRQs. It should be a good idea to add an IRQ number parameter to the
IRQ handler type, since this would allow to detect such situtations.

One check that could help would be to discover the drive to serive next, 
based on drive->queue in do_ide_request() instead of naively looking
through all drives in do_ide_request(). At least comparing it to the
queue parameter after selection would make sense.

Do you do unmasking of IRQs? Holding them a bit longer could have some
impact as well...

Thanks for the input, I will have to think through it a bit longer :-).


^ permalink raw reply	[flat|nested] 29+ messages in thread

* IDE from current bk tree, UDMA and two channels...
@ 2002-07-30 14:03 Petr Vandrovec
  2002-07-30 14:25 ` Marcin Dalecki
  0 siblings, 1 reply; 29+ messages in thread
From: Petr Vandrovec @ 2002-07-30 14:03 UTC (permalink / raw)
  To: martin; +Cc: linux-kernel

Hi Martin,
  here at work I have i845 chipset, with one UDMA100 disk connected
to the primary channel, and one UDMA100 disk and one CD-DVD on the
secondary one. CD-DVD driver is not loaded at all, all three devices
are configured for UDMA by kernel. 

  Today 2.5.29-cset511 died when rebooting to 2.5.29-cset536 (rmap.c:212 
BUG(), but I believe that it is fixed by Paulus's page->index patch 
(cset520)) and after reboot I'm not able to fsck /dev/hdc1. It dies with

hdc: ide_dma_intr: status=0x58 [ drive ready,seek complete,data request]
hdc: request error, nr. 1

and fsck is D, and channel is stopped :-( First something easy: I think 
that we should use ", " as a separator in dump_bits, and if there is
space after opening "[", there should be also space before closing "]".

Second problem is that read operation which ends with
"drive ready, seek complete, data request" (why it happened in first
place?) will just read one sector from drive (it was DMA transfer,
so drive->mult_count == 0), and then it returns from ata_error
with ATA_OP_CONTINUES. But what continues? Drive told us that
current operation is done, and no new operation was started, so
there is very low chance that some IRQ will ever come, and timer was
just removed by ata_irq_request(), so channel will never awake.

And third, why this happens at all? When I instrumented ide_dma_intr 
with printk, udma_stop() returns zero: it means that everything went 
fine, UDMA engine asked for interrupt, no error, UDMA engine stopped. 
Only reason I can invent is that drive did not clear DRQ bit yet, or 
that we programmed UDMA engine with too few bytes to transfer. Either 
of these explanations looks strange to me, as this does not explain
why it happens only when both channels are in use simultaneously.

And last thing: problem does not happen when only one of channels is
active, it is triggered only when both channels are active, and
channel #1 is always one which dies. Channel #0 uses IRQ14, channel #1
IRQ15, so there should be no sharing involved.
                                Thanks,
                                    Petr Vandrovec
                                    vandrove@vc.cvut.cz
                                    

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2002-08-02 11:48 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-07-30 18:19 IDE from current bk tree, UDMA and two channels Petr Vandrovec
2002-07-31 19:48 ` Marcin Dalecki
  -- strict thread matches above, loose matches on Subject: below --
2002-08-01 23:13 Petr Vandrovec
2002-08-02 13:07 ` Alan Cox
2002-08-01 23:00 Petr Vandrovec
2002-08-01 23:05 ` Alexander Viro
2002-08-01 22:53 Petr Vandrovec
2002-08-01 23:02 ` Alexander Viro
2002-08-01 23:54 ` Linus Torvalds
     [not found] <200208012219.g71MJV109133@penguin.transmeta.com>
2002-08-01 22:45 ` Alexander Viro
2002-08-02  9:10   ` Marcin Dalecki
2002-08-01 22:42 Petr Vandrovec
2002-08-01 22:52 ` Alexander Viro
2002-08-02  9:11   ` Marcin Dalecki
2002-08-01 22:34 Petr Vandrovec
2002-08-01 17:07 Petr Vandrovec
2002-08-01 22:00 ` Petr Vandrovec
2002-08-01 22:13   ` Marcin Dalecki
2002-08-01 22:39     ` Alexander Viro
2002-07-30 19:26 Petr Vandrovec
2002-07-31 20:01 ` Marcin Dalecki
2002-08-01  9:56   ` Jens Axboe
2002-08-01  9:56     ` Marcin Dalecki
2002-08-01 10:05       ` Jens Axboe
2002-08-01 10:33         ` Marcin Dalecki
2002-08-01 10:45           ` Jens Axboe
2002-07-30 16:15 Petr Vandrovec
2002-07-30 14:03 Petr Vandrovec
2002-07-30 14:25 ` Marcin Dalecki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).