linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alan Chandler <alan@chandlerfamily.org.uk>
To: Jens Axboe <axboe@suse.de>
Cc: linux-kernel@vger.kernel.org
Subject: Re: ide-cd problem
Date: Tue, 30 Nov 2004 08:59:43 +0000	[thread overview]
Message-ID: <200411300859.43422.alan@chandlerfamily.org.uk> (raw)
In-Reply-To: <200411232149.31701.alan@chandlerfamily.org.uk>

On Tuesday 23 November 2004 21:49, Alan Chandler wrote:
> On Tuesday 23 November 2004 14:51, Jens Axboe wrote:
> > On Tue, Nov 23 2004, Alan Chandler wrote:
> > > On Monday 22 November 2004 23:48, Alan Chandler wrote:
> > > ...
> > >
> > > > If I make the delay 600ns it works - I guess my hardware is a little
> > > > off spec.
...
> There is good and bad news related to 2.6.10-rc2
>
> The good news is that the acpi problem was the cause of the startup issues.
> adding pci=noacpi to the boot command line fixed that.
>
> The bad news is that with the delay at 800ns in drive_is_ready() I am
> getting the exact same symptoms I got with 2.6.9 before upping the delay to
> over 540ns.
...
>
> Firstly, the symptoms are the same between 2.6.9 and 2.6.10-rc2.  The halt
> seem to always be in exactly the same place. If it was a timing problem I
> would have thought it would have varied.
>
> Secondly, the command before seems to have an expectation of a 2048
> transfer rather than the 0 in the command, before the problem and then you
> get the strange DRQ=1 but 0 in the len register.
>
> Nov 23 20:37:33 kanger kernel: ide-cd:ide_do_rq_cdrom - cmd = 0x0
> Nov 23 20:37:33 kanger kernel: ide-cd:cdrom_newpc_intr - cmd=0x0 stat=0x50
> ireason=3 len=2048 rq len=0
> Nov 23 20:37:33 kanger kernel: ide-cd:ide_do_rq_cdrom - cmd = 0x1b
> Nov 23 20:37:33 kanger kernel: ide-cd:cdrom_newpc_intr - cmd=0x1b stat=0x58
> ireason=2 len=0 rq len=0
>

I have looked at this "every which way" to try and understand what is 
happening, and now think I have a handle on it.

The Test Unit Ready command (cmd=0x00) is the first command in the sequence 
that is issued with an expectation of no data being returned (rq len=0), but 
for some reason (which I don't know why) the device has decided it has data 
to send us (len=2048). I was confused for a long time because in normal 
circumstances cdrom_newpc_intr() is entered twice for a packet command with a 
data transfer attached, the first time with DRQ asserted and rq len == len, 
and once the transfer has taken place with DRQ not asserted and rq len having 
been set to 0 whilst len remains at its original value.  But in the case 
shown above, the first time cdrom_newpc_intr() is entered rq len (actually 
rq->data_len) is set to zero

In this case and (as far as I can see related to the packet commands) only 
this case, with my drive (CW078D), there is a delay between IRQ being 
asserted to say the command has been completed and DRQ being raised. (DRQ for 
the other cases seems to be asserted by the time of the interrupt).  

Once DRQ is asserted in this way, and until some command tries to read the 
data it wants to send us, it remains asserted.  That is why the start stop 
unit command (0x1b) appears to be the one that is failing.  In one trial I 
conducted I made this command into a no op - returning successfully, and the 
problem just full into the next command in sequence.

I was also experiencing different symptoms between 2.6.9 and 2.6.10-rc2, in 
that a change made to solve the problem in 2.6.9 (changing the delay in 
drive_is_ready to 600ns) did not appear to work in 2.6.10-rc2.  I haven't 
quite got to the bottom of this - but as the patch below shows, I have 
plugged what might be race conditions similar to the one identified by Alan 
Cox.  I do not fully understand the logic of the Alan's race condition, as 
from comments elsewhere in the code (and I really don't know how it really 
works) it says linux does not allow rentrant interrupts on the same unit - so 
I don't understand why the spinlocks need to be there, so my logic maybe 
totally spurious in this case - except it did make the difference between it 
working in 2.6.9 and getting it to work in 2.6.10-rc2.

 I did attempt in early tests I did with 2.6.9 to determine the accurately the 
size of the delay needed, it appeared to be around 540ns (400ns was not 
enough).  I have done all recent tests with 2.6.10-rc2 and 400ns is not 
enough 600ns is.  I have searched all over the ATAPI spec, but I can not see 
any really definitive statement on DRQ assertion in this case.  There is of 
course the 400ns delay in removing IRQ after either reading the status 
register or writing to the command register (clause 5.2.9), which the code 
seems to interpret as 400ns after either occurrence, but which my reading of 
the spec would be 400ns after the first occurence of one of these.  As I 
indicated in a previous note on this, as we always read the status register 
at the start of interrupt processing (in drive_is_ready), perhaps the 
solution is to store jiffies at this point and then delay 400ns - difference 
between jiffies now and as we exit ide_intr().

The other place where any delay is mentioned is when transitioning from bus 
idle state to various command state following issue of a command. Again the 
400ns is mentioned.  In the specific case of the packet command, this delay 
seems to need to be when the packet command is written - and before the 
command packet is transfered.  The only delay mentioned after that point is 
the one pio cycle (where it recommends reading the alt status reg) before 
checking status.  Given that we read the status twice (drive_is_ready and 
cdrom_decode_status) it appears we fufil this rule.

In fact, unless someone can point me at another place in the spec, the 
implication is that DRQ will be correct at the time of the interrupt and two 
reads of the status register that we do when entering cdrom_newpc_intr.  So I 
Have to conclude that my CD is operating outside of spec in this case.

This patch below is not a definitive answer to the issue, rather an 
illustration of the test envionment I was using to show things working.  You 
will see from the patches in drivers/ide/ide-iops.c that I have removed most 
of the timing delays, mainly to get a real handle on what was needed in 
ide-cd.c.  I never had any problem with delays apart from the one described 
above 

Index: drivers/ide/ide-iops.c
===================================================================
--- drivers/ide/ide-iops.c (.../vendor/2.6.10-rc2) (revision 14)
+++ drivers/ide/ide-iops.c (.../akc/delay) (revision 14)
@@ -476,10 +476,6 @@
  if (drive->waiting_for_dma)
   return hwif->ide_dma_test_irq(drive);
 
-#if 0
- /* need to guarantee 400ns since last command was issued */
- udelay(1);
-#endif
 
 #ifdef CONFIG_IDEPCI_SHARE_IRQ
  /*
@@ -560,7 +556,6 @@
   return 1;
  }
 
- udelay(1); /* spec allows drive 400ns to assert "BUSY" */
  if ((stat = hwif->INB(IDE_STATUS_REG)) & BUSY_STAT) {
   local_irq_set(flags);
   timeout += jiffies;
@@ -589,11 +584,13 @@
   * rather than expensively fail things immediately.
   * This fix courtesy of Matthew Faupel & Niccolo Rigacci.
   */
- for (i = 0; i < 10; i++) {
-  udelay(1);
+ 
+ /* take out this fix temporarily to test out what is happening without 
distractions*/
+/* for (i = 0; i < 10; i++) {
+  udelay(1);*/
   if (OK_STAT((stat = hwif->INB(IDE_STATUS_REG)), good, bad))
    return 0;
- }
+/* }*/
  *startstop = DRIVER(drive)->error(drive, "status error", stat);
  return 1;
 }
Index: drivers/ide/ide-cd.c
===================================================================
--- drivers/ide/ide-cd.c (.../vendor/2.6.10-rc2) (revision 14)
+++ drivers/ide/ide-cd.c (.../akc/delay) (revision 14)
@@ -890,8 +890,13 @@
   ide_execute_command(drive, WIN_PACKETCMD, handler, ATAPI_WAIT_PC, 
cdrom_timer_expiry);
   return ide_started;
  } else {
-  /* packet command */
-  HWIF(drive)->OUTB(WIN_PACKETCMD, IDE_COMMAND_REG);
+  unsigned long flags;
+
+                /* packet command */
+  spin_lock_irqsave(&ide_lock, flags);
+  HWIF(drive)->OUTBSYNC(drive, WIN_PACKETCMD, IDE_COMMAND_REG);
+  ndelay(400);
+  spin_unlock_irqrestore(&ide_lock, flags);
   return (*handler) (drive);
  }
 }
@@ -938,8 +943,12 @@
   cmd_len = ATAPI_MIN_CDB_BYTES;
 
  /* Send the command to the device. */
+ unsigned long flags; 
+ spin_lock_irqsave(&ide_lock, flags);
  HWIF(drive)->atapi_output_bytes(drive, rq->cmd, cmd_len);
-
+ ndelay(400);
+ spin_unlock_irqrestore(&ide_lock, flags);
+ 
  /* Start the DMA if need be */
  if (info->dma)
   hwif->dma_start(drive);
@@ -1649,6 +1658,8 @@
  xfer_func_t *xferfunc;
  unsigned long flags;
 
+ /* delay here enough to ensure the command has had time to complete and 
decide if its going to set DRQ */
+ ndelay(600);
  /* Check for errors. */
  dma_error = 0;
  dma = info->dma;
@@ -1665,7 +1676,6 @@
   */
  if (dma) {
   if (dma_error) {
-   printk("ide-cd: dma error\n");
    __ide_dma_off(drive);
    return DRIVER(drive)->error(drive, "dma error", stat);
   }
@@ -1737,10 +1747,19 @@
   if (blen > thislen)
    blen = thislen;
 
-  xferfunc(drive, ptr, blen);
-
   thislen -= blen;
   len -= blen;
+  
+  if((thislen <=0) && (len<=0)) {
+   unsigned long flags;
+   /* this is the last of this output */
+   spin_lock_irqsave(&ide_lock, flags);
+   xferfunc(drive, ptr, blen);
+   ndelay (400);
+   spin_unlock_irqrestore(&ide_lock, flags);
+  } else
+   xferfunc(drive,ptr,blen);
+  
   rq->data_len -= blen;
 
   if (rq->bio)
@@ -1748,7 +1767,7 @@
   else
    rq->data += blen;
  }
-
+ 
  /*
   * pad, if necessary
   */
@@ -1756,10 +1775,19 @@
   while (len > 0) {
    int pad = 0;
 
-   xferfunc(drive, &pad, sizeof(pad));
    len -= sizeof(pad);
+   if (len <= 0) {
+    unsigned long flags;
+
+    /* last output */
+    spin_lock_irqsave(&ide_lock,flags);
+    xferfunc(drive, &pad, sizeof(pad));
+    ndelay(400);
+    spin_unlock_irqrestore(&ide_lock,flags);
+   } else
+    xferfunc(drive, &pad, sizeof(pad));
   }
- }
+ } 
 
  if (HWGROUP(drive)->handler != NULL)
   BUG();



-- 
Alan Chandler
alan@chandlerfamily.org.uk
First they ignore you, then they laugh at you,
 then they fight you, then you win. --Gandhi

  parent reply	other threads:[~2004-11-30  9:03 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-11-20 18:42 ide-cd problem Alan Chandler
2004-11-20 19:47 ` Jens Axboe
2004-11-21  0:53   ` Alan Chandler
2004-11-21  8:56     ` Jens Axboe
2004-11-21 10:25       ` Alan Chandler
2004-11-21 16:13         ` Alan Chandler
2004-11-22  7:52           ` Alan Chandler
2004-11-22  8:01             ` Jens Axboe
2004-11-22 10:30               ` Alan Chandler
2004-11-22 10:51                 ` Jens Axboe
2004-11-22 11:29                   ` Alan Chandler
2004-11-22 11:31                     ` Jens Axboe
2004-11-22 12:53                       ` Alan Chandler
2004-11-22 13:02                         ` Jens Axboe
2004-11-22 19:19                           ` Alan Chandler
2004-11-22 23:48                             ` Alan Chandler
2004-11-23  7:13                               ` Alan Chandler
2004-11-23 14:51                                 ` Jens Axboe
2004-11-23 21:49                                   ` Alan Chandler
2004-11-26 23:39                                     ` Alan Chandler
2004-11-29 17:29                                       ` Bill Davidsen
2004-11-30  8:59                                     ` Alan Chandler [this message]
2004-12-10 21:32                                       ` ide-cd problem revisited - more brainpower needed Alan Chandler
2004-12-10 23:14                                         ` Alan Cox
2004-12-12  0:17                                           ` Alan Chandler
2004-12-12 11:39                                             ` Alan Cox
2004-12-12 13:34                                               ` Alan Chandler
2004-12-14  0:20                                                 ` Alan Chandler
2004-12-16 15:56                                                   ` Bill Davidsen
2004-12-17 23:59                                                     ` Alan Chandler
2004-11-24 23:19                           ` ide-cd problem Alan Cox
2004-11-25 15:29                             ` Jens Axboe
2004-11-25 16:25                               ` Alan Cox
2004-11-25 18:12                                 ` Jens Axboe
2004-11-25 18:45                                   ` Alan Chandler
2004-11-23 18:34               ` Jeff Garzik
2004-11-23 19:13                 ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200411300859.43422.alan@chandlerfamily.org.uk \
    --to=alan@chandlerfamily.org.uk \
    --cc=axboe@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).