* 2.4.19 SCSI core bug? @ 2002-09-11 21:19 Russell King 2002-09-12 9:01 ` Russell King 0 siblings, 1 reply; 6+ messages in thread From: Russell King @ 2002-09-11 21:19 UTC (permalink / raw) To: linux-kernel Hi, I've just been running some of my tests on 2.4.19 on ARM, and came across the following bug, which seems to be somewhere in the bowls of the SCSI layer. I'm using drivers/acorn/scsi/powertec.c, which uses fas216.c as the core chipset driver (I authored both of these files) which contains some pretty rigorous sanity checking. Basically, what seems to be happening is that scsi can go wrong after a medium error, and repeatedly pass invalid requests back down to the HBA. My HBA driver dislikes invalid requests thusly: scsi0.2: bad request buffer length 38400, should be 130048 38400 is SCpnt->request_bufflen. 130048 is the total number of bytes in the SG list. Dumping out the offending command: Read (10) 00 00 00 46 00 00 00 fe 00 Ok, so we were asking for 0xfe 512-byte sectors, which is 130048. So why did SCSI tell me that it wanted 38400 bytes in SCpnt->request_bufflen? My workaround for this is to forcefully set request_bufflen correctly, which appears to work. However, the SCSI subsystem completely stops making progress at this point. (It just repeats the above command indefinitely.) -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 2.4.19 SCSI core bug? 2002-09-11 21:19 2.4.19 SCSI core bug? Russell King @ 2002-09-12 9:01 ` Russell King 2002-09-12 15:39 ` Mike Anderson 0 siblings, 1 reply; 6+ messages in thread From: Russell King @ 2002-09-12 9:01 UTC (permalink / raw) To: linux-kernel On Wed, Sep 11, 2002 at 10:19:00PM +0100, Russell King wrote: > Ok, so we were asking for 0xfe 512-byte sectors, which is 130048. > So why did SCSI tell me that it wanted 38400 bytes in > SCpnt->request_bufflen? Ok, problem found. There's a nice loop in scsi_send_eh_cmnd() which just loops endlessly trying to retry a SCpnt command on medium error without restoring it to its pristine state before giving it back to the host, or limiting the number of retries. The end result is that the SCSI command says N sectors, the request list may say N * sector size bytes, but SCpnt->request_bufflen may be complete rubbish, since the HBA driver is allowed to modify this field during the processing of a command. We do have a specific function to restore the SCSI command data to a pristine state (scsi_eh_retry_command). Ok, so there's two bugs here: 1. It's possible for SCSI to completely bring a box to a complete standstill when it encounters a SCSI error. 2. Not retrying the command in its correct state. I'll look at cooking up a patch for both of these in the next few days or so. -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 2.4.19 SCSI core bug? 2002-09-12 9:01 ` Russell King @ 2002-09-12 15:39 ` Mike Anderson 2002-09-12 15:42 ` Russell King 0 siblings, 1 reply; 6+ messages in thread From: Mike Anderson @ 2002-09-12 15:39 UTC (permalink / raw) To: Russell King; +Cc: linux-kernel Russell King [rmk@arm.linux.org.uk] wrote: > On Wed, Sep 11, 2002 at 10:19:00PM +0100, Russell King wrote: > > Ok, so we were asking for 0xfe 512-byte sectors, which is 130048. > > So why did SCSI tell me that it wanted 38400 bytes in > > SCpnt->request_bufflen? > > Ok, problem found. > > There's a nice loop in scsi_send_eh_cmnd() which just loops endlessly > trying to retry a SCpnt command on medium error without restoring it > to its pristine state before giving it back to the host, or limiting > the number of retries. The scsi_eh_completed_normally function should be limiting your retries. So you should not be looping endlessly unless the problem of sending down a dirty command is causing another issue. I have a cleanup patch for 2.5 scsi_error I will add this fix in. scsi_send_eh_cmnd should not be retrying the command it should return to the caller the status and let them decide. We also should create a ?restore_scsi_cb? function that is shared so that it is done consistently. Eventually the retry policy is going to be changed, but until then we should fix this problem. -andmike -- Michael Anderson andmike@us.ibm.com ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 2.4.19 SCSI core bug? 2002-09-12 15:39 ` Mike Anderson @ 2002-09-12 15:42 ` Russell King 2002-09-12 16:45 ` Russell King 0 siblings, 1 reply; 6+ messages in thread From: Russell King @ 2002-09-12 15:42 UTC (permalink / raw) To: Mike Anderson; +Cc: linux-kernel On Thu, Sep 12, 2002 at 08:39:23AM -0700, Mike Anderson wrote: > I have a cleanup patch for 2.5 scsi_error I will add this fix in. > scsi_send_eh_cmnd should not be retrying the command it should return to > the caller the status and let them decide. We also should create a > ?restore_scsi_cb? function that is shared so that it is done > consistently. I've got a patch for both of these now. I'm working through some other issues at the moment though (like, why the hell requests for sectors after the sector in error don't have a data phase, and the drive returns status = GOOD, message = COMMAND COMPLETE, leading the kernel to believe that it did transfer data.) -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 2.4.19 SCSI core bug? 2002-09-12 15:42 ` Russell King @ 2002-09-12 16:45 ` Russell King 2002-09-12 9:41 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 6+ messages in thread From: Russell King @ 2002-09-12 16:45 UTC (permalink / raw) To: Mike Anderson; +Cc: linux-kernel On Thu, Sep 12, 2002 at 04:42:01PM +0100, Russell King wrote: > On Thu, Sep 12, 2002 at 08:39:23AM -0700, Mike Anderson wrote: > > I have a cleanup patch for 2.5 scsi_error I will add this fix in. > > scsi_send_eh_cmnd should not be retrying the command it should return to > > the caller the status and let them decide. We also should create a > > ?restore_scsi_cb? function that is shared so that it is done > > consistently. > > I've got a patch for both of these now. I'm working through some other > issues at the moment though (like, why the hell requests for sectors > after the sector in error don't have a data phase, and the drive returns > status = GOOD, message = COMMAND COMPLETE, leading the kernel to believe > that it did transfer data.) <mode=fact> Ok, I've found the cause of this one. Lets assume we're issuing a READ10 command (transfer 1) for blocks N to N + L, and N + B is bad. ( L >= 0, B >= 0, B < L ) We successfully transfer blocks N to N + (B - 1) inclusive to the host without problem. We think B is bad, so we then request N + B + 1 to N + L blocks from the device using a READ10 command (transfer 2.) The device immediately comes back without any data, but with GOOD status and COMMAND COMPLETED message. In other words, for transfer 2's command, it goes through the states: Message Out -> Command Out -> Status In -> Message In Now, the interesting thing is that the READ10 commands have a special bit - FUA or Force Unit Access. When I set this bit in _all_ READ10 commands, I _do not_ see this behaviour. <mode=theory> It turns out that my drive appears to cache the request for transfer 1 (blocks N to N + L), and tries to satisfy the request from its cache. However, since block N + B was in error, it appears fill its cache only for N to N + (B - 1), but reserve N to N + L. When a request with the FUA bit clear for a block between N + B and N + L is received, it tries to satisfy it from its cache, which doesn't contain the data. It thus responds without a data phase, indicating that the command was successful and without error. <mode=fact> This means that, using READ10, it is possible to issue READ commands to the drive, and have the drive respond without any error, yet NOT actually read _or_ transfer any data to the host. Think about what can happen to your ext2fs superblocks / bitmaps / inode tables in this situation, and what happens if something writes back the page to the drive... I think we have two options: 1. we need to come up with a way of reasonably handing SCSI medium errors if we are going to use the READ10 command with the FUA bit clear. 2. we could just set the FUA bit and bypass the drives on-board cache completely. Note that the only reason I've found this is because my HBA drives are _really_ pedantic about checking that all expected data does in fact get transferred by the drive. I wonder how many other drives out there are buggy like this. 8/ -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 2.4.19 SCSI core bug? 2002-09-12 16:45 ` Russell King @ 2002-09-12 9:41 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 6+ messages in thread From: Benjamin Herrenschmidt @ 2002-09-12 9:41 UTC (permalink / raw) To: Russell King, Mike Anderson; +Cc: linux-kernel >1. we need to come up with a way of reasonably handing SCSI medium > errors if we are going to use the READ10 command with the FUA > bit clear. > >2. we could just set the FUA bit and bypass the drives on-board > cache completely. > >Note that the only reason I've found this is because my HBA drives >are _really_ pedantic about checking that all expected data does >in fact get transferred by the drive. > >I wonder how many other drives out there are buggy like this. 8/ What about setting FUA on the next command after an error ? Would this work or may the disk really keep stale data around ? I don't have a recent SCSI spec at hand, but is there a command we can send after an error to force a cache flush & invalidate or will it only flush dirty datas to platter and not invalidate ? Ben. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2002-09-12 21:18 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2002-09-11 21:19 2.4.19 SCSI core bug? Russell King 2002-09-12 9:01 ` Russell King 2002-09-12 15:39 ` Mike Anderson 2002-09-12 15:42 ` Russell King 2002-09-12 16:45 ` Russell King 2002-09-12 9:41 ` Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).