linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Robert Hancock <hancockr@shaw.ca>
To: "Matthew \"Cheetah\" Gabeler-Lee" <cheetah-lkmlsata@fastcat.org>
Cc: linux-kernel@vger.kernel.org, ide <linux-ide@vger.kernel.org>
Subject: Re: Frequent SATA resets with sata_nv (fwd)
Date: Sun, 24 Jun 2007 13:53:28 -0600	[thread overview]
Message-ID: <467ECBB8.2030706@shaw.ca> (raw)
In-Reply-To: <fa.Z5bo5y2TvW4efa6slAJpcdPwqAA@ifi.uio.no>

(ccing linux-ide)

Matthew "Cheetah" Gabeler-Lee wrote:
> (Please cc me on replies)
> 
> I have three samsung hdds (/sys/block/sda/device/model says SAMSUNG 
> SP2504C) in a raid configuration.  My system frequently (2-3x/day) 
> experiences temporary lockups, which produce messages as below in my 
> dmesg/syslog.  The system recovers, but the hang is annoying to say the 
> least.
> 
> All three drives are connected to sata_nv ports.  Oddly, it almost 
> always happens on ata6 or ata7 (the second and third ports of that 4 
> port setup on my motherboard).  There is an identical drive connected at 
> ata5, but I've only once or twice seen it hit that drive.
> 
> Googling around lkml.org, I found a few threads investigating what look 
> like very similar problems, some of which never seemed to find the 
> solution, but one of which came up with a fairly quick answer it seemed, 
> namely that the drive's NCQ implementation was horked: 
> http://lkml.org/lkml/2007/4/18/32
> 
> While I don't have older logs to verify exactly when this started, it 
> was fairly recent, perhaps around my 2.6.20.1 to 2.6.21.1 kernel 
> upgrade.
> 
> Any other info or tests I can provide/run to help?
> 
> Syslog snippet:
> Jun 21 10:35:23 cheetah kernel: ata6: EH in ADMA mode, notifier 0x0 notifier_error 0x0 gen_ctl 0x1501000 status 0x400 next cpb count 0x0 next cpb idx 0x0
> Jun 21 10:35:24 cheetah kernel: ata6: CPB 0: ctl_flags 0x9, resp_flags 0x0
> Jun 21 10:35:24 cheetah kernel: ata6: timeout waiting for ADMA IDLE, stat=0x400
> Jun 21 10:35:24 cheetah kernel: ata6: timeout waiting for ADMA LEGACY, stat=0x400
> Jun 21 10:35:24 cheetah kernel: ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> Jun 21 10:35:24 cheetah kernel: ata6.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x0 data 0
> Jun 21 10:35:24 cheetah kernel:          res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
> Jun 21 10:35:24 cheetah kernel: ata6: soft resetting port
> Jun 21 10:35:24 cheetah kernel: ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> Jun 21 10:35:24 cheetah kernel: ata6.00: configured for UDMA/133
> Jun 21 10:35:24 cheetah kernel: ata6: EH complete
> Jun 21 10:35:24 cheetah kernel: SCSI device sdb: 488397168 512-byte hdwr sectors (250059 MB)
> Jun 21 10:35:24 cheetah kernel: sdb: Write Protect is off
> Jun 21 10:35:24 cheetah kernel: sdb: Mode Sense: 00 3a 00 00
> Jun 21 10:35:24 cheetah kernel: SCSI device sdb: write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Unfortunately, this kind of problem is rather difficult to diagnose. 
Essentially what's happened is that we've sent a command (in this case a 
cache flush) to the controller but it's given no indication that it's 
done anything with it (somewhat different from the case in the link you 
mentioned above, where the controller indicates it's sent the command 
and is waiting for completion). This could be some kind of drive issue 
or drive/controller incompatibility, a controller bug, the driver doing 
something the controller doesn't expect..

Does this drive actually support NCQ? I can't tell from this part of the 
  log.

-- 
Robert Hancock      Saskatoon, SK, Canada
To email, remove "nospam" from hancockr@nospamshaw.ca
Home Page: http://www.roberthancock.com/


       reply	other threads:[~2007-06-24 19:53 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <fa.Z5bo5y2TvW4efa6slAJpcdPwqAA@ifi.uio.no>
2007-06-24 19:53 ` Robert Hancock [this message]
2007-06-26 16:06   ` Frequent SATA resets with sata_nv (fwd) Matthew "Cheetah" Gabeler-Lee
2007-06-26 17:29     ` Heinz Ulrich Stille
2007-06-24  0:52 Matthew "Cheetah" Gabeler-Lee
2007-06-24 17:09 ` Alistair John Strachan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=467ECBB8.2030706@shaw.ca \
    --to=hancockr@shaw.ca \
    --cc=cheetah-lkmlsata@fastcat.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).