linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Justin Piszcz <jpiszcz@lucidpixels.com>
Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
	linux-ide@vger.kernel.org, apiszcz@solarrain.com
Subject: Re: Kernel 2.6.23.9 / P35 Chipset + WD 750GB Drives (reset port)
Date: Thu, 6 Dec 2007 15:05:11 -0800	[thread overview]
Message-ID: <20071206150511.e0dd0b07.akpm@linux-foundation.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0712061737500.8523@p34.internal.lan>

On Thu, 6 Dec 2007 17:38:08 -0500 (EST)
Justin Piszcz <jpiszcz@lucidpixels.com> wrote:

> 
> 
> On Thu, 6 Dec 2007, Andrew Morton wrote:
> 
> > On Sat, 1 Dec 2007 06:26:08 -0500 (EST)
> > Justin Piszcz <jpiszcz@lucidpixels.com> wrote:
> >
> >> I am putting a new machine together and I have dual raptor raid 1 for the
> >> root, which works just fine under all stress tests.
> >>
> >> Then I have the WD 750 GiB drive (not RE2, desktop ones for ~150-160 on
> >> sale now adays):
> >>
> >> I ran the following:
> >>
> >> dd if=/dev/zero of=/dev/sdc
> >> dd if=/dev/zero of=/dev/sdd
> >> dd if=/dev/zero of=/dev/sde
> >>
> >> (as it is always a very good idea to do this with any new disk)
> >>
> >> And sometime along the way(?) (i had gone to sleep and let it run), this
> >> occurred:
> >>
> >> [42880.680144] ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x4010000
> >> action 0x2 frozen
> >
> > Gee we're seeing a lot of these lately.
> >
> >> [42880.680231] ata3.00: irq_stat 0x00400040, connection status changed
> >> [42880.680290] ata3.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 cdb
> >> 0x0 data 512 in
> >> [42880.680292]          res 40/00:ac:d8:64:54/00:00:57:00:00/40 Emask 0x10
> >> (ATA bus error)
> >> [42881.841899] ata3: soft resetting port
> >> [42885.966320] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> >> [42915.919042] ata3.00: qc timeout (cmd 0xec)
> >> [42915.919094] ata3.00: failed to IDENTIFY (I/O error, err_mask=0x5)
> >> [42915.919149] ata3.00: revalidation failed (errno=-5)
> >> [42915.919206] ata3: failed to recover some devices, retrying in 5 secs
> >> [42920.912458] ata3: hard resetting port
> >> [42926.411363] ata3: port is slow to respond, please be patient (Status
> >> 0x80)
> >> [42930.943080] ata3: COMRESET failed (errno=-16)
> >> [42930.943130] ata3: hard resetting port
> >> [42931.399628] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> >> [42931.413523] ata3.00: configured for UDMA/133
> >> [42931.413586] ata3: EH pending after completion, repeating EH (cnt=4)
> >> [42931.413655] ata3: EH complete
> >> [42931.413719] sd 2:0:0:0: [sdc] 1465149168 512-byte hardware sectors
> >> (750156 MB)
> >> [42931.413809] sd 2:0:0:0: [sdc] Write Protect is off
> >> [42931.413856] sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
> >> [42931.413867] sd 2:0:0:0: [sdc] Write cache: enabled, read cache:
> >> enabled, doesn't support DPO or FUA
> >>
> >> Usually when I see this sort of thing with another box I have full of
> >> raptors, it was due to a bad raptor and I never saw it again after I
> >> replaced the disk that it happened on, but that was using the Intel P965
> >> chipset.
> >>
> >> For this board, it is a Gigabyte GSP-P35-DS4 (Rev 2.0) and I have all of
> >> the drives (2 raptors, 3 750s connected to the Intel ICH9 Southbridge).
> >>
> >> I am going to do some further testing but does this indicate a bad drive?
> >> Bad cable?  Bad connector?
> >>
> >> As you can see above, /dev/sdc stopped responding for a little bit and
> >> then the kernel reset the port.
> >>
> >> Why is this though?  What is the likely root cause?  Should I replace the
> >> drive?  Obviously this is not normal and cannot be good at all, the idea
> >> is to put these drives in a RAID5 and if one is going to timeout that is
> >> going to cause the array to go degraded and thus be worthless in a raid5
> >> configuration.
> >>
> >> Can anyone offer any insight here?
> >
> > It would be interesting to try 2.6.21 or 2.6.22.
> >
> 
> This was due to NCQ issues (disabling it fixed the problem).
> 

I cannot locate any further email discussion on this topic.

Disabling NCQ at either compile time or runtime is not a "fix" and further
work should be done here to maek the kernel run acceptably on that
hardware.

  reply	other threads:[~2007-12-06 23:05 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-01 11:26 Kernel 2.6.23.9 / P35 Chipset + WD 750GB Drives (reset port) Justin Piszcz
2007-12-01 12:13 ` Jan Engelhardt
2007-12-01 12:23   ` Justin Piszcz
     [not found]     ` <20071201174733.646a5c35@absurd>
     [not found]       ` <Pine.LNX.4.64.0712011155110.6257@p34.internal.lan>
2007-12-02  9:11         ` Justin Piszcz
2007-12-10  8:23           ` Tejun Heo
2007-12-01 18:44   ` Bill Davidsen
2007-12-10  8:14     ` Tejun Heo
2007-12-13 22:27       ` Bill Davidsen
2007-12-06 22:00 ` Andrew Morton
2007-12-06 22:38   ` Justin Piszcz
2007-12-06 23:05     ` Andrew Morton [this message]
     [not found] <fa.hhS4g1h0uppt8Xx/ZZfNNQfAv1Q@ifi.uio.no>
2007-12-01 20:08 ` Robert Hancock
     [not found] ` <fa.YIWyRfjQw18aIH2fKaze37Gwuzo@ifi.uio.no>
     [not found]   ` <fa.ib4H8TQ3raADIWdsEBy+eSL/1RU@ifi.uio.no>
     [not found]     ` <fa.S4u1AwoYnqrSuegcUaP78D3SFXQ@ifi.uio.no>
     [not found]       ` <fa.H1nTe/xQV/oyEMTHAkOjqgqu7jY@ifi.uio.no>
     [not found]         ` <fa.YpQ6xCPOijQOCKsLJr1SDINFURI@ifi.uio.no>
2007-12-05  1:26           ` Robert Hancock

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071206150511.e0dd0b07.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=apiszcz@solarrain.com \
    --cc=jpiszcz@lucidpixels.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).