All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O
@ 2013-07-29  3:20 bugzilla-daemon
  2013-07-29  3:21 ` [Bug 60644] " bugzilla-daemon
                   ` (51 more replies)
  0 siblings, 52 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-07-29  3:20 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

            Bug ID: 60644
           Summary: MPT2SAS drops all HDDs when under high I/O
           Product: SCSI Drivers
           Version: 2.5
    Kernel Version: 3.11
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: blocking
          Priority: P1
         Component: Other
          Assignee: scsi_drivers-other@kernel-bugs.osdl.org
          Reporter: liveaxle@live.com
        Regression: No

I have this issue that refused to be solved no matter what I do . My ASRock
comes with onboard SAS controller (LSI 2308) , since I recieved it always does
this one thing : Drops all HDDs connected to it .

It happens only under heavy IO operations after a few minutes . I can recreate
it easily by running either dd , md5deep or even btrfs scrub .

Kernel locks , can't even shut it down from console and a quick ls
/dev/disk/by-id shows that all the HDDs connected to the SAS controller have
disappeared .

It happens with the stable kernel (3.9 and 3.10.3) and the mainline (3.11-rc2)
as of this day .

It's not a hardware issue , because I installed a Windows Server 2012 on the
same machine with a few HDDs I have laying around and beat the controller to
the ground and it never hanged . So I know it's a Linux-specific issue .

Dmesg logs before and after the issue are attached .

Thank you .

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
@ 2013-07-29  3:21 ` bugzilla-daemon
  2013-07-29  3:32 ` bugzilla-daemon
                   ` (50 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-07-29  3:21 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #1 from liveaxle@live.com ---
Created attachment 107032
  --> https://bugzilla.kernel.org/attachment.cgi?id=107032&action=edit
dmesg logs

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
  2013-07-29  3:21 ` [Bug 60644] " bugzilla-daemon
@ 2013-07-29  3:32 ` bugzilla-daemon
  2013-07-29 11:27 ` bugzilla-daemon
                   ` (49 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-07-29  3:32 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #2 from liveaxle@live.com ---
Kernel locks are rather "soft" , the machine functions but the HDDs activity
LED stays on and the kernel doesn't respond to a reboot or shutdown command
from console .

It has to be hard-reset using the power button .

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
  2013-07-29  3:21 ` [Bug 60644] " bugzilla-daemon
  2013-07-29  3:32 ` bugzilla-daemon
@ 2013-07-29 11:27 ` bugzilla-daemon
  2013-07-29 14:12 ` bugzilla-daemon
                   ` (48 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-07-29 11:27 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

Sreekanth Reddy <sreekanth.reddy@lsi.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sreekanth.reddy@lsi.com

--- Comment #3 from Sreekanth Reddy <sreekanth.reddy@lsi.com> ---
Hi, 

Can you please provide the driver logs my setting the driver logging level to
0x3f8.

Here are the steps to set the mpt2sas driver logging level

a.While loading the driver 
        modprobe mpt2sas logging_level=0x3f8

b. If driver is in ramdisk, then in RHEL5/SLES/OEL5 OS, following line has to
be added in /etc/modprobe.conf and reboot the system
    options mpt2sas logging_level=0x3f8
                (Or)
Add below word at the end of kernel module parameters line in
/boot/grub/menu.lst or /boot/grub/grub.conf file and reboot the system
    mpt2sas.logging_level=0x3f8

c. During driver run time
         echo 0x3f8 > /sys/module/mpt2sas/parameters/logging_level

Also please provide us the IO rate at which you are faceing this problem.

Regards,
Sreekanth

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (2 preceding siblings ...)
  2013-07-29 11:27 ` bugzilla-daemon
@ 2013-07-29 14:12 ` bugzilla-daemon
  2013-07-29 14:14 ` bugzilla-daemon
                   ` (47 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-07-29 14:12 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #4 from liveaxle@live.com ---
Hi .

I'll attach it , however dmesg only shows the last 16000 events . I hope it
would be enough .

Sorry for being a noob in reporting my first bug , but can you tell me how can
I find the exact IO rate ?

It doesn't happen under daily workload , though . (rsync cronjob , writing a
gzipped root backup to the RAID) .

Thank you .

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (3 preceding siblings ...)
  2013-07-29 14:12 ` bugzilla-daemon
@ 2013-07-29 14:14 ` bugzilla-daemon
  2013-07-29 23:24 ` bugzilla-daemon
                   ` (46 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-07-29 14:14 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #5 from liveaxle@live.com ---
Created attachment 107033
  --> https://bugzilla.kernel.org/attachment.cgi?id=107033&action=edit
dmeg logs 2

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (4 preceding siblings ...)
  2013-07-29 14:14 ` bugzilla-daemon
@ 2013-07-29 23:24 ` bugzilla-daemon
  2013-07-29 23:30 ` bugzilla-daemon
                   ` (45 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-07-29 23:24 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #6 from liveaxle@live.com ---
Hi again .

I did monitor the IO of the RAID array using IOstat tool .

I'll attach the output .

One thing I noticed is that monitoring the raid array made it survive a LOT
longer than before .

I simply used dd to dump 300G of zeros into the array , while at the same time
using md5deep on the entire mountpoint .

it stopped after writing around 280G this time , I was surprised because it
never exceeded 77G before .

Tell me if you need me to do anything else .

Thank you very much .

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (5 preceding siblings ...)
  2013-07-29 23:24 ` bugzilla-daemon
@ 2013-07-29 23:30 ` bugzilla-daemon
  2013-07-30  5:00 ` bugzilla-daemon
                   ` (44 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-07-29 23:30 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #7 from liveaxle@live.com ---
Created attachment 107038
  --> https://bugzilla.kernel.org/attachment.cgi?id=107038&action=edit
iostat log

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (6 preceding siblings ...)
  2013-07-29 23:30 ` bugzilla-daemon
@ 2013-07-30  5:00 ` bugzilla-daemon
  2013-07-30  5:26 ` bugzilla-daemon
                   ` (43 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-07-30  5:00 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #8 from Sreekanth Reddy <sreekanth.reddy@lsi.com> ---
Hi,

Can you please provide me the /var/log/message file as dmesg logs is not enough
to analysize this issue.

Thanks,
Sreekanth

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (7 preceding siblings ...)
  2013-07-30  5:00 ` bugzilla-daemon
@ 2013-07-30  5:26 ` bugzilla-daemon
  2013-07-30  5:33 ` bugzilla-daemon
                   ` (42 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-07-30  5:26 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #9 from liveaxle@live.com ---
Hi .

Ok . Journal for this entire day will be attached . It starts at 12:00 AM 

To save your time , mpt2sas errors start at (03:19:26) mark .

Thank you .

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (8 preceding siblings ...)
  2013-07-30  5:26 ` bugzilla-daemon
@ 2013-07-30  5:33 ` bugzilla-daemon
  2013-07-30 10:03 ` bugzilla-daemon
                   ` (41 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-07-30  5:33 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #10 from liveaxle@live.com ---
Created attachment 107041
  --> https://bugzilla.kernel.org/attachment.cgi?id=107041&action=edit
Journal 1

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (9 preceding siblings ...)
  2013-07-30  5:33 ` bugzilla-daemon
@ 2013-07-30 10:03 ` bugzilla-daemon
  2013-07-30 10:11 ` bugzilla-daemon
                   ` (40 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-07-30 10:03 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #11 from Sreekanth Reddy <sreekanth.reddy@lsi.com> ---
Hi,

Thanks for providing the logs.

>From the logs what I observed is that controller is going in to the non
operational state and so we are seeing the messages "mpt2sas0:
_base_fault_reset_work : SAS host is non-operational !!!!". 

So once the controller stays in this state then driver will remove this
controller's host entry from the scsi mid layer ( i.e. HBA's host is removed
from the /sys/class/scsi_host/hostX). And hence your are observing that all the
drivers attached to this controller are dropped.

But still I am not sure why controller enters in to the non-operational state. 
So,  I thought to reproduce this locally, So can you please help me in
reproducing this issue i.e. can you please tell me the steps and which utils
with cmds your are used to reproduce this issue.

Regards,
Sreekanth

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (10 preceding siblings ...)
  2013-07-30 10:03 ` bugzilla-daemon
@ 2013-07-30 10:11 ` bugzilla-daemon
  2013-07-30 10:12 ` bugzilla-daemon
                   ` (39 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-07-30 10:11 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #12 from liveaxle@live.com ---
Hi .

I can easily reproduce this issue in a second by running :

btrfs scrub start /MOUNTPOINT

The btrfs system is a RAID1 that consists of 5 drives .


Also in an MD-RAID0 that consists of 3 drives by running a little harsh
commands like :

dd if=/dev/zero of=/MOUNTPOINT/dd.img bs=1G count=300

and/or :

md5deep -r /MOUNTPOINT


My CPU is an Ivy-Bridge i5 , with 32GB of RAM . (Watching htop , the CPU never
reaches 30% of load)

Thank you .

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (11 preceding siblings ...)
  2013-07-30 10:11 ` bugzilla-daemon
@ 2013-07-30 10:12 ` bugzilla-daemon
  2013-08-01 13:20 ` bugzilla-daemon
                   ` (38 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-07-30 10:12 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #13 from liveaxle@live.com ---
One more thing , the MD-RAID0 has XFS but that doesn't matter because it used
to have EXT4 with the same results .

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (12 preceding siblings ...)
  2013-07-30 10:12 ` bugzilla-daemon
@ 2013-08-01 13:20 ` bugzilla-daemon
  2013-08-01 14:24 ` bugzilla-daemon
                   ` (37 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-08-01 13:20 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #14 from liveaxle@live.com ---
Hi .

Any updates regarding this bug ?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (13 preceding siblings ...)
  2013-08-01 13:20 ` bugzilla-daemon
@ 2013-08-01 14:24 ` bugzilla-daemon
  2013-08-01 14:30 ` bugzilla-daemon
                   ` (36 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-08-01 14:24 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #15 from Sreekanth Reddy <sreekanth.reddy@lsi.com> ---
(In reply to liveaxle from comment #14)
> Hi .

> Any updates regarding this bug ?

I tried to reproduce this issue locally, but for me this issue is not
reproduced.

Here are the steps which I followed to reproduce the issue

1. I have created RAID0 vloume on two 500 GB SAS drives.
2. Created the EXT4 file system.
3. Mounted this FS to /mnt
4. And run the IO's using cmd 'dd if=/dev/zero of=/mnt/dd.img bs=1G count=300'

Result:
IO's run successfully with out any issue.

Please let me know whether I have missed any steps while reproducing this
issue.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (14 preceding siblings ...)
  2013-08-01 14:24 ` bugzilla-daemon
@ 2013-08-01 14:30 ` bugzilla-daemon
  2013-08-02 10:44 ` bugzilla-daemon
                   ` (35 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-08-01 14:30 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #16 from liveaxle@live.com ---
Hello .

The thing is that I'm using SATA drives and not SAS drives . The motherboard
exposes the LSI controller as 8 SATA ports .

This wasn't an issue under Windows 2012 , so I think that hardware issues are
pretty much not the cause in here .

Sorry if I'm demanding too much , but can you try to create a BTRFS RAID1 ,
fill it with data and then run :

btrfs scrub start /MOUNTPOINT

It always produces the issue in less than 2 minutes .

Thank you .

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (15 preceding siblings ...)
  2013-08-01 14:30 ` bugzilla-daemon
@ 2013-08-02 10:44 ` bugzilla-daemon
  2013-08-27  6:48 ` bugzilla-daemon
                   ` (34 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-08-02 10:44 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #17 from liveaxle@live.com ---
Hi .

Today , I ran some tests on all 8 drives connected to the LSI 2308 .

The results are rather surprising . The controller goes non-operational under
high READ workloads , while WRITE workloads always complete just fine .

I'll run more tests , but at this point I can safely say that heavy READ
operations (md5 checks , btrfs scrub , torrent files checking , etc) are the
problem , while heavy WRITE workloads (dd , copy , rsync) always complete
successfully .

I hope that can be useful in nailing this bug .

Thank you .

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (16 preceding siblings ...)
  2013-08-02 10:44 ` bugzilla-daemon
@ 2013-08-27  6:48 ` bugzilla-daemon
  2013-08-27  7:14 ` bugzilla-daemon
                   ` (33 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-08-27  6:48 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

Per Zetterlund <per@pz.se> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |per@pz.se

--- Comment #18 from Per Zetterlund <per@pz.se> ---
Looks to be the same as https://bugzilla.kernel.org/show_bug.cgi?id=59301

I'm seeing the same thing with a LSI 9211-8i-card (firmware 16) with kernel
3.2, 3.5 and 3.8. My 5 SATA-drives gets dropped when resyncing a SW RAID-6 set.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (17 preceding siblings ...)
  2013-08-27  6:48 ` bugzilla-daemon
@ 2013-08-27  7:14 ` bugzilla-daemon
  2013-08-27 11:45 ` bugzilla-daemon
                   ` (32 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-08-27  7:14 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #19 from liveaxle@live.com ---
It seems the LSI isn't interested in fixing this . I also purchased a 9211-8i
card lately , and it has the same issue .

Perhaps , I might consider buying an Adaptec HBA to replace these LSI
controllers .

Thank you .

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (18 preceding siblings ...)
  2013-08-27  7:14 ` bugzilla-daemon
@ 2013-08-27 11:45 ` bugzilla-daemon
  2013-08-27 11:46 ` bugzilla-daemon
                   ` (31 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-08-27 11:45 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

Hannes Reinecke <hare@suse.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hare@suse.de

--- Comment #20 from Hannes Reinecke <hare@suse.de> ---
Created attachment 107333
  --> https://bugzilla.kernel.org/attachment.cgi?id=107333&action=edit
mpt2sas-disable-watchdog.patch

mpt2sas: add module option to disable watchdog.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (19 preceding siblings ...)
  2013-08-27 11:45 ` bugzilla-daemon
@ 2013-08-27 11:46 ` bugzilla-daemon
  2013-08-27 15:29 ` bugzilla-daemon
                   ` (30 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-08-27 11:46 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #21 from Hannes Reinecke <hare@suse.de> ---
Try with this patch. With a bit of luck it's just the firmware becoming
sluggish under high load, so disabling the watchdog will be circumvent this.
And any real error would still be handled by SCSI EH.

Keep fingers crossed.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (20 preceding siblings ...)
  2013-08-27 11:46 ` bugzilla-daemon
@ 2013-08-27 15:29 ` bugzilla-daemon
  2013-08-28  7:57 ` bugzilla-daemon
                   ` (29 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-08-27 15:29 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #22 from liveaxle@live.com ---
Hi . Thank you Hannes very much for the patch .

I compiled it inside a 3.11-rc7 , and put "mpt2sas.disable_watchdog=1" in the
boot parameters .

It helped the driver to survive longer -around an hour longer than before- but
then it failed . But this time , it hard-locked the machine (ssh sessions are
closed , new sessions time-out , iscsi targets are dropped) I had to hard-reset
the server .

Thank you .

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (21 preceding siblings ...)
  2013-08-27 15:29 ` bugzilla-daemon
@ 2013-08-28  7:57 ` bugzilla-daemon
  2013-08-28  8:32 ` bugzilla-daemon
                   ` (28 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-08-28  7:57 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #23 from Hannes Reinecke <hare@suse.de> ---
So the firmware does indeed wedge under high load. Given the issues I've had so
far with LSI SATL I'm not surprised.

Does the same thing happen when running on a single disk, ie without MD?
There have been issues with MD dropping any queue limitations (ie the 4k
physical / 512 logical block sizes you're having) so MD might end up spitting
out non-aligned requests. Which in turn might trigger issues in the firmware
translation.

Using the devices directly without MD would eliminate this problem.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (22 preceding siblings ...)
  2013-08-28  7:57 ` bugzilla-daemon
@ 2013-08-28  8:32 ` bugzilla-daemon
  2013-08-28  8:44 ` bugzilla-daemon
                   ` (27 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-08-28  8:32 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #24 from liveaxle@live.com ---
Hello .

I used to have the same issues with MD . yes .

I'm using BTRFS , I don't know if BTRFS RAID code was ported from MD , but the
issue is the same .

I didn't try anything other than BTRFS and MD . Maybe I should give ZFS a try ,
although it is still slower on linux .

I'll report back as soon as possible .

Thank you .

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (23 preceding siblings ...)
  2013-08-28  8:32 ` bugzilla-daemon
@ 2013-08-28  8:44 ` bugzilla-daemon
  2013-08-28 13:33 ` bugzilla-daemon
                   ` (26 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-08-28  8:44 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

Bernd Schubert <bernd.schubert@fastmail.fm> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bernd.schubert@fastmail.fm

--- Comment #25 from Bernd Schubert <bernd.schubert@fastmail.fm> ---
Have you already tried to give the controller less work, i.e. by setting
/sys/block/sdX/device/queue_depth to 1? If you can't set it, use the mpt2sas
option max_queue_depth=1. A low value of max_sgl_entries and max_sectors also
might help. Lowering all of that is not good for performance, but might
increase stability.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (24 preceding siblings ...)
  2013-08-28  8:44 ` bugzilla-daemon
@ 2013-08-28 13:33 ` bugzilla-daemon
  2013-08-29  5:51 ` bugzilla-daemon
                   ` (25 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-08-28 13:33 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #26 from liveaxle@live.com ---
Hi .

Thanks for your kind help , Bernd .

setting /sys/block/sdX(c~j)/device/queue_depth to 1 unfortuantely didn't solve
the issue .

I put the following in the end of the boot line :

mpt2sas.disable_watchdog=1 mpt2sas.max_queue_depth=1 mpt2sas.max_sgl_entries=64
mpt2sas.max_sectors=64

However , the OS doesn't see the controllers anymore . So I had to remove these
entries from the boot line .

Thank you .

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (25 preceding siblings ...)
  2013-08-28 13:33 ` bugzilla-daemon
@ 2013-08-29  5:51 ` bugzilla-daemon
  2013-08-29 10:09 ` bugzilla-daemon
                   ` (24 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-08-29  5:51 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #27 from liveaxle@live.com ---
Hi again .

I set up two disks as two seperate BTRFS volumes (No RAID) , and did some tests
.

One of the disks failed to complete the process given to it , but it wasn't
dropped , and checking the mountpoint shows that it still mounted . The error
it gives is "Stale file handle" .

The other drive , completed all the processes successfully .

It seems that RAID is indeed a problem if not THE problem .

Thank you .

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (26 preceding siblings ...)
  2013-08-29  5:51 ` bugzilla-daemon
@ 2013-08-29 10:09 ` bugzilla-daemon
  2013-08-29 19:34 ` bugzilla-daemon
                   ` (23 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-08-29 10:09 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #28 from liveaxle@live.com ---
PS . :

To summerize what I have in my server :

1 - BTRFS RAID1 (5 disks) : FAILS

2 - BTRFS Single Data Profile (2 disks) : FAILS 

3 - BTRFS Single disk FS (No RAID) : FAILS (But recovers without rebooting -
Doesn't drop or Unmount)

4 - BTRFS Single disk FS (No RAID) : WORKS .


I'll run one test , this time I'll use a leafsize of 64k , I have a doubt that
it helps .

THank you .

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (27 preceding siblings ...)
  2013-08-29 10:09 ` bugzilla-daemon
@ 2013-08-29 19:34 ` bugzilla-daemon
  2013-08-30  3:08 ` bugzilla-daemon
                   ` (22 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-08-29 19:34 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #29 from liveaxle@live.com ---
Hi .

I created a BTRFS RAID0 using a leafsize of 64k .

Copying some files to the RAID results in some strange output in dmesg :

[107991.481826] sd 8:0:10:0: [sdm]
[107991.482239] Sense Key : 0x2 [current]
[107991.482657] sd 8:0:10:0: [sdm]
[107991.483067] ASC=0x4 ASCQ=0x0
[107991.483475] sd 8:0:10:0: [sdm] CDB:
[107991.483885] cdb[0]=0x2a: 2a 00 00 3a 0b 80 00 04 00 00
[107991.484337] sd 8:0:10:0: [sdm] Device not ready
[107991.484752] sd 8:0:10:0: [sdm]
[107991.485163] Result: hostbyte=0x00 driverbyte=0x08
[107991.485581] sd 8:0:10:0: [sdm]
[107991.486001] Sense Key : 0x2 [current]
[107991.486441] sd 8:0:10:0: [sdm]
[107991.486856] ASC=0x4 ASCQ=0x0
[107991.487260] sd 8:0:10:0: [sdm] CDB:
[107991.487659] cdb[0]=0x2a: 2a 00 00 3a 0f 80 00 04 00 00
[107991.488100] sd 8:0:10:0: [sdm] Device not ready
[107991.488510] sd 8:0:10:0: [sdm]
[107991.488915] Result: hostbyte=0x00 driverbyte=0x08
[107991.489321] sd 8:0:10:0: [sdm]
[107991.489715] Sense Key : 0x2 [current]
[107991.490108] sd 8:0:10:0: [sdm]
[107991.490495] ASC=0x4 ASCQ=0x0
[107991.490882] sd 8:0:10:0: [sdm] CDB:
[107991.491269] cdb[0]=0x2a: 2a 00 00 3a 13 80 00 04 00 00
[107991.491704] sd 8:0:10:0: [sdm] Device not ready
[107991.492103] sd 8:0:10:0: [sdm]
[107991.492491] Result: hostbyte=0x00 driverbyte=0x08
[107991.492882] sd 8:0:10:0: [sdm]
[107991.493291] Sense Key : 0x2 [current]
[107991.493684] sd 8:0:10:0: [sdm]
[107991.494070] ASC=0x4 ASCQ=0x0
[107991.494454] sd 8:0:10:0: [sdm] CDB:
[107991.494838] cdb[0]=0x2a: 2a 00 00 3a 17 80 00 04 00 00
[107991.495272] sd 8:0:10:0: [sdm] Device not ready
[107991.495669] sd 8:0:10:0: [sdm]
[107991.496057] Result: hostbyte=0x00 driverbyte=0x08
[107991.496446] sd 8:0:10:0: [sdm]
[107991.496835] Sense Key : 0x2 [current]
[107991.497227] sd 8:0:10:0: [sdm]
[107991.497612] ASC=0x4 ASCQ=0x0
[107991.497996] sd 8:0:10:0: [sdm] CDB:
[107991.498379] cdb[0]=0x2a: 2a 00 00 3a 1b 80 00 04 00 00
[107991.498810] sd 8:0:10:0: [sdm] Device not ready
[107991.499207] sd 8:0:10:0: [sdm]
[107991.499593] Result: hostbyte=0x00 driverbyte=0x08
[107991.500018] sd 8:0:10:0: [sdm]
[107991.500407] Sense Key : 0x2 [current]
[107991.500797] sd 8:0:10:0: [sdm]
[107991.501182] ASC=0x4 ASCQ=0x0
[107991.501567] sd 8:0:10:0: [sdm] CDB:
[107991.501951] cdb[0]=0x2a: 2a 00 00 3a 1f 80 00 04 00 00
[107991.502385] sd 8:0:10:0: [sdm] Device not ready
[107991.502782] sd 8:0:10:0: [sdm]
[107991.503170] Result: hostbyte=0x00 driverbyte=0x08
[107991.503558] sd 8:0:10:0: [sdm]
[107991.503948] Sense Key : 0x2 [current]
[107991.504338] sd 8:0:10:0: [sdm]
[107991.504723] ASC=0x4 ASCQ=0x0
[107991.505108] sd 8:0:10:0: [sdm] CDB:
[107991.505493] cdb[0]=0x2a: 2a 00 00 3a 23 80 00 04 00 00
[107991.505927] sd 8:0:10:0: [sdm] Device not ready
[107991.506343] sd 8:0:10:0: [sdm]
[107991.506730] Result: hostbyte=0x00 driverbyte=0x08
[107991.507119] sd 8:0:10:0: [sdm]
[107991.507507] Sense Key : 0x2 [current]
[107991.507896] sd 8:0:10:0: [sdm]
[107991.508296] ASC=0x4 ASCQ=0x0
[107991.508678] sd 8:0:10:0: [sdm] CDB:
[107991.509059] cdb[0]=0x2a: 2a 00 00 3a 27 80 00 04 00 00
[107991.509486] sd 8:0:10:0: [sdm] Device not ready
[107991.509881] sd 8:0:10:0: [sdm]
[107991.510267] Result: hostbyte=0x00 driverbyte=0x08
[107991.510655] sd 8:0:10:0: [sdm]
[107991.511043] Sense Key : 0x2 [current]
[107991.511430] sd 8:0:10:0: [sdm]
[107991.511815] ASC=0x4 ASCQ=0x0
[107991.512197] sd 8:0:10:0: [sdm] CDB:
[107991.512581] cdb[0]=0x2a: 2a 00 00 3a 2b 80 00 04 00 00
[107991.513044] sd 8:0:10:0: [sdm] Device not ready
[107991.513440] sd 8:0:10:0: [sdm]
[107991.513826] Result: hostbyte=0x00 driverbyte=0x08
[107991.514214] sd 8:0:10:0: [sdm]
[107991.514603] Sense Key : 0x2 [current]
[107991.514994] sd 8:0:10:0: [sdm]
[107991.515377] ASC=0x4 ASCQ=0x0
[107991.515760] sd 8:0:10:0: [sdm] CDB:
[107991.516144] cdb[0]=0x2a: 2a 00 00 3a 2f 80 00 04 00 00
[107991.516576] sd 8:0:10:0: [sdm] Device not ready
[107991.516973] sd 8:0:10:0: [sdm]
[107991.517360] Result: hostbyte=0x00 driverbyte=0x08
[107991.517749] sd 8:0:10:0: [sdm]
[107991.518138] Sense Key : 0x2 [current]
[107991.518527] sd 8:0:10:0: [sdm]
[107991.518911] ASC=0x4 ASCQ=0x0
[107991.519295] sd 8:0:10:0: [sdm] CDB:
[107991.519696] cdb[0]=0x2a: 2a 00 00 3a 33 80 00 04 00 00
[107991.520129] sd 8:0:10:0: [sdm] Device not ready
[107991.520528] sd 8:0:10:0: [sdm]
[107991.520915] Result: hostbyte=0x00 driverbyte=0x08
[107991.521305] sd 8:0:10:0: [sdm]
[107991.521695] Sense Key : 0x2 [current]
[107991.522087] sd 8:0:10:0: [sdm]
[107991.522472] ASC=0x4 ASCQ=0x0
[107991.522856] sd 8:0:10:0: [sdm] CDB:
[107991.523241] cdb[0]=0x2a: 2a 00 00 3a 37 80 00 04 00 00
[107991.523675] sd 8:0:10:0: [sdm] Device not ready
[107991.524073] sd 8:0:10:0: [sdm]
[107991.524460] Result: hostbyte=0x00 driverbyte=0x08
[107991.524850] sd 8:0:10:0: [sdm]
[107991.525241] Sense Key : 0x2 [current]
[107991.525631] sd 8:0:10:0: [sdm]
[107991.526018] ASC=0x4 ASCQ=0x0
[107991.526437] sd 8:0:10:0: [sdm] CDB:
[107991.526823] cdb[0]=0x2a: 2a 00 00 3a 3b 80 00 04 00 00
[107991.527257] sd 8:0:10:0: [sdm] Device not ready
[107991.527657] sd 8:0:10:0: [sdm]
[107991.528047] Result: hostbyte=0x00 driverbyte=0x08
[107991.528437] sd 8:0:10:0: [sdm]
[107991.528829] Sense Key : 0x2 [current]
[107991.529221] sd 8:0:10:0: [sdm]
[107991.529609] ASC=0x4 ASCQ=0x0
[107991.529996] sd 8:0:10:0: [sdm] CDB:
[107991.530382] cdb[0]=0x2a: 2a 00 00 3a 3f 80 00 04 00 00
[107991.530816] sd 8:0:10:0: [sdm] Device not ready
[107991.531216] sd 8:0:10:0: [sdm]
[107991.531604] Result: hostbyte=0x00 driverbyte=0x08
[107991.531996] sd 8:0:10:0: [sdm]
[107991.532387] Sense Key : 0x2 [current]
[107991.532780] sd 8:0:10:0: [sdm]
[107991.533184] ASC=0x4 ASCQ=0x0
[107991.533571] sd 8:0:10:0: [sdm] CDB:
[107991.533958] cdb[0]=0x2a: 2a 00 00 3a 43 80 00 04 00 00
[107991.534392] sd 8:0:10:0: [sdm] Device not ready
[107991.534791] sd 8:0:10:0: [sdm]
[107991.535180] Result: hostbyte=0x00 driverbyte=0x08
[107991.535572] sd 8:0:10:0: [sdm]
[107991.535964] Sense Key : 0x2 [current]
[107991.536358] sd 8:0:10:0: [sdm]
[107991.536747] ASC=0x4 ASCQ=0x0
[107991.537134] sd 8:0:10:0: [sdm] CDB:
[107991.537520] cdb[0]=0x2a: 2a 00 00 3a 47 80 00 04 00 00
[107991.537956] sd 8:0:10:0: [sdm] Device not ready
[107991.538357] sd 8:0:10:0: [sdm]
[107991.538746] Result: hostbyte=0x00 driverbyte=0x08
[107991.539139] sd 8:0:10:0: [sdm]
[107991.539529] Sense Key : 0x2 [current]
[107991.539954] sd 8:0:10:0: [sdm]
[107991.540340] ASC=0x4 ASCQ=0x0
[107991.540726] sd 8:0:10:0: [sdm] CDB:
[107991.541112] cdb[0]=0x2a: 2a 00 00 3a 4b 80 00 04 00 00
[107991.541542] sd 8:0:10:0: [sdm] Device not ready
[107991.541941] sd 8:0:10:0: [sdm]
[107991.542330] Result: hostbyte=0x00 driverbyte=0x08
[107991.542720] sd 8:0:10:0: [sdm]
[107991.543113] Sense Key : 0x2 [current]
[107991.543507] sd 8:0:10:0: [sdm]
[107991.543894] ASC=0x4 ASCQ=0x0
[107991.544281] sd 8:0:10:0: [sdm] CDB:
[107991.544668] cdb[0]=0x2a: 2a 00 00 3a 4f 80 00 01 80 00
[107991.545090] sd 8:0:10:0: [sdm] Device not ready
[107991.545489] sd 8:0:10:0: [sdm]
[107991.545883] Result: hostbyte=0x00 driverbyte=0x08
[107991.546277] sd 8:0:10:0: [sdm]
[107991.546684] Sense Key : 0x2 [current]
[107991.547076] sd 8:0:10:0: [sdm]
[107991.547464] ASC=0x4 ASCQ=0x0
[107991.547848] sd 8:0:10:0: [sdm] CDB:
[107991.548234] cdb[0]=0x2a: 2a 00 00 21 38 00 00 04 00 00
[107991.548669] sd 8:0:10:0: [sdm] Device not ready
[107991.549083] sd 8:0:10:0: [sdm]
[107991.549470] Result: hostbyte=0x00 driverbyte=0x08
[107991.549861] sd 8:0:10:0: [sdm]
[107991.550252] Sense Key : 0x2 [current]
[107991.550642] sd 8:0:10:0: [sdm]
[107991.551029] ASC=0x4 ASCQ=0x0
[107991.551415] sd 8:0:10:0: [sdm] CDB:
[107991.551800] cdb[0]=0x2a: 2a 00 00 21 3c 00 00 04 00 00
[107991.552235] sd 8:0:10:0: [sdm] Device not ready
[107991.552649] sd 8:0:10:0: [sdm]
[107991.553054] Result: hostbyte=0x00 driverbyte=0x08
[107991.553444] sd 8:0:10:0: [sdm]
[107991.553835] Sense Key : 0x2 [current]
[107991.554225] sd 8:0:10:0: [sdm]
[107991.554610] ASC=0x4 ASCQ=0x0
[107991.554996] sd 8:0:10:0: [sdm] CDB:
[107991.555382] cdb[0]=0x2a: 2a 00 00 21 40 00 00 04 00 00
[107991.555817] sd 8:0:10:0: [sdm] Device not ready
[107991.556232] sd 8:0:10:0: [sdm]
[107991.556621] Result: hostbyte=0x00 driverbyte=0x08
[107991.557013] sd 8:0:10:0: [sdm]
[107991.557402] Sense Key : 0x2 [current]
[107991.557792] sd 8:0:10:0: [sdm]
[107991.558181] ASC=0x4 ASCQ=0x0
[107991.558564] sd 8:0:10:0: [sdm] CDB:
[107991.558949] cdb[0]=0x2a: 2a 00 00 21 44 00 00 04 00 00
[107991.559385] sd 8:0:10:0: [sdm] Device not ready
[107991.559811] sd 8:0:10:0: [sdm]
[107991.560199] Result: hostbyte=0x00 driverbyte=0x08
[107991.560588] sd 8:0:10:0: [sdm]
[107991.560976] Sense Key : 0x2 [current]
[107991.561367] sd 8:0:10:0: [sdm]
[107991.561750] ASC=0x4 ASCQ=0x0
[107991.562132] sd 8:0:10:0: [sdm] CDB:
[107991.562516] cdb[0]=0x2a: 2a 00 00 21 48 00 00 04 00 00
[107991.562949] sd 8:0:10:0: [sdm] Device not ready
[107991.563362] sd 8:0:10:0: [sdm]
[107991.563748] Result: hostbyte=0x00 driverbyte=0x08
[107991.564135] sd 8:0:10:0: [sdm]
[107991.564524] Sense Key : 0x2 [current]
[107991.564912] sd 8:0:10:0: [sdm]
[107991.565295] ASC=0x4 ASCQ=0x0
[107991.565676] sd 8:0:10:0: [sdm] CDB:
[107991.566058] cdb[0]=0x2a: 2a 00 00 21 4c 00 00 04 00 00
[107991.566523] sd 8:0:10:0: [sdm] Device not ready
[107991.566937] sd 8:0:10:0: [sdm]
[107991.567322] Result: hostbyte=0x00 driverbyte=0x08
[107991.567708] sd 8:0:10:0: [sdm]
[107991.568095] Sense Key : 0x2 [current]
[107991.568482] sd 8:0:10:0: [sdm]
[107991.568865] ASC=0x4 ASCQ=0x0
[107991.569246] sd 8:0:10:0: [sdm] CDB:
[107991.569628] cdb[0]=0x2a: 2a 00 00 21 50 00 00 04 00 00
[107991.570058] sd 8:0:10:0: [sdm] Device not ready
[107991.570469] sd 8:0:10:0: [sdm]
[107991.570855] Result: hostbyte=0x00 driverbyte=0x08
[107991.571244] sd 8:0:10:0: [sdm]
[107991.571632] Sense Key : 0x2 [current]
[107991.572020] sd 8:0:10:0: [sdm]
[107991.572402] ASC=0x4 ASCQ=0x0
[107991.572783] sd 8:0:10:0: [sdm] CDB:
[107991.573182] cdb[0]=0x2a: 2a 00 00 21 54 00 00 04 00 00
[107991.573612] sd 8:0:10:0: [sdm] Device not ready
[107991.574023] sd 8:0:10:0: [sdm]
[107991.574409] Result: hostbyte=0x00 driverbyte=0x08
[107991.574796] sd 8:0:10:0: [sdm]
[107991.575186] Sense Key : 0x2 [current]
[107991.575576] sd 8:0:10:0: [sdm]
[107991.575960] ASC=0x4 ASCQ=0x0
[107991.576345] sd 8:0:10:0: [sdm] CDB:
[107991.576729] cdb[0]=0x2a: 2a 00 00 21 58 00 00 04 00 00
[107991.577161] sd 8:0:10:0: [sdm] Device not ready
[107991.577572] sd 8:0:10:0: [sdm]
[107991.577959] Result: hostbyte=0x00 driverbyte=0x08
[107991.578348] sd 8:0:10:0: [sdm]
[107991.578736] Sense Key : 0x2 [current]
[107991.579125] sd 8:0:10:0: [sdm]
[107991.579509] ASC=0x4 ASCQ=0x0
[107991.579924] sd 8:0:10:0: [sdm] CDB:
[107991.580307] cdb[0]=0x2a: 2a 00 00 21 5c 00 00 04 00 00


The copying proccess does not stop , drives do not drop and the mountpoint
still intact . However , trying to read the copied files result in Input/Output
Error .

THank you .

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (28 preceding siblings ...)
  2013-08-29 19:34 ` bugzilla-daemon
@ 2013-08-30  3:08 ` bugzilla-daemon
  2013-08-30  4:56 ` bugzilla-daemon
                   ` (21 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-08-30  3:08 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

Kurk <kurk@shiftmail.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kurk@shiftmail.org

--- Comment #30 from Kurk <kurk@shiftmail.org> ---
Hi Liveaxle, would you share the brand and model of your HDDs, and print here
the exact partition table you are using? Thank you

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (29 preceding siblings ...)
  2013-08-30  3:08 ` bugzilla-daemon
@ 2013-08-30  4:56 ` bugzilla-daemon
  2013-09-17 11:00 ` bugzilla-daemon
                   ` (20 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-08-30  4:56 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #31 from liveaxle@live.com ---
Hi Kurk .

As for the HDDs , here is the list :

Hitachi_HDS5C4040ALE630 (4TB) - (4 disks)

TOSHIBA_DT01ACA300 (3TB) (2 disks)

WDC_WD10EARS-00Y5B1_WD-WCAV5N165986 (1TB) (1 disk)

WDC_WD3200AAJS-00L7A0_WD-WMAV20125236 (320GB) (1 disk)

WDC_WD3200AAKX-001CA0_WD-WCAYUH130479 (320GB) (1 disk)


All of them use BTRFS , which has its own way of partitioning . WHen a new
BTRFS volume is created , it clears the old partition table . And fdisk -l
doesn't show any partitions .

Thank you .

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (30 preceding siblings ...)
  2013-08-30  4:56 ` bugzilla-daemon
@ 2013-09-17 11:00 ` bugzilla-daemon
  2013-09-20 10:13   ` Pasi Kärkkäinen
  2013-09-20 10:28 ` bugzilla-daemon
                   ` (19 subsequent siblings)
  51 siblings, 1 reply; 54+ messages in thread
From: bugzilla-daemon @ 2013-09-17 11:00 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #32 from liveaxle@live.com ---
Hi .

This bug is still present in 3.12-rc1 as of today's tests .

Thank you .

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-09-17 11:00 ` bugzilla-daemon
@ 2013-09-20 10:13   ` Pasi Kärkkäinen
  0 siblings, 0 replies; 54+ messages in thread
From: Pasi Kärkkäinen @ 2013-09-20 10:13 UTC (permalink / raw)
  To: bugzilla-daemon; +Cc: linux-scsi

On Tue, Sep 17, 2013 at 11:00:21AM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=60644
> 
> --- Comment #32 from liveaxle@live.com ---
> Hi .
> 
> This bug is still present in 3.12-rc1 as of today's tests .
> 

You might want to try the latest P17 firmware aswell, it has been out for a couple of weeks now.
There's not much in the changelogs, but it seems to fix some other SGPIO related issues at least.

-- Pasi


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (31 preceding siblings ...)
  2013-09-17 11:00 ` bugzilla-daemon
@ 2013-09-20 10:28 ` bugzilla-daemon
  2013-09-20 14:38 ` bugzilla-daemon
                   ` (18 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-09-20 10:28 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #33 from pasik@iki.fi ---

You might want to try the latest P17 firmware aswell, it has been out for a
couple of weeks now. There's not much in the changelogs, but it seems to fix
some other SGPIO related issues at least.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (32 preceding siblings ...)
  2013-09-20 10:28 ` bugzilla-daemon
@ 2013-09-20 14:38 ` bugzilla-daemon
  2013-12-05 18:00 ` bugzilla-daemon
                   ` (17 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-09-20 14:38 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #34 from liveaxle@live.com ---
Hi .

In fact I installed P17 on both controllers (2308 -as 9207- and M1015 -as
9211-) , but nothing has changed at all .

P16 worked just fine under Windows Server 2012 .

the problem lies in MPT2SAS as far as I see .

Thank you .

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (33 preceding siblings ...)
  2013-09-20 14:38 ` bugzilla-daemon
@ 2013-12-05 18:00 ` bugzilla-daemon
  2014-01-02 14:40 ` bugzilla-daemon
                   ` (16 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2013-12-05 18:00 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

Jeff Johnson <jeff.johnson@aeoncomputing.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jeff.johnson@aeoncomputing.
                   |                            |com

--- Comment #35 from Jeff Johnson <jeff.johnson@aeoncomputing.com> ---
Has anyone checked operating temp of the SAS chip on the HBAs?

Max operating temp is 55C. I'm seeing this issue on a box with three 9207-8i
running zfs and the operating climbs to 64-67C before a drop out occurs.

Temps can be checked with the lsiutil utility.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (34 preceding siblings ...)
  2013-12-05 18:00 ` bugzilla-daemon
@ 2014-01-02 14:40 ` bugzilla-daemon
  2014-01-13 17:27 ` bugzilla-daemon
                   ` (15 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2014-01-02 14:40 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

Tommy Apel <tommyapeldk@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tommyapeldk@gmail.com

--- Comment #36 from Tommy Apel <tommyapeldk@gmail.com> ---
Hello all,

I'm currently hitting this problem consistently with kernel 3.10.25 during
MDRAID6 resync, one thing that I found that will stop this from happening is to
disable SERR and PERR in BIOS, I don't know if that helps.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (35 preceding siblings ...)
  2014-01-02 14:40 ` bugzilla-daemon
@ 2014-01-13 17:27 ` bugzilla-daemon
  2014-01-13 17:31 ` bugzilla-daemon
                   ` (14 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2014-01-13 17:27 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #37 from Jeff Johnson <jeff.johnson@aeoncomputing.com> ---
I can confirm Tommy's observations about disabling PERR and SERR solving the
issue.

The motherboard I am using (Supermicro X8DTH-iF) does not have those exact BIOS
settings to control. In my case the following BIOS changes and kernel command
line arguments eliminated the issue:

MB BIOS: BIOS->Advanced->Advanced Chipset Configuration->North Bridge
Configuration->ASPM=Disabled

Linux boot command line options: pcie_aspm=off disable_msi=1 

With these changes a very intense fio run has gone four days without a single
error or issue on a Linux-ZFS filesystem.

Before these changes I could not go four hours without multiple HBAs
disappearing (in my config there are three HBAs). Sometimes the HBAs would
disappear within 15-20 minutes of benchmark runtime.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (36 preceding siblings ...)
  2014-01-13 17:27 ` bugzilla-daemon
@ 2014-01-13 17:31 ` bugzilla-daemon
  2014-01-13 18:02 ` bugzilla-daemon
                   ` (13 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2014-01-13 17:31 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #38 from Jeff Johnson <jeff.johnson@aeoncomputing.com> ---
Apologies, I forgot to add important information..

I am not running a 3.x kernel, I am running a 2.6 kernel. It would appear that
this problem is a hardware issue (LSI) and not a driver or kernel issue.

I am running:
Linux zfs-0-0.local 2.6.32-279.14.1.el6.x86_64 #1 SMP Tue Nov 6 23:43:09 UTC
2012 x86_64 x86_64 x86_64 GNU/Linux

LSI driver:
filename:       /lib/modules/2.6.32-279.14.1.el6.x86_64/extra/mpt2sas.ko
version:        18.00.01.00
license:        GPL
description:    LSI MPT Fusion SAS 2.0 Device Driver
author:         LSI Corporation <DL-MPTFusionLinux@lsi.com>
srcversion:     ED5DA691FBB263E9F3A55B1



--Jeff

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (37 preceding siblings ...)
  2014-01-13 17:31 ` bugzilla-daemon
@ 2014-01-13 18:02 ` bugzilla-daemon
  2014-01-13 19:41 ` bugzilla-daemon
                   ` (12 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2014-01-13 18:02 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #39 from Sreekanth Reddy <sreekanth.reddy@lsi.com> ---
I am on vacation till 17th Janury 2014. I will have limited access to emails
during this time. For urgent issues, please contact my manager Krishna
(Krishnaraddi.Mankani@lsi.com<mailto:Krishnaraddi.Mankani@lsi.com>), or call on
my mobile (+91 8722810905).

Regards,
Sreekanth Reddy

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (38 preceding siblings ...)
  2014-01-13 18:02 ` bugzilla-daemon
@ 2014-01-13 19:41 ` bugzilla-daemon
  2014-01-19 12:14 ` bugzilla-daemon
                   ` (11 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2014-01-13 19:41 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #40 from Tommy Apel <tommyapeldk@gmail.com> ---
Hello all,

I've actually been playing around with this a bit more and found out that the
PERR and SERR might only hide problem for a while, sustained load over longer
periods (4-5 hours) will still unfold in a crash, I have re-enabled the PERR
and SERR and I'm running with ASPM enabled aswell, but the change that made it
all go away was that to switch encoding on the PCIe to use "Above 4G", now the
thing I noted was aswell that I had changed from using a PCIe2.0 SAS HBA to a
PCIe3.0 SAS HBA and the problem manifested it self only with the latter, anoter
thing to note is that I have mixed PCIe2 and PCIe3 devices with a PCIe3 capable
cpu.

/Tommy

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (39 preceding siblings ...)
  2014-01-13 19:41 ` bugzilla-daemon
@ 2014-01-19 12:14 ` bugzilla-daemon
  2014-01-19 12:15 ` bugzilla-daemon
                   ` (10 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2014-01-19 12:14 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

Konstantin <ktrackfd@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ktrackfd@gmail.com

--- Comment #41 from Konstantin <ktrackfd@gmail.com> ---
Created attachment 122561
  --> https://bugzilla.kernel.org/attachment.cgi?id=122561&action=edit
dmesg showing mpt2sas errors with ibm m1015 in it mode (fw 19)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (40 preceding siblings ...)
  2014-01-19 12:14 ` bugzilla-daemon
@ 2014-01-19 12:15 ` bugzilla-daemon
  2014-01-19 12:17 ` bugzilla-daemon
                   ` (9 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2014-01-19 12:15 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #42 from Konstantin <ktrackfd@gmail.com> ---
Created attachment 122581
  --> https://bugzilla.kernel.org/attachment.cgi?id=122581&action=edit
"zpool status" output

zpool status showing the disks configured as a raidz3 vdev.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (41 preceding siblings ...)
  2014-01-19 12:15 ` bugzilla-daemon
@ 2014-01-19 12:17 ` bugzilla-daemon
  2014-01-20 18:27 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2014-01-19 12:17 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #43 from Konstantin <ktrackfd@gmail.com> ---
Hello all,

enabling "Above 4G encoding" in the bios did not help in my case.

I enabled PERR and SERR as well. PCIe ASPM is forced on by the bios and the
kernel.

When I scrub my zpool, the system locks up. This time at 7.13% progress. After
a reset the scrubbing continues and sometimes the locks up a second time.

So in general I get 1-2 lockups during a scrub, but it always finishes the
scrub without errors (ofc when the disks drop out the zfs scrubbing mentions
errors).

Hardware:

Case: Inter-Tech 4HU-4324L
Board: Supermicro X9SCM-F
CPU: Intel Xeon E3-1230 V2
RAM: 2x8GB ECC ( Samsung M391B1G73BH0-CH9 )
HBA: IBM ServeRAID M1015 ( IT mode, FW version 17 )
Disks: 10x 3TB WD Green ( WD30EZRX ) and 1x 3TB Hitachi ( HDS5C303 )

Software:

- Gentoo hardened, kernel 3.12.6-hardened-r4 (other kernel version failing,
too)
- All the disks luks encrypted
- A pool "rpool" for the system on a ssd
- A pool "tank" for the data on a raidz3

I have attached "zpool status" and dmesg logs (see posts above).

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (42 preceding siblings ...)
  2014-01-19 12:17 ` bugzilla-daemon
@ 2014-01-20 18:27 ` bugzilla-daemon
  2014-01-20 18:39 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2014-01-20 18:27 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #44 from Konstantin <ktrackfd@gmail.com> ---
modinfo mpt2sas

filename:      
/lib/modules/3.12.6-hardened-r4/kernel/drivers/scsi/mpt2sas/mpt2sas.ko
version:        16.100.00.00
license:        GPL
description:    LSI MPT Fusion SAS 2.0 Device Driver
author:         LSI Corporation <DL-MPTFusionLinux@lsi.com>
srcversion:     17F8D55839A477BC4077B0B

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (43 preceding siblings ...)
  2014-01-20 18:27 ` bugzilla-daemon
@ 2014-01-20 18:39 ` bugzilla-daemon
  2014-01-20 18:42 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2014-01-20 18:39 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #45 from Jeff Johnson <jeff.johnson@aeoncomputing.com> ---
I ran more detailed tests this weekend.

ASPM & MSI disabled = stable machine under zfs load

ASPM disabled / MSI enabled = stable machine under zfs load

ASPM enabled / MSI disabled = unstable, lost an HBA under zfs load


Hardware:
Supermicro X8DTH-iF, BIOS 2.1b (current)
2x Xeon X5670, 48GB DDR3 1333Mhz Reg/ECC
3x LSI 9207-8i, phase 18 firmware
36x Seagate ST32000444SS

It appears to be ASPM and vulnerability to issue may vary by chipset. I know
other motherboards mentioned in this thread have a different chipset. I have
other systems in the field with similar components but with a C206 chipset
based motherboard and these issues are not occurring.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (44 preceding siblings ...)
  2014-01-20 18:39 ` bugzilla-daemon
@ 2014-01-20 18:42 ` bugzilla-daemon
  2014-01-21  2:54 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2014-01-20 18:42 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #46 from Jeff Johnson <jeff.johnson@aeoncomputing.com> ---
addendum/corrections to last...

LSI 9207-8i with phase 17 firmware.
Motherboards with Intel C602 chipset appear to be functioning w/o issue

Apologies.. not enough coffee consumed this morning

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (45 preceding siblings ...)
  2014-01-20 18:42 ` bugzilla-daemon
@ 2014-01-21  2:54 ` bugzilla-daemon
  2014-01-21 19:46 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2014-01-21  2:54 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #47 from Konstantin <ktrackfd@gmail.com> ---
Thanks for the hint at ASPM ! After disabling it in the BIOS I was able to
scrub my zpool without a single issue.

# zpool status

....
scan: scrub repaired 0 in 6h32m with 0 errors on Tue Jan 21 03:15:12 2014
....

Problem solved. Proper support for PCIe ASPM would be great though !

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (46 preceding siblings ...)
  2014-01-21  2:54 ` bugzilla-daemon
@ 2014-01-21 19:46 ` bugzilla-daemon
  2016-03-29 15:02 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2014-01-21 19:46 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #48 from Jeff Johnson <jeff.johnson@aeoncomputing.com> ---
Awesome! I'm glad it is working for others.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (47 preceding siblings ...)
  2014-01-21 19:46 ` bugzilla-daemon
@ 2016-03-29 15:02 ` bugzilla-daemon
  2016-12-28 11:15 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2016-03-29 15:02 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

rlung_74@yahoo.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rlung_74@yahoo.com

--- Comment #49 from rlung_74@yahoo.com ---
(In reply to Jeff Johnson from comment #48)
> Awesome! I'm glad it is working for others.

Disabling ASPM did the trick for me!  

Always hanged within 20min under heavy IO before disabling ASPM. 

On Supermicro x10sl7 its under advanced, chipset config, system agent config,
pcie config

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (48 preceding siblings ...)
  2016-03-29 15:02 ` bugzilla-daemon
@ 2016-12-28 11:15 ` bugzilla-daemon
  2016-12-28 11:18 ` bugzilla-daemon
  2017-02-19 11:55 ` bugzilla-daemon
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2016-12-28 11:15 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

ojab@ojab.ru changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ojab@ojab.ru

--- Comment #50 from ojab@ojab.ru ---
Created attachment 248831
  --> https://bugzilla.kernel.org/attachment.cgi?id=248831&action=edit
Possible fix

This patch disables ASPM powersave for controller's pci link.
I can't reproduce the isse with ASPM enabled and this patch applied (4.9
kernel, LSI SAS 9217-8i HBA), but more testing will not hurt.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (49 preceding siblings ...)
  2016-12-28 11:15 ` bugzilla-daemon
@ 2016-12-28 11:18 ` bugzilla-daemon
  2017-02-19 11:55 ` bugzilla-daemon
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2016-12-28 11:18 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #51 from ojab@ojab.ru ---
(as a side note, does anyone else has an issue [0] with `rmmod mpt3sas`? It's
reproducible with LSI SAS 9217-8i HBA, and I would like to know if other HBAs
are affected)

[0] https://www.spinics.net/lists/linux-scsi/msg100687.html

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* [Bug 60644] MPT2SAS drops all HDDs when under high I/O
  2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
                   ` (50 preceding siblings ...)
  2016-12-28 11:18 ` bugzilla-daemon
@ 2017-02-19 11:55 ` bugzilla-daemon
  51 siblings, 0 replies; 54+ messages in thread
From: bugzilla-daemon @ 2017-02-19 11:55 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=60644

--- Comment #52 from ojab@ojab.ru ---
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ffdadd68af5a397b8a52289ab39d62e1acb39e63

Patch is merged, bug can be closed.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2017-02-19 11:56 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-29  3:20 [Bug 60644] New: MPT2SAS drops all HDDs when under high I/O bugzilla-daemon
2013-07-29  3:21 ` [Bug 60644] " bugzilla-daemon
2013-07-29  3:32 ` bugzilla-daemon
2013-07-29 11:27 ` bugzilla-daemon
2013-07-29 14:12 ` bugzilla-daemon
2013-07-29 14:14 ` bugzilla-daemon
2013-07-29 23:24 ` bugzilla-daemon
2013-07-29 23:30 ` bugzilla-daemon
2013-07-30  5:00 ` bugzilla-daemon
2013-07-30  5:26 ` bugzilla-daemon
2013-07-30  5:33 ` bugzilla-daemon
2013-07-30 10:03 ` bugzilla-daemon
2013-07-30 10:11 ` bugzilla-daemon
2013-07-30 10:12 ` bugzilla-daemon
2013-08-01 13:20 ` bugzilla-daemon
2013-08-01 14:24 ` bugzilla-daemon
2013-08-01 14:30 ` bugzilla-daemon
2013-08-02 10:44 ` bugzilla-daemon
2013-08-27  6:48 ` bugzilla-daemon
2013-08-27  7:14 ` bugzilla-daemon
2013-08-27 11:45 ` bugzilla-daemon
2013-08-27 11:46 ` bugzilla-daemon
2013-08-27 15:29 ` bugzilla-daemon
2013-08-28  7:57 ` bugzilla-daemon
2013-08-28  8:32 ` bugzilla-daemon
2013-08-28  8:44 ` bugzilla-daemon
2013-08-28 13:33 ` bugzilla-daemon
2013-08-29  5:51 ` bugzilla-daemon
2013-08-29 10:09 ` bugzilla-daemon
2013-08-29 19:34 ` bugzilla-daemon
2013-08-30  3:08 ` bugzilla-daemon
2013-08-30  4:56 ` bugzilla-daemon
2013-09-17 11:00 ` bugzilla-daemon
2013-09-20 10:13   ` Pasi Kärkkäinen
2013-09-20 10:28 ` bugzilla-daemon
2013-09-20 14:38 ` bugzilla-daemon
2013-12-05 18:00 ` bugzilla-daemon
2014-01-02 14:40 ` bugzilla-daemon
2014-01-13 17:27 ` bugzilla-daemon
2014-01-13 17:31 ` bugzilla-daemon
2014-01-13 18:02 ` bugzilla-daemon
2014-01-13 19:41 ` bugzilla-daemon
2014-01-19 12:14 ` bugzilla-daemon
2014-01-19 12:15 ` bugzilla-daemon
2014-01-19 12:17 ` bugzilla-daemon
2014-01-20 18:27 ` bugzilla-daemon
2014-01-20 18:39 ` bugzilla-daemon
2014-01-20 18:42 ` bugzilla-daemon
2014-01-21  2:54 ` bugzilla-daemon
2014-01-21 19:46 ` bugzilla-daemon
2016-03-29 15:02 ` bugzilla-daemon
2016-12-28 11:15 ` bugzilla-daemon
2016-12-28 11:18 ` bugzilla-daemon
2017-02-19 11:55 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.