All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 90601] New: panic on write to 3ware raid array
@ 2015-01-02 14:31 bugzilla-daemon
  2015-01-02 14:32 ` [Bug 90601] " bugzilla-daemon
                   ` (28 more replies)
  0 siblings, 29 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-01-02 14:31 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

            Bug ID: 90601
           Summary: panic on write to 3ware raid array
           Product: IO/Storage
           Version: 2.5
    Kernel Version: 3.16 and above
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: SCSI
          Assignee: linux-scsi@vger.kernel.org
          Reporter: merlin@liao.homelinux.org
        Regression: No

Created attachment 162291
  --> https://bugzilla.kernel.org/attachment.cgi?id=162291&action=edit
panic log

Since kernel version 3.16 we are getting kernel panics and the panic message
*might* suggest that it's related to the raid controller we use.
I did try the kernel versions 3.17, 3.18 and also and it keeps crashing.
3.15.10 runs totally fine though.

I can reproduce the crash 100% with a simple DD command
dd if=/dev/zero of=test bs=1G count=10

I've attached the backtrace and tw_cli configruation options.
We have several cluster nodes so I can try patches or give additional
information when necessary.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
@ 2015-01-02 14:32 ` bugzilla-daemon
  2015-01-02 14:33 ` bugzilla-daemon
                   ` (27 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-01-02 14:32 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #1 from merlin@liao.homelinux.org ---
Created attachment 162301
  --> https://bugzilla.kernel.org/attachment.cgi?id=162301&action=edit
tw_cli raid config

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
  2015-01-02 14:32 ` [Bug 90601] " bugzilla-daemon
@ 2015-01-02 14:33 ` bugzilla-daemon
  2015-01-05  9:17 ` bugzilla-daemon
                   ` (26 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-01-02 14:33 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #2 from merlin@liao.homelinux.org ---
I tried to mail "Adam Radford <linuxraid@lsi.com>" directly as stated in
drivers/scsi/3w-sas.c but the mail address doesn't exist anymore.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
  2015-01-02 14:32 ` [Bug 90601] " bugzilla-daemon
  2015-01-02 14:33 ` bugzilla-daemon
@ 2015-01-05  9:17 ` bugzilla-daemon
  2015-01-06 15:28 ` bugzilla-daemon
                   ` (25 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-01-05  9:17 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

kashyap <kashyap.desai@lsi.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kashyap.desai@lsi.com

--- Comment #3 from kashyap <kashyap.desai@lsi.com> ---
Merlin, 

1.) 
Can you enable CONFIG_DMA_API_DEBUG in your kernel conf and send complete
/var/log/messages along with back trace.

2.)
>From existing logs I can figure out that crash happened at below place.

static void intel_unmap_sg(struct device *dev, struct scatterlist *sglist,
                           int nelems, enum dma_data_direction dir,
                           struct dma_attrs *attrs)
{
        intel_unmap(dev, sglist[0].dma_address);
}

sglist[0] is NULL and while accessing dma_address (which is 0x10th index as per
below definition)

struct scatterlist {
#ifdef CONFIG_DEBUG_SG
        unsigned long   sg_magic;
#endif
        unsigned long   page_link;
        unsigned int    offset;
        unsigned int    length;
        dma_addr_t      dma_address;   < -- 0x10th index
#ifdef CONFIG_NEED_SG_DMA_LENGTH
        unsigned int    dma_length;
#endif  
};

Crash detail -
[ 1005.899543] Hardware name: Supermicro X9DR7/E-(J)LN4F/X9DR7/E-(J)LN4F, BIOS
3.0a 09/10/2013
[ 1005.899971] task: ffffffff8160a460 ti: ffffffff815f4000 task.ti:
ffffffff815f4000
[ 1005.900401] RIP: 0010:[<ffffffff813db181>]  [<ffffffff813db181>]
intel_unmap_sg+0x1/0x10


~ kashyap

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (2 preceding siblings ...)
  2015-01-05  9:17 ` bugzilla-daemon
@ 2015-01-06 15:28 ` bugzilla-daemon
  2015-01-06 15:29 ` bugzilla-daemon
                   ` (24 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-01-06 15:28 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #4 from merlin@liao.homelinux.org ---
Hi Kashyap,

1. I enabled CONFIG_DMA_API_DEBUG , strangely enough after enabling it I
couldn't even boot into the system anymore because it would crash when trying
to access the root partition (after the kernel image was loaded). I think the
problem was the initial xfs_repair that is done on bootup.

So I took the same kernel and copied it to a USB device and booted from that on
the same host into a system that also resides on the USB disk.

Then I mounted /dev/sda (which is the first device on the raid controller which
is the root partition).

Then I tried to reproduce the panic with the simple dd that was used before
without success.
I then compiled the kernel sources with -j25 on the mounted root partition 2
times without any problems.
Then I chrooted onto that partition and did the same and could finally
reproduce the crash.

I am not sure if that backtrace is more useful though.

I have also attached the /var/log/messages but of course that crash wasn't
saved in it.
I also added my kernel config.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (3 preceding siblings ...)
  2015-01-06 15:28 ` bugzilla-daemon
@ 2015-01-06 15:29 ` bugzilla-daemon
  2015-01-06 15:30 ` bugzilla-daemon
                   ` (23 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-01-06 15:29 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #5 from merlin@liao.homelinux.org ---
Created attachment 162641
  --> https://bugzilla.kernel.org/attachment.cgi?id=162641&action=edit
syslog before crash

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (4 preceding siblings ...)
  2015-01-06 15:29 ` bugzilla-daemon
@ 2015-01-06 15:30 ` bugzilla-daemon
  2015-01-06 15:48 ` bugzilla-daemon
                   ` (22 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-01-06 15:30 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #6 from merlin@liao.homelinux.org ---
Created attachment 162651
  --> https://bugzilla.kernel.org/attachment.cgi?id=162651&action=edit
backtrace with CONFIG_DMA_API_DEBUG enabled

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (5 preceding siblings ...)
  2015-01-06 15:30 ` bugzilla-daemon
@ 2015-01-06 15:48 ` bugzilla-daemon
  2015-01-21 11:51 ` bugzilla-daemon
                   ` (21 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-01-06 15:48 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #7 from merlin@liao.homelinux.org ---
Created attachment 162661
  --> https://bugzilla.kernel.org/attachment.cgi?id=162661&action=edit
kernel config

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (6 preceding siblings ...)
  2015-01-06 15:48 ` bugzilla-daemon
@ 2015-01-21 11:51 ` bugzilla-daemon
  2015-02-13  7:25 ` bugzilla-daemon
                   ` (20 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-01-21 11:51 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #8 from merlin@liao.homelinux.org ---
Hi Kashyap,

is there any additional info you need or do you know a better way to create a
useable backtrace for you?

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (7 preceding siblings ...)
  2015-01-21 11:51 ` bugzilla-daemon
@ 2015-02-13  7:25 ` bugzilla-daemon
  2015-02-13  7:54 ` bugzilla-daemon
                   ` (19 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-02-13  7:25 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

Alex Elsayed <eternaleye+kernelbugs@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |eternaleye+kernelbugs@gmail
                   |                            |.com

--- Comment #9 from Alex Elsayed <eternaleye+kernelbugs@gmail.com> ---
I've been bitten by this as well, on a 3ware 9750. It's currently preventing
from using recent kernels on the server in question - which is an issue,
because while 3.16.1 is rock-solid for scsi it has an intermittent wireless
oops, and 3.17.1 oopses out with this (same backtrace as the existing
attachment).

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (8 preceding siblings ...)
  2015-02-13  7:25 ` bugzilla-daemon
@ 2015-02-13  7:54 ` bugzilla-daemon
  2015-02-13  8:28 ` bugzilla-daemon
                   ` (18 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-02-13  7:54 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #10 from Alex Elsayed <eternaleye+kernelbugs@gmail.com> ---
Testing with 3.19 shows it's still there for me.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (9 preceding siblings ...)
  2015-02-13  7:54 ` bugzilla-daemon
@ 2015-02-13  8:28 ` bugzilla-daemon
  2015-02-13 16:20 ` bugzilla-daemon
                   ` (17 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-02-13  8:28 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #11 from merlin@liao.homelinux.org ---
I can confirm it's still happening with 3.19 here.

@Alex: What server hardware are you using? We are using Supermicro boards and
Intel CPUs.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (10 preceding siblings ...)
  2015-02-13  8:28 ` bugzilla-daemon
@ 2015-02-13 16:20 ` bugzilla-daemon
  2015-02-26  5:44 ` bugzilla-daemon
                   ` (16 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-02-13 16:20 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #12 from Alex Elsayed <eternaleye+kernelbugs@gmail.com> ---
Created attachment 166681
  --> https://bugzilla.kernel.org/attachment.cgi?id=166681&action=edit
Hardware of a second machine with the issue

Intel CPUs on an ASUS board - here's the output of lshw on the machine.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (11 preceding siblings ...)
  2015-02-13 16:20 ` bugzilla-daemon
@ 2015-02-26  5:44 ` bugzilla-daemon
  2015-02-26  7:32 ` bugzilla-daemon
                   ` (15 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-02-26  5:44 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #13 from kashyap <kashyap.desai@avagotech.com> ---
My old email id is not working now, so I have to update my email id to new
avagotech domain. Sorry for delay in response.

Does this mean issue does not happen with 3.16.1 
but only happens with 3.17.1 (onwards) ? This is good data point.

Can you share whole dmesg logs for both 3.16.1 and 3.17.1 ?

Also are you using intel_iommu=on in your config ? Can you boot without
enabling Intel IOMMU? 

New logs with ""CONFIG_DMA_API_DEBUG" does not help much. You may have to post
complete logs. We may have some interesting data before crash.

Thanks, Kashyap

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (12 preceding siblings ...)
  2015-02-26  5:44 ` bugzilla-daemon
@ 2015-02-26  7:32 ` bugzilla-daemon
  2015-02-26 13:38 ` bugzilla-daemon
                   ` (14 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-02-26  7:32 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #14 from merlin@liao.homelinux.org ---
@Kashyap

For me it started with 3.16 on all our servers with 3ware/lsi/avagotech
controllers.
I rechecked in our internal ticket system.

We are using IOMMU so I completely disabled it, recompiled the kernal but
getting the same error.

I also attached my kernel config before so maybe you can use that config and
compile the kernel on your side.
Running with that config I can reproduce the crash with a simple dd command:
dd if=/dev/zero of=test bs=1G count=10

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (13 preceding siblings ...)
  2015-02-26  7:32 ` bugzilla-daemon
@ 2015-02-26 13:38 ` bugzilla-daemon
  2015-04-13 16:16 ` bugzilla-daemon
                   ` (13 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-02-26 13:38 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #15 from kashyap <kashyap.desai@avagotech.com> ---
(In reply to merlin from comment #14)
> @Kashyap
> 
> For me it started with 3.16 on all our servers with 3ware/lsi/avagotech
> controllers.
> I rechecked in our internal ticket system.
> 
> We are using IOMMU so I completely disabled it, recompiled the kernal but
> getting the same error.
> 
> I also attached my kernel config before so maybe you can use that config and
> compile the kernel on your side.
> Running with that config I can reproduce the crash with a simple dd command:
> dd if=/dev/zero of=test bs=1G count=10

Can you share whole dmesg logs for the failure even if IOMMU is disabled ? I
will try to reproduce at my setup as I was able to find same h/w.

~ Kashyap

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (14 preceding siblings ...)
  2015-02-26 13:38 ` bugzilla-daemon
@ 2015-04-13 16:16 ` bugzilla-daemon
  2015-04-13 16:16 ` bugzilla-daemon
                   ` (12 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-04-13 16:16 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #16 from merlin@liao.homelinux.org ---
Created attachment 173931
  --> https://bugzilla.kernel.org/attachment.cgi?id=173931&action=edit
panic_screenshot

screenshot of kernel panic

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (15 preceding siblings ...)
  2015-04-13 16:16 ` bugzilla-daemon
@ 2015-04-13 16:16 ` bugzilla-daemon
  2015-04-16 11:21 ` bugzilla-daemon
                   ` (11 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-04-13 16:16 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #17 from merlin@liao.homelinux.org ---
I am unable to get a dmesg output since the system crashes on boot.
It looks like it's happening when udev is initialized.

Sorry for the screenshot but the system crashed before any network came up.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (16 preceding siblings ...)
  2015-04-13 16:16 ` bugzilla-daemon
@ 2015-04-16 11:21 ` bugzilla-daemon
  2015-04-16 11:25 ` bugzilla-daemon
                   ` (10 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-04-16 11:21 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #18 from merlin@liao.homelinux.org ---
Created attachment 174171
  --> https://bugzilla.kernel.org/attachment.cgi?id=174171&action=edit
dmesg

I finally managed to get a dmesg output with the crashing kernel.

Especially this part might indicate the root of the problem?
[Thu Apr 16 13:02:10 2015] ------------[ cut here ]------------
[Thu Apr 16 13:02:10 2015] WARNING: CPU: 6 PID: 1736 at lib/dma-debug.c:601
debug_dma_assert_idle+0x17c/0x1e0()
[Thu Apr 16 13:02:10 2015] 3w-sas 0000:03:00.0: DMA-API: cpu touching an active
dma mapped cacheline [cln=0x0000000101ff8e00]
[Thu Apr 16 13:02:10 2015] Modules linked in:
[Thu Apr 16 13:02:10 2015] CPU: 6 PID: 1736 Comm: mount Not tainted
4.0.0-gentoo #1
[Thu Apr 16 13:02:11 2015] Hardware name: Supermicro
X9DR7/E-(J)LN4F/X9DR7/E-(J)LN4F, BIOS 3.0a 09/10/2013
[Thu Apr 16 13:02:11 2015]  ffffffff8170e0f5 ffff88406050fcb8 ffffffff814c8778
000000000000009a
[Thu Apr 16 13:02:11 2015]  ffff88406050fd08 ffff88406050fcf8 ffffffff8107c250
ffff884060422e08
[Thu Apr 16 13:02:11 2015]  ffff8840647fe5e0 0000000000000206 ffffea00e1bf9ce8
ffff884060422e08
[Thu Apr 16 13:02:11 2015] Call Trace:
[Thu Apr 16 13:02:11 2015]  [<ffffffff814c8778>] dump_stack+0x45/0x57
[Thu Apr 16 13:02:11 2015]  [<ffffffff8107c250>] warn_slowpath_common+0x80/0xc0
[Thu Apr 16 13:02:11 2015]  [<ffffffff8107c2d1>] warn_slowpath_fmt+0x41/0x50
[Thu Apr 16 13:02:11 2015]  [<ffffffff812b9dfc>]
debug_dma_assert_idle+0x17c/0x1e0
[Thu Apr 16 13:02:11 2015]  [<ffffffff81114814>] do_wp_page+0xc4/0x810
[Thu Apr 16 13:02:11 2015]  [<ffffffff8111744c>] handle_mm_fault+0xb3c/0x10f0
[Thu Apr 16 13:02:11 2015]  [<ffffffff8111ca6d>] ? do_mmap_pgoff+0x34d/0x400
[Thu Apr 16 13:02:11 2015]  [<ffffffff81070ec6>] __do_page_fault+0x146/0x390
[Thu Apr 16 13:02:11 2015]  [<ffffffff8107114c>] do_page_fault+0xc/0x10
[Thu Apr 16 13:02:11 2015]  [<ffffffff814cf252>] page_fault+0x22/0x30
[Thu Apr 16 13:02:11 2015] ---[ end trace 326b0530baa65706 ]---


I was still able to login with root after that and reproduce the crash with
another backtrace. See the next attachment.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (17 preceding siblings ...)
  2015-04-16 11:21 ` bugzilla-daemon
@ 2015-04-16 11:25 ` bugzilla-daemon
  2015-04-16 11:25 ` bugzilla-daemon
                   ` (9 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-04-16 11:25 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #19 from merlin@liao.homelinux.org ---
Created attachment 174181
  --> https://bugzilla.kernel.org/attachment.cgi?id=174181&action=edit
crash-backtrace

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (18 preceding siblings ...)
  2015-04-16 11:25 ` bugzilla-daemon
@ 2015-04-16 11:25 ` bugzilla-daemon
  2015-04-19  3:59 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-04-16 11:25 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

merlin@liao.homelinux.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #162291|0                           |1
        is obsolete|                            |

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (19 preceding siblings ...)
  2015-04-16 11:25 ` bugzilla-daemon
@ 2015-04-19  3:59 ` bugzilla-daemon
  2015-04-19  4:00 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-04-19  3:59 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

Justin Keogh <kernel.org@v6y.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kernel.org@v6y.net

--- Comment #20 from Justin Keogh <kernel.org@v6y.net> ---
Created attachment 174421
  --> https://bugzilla.kernel.org/attachment.cgi?id=174421&action=edit
dmesg for the commit that git bisect landed on

I'm getting a similar panic (attached).

$lspci | grep 3ware
05:00.0 RAID bus controller: 3ware Inc 9750 SAS2/SATA-II RAID PCIe (rev 05)

$uname -a
Linux localhost 3.16.0-rc5+ #18 SMP PREEMPT Wed Apr 8 20:11:03 2015 x86_64
Intel(R) Xeon(R) CPU X5482 @ 3.20GHz GenuineIntel GNU/Linux

root@localhost /usr/src/linux $git rev-parse HEAD
74665016086615bbaa3fa6f83af410a0a4e029ee

The panic bisected to:
74665016086615bbaa3fa6f83af410a0a4e029ee is the first bad commit
commit 74665016086615bbaa3fa6f83af410a0a4e029ee
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Jan 22 15:29:29 2014 +0100

    scsi: convert host_busy to atomic_t

    Avoid taking the host-wide host_lock to check the per-host queue limit.
    Instead we do an atomic_inc_return early on to grab our slot in the queue,
    and if necessary decrement it after finishing all checks.


I was unable to trigger the panic by writing to the 3w array with:
$dd if=/dev/zero of=test bs=1G count=10
or even:
$dd if=/dev/zero of=test bs=1G count=100

Instead, it takes somewhere between a few hours and 2 days of writing to panic,
so the good bisects were run for 3 days to make sure.

Crash happens with and without ZFS modules loaded but most of the bisect was
done with because I couldn't take the array out of the equation for a month.
Both cases are attached along with .config and bisect log.

I'll recompile without IOMMU and post the results, further suggestions
appreciated.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (20 preceding siblings ...)
  2015-04-19  3:59 ` bugzilla-daemon
@ 2015-04-19  4:00 ` bugzilla-daemon
  2015-04-19  4:00 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-04-19  4:00 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #21 from Justin Keogh <kernel.org@v6y.net> ---
Created attachment 174431
  --> https://bugzilla.kernel.org/attachment.cgi?id=174431&action=edit
git bisect log between 3.14.33 and 3.18.7

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (21 preceding siblings ...)
  2015-04-19  4:00 ` bugzilla-daemon
@ 2015-04-19  4:00 ` bugzilla-daemon
  2015-04-19  4:04 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-04-19  4:00 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #22 from Justin Keogh <kernel.org@v6y.net> ---
Created attachment 174441
  --> https://bugzilla.kernel.org/attachment.cgi?id=174441&action=edit
.config

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (22 preceding siblings ...)
  2015-04-19  4:00 ` bugzilla-daemon
@ 2015-04-19  4:04 ` bugzilla-daemon
  2015-04-19  4:05 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-04-19  4:04 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #23 from Justin Keogh <kernel.org@v6y.net> ---
Created attachment 174451
  --> https://bugzilla.kernel.org/attachment.cgi?id=174451&action=edit
dmesg for commit 'scsi: convert device_busy to atomic_t'

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (23 preceding siblings ...)
  2015-04-19  4:04 ` bugzilla-daemon
@ 2015-04-19  4:05 ` bugzilla-daemon
  2015-04-19  4:07 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-04-19  4:05 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #24 from Justin Keogh <kernel.org@v6y.net> ---
Created attachment 174461
  --> https://bugzilla.kernel.org/attachment.cgi?id=174461&action=edit
dmesg for panic without zfs modules

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (24 preceding siblings ...)
  2015-04-19  4:05 ` bugzilla-daemon
@ 2015-04-19  4:07 ` bugzilla-daemon
  2015-04-19  4:21 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-04-19  4:07 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

Justin Keogh <kernel.org@v6y.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #174451|application/octet-stream    |text/plain
          mime type|                            |

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (25 preceding siblings ...)
  2015-04-19  4:07 ` bugzilla-daemon
@ 2015-04-19  4:21 ` bugzilla-daemon
  2015-04-20 17:43 ` bugzilla-daemon
  2015-10-10  7:30 ` bugzilla-daemon
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-04-19  4:21 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #25 from Justin Keogh <kernel.org@v6y.net> ---
Just to clarify, I couldn't trigger this by writing to the array from /dev/zero
at all (maybe because ZFS), all panics were triggered by writing real data from
another array on the same controller.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (26 preceding siblings ...)
  2015-04-19  4:21 ` bugzilla-daemon
@ 2015-04-20 17:43 ` bugzilla-daemon
  2015-10-10  7:30 ` bugzilla-daemon
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-04-20 17:43 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #26 from Justin Keogh <kernel.org@v6y.net> ---
Created attachment 174581
  --> https://bugzilla.kernel.org/attachment.cgi?id=174581&action=edit
dmesg for the commit that git bisect landed on, with IOMMU disabled

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Bug 90601] panic on write to 3ware raid array
  2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
                   ` (27 preceding siblings ...)
  2015-04-20 17:43 ` bugzilla-daemon
@ 2015-10-10  7:30 ` bugzilla-daemon
  28 siblings, 0 replies; 30+ messages in thread
From: bugzilla-daemon @ 2015-10-10  7:30 UTC (permalink / raw)
  To: linux-scsi

https://bugzilla.kernel.org/show_bug.cgi?id=90601

--- Comment #27 from Justin Keogh <kernel.org@v6y.net> ---
I am unable to reproduce this on any of the 4.x.y kernels.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2015-10-10  7:30 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-02 14:31 [Bug 90601] New: panic on write to 3ware raid array bugzilla-daemon
2015-01-02 14:32 ` [Bug 90601] " bugzilla-daemon
2015-01-02 14:33 ` bugzilla-daemon
2015-01-05  9:17 ` bugzilla-daemon
2015-01-06 15:28 ` bugzilla-daemon
2015-01-06 15:29 ` bugzilla-daemon
2015-01-06 15:30 ` bugzilla-daemon
2015-01-06 15:48 ` bugzilla-daemon
2015-01-21 11:51 ` bugzilla-daemon
2015-02-13  7:25 ` bugzilla-daemon
2015-02-13  7:54 ` bugzilla-daemon
2015-02-13  8:28 ` bugzilla-daemon
2015-02-13 16:20 ` bugzilla-daemon
2015-02-26  5:44 ` bugzilla-daemon
2015-02-26  7:32 ` bugzilla-daemon
2015-02-26 13:38 ` bugzilla-daemon
2015-04-13 16:16 ` bugzilla-daemon
2015-04-13 16:16 ` bugzilla-daemon
2015-04-16 11:21 ` bugzilla-daemon
2015-04-16 11:25 ` bugzilla-daemon
2015-04-16 11:25 ` bugzilla-daemon
2015-04-19  3:59 ` bugzilla-daemon
2015-04-19  4:00 ` bugzilla-daemon
2015-04-19  4:00 ` bugzilla-daemon
2015-04-19  4:04 ` bugzilla-daemon
2015-04-19  4:05 ` bugzilla-daemon
2015-04-19  4:07 ` bugzilla-daemon
2015-04-19  4:21 ` bugzilla-daemon
2015-04-20 17:43 ` bugzilla-daemon
2015-10-10  7:30 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.