All of lore.kernel.org
 help / color / mirror / Atom feed
* BUG: unable to handle kernel NULL pointer deref, bisected to 746650160
@ 2015-04-08 16:33 Torsten Luettgert
  2015-04-13 17:41 ` Christoph Hellwig
  0 siblings, 1 reply; 9+ messages in thread
From: Torsten Luettgert @ 2015-04-08 16:33 UTC (permalink / raw)
  To: linux-kernel; +Cc: Christoph Hellwig

Hello,

I'm getting NULL pointer deref BUGs on a Supermicro machine of
mine since 3.17. It occurs at random uptimes, often a few hours
after booting (max uptime was 2 days yet).

I bisected the problem (took a while); the problematic commit seems
to be 746650160866 (scsi: convert host_busy to atomic_t) by
Christoph Hellwig.

Here's one of the logs (it's always the same trace):

BUG: unable to handle kernel NULL pointer dereference at
0000000000000010 IP: [<ffffffff8133af60>]
swiotlb_unmap_sg_attrs+0x30/0x80 PGD 0 
Oops: 0000 [#1] SMP 
Modules linked in: iTCO_wdt iTCO_vendor_support lpc_ich mfd_core
usb_storage CPU: 0 PID: 0 Comm: swapper/0 Not tainted
3.16.0-74665016086615bb+ #1 Hardware name: Supermicro X8DTT/X8DTT, BIOS
080016  10/05/2010 task: ffffffff81c16480 ti: ffffffff81c00000 task.ti:
ffffffff81c00000 RIP: 0010:[<ffffffff8133af60>]  [<ffffffff8133af60>]
swiotlb_unmap_sg_attrs+0x30/0x80 RSP: 0018:ffff88063fc03e08  EFLAGS:
00010002 RAX: 0000000000000000 RBX: 0000000000000001 RCX:
0000000000000002 RDX: 0000000000000000 RSI: 000000090e2ef000 RDI:
ffff880c14e61a00 RBP: ffff88063fc03e38 R08: 0000000000000000 R09:
ffff8806209cc098 R10: ffff88063f400120 R11: 0000000000001268 R12:
0000000000000002 R13: 0000000000000002 R14: ffff8806209cc098 R15:
ffff880c200fcc70 FS:  0000000000000000(0000) GS:ffff88063fc00000(0000)
knlGS:0000000000000000 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000010 CR3: 0000000001c11000 CR4: 00000000000027e0
Stack:
 0000000000000094 0000000000000094 ffff880c200f8718 0000000000000094
 0000000000000094 0000000000000094 ffff88063fc03e48 ffffffff8146a0b4
 ffff88063fc03e88 ffffffff81477c1d ffff88063fc03e78 ffff880c213a57c0
Call Trace:
 <IRQ> 
 [<ffffffff8146a0b4>] scsi_dma_unmap+0x54/0x70
 [<ffffffff81477c1d>] twl_interrupt+0x26d/0x420
 [<ffffffff810fe2fd>] handle_irq_event_percpu+0x5d/0x1c0
 [<ffffffff810fe4a2>] handle_irq_event+0x42/0x70
 [<ffffffff8110165b>] handle_fasteoi_irq+0x5b/0x100
 [<ffffffff81053fdc>] handle_irq+0x5c/0x150
 [<ffffffff810c8f72>] ? __atomic_notifier_call_chain+0x12/0x20
 [<ffffffff810c8f96>] ? atomic_notifier_call_chain+0x16/0x20
 [<ffffffff81776f6e>] do_IRQ+0x5e/0x110
 [<ffffffff817754ea>] common_interrupt+0x6a/0x6a
 <EOI> 
 [<ffffffff815de8c3>] ? cpuidle_enter_state+0x53/0xd0
 [<ffffffff815de8bf>] ? cpuidle_enter_state+0x4f/0xd0
 [<ffffffff815de957>] cpuidle_enter+0x17/0x20
 [<ffffffff810e95a4>] cpuidle_idle_call+0xc4/0x250
 [<ffffffff810e9855>] cpu_idle_loop+0x125/0x1d0
 [<ffffffff810e9913>] cpu_startup_entry+0x13/0x20
 [<ffffffff81769597>] rest_init+0x77/0x80
 [<ffffffff81d74344>] start_kernel+0x39a/0x3a1
 [<ffffffff81d73dc8>] ? set_init_arg+0x5d/0x5d
 [<ffffffff8176f1ad>] ? memblock_reserve+0x4c/0x51
 [<ffffffff81d735ad>] x86_64_start_reservations+0x2a/0x2c
 [<ffffffff81d736f0>] x86_64_start_kernel+0x141/0x148
Code: 56 49 89 fe 41 55 41 89 cd 41 54 41 89 d4 53 48 83 ec 10 83 f9 03
74 5e 31 db 85 d2 48 89 f0 7e 48 66 2e 0f 1f 84 00 00 00 00 00 <48> 8b
70 10 48 3b 35 d5 16 e0 00 8b 50 18 72 1e 48 3b 35 d1 16 RIP
[<ffffffff8133af60>] swiotlb_unmap_sg_attrs+0x30/0x80 RSP
<ffff88063fc03e08> CR2: 0000000000000010 ---[ end trace
4e21be7f8b16aadd ]---

The same problem was reported by Kui Zhang last october with the
subject "3.17.0-rc7 kernel NULL pointer dereference (3ware 9650SE)".
Regrettably (for me), nobody replied.

We have a 3ware controller, too, but ours is a 9750. Controller
firmware and BIOS are current.

Any help with this is greatly appreciated.

Regards,
Torsten

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BUG: unable to handle kernel NULL pointer deref, bisected to 746650160
  2015-04-08 16:33 BUG: unable to handle kernel NULL pointer deref, bisected to 746650160 Torsten Luettgert
@ 2015-04-13 17:41 ` Christoph Hellwig
  2015-04-13 18:28   ` Torsten Luettgert
  0 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2015-04-13 17:41 UTC (permalink / raw)
  To: Torsten Luettgert; +Cc: linux-kernel, linux-scsi

On Wed, Apr 08, 2015 at 06:33:33PM +0200, Torsten Luettgert wrote:
> Hello,
> 
> I'm getting NULL pointer deref BUGs on a Supermicro machine of
> mine since 3.17. It occurs at random uptimes, often a few hours
> after booting (max uptime was 2 days yet).
> 
> I bisected the problem (took a while); the problematic commit seems
> to be 746650160866 (scsi: convert host_busy to atomic_t) by
> Christoph Hellwig.

Seems like a very unlikely commit for this sort of change, so let's try
to dig a bit deeper.

> Here's one of the logs (it's always the same trace):
> 
> BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000010 IP: [<ffffffff8133af60>]
> swiotlb_unmap_sg_attrs+0x30/0x80 PGD 0 
> Oops: 0000 [#1] SMP 
> Modules linked in: iTCO_wdt iTCO_vendor_support lpc_ich mfd_core
> usb_storage CPU: 0 PID: 0 Comm: swapper/0 Not tainted
> 3.16.0-74665016086615bb+ #1 Hardware name: Supermicro X8DTT/X8DTT, BIOS
> 080016  10/05/2010 task: ffffffff81c16480 ti: ffffffff81c00000 task.ti:
> ffffffff81c00000 RIP: 0010:[<ffffffff8133af60>]  [<ffffffff8133af60>]
> swiotlb_unmap_sg_attrs+0x30/0x80 RSP: 0018:ffff88063fc03e08  EFLAGS:
> 00010002 RAX: 0000000000000000 RBX: 0000000000000001 RCX:
> 0000000000000002 RDX: 0000000000000000 RSI: 000000090e2ef000 RDI:
> ffff880c14e61a00 RBP: ffff88063fc03e38 R08: 0000000000000000 R09:
> ffff8806209cc098 R10: ffff88063f400120 R11: 0000000000001268 R12:
> 0000000000000002 R13: 0000000000000002 R14: ffff8806209cc098 R15:
> ffff880c200fcc70 FS:  0000000000000000(0000) GS:ffff88063fc00000(0000)
> knlGS:0000000000000000 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 0000000000000010 CR3: 0000000001c11000 CR4: 00000000000027e0
> Stack:
>  0000000000000094 0000000000000094 ffff880c200f8718 0000000000000094
>  0000000000000094 0000000000000094 ffff88063fc03e48 ffffffff8146a0b4
>  ffff88063fc03e88 ffffffff81477c1d ffff88063fc03e78 ffff880c213a57c0
> Call Trace:
>  <IRQ> 
>  [<ffffffff8146a0b4>] scsi_dma_unmap+0x54/0x70

Can you run gdb on your vmlinux file and send the output of the
following command in gdb

l *(scsi_dma_unmap+0x54)

Thanks!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BUG: unable to handle kernel NULL pointer deref, bisected to 746650160
  2015-04-13 17:41 ` Christoph Hellwig
@ 2015-04-13 18:28   ` Torsten Luettgert
  2015-04-17 13:31     ` Torsten Luettgert
  2015-04-19 16:58     ` Christoph Hellwig
  0 siblings, 2 replies; 9+ messages in thread
From: Torsten Luettgert @ 2015-04-13 18:28 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-kernel, linux-scsi

On Mon, 13 Apr 2015 19:41:05 +0200
Christoph Hellwig <hch@lst.de> wrote:

> Can you run gdb on your vmlinux file and send the output of the
> following command in gdb
> 
> l *(scsi_dma_unmap+0x54)

Thanks for looking into it! Here is what gdb says:

Reading symbols
from /opt/kvm/bisect/vmlinux-3.16.0-74665016086615bb+...done.
(gdb) l *(scsi_dma_unmap+0x54)
0xffffffff8146a0b4 is in scsi_dma_unmap
(drivers/scsi/scsi_lib_dma.c:50).
45 struct device *dev = cmd->device->host->dma_dev;
46
47 dma_unmap_sg(dev, scsi_sglist(cmd), scsi_sg_count(cmd),
48 cmd->sc_data_direction);
49		}
50	}
51	EXPORT_SYMBOL(scsi_dma_unmap);

Regards,
Torsten

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BUG: unable to handle kernel NULL pointer deref, bisected to 746650160
  2015-04-13 18:28   ` Torsten Luettgert
@ 2015-04-17 13:31     ` Torsten Luettgert
  2015-04-17 13:55       ` Torsten Luettgert
  2015-04-19 16:58     ` Christoph Hellwig
  1 sibling, 1 reply; 9+ messages in thread
From: Torsten Luettgert @ 2015-04-17 13:31 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-kernel, linux-scsi

On Mon, 13 Apr 2015 20:28:29 +0200
Torsten Luettgert <ml-lkml@enda.eu> wrote:

> On Mon, 13 Apr 2015 19:41:05 +0200
> Christoph Hellwig <hch@lst.de> wrote:
> 
> > Can you run gdb on your vmlinux file and send the output of the
> > following command in gdb
> > 
> > l *(scsi_dma_unmap+0x54)
> 
> Thanks for looking into it! Here is what gdb says: [...]

Hello Christoph,

do you think it could be a compiler problem? I still can't imagine why
only 2 people have seen this (at least it wasn't just me), but it
_always_ hits with newer kernels on that node.

Right now, I'm trying the problematic release, compiled with a newer
gcc (4.9.2-6 from Fedora, while using 4.4.7-11 from rhel6
before). It has only been running a few hours yet, so I can't tell if
it's good.

Regards,
Torsten

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BUG: unable to handle kernel NULL pointer deref, bisected to 746650160
  2015-04-17 13:31     ` Torsten Luettgert
@ 2015-04-17 13:55       ` Torsten Luettgert
  0 siblings, 0 replies; 9+ messages in thread
From: Torsten Luettgert @ 2015-04-17 13:55 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: LKML, Linux SCSI mailing list

On Fri, 17 Apr 2015 15:31:16 +0200
Torsten Luettgert <ml-lkml@enda.eu> wrote:

> On Mon, 13 Apr 2015 20:28:29 +0200
> Torsten Luettgert <ml-lkml@enda.eu> wrote:
> 
> Right now, I'm trying the problematic release, compiled with a newer
> gcc (4.9.2-6 from Fedora, while using 4.4.7-11 from rhel6
> before). It has only been running a few hours yet, so I can't tell if
> it's good.

Ok, forget that, it just crashed.

Regards,
Torsten

> BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000018 IP: [<ffffffff8134d3c8>]
> swiotlb_unmap_sg_attrs+0x28/0x70 PGD 0 
> Oops: 0000 [#1] SMP 
> Modules linked in: iTCO_wdt iTCO_vendor_support lpc_ich mfd_core
> usb_storage CPU: 0 PID: 0 Comm: swapper/0 Not tainted
> 3.16.0-74665016086615bbaa3fa6f83af410a0a4e029ee+ #1 Hardware name:
> Supermicro X8DTT/X8DTT, BIOS 2.1c       04/22/2014 task:
> ffffffff81c16500 ti: ffffffff81c00000 task.ti: ffffffff81c00000 RIP:
> 0010:[<ffffffff8134d3c8>]  [<ffffffff8134d3c8>]
> swiotlb_unmap_sg_attrs+0x28/0x70 RSP: 0018:ffff88063fc03e20  EFLAGS:
> 00010002 RAX: ffff880c20567098 RBX: 0000000000000000 RCX:
> 0000000000000002 RDX: 0000000000000001 RSI: 0000000000000000 RDI:
> ffff880c20567098 RBP: ffff88063fc03e48 R08: 0000000000000000 R09:
> ffffffff8134d3a0 R10: 0000000000000000 R11: 00000000000007eb R12:
> 0000000000000000 R13: 0000000000000002 R14: 0000000000000001 R15:
> ffff880c20567098 FS:  0000000000000000(0000)
> GS:ffff88063fc00000(0000) knlGS:0000000000000000 CS:  0010 DS: 0000
> ES: 0000 CR0: 000000008005003b CR2: 0000000000000018 CR3:
> 0000000001c11000 CR4: 00000000000027e0 Stack: ffff880c1eae0718
> ffff880c1eae4c70 ffff880c1f8c858c 00000000000000e4 00000000000000e4
> ffff88063fc03e58 ffffffff8147975e ffff88063fc03ea8 ffffffff81486135
> 0000000000000046 ffff880613c6d050 ffff88063fc03f48 Call Trace:
>  <IRQ> 
>  [<ffffffff8147975e>] scsi_dma_unmap+0x5e/0x80
>  [<ffffffff81486135>] twl_interrupt+0x425/0x750
>  [<ffffffff810f6139>] handle_irq_event_percpu+0x39/0x1b0
>  [<ffffffff810f62e7>] handle_irq_event+0x37/0x60
>  [<ffffffff810f95f8>] handle_fasteoi_irq+0x78/0x150
>  [<ffffffff8104fc64>] handle_irq+0x84/0x150
>  [<ffffffff810c0f6a>] ? atomic_notifier_call_chain+0x1a/0x20
>  [<ffffffff81778d5d>] do_IRQ+0x4d/0xe0
>  [<ffffffff817772ea>] common_interrupt+0x6a/0x6a
>  <EOI> 
>  [<ffffffff815e28a6>] ? cpuidle_enter_state+0x46/0xb0
>  [<ffffffff815e29f7>] cpuidle_enter+0x17/0x20
>  [<ffffffff810e12e9>] cpu_startup_entry+0x2b9/0x3e0
>  [<ffffffff81765887>] rest_init+0x77/0x80
>  [<ffffffff81d85e94>] start_kernel+0x41d/0x42a
>  [<ffffffff81d85495>] x86_64_start_reservations+0x2a/0x2c
>  [<ffffffff81d85588>] x86_64_start_kernel+0xf1/0xf4
> Code: 44 00 00 55 83 f9 03 48 89 e5 41 57 41 56 41 55 41 89 cd 41 54
> 53 74 4b 45 31 e4 85 d2 49 89 ff 48 89 f3 41 89 d6 7e 29 0f 1f 00
> <8b> 53 18 48 8b 73 10 44 89 e9 4c 89 ff 41 83 c4 01 e8 82 ff ff
> RIP [<ffffffff8134d3c8>] swiotlb_unmap_sg_attrs+0x28/0x70
>  RSP <ffff88063fc03e20>
> CR2: 0000000000000018
> ---[ end trace 3eba978c8cc1c24a ]---
> Kernel panic - not syncing: Fatal exception in interrupt
> Shutting down cpus with NMI
> Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> Rebooting in 10 seconds..

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BUG: unable to handle kernel NULL pointer deref, bisected to 746650160
  2015-04-13 18:28   ` Torsten Luettgert
  2015-04-17 13:31     ` Torsten Luettgert
@ 2015-04-19 16:58     ` Christoph Hellwig
  2015-04-20 11:24       ` Torsten Luettgert
  1 sibling, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2015-04-19 16:58 UTC (permalink / raw)
  To: Torsten Luettgert; +Cc: Christoph Hellwig, linux-kernel, linux-scsi

[-- Attachment #1: Type: text/plain, Size: 177 bytes --]

This looks like a long standing bug in all three 3ware drivers to
me, that the taking the host lock around the host_busy manipulation
was hiding.

Can you test the patch below?

[-- Attachment #2: 0001-3w-sas-fix-command-completion-race.patch --]
[-- Type: text/x-patch, Size: 4523 bytes --]

>From b500f76e6f92722d8f1fc0de90961af730320953 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Sun, 19 Apr 2015 18:46:36 +0200
Subject: 3w-sas: fix command completion race

The 3w-sas driver needs to tear down the dma mappings before returning
the command to the midlayer, as there is no guarantee the sglist and
count are valid after that point.  Also remove the dma mapping helpers
which have another inherent race due to the request_id index.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/3w-sas.c | 50 ++++++++++----------------------------------------
 drivers/scsi/3w-sas.h |  4 ----
 2 files changed, 10 insertions(+), 44 deletions(-)

diff --git a/drivers/scsi/3w-sas.c b/drivers/scsi/3w-sas.c
index 2361772..3d4c5f9 100644
--- a/drivers/scsi/3w-sas.c
+++ b/drivers/scsi/3w-sas.c
@@ -290,26 +290,6 @@ static int twl_post_command_packet(TW_Device_Extension *tw_dev, int request_id)
 	return 0;
 } /* End twl_post_command_packet() */
 
-/* This function will perform a pci-dma mapping for a scatter gather list */
-static int twl_map_scsi_sg_data(TW_Device_Extension *tw_dev, int request_id)
-{
-	int use_sg;
-	struct scsi_cmnd *cmd = tw_dev->srb[request_id];
-
-	use_sg = scsi_dma_map(cmd);
-	if (!use_sg)
-		return 0;
-	else if (use_sg < 0) {
-		TW_PRINTK(tw_dev->host, TW_DRIVER, 0x1, "Failed to map scatter gather list");
-		return 0;
-	}
-
-	cmd->SCp.phase = TW_PHASE_SGLIST;
-	cmd->SCp.have_data_in = use_sg;
-
-	return use_sg;
-} /* End twl_map_scsi_sg_data() */
-
 /* This function hands scsi cdb's to the firmware */
 static int twl_scsiop_execute_scsi(TW_Device_Extension *tw_dev, int request_id, char *cdb, int use_sg, TW_SG_Entry_ISO *sglistarg)
 {
@@ -357,8 +337,8 @@ static int twl_scsiop_execute_scsi(TW_Device_Extension *tw_dev, int request_id,
 	if (!sglistarg) {
 		/* Map sglist from scsi layer to cmd packet */
 		if (scsi_sg_count(srb)) {
-			sg_count = twl_map_scsi_sg_data(tw_dev, request_id);
-			if (sg_count == 0)
+			sg_count = scsi_dma_map(srb);
+			if (sg_count <= 0)
 				goto out;
 
 			scsi_for_each_sg(srb, sg, sg_count, i) {
@@ -1102,15 +1082,6 @@ out:
 	return retval;
 } /* End twl_initialize_device_extension() */
 
-/* This function will perform a pci-dma unmap */
-static void twl_unmap_scsi_data(TW_Device_Extension *tw_dev, int request_id)
-{
-	struct scsi_cmnd *cmd = tw_dev->srb[request_id];
-
-	if (cmd->SCp.phase == TW_PHASE_SGLIST)
-		scsi_dma_unmap(cmd);
-} /* End twl_unmap_scsi_data() */
-
 /* This function will handle attention interrupts */
 static int twl_handle_attention_interrupt(TW_Device_Extension *tw_dev)
 {
@@ -1251,11 +1222,11 @@ static irqreturn_t twl_interrupt(int irq, void *dev_instance)
 			}
 
 			/* Now complete the io */
+			scsi_dma_unmap(cmd);
+			cmd->scsi_done(cmd);
 			tw_dev->state[request_id] = TW_S_COMPLETED;
 			twl_free_request_id(tw_dev, request_id);
 			tw_dev->posted_request_count--;
-			tw_dev->srb[request_id]->scsi_done(tw_dev->srb[request_id]);
-			twl_unmap_scsi_data(tw_dev, request_id);
 		}
 
 		/* Check for another response interrupt */
@@ -1400,10 +1371,12 @@ static int twl_reset_device_extension(TW_Device_Extension *tw_dev, int ioctl_res
 		if ((tw_dev->state[i] != TW_S_FINISHED) &&
 		    (tw_dev->state[i] != TW_S_INITIAL) &&
 		    (tw_dev->state[i] != TW_S_COMPLETED)) {
-			if (tw_dev->srb[i]) {
-				tw_dev->srb[i]->result = (DID_RESET << 16);
-				tw_dev->srb[i]->scsi_done(tw_dev->srb[i]);
-				twl_unmap_scsi_data(tw_dev, i);
+		    	struct scsi_cmnd * cmd = tw_dev->srb[i];
+
+			if (cmd) {
+				cmd->result = (DID_RESET << 16);
+				scsi_dma_unmap(cmd);
+				cmd->scsi_done(cmd);
 			}
 		}
 	}
@@ -1507,9 +1480,6 @@ static int twl_scsi_queue_lck(struct scsi_cmnd *SCpnt, void (*done)(struct scsi_
 	/* Save the scsi command for use by the ISR */
 	tw_dev->srb[request_id] = SCpnt;
 
-	/* Initialize phase to zero */
-	SCpnt->SCp.phase = TW_PHASE_INITIAL;
-
 	retval = twl_scsiop_execute_scsi(tw_dev, request_id, NULL, 0, NULL);
 	if (retval) {
 		tw_dev->state[request_id] = TW_S_COMPLETED;
diff --git a/drivers/scsi/3w-sas.h b/drivers/scsi/3w-sas.h
index d474892..fec6449 100644
--- a/drivers/scsi/3w-sas.h
+++ b/drivers/scsi/3w-sas.h
@@ -103,10 +103,6 @@ static char *twl_aen_severity_table[] =
 #define TW_CURRENT_DRIVER_BUILD 0
 #define TW_CURRENT_DRIVER_BRANCH 0
 
-/* Phase defines */
-#define TW_PHASE_INITIAL 0
-#define TW_PHASE_SGLIST  2
-
 /* Misc defines */
 #define TW_SECTOR_SIZE                        512
 #define TW_MAX_UNITS			      32
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: BUG: unable to handle kernel NULL pointer deref, bisected to 746650160
  2015-04-19 16:58     ` Christoph Hellwig
@ 2015-04-20 11:24       ` Torsten Luettgert
  2015-04-22  8:45         ` Torsten Luettgert
  0 siblings, 1 reply; 9+ messages in thread
From: Torsten Luettgert @ 2015-04-20 11:24 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: LKML, Linux SCSI mailing list

On Sun, 19 Apr 2015 18:58:41 +0200
Christoph Hellwig <hch@lst.de> wrote:

> This looks like a long standing bug in all three 3ware drivers to
> me, that the taking the host lock around the host_busy manipulation
> was hiding.
> 
> Can you test the patch below?

I'm running it right now and keeping my fingers crossed.

Thanks & regards,
Torsten

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BUG: unable to handle kernel NULL pointer deref, bisected to 746650160
  2015-04-20 11:24       ` Torsten Luettgert
@ 2015-04-22  8:45         ` Torsten Luettgert
  2015-04-27 15:17           ` Torsten Luettgert
  0 siblings, 1 reply; 9+ messages in thread
From: Torsten Luettgert @ 2015-04-22  8:45 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: LKML, Linux SCSI mailing list

On Mon, 20 Apr 2015 13:24:24 +0200
Torsten Luettgert <ml-lkml@enda.eu> wrote:

> > Can you test the patch below?
> 
> I'm running it right now and keeping my fingers crossed.

Just under two days uptime now, and no crashes. I'm pretty sure you
nailed it.

I'll keep this running, and if - when - it has been going for a week (3
times the max uptime without the patch) - I'll give a final notice.

Thanks a lot, Christoph! Will this patch be integrated into mainline?

Regards,
Torsten

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BUG: unable to handle kernel NULL pointer deref, bisected to 746650160
  2015-04-22  8:45         ` Torsten Luettgert
@ 2015-04-27 15:17           ` Torsten Luettgert
  0 siblings, 0 replies; 9+ messages in thread
From: Torsten Luettgert @ 2015-04-27 15:17 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: LKML, Linux SCSI mailing list

On Wed, 22 Apr 2015 10:45:19 +0200
Torsten Luettgert <ml-lkml@enda.eu> wrote:

> I'll keep this running, and if - when - it has been going for a week
> (3 times the max uptime without the patch) - I'll give a final notice.

So here it is, the final notice:
17:16:36 up 7 days,  4:07,  2 users,  load average: 3.64, 3.11, 3.19

Thanks again,
Torsten

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-04-27 15:17 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-08 16:33 BUG: unable to handle kernel NULL pointer deref, bisected to 746650160 Torsten Luettgert
2015-04-13 17:41 ` Christoph Hellwig
2015-04-13 18:28   ` Torsten Luettgert
2015-04-17 13:31     ` Torsten Luettgert
2015-04-17 13:55       ` Torsten Luettgert
2015-04-19 16:58     ` Christoph Hellwig
2015-04-20 11:24       ` Torsten Luettgert
2015-04-22  8:45         ` Torsten Luettgert
2015-04-27 15:17           ` Torsten Luettgert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.