All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
@ 2021-04-27  8:52 ` mwilck
  0 siblings, 0 replies; 24+ messages in thread
From: mwilck @ 2021-04-27  8:52 UTC (permalink / raw)
  To: Keith Busch, Sagi Grimberg, Christoph Hellwig, Chao Leng
  Cc: Hannes Reinecke, Daniel Wagner, linux-nvme, Martin Wilck, stable

From: Martin Wilck <mwilck@suse.com>

We have observed a few crashes run_timer_softirq(), where a broken
timer_list struct belonging to an anatt_timer was encountered. The broken
structures look like this, and we see actually multiple ones attached to
the same timer base:

crash> struct timer_list 0xffff92471bcfdc90
struct timer_list {
  entry = {
    next = 0xdead000000000122,  // LIST_POISON2
    pprev = 0x0
  },
  expires = 4296022933,
  function = 0xffffffffc06de5e0 <nvme_anatt_timeout>,
  flags = 20
}

If such a timer is encountered in run_timer_softirq(), the kernel
crashes. The test scenario was an I/O load test with lots of NVMe
controllers, some of which were removed and re-added on the storage side.

I think this may happen if the rdma recovery_work starts, in this call
chain:

nvme_rdma_error_recovery_work()
  /* this stops all sorts of activity for the controller, but not
     the multipath-related work queue and timer */
  nvme_rdma_reconnect_or_remove(ctrl)
    => kicks reconnect_work

work queue: reconnect_work

nvme_rdma_reconnect_ctrl_work()
  nvme_rdma_setup_ctrl()
    nvme_rdma_configure_admin_queue()
       nvme_init_identify()
          nvme_mpath_init()
	     # this sets some fields of the timer_list without taking a lock
             timer_setup()
             nvme_read_ana_log()
	         mod_timer() or del_timer_sync()

Similar for TCP. The idea for the patch is based on the observation that
nvme_rdma_reset_ctrl_work() calls nvme_stop_ctrl()->nvme_mpath_stop(),
whereas nvme_rdma_error_recovery_work() stops only the keepalive timer, but
not the anatt timer.

I admit that the root cause analysis isn't rock solid yet. In particular, I
can't explain why we see LIST_POISON2 in the "next" pointer, which would
indicate that the timer has been detached before; yet we find it linked to
the timer base when the crash occurs.

OTOH, the anatt_timer is only touched in nvme_mpath_init() (see above) and
nvme_mpath_stop(), so the hypothesis that modifying active timers may cause
the issue isn't totally out of sight. I suspect that the LIST_POISON2 may
come to pass in multiple steps.

If anyone has better ideas, please advise. The issue occurs very
sporadically; verifying this by testing will be difficult.

Signed-off-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chao Leng <lengchao@huawei.com>
Cc: stable@vger.kernel.org

----
Changes in v3: Changed the subject line, as suggested by Sagi Grimberg

Changes in v2: Moved call to nvme_mpath_stop() further down, directly before
the call of nvme_rdma_reconnect_or_remove() (Chao Leng)
---
 drivers/nvme/host/multipath.c | 1 +
 drivers/nvme/host/rdma.c      | 1 +
 drivers/nvme/host/tcp.c       | 1 +
 3 files changed, 3 insertions(+)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index a1d476e1ac02..c63dd5dfa7ff 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -586,6 +586,7 @@ void nvme_mpath_stop(struct nvme_ctrl *ctrl)
 	del_timer_sync(&ctrl->anatt_timer);
 	cancel_work_sync(&ctrl->ana_work);
 }
+EXPORT_SYMBOL_GPL(nvme_mpath_stop);
 
 #define SUBSYS_ATTR_RW(_name, _mode, _show, _store)  \
 	struct device_attribute subsys_attr_##_name =	\
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index be905d4fdb47..fc07a7b0dc1d 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1202,6 +1202,7 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work)
 		return;
 	}
 
+	nvme_mpath_stop(&ctrl->ctrl);
 	nvme_rdma_reconnect_or_remove(ctrl);
 }
 
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index a0f00cb8f9f3..46287b4f4d10 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2068,6 +2068,7 @@ static void nvme_tcp_error_recovery_work(struct work_struct *work)
 		return;
 	}
 
+	nvme_mpath_stop(ctrl);
 	nvme_tcp_reconnect_or_remove(ctrl);
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
@ 2021-04-27  8:52 ` mwilck
  0 siblings, 0 replies; 24+ messages in thread
From: mwilck @ 2021-04-27  8:52 UTC (permalink / raw)
  To: Keith Busch, Sagi Grimberg, Christoph Hellwig, Chao Leng
  Cc: Hannes Reinecke, Daniel Wagner, linux-nvme, Martin Wilck, stable

From: Martin Wilck <mwilck@suse.com>

We have observed a few crashes run_timer_softirq(), where a broken
timer_list struct belonging to an anatt_timer was encountered. The broken
structures look like this, and we see actually multiple ones attached to
the same timer base:

crash> struct timer_list 0xffff92471bcfdc90
struct timer_list {
  entry = {
    next = 0xdead000000000122,  // LIST_POISON2
    pprev = 0x0
  },
  expires = 4296022933,
  function = 0xffffffffc06de5e0 <nvme_anatt_timeout>,
  flags = 20
}

If such a timer is encountered in run_timer_softirq(), the kernel
crashes. The test scenario was an I/O load test with lots of NVMe
controllers, some of which were removed and re-added on the storage side.

I think this may happen if the rdma recovery_work starts, in this call
chain:

nvme_rdma_error_recovery_work()
  /* this stops all sorts of activity for the controller, but not
     the multipath-related work queue and timer */
  nvme_rdma_reconnect_or_remove(ctrl)
    => kicks reconnect_work

work queue: reconnect_work

nvme_rdma_reconnect_ctrl_work()
  nvme_rdma_setup_ctrl()
    nvme_rdma_configure_admin_queue()
       nvme_init_identify()
          nvme_mpath_init()
	     # this sets some fields of the timer_list without taking a lock
             timer_setup()
             nvme_read_ana_log()
	         mod_timer() or del_timer_sync()

Similar for TCP. The idea for the patch is based on the observation that
nvme_rdma_reset_ctrl_work() calls nvme_stop_ctrl()->nvme_mpath_stop(),
whereas nvme_rdma_error_recovery_work() stops only the keepalive timer, but
not the anatt timer.

I admit that the root cause analysis isn't rock solid yet. In particular, I
can't explain why we see LIST_POISON2 in the "next" pointer, which would
indicate that the timer has been detached before; yet we find it linked to
the timer base when the crash occurs.

OTOH, the anatt_timer is only touched in nvme_mpath_init() (see above) and
nvme_mpath_stop(), so the hypothesis that modifying active timers may cause
the issue isn't totally out of sight. I suspect that the LIST_POISON2 may
come to pass in multiple steps.

If anyone has better ideas, please advise. The issue occurs very
sporadically; verifying this by testing will be difficult.

Signed-off-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chao Leng <lengchao@huawei.com>
Cc: stable@vger.kernel.org

----
Changes in v3: Changed the subject line, as suggested by Sagi Grimberg

Changes in v2: Moved call to nvme_mpath_stop() further down, directly before
the call of nvme_rdma_reconnect_or_remove() (Chao Leng)
---
 drivers/nvme/host/multipath.c | 1 +
 drivers/nvme/host/rdma.c      | 1 +
 drivers/nvme/host/tcp.c       | 1 +
 3 files changed, 3 insertions(+)

diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index a1d476e1ac02..c63dd5dfa7ff 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -586,6 +586,7 @@ void nvme_mpath_stop(struct nvme_ctrl *ctrl)
 	del_timer_sync(&ctrl->anatt_timer);
 	cancel_work_sync(&ctrl->ana_work);
 }
+EXPORT_SYMBOL_GPL(nvme_mpath_stop);
 
 #define SUBSYS_ATTR_RW(_name, _mode, _show, _store)  \
 	struct device_attribute subsys_attr_##_name =	\
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index be905d4fdb47..fc07a7b0dc1d 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1202,6 +1202,7 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work)
 		return;
 	}
 
+	nvme_mpath_stop(&ctrl->ctrl);
 	nvme_rdma_reconnect_or_remove(ctrl);
 }
 
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index a0f00cb8f9f3..46287b4f4d10 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2068,6 +2068,7 @@ static void nvme_tcp_error_recovery_work(struct work_struct *work)
 		return;
 	}
 
+	nvme_mpath_stop(ctrl);
 	nvme_tcp_reconnect_or_remove(ctrl);
 }
 
-- 
2.31.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
  2021-04-27  8:52 ` mwilck
@ 2021-04-27  9:16   ` Daniel Wagner
  -1 siblings, 0 replies; 24+ messages in thread
From: Daniel Wagner @ 2021-04-27  9:16 UTC (permalink / raw)
  To: mwilck
  Cc: Keith Busch, Sagi Grimberg, Christoph Hellwig, Chao Leng,
	Hannes Reinecke, linux-nvme, stable

On Tue, Apr 27, 2021 at 10:52:46AM +0200, mwilck@suse.com wrote:
> If anyone has better ideas, please advise. The issue occurs very
> sporadically; verifying this by testing will be difficult.

Maybe you could update the commit message with Chao's analysis?


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
@ 2021-04-27  9:16   ` Daniel Wagner
  0 siblings, 0 replies; 24+ messages in thread
From: Daniel Wagner @ 2021-04-27  9:16 UTC (permalink / raw)
  To: mwilck
  Cc: Keith Busch, Sagi Grimberg, Christoph Hellwig, Chao Leng,
	Hannes Reinecke, linux-nvme, stable

On Tue, Apr 27, 2021 at 10:52:46AM +0200, mwilck@suse.com wrote:
> If anyone has better ideas, please advise. The issue occurs very
> sporadically; verifying this by testing will be difficult.

Maybe you could update the commit message with Chao's analysis?


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
  2021-04-27  8:52 ` mwilck
@ 2021-04-27  9:33   ` Hannes Reinecke
  -1 siblings, 0 replies; 24+ messages in thread
From: Hannes Reinecke @ 2021-04-27  9:33 UTC (permalink / raw)
  To: mwilck, Keith Busch, Sagi Grimberg, Christoph Hellwig, Chao Leng
  Cc: Daniel Wagner, linux-nvme, stable

On 4/27/21 10:52 AM, mwilck@suse.com wrote:
> From: Martin Wilck <mwilck@suse.com>
> 
> We have observed a few crashes run_timer_softirq(), where a broken
> timer_list struct belonging to an anatt_timer was encountered. The broken
> structures look like this, and we see actually multiple ones attached to
> the same timer base:
> 
> crash> struct timer_list 0xffff92471bcfdc90
> struct timer_list {
>   entry = {
>     next = 0xdead000000000122,  // LIST_POISON2
>     pprev = 0x0
>   },
>   expires = 4296022933,
>   function = 0xffffffffc06de5e0 <nvme_anatt_timeout>,
>   flags = 20
> }
> 
> If such a timer is encountered in run_timer_softirq(), the kernel
> crashes. The test scenario was an I/O load test with lots of NVMe
> controllers, some of which were removed and re-added on the storage side.
> 
> I think this may happen if the rdma recovery_work starts, in this call
> chain:
> 
> nvme_rdma_error_recovery_work()
>   /* this stops all sorts of activity for the controller, but not
>      the multipath-related work queue and timer */
>   nvme_rdma_reconnect_or_remove(ctrl)
>     => kicks reconnect_work
> 
> work queue: reconnect_work
> 
> nvme_rdma_reconnect_ctrl_work()
>   nvme_rdma_setup_ctrl()
>     nvme_rdma_configure_admin_queue()
>        nvme_init_identify()
>           nvme_mpath_init()
> 	     # this sets some fields of the timer_list without taking a lock
>              timer_setup()
>              nvme_read_ana_log()
> 	         mod_timer() or del_timer_sync()
> 
> Similar for TCP. The idea for the patch is based on the observation that
> nvme_rdma_reset_ctrl_work() calls nvme_stop_ctrl()->nvme_mpath_stop(),
> whereas nvme_rdma_error_recovery_work() stops only the keepalive timer, but
> not the anatt timer.
> 
> I admit that the root cause analysis isn't rock solid yet. In particular, I
> can't explain why we see LIST_POISON2 in the "next" pointer, which would
> indicate that the timer has been detached before; yet we find it linked to
> the timer base when the crash occurs.
> 
> OTOH, the anatt_timer is only touched in nvme_mpath_init() (see above) and
> nvme_mpath_stop(), so the hypothesis that modifying active timers may cause
> the issue isn't totally out of sight. I suspect that the LIST_POISON2 may
> come to pass in multiple steps.
> 
> If anyone has better ideas, please advise. The issue occurs very
> sporadically; verifying this by testing will be difficult.
> 
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
> Reviewed-by: Chao Leng <lengchao@huawei.com>
> Cc: stable@vger.kernel.org
> 
As indicated in my previous mail, please change the description. We have
since established a actual reason (duplicate calls to add_timer()), so
please list it here.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		        Kernel Storage Architect
hare@suse.de			               +49 911 74053 688
SUSE Software Solutions Germany GmbH, 90409 Nürnberg
GF: F. Imendörffer, HRB 36809 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
@ 2021-04-27  9:33   ` Hannes Reinecke
  0 siblings, 0 replies; 24+ messages in thread
From: Hannes Reinecke @ 2021-04-27  9:33 UTC (permalink / raw)
  To: mwilck, Keith Busch, Sagi Grimberg, Christoph Hellwig, Chao Leng
  Cc: Daniel Wagner, linux-nvme, stable

On 4/27/21 10:52 AM, mwilck@suse.com wrote:
> From: Martin Wilck <mwilck@suse.com>
> 
> We have observed a few crashes run_timer_softirq(), where a broken
> timer_list struct belonging to an anatt_timer was encountered. The broken
> structures look like this, and we see actually multiple ones attached to
> the same timer base:
> 
> crash> struct timer_list 0xffff92471bcfdc90
> struct timer_list {
>   entry = {
>     next = 0xdead000000000122,  // LIST_POISON2
>     pprev = 0x0
>   },
>   expires = 4296022933,
>   function = 0xffffffffc06de5e0 <nvme_anatt_timeout>,
>   flags = 20
> }
> 
> If such a timer is encountered in run_timer_softirq(), the kernel
> crashes. The test scenario was an I/O load test with lots of NVMe
> controllers, some of which were removed and re-added on the storage side.
> 
> I think this may happen if the rdma recovery_work starts, in this call
> chain:
> 
> nvme_rdma_error_recovery_work()
>   /* this stops all sorts of activity for the controller, but not
>      the multipath-related work queue and timer */
>   nvme_rdma_reconnect_or_remove(ctrl)
>     => kicks reconnect_work
> 
> work queue: reconnect_work
> 
> nvme_rdma_reconnect_ctrl_work()
>   nvme_rdma_setup_ctrl()
>     nvme_rdma_configure_admin_queue()
>        nvme_init_identify()
>           nvme_mpath_init()
> 	     # this sets some fields of the timer_list without taking a lock
>              timer_setup()
>              nvme_read_ana_log()
> 	         mod_timer() or del_timer_sync()
> 
> Similar for TCP. The idea for the patch is based on the observation that
> nvme_rdma_reset_ctrl_work() calls nvme_stop_ctrl()->nvme_mpath_stop(),
> whereas nvme_rdma_error_recovery_work() stops only the keepalive timer, but
> not the anatt timer.
> 
> I admit that the root cause analysis isn't rock solid yet. In particular, I
> can't explain why we see LIST_POISON2 in the "next" pointer, which would
> indicate that the timer has been detached before; yet we find it linked to
> the timer base when the crash occurs.
> 
> OTOH, the anatt_timer is only touched in nvme_mpath_init() (see above) and
> nvme_mpath_stop(), so the hypothesis that modifying active timers may cause
> the issue isn't totally out of sight. I suspect that the LIST_POISON2 may
> come to pass in multiple steps.
> 
> If anyone has better ideas, please advise. The issue occurs very
> sporadically; verifying this by testing will be difficult.
> 
> Signed-off-by: Martin Wilck <mwilck@suse.com>
> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
> Reviewed-by: Chao Leng <lengchao@huawei.com>
> Cc: stable@vger.kernel.org
> 
As indicated in my previous mail, please change the description. We have
since established a actual reason (duplicate calls to add_timer()), so
please list it here.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		        Kernel Storage Architect
hare@suse.de			               +49 911 74053 688
SUSE Software Solutions Germany GmbH, 90409 Nürnberg
GF: F. Imendörffer, HRB 36809 (AG Nürnberg)

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
  2021-04-27  9:33   ` Hannes Reinecke
@ 2021-04-27 16:25     ` Christoph Hellwig
  -1 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2021-04-27 16:25 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: mwilck, Keith Busch, Sagi Grimberg, Christoph Hellwig, Chao Leng,
	Daniel Wagner, linux-nvme, stable

On Tue, Apr 27, 2021 at 11:33:04AM +0200, Hannes Reinecke wrote:
> As indicated in my previous mail, please change the description. We have
> since established a actual reason (duplicate calls to add_timer()), so
> please list it here.

So what happens if the offending add_timer is changed to mod_timer?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
@ 2021-04-27 16:25     ` Christoph Hellwig
  0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2021-04-27 16:25 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: mwilck, Keith Busch, Sagi Grimberg, Christoph Hellwig, Chao Leng,
	Daniel Wagner, linux-nvme, stable

On Tue, Apr 27, 2021 at 11:33:04AM +0200, Hannes Reinecke wrote:
> As indicated in my previous mail, please change the description. We have
> since established a actual reason (duplicate calls to add_timer()), so
> please list it here.

So what happens if the offending add_timer is changed to mod_timer?

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
  2021-04-27 16:25     ` Christoph Hellwig
@ 2021-04-27 18:05       ` Hannes Reinecke
  -1 siblings, 0 replies; 24+ messages in thread
From: Hannes Reinecke @ 2021-04-27 18:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: mwilck, Keith Busch, Sagi Grimberg, Chao Leng, Daniel Wagner,
	linux-nvme, stable

On 4/27/21 6:25 PM, Christoph Hellwig wrote:
> On Tue, Apr 27, 2021 at 11:33:04AM +0200, Hannes Reinecke wrote:
>> As indicated in my previous mail, please change the description. We have
>> since established a actual reason (duplicate calls to add_timer()), so
>> please list it here.
> 
> So what happens if the offending add_timer is changed to mod_timer?
> 
I guess that should be fine, as the boilerplate said it can act
as a safe version of add_timer.

But that would just solve the crash upon add_timer(). We still have the
problem that the anatt timer might trigger just _after_ eg
nvme_tcp_teardown_admin_queue(), causing it to hit an invalid admin queue.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		        Kernel Storage Architect
hare@suse.de			               +49 911 74053 688
SUSE Software Solutions Germany GmbH, 90409 Nürnberg
GF: F. Imendörffer, HRB 36809 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
@ 2021-04-27 18:05       ` Hannes Reinecke
  0 siblings, 0 replies; 24+ messages in thread
From: Hannes Reinecke @ 2021-04-27 18:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: mwilck, Keith Busch, Sagi Grimberg, Chao Leng, Daniel Wagner,
	linux-nvme, stable

On 4/27/21 6:25 PM, Christoph Hellwig wrote:
> On Tue, Apr 27, 2021 at 11:33:04AM +0200, Hannes Reinecke wrote:
>> As indicated in my previous mail, please change the description. We have
>> since established a actual reason (duplicate calls to add_timer()), so
>> please list it here.
> 
> So what happens if the offending add_timer is changed to mod_timer?
> 
I guess that should be fine, as the boilerplate said it can act
as a safe version of add_timer.

But that would just solve the crash upon add_timer(). We still have the
problem that the anatt timer might trigger just _after_ eg
nvme_tcp_teardown_admin_queue(), causing it to hit an invalid admin queue.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		        Kernel Storage Architect
hare@suse.de			               +49 911 74053 688
SUSE Software Solutions Germany GmbH, 90409 Nürnberg
GF: F. Imendörffer, HRB 36809 (AG Nürnberg)

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
  2021-04-27 18:05       ` Hannes Reinecke
@ 2021-04-27 18:10         ` Christoph Hellwig
  -1 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2021-04-27 18:10 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Christoph Hellwig, mwilck, Keith Busch, Sagi Grimberg, Chao Leng,
	Daniel Wagner, linux-nvme, stable

On Tue, Apr 27, 2021 at 08:05:41PM +0200, Hannes Reinecke wrote:
> On 4/27/21 6:25 PM, Christoph Hellwig wrote:
> > On Tue, Apr 27, 2021 at 11:33:04AM +0200, Hannes Reinecke wrote:
> >> As indicated in my previous mail, please change the description. We have
> >> since established a actual reason (duplicate calls to add_timer()), so
> >> please list it here.
> > 
> > So what happens if the offending add_timer is changed to mod_timer?
> > 
> I guess that should be fine, as the boilerplate said it can act
> as a safe version of add_timer.
> 
> But that would just solve the crash upon add_timer(). We still have the
> problem that the anatt timer might trigger just _after_ eg
> nvme_tcp_teardown_admin_queue(), causing it to hit an invalid admin queue.

Yeah.  nvme_mpath_init isn't really suitable for being called on an
already initialized controller for a few reasons.  I have an idea how
to fix it, let me give it a spin.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
@ 2021-04-27 18:10         ` Christoph Hellwig
  0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2021-04-27 18:10 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Christoph Hellwig, mwilck, Keith Busch, Sagi Grimberg, Chao Leng,
	Daniel Wagner, linux-nvme, stable

On Tue, Apr 27, 2021 at 08:05:41PM +0200, Hannes Reinecke wrote:
> On 4/27/21 6:25 PM, Christoph Hellwig wrote:
> > On Tue, Apr 27, 2021 at 11:33:04AM +0200, Hannes Reinecke wrote:
> >> As indicated in my previous mail, please change the description. We have
> >> since established a actual reason (duplicate calls to add_timer()), so
> >> please list it here.
> > 
> > So what happens if the offending add_timer is changed to mod_timer?
> > 
> I guess that should be fine, as the boilerplate said it can act
> as a safe version of add_timer.
> 
> But that would just solve the crash upon add_timer(). We still have the
> problem that the anatt timer might trigger just _after_ eg
> nvme_tcp_teardown_admin_queue(), causing it to hit an invalid admin queue.

Yeah.  nvme_mpath_init isn't really suitable for being called on an
already initialized controller for a few reasons.  I have an idea how
to fix it, let me give it a spin.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
  2021-04-27 18:05       ` Hannes Reinecke
@ 2021-04-27 19:54         ` Martin Wilck
  -1 siblings, 0 replies; 24+ messages in thread
From: Martin Wilck @ 2021-04-27 19:54 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig
  Cc: Keith Busch, Sagi Grimberg, Chao Leng, Daniel Wagner, linux-nvme, stable

On Tue, 2021-04-27 at 20:05 +0200, Hannes Reinecke wrote:
> On 4/27/21 6:25 PM, Christoph Hellwig wrote:
> > On Tue, Apr 27, 2021 at 11:33:04AM +0200, Hannes Reinecke wrote:
> > > As indicated in my previous mail, please change the description.
> > > We have
> > > since established a actual reason (duplicate calls to
> > > add_timer()), so
> > > please list it here.
> > 
> > So what happens if the offending add_timer is changed to mod_timer?
> > 
> I guess that should be fine, as the boilerplate said it can act
> as a safe version of add_timer.
> 
> But that would just solve the crash upon add_timer().

The code doesn't use add_timer(), only mod_timer() and
del_timer_sync(). And we didn't observe a crash upon add_timer(). What
we observed was that a timer had been enqueued multiple times, and the
kernel crashes in expire_timers()->detach_timer(), when it encounters
an already detached entry in the timer list.

Regards,
Martin




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
@ 2021-04-27 19:54         ` Martin Wilck
  0 siblings, 0 replies; 24+ messages in thread
From: Martin Wilck @ 2021-04-27 19:54 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig
  Cc: Keith Busch, Sagi Grimberg, Chao Leng, Daniel Wagner, linux-nvme, stable

On Tue, 2021-04-27 at 20:05 +0200, Hannes Reinecke wrote:
> On 4/27/21 6:25 PM, Christoph Hellwig wrote:
> > On Tue, Apr 27, 2021 at 11:33:04AM +0200, Hannes Reinecke wrote:
> > > As indicated in my previous mail, please change the description.
> > > We have
> > > since established a actual reason (duplicate calls to
> > > add_timer()), so
> > > please list it here.
> > 
> > So what happens if the offending add_timer is changed to mod_timer?
> > 
> I guess that should be fine, as the boilerplate said it can act
> as a safe version of add_timer.
> 
> But that would just solve the crash upon add_timer().

The code doesn't use add_timer(), only mod_timer() and
del_timer_sync(). And we didn't observe a crash upon add_timer(). What
we observed was that a timer had been enqueued multiple times, and the
kernel crashes in expire_timers()->detach_timer(), when it encounters
an already detached entry in the timer list.

Regards,
Martin




_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
  2021-04-27 19:54         ` Martin Wilck
@ 2021-04-28  6:35           ` Hannes Reinecke
  -1 siblings, 0 replies; 24+ messages in thread
From: Hannes Reinecke @ 2021-04-28  6:35 UTC (permalink / raw)
  To: Martin Wilck, Christoph Hellwig
  Cc: Keith Busch, Sagi Grimberg, Chao Leng, Daniel Wagner, linux-nvme, stable

On 4/27/21 9:54 PM, Martin Wilck wrote:
> On Tue, 2021-04-27 at 20:05 +0200, Hannes Reinecke wrote:
>> On 4/27/21 6:25 PM, Christoph Hellwig wrote:
>>> On Tue, Apr 27, 2021 at 11:33:04AM +0200, Hannes Reinecke wrote:
>>>> As indicated in my previous mail, please change the description.
>>>> We have
>>>> since established a actual reason (duplicate calls to
>>>> add_timer()), so
>>>> please list it here.
>>>
>>> So what happens if the offending add_timer is changed to mod_timer?
>>>
>> I guess that should be fine, as the boilerplate said it can act
>> as a safe version of add_timer.
>>
>> But that would just solve the crash upon add_timer().
> 
> The code doesn't use add_timer(), only mod_timer() and
> del_timer_sync(). And we didn't observe a crash upon add_timer(). What
> we observed was that a timer had been enqueued multiple times, and the
> kernel crashes in expire_timers()->detach_timer(), when it encounters
> an already detached entry in the timer list.
> 
nvme_mpath_init() doesn't use add_timer, but it uses timer_setup(). And
calling that on an already pending timer is even worse :-)

And my point is that the anatt timer is not stopped at the end of
nvme_init_identify() if any of the calls to

nvme_configure_apst()
nvme_configure_timestamp()
nvme_configure_directives()
nvme_configure_acre()

returns with an error. If they do the controller is reset, causing
eg nvme_tcp_configure_admin_queue() to be called, which will be
calling timer_setup() with the original timer still running.
If the (original) timer triggers _after_ that time we have the crash.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		        Kernel Storage Architect
hare@suse.de			               +49 911 74053 688
SUSE Software Solutions Germany GmbH, 90409 Nürnberg
GF: F. Imendörffer, HRB 36809 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
@ 2021-04-28  6:35           ` Hannes Reinecke
  0 siblings, 0 replies; 24+ messages in thread
From: Hannes Reinecke @ 2021-04-28  6:35 UTC (permalink / raw)
  To: Martin Wilck, Christoph Hellwig
  Cc: Keith Busch, Sagi Grimberg, Chao Leng, Daniel Wagner, linux-nvme, stable

On 4/27/21 9:54 PM, Martin Wilck wrote:
> On Tue, 2021-04-27 at 20:05 +0200, Hannes Reinecke wrote:
>> On 4/27/21 6:25 PM, Christoph Hellwig wrote:
>>> On Tue, Apr 27, 2021 at 11:33:04AM +0200, Hannes Reinecke wrote:
>>>> As indicated in my previous mail, please change the description.
>>>> We have
>>>> since established a actual reason (duplicate calls to
>>>> add_timer()), so
>>>> please list it here.
>>>
>>> So what happens if the offending add_timer is changed to mod_timer?
>>>
>> I guess that should be fine, as the boilerplate said it can act
>> as a safe version of add_timer.
>>
>> But that would just solve the crash upon add_timer().
> 
> The code doesn't use add_timer(), only mod_timer() and
> del_timer_sync(). And we didn't observe a crash upon add_timer(). What
> we observed was that a timer had been enqueued multiple times, and the
> kernel crashes in expire_timers()->detach_timer(), when it encounters
> an already detached entry in the timer list.
> 
nvme_mpath_init() doesn't use add_timer, but it uses timer_setup(). And
calling that on an already pending timer is even worse :-)

And my point is that the anatt timer is not stopped at the end of
nvme_init_identify() if any of the calls to

nvme_configure_apst()
nvme_configure_timestamp()
nvme_configure_directives()
nvme_configure_acre()

returns with an error. If they do the controller is reset, causing
eg nvme_tcp_configure_admin_queue() to be called, which will be
calling timer_setup() with the original timer still running.
If the (original) timer triggers _after_ that time we have the crash.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		        Kernel Storage Architect
hare@suse.de			               +49 911 74053 688
SUSE Software Solutions Germany GmbH, 90409 Nürnberg
GF: F. Imendörffer, HRB 36809 (AG Nürnberg)

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
  2021-04-27 19:54         ` Martin Wilck
@ 2021-04-28  6:39           ` Maurizio Lombardi
  -1 siblings, 0 replies; 24+ messages in thread
From: Maurizio Lombardi @ 2021-04-28  6:39 UTC (permalink / raw)
  To: Martin Wilck
  Cc: Hannes Reinecke, Christoph Hellwig, Keith Busch, Sagi Grimberg,
	Chao Leng, Daniel Wagner, linux-nvme, stable

út 27. 4. 2021 v 22:02 odesílatel Martin Wilck <mwilck@suse.com> napsal:
> The code doesn't use add_timer(), only mod_timer() and
> del_timer_sync(). And we didn't observe a crash upon add_timer(). What
> we observed was that a timer had been enqueued multiple times, and the
> kernel crashes in expire_timers()->detach_timer(), when it encounters
> an already detached entry in the timer list.

How can a timer end up enqueued multiple times?
It's safe to call mod_timer() against both an active or an inactive timer
and mod_timer() is thread-safe also.

IMO the problem is due to the fact that timer_setup() could end up
being called against
an active, pending timer.
timer_setup() doesn't take any lock and modifies the pprev pointer and
the timer's flags

Maurizio


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
@ 2021-04-28  6:39           ` Maurizio Lombardi
  0 siblings, 0 replies; 24+ messages in thread
From: Maurizio Lombardi @ 2021-04-28  6:39 UTC (permalink / raw)
  To: Martin Wilck
  Cc: Hannes Reinecke, Christoph Hellwig, Keith Busch, Sagi Grimberg,
	Chao Leng, Daniel Wagner, linux-nvme, stable

út 27. 4. 2021 v 22:02 odesílatel Martin Wilck <mwilck@suse.com> napsal:
> The code doesn't use add_timer(), only mod_timer() and
> del_timer_sync(). And we didn't observe a crash upon add_timer(). What
> we observed was that a timer had been enqueued multiple times, and the
> kernel crashes in expire_timers()->detach_timer(), when it encounters
> an already detached entry in the timer list.

How can a timer end up enqueued multiple times?
It's safe to call mod_timer() against both an active or an inactive timer
and mod_timer() is thread-safe also.

IMO the problem is due to the fact that timer_setup() could end up
being called against
an active, pending timer.
timer_setup() doesn't take any lock and modifies the pprev pointer and
the timer's flags

Maurizio


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
  2021-04-28  6:35           ` Hannes Reinecke
@ 2021-04-28  6:41             ` Martin Wilck
  -1 siblings, 0 replies; 24+ messages in thread
From: Martin Wilck @ 2021-04-28  6:41 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig
  Cc: Keith Busch, Sagi Grimberg, Chao Leng, Daniel Wagner, linux-nvme, stable

On Wed, 2021-04-28 at 08:35 +0200, Hannes Reinecke wrote:
> On 4/27/21 9:54 PM, Martin Wilck wrote:
> > On Tue, 2021-04-27 at 20:05 +0200, Hannes Reinecke wrote:
> > > On 4/27/21 6:25 PM, Christoph Hellwig wrote:
> > > > On Tue, Apr 27, 2021 at 11:33:04AM +0200, Hannes Reinecke
> > > > wrote:
> > > > > As indicated in my previous mail, please change the
> > > > > description.
> > > > > We have
> > > > > since established a actual reason (duplicate calls to
> > > > > add_timer()), so
> > > > > please list it here.
> > > > 
> > > > So what happens if the offending add_timer is changed to
> > > > mod_timer?
> > > > 
> > > I guess that should be fine, as the boilerplate said it can act
> > > as a safe version of add_timer.
> > > 
> > > But that would just solve the crash upon add_timer().
> > 
> > The code doesn't use add_timer(), only mod_timer() and
> > del_timer_sync(). And we didn't observe a crash upon add_timer().
> > What
> > we observed was that a timer had been enqueued multiple times, and
> > the
> > kernel crashes in expire_timers()->detach_timer(), when it
> > encounters
> > an already detached entry in the timer list.
> > 
> nvme_mpath_init() doesn't use add_timer, but it uses timer_setup().

Yes. The notion that this is wrong was the idea behind my patch.

> And
> calling that on an already pending timer is even worse :-)
> 
> And my point is that the anatt timer is not stopped at the end of
> nvme_init_identify() if any of the calls to
> 
> nvme_configure_apst()
> nvme_configure_timestamp()
> nvme_configure_directives()
> nvme_configure_acre()
> 
> returns with an error. If they do the controller is reset, causing
> eg nvme_tcp_configure_admin_queue() to be called, which will be
> calling timer_setup() with the original timer still running.
> If the (original) timer triggers _after_ that time we have the crash.

You are right. But afaics the problem doesn't have to originate in
these 4 function calls. The same applies even after
nvme_init_identify() has returned. Any error that would trigger the
error recovery work after the anatt timer has started would have this
effect.

Regards,
Martin



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
@ 2021-04-28  6:41             ` Martin Wilck
  0 siblings, 0 replies; 24+ messages in thread
From: Martin Wilck @ 2021-04-28  6:41 UTC (permalink / raw)
  To: Hannes Reinecke, Christoph Hellwig
  Cc: Keith Busch, Sagi Grimberg, Chao Leng, Daniel Wagner, linux-nvme, stable

On Wed, 2021-04-28 at 08:35 +0200, Hannes Reinecke wrote:
> On 4/27/21 9:54 PM, Martin Wilck wrote:
> > On Tue, 2021-04-27 at 20:05 +0200, Hannes Reinecke wrote:
> > > On 4/27/21 6:25 PM, Christoph Hellwig wrote:
> > > > On Tue, Apr 27, 2021 at 11:33:04AM +0200, Hannes Reinecke
> > > > wrote:
> > > > > As indicated in my previous mail, please change the
> > > > > description.
> > > > > We have
> > > > > since established a actual reason (duplicate calls to
> > > > > add_timer()), so
> > > > > please list it here.
> > > > 
> > > > So what happens if the offending add_timer is changed to
> > > > mod_timer?
> > > > 
> > > I guess that should be fine, as the boilerplate said it can act
> > > as a safe version of add_timer.
> > > 
> > > But that would just solve the crash upon add_timer().
> > 
> > The code doesn't use add_timer(), only mod_timer() and
> > del_timer_sync(). And we didn't observe a crash upon add_timer().
> > What
> > we observed was that a timer had been enqueued multiple times, and
> > the
> > kernel crashes in expire_timers()->detach_timer(), when it
> > encounters
> > an already detached entry in the timer list.
> > 
> nvme_mpath_init() doesn't use add_timer, but it uses timer_setup().

Yes. The notion that this is wrong was the idea behind my patch.

> And
> calling that on an already pending timer is even worse :-)
> 
> And my point is that the anatt timer is not stopped at the end of
> nvme_init_identify() if any of the calls to
> 
> nvme_configure_apst()
> nvme_configure_timestamp()
> nvme_configure_directives()
> nvme_configure_acre()
> 
> returns with an error. If they do the controller is reset, causing
> eg nvme_tcp_configure_admin_queue() to be called, which will be
> calling timer_setup() with the original timer still running.
> If the (original) timer triggers _after_ that time we have the crash.

You are right. But afaics the problem doesn't have to originate in
these 4 function calls. The same applies even after
nvme_init_identify() has returned. Any error that would trigger the
error recovery work after the anatt timer has started would have this
effect.

Regards,
Martin



_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
  2021-04-28  6:39           ` Maurizio Lombardi
@ 2021-04-28  7:06             ` Martin Wilck
  -1 siblings, 0 replies; 24+ messages in thread
From: Martin Wilck @ 2021-04-28  7:06 UTC (permalink / raw)
  To: Maurizio Lombardi
  Cc: Hannes Reinecke, Christoph Hellwig, Keith Busch, Sagi Grimberg,
	Chao Leng, Daniel Wagner, linux-nvme, stable

On Wed, 2021-04-28 at 08:39 +0200, Maurizio Lombardi wrote:
> út 27. 4. 2021 v 22:02 odesílatel Martin Wilck <mwilck@suse.com>
> napsal:
> > The code doesn't use add_timer(), only mod_timer() and
> > del_timer_sync(). And we didn't observe a crash upon add_timer().
> > What
> > we observed was that a timer had been enqueued multiple times, and
> > the
> > kernel crashes in expire_timers()->detach_timer(), when it encounters
> > an already detached entry in the timer list.
> 
> How can a timer end up enqueued multiple times?

I observed in a dump that this can happen. More precisely, I observed
the following:

[ 4389.978732] nvme nvme36: Successfully reconnected (1 attempt)
...
[ 4441.445655] nvme nvme36: Successfully reconnected (1 attempt)
...
[ 4510.891400] nvme nvme36: ANATT timeout, resetting controller.
...
[ 4517.035411] general protection fault: 0000 [#1] SMP PTI (kernel crash)

In the crash dump, I could see a anatt_timer belonging to nvme36 being
pending in the timer list, with a remaining expiry time of 44s,
suggesting it had been created around time 4441.4s. That means that at
the time the ANATT timeout was seen (which corresponds to a timer added
around 4389.9s), the timer must have been pending twice. (*)

> It's safe to call mod_timer() against both an active or an inactive
> timer
> and mod_timer() is thread-safe also.
> 
> IMO the problem is due to the fact that timer_setup() could end up
> being called against
> an active, pending timer.
> timer_setup() doesn't take any lock and modifies the pprev pointer
> and
> the timer's flags

Yes, that's what I think has happened. timer_setup() doesn't clear any
pointers in the list of pending timers pointing to this entry. If the
newly-initialized timer is then added with mod_timer(), it becomes
linked in a second timer list. When the first one expires, the timer
will be detached, but only from one of the lists it's pending in. In a
scenario like the one we faced, this could actually happen multiple
times. If the detached timer remains linked into a timer list, once
that list is traversed, the kernel dereferences a pointer with value
LIST_POISON2, and crashes.

Anybody: correct me if this is nonsense.

Best,
Martin

(*) Note that the crash had been caused by another wrongly linked anatt
timer, not this one.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
@ 2021-04-28  7:06             ` Martin Wilck
  0 siblings, 0 replies; 24+ messages in thread
From: Martin Wilck @ 2021-04-28  7:06 UTC (permalink / raw)
  To: Maurizio Lombardi
  Cc: Hannes Reinecke, Christoph Hellwig, Keith Busch, Sagi Grimberg,
	Chao Leng, Daniel Wagner, linux-nvme, stable

On Wed, 2021-04-28 at 08:39 +0200, Maurizio Lombardi wrote:
> út 27. 4. 2021 v 22:02 odesílatel Martin Wilck <mwilck@suse.com>
> napsal:
> > The code doesn't use add_timer(), only mod_timer() and
> > del_timer_sync(). And we didn't observe a crash upon add_timer().
> > What
> > we observed was that a timer had been enqueued multiple times, and
> > the
> > kernel crashes in expire_timers()->detach_timer(), when it encounters
> > an already detached entry in the timer list.
> 
> How can a timer end up enqueued multiple times?

I observed in a dump that this can happen. More precisely, I observed
the following:

[ 4389.978732] nvme nvme36: Successfully reconnected (1 attempt)
...
[ 4441.445655] nvme nvme36: Successfully reconnected (1 attempt)
...
[ 4510.891400] nvme nvme36: ANATT timeout, resetting controller.
...
[ 4517.035411] general protection fault: 0000 [#1] SMP PTI (kernel crash)

In the crash dump, I could see a anatt_timer belonging to nvme36 being
pending in the timer list, with a remaining expiry time of 44s,
suggesting it had been created around time 4441.4s. That means that at
the time the ANATT timeout was seen (which corresponds to a timer added
around 4389.9s), the timer must have been pending twice. (*)

> It's safe to call mod_timer() against both an active or an inactive
> timer
> and mod_timer() is thread-safe also.
> 
> IMO the problem is due to the fact that timer_setup() could end up
> being called against
> an active, pending timer.
> timer_setup() doesn't take any lock and modifies the pprev pointer
> and
> the timer's flags

Yes, that's what I think has happened. timer_setup() doesn't clear any
pointers in the list of pending timers pointing to this entry. If the
newly-initialized timer is then added with mod_timer(), it becomes
linked in a second timer list. When the first one expires, the timer
will be detached, but only from one of the lists it's pending in. In a
scenario like the one we faced, this could actually happen multiple
times. If the detached timer remains linked into a timer list, once
that list is traversed, the kernel dereferences a pointer with value
LIST_POISON2, and crashes.

Anybody: correct me if this is nonsense.

Best,
Martin

(*) Note that the crash had been caused by another wrongly linked anatt
timer, not this one.


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
  2021-04-28  7:06             ` Martin Wilck
@ 2021-04-28  7:21               ` Maurizio Lombardi
  -1 siblings, 0 replies; 24+ messages in thread
From: Maurizio Lombardi @ 2021-04-28  7:21 UTC (permalink / raw)
  To: Martin Wilck
  Cc: Hannes Reinecke, Christoph Hellwig, Keith Busch, Sagi Grimberg,
	Chao Leng, Daniel Wagner, linux-nvme, stable

st 28. 4. 2021 v 9:06 odesílatel Martin Wilck <mwilck@suse.com> napsal:
>
> Yes, that's what I think has happened. timer_setup() doesn't clear any
> pointers in the list of pending timers pointing to this entry. If the
> newly-initialized timer is then added with mod_timer(), it becomes
> linked in a second timer list. When the first one expires, the timer
> will be detached, but only from one of the lists it's pending in. In a
> scenario like the one we faced, this could actually happen multiple
> times. If the detached timer remains linked into a timer list, once
> that list is traversed, the kernel dereferences a pointer with value
> LIST_POISON2, and crashes.

Yes I think it makes sense.
timer_setup() modifies the timer's base in the "flags" field, then mod_timer()
could add the timer to the wrong base structure.

Maurizio


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer
@ 2021-04-28  7:21               ` Maurizio Lombardi
  0 siblings, 0 replies; 24+ messages in thread
From: Maurizio Lombardi @ 2021-04-28  7:21 UTC (permalink / raw)
  To: Martin Wilck
  Cc: Hannes Reinecke, Christoph Hellwig, Keith Busch, Sagi Grimberg,
	Chao Leng, Daniel Wagner, linux-nvme, stable

st 28. 4. 2021 v 9:06 odesílatel Martin Wilck <mwilck@suse.com> napsal:
>
> Yes, that's what I think has happened. timer_setup() doesn't clear any
> pointers in the list of pending timers pointing to this entry. If the
> newly-initialized timer is then added with mod_timer(), it becomes
> linked in a second timer list. When the first one expires, the timer
> will be detached, but only from one of the lists it's pending in. In a
> scenario like the one we faced, this could actually happen multiple
> times. If the detached timer remains linked into a timer list, once
> that list is traversed, the kernel dereferences a pointer with value
> LIST_POISON2, and crashes.

Yes I think it makes sense.
timer_setup() modifies the timer's base in the "flags" field, then mod_timer()
could add the timer to the wrong base structure.

Maurizio


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2021-04-28  7:22 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-27  8:52 [PATCH v3] nvme: rdma/tcp: fix list corruption with anatt timer mwilck
2021-04-27  8:52 ` mwilck
2021-04-27  9:16 ` Daniel Wagner
2021-04-27  9:16   ` Daniel Wagner
2021-04-27  9:33 ` Hannes Reinecke
2021-04-27  9:33   ` Hannes Reinecke
2021-04-27 16:25   ` Christoph Hellwig
2021-04-27 16:25     ` Christoph Hellwig
2021-04-27 18:05     ` Hannes Reinecke
2021-04-27 18:05       ` Hannes Reinecke
2021-04-27 18:10       ` Christoph Hellwig
2021-04-27 18:10         ` Christoph Hellwig
2021-04-27 19:54       ` Martin Wilck
2021-04-27 19:54         ` Martin Wilck
2021-04-28  6:35         ` Hannes Reinecke
2021-04-28  6:35           ` Hannes Reinecke
2021-04-28  6:41           ` Martin Wilck
2021-04-28  6:41             ` Martin Wilck
2021-04-28  6:39         ` Maurizio Lombardi
2021-04-28  6:39           ` Maurizio Lombardi
2021-04-28  7:06           ` Martin Wilck
2021-04-28  7:06             ` Martin Wilck
2021-04-28  7:21             ` Maurizio Lombardi
2021-04-28  7:21               ` Maurizio Lombardi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.