All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] nvme-[tcp|rdma] fix for possible use-after-free
@ 2022-01-30  9:21 Sagi Grimberg
  2022-01-30  9:21 ` [PATCH 1/4] nvme-tcp: fix a possible use-after-free in controller reset during load Sagi Grimberg
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Sagi Grimberg @ 2022-01-30  9:21 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch; +Cc: Chris Leech

A few use-after-free reports were seen in the wild with nvme-tcp when testing
ctrl reset and error recovery under load. Analysis shows that the exact same
use-after-free can happen with nvme-rdma as well. This patch series addresses
these issues for both.

Sagi Grimberg (4):
  nvme-tcp: fix a possible use-after-free in controller reset during
    load
  nvme-tcp: fix possible use-after-free in transport error_recovery work
  nvme-rdma: fix a possible use-after-free in controller reset during
    load
  nvme-rdma: fix possible use-after-free in transport error_recovery
    work

 drivers/nvme/host/rdma.c | 5 +++++
 drivers/nvme/host/tcp.c  | 5 +++++
 2 files changed, 10 insertions(+)

-- 
2.30.2



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/4] nvme-tcp: fix a possible use-after-free in controller reset during load
  2022-01-30  9:21 [PATCH 0/4] nvme-[tcp|rdma] fix for possible use-after-free Sagi Grimberg
@ 2022-01-30  9:21 ` Sagi Grimberg
  2022-02-01  8:03   ` Christoph Hellwig
  2022-01-30  9:21 ` [PATCH 2/4] nvme-tcp: fix possible use-after-free in transport error_recovery work Sagi Grimberg
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Sagi Grimberg @ 2022-01-30  9:21 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch; +Cc: Chris Leech

Unlike .queue_rq, .submit_async_event does not check the ctrl and
queue readiness for submitting a AER. This may lead to a use-after-free
condition in the following scenario:
1. nvme_tcp_reset_ctrl_work
2. -> nvme_stop_ctrl flushes ctrl async_event_work
3. ctrl sends AEN which is received by the host, which in turn
   schedules AEN handling
4. teardown admin queue (which releases the queue socket)
5. AEN processed, submits another AER, calling nvme-tcp driver to submit
   prepares the cmd and schedules io_work
6. io_work triggers and attempts to send the cmd
==> use-after-free

In order to fix that, add ctrl and queue state check to validate the driver
is actually able to accept the socket send.

This solves the above race in the reset flow because the ctrl state is
changed to RESETTING, then the async_event_work is flushed, hence from
that point, any other AER command will find the ctrl state to be RESETTING
and bail out without scheduling io_work.

CC: stable@vger.kernel.org
Tested-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/tcp.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 4ceb28675fdf..447bae34f2cb 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2211,6 +2211,10 @@ static void nvme_tcp_submit_async_event(struct nvme_ctrl *arg)
 	struct nvme_tcp_cmd_pdu *pdu = ctrl->async_req.pdu;
 	struct nvme_command *cmd = &pdu->cmd;
 	u8 hdgst = nvme_tcp_hdgst_len(queue);
+	bool queue_ready = test_bit(NVME_TCP_Q_LIVE, &queue->flags);
+
+	if (ctrl->ctrl.state != NVME_CTRL_LIVE || !queue_ready)
+		return;
 
 	memset(pdu, 0, sizeof(*pdu));
 	pdu->hdr.type = nvme_tcp_cmd;
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/4] nvme-tcp: fix possible use-after-free in transport error_recovery work
  2022-01-30  9:21 [PATCH 0/4] nvme-[tcp|rdma] fix for possible use-after-free Sagi Grimberg
  2022-01-30  9:21 ` [PATCH 1/4] nvme-tcp: fix a possible use-after-free in controller reset during load Sagi Grimberg
@ 2022-01-30  9:21 ` Sagi Grimberg
  2022-01-30  9:21 ` [PATCH 3/4] nvme-rdma: fix a possible use-after-free in controller reset during load Sagi Grimberg
  2022-01-30  9:21 ` [PATCH 4/4] nvme-rdma: fix possible use-after-free in transport error_recovery work Sagi Grimberg
  3 siblings, 0 replies; 11+ messages in thread
From: Sagi Grimberg @ 2022-01-30  9:21 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch; +Cc: Chris Leech

While nvme_tcp_submit_async_event_work is checking the ctrl and queue
state before preparing the AER command and scheduling io_work, in order
to fully prevent a race where this check is not reliable the error
recovery work must flush async_event_work before continuing to destroy
the admin queue after setting the ctrl state to RESETTING such that
there is no race .submit_async_event and the error recovery handler
itself changing the ctrl state.

CC: stable@vger.kernel.org
Tested-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/tcp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 447bae34f2cb..f8001b35a9c9 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2096,6 +2096,7 @@ static void nvme_tcp_error_recovery_work(struct work_struct *work)
 	struct nvme_ctrl *ctrl = &tcp_ctrl->ctrl;
 
 	nvme_stop_keep_alive(ctrl);
+	flush_work(&ctrl->async_event_work);
 	nvme_tcp_teardown_io_queues(ctrl, false);
 	/* unquiesce to fail fast pending requests */
 	nvme_start_queues(ctrl);
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/4] nvme-rdma: fix a possible use-after-free in controller reset during load
  2022-01-30  9:21 [PATCH 0/4] nvme-[tcp|rdma] fix for possible use-after-free Sagi Grimberg
  2022-01-30  9:21 ` [PATCH 1/4] nvme-tcp: fix a possible use-after-free in controller reset during load Sagi Grimberg
  2022-01-30  9:21 ` [PATCH 2/4] nvme-tcp: fix possible use-after-free in transport error_recovery work Sagi Grimberg
@ 2022-01-30  9:21 ` Sagi Grimberg
  2022-02-01  8:03   ` Christoph Hellwig
  2022-01-30  9:21 ` [PATCH 4/4] nvme-rdma: fix possible use-after-free in transport error_recovery work Sagi Grimberg
  3 siblings, 1 reply; 11+ messages in thread
From: Sagi Grimberg @ 2022-01-30  9:21 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch; +Cc: Chris Leech

Unlike .queue_rq, .submit_async_event does not check the ctrl and
queue readiness for submitting a AER. This may lead to a use-after-free
condition in the following scenario:
1. nvme_rdma_reset_ctrl_work
2. -> nvme_stop_ctrl flushes ctrl async_event_work
3. ctrl sends AEN which is received by the host, which in turn
   schedules AEN handling
4. teardown admin queue (which releases the rdma queue-pair)
5. AEN processed, submits another AER, calling nvme-rdma driver to submit
   prepares the cmd posts on the admin qp
==> use-after-free accessing an already freed rdma queue-pair

In order to fix that, add ctrl and queue state check to validate the driver
is actually able to accept the rdma qp post.

This solves the above race in the reset flow because the ctrl state is
changed to RESETTING, then the async_event_work is flushed, hence from
that point, any other AER command will find the ctrl state to be RESETTING
and bail out without posting the cmd on the rdma qp.

CC: stable@vger.kernel.org
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/rdma.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 850f84d204d0..780e8fbf503f 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1714,9 +1714,13 @@ static void nvme_rdma_submit_async_event(struct nvme_ctrl *arg)
 	struct ib_device *dev = queue->device->dev;
 	struct nvme_rdma_qe *sqe = &ctrl->async_event_sqe;
 	struct nvme_command *cmd = sqe->data;
+	bool queue_ready = test_bit(NVME_RDMA_Q_LIVE, &queue->flags);
 	struct ib_sge sge;
 	int ret;
 
+	if (ctrl->ctrl.state != NVME_CTRL_LIVE || !queue_ready)
+		return;
+
 	ib_dma_sync_single_for_cpu(dev, sqe->dma, sizeof(*cmd), DMA_TO_DEVICE);
 
 	memset(cmd, 0, sizeof(*cmd));
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/4] nvme-rdma: fix possible use-after-free in transport error_recovery work
  2022-01-30  9:21 [PATCH 0/4] nvme-[tcp|rdma] fix for possible use-after-free Sagi Grimberg
                   ` (2 preceding siblings ...)
  2022-01-30  9:21 ` [PATCH 3/4] nvme-rdma: fix a possible use-after-free in controller reset during load Sagi Grimberg
@ 2022-01-30  9:21 ` Sagi Grimberg
  3 siblings, 0 replies; 11+ messages in thread
From: Sagi Grimberg @ 2022-01-30  9:21 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch; +Cc: Chris Leech

While nvme_rdma_submit_async_event_work is checking the ctrl and queue
state before preparing the AER command and scheduling io_work, in order
to fully prevent a race where this check is not reliable the error
recovery work must flush async_event_work before continuing to destroy
the admin queue after setting the ctrl state to RESETTING such that
there is no race .submit_async_event and the error recovery handler
itself changing the ctrl state.

CC: stable@vger.kernel.org
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/rdma.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 780e8fbf503f..78e787c0e103 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1200,6 +1200,7 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work)
 			struct nvme_rdma_ctrl, err_work);
 
 	nvme_stop_keep_alive(&ctrl->ctrl);
+	flush_work(&ctrl->ctrl.async_event_work);
 	nvme_rdma_teardown_io_queues(ctrl, false);
 	nvme_start_queues(&ctrl->ctrl);
 	nvme_rdma_teardown_admin_queue(ctrl, false);
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] nvme-tcp: fix a possible use-after-free in controller reset during load
  2022-01-30  9:21 ` [PATCH 1/4] nvme-tcp: fix a possible use-after-free in controller reset during load Sagi Grimberg
@ 2022-02-01  8:03   ` Christoph Hellwig
  2022-02-01 12:36     ` Sagi Grimberg
  0 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2022-02-01  8:03 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: linux-nvme, Christoph Hellwig, Keith Busch, Chris Leech

On Sun, Jan 30, 2022 at 11:21:16AM +0200, Sagi Grimberg wrote:
>  	struct nvme_tcp_cmd_pdu *pdu = ctrl->async_req.pdu;
>  	struct nvme_command *cmd = &pdu->cmd;
>  	u8 hdgst = nvme_tcp_hdgst_len(queue);
> +	bool queue_ready = test_bit(NVME_TCP_Q_LIVE, &queue->flags);
> +
> +	if (ctrl->ctrl.state != NVME_CTRL_LIVE || !queue_ready)

Why do we need the local variable?

Also what prevents the controller or queue state to change just after
this check?


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/4] nvme-rdma: fix a possible use-after-free in controller reset during load
  2022-01-30  9:21 ` [PATCH 3/4] nvme-rdma: fix a possible use-after-free in controller reset during load Sagi Grimberg
@ 2022-02-01  8:03   ` Christoph Hellwig
  0 siblings, 0 replies; 11+ messages in thread
From: Christoph Hellwig @ 2022-02-01  8:03 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: linux-nvme, Christoph Hellwig, Keith Busch, Chris Leech

Same comment as for the TCP version here.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] nvme-tcp: fix a possible use-after-free in controller reset during load
  2022-02-01  8:03   ` Christoph Hellwig
@ 2022-02-01 12:36     ` Sagi Grimberg
  2022-02-01 13:58       ` Christoph Hellwig
  0 siblings, 1 reply; 11+ messages in thread
From: Sagi Grimberg @ 2022-02-01 12:36 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-nvme, Keith Busch, Chris Leech


>>   	struct nvme_tcp_cmd_pdu *pdu = ctrl->async_req.pdu;
>>   	struct nvme_command *cmd = &pdu->cmd;
>>   	u8 hdgst = nvme_tcp_hdgst_len(queue);
>> +	bool queue_ready = test_bit(NVME_TCP_Q_LIVE, &queue->flags);
>> +
>> +	if (ctrl->ctrl.state != NVME_CTRL_LIVE || !queue_ready)
> 
> Why do we need the local variable?

I can remove the local variable, didn't want to have a multiline
if condition.

> Also what prevents the controller or queue state to change just after
> this check?

The driver will make sure to flush ctrl->async_event_work _after_
changing the controller state (it is flushed in nvme_stop_ctrl).
Only after that it will continue to free the admin queue. So if
this check passed, it is safe to submit the aer command.

I think that the ctrl->state check should be sufficient. In fact, I
think we can move it to the core instead of doing it in the drivers:
--
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index dd18861f77c0..d60a6e65ff75 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4251,6 +4251,8 @@ static void nvme_async_event_work(struct 
work_struct *work)
                 container_of(work, struct nvme_ctrl, async_event_work);

         nvme_aen_uevent(ctrl);
+       if (ctrl->state != NVME_CTRL_LIVE)
+               return;
         ctrl->ops->submit_async_event(ctrl);
  }
--

I'll send a revised patch.


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] nvme-tcp: fix a possible use-after-free in controller reset during load
  2022-02-01 12:36     ` Sagi Grimberg
@ 2022-02-01 13:58       ` Christoph Hellwig
  2022-02-01 15:25         ` Sagi Grimberg
  0 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2022-02-01 13:58 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: Christoph Hellwig, linux-nvme, Keith Busch, Chris Leech

On Tue, Feb 01, 2022 at 02:36:03PM +0200, Sagi Grimberg wrote:
> The driver will make sure to flush ctrl->async_event_work _after_
> changing the controller state (it is flushed in nvme_stop_ctrl).
> Only after that it will continue to free the admin queue. So if
> this check passed, it is safe to submit the aer command.
>
> I think that the ctrl->state check should be sufficient. In fact, I
> think we can move it to the core instead of doing it in the drivers:

Maybe through in a comment explaining this?  Otherwise having less
checks and having them in the core is always a good thing, so I'm
in favour.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] nvme-tcp: fix a possible use-after-free in controller reset during load
  2022-02-01 13:58       ` Christoph Hellwig
@ 2022-02-01 15:25         ` Sagi Grimberg
  2022-02-02  8:18           ` Christoph Hellwig
  0 siblings, 1 reply; 11+ messages in thread
From: Sagi Grimberg @ 2022-02-01 15:25 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-nvme, Keith Busch, Chris Leech


> On Tue, Feb 01, 2022 at 02:36:03PM +0200, Sagi Grimberg wrote:
>> The driver will make sure to flush ctrl->async_event_work _after_
>> changing the controller state (it is flushed in nvme_stop_ctrl).
>> Only after that it will continue to free the admin queue. So if
>> this check passed, it is safe to submit the aer command.
>>
>> I think that the ctrl->state check should be sufficient. In fact, I
>> think we can move it to the core instead of doing it in the drivers:
> 
> Maybe through in a comment explaining this?  Otherwise having less
> checks and having them in the core is always a good thing, so I'm
> in favour.

I sent a v2 already. You want me to send a v3 with something like:
--
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index c11cd3a814fd..4beeb53e33e8 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4253,6 +4253,12 @@ static void nvme_async_event_work(struct 
work_struct *work)
         nvme_aen_uevent(ctrl);
         if (ctrl->state != NVME_CTRL_LIVE)
                 return;
+       /*
+        * drivers must guarantee aer submission here is safe
+        * by flushing the ctrl async_event_work after changing
+        * the controller state from LIVE and before freeing the
+        * admin queue.
+        */
         ctrl->ops->submit_async_event(ctrl);
  }
--


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] nvme-tcp: fix a possible use-after-free in controller reset during load
  2022-02-01 15:25         ` Sagi Grimberg
@ 2022-02-02  8:18           ` Christoph Hellwig
  0 siblings, 0 replies; 11+ messages in thread
From: Christoph Hellwig @ 2022-02-02  8:18 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: Christoph Hellwig, linux-nvme, Keith Busch, Chris Leech

On Tue, Feb 01, 2022 at 05:25:02PM +0200, Sagi Grimberg wrote:
>
>> On Tue, Feb 01, 2022 at 02:36:03PM +0200, Sagi Grimberg wrote:
>>> The driver will make sure to flush ctrl->async_event_work _after_
>>> changing the controller state (it is flushed in nvme_stop_ctrl).
>>> Only after that it will continue to free the admin queue. So if
>>> this check passed, it is safe to submit the aer command.
>>>
>>> I think that the ctrl->state check should be sufficient. In fact, I
>>> think we can move it to the core instead of doing it in the drivers:
>>
>> Maybe through in a comment explaining this?  Otherwise having less
>> checks and having them in the core is always a good thing, so I'm
>> in favour.
>
> I sent a v2 already. You want me to send a v3 with something like:

I've added the comment manually and applied the v2 series, thanks!


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-02-02  8:19 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-30  9:21 [PATCH 0/4] nvme-[tcp|rdma] fix for possible use-after-free Sagi Grimberg
2022-01-30  9:21 ` [PATCH 1/4] nvme-tcp: fix a possible use-after-free in controller reset during load Sagi Grimberg
2022-02-01  8:03   ` Christoph Hellwig
2022-02-01 12:36     ` Sagi Grimberg
2022-02-01 13:58       ` Christoph Hellwig
2022-02-01 15:25         ` Sagi Grimberg
2022-02-02  8:18           ` Christoph Hellwig
2022-01-30  9:21 ` [PATCH 2/4] nvme-tcp: fix possible use-after-free in transport error_recovery work Sagi Grimberg
2022-01-30  9:21 ` [PATCH 3/4] nvme-rdma: fix a possible use-after-free in controller reset during load Sagi Grimberg
2022-02-01  8:03   ` Christoph Hellwig
2022-01-30  9:21 ` [PATCH 4/4] nvme-rdma: fix possible use-after-free in transport error_recovery work Sagi Grimberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.