All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] nvme-[tcp|rdma] fix for possible use-after-free
@ 2022-02-01 12:54 Sagi Grimberg
  2022-02-01 12:54 ` [PATCH v2 1/3] nvme: fix a possible use-after-free in controller reset during load Sagi Grimberg
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Sagi Grimberg @ 2022-02-01 12:54 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch; +Cc: Chris Leech

A few use-after-free reports were seen in the wild with nvme-tcp when testing
ctrl reset and error recovery under load. Analysis shows that the exact same
use-after-free can happen with nvme-rdma as well. This patch series addresses
these issues for both.

Changes from v1:
- Move ctrl->state check from driver(s) .submit_async_event to core
  nvme_async_event_work (so need a single patch, not one per driver).
- omit queue state from the check - it is redundant, the ctrl state
  check is sufficient

Sagi Grimberg (3):
  nvme: fix a possible use-after-free in controller reset during load
  nvme-tcp: fix possible use-after-free in transport error_recovery work
  nvme-rdma: fix possible use-after-free in transport error_recovery
    work

 drivers/nvme/host/core.c | 2 ++
 drivers/nvme/host/rdma.c | 1 +
 drivers/nvme/host/tcp.c  | 1 +
 3 files changed, 4 insertions(+)

-- 
2.30.2



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 1/3] nvme: fix a possible use-after-free in controller reset during load
  2022-02-01 12:54 [PATCH v2 0/3] nvme-[tcp|rdma] fix for possible use-after-free Sagi Grimberg
@ 2022-02-01 12:54 ` Sagi Grimberg
  2022-02-03 14:43   ` Max Gurtovoy
  2022-02-04 12:20   ` Hannes Reinecke
  2022-02-01 12:54 ` [PATCH v2 2/3] nvme-tcp: fix possible use-after-free in transport error_recovery work Sagi Grimberg
  2022-02-01 12:54 ` [PATCH v2 3/3] nvme-rdma: " Sagi Grimberg
  2 siblings, 2 replies; 10+ messages in thread
From: Sagi Grimberg @ 2022-02-01 12:54 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch; +Cc: Chris Leech

Unlike .queue_rq, in .submit_async_event drivers may not check the ctrl
readiness for AER submission. This may lead to a use-after-free
condition that was observed with nvme-tcp.

The race condition may happen in the following scenario:
1. driver executes its reset_ctrl_work
2. -> nvme_stop_ctrl - flushes ctrl async_event_work
3. ctrl sends AEN which is received by the host, which in turn
   schedules AEN handling
4. teardown admin queue (which releases the queue socket)
5. AEN processed, submits another AER, calling the driver to submit
6. driver attempts to send the cmd
==> use-after-free

In order to fix that, add ctrl state check to validate the ctrl
is actually able to accept the AER submission.

This addresses the above race in controller resets because the driver
during teardown should:
1. change ctrl state to RESETTING
2. flush async_event_work (as well as other async work elements)

So after 1,2, any other AER command will find the
ctrl state to be RESETTING and bail out without submitting the AER.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index dd18861f77c0..c11cd3a814fd 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4251,6 +4251,8 @@ static void nvme_async_event_work(struct work_struct *work)
 		container_of(work, struct nvme_ctrl, async_event_work);
 
 	nvme_aen_uevent(ctrl);
+	if (ctrl->state != NVME_CTRL_LIVE)
+		return;
 	ctrl->ops->submit_async_event(ctrl);
 }
 
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/3] nvme-tcp: fix possible use-after-free in transport error_recovery work
  2022-02-01 12:54 [PATCH v2 0/3] nvme-[tcp|rdma] fix for possible use-after-free Sagi Grimberg
  2022-02-01 12:54 ` [PATCH v2 1/3] nvme: fix a possible use-after-free in controller reset during load Sagi Grimberg
@ 2022-02-01 12:54 ` Sagi Grimberg
  2022-02-04 12:20   ` Hannes Reinecke
  2022-02-01 12:54 ` [PATCH v2 3/3] nvme-rdma: " Sagi Grimberg
  2 siblings, 1 reply; 10+ messages in thread
From: Sagi Grimberg @ 2022-02-01 12:54 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch; +Cc: Chris Leech

While nvme_tcp_submit_async_event_work is checking the ctrl and queue
state before preparing the AER command and scheduling io_work, in order
to fully prevent a race where this check is not reliable the error
recovery work must flush async_event_work before continuing to destroy
the admin queue after setting the ctrl state to RESETTING such that
there is no race .submit_async_event and the error recovery handler
itself changing the ctrl state.

Tested-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/tcp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 4ceb28675fdf..01e24b5703db 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2096,6 +2096,7 @@ static void nvme_tcp_error_recovery_work(struct work_struct *work)
 	struct nvme_ctrl *ctrl = &tcp_ctrl->ctrl;
 
 	nvme_stop_keep_alive(ctrl);
+	flush_work(&ctrl->async_event_work);
 	nvme_tcp_teardown_io_queues(ctrl, false);
 	/* unquiesce to fail fast pending requests */
 	nvme_start_queues(ctrl);
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 3/3] nvme-rdma: fix possible use-after-free in transport error_recovery work
  2022-02-01 12:54 [PATCH v2 0/3] nvme-[tcp|rdma] fix for possible use-after-free Sagi Grimberg
  2022-02-01 12:54 ` [PATCH v2 1/3] nvme: fix a possible use-after-free in controller reset during load Sagi Grimberg
  2022-02-01 12:54 ` [PATCH v2 2/3] nvme-tcp: fix possible use-after-free in transport error_recovery work Sagi Grimberg
@ 2022-02-01 12:54 ` Sagi Grimberg
  2022-02-04 12:21   ` Hannes Reinecke
  2 siblings, 1 reply; 10+ messages in thread
From: Sagi Grimberg @ 2022-02-01 12:54 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch; +Cc: Chris Leech

While nvme_rdma_submit_async_event_work is checking the ctrl and queue
state before preparing the AER command and scheduling io_work, in order
to fully prevent a race where this check is not reliable the error
recovery work must flush async_event_work before continuing to destroy
the admin queue after setting the ctrl state to RESETTING such that
there is no race .submit_async_event and the error recovery handler
itself changing the ctrl state.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/rdma.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 850f84d204d0..9c55e4be8a39 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1200,6 +1200,7 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work)
 			struct nvme_rdma_ctrl, err_work);
 
 	nvme_stop_keep_alive(&ctrl->ctrl);
+	flush_work(&ctrl->ctrl.async_event_work);
 	nvme_rdma_teardown_io_queues(ctrl, false);
 	nvme_start_queues(&ctrl->ctrl);
 	nvme_rdma_teardown_admin_queue(ctrl, false);
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/3] nvme: fix a possible use-after-free in controller reset during load
  2022-02-01 12:54 ` [PATCH v2 1/3] nvme: fix a possible use-after-free in controller reset during load Sagi Grimberg
@ 2022-02-03 14:43   ` Max Gurtovoy
  2022-02-03 15:03     ` Sagi Grimberg
  2022-02-04 12:20   ` Hannes Reinecke
  1 sibling, 1 reply; 10+ messages in thread
From: Max Gurtovoy @ 2022-02-03 14:43 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme, Christoph Hellwig, Keith Busch; +Cc: Chris Leech


On 2/1/2022 2:54 PM, Sagi Grimberg wrote:
> Unlike .queue_rq, in .submit_async_event drivers may not check the ctrl
> readiness for AER submission. This may lead to a use-after-free
> condition that was observed with nvme-tcp.
>
> The race condition may happen in the following scenario:
> 1. driver executes its reset_ctrl_work
> 2. -> nvme_stop_ctrl - flushes ctrl async_event_work
> 3. ctrl sends AEN which is received by the host, which in turn
>     schedules AEN handling
> 4. teardown admin queue (which releases the queue socket)
> 5. AEN processed, submits another AER, calling the driver to submit
> 6. driver attempts to send the cmd
> ==> use-after-free
>
> In order to fix that, add ctrl state check to validate the ctrl
> is actually able to accept the AER submission.
>
> This addresses the above race in controller resets because the driver
> during teardown should:
> 1. change ctrl state to RESETTING
> 2. flush async_event_work (as well as other async work elements)
>
> So after 1,2, any other AER command will find the
> ctrl state to be RESETTING and bail out without submitting the AER.
>
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>   drivers/nvme/host/core.c | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index dd18861f77c0..c11cd3a814fd 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -4251,6 +4251,8 @@ static void nvme_async_event_work(struct work_struct *work)
>   		container_of(work, struct nvme_ctrl, async_event_work);
>   
>   	nvme_aen_uevent(ctrl);
> +	if (ctrl->state != NVME_CTRL_LIVE)
> +		return;

any reason you moved the queue_ready check in the transport drivers ?

Is it redundant ?


>   	ctrl->ops->submit_async_event(ctrl);
>   }
>   


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/3] nvme: fix a possible use-after-free in controller reset during load
  2022-02-03 14:43   ` Max Gurtovoy
@ 2022-02-03 15:03     ` Sagi Grimberg
  2022-02-03 15:47       ` Max Gurtovoy
  0 siblings, 1 reply; 10+ messages in thread
From: Sagi Grimberg @ 2022-02-03 15:03 UTC (permalink / raw)
  To: Max Gurtovoy, linux-nvme, Christoph Hellwig, Keith Busch; +Cc: Chris Leech


>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index dd18861f77c0..c11cd3a814fd 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -4251,6 +4251,8 @@ static void nvme_async_event_work(struct 
>> work_struct *work)
>>           container_of(work, struct nvme_ctrl, async_event_work);
>>       nvme_aen_uevent(ctrl);
>> +    if (ctrl->state != NVME_CTRL_LIVE)
>> +        return;
> 
> any reason you moved the queue_ready check in the transport drivers ?
> 
> Is it redundant ?
> 

Yes, see the discussion with Christoph


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/3] nvme: fix a possible use-after-free in controller reset during load
  2022-02-03 15:03     ` Sagi Grimberg
@ 2022-02-03 15:47       ` Max Gurtovoy
  0 siblings, 0 replies; 10+ messages in thread
From: Max Gurtovoy @ 2022-02-03 15:47 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme, Christoph Hellwig, Keith Busch; +Cc: Chris Leech


On 2/3/2022 5:03 PM, Sagi Grimberg wrote:
>
>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>> index dd18861f77c0..c11cd3a814fd 100644
>>> --- a/drivers/nvme/host/core.c
>>> +++ b/drivers/nvme/host/core.c
>>> @@ -4251,6 +4251,8 @@ static void nvme_async_event_work(struct 
>>> work_struct *work)
>>>           container_of(work, struct nvme_ctrl, async_event_work);
>>>       nvme_aen_uevent(ctrl);
>>> +    if (ctrl->state != NVME_CTRL_LIVE)
>>> +        return;
>>
>> any reason you moved the queue_ready check in the transport drivers ?
>>
>> Is it redundant ?
>>
>
> Yes, see the discussion with Christoph

The discussion was on the need for local variable, wasn't it ? not on 
the need for the check itself.

But yes, I see it's redundant. this flush you added is actually 
nvme_disable_aen (like we have nvme_start_keep_alive/nvme_stop_keep_alive).

I think it would be nice to have similar naming like we have for KA 
(nvme_enable_aen/nvme_disable_aen) but the series looks good 
with/without that,

Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/3] nvme: fix a possible use-after-free in controller reset during load
  2022-02-01 12:54 ` [PATCH v2 1/3] nvme: fix a possible use-after-free in controller reset during load Sagi Grimberg
  2022-02-03 14:43   ` Max Gurtovoy
@ 2022-02-04 12:20   ` Hannes Reinecke
  1 sibling, 0 replies; 10+ messages in thread
From: Hannes Reinecke @ 2022-02-04 12:20 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme, Christoph Hellwig, Keith Busch; +Cc: Chris Leech

On 2/1/22 13:54, Sagi Grimberg wrote:
> Unlike .queue_rq, in .submit_async_event drivers may not check the ctrl
> readiness for AER submission. This may lead to a use-after-free
> condition that was observed with nvme-tcp.
> 
> The race condition may happen in the following scenario:
> 1. driver executes its reset_ctrl_work
> 2. -> nvme_stop_ctrl - flushes ctrl async_event_work
> 3. ctrl sends AEN which is received by the host, which in turn
>     schedules AEN handling
> 4. teardown admin queue (which releases the queue socket)
> 5. AEN processed, submits another AER, calling the driver to submit
> 6. driver attempts to send the cmd
> ==> use-after-free
> 
> In order to fix that, add ctrl state check to validate the ctrl
> is actually able to accept the AER submission.
> 
> This addresses the above race in controller resets because the driver
> during teardown should:
> 1. change ctrl state to RESETTING
> 2. flush async_event_work (as well as other async work elements)
> 
> So after 1,2, any other AER command will find the
> ctrl state to be RESETTING and bail out without submitting the AER.
> 
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>   drivers/nvme/host/core.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index dd18861f77c0..c11cd3a814fd 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -4251,6 +4251,8 @@ static void nvme_async_event_work(struct work_struct *work)
>   		container_of(work, struct nvme_ctrl, async_event_work);
>   
>   	nvme_aen_uevent(ctrl);
> +	if (ctrl->state != NVME_CTRL_LIVE)
> +		return;
>   	ctrl->ops->submit_async_event(ctrl);
>   }
>   
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/3] nvme-tcp: fix possible use-after-free in transport error_recovery work
  2022-02-01 12:54 ` [PATCH v2 2/3] nvme-tcp: fix possible use-after-free in transport error_recovery work Sagi Grimberg
@ 2022-02-04 12:20   ` Hannes Reinecke
  0 siblings, 0 replies; 10+ messages in thread
From: Hannes Reinecke @ 2022-02-04 12:20 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme, Christoph Hellwig, Keith Busch; +Cc: Chris Leech

On 2/1/22 13:54, Sagi Grimberg wrote:
> While nvme_tcp_submit_async_event_work is checking the ctrl and queue
> state before preparing the AER command and scheduling io_work, in order
> to fully prevent a race where this check is not reliable the error
> recovery work must flush async_event_work before continuing to destroy
> the admin queue after setting the ctrl state to RESETTING such that
> there is no race .submit_async_event and the error recovery handler
> itself changing the ctrl state.
> 
> Tested-by: Chris Leech <cleech@redhat.com>
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>   drivers/nvme/host/tcp.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 4ceb28675fdf..01e24b5703db 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -2096,6 +2096,7 @@ static void nvme_tcp_error_recovery_work(struct work_struct *work)
>   	struct nvme_ctrl *ctrl = &tcp_ctrl->ctrl;
>   
>   	nvme_stop_keep_alive(ctrl);
> +	flush_work(&ctrl->async_event_work);
>   	nvme_tcp_teardown_io_queues(ctrl, false);
>   	/* unquiesce to fail fast pending requests */
>   	nvme_start_queues(ctrl);

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 3/3] nvme-rdma: fix possible use-after-free in transport error_recovery work
  2022-02-01 12:54 ` [PATCH v2 3/3] nvme-rdma: " Sagi Grimberg
@ 2022-02-04 12:21   ` Hannes Reinecke
  0 siblings, 0 replies; 10+ messages in thread
From: Hannes Reinecke @ 2022-02-04 12:21 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme, Christoph Hellwig, Keith Busch; +Cc: Chris Leech

On 2/1/22 13:54, Sagi Grimberg wrote:
> While nvme_rdma_submit_async_event_work is checking the ctrl and queue
> state before preparing the AER command and scheduling io_work, in order
> to fully prevent a race where this check is not reliable the error
> recovery work must flush async_event_work before continuing to destroy
> the admin queue after setting the ctrl state to RESETTING such that
> there is no race .submit_async_event and the error recovery handler
> itself changing the ctrl state.
> 
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>   drivers/nvme/host/rdma.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index 850f84d204d0..9c55e4be8a39 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -1200,6 +1200,7 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work)
>   			struct nvme_rdma_ctrl, err_work);
>   
>   	nvme_stop_keep_alive(&ctrl->ctrl);
> +	flush_work(&ctrl->ctrl.async_event_work);
>   	nvme_rdma_teardown_io_queues(ctrl, false);
>   	nvme_start_queues(&ctrl->ctrl);
>   	nvme_rdma_teardown_admin_queue(ctrl, false);

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		           Kernel Storage Architect
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-02-04 12:21 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-01 12:54 [PATCH v2 0/3] nvme-[tcp|rdma] fix for possible use-after-free Sagi Grimberg
2022-02-01 12:54 ` [PATCH v2 1/3] nvme: fix a possible use-after-free in controller reset during load Sagi Grimberg
2022-02-03 14:43   ` Max Gurtovoy
2022-02-03 15:03     ` Sagi Grimberg
2022-02-03 15:47       ` Max Gurtovoy
2022-02-04 12:20   ` Hannes Reinecke
2022-02-01 12:54 ` [PATCH v2 2/3] nvme-tcp: fix possible use-after-free in transport error_recovery work Sagi Grimberg
2022-02-04 12:20   ` Hannes Reinecke
2022-02-01 12:54 ` [PATCH v2 3/3] nvme-rdma: " Sagi Grimberg
2022-02-04 12:21   ` Hannes Reinecke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.