linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/1] nvmet-tcp: Fix NULL pointer dereference during release
@ 2022-09-01  6:14 zhenwei pi
  2022-09-01  6:14 ` [PATCH v3 1/1] " zhenwei pi
  0 siblings, 1 reply; 4+ messages in thread
From: zhenwei pi @ 2022-09-01  6:14 UTC (permalink / raw)
  To: hch, sagi; +Cc: kch, linux-nvme, linux-kernel, zhenwei pi

v2 -> v3:
- Keep original function name as nvmet_tcp_uninit_data_in_cmds()
- Add new function nvmet_tcp_free_cmd_data_in_buffers() to free CMD buffers
- Rework against http://git.infradead.org/nvme.git/shortlog/refs/heads/nvme-6.0
- Move nvmet_tcp_free_cmd_data_in_buffers() after cancel_work_sync(&queue->io_work).
  Hi @Sagi, if you still prefer to clear the commands that are still waiting for
  data from the wire, please let me know.

v1 -> v2:
- Separate nvmet_tcp_uninit_data_in_cmds() into nvmet_tcp_uninit_req()
  & nvmet_tcp_uninit_data_in_cmds()(free buffers only).

v1:
- Move nvmet_tcp_uninit_data_in_cmds() after nvmet_sq_destroy()

zhenwei pi (1):
  nvmet-tcp: Fix NULL pointer dereference during release

 drivers/nvme/target/tcp.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v3 1/1] nvmet-tcp: Fix NULL pointer dereference during release
  2022-09-01  6:14 [PATCH v3 0/1] nvmet-tcp: Fix NULL pointer dereference during release zhenwei pi
@ 2022-09-01  6:14 ` zhenwei pi
  2022-09-01 10:52   ` Sagi Grimberg
  0 siblings, 1 reply; 4+ messages in thread
From: zhenwei pi @ 2022-09-01  6:14 UTC (permalink / raw)
  To: hch, sagi; +Cc: kch, linux-nvme, linux-kernel, zhenwei pi

nvmet-tcp frees CMD buffers in nvmet_tcp_uninit_data_in_cmds(),
and waits the inflight IO requests in nvmet_sq_destroy(). During wait
the inflight IO requests, the callback nvmet_tcp_queue_response()
is called from backend after IO complete, this leads a typical
Use-After-Free issue like this:

 BUG: kernel NULL pointer dereference, address: 0000000000000008
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] PREEMPT SMP NOPTI
 CPU: 14 PID: 207 Comm: kworker/14:1H Kdump: loaded Tainted: G            E      6.0.0-rc2.bm.1-amd64 #12
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
 Workqueue: nvmet_tcp_wq nvmet_tcp_io_work [nvmet_tcp]
 RIP: 0010:shash_ahash_digest+0x2b/0x110
 Code: 1f 44 00 00 41 57 41 56 41 55 41 54 55 48 89 fd 53 48 89 f3 48 83 ec 08 44 8b 67 30 45 85 e4 74 1c 48 8b 57 38 b8 00 10 00 00 <44> 8b 7a 08 44 29 f8 39 42 0c 0f 46 42 0c 41 39 c4 76 43 48 8b 03
 RSP: 0018:ffffc900006e3dd8 EFLAGS: 00010206
 RAX: 0000000000001000 RBX: ffff888104ac1650 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: ffff888104ac1650 RDI: ffff888104ac1600
 RBP: ffff888104ac1600 R08: ffff8881073980c8 R09: ffff8881057798b8
 R10: 8080808080808080 R11: 0000000000000000 R12: 0000000000001000
 R13: 0000000000000000 R14: ffff88810601a1cc R15: ffff888107398000
 FS:  0000000000000000(0000) GS:ffff88823fd80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000008 CR3: 000000010a8e8000 CR4: 0000000000350ee0
 Call Trace:
  <TASK>
  nvmet_tcp_io_work+0xa1c/0xb1c [nvmet_tcp]
  ? __switch_to+0x106/0x420
  process_one_work+0x1ae/0x380
  ? process_one_work+0x380/0x380
  worker_thread+0x30/0x360
  ? process_one_work+0x380/0x380
  kthread+0xe6/0x110
  ? kthread_complete_and_exit+0x20/0x20
  ret_from_fork+0x1f/0x30

Suggested by Sagi, separate nvmet_tcp_uninit_data_in_cmds() into two
steps:
  uninit data in cmds                  <- new step 1
  nvmet_sq_destroy();
  cancel_work_sync(&queue->io_work);
  free CMD buffers                     <- new step 2

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
---
 drivers/nvme/target/tcp.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index a3694a32f6d5..e18eae5ef90b 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -1423,14 +1423,27 @@ static void nvmet_tcp_uninit_data_in_cmds(struct nvmet_tcp_queue *queue)
 	for (i = 0; i < queue->nr_cmds; i++, cmd++) {
 		if (nvmet_tcp_need_data_in(cmd))
 			nvmet_req_uninit(&cmd->req);
+	}
+
+	if (!queue->nr_cmds && nvmet_tcp_need_data_in(&queue->connect)) {
+		/* failed in connect */
+		nvmet_req_uninit(&queue->connect.req);
+	}
+}
 
+static void nvmet_tcp_free_cmd_data_in_buffers(struct nvmet_tcp_queue *queue)
+{
+	struct nvmet_tcp_cmd *cmd = queue->cmds;
+	int i;
+
+	for (i = 0; i < queue->nr_cmds; i++, cmd++) {
 		nvmet_tcp_unmap_pdu_iovec(cmd);
 		nvmet_tcp_free_cmd_buffers(cmd);
 	}
 
 	if (!queue->nr_cmds && nvmet_tcp_need_data_in(&queue->connect)) {
-		/* failed in connect */
-		nvmet_tcp_finish_cmd(&queue->connect);
+		nvmet_tcp_unmap_pdu_iovec(&queue->connect);
+		nvmet_tcp_free_cmd_buffers(&queue->connect);
 	}
 }
 
@@ -1452,6 +1465,7 @@ static void nvmet_tcp_release_queue_work(struct work_struct *w)
 	nvmet_tcp_uninit_data_in_cmds(queue);
 	nvmet_sq_destroy(&queue->nvme_sq);
 	cancel_work_sync(&queue->io_work);
+	nvmet_tcp_free_cmd_data_in_buffers(queue);
 	sock_release(queue->sock);
 	nvmet_tcp_free_cmds(queue);
 	if (queue->hdr_digest || queue->data_digest)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v3 1/1] nvmet-tcp: Fix NULL pointer dereference during release
  2022-09-01  6:14 ` [PATCH v3 1/1] " zhenwei pi
@ 2022-09-01 10:52   ` Sagi Grimberg
  2022-09-02  2:37     ` zhenwei pi
  0 siblings, 1 reply; 4+ messages in thread
From: Sagi Grimberg @ 2022-09-01 10:52 UTC (permalink / raw)
  To: zhenwei pi, hch; +Cc: kch, linux-nvme, linux-kernel



On 9/1/22 09:14, zhenwei pi wrote:
> nvmet-tcp frees CMD buffers in nvmet_tcp_uninit_data_in_cmds(),
> and waits the inflight IO requests in nvmet_sq_destroy(). During wait
> the inflight IO requests, the callback nvmet_tcp_queue_response()
> is called from backend after IO complete, this leads a typical
> Use-After-Free issue like this:
> 
>   BUG: kernel NULL pointer dereference, address: 0000000000000008
>   #PF: supervisor read access in kernel mode
>   #PF: error_code(0x0000) - not-present page
>   PGD 0 P4D 0
>   Oops: 0000 [#1] PREEMPT SMP NOPTI
>   CPU: 14 PID: 207 Comm: kworker/14:1H Kdump: loaded Tainted: G            E      6.0.0-rc2.bm.1-amd64 #12
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
>   Workqueue: nvmet_tcp_wq nvmet_tcp_io_work [nvmet_tcp]
>   RIP: 0010:shash_ahash_digest+0x2b/0x110
>   Code: 1f 44 00 00 41 57 41 56 41 55 41 54 55 48 89 fd 53 48 89 f3 48 83 ec 08 44 8b 67 30 45 85 e4 74 1c 48 8b 57 38 b8 00 10 00 00 <44> 8b 7a 08 44 29 f8 39 42 0c 0f 46 42 0c 41 39 c4 76 43 48 8b 03
>   RSP: 0018:ffffc900006e3dd8 EFLAGS: 00010206
>   RAX: 0000000000001000 RBX: ffff888104ac1650 RCX: 0000000000000000
>   RDX: 0000000000000000 RSI: ffff888104ac1650 RDI: ffff888104ac1600
>   RBP: ffff888104ac1600 R08: ffff8881073980c8 R09: ffff8881057798b8
>   R10: 8080808080808080 R11: 0000000000000000 R12: 0000000000001000
>   R13: 0000000000000000 R14: ffff88810601a1cc R15: ffff888107398000
>   FS:  0000000000000000(0000) GS:ffff88823fd80000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 0000000000000008 CR3: 000000010a8e8000 CR4: 0000000000350ee0
>   Call Trace:
>    <TASK>
>    nvmet_tcp_io_work+0xa1c/0xb1c [nvmet_tcp]
>    ? __switch_to+0x106/0x420
>    process_one_work+0x1ae/0x380
>    ? process_one_work+0x380/0x380
>    worker_thread+0x30/0x360
>    ? process_one_work+0x380/0x380
>    kthread+0xe6/0x110
>    ? kthread_complete_and_exit+0x20/0x20
>    ret_from_fork+0x1f/0x30
> 
> Suggested by Sagi, separate nvmet_tcp_uninit_data_in_cmds() into two
> steps:
>    uninit data in cmds                  <- new step 1
>    nvmet_sq_destroy();
>    cancel_work_sync(&queue->io_work);
>    free CMD buffers                     <- new step 2
> 
> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
> ---
>   drivers/nvme/target/tcp.c | 18 ++++++++++++++++--
>   1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
> index a3694a32f6d5..e18eae5ef90b 100644
> --- a/drivers/nvme/target/tcp.c
> +++ b/drivers/nvme/target/tcp.c
> @@ -1423,14 +1423,27 @@ static void nvmet_tcp_uninit_data_in_cmds(struct nvmet_tcp_queue *queue)
>   	for (i = 0; i < queue->nr_cmds; i++, cmd++) {
>   		if (nvmet_tcp_need_data_in(cmd))
>   			nvmet_req_uninit(&cmd->req);
> +	}
> +
> +	if (!queue->nr_cmds && nvmet_tcp_need_data_in(&queue->connect)) {
> +		/* failed in connect */
> +		nvmet_req_uninit(&queue->connect.req);
> +	}
> +}
>   
> +static void nvmet_tcp_free_cmd_data_in_buffers(struct nvmet_tcp_queue *queue)
> +{
> +	struct nvmet_tcp_cmd *cmd = queue->cmds;
> +	int i;
> +
> +	for (i = 0; i < queue->nr_cmds; i++, cmd++) {

		if (nvmet_tcp_need_data_in(cmd))

>   		nvmet_tcp_unmap_pdu_iovec(cmd);

This function should no longer exist in the code.

>   		nvmet_tcp_free_cmd_buffers(cmd);
>   	}
>   
>   	if (!queue->nr_cmds && nvmet_tcp_need_data_in(&queue->connect)) {
> -		/* failed in connect */
> -		nvmet_tcp_finish_cmd(&queue->connect);
> +		nvmet_tcp_unmap_pdu_iovec(&queue->connect);

This function should no longer exist in the code.

> +		nvmet_tcp_free_cmd_buffers(&queue->connect);
>   	}
>   }
>   
> @@ -1452,6 +1465,7 @@ static void nvmet_tcp_release_queue_work(struct work_struct *w)
>   	nvmet_tcp_uninit_data_in_cmds(queue);
>   	nvmet_sq_destroy(&queue->nvme_sq);
>   	cancel_work_sync(&queue->io_work);
> +	nvmet_tcp_free_cmd_data_in_buffers(queue);
>   	sock_release(queue->sock);
>   	nvmet_tcp_free_cmds(queue);
>   	if (queue->hdr_digest || queue->data_digest)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Re: [PATCH v3 1/1] nvmet-tcp: Fix NULL pointer dereference during release
  2022-09-01 10:52   ` Sagi Grimberg
@ 2022-09-02  2:37     ` zhenwei pi
  0 siblings, 0 replies; 4+ messages in thread
From: zhenwei pi @ 2022-09-02  2:37 UTC (permalink / raw)
  To: Sagi Grimberg, hch; +Cc: kch, linux-nvme, linux-kernel

On 9/1/22 18:52, Sagi Grimberg wrote:
> 
> 
> On 9/1/22 09:14, zhenwei pi wrote:
>> nvmet-tcp frees CMD buffers in nvmet_tcp_uninit_data_in_cmds(),
>> and waits the inflight IO requests in nvmet_sq_destroy(). During wait
>> the inflight IO requests, the callback nvmet_tcp_queue_response()
>> is called from backend after IO complete, this leads a typical
>> Use-After-Free issue like this:
>>
>>   BUG: kernel NULL pointer dereference, address: 0000000000000008
>>   #PF: supervisor read access in kernel mode
>>   #PF: error_code(0x0000) - not-present page
>>   PGD 0 P4D 0
>>   Oops: 0000 [#1] PREEMPT SMP NOPTI
>>   CPU: 14 PID: 207 Comm: kworker/14:1H Kdump: loaded Tainted: 
>> G            E      6.0.0-rc2.bm.1-amd64 #12
>>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
>> rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
>>   Workqueue: nvmet_tcp_wq nvmet_tcp_io_work [nvmet_tcp]
>>   RIP: 0010:shash_ahash_digest+0x2b/0x110
>>   Code: 1f 44 00 00 41 57 41 56 41 55 41 54 55 48 89 fd 53 48 89 f3 48 
>> 83 ec 08 44 8b 67 30 45 85 e4 74 1c 48 8b 57 38 b8 00 10 00 00 <44> 8b 
>> 7a 08 44 29 f8 39 42 0c 0f 46 42 0c 41 39 c4 76 43 48 8b 03
>>   RSP: 0018:ffffc900006e3dd8 EFLAGS: 00010206
>>   RAX: 0000000000001000 RBX: ffff888104ac1650 RCX: 0000000000000000
>>   RDX: 0000000000000000 RSI: ffff888104ac1650 RDI: ffff888104ac1600
>>   RBP: ffff888104ac1600 R08: ffff8881073980c8 R09: ffff8881057798b8
>>   R10: 8080808080808080 R11: 0000000000000000 R12: 0000000000001000
>>   R13: 0000000000000000 R14: ffff88810601a1cc R15: ffff888107398000
>>   FS:  0000000000000000(0000) GS:ffff88823fd80000(0000) 
>> knlGS:0000000000000000
>>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>   CR2: 0000000000000008 CR3: 000000010a8e8000 CR4: 0000000000350ee0
>>   Call Trace:
>>    <TASK>
>>    nvmet_tcp_io_work+0xa1c/0xb1c [nvmet_tcp]
>>    ? __switch_to+0x106/0x420
>>    process_one_work+0x1ae/0x380
>>    ? process_one_work+0x380/0x380
>>    worker_thread+0x30/0x360
>>    ? process_one_work+0x380/0x380
>>    kthread+0xe6/0x110
>>    ? kthread_complete_and_exit+0x20/0x20
>>    ret_from_fork+0x1f/0x30
>>
>> Suggested by Sagi, separate nvmet_tcp_uninit_data_in_cmds() into two
>> steps:
>>    uninit data in cmds                  <- new step 1
>>    nvmet_sq_destroy();
>>    cancel_work_sync(&queue->io_work);
>>    free CMD buffers                     <- new step 2
>>
>> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
>> ---
>>   drivers/nvme/target/tcp.c | 18 ++++++++++++++++--
>>   1 file changed, 16 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
>> index a3694a32f6d5..e18eae5ef90b 100644
>> --- a/drivers/nvme/target/tcp.c
>> +++ b/drivers/nvme/target/tcp.c
>> @@ -1423,14 +1423,27 @@ static void 
>> nvmet_tcp_uninit_data_in_cmds(struct nvmet_tcp_queue *queue)
>>       for (i = 0; i < queue->nr_cmds; i++, cmd++) {
>>           if (nvmet_tcp_need_data_in(cmd))
>>               nvmet_req_uninit(&cmd->req);
>> +    }
>> +
>> +    if (!queue->nr_cmds && nvmet_tcp_need_data_in(&queue->connect)) {
>> +        /* failed in connect */
>> +        nvmet_req_uninit(&queue->connect.req);
>> +    }
>> +}
>> +static void nvmet_tcp_free_cmd_data_in_buffers(struct nvmet_tcp_queue 
>> *queue)
>> +{
>> +    struct nvmet_tcp_cmd *cmd = queue->cmds;
>> +    int i;
>> +
>> +    for (i = 0; i < queue->nr_cmds; i++, cmd++) {
> 
>          if (nvmet_tcp_need_data_in(cmd))
> 
>>           nvmet_tcp_unmap_pdu_iovec(cmd);
> 
> This function should no longer exist in the code.
> 

Hi, Sagi

Sorry, I didn't notice this patch is on the way...
https://lore.kernel.org/linux-nvme/Yw%2F0cZFCQQpx3hhZ@ZenIV/T/#t

I'll send the v4 after this patch apply. Thanks!

>>           nvmet_tcp_free_cmd_buffers(cmd);
>>       }
>>       if (!queue->nr_cmds && nvmet_tcp_need_data_in(&queue->connect)) {
>> -        /* failed in connect */
>> -        nvmet_tcp_finish_cmd(&queue->connect);
>> +        nvmet_tcp_unmap_pdu_iovec(&queue->connect);
> 
> This function should no longer exist in the code.
> 
>> +        nvmet_tcp_free_cmd_buffers(&queue->connect);
>>       }
>>   }
>> @@ -1452,6 +1465,7 @@ static void nvmet_tcp_release_queue_work(struct 
>> work_struct *w)
>>       nvmet_tcp_uninit_data_in_cmds(queue);
>>       nvmet_sq_destroy(&queue->nvme_sq);
>>       cancel_work_sync(&queue->io_work);
>> +    nvmet_tcp_free_cmd_data_in_buffers(queue);
>>       sock_release(queue->sock);
>>       nvmet_tcp_free_cmds(queue);
>>       if (queue->hdr_digest || queue->data_digest)

-- 
zhenwei pi

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-09-02  2:37 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-01  6:14 [PATCH v3 0/1] nvmet-tcp: Fix NULL pointer dereference during release zhenwei pi
2022-09-01  6:14 ` [PATCH v3 1/1] " zhenwei pi
2022-09-01 10:52   ` Sagi Grimberg
2022-09-02  2:37     ` zhenwei pi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).