All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-next v2] RDMA/rxe: Fix mr->map double free
@ 2022-10-30  3:04 Li Zhijian
  2022-11-12  3:29 ` lizhijian
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Li Zhijian @ 2022-10-30  3:04 UTC (permalink / raw)
  To: zyjzyj2000, jgg, leon, linux-rdma; +Cc: Bob Pearson, linux-kernel, Li Zhijian

rxe_mr_cleanup() which tries to free mr->map again will be called
when rxe_mr_init_user() fails.

[43895.939883] CPU: 0 PID: 4917 Comm: rdma_flush_serv Kdump: loaded Not tainted 6.1.0-rc1-roce-flush+ #25
[43895.942341] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[43895.945208] Call Trace:
[43895.946130]  <TASK>
[43895.946931]  dump_stack_lvl+0x45/0x5d
[43895.948049]  panic+0x19e/0x349
[43895.949010]  ? panic_print_sys_info.part.0+0x77/0x77
[43895.950356]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
[43895.952589]  ? preempt_count_sub+0x14/0xc0
[43895.953809]  end_report.part.0+0x54/0x7c
[43895.954993]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
[43895.956406]  kasan_report.cold+0xa/0xf
[43895.957668]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
[43895.959090]  rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
[43895.960502]  __rxe_cleanup+0x10a/0x1e0 [rdma_rxe]
[43895.961983]  rxe_reg_user_mr+0xb7/0xd0 [rdma_rxe]
[43895.963456]  ib_uverbs_reg_mr+0x26a/0x480 [ib_uverbs]
[43895.964921]  ? __lock_acquire+0x876/0x31e0
[43895.966182]  ? ib_uverbs_ex_create_wq+0x630/0x630 [ib_uverbs]
[43895.967739]  ? uverbs_fill_udata+0x1c6/0x330 [ib_uverbs]
[43895.969204]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x1a2/0x250 [ib_uverbs]
[43895.971126]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
[43895.973094]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
[43895.975096]  ? uverbs_fill_udata+0x25f/0x330 [ib_uverbs]
[43895.976466]  ib_uverbs_cmd_verbs+0x1397/0x15a0 [ib_uverbs]
[43895.977930]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
[43895.979937]  ? uverbs_fill_udata+0x330/0x330 [ib_uverbs]

This issue was fistrly exposed since
commit: b18c7da63fcb ("RDMA/rxe: Fix memory leak in error path code")
and then we fixed it in
commit: 8ff5f5d9d8cf ("RDMA/rxe: Prevent double freeing rxe_map_set()")
but this fix was reverted together at last by
commit: 1e75550648da (Revert "RDMA/rxe: Create duplicate mapping tables for FMRs")

Fixes: 1e75550648da (Revert "RDMA/rxe: Create duplicate mapping tables for FMRs")
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
 drivers/infiniband/sw/rxe/rxe_mr.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index d4f10c2d1aa7..7c99d1591580 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -99,6 +99,7 @@ static int rxe_mr_alloc(struct rxe_mr *mr, int num_buf)
 		kfree(mr->map[i]);
 
 	kfree(mr->map);
+	mr->map = NULL;
 err1:
 	return -ENOMEM;
 }
@@ -122,7 +123,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
 	int			num_buf;
 	void			*vaddr;
 	int err;
-	int i;
 
 	umem = ib_umem_get(&rxe->ib_dev, start, length, access);
 	if (IS_ERR(umem)) {
@@ -163,9 +163,8 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
 				pr_warn("%s: Unable to get virtual address\n",
 						__func__);
 				err = -ENOMEM;
-				goto err_cleanup_map;
+				goto err_release_umem;
 			}
-
 			buf->addr = (uintptr_t)vaddr;
 			buf->size = PAGE_SIZE;
 			num_buf++;
@@ -182,10 +181,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
 
 	return 0;
 
-err_cleanup_map:
-	for (i = 0; i < mr->num_map; i++)
-		kfree(mr->map[i]);
-	kfree(mr->map);
 err_release_umem:
 	ib_umem_release(umem);
 err_out:
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH for-next v2] RDMA/rxe: Fix mr->map double free
  2022-10-30  3:04 [PATCH for-next v2] RDMA/rxe: Fix mr->map double free Li Zhijian
@ 2022-11-12  3:29 ` lizhijian
  2022-11-13 12:40   ` Yanjun Zhu
  2022-11-16  0:10 ` Yanjun Zhu
  2022-11-19  0:15 ` Jason Gunthorpe
  2 siblings, 1 reply; 14+ messages in thread
From: lizhijian @ 2022-11-12  3:29 UTC (permalink / raw)
  To: zyjzyj2000
  Cc: Bob Pearson, linux-kernel, linux-rdma, Leon Romanovsky, Jason Gunthorpe

ping...

Hi Yanjun


I hope you could take a look to this bug fix more earlier.


Thanks
Zhijian



On 30/10/2022 11:04, Li Zhijian wrote:
> rxe_mr_cleanup() which tries to free mr->map again will be called
> when rxe_mr_init_user() fails.
> 
> [43895.939883] CPU: 0 PID: 4917 Comm: rdma_flush_serv Kdump: loaded Not tainted 6.1.0-rc1-roce-flush+ #25
> [43895.942341] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [43895.945208] Call Trace:
> [43895.946130]  <TASK>
> [43895.946931]  dump_stack_lvl+0x45/0x5d
> [43895.948049]  panic+0x19e/0x349
> [43895.949010]  ? panic_print_sys_info.part.0+0x77/0x77
> [43895.950356]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
> [43895.952589]  ? preempt_count_sub+0x14/0xc0
> [43895.953809]  end_report.part.0+0x54/0x7c
> [43895.954993]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
> [43895.956406]  kasan_report.cold+0xa/0xf
> [43895.957668]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
> [43895.959090]  rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
> [43895.960502]  __rxe_cleanup+0x10a/0x1e0 [rdma_rxe]
> [43895.961983]  rxe_reg_user_mr+0xb7/0xd0 [rdma_rxe]
> [43895.963456]  ib_uverbs_reg_mr+0x26a/0x480 [ib_uverbs]
> [43895.964921]  ? __lock_acquire+0x876/0x31e0
> [43895.966182]  ? ib_uverbs_ex_create_wq+0x630/0x630 [ib_uverbs]
> [43895.967739]  ? uverbs_fill_udata+0x1c6/0x330 [ib_uverbs]
> [43895.969204]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x1a2/0x250 [ib_uverbs]
> [43895.971126]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
> [43895.973094]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
> [43895.975096]  ? uverbs_fill_udata+0x25f/0x330 [ib_uverbs]
> [43895.976466]  ib_uverbs_cmd_verbs+0x1397/0x15a0 [ib_uverbs]
> [43895.977930]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
> [43895.979937]  ? uverbs_fill_udata+0x330/0x330 [ib_uverbs]
> 
> This issue was fistrly exposed since
> commit: b18c7da63fcb ("RDMA/rxe: Fix memory leak in error path code")
> and then we fixed it in
> commit: 8ff5f5d9d8cf ("RDMA/rxe: Prevent double freeing rxe_map_set()")
> but this fix was reverted together at last by
> commit: 1e75550648da (Revert "RDMA/rxe: Create duplicate mapping tables for FMRs")
> 
> Fixes: 1e75550648da (Revert "RDMA/rxe: Create duplicate mapping tables for FMRs")
> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
> ---
>   drivers/infiniband/sw/rxe/rxe_mr.c | 9 ++-------
>   1 file changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
> index d4f10c2d1aa7..7c99d1591580 100644
> --- a/drivers/infiniband/sw/rxe/rxe_mr.c
> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
> @@ -99,6 +99,7 @@ static int rxe_mr_alloc(struct rxe_mr *mr, int num_buf)
>   		kfree(mr->map[i]);
>   
>   	kfree(mr->map);
> +	mr->map = NULL;
>   err1:
>   	return -ENOMEM;
>   }
> @@ -122,7 +123,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>   	int			num_buf;
>   	void			*vaddr;
>   	int err;
> -	int i;
>   
>   	umem = ib_umem_get(&rxe->ib_dev, start, length, access);
>   	if (IS_ERR(umem)) {
> @@ -163,9 +163,8 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>   				pr_warn("%s: Unable to get virtual address\n",
>   						__func__);
>   				err = -ENOMEM;
> -				goto err_cleanup_map;
> +				goto err_release_umem;
>   			}
> -
>   			buf->addr = (uintptr_t)vaddr;
>   			buf->size = PAGE_SIZE;
>   			num_buf++;
> @@ -182,10 +181,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>   
>   	return 0;
>   
> -err_cleanup_map:
> -	for (i = 0; i < mr->num_map; i++)
> -		kfree(mr->map[i]);
> -	kfree(mr->map);
>   err_release_umem:
>   	ib_umem_release(umem);
>   err_out:

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH for-next v2] RDMA/rxe: Fix mr->map double free
  2022-11-12  3:29 ` lizhijian
@ 2022-11-13 12:40   ` Yanjun Zhu
  0 siblings, 0 replies; 14+ messages in thread
From: Yanjun Zhu @ 2022-11-13 12:40 UTC (permalink / raw)
  To: lizhijian, zyjzyj2000
  Cc: Bob Pearson, linux-kernel, linux-rdma, Leon Romanovsky, Jason Gunthorpe

在 2022/11/12 11:29, lizhijian@fujitsu.com 写道:
> ping...
Hi, Zhijian

The changes are too much. I need time to delve into it.

Zhu Yanjun

> 
> Hi Yanjun
> 
> 
> I hope you could take a look to this bug fix more earlier.
> 
> 
> Thanks
> Zhijian
> 
> 
> 
> On 30/10/2022 11:04, Li Zhijian wrote:
>> rxe_mr_cleanup() which tries to free mr->map again will be called
>> when rxe_mr_init_user() fails.
>>
>> [43895.939883] CPU: 0 PID: 4917 Comm: rdma_flush_serv Kdump: loaded Not tainted 6.1.0-rc1-roce-flush+ #25
>> [43895.942341] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
>> [43895.945208] Call Trace:
>> [43895.946130]  <TASK>
>> [43895.946931]  dump_stack_lvl+0x45/0x5d
>> [43895.948049]  panic+0x19e/0x349
>> [43895.949010]  ? panic_print_sys_info.part.0+0x77/0x77
>> [43895.950356]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
>> [43895.952589]  ? preempt_count_sub+0x14/0xc0
>> [43895.953809]  end_report.part.0+0x54/0x7c
>> [43895.954993]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
>> [43895.956406]  kasan_report.cold+0xa/0xf
>> [43895.957668]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
>> [43895.959090]  rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
>> [43895.960502]  __rxe_cleanup+0x10a/0x1e0 [rdma_rxe]
>> [43895.961983]  rxe_reg_user_mr+0xb7/0xd0 [rdma_rxe]
>> [43895.963456]  ib_uverbs_reg_mr+0x26a/0x480 [ib_uverbs]
>> [43895.964921]  ? __lock_acquire+0x876/0x31e0
>> [43895.966182]  ? ib_uverbs_ex_create_wq+0x630/0x630 [ib_uverbs]
>> [43895.967739]  ? uverbs_fill_udata+0x1c6/0x330 [ib_uverbs]
>> [43895.969204]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x1a2/0x250 [ib_uverbs]
>> [43895.971126]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
>> [43895.973094]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
>> [43895.975096]  ? uverbs_fill_udata+0x25f/0x330 [ib_uverbs]
>> [43895.976466]  ib_uverbs_cmd_verbs+0x1397/0x15a0 [ib_uverbs]
>> [43895.977930]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
>> [43895.979937]  ? uverbs_fill_udata+0x330/0x330 [ib_uverbs]
>>
>> This issue was fistrly exposed since
>> commit: b18c7da63fcb ("RDMA/rxe: Fix memory leak in error path code")
>> and then we fixed it in
>> commit: 8ff5f5d9d8cf ("RDMA/rxe: Prevent double freeing rxe_map_set()")
>> but this fix was reverted together at last by
>> commit: 1e75550648da (Revert "RDMA/rxe: Create duplicate mapping tables for FMRs")
>>
>> Fixes: 1e75550648da (Revert "RDMA/rxe: Create duplicate mapping tables for FMRs")
>> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
>> ---
>>    drivers/infiniband/sw/rxe/rxe_mr.c | 9 ++-------
>>    1 file changed, 2 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
>> index d4f10c2d1aa7..7c99d1591580 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_mr.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
>> @@ -99,6 +99,7 @@ static int rxe_mr_alloc(struct rxe_mr *mr, int num_buf)
>>    		kfree(mr->map[i]);
>>    
>>    	kfree(mr->map);
>> +	mr->map = NULL;
>>    err1:
>>    	return -ENOMEM;
>>    }
>> @@ -122,7 +123,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>>    	int			num_buf;
>>    	void			*vaddr;
>>    	int err;
>> -	int i;
>>    
>>    	umem = ib_umem_get(&rxe->ib_dev, start, length, access);
>>    	if (IS_ERR(umem)) {
>> @@ -163,9 +163,8 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>>    				pr_warn("%s: Unable to get virtual address\n",
>>    						__func__);
>>    				err = -ENOMEM;
>> -				goto err_cleanup_map;
>> +				goto err_release_umem;
>>    			}
>> -
>>    			buf->addr = (uintptr_t)vaddr;
>>    			buf->size = PAGE_SIZE;
>>    			num_buf++;
>> @@ -182,10 +181,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>>    
>>    	return 0;
>>    
>> -err_cleanup_map:
>> -	for (i = 0; i < mr->num_map; i++)
>> -		kfree(mr->map[i]);
>> -	kfree(mr->map);
>>    err_release_umem:
>>    	ib_umem_release(umem);
>>    err_out:


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH for-next v2] RDMA/rxe: Fix mr->map double free
  2022-10-30  3:04 [PATCH for-next v2] RDMA/rxe: Fix mr->map double free Li Zhijian
  2022-11-12  3:29 ` lizhijian
@ 2022-11-16  0:10 ` Yanjun Zhu
  2022-11-16  2:39   ` lizhijian
  2022-11-19  0:15 ` Jason Gunthorpe
  2 siblings, 1 reply; 14+ messages in thread
From: Yanjun Zhu @ 2022-11-16  0:10 UTC (permalink / raw)
  To: Li Zhijian, zyjzyj2000, jgg, leon, linux-rdma; +Cc: Bob Pearson, linux-kernel

在 2022/10/30 11:04, Li Zhijian 写道:
> rxe_mr_cleanup() which tries to free mr->map again will be called
> when rxe_mr_init_user() fails.
> 
> [43895.939883] CPU: 0 PID: 4917 Comm: rdma_flush_serv Kdump: loaded Not tainted 6.1.0-rc1-roce-flush+ #25
> [43895.942341] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [43895.945208] Call Trace:
> [43895.946130]  <TASK>
> [43895.946931]  dump_stack_lvl+0x45/0x5d
> [43895.948049]  panic+0x19e/0x349
> [43895.949010]  ? panic_print_sys_info.part.0+0x77/0x77
> [43895.950356]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
> [43895.952589]  ? preempt_count_sub+0x14/0xc0
> [43895.953809]  end_report.part.0+0x54/0x7c
> [43895.954993]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
> [43895.956406]  kasan_report.cold+0xa/0xf
> [43895.957668]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
> [43895.959090]  rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
> [43895.960502]  __rxe_cleanup+0x10a/0x1e0 [rdma_rxe]
> [43895.961983]  rxe_reg_user_mr+0xb7/0xd0 [rdma_rxe]
> [43895.963456]  ib_uverbs_reg_mr+0x26a/0x480 [ib_uverbs]
> [43895.964921]  ? __lock_acquire+0x876/0x31e0
> [43895.966182]  ? ib_uverbs_ex_create_wq+0x630/0x630 [ib_uverbs]
> [43895.967739]  ? uverbs_fill_udata+0x1c6/0x330 [ib_uverbs]
> [43895.969204]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x1a2/0x250 [ib_uverbs]
> [43895.971126]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
> [43895.973094]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
> [43895.975096]  ? uverbs_fill_udata+0x25f/0x330 [ib_uverbs]
> [43895.976466]  ib_uverbs_cmd_verbs+0x1397/0x15a0 [ib_uverbs]
> [43895.977930]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
> [43895.979937]  ? uverbs_fill_udata+0x330/0x330 [ib_uverbs]
> 
> This issue was fistrly exposed since
> commit: b18c7da63fcb ("RDMA/rxe: Fix memory leak in error path code")
> and then we fixed it in
> commit: 8ff5f5d9d8cf ("RDMA/rxe: Prevent double freeing rxe_map_set()")
> but this fix was reverted together at last by
> commit: 1e75550648da (Revert "RDMA/rxe: Create duplicate mapping tables for FMRs")
> 
> Fixes: 1e75550648da (Revert "RDMA/rxe: Create duplicate mapping tables for FMRs")
> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
> ---
>   drivers/infiniband/sw/rxe/rxe_mr.c | 9 ++-------
>   1 file changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
> index d4f10c2d1aa7..7c99d1591580 100644
> --- a/drivers/infiniband/sw/rxe/rxe_mr.c
> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
> @@ -99,6 +99,7 @@ static int rxe_mr_alloc(struct rxe_mr *mr, int num_buf)
>   		kfree(mr->map[i]);
>   
>   	kfree(mr->map);
> +	mr->map = NULL;
>   err1:
>   	return -ENOMEM;
>   }
> @@ -122,7 +123,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>   	int			num_buf;
>   	void			*vaddr;
>   	int err;
> -	int i;
>   
>   	umem = ib_umem_get(&rxe->ib_dev, start, length, access);
>   	if (IS_ERR(umem)) {
> @@ -163,9 +163,8 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>   				pr_warn("%s: Unable to get virtual address\n",
>   						__func__);
>   				err = -ENOMEM;
> -				goto err_cleanup_map;
> +				goto err_release_umem;

This call trace results from page_address's returning NULL, then goto 
err_cleanup_map where mr->map[i] and mr->map are freed.

And finally rxe_reg_user_mr gets an error from rxe_mr_init_user, the 
function rxe_mr_cleanup is called to handle mr to free mr->map[i] and 
mr->map again.

So mr->map[i] and mr->map are double freed.

As such, this commit is reasonable.

But why page_address will return NULL?

Zhu Yanjun

>   			}
> -
>   			buf->addr = (uintptr_t)vaddr;
>   			buf->size = PAGE_SIZE;
>   			num_buf++;
> @@ -182,10 +181,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>   
>   	return 0;
>   
> -err_cleanup_map:
> -	for (i = 0; i < mr->num_map; i++)
> -		kfree(mr->map[i]);
> -	kfree(mr->map);
>   err_release_umem:
>   	ib_umem_release(umem);
>   err_out:


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH for-next v2] RDMA/rxe: Fix mr->map double free
  2022-11-16  0:10 ` Yanjun Zhu
@ 2022-11-16  2:39   ` lizhijian
  0 siblings, 0 replies; 14+ messages in thread
From: lizhijian @ 2022-11-16  2:39 UTC (permalink / raw)
  To: Yanjun Zhu, zyjzyj2000, jgg, leon, linux-rdma; +Cc: Bob Pearson, linux-kernel



On 16/11/2022 08:10, Yanjun Zhu wrote:
>>
>> index d4f10c2d1aa7..7c99d1591580 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_mr.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
>> @@ -99,6 +99,7 @@ static int rxe_mr_alloc(struct rxe_mr *mr, int num_buf)
>>           kfree(mr->map[i]);
>>       kfree(mr->map);
>> +    mr->map = NULL;
>>   err1:
>>       return -ENOMEM;
>>   }
>> @@ -122,7 +123,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 
>> start, u64 length, u64 iova,
>>       int            num_buf;
>>       void            *vaddr;
>>       int err;
>> -    int i;
>>       umem = ib_umem_get(&rxe->ib_dev, start, length, access);
>>       if (IS_ERR(umem)) {
>> @@ -163,9 +163,8 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 
>> start, u64 length, u64 iova,
>>                   pr_warn("%s: Unable to get virtual address\n",
>>                           __func__);
>>                   err = -ENOMEM;
>> -                goto err_cleanup_map;
>> +                goto err_release_umem;
> 
> This call trace results from page_address's returning NULL, then goto 
> err_cleanup_map where mr->map[i] and mr->map are freed.
> 
> And finally rxe_reg_user_mr gets an error from rxe_mr_init_user, the 
> function rxe_mr_cleanup is called to handle mr to free mr->map[i] and 
> mr->map again.
> 
> So mr->map[i] and mr->map are double freed.
> 
> As such, this commit is reasonable.
> 
> But why page_address will return NULL?

ENOMEM? but I don't think we need taking too much care upon the reason.

this patch is most likely porting the reverted back, commit: 
8ff5f5d9d8cf ("RDMA/rxe: Prevent double freeing rxe_map_set()")

Actually, the double free can be triggered by below error path too.

149         err = rxe_mr_alloc(mr, num_buf); 

150         if (err) { 

151                 pr_warn("%s: Unable to allocate memory for map\n", 

152                                 __func__); 

153                 goto err_release_umem; 

154         }

where rxe_mr_alloc() freed the memory but don't set 'mr->map = NULL'

Thanks
Zhijian


> 
> Zhu Yanjun
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH for-next v2] RDMA/rxe: Fix mr->map double free
  2022-10-30  3:04 [PATCH for-next v2] RDMA/rxe: Fix mr->map double free Li Zhijian
  2022-11-12  3:29 ` lizhijian
  2022-11-16  0:10 ` Yanjun Zhu
@ 2022-11-19  0:15 ` Jason Gunthorpe
  2022-11-19  8:25   ` Yanjun Zhu
  2 siblings, 1 reply; 14+ messages in thread
From: Jason Gunthorpe @ 2022-11-19  0:15 UTC (permalink / raw)
  To: Li Zhijian; +Cc: zyjzyj2000, leon, linux-rdma, Bob Pearson, linux-kernel

On Sun, Oct 30, 2022 at 03:04:33AM +0000, Li Zhijian wrote:
> rxe_mr_cleanup() which tries to free mr->map again will be called
> when rxe_mr_init_user() fails.
> 
> [43895.939883] CPU: 0 PID: 4917 Comm: rdma_flush_serv Kdump: loaded Not tainted 6.1.0-rc1-roce-flush+ #25
> [43895.942341] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [43895.945208] Call Trace:
> [43895.946130]  <TASK>
> [43895.946931]  dump_stack_lvl+0x45/0x5d
> [43895.948049]  panic+0x19e/0x349
> [43895.949010]  ? panic_print_sys_info.part.0+0x77/0x77
> [43895.950356]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
> [43895.952589]  ? preempt_count_sub+0x14/0xc0
> [43895.953809]  end_report.part.0+0x54/0x7c
> [43895.954993]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
> [43895.956406]  kasan_report.cold+0xa/0xf
> [43895.957668]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
> [43895.959090]  rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
> [43895.960502]  __rxe_cleanup+0x10a/0x1e0 [rdma_rxe]
> [43895.961983]  rxe_reg_user_mr+0xb7/0xd0 [rdma_rxe]
> [43895.963456]  ib_uverbs_reg_mr+0x26a/0x480 [ib_uverbs]
> [43895.964921]  ? __lock_acquire+0x876/0x31e0
> [43895.966182]  ? ib_uverbs_ex_create_wq+0x630/0x630 [ib_uverbs]
> [43895.967739]  ? uverbs_fill_udata+0x1c6/0x330 [ib_uverbs]
> [43895.969204]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x1a2/0x250 [ib_uverbs]
> [43895.971126]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
> [43895.973094]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
> [43895.975096]  ? uverbs_fill_udata+0x25f/0x330 [ib_uverbs]
> [43895.976466]  ib_uverbs_cmd_verbs+0x1397/0x15a0 [ib_uverbs]
> [43895.977930]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
> [43895.979937]  ? uverbs_fill_udata+0x330/0x330 [ib_uverbs]

Please dont include timestamps in commit messages

> @@ -163,9 +163,8 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>  				pr_warn("%s: Unable to get virtual address\n",
>  						__func__);
>  				err = -ENOMEM;
> -				goto err_cleanup_map;
> +				goto err_release_umem;
>  			}
> -

page_address() fails if this is a highmem system and the page hasn't
been kmap'd yet. So the right thing to do is to use kmap..

But this looks right, so applied to for-next

Thanks,
Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH for-next v2] RDMA/rxe: Fix mr->map double free
  2022-11-19  0:15 ` Jason Gunthorpe
@ 2022-11-19  8:25   ` Yanjun Zhu
  2022-11-20  1:29     ` [for-next PATCH 1/1] RDMA/rxe: sgt_append from ib_umem_get is not highmem Zhu Yanjun
  0 siblings, 1 reply; 14+ messages in thread
From: Yanjun Zhu @ 2022-11-19  8:25 UTC (permalink / raw)
  To: Jason Gunthorpe, Li Zhijian
  Cc: zyjzyj2000, leon, linux-rdma, Bob Pearson, linux-kernel

在 2022/11/19 8:15, Jason Gunthorpe 写道:
> On Sun, Oct 30, 2022 at 03:04:33AM +0000, Li Zhijian wrote:
>> rxe_mr_cleanup() which tries to free mr->map again will be called
>> when rxe_mr_init_user() fails.
>>
>> [43895.939883] CPU: 0 PID: 4917 Comm: rdma_flush_serv Kdump: loaded Not tainted 6.1.0-rc1-roce-flush+ #25
>> [43895.942341] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
>> [43895.945208] Call Trace:
>> [43895.946130]  <TASK>
>> [43895.946931]  dump_stack_lvl+0x45/0x5d
>> [43895.948049]  panic+0x19e/0x349
>> [43895.949010]  ? panic_print_sys_info.part.0+0x77/0x77
>> [43895.950356]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
>> [43895.952589]  ? preempt_count_sub+0x14/0xc0
>> [43895.953809]  end_report.part.0+0x54/0x7c
>> [43895.954993]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
>> [43895.956406]  kasan_report.cold+0xa/0xf
>> [43895.957668]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
>> [43895.959090]  rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
>> [43895.960502]  __rxe_cleanup+0x10a/0x1e0 [rdma_rxe]
>> [43895.961983]  rxe_reg_user_mr+0xb7/0xd0 [rdma_rxe]
>> [43895.963456]  ib_uverbs_reg_mr+0x26a/0x480 [ib_uverbs]
>> [43895.964921]  ? __lock_acquire+0x876/0x31e0
>> [43895.966182]  ? ib_uverbs_ex_create_wq+0x630/0x630 [ib_uverbs]
>> [43895.967739]  ? uverbs_fill_udata+0x1c6/0x330 [ib_uverbs]
>> [43895.969204]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x1a2/0x250 [ib_uverbs]
>> [43895.971126]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
>> [43895.973094]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
>> [43895.975096]  ? uverbs_fill_udata+0x25f/0x330 [ib_uverbs]
>> [43895.976466]  ib_uverbs_cmd_verbs+0x1397/0x15a0 [ib_uverbs]
>> [43895.977930]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
>> [43895.979937]  ? uverbs_fill_udata+0x330/0x330 [ib_uverbs]
> 
> Please dont include timestamps in commit messages
> 
>> @@ -163,9 +163,8 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>>   				pr_warn("%s: Unable to get virtual address\n",
>>   						__func__);
>>   				err = -ENOMEM;
>> -				goto err_cleanup_map;
>> +				goto err_release_umem;
>>   			}
>> -
> 
> page_address() fails if this is a highmem system and the page hasn't
> been kmap'd yet. So the right thing to do is to use kmap..

Sure.

sgt_append.sgt is allocated in this function ib_umem_get. And the 
function sg_alloc_append_table_from_pages is called to allocate memory.

147 struct ib_umem *ib_umem_get(struct ib_device *device, unsigned long 
addr,
148                             size_t size, int access)
149 {
...
230                 ret = sg_alloc_append_table_from_pages(
231                         &umem->sgt_append, page_list, pinned, 0,
232                         pinned << PAGE_SHIFT, 
ib_dma_max_seg_size(device),
233                         npages, GFP_KERNEL);
...

And it seems that it is not highmem.

So page_address will not be NULL?

As such, it is not necessary to test the return vaue of page_address?

If so, can we add a new commit to avoid testing of the return value of 
page_address?

Zhu Yanjun

> 
> But this looks right, so applied to for-next
> 
> Thanks,
> Jason


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [for-next PATCH 1/1] RDMA/rxe: sgt_append from ib_umem_get is not highmem
  2022-11-20  1:29     ` [for-next PATCH 1/1] RDMA/rxe: sgt_append from ib_umem_get is not highmem Zhu Yanjun
@ 2022-11-19 10:30       ` lizhijian
  2022-11-19 13:20         ` Yanjun Zhu
  2022-11-21 15:28       ` Jason Gunthorpe
  1 sibling, 1 reply; 14+ messages in thread
From: lizhijian @ 2022-11-19 10:30 UTC (permalink / raw)
  To: Zhu Yanjun, zyjzyj2000, jgg, leon, linux-rdma; +Cc: Zhu Yanjun



There was a thread that tries to refactor iova_to_vaddr[1][2], where 
page_address will be drop. if so, your changes will be no longer needed.

[1]https://lore.kernel.org/lkml/a7decec2-77d9-db4c-ff66-ff597da8bc71@fujitsu.com/T/
[2]https://www.spinics.net/lists/linux-rdma/msg114053.html

Thanks
Zhijian


On 20/11/2022 09:29, Zhu Yanjun wrote:
> From: Zhu Yanjun <yanjun.zhu@linux.dev>
> 
> In ib_umem_get, sgt_append is allocated from the function
> sg_alloc_append_table_from_pages. And it is not from highmem.
> 
> As such, the return value of page_address will not be NULL.
> 
> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
> ---
>   drivers/infiniband/sw/rxe/rxe_mr.c | 9 ++-------
>   1 file changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
> index b1423000e4bc..3f33a4cdef24 100644
> --- a/drivers/infiniband/sw/rxe/rxe_mr.c
> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
> @@ -119,7 +119,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>   	struct ib_umem		*umem;
>   	struct sg_page_iter	sg_iter;
>   	int			num_buf;
> -	void			*vaddr;
>   	int err;
>   
>   	umem = ib_umem_get(&rxe->ib_dev, start, length, access);
> @@ -149,6 +148,8 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>   		buf = map[0]->buf;
>   
>   		for_each_sgtable_page (&umem->sgt_append.sgt, &sg_iter, 0) {
> +			void *vaddr;
> +
>   			if (num_buf >= RXE_BUF_PER_MAP) {
>   				map++;
>   				buf = map[0]->buf;
> @@ -156,16 +157,10 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>   			}
>   
>   			vaddr = page_address(sg_page_iter_page(&sg_iter));
> -			if (!vaddr) {
> -				rxe_dbg_mr(mr, "Unable to get virtual address\n");
> -				err = -ENOMEM;
> -				goto err_release_umem;
> -			}
>   			buf->addr = (uintptr_t)vaddr;
>   			buf->size = PAGE_SIZE;
>   			num_buf++;
>   			buf++;
> -
>   		}
>   	}
>   

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [for-next PATCH 1/1] RDMA/rxe: sgt_append from ib_umem_get is not highmem
  2022-11-19 10:30       ` lizhijian
@ 2022-11-19 13:20         ` Yanjun Zhu
  0 siblings, 0 replies; 14+ messages in thread
From: Yanjun Zhu @ 2022-11-19 13:20 UTC (permalink / raw)
  To: lizhijian, Zhu Yanjun, zyjzyj2000, jgg, leon, linux-rdma


在 2022/11/19 18:30, lizhijian@fujitsu.com 写道:
>
> There was a thread that tries to refactor iova_to_vaddr[1][2], where
> page_address will be drop. if so, your changes will be no longer needed.

My commit is to indicate that your commit is not good enough.

And you should make all the related commits in a patch series.

Zhu Yanjun

>
> [1]https://lore.kernel.org/lkml/a7decec2-77d9-db4c-ff66-ff597da8bc71@fujitsu.com/T/
> [2]https://www.spinics.net/lists/linux-rdma/msg114053.html
>
> Thanks
> Zhijian
>
>
> On 20/11/2022 09:29, Zhu Yanjun wrote:
>> From: Zhu Yanjun <yanjun.zhu@linux.dev>
>>
>> In ib_umem_get, sgt_append is allocated from the function
>> sg_alloc_append_table_from_pages. And it is not from highmem.
>>
>> As such, the return value of page_address will not be NULL.
>>
>> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
>> ---
>>    drivers/infiniband/sw/rxe/rxe_mr.c | 9 ++-------
>>    1 file changed, 2 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
>> index b1423000e4bc..3f33a4cdef24 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_mr.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
>> @@ -119,7 +119,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>>    	struct ib_umem		*umem;
>>    	struct sg_page_iter	sg_iter;
>>    	int			num_buf;
>> -	void			*vaddr;
>>    	int err;
>>    
>>    	umem = ib_umem_get(&rxe->ib_dev, start, length, access);
>> @@ -149,6 +148,8 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>>    		buf = map[0]->buf;
>>    
>>    		for_each_sgtable_page (&umem->sgt_append.sgt, &sg_iter, 0) {
>> +			void *vaddr;
>> +
>>    			if (num_buf >= RXE_BUF_PER_MAP) {
>>    				map++;
>>    				buf = map[0]->buf;
>> @@ -156,16 +157,10 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>>    			}
>>    
>>    			vaddr = page_address(sg_page_iter_page(&sg_iter));
>> -			if (!vaddr) {
>> -				rxe_dbg_mr(mr, "Unable to get virtual address\n");
>> -				err = -ENOMEM;
>> -				goto err_release_umem;
>> -			}
>>    			buf->addr = (uintptr_t)vaddr;
>>    			buf->size = PAGE_SIZE;
>>    			num_buf++;
>>    			buf++;
>> -
>>    		}
>>    	}
>>    

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [for-next PATCH 1/1] RDMA/rxe: sgt_append from ib_umem_get is not highmem
  2022-11-19  8:25   ` Yanjun Zhu
@ 2022-11-20  1:29     ` Zhu Yanjun
  2022-11-19 10:30       ` lizhijian
  2022-11-21 15:28       ` Jason Gunthorpe
  0 siblings, 2 replies; 14+ messages in thread
From: Zhu Yanjun @ 2022-11-20  1:29 UTC (permalink / raw)
  To: yanjun.zhu, zyjzyj2000, jgg, leon, linux-rdma; +Cc: Zhu Yanjun

From: Zhu Yanjun <yanjun.zhu@linux.dev>

In ib_umem_get, sgt_append is allocated from the function
sg_alloc_append_table_from_pages. And it is not from highmem.

As such, the return value of page_address will not be NULL.

Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
---
 drivers/infiniband/sw/rxe/rxe_mr.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index b1423000e4bc..3f33a4cdef24 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -119,7 +119,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
 	struct ib_umem		*umem;
 	struct sg_page_iter	sg_iter;
 	int			num_buf;
-	void			*vaddr;
 	int err;
 
 	umem = ib_umem_get(&rxe->ib_dev, start, length, access);
@@ -149,6 +148,8 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
 		buf = map[0]->buf;
 
 		for_each_sgtable_page (&umem->sgt_append.sgt, &sg_iter, 0) {
+			void *vaddr;
+
 			if (num_buf >= RXE_BUF_PER_MAP) {
 				map++;
 				buf = map[0]->buf;
@@ -156,16 +157,10 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
 			}
 
 			vaddr = page_address(sg_page_iter_page(&sg_iter));
-			if (!vaddr) {
-				rxe_dbg_mr(mr, "Unable to get virtual address\n");
-				err = -ENOMEM;
-				goto err_release_umem;
-			}
 			buf->addr = (uintptr_t)vaddr;
 			buf->size = PAGE_SIZE;
 			num_buf++;
 			buf++;
-
 		}
 	}
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [for-next PATCH 1/1] RDMA/rxe: sgt_append from ib_umem_get is not highmem
  2022-11-20  1:29     ` [for-next PATCH 1/1] RDMA/rxe: sgt_append from ib_umem_get is not highmem Zhu Yanjun
  2022-11-19 10:30       ` lizhijian
@ 2022-11-21 15:28       ` Jason Gunthorpe
  2022-11-22  4:07         ` Yanjun Zhu
  1 sibling, 1 reply; 14+ messages in thread
From: Jason Gunthorpe @ 2022-11-21 15:28 UTC (permalink / raw)
  To: Zhu Yanjun; +Cc: zyjzyj2000, leon, linux-rdma, Zhu Yanjun

On Sat, Nov 19, 2022 at 08:29:38PM -0500, Zhu Yanjun wrote:
> From: Zhu Yanjun <yanjun.zhu@linux.dev>
> 
> In ib_umem_get, sgt_append is allocated from the function
> sg_alloc_append_table_from_pages. And it is not from highmem.

You've confused the allocation of the SGL table itself with the
page_address called on the struct page * stored inside the SGL table.

Think of the SGL as an array of 'struct page *'

The page_address() can return NULL because the 'struct page *' it
contains came from userspace and could very will be highmem.

Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [for-next PATCH 1/1] RDMA/rxe: sgt_append from ib_umem_get is not highmem
  2022-11-21 15:28       ` Jason Gunthorpe
@ 2022-11-22  4:07         ` Yanjun Zhu
  2022-11-22 14:06           ` Jason Gunthorpe
  0 siblings, 1 reply; 14+ messages in thread
From: Yanjun Zhu @ 2022-11-22  4:07 UTC (permalink / raw)
  To: Jason Gunthorpe, Zhu Yanjun; +Cc: zyjzyj2000, leon, linux-rdma, Zhu Yanjun

在 2022/11/21 23:28, Jason Gunthorpe 写道:
> On Sat, Nov 19, 2022 at 08:29:38PM -0500, Zhu Yanjun wrote:
>> From: Zhu Yanjun <yanjun.zhu@linux.dev>
>>
>> In ib_umem_get, sgt_append is allocated from the function
>> sg_alloc_append_table_from_pages. And it is not from highmem.
> 
> You've confused the allocation of the SGL table itself with the
> page_address called on the struct page * stored inside the SGL table.
> 
> Think of the SGL as an array of 'struct page *'
> 
About "The page_address() can return NULL because the 'struct page *' it 
  contains came from userspace and could very will be highmem.",

 From the function ib_umem *ib_umem_get(struct ib_device *device, 
unsigned long addr,size_t size, int access), I agree with you that 
struct page comes from user space.

But from "Understanding the Linux Kernel", third edition - sections 
"8.1.3. Memory Zones" and "8.1.6. Kernel Mappings of High-Memory Page 
Frames".

In the process' virtual address space, the user space occupies the first 
3GB, and the kernel space the 4th GB of this linear address space.

The first 896MB of the kernel space (not only kernel code, but its data 
also) is "directly" mapped to the first 896 MB of the physical memory.

The last 128MB part of the virtual kernel space is where are mapped some 
pieces of the physical "high memory" (> 896MB) : thus it can only map no 
more than 128MB of "high memory" at a time.

It seems that page_address of these "128MB high memory" will return NULL.

But can user space access these high memory? From "Understanding the 
Linux Kernel", third edition, it seems that it is in kernel space.

Thanks and Regards,
Zhu Yanjun

> 
> Jason


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [for-next PATCH 1/1] RDMA/rxe: sgt_append from ib_umem_get is not highmem
  2022-11-22  4:07         ` Yanjun Zhu
@ 2022-11-22 14:06           ` Jason Gunthorpe
  2022-11-23  1:53             ` Yanjun Zhu
  0 siblings, 1 reply; 14+ messages in thread
From: Jason Gunthorpe @ 2022-11-22 14:06 UTC (permalink / raw)
  To: Yanjun Zhu; +Cc: Zhu Yanjun, zyjzyj2000, leon, linux-rdma

On Tue, Nov 22, 2022 at 12:07:06PM +0800, Yanjun Zhu wrote:

> But can user space access these high memory? From "Understanding the Linux
> Kernel", third edition, it seems that it is in kernel space.

Yes, "highmem" is effecitvely the ability to have a struct page * that
is not mapped into the kernel address space, but is mapped into a
userspace process address sspace.

Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [for-next PATCH 1/1] RDMA/rxe: sgt_append from ib_umem_get is not highmem
  2022-11-22 14:06           ` Jason Gunthorpe
@ 2022-11-23  1:53             ` Yanjun Zhu
  0 siblings, 0 replies; 14+ messages in thread
From: Yanjun Zhu @ 2022-11-23  1:53 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Zhu Yanjun, zyjzyj2000, leon, linux-rdma


在 2022/11/22 22:06, Jason Gunthorpe 写道:
> On Tue, Nov 22, 2022 at 12:07:06PM +0800, Yanjun Zhu wrote:
>
>> But can user space access these high memory? From "Understanding the Linux
>> Kernel", third edition, it seems that it is in kernel space.
> Yes, "highmem" is effecitvely the ability to have a struct page * that
> is not mapped into the kernel address space, but is mapped into a
> userspace process address sspace.
Got it. I am interested in this "highmem is mapped into a

userspace process address space.".


Would you like to share some documents, links or source code about this 
with me?


I want to delve into this "highmem is mapped into a

userspace process address space."

Thanks and Regards,

Zhu Yanjun


>
> Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-11-23  1:53 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-30  3:04 [PATCH for-next v2] RDMA/rxe: Fix mr->map double free Li Zhijian
2022-11-12  3:29 ` lizhijian
2022-11-13 12:40   ` Yanjun Zhu
2022-11-16  0:10 ` Yanjun Zhu
2022-11-16  2:39   ` lizhijian
2022-11-19  0:15 ` Jason Gunthorpe
2022-11-19  8:25   ` Yanjun Zhu
2022-11-20  1:29     ` [for-next PATCH 1/1] RDMA/rxe: sgt_append from ib_umem_get is not highmem Zhu Yanjun
2022-11-19 10:30       ` lizhijian
2022-11-19 13:20         ` Yanjun Zhu
2022-11-21 15:28       ` Jason Gunthorpe
2022-11-22  4:07         ` Yanjun Zhu
2022-11-22 14:06           ` Jason Gunthorpe
2022-11-23  1:53             ` Yanjun Zhu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.