linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
  2021-08-20 11:15 [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue Xiao Yang
@ 2021-08-20 10:48 ` yangx.jy
  2021-08-20 18:44 ` Bob Pearson
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: yangx.jy @ 2021-08-20 10:48 UTC (permalink / raw)
  To: aglo; +Cc: yangx.jy, linux-rdma, rpearsonhpe, zyjzyj2000, jgg, leon

Hi Olga Kornievskaia,

Could you check if this patch can fix your issues?

https://www.spinics.net/lists/linux-rdma/msg104358.html
https://www.spinics.net/lists/linux-rdma/msg104359.html
https://www.spinics.net/lists/linux-rdma/msg104360.html

By the way, this patch can fix my panic.

Best Regards,
Xiao Yang
On 2021/8/20 19:15, Xiao Yang wrote:
> 1) New index member of struct rxe_queue is introduced but not zeroed
>    so the initial value of index may be random.
> 2) Current index is not masked off to index_mask.
> In such case, producer_addr() and consumer_addr() will get an invalid
> address by the random index and then accessing the invalid address
> triggers the following panic:
> "BUG: unable to handle page fault for address: ffff9ae2c07a1414"
>
> Fix the issue by using kzalloc() to zero out index member.
>
> Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
> Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
> ---
>  drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c
> index 85b812586ed4..72d95398e604 100644
> --- a/drivers/infiniband/sw/rxe/rxe_queue.c
> +++ b/drivers/infiniband/sw/rxe/rxe_queue.c
> @@ -63,7 +63,7 @@ struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe, int *num_elem,
>  	if (*num_elem < 0)
>  		goto err1;
>  
> -	q = kmalloc(sizeof(*q), GFP_KERNEL);
> +	q = kzalloc(sizeof(*q), GFP_KERNEL);
>  	if (!q)
>  		goto err1;
>  

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
@ 2021-08-20 11:15 Xiao Yang
  2021-08-20 10:48 ` yangx.jy
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Xiao Yang @ 2021-08-20 11:15 UTC (permalink / raw)
  To: linux-rdma; +Cc: aglo, rpearsonhpe, zyjzyj2000, jgg, leon, Xiao Yang

1) New index member of struct rxe_queue is introduced but not zeroed
   so the initial value of index may be random.
2) Current index is not masked off to index_mask.
In such case, producer_addr() and consumer_addr() will get an invalid
address by the random index and then accessing the invalid address
triggers the following panic:
"BUG: unable to handle page fault for address: ffff9ae2c07a1414"

Fix the issue by using kzalloc() to zero out index member.

Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
---
 drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c
index 85b812586ed4..72d95398e604 100644
--- a/drivers/infiniband/sw/rxe/rxe_queue.c
+++ b/drivers/infiniband/sw/rxe/rxe_queue.c
@@ -63,7 +63,7 @@ struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe, int *num_elem,
 	if (*num_elem < 0)
 		goto err1;
 
-	q = kmalloc(sizeof(*q), GFP_KERNEL);
+	q = kzalloc(sizeof(*q), GFP_KERNEL);
 	if (!q)
 		goto err1;
 
-- 
2.25.1




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
  2021-08-20 11:15 [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue Xiao Yang
  2021-08-20 10:48 ` yangx.jy
@ 2021-08-20 18:44 ` Bob Pearson
  2021-08-21 13:00   ` yangx.jy
  2021-08-20 19:10 ` Jason Gunthorpe
  2021-08-21  7:21 ` Zhu Yanjun
  3 siblings, 1 reply; 13+ messages in thread
From: Bob Pearson @ 2021-08-20 18:44 UTC (permalink / raw)
  To: Xiao Yang, linux-rdma; +Cc: aglo, zyjzyj2000, jgg, leon

On 8/20/21 6:15 AM, Xiao Yang wrote:
> 1) New index member of struct rxe_queue is introduced but not zeroed
>    so the initial value of index may be random.
> 2) Current index is not masked off to index_mask.
> In such case, producer_addr() and consumer_addr() will get an invalid
> address by the random index and then accessing the invalid address
> triggers the following panic:
> "BUG: unable to handle page fault for address: ffff9ae2c07a1414"
> 
> Fix the issue by using kzalloc() to zero out index member.
> 
> Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
> Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
> ---
>  drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c
> index 85b812586ed4..72d95398e604 100644
> --- a/drivers/infiniband/sw/rxe/rxe_queue.c
> +++ b/drivers/infiniband/sw/rxe/rxe_queue.c
> @@ -63,7 +63,7 @@ struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe, int *num_elem,
>  	if (*num_elem < 0)
>  		goto err1;
>  
> -	q = kmalloc(sizeof(*q), GFP_KERNEL);
> +	q = kzalloc(sizeof(*q), GFP_KERNEL);
>  	if (!q)
>  		goto err1;
>  
> 

Thanks for this!! I am happy to take the blame but this has been there from the original 2016 rxe commit. Its a good catch.

Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
  2021-08-20 11:15 [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue Xiao Yang
  2021-08-20 10:48 ` yangx.jy
  2021-08-20 18:44 ` Bob Pearson
@ 2021-08-20 19:10 ` Jason Gunthorpe
  2021-08-21  7:21 ` Zhu Yanjun
  3 siblings, 0 replies; 13+ messages in thread
From: Jason Gunthorpe @ 2021-08-20 19:10 UTC (permalink / raw)
  To: Xiao Yang; +Cc: linux-rdma, aglo, rpearsonhpe, zyjzyj2000, leon

On Fri, Aug 20, 2021 at 07:15:09PM +0800, Xiao Yang wrote:
> 1) New index member of struct rxe_queue is introduced but not zeroed
>    so the initial value of index may be random.
> 2) Current index is not masked off to index_mask.
> In such case, producer_addr() and consumer_addr() will get an invalid
> address by the random index and then accessing the invalid address
> triggers the following panic:
> "BUG: unable to handle page fault for address: ffff9ae2c07a1414"
> 
> Fix the issue by using kzalloc() to zero out index member.
> 
> Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
> Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
> ---
>  drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Applied to for-rc, thanks

Jason

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
  2021-08-20 11:15 [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue Xiao Yang
                   ` (2 preceding siblings ...)
  2021-08-20 19:10 ` Jason Gunthorpe
@ 2021-08-21  7:21 ` Zhu Yanjun
  2021-08-23  4:37   ` yangx.jy
  3 siblings, 1 reply; 13+ messages in thread
From: Zhu Yanjun @ 2021-08-21  7:21 UTC (permalink / raw)
  To: Xiao Yang
  Cc: RDMA mailing list, Olga Kornievskaia, Bob Pearson,
	Jason Gunthorpe, Leon Romanovsky

On Fri, Aug 20, 2021 at 6:44 PM Xiao Yang <yangx.jy@fujitsu.com> wrote:
>
> 1) New index member of struct rxe_queue is introduced but not zeroed
>    so the initial value of index may be random.
> 2) Current index is not masked off to index_mask.
> In such case, producer_addr() and consumer_addr() will get an invalid
> address by the random index and then accessing the invalid address
> triggers the following panic:
> "BUG: unable to handle page fault for address: ffff9ae2c07a1414"
>
> Fix the issue by using kzalloc() to zero out index member.
>
> Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
> Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
> ---
>  drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c
> index 85b812586ed4..72d95398e604 100644
> --- a/drivers/infiniband/sw/rxe/rxe_queue.c
> +++ b/drivers/infiniband/sw/rxe/rxe_queue.c
> @@ -63,7 +63,7 @@ struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe, int *num_elem,
>         if (*num_elem < 0)
>                 goto err1;
>
> -       q = kmalloc(sizeof(*q), GFP_KERNEL);
> +       q = kzalloc(sizeof(*q), GFP_KERNEL);

Perhaps this is why I can not reproduce this problem in the local host.

Zhu Yanjun

>         if (!q)
>                 goto err1;
>
> --
> 2.25.1
>
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
  2021-08-20 18:44 ` Bob Pearson
@ 2021-08-21 13:00   ` yangx.jy
  2021-08-24 18:04     ` Bob Pearson
  0 siblings, 1 reply; 13+ messages in thread
From: yangx.jy @ 2021-08-21 13:00 UTC (permalink / raw)
  To: Bob Pearson; +Cc: linux-rdma, aglo, zyjzyj2000, jgg, leon

On 2021/8/21 2:44, Bob Pearson wrote:
> On 8/20/21 6:15 AM, Xiao Yang wrote:
>> 1) New index member of struct rxe_queue is introduced but not zeroed
>>     so the initial value of index may be random.
>> 2) Current index is not masked off to index_mask.
>> In such case, producer_addr() and consumer_addr() will get an invalid
>> address by the random index and then accessing the invalid address
>> triggers the following panic:
>> "BUG: unable to handle page fault for address: ffff9ae2c07a1414"
>>
>> Fix the issue by using kzalloc() to zero out index member.
>>
>> Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
>> Signed-off-by: Xiao Yang<yangx.jy@fujitsu.com>
>> ---
>>   drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c
>> index 85b812586ed4..72d95398e604 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_queue.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_queue.c
>> @@ -63,7 +63,7 @@ struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe, int *num_elem,
>>   	if (*num_elem<  0)
>>   		goto err1;
>>
>> -	q = kmalloc(sizeof(*q), GFP_KERNEL);
>> +	q = kzalloc(sizeof(*q), GFP_KERNEL);
>>   	if (!q)
>>   		goto err1;
>>
>>
> Thanks for this!! I am happy to take the blame but this has been there from the original 2016 rxe commit. Its a good catch.
Hi Bob,

The original 2016 rxe commit actually introduced kmalloc() but it 
initialized all members of struct rxe_queue at subsequent steps.
When the new index member of struct rxe_queue was added, it didn't 
initialized at subsequent steps.  So I think the issue was caused by 
your patch.
I use kzalloc() to fix the issue because I want to avoid the same issue 
when another new member will be added in future.

Best Regards,
Xiao Yang
> Reviewed-by: Bob Pearson<rpearsonhpe@gmail.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
  2021-08-21  7:21 ` Zhu Yanjun
@ 2021-08-23  4:37   ` yangx.jy
  2021-08-23  5:42     ` Zhu Yanjun
  0 siblings, 1 reply; 13+ messages in thread
From: yangx.jy @ 2021-08-23  4:37 UTC (permalink / raw)
  To: Zhu Yanjun
  Cc: RDMA mailing list, Olga Kornievskaia, Bob Pearson,
	Jason Gunthorpe, Leon Romanovsky

On 2021/8/21 15:21, Zhu Yanjun wrote:
> On Fri, Aug 20, 2021 at 6:44 PM Xiao Yang<yangx.jy@fujitsu.com>  wrote:
>> 1) New index member of struct rxe_queue is introduced but not zeroed
>>     so the initial value of index may be random.
>> 2) Current index is not masked off to index_mask.
>> In such case, producer_addr() and consumer_addr() will get an invalid
>> address by the random index and then accessing the invalid address
>> triggers the following panic:
>> "BUG: unable to handle page fault for address: ffff9ae2c07a1414"
>>
>> Fix the issue by using kzalloc() to zero out index member.
>>
>> Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
>> Signed-off-by: Xiao Yang<yangx.jy@fujitsu.com>
>> ---
>>   drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c
>> index 85b812586ed4..72d95398e604 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_queue.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_queue.c
>> @@ -63,7 +63,7 @@ struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe, int *num_elem,
>>          if (*num_elem<  0)
>>                  goto err1;
>>
>> -       q = kmalloc(sizeof(*q), GFP_KERNEL);
>> +       q = kzalloc(sizeof(*q), GFP_KERNEL);
> Perhaps this is why I can not reproduce this problem in the local host.
Hi Yanjun,

I forgot to say that I reproduced the issue on my local vm.

Best Regards,
Xiao Yang
> Zhu Yanjun
>
>>          if (!q)
>>                  goto err1;
>>
>> --
>> 2.25.1
>>
>>
>>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
  2021-08-23  4:37   ` yangx.jy
@ 2021-08-23  5:42     ` Zhu Yanjun
  2021-08-23  6:18       ` yangx.jy
  0 siblings, 1 reply; 13+ messages in thread
From: Zhu Yanjun @ 2021-08-23  5:42 UTC (permalink / raw)
  To: yangx.jy
  Cc: RDMA mailing list, Olga Kornievskaia, Bob Pearson,
	Jason Gunthorpe, Leon Romanovsky

On Mon, Aug 23, 2021 at 12:37 PM yangx.jy@fujitsu.com
<yangx.jy@fujitsu.com> wrote:
>
> On 2021/8/21 15:21, Zhu Yanjun wrote:
> > On Fri, Aug 20, 2021 at 6:44 PM Xiao Yang<yangx.jy@fujitsu.com>  wrote:
> >> 1) New index member of struct rxe_queue is introduced but not zeroed
> >>     so the initial value of index may be random.
> >> 2) Current index is not masked off to index_mask.
> >> In such case, producer_addr() and consumer_addr() will get an invalid
> >> address by the random index and then accessing the invalid address
> >> triggers the following panic:
> >> "BUG: unable to handle page fault for address: ffff9ae2c07a1414"
> >>
> >> Fix the issue by using kzalloc() to zero out index member.
> >>
> >> Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
> >> Signed-off-by: Xiao Yang<yangx.jy@fujitsu.com>
> >> ---
> >>   drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c
> >> index 85b812586ed4..72d95398e604 100644
> >> --- a/drivers/infiniband/sw/rxe/rxe_queue.c
> >> +++ b/drivers/infiniband/sw/rxe/rxe_queue.c
> >> @@ -63,7 +63,7 @@ struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe, int *num_elem,
> >>          if (*num_elem<  0)
> >>                  goto err1;
> >>
> >> -       q = kmalloc(sizeof(*q), GFP_KERNEL);
> >> +       q = kzalloc(sizeof(*q), GFP_KERNEL);
> > Perhaps this is why I can not reproduce this problem in the local host.
> Hi Yanjun,
>
> I forgot to say that I reproduced the issue on my local vm.

Which OS are you using to reproduce this problem?

Zhu Yanjun

>
> Best Regards,
> Xiao Yang
> > Zhu Yanjun
> >
> >>          if (!q)
> >>                  goto err1;
> >>
> >> --
> >> 2.25.1
> >>
> >>
> >>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
  2021-08-23  5:42     ` Zhu Yanjun
@ 2021-08-23  6:18       ` yangx.jy
  2021-08-23  6:48         ` Zhu Yanjun
  0 siblings, 1 reply; 13+ messages in thread
From: yangx.jy @ 2021-08-23  6:18 UTC (permalink / raw)
  To: Zhu Yanjun
  Cc: RDMA mailing list, Olga Kornievskaia, Bob Pearson,
	Jason Gunthorpe, Leon Romanovsky

On 2021/8/23 13:42, Zhu Yanjun wrote:
> On Mon, Aug 23, 2021 at 12:37 PM yangx.jy@fujitsu.com
> <yangx.jy@fujitsu.com>  wrote:
>> On 2021/8/21 15:21, Zhu Yanjun wrote:
>>> On Fri, Aug 20, 2021 at 6:44 PM Xiao Yang<yangx.jy@fujitsu.com>   wrote:
>>>> 1) New index member of struct rxe_queue is introduced but not zeroed
>>>>      so the initial value of index may be random.
>>>> 2) Current index is not masked off to index_mask.
>>>> In such case, producer_addr() and consumer_addr() will get an invalid
>>>> address by the random index and then accessing the invalid address
>>>> triggers the following panic:
>>>> "BUG: unable to handle page fault for address: ffff9ae2c07a1414"
>>>>
>>>> Fix the issue by using kzalloc() to zero out index member.
>>>>
>>>> Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
>>>> Signed-off-by: Xiao Yang<yangx.jy@fujitsu.com>
>>>> ---
>>>>    drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c
>>>> index 85b812586ed4..72d95398e604 100644
>>>> --- a/drivers/infiniband/sw/rxe/rxe_queue.c
>>>> +++ b/drivers/infiniband/sw/rxe/rxe_queue.c
>>>> @@ -63,7 +63,7 @@ struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe, int *num_elem,
>>>>           if (*num_elem<   0)
>>>>                   goto err1;
>>>>
>>>> -       q = kmalloc(sizeof(*q), GFP_KERNEL);
>>>> +       q = kzalloc(sizeof(*q), GFP_KERNEL);
>>> Perhaps this is why I can not reproduce this problem in the local host.
>> Hi Yanjun,
>>
>> I forgot to say that I reproduced the issue on my local vm.
> Which OS are you using to reproduce this problem?

OS is fedora31.

> Zhu Yanjun
>
>> Best Regards,
>> Xiao Yang
>>> Zhu Yanjun
>>>
>>>>           if (!q)
>>>>                   goto err1;
>>>>
>>>> --
>>>> 2.25.1
>>>>
>>>>
>>>>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
  2021-08-23  6:18       ` yangx.jy
@ 2021-08-23  6:48         ` Zhu Yanjun
  2021-09-01  5:42           ` yangx.jy
  0 siblings, 1 reply; 13+ messages in thread
From: Zhu Yanjun @ 2021-08-23  6:48 UTC (permalink / raw)
  To: yangx.jy
  Cc: RDMA mailing list, Olga Kornievskaia, Bob Pearson,
	Jason Gunthorpe, Leon Romanovsky

On Mon, Aug 23, 2021 at 2:18 PM yangx.jy@fujitsu.com
<yangx.jy@fujitsu.com> wrote:
>
> On 2021/8/23 13:42, Zhu Yanjun wrote:
> > On Mon, Aug 23, 2021 at 12:37 PM yangx.jy@fujitsu.com
> > <yangx.jy@fujitsu.com>  wrote:
> >> On 2021/8/21 15:21, Zhu Yanjun wrote:
> >>> On Fri, Aug 20, 2021 at 6:44 PM Xiao Yang<yangx.jy@fujitsu.com>   wrote:
> >>>> 1) New index member of struct rxe_queue is introduced but not zeroed
> >>>>      so the initial value of index may be random.
> >>>> 2) Current index is not masked off to index_mask.
> >>>> In such case, producer_addr() and consumer_addr() will get an invalid
> >>>> address by the random index and then accessing the invalid address
> >>>> triggers the following panic:
> >>>> "BUG: unable to handle page fault for address: ffff9ae2c07a1414"
> >>>>
> >>>> Fix the issue by using kzalloc() to zero out index member.
> >>>>
> >>>> Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
> >>>> Signed-off-by: Xiao Yang<yangx.jy@fujitsu.com>
> >>>> ---
> >>>>    drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
> >>>>    1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c
> >>>> index 85b812586ed4..72d95398e604 100644
> >>>> --- a/drivers/infiniband/sw/rxe/rxe_queue.c
> >>>> +++ b/drivers/infiniband/sw/rxe/rxe_queue.c
> >>>> @@ -63,7 +63,7 @@ struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe, int *num_elem,
> >>>>           if (*num_elem<   0)
> >>>>                   goto err1;
> >>>>
> >>>> -       q = kmalloc(sizeof(*q), GFP_KERNEL);
> >>>> +       q = kzalloc(sizeof(*q), GFP_KERNEL);
> >>> Perhaps this is why I can not reproduce this problem in the local host.
> >> Hi Yanjun,
> >>
> >> I forgot to say that I reproduced the issue on my local vm.
> > Which OS are you using to reproduce this problem?
>
> OS is fedora31.

Can you reproduce this problem on Ubuntu 20.04?

Thanks,
Zhu Yanjun

>
> > Zhu Yanjun
> >
> >> Best Regards,
> >> Xiao Yang
> >>> Zhu Yanjun
> >>>
> >>>>           if (!q)
> >>>>                   goto err1;
> >>>>
> >>>> --
> >>>> 2.25.1
> >>>>
> >>>>
> >>>>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
  2021-08-21 13:00   ` yangx.jy
@ 2021-08-24 18:04     ` Bob Pearson
  0 siblings, 0 replies; 13+ messages in thread
From: Bob Pearson @ 2021-08-24 18:04 UTC (permalink / raw)
  To: yangx.jy; +Cc: linux-rdma, aglo, zyjzyj2000, jgg, leon

On 8/21/21 8:00 AM, yangx.jy@fujitsu.com wrote:
> On 2021/8/21 2:44, Bob Pearson wrote:
>> On 8/20/21 6:15 AM, Xiao Yang wrote:
>>> 1) New index member of struct rxe_queue is introduced but not zeroed
>>>     so the initial value of index may be random.
>>> 2) Current index is not masked off to index_mask.
>>> In such case, producer_addr() and consumer_addr() will get an invalid
>>> address by the random index and then accessing the invalid address
>>> triggers the following panic:
>>> "BUG: unable to handle page fault for address: ffff9ae2c07a1414"
>>>
>>> Fix the issue by using kzalloc() to zero out index member.
>>>
>>> Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
>>> Signed-off-by: Xiao Yang<yangx.jy@fujitsu.com>
>>> ---
>>>   drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c
>>> index 85b812586ed4..72d95398e604 100644
>>> --- a/drivers/infiniband/sw/rxe/rxe_queue.c
>>> +++ b/drivers/infiniband/sw/rxe/rxe_queue.c
>>> @@ -63,7 +63,7 @@ struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe, int *num_elem,
>>>   	if (*num_elem<  0)
>>>   		goto err1;
>>>
>>> -	q = kmalloc(sizeof(*q), GFP_KERNEL);
>>> +	q = kzalloc(sizeof(*q), GFP_KERNEL);
>>>   	if (!q)
>>>   		goto err1;
>>>
>>>
>> Thanks for this!! I am happy to take the blame but this has been there from the original 2016 rxe commit. Its a good catch.
> Hi Bob,
> 
> The original 2016 rxe commit actually introduced kmalloc() but it 
> initialized all members of struct rxe_queue at subsequent steps.
> When the new index member of struct rxe_queue was added, it didn't 
> initialized at subsequent steps.  So I think the issue was caused by 
> your patch.
Yup. My comment was really that if it was an old one I was guilty either way most likely. But this is a good catch.
> I use kzalloc() to fix the issue because I want to avoid the same issue 
> when another new member will be added in future.
> 
> Best Regards,
> Xiao Yang
>> Reviewed-by: Bob Pearson<rpearsonhpe@gmail.com>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
  2021-08-23  6:48         ` Zhu Yanjun
@ 2021-09-01  5:42           ` yangx.jy
  2021-09-01  6:03             ` Zhu Yanjun
  0 siblings, 1 reply; 13+ messages in thread
From: yangx.jy @ 2021-09-01  5:42 UTC (permalink / raw)
  To: Zhu Yanjun
  Cc: RDMA mailing list, Olga Kornievskaia, Bob Pearson,
	Jason Gunthorpe, Leon Romanovsky

On 2021/8/23 14:48, Zhu Yanjun wrote:
> Can you reproduce this problem on Ubuntu 20.04?
Hi Yanjun,

I cannot reproduce this issue on Ubuntu 20.04 vm for now.
I think I didn't hit the condition where q->index gets the random value 
after kmalloc().
Perhaps only when allocating the memory which has been used and freed 
before.

Best Regards,
Xiao Yang
> Thanks,
> Zhu Yanjun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
  2021-09-01  5:42           ` yangx.jy
@ 2021-09-01  6:03             ` Zhu Yanjun
  0 siblings, 0 replies; 13+ messages in thread
From: Zhu Yanjun @ 2021-09-01  6:03 UTC (permalink / raw)
  To: yangx.jy
  Cc: RDMA mailing list, Olga Kornievskaia, Bob Pearson,
	Jason Gunthorpe, Leon Romanovsky

On Wed, Sep 1, 2021 at 1:42 PM yangx.jy@fujitsu.com
<yangx.jy@fujitsu.com> wrote:
>
> On 2021/8/23 14:48, Zhu Yanjun wrote:
> > Can you reproduce this problem on Ubuntu 20.04?
> Hi Yanjun,
>
> I cannot reproduce this issue on Ubuntu 20.04 vm for now.
> I think I didn't hit the condition where q->index gets the random value
> after kmalloc().

Sure. Thanks a lot.
I can not reproduce this problem on Ubuntu 20.04.

Zhu Yanjun

> Perhaps only when allocating the memory which has been used and freed
> before.
>
> Best Regards,
> Xiao Yang
> > Thanks,
> > Zhu Yanjun

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-09-01  6:03 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-20 11:15 [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue Xiao Yang
2021-08-20 10:48 ` yangx.jy
2021-08-20 18:44 ` Bob Pearson
2021-08-21 13:00   ` yangx.jy
2021-08-24 18:04     ` Bob Pearson
2021-08-20 19:10 ` Jason Gunthorpe
2021-08-21  7:21 ` Zhu Yanjun
2021-08-23  4:37   ` yangx.jy
2021-08-23  5:42     ` Zhu Yanjun
2021-08-23  6:18       ` yangx.jy
2021-08-23  6:48         ` Zhu Yanjun
2021-09-01  5:42           ` yangx.jy
2021-09-01  6:03             ` Zhu Yanjun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).