* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
2021-08-20 11:15 [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue Xiao Yang
@ 2021-08-20 10:48 ` yangx.jy
2021-08-20 18:44 ` Bob Pearson
` (2 subsequent siblings)
3 siblings, 0 replies; 13+ messages in thread
From: yangx.jy @ 2021-08-20 10:48 UTC (permalink / raw)
To: aglo; +Cc: yangx.jy, linux-rdma, rpearsonhpe, zyjzyj2000, jgg, leon
Hi Olga Kornievskaia,
Could you check if this patch can fix your issues?
https://www.spinics.net/lists/linux-rdma/msg104358.html
https://www.spinics.net/lists/linux-rdma/msg104359.html
https://www.spinics.net/lists/linux-rdma/msg104360.html
By the way, this patch can fix my panic.
Best Regards,
Xiao Yang
On 2021/8/20 19:15, Xiao Yang wrote:
> 1) New index member of struct rxe_queue is introduced but not zeroed
> so the initial value of index may be random.
> 2) Current index is not masked off to index_mask.
> In such case, producer_addr() and consumer_addr() will get an invalid
> address by the random index and then accessing the invalid address
> triggers the following panic:
> "BUG: unable to handle page fault for address: ffff9ae2c07a1414"
>
> Fix the issue by using kzalloc() to zero out index member.
>
> Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
> Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
> ---
> drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c
> index 85b812586ed4..72d95398e604 100644
> --- a/drivers/infiniband/sw/rxe/rxe_queue.c
> +++ b/drivers/infiniband/sw/rxe/rxe_queue.c
> @@ -63,7 +63,7 @@ struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe, int *num_elem,
> if (*num_elem < 0)
> goto err1;
>
> - q = kmalloc(sizeof(*q), GFP_KERNEL);
> + q = kzalloc(sizeof(*q), GFP_KERNEL);
> if (!q)
> goto err1;
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
@ 2021-08-20 11:15 Xiao Yang
2021-08-20 10:48 ` yangx.jy
` (3 more replies)
0 siblings, 4 replies; 13+ messages in thread
From: Xiao Yang @ 2021-08-20 11:15 UTC (permalink / raw)
To: linux-rdma; +Cc: aglo, rpearsonhpe, zyjzyj2000, jgg, leon, Xiao Yang
1) New index member of struct rxe_queue is introduced but not zeroed
so the initial value of index may be random.
2) Current index is not masked off to index_mask.
In such case, producer_addr() and consumer_addr() will get an invalid
address by the random index and then accessing the invalid address
triggers the following panic:
"BUG: unable to handle page fault for address: ffff9ae2c07a1414"
Fix the issue by using kzalloc() to zero out index member.
Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
---
drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c
index 85b812586ed4..72d95398e604 100644
--- a/drivers/infiniband/sw/rxe/rxe_queue.c
+++ b/drivers/infiniband/sw/rxe/rxe_queue.c
@@ -63,7 +63,7 @@ struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe, int *num_elem,
if (*num_elem < 0)
goto err1;
- q = kmalloc(sizeof(*q), GFP_KERNEL);
+ q = kzalloc(sizeof(*q), GFP_KERNEL);
if (!q)
goto err1;
--
2.25.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
2021-08-20 11:15 [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue Xiao Yang
2021-08-20 10:48 ` yangx.jy
@ 2021-08-20 18:44 ` Bob Pearson
2021-08-21 13:00 ` yangx.jy
2021-08-20 19:10 ` Jason Gunthorpe
2021-08-21 7:21 ` Zhu Yanjun
3 siblings, 1 reply; 13+ messages in thread
From: Bob Pearson @ 2021-08-20 18:44 UTC (permalink / raw)
To: Xiao Yang, linux-rdma; +Cc: aglo, zyjzyj2000, jgg, leon
On 8/20/21 6:15 AM, Xiao Yang wrote:
> 1) New index member of struct rxe_queue is introduced but not zeroed
> so the initial value of index may be random.
> 2) Current index is not masked off to index_mask.
> In such case, producer_addr() and consumer_addr() will get an invalid
> address by the random index and then accessing the invalid address
> triggers the following panic:
> "BUG: unable to handle page fault for address: ffff9ae2c07a1414"
>
> Fix the issue by using kzalloc() to zero out index member.
>
> Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
> Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
> ---
> drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c
> index 85b812586ed4..72d95398e604 100644
> --- a/drivers/infiniband/sw/rxe/rxe_queue.c
> +++ b/drivers/infiniband/sw/rxe/rxe_queue.c
> @@ -63,7 +63,7 @@ struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe, int *num_elem,
> if (*num_elem < 0)
> goto err1;
>
> - q = kmalloc(sizeof(*q), GFP_KERNEL);
> + q = kzalloc(sizeof(*q), GFP_KERNEL);
> if (!q)
> goto err1;
>
>
Thanks for this!! I am happy to take the blame but this has been there from the original 2016 rxe commit. Its a good catch.
Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
2021-08-20 11:15 [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue Xiao Yang
2021-08-20 10:48 ` yangx.jy
2021-08-20 18:44 ` Bob Pearson
@ 2021-08-20 19:10 ` Jason Gunthorpe
2021-08-21 7:21 ` Zhu Yanjun
3 siblings, 0 replies; 13+ messages in thread
From: Jason Gunthorpe @ 2021-08-20 19:10 UTC (permalink / raw)
To: Xiao Yang; +Cc: linux-rdma, aglo, rpearsonhpe, zyjzyj2000, leon
On Fri, Aug 20, 2021 at 07:15:09PM +0800, Xiao Yang wrote:
> 1) New index member of struct rxe_queue is introduced but not zeroed
> so the initial value of index may be random.
> 2) Current index is not masked off to index_mask.
> In such case, producer_addr() and consumer_addr() will get an invalid
> address by the random index and then accessing the invalid address
> triggers the following panic:
> "BUG: unable to handle page fault for address: ffff9ae2c07a1414"
>
> Fix the issue by using kzalloc() to zero out index member.
>
> Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
> Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
> ---
> drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
Applied to for-rc, thanks
Jason
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
2021-08-20 11:15 [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue Xiao Yang
` (2 preceding siblings ...)
2021-08-20 19:10 ` Jason Gunthorpe
@ 2021-08-21 7:21 ` Zhu Yanjun
2021-08-23 4:37 ` yangx.jy
3 siblings, 1 reply; 13+ messages in thread
From: Zhu Yanjun @ 2021-08-21 7:21 UTC (permalink / raw)
To: Xiao Yang
Cc: RDMA mailing list, Olga Kornievskaia, Bob Pearson,
Jason Gunthorpe, Leon Romanovsky
On Fri, Aug 20, 2021 at 6:44 PM Xiao Yang <yangx.jy@fujitsu.com> wrote:
>
> 1) New index member of struct rxe_queue is introduced but not zeroed
> so the initial value of index may be random.
> 2) Current index is not masked off to index_mask.
> In such case, producer_addr() and consumer_addr() will get an invalid
> address by the random index and then accessing the invalid address
> triggers the following panic:
> "BUG: unable to handle page fault for address: ffff9ae2c07a1414"
>
> Fix the issue by using kzalloc() to zero out index member.
>
> Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
> Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
> ---
> drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c
> index 85b812586ed4..72d95398e604 100644
> --- a/drivers/infiniband/sw/rxe/rxe_queue.c
> +++ b/drivers/infiniband/sw/rxe/rxe_queue.c
> @@ -63,7 +63,7 @@ struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe, int *num_elem,
> if (*num_elem < 0)
> goto err1;
>
> - q = kmalloc(sizeof(*q), GFP_KERNEL);
> + q = kzalloc(sizeof(*q), GFP_KERNEL);
Perhaps this is why I can not reproduce this problem in the local host.
Zhu Yanjun
> if (!q)
> goto err1;
>
> --
> 2.25.1
>
>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
2021-08-20 18:44 ` Bob Pearson
@ 2021-08-21 13:00 ` yangx.jy
2021-08-24 18:04 ` Bob Pearson
0 siblings, 1 reply; 13+ messages in thread
From: yangx.jy @ 2021-08-21 13:00 UTC (permalink / raw)
To: Bob Pearson; +Cc: linux-rdma, aglo, zyjzyj2000, jgg, leon
On 2021/8/21 2:44, Bob Pearson wrote:
> On 8/20/21 6:15 AM, Xiao Yang wrote:
>> 1) New index member of struct rxe_queue is introduced but not zeroed
>> so the initial value of index may be random.
>> 2) Current index is not masked off to index_mask.
>> In such case, producer_addr() and consumer_addr() will get an invalid
>> address by the random index and then accessing the invalid address
>> triggers the following panic:
>> "BUG: unable to handle page fault for address: ffff9ae2c07a1414"
>>
>> Fix the issue by using kzalloc() to zero out index member.
>>
>> Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
>> Signed-off-by: Xiao Yang<yangx.jy@fujitsu.com>
>> ---
>> drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c
>> index 85b812586ed4..72d95398e604 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_queue.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_queue.c
>> @@ -63,7 +63,7 @@ struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe, int *num_elem,
>> if (*num_elem< 0)
>> goto err1;
>>
>> - q = kmalloc(sizeof(*q), GFP_KERNEL);
>> + q = kzalloc(sizeof(*q), GFP_KERNEL);
>> if (!q)
>> goto err1;
>>
>>
> Thanks for this!! I am happy to take the blame but this has been there from the original 2016 rxe commit. Its a good catch.
Hi Bob,
The original 2016 rxe commit actually introduced kmalloc() but it
initialized all members of struct rxe_queue at subsequent steps.
When the new index member of struct rxe_queue was added, it didn't
initialized at subsequent steps. So I think the issue was caused by
your patch.
I use kzalloc() to fix the issue because I want to avoid the same issue
when another new member will be added in future.
Best Regards,
Xiao Yang
> Reviewed-by: Bob Pearson<rpearsonhpe@gmail.com>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
2021-08-21 7:21 ` Zhu Yanjun
@ 2021-08-23 4:37 ` yangx.jy
2021-08-23 5:42 ` Zhu Yanjun
0 siblings, 1 reply; 13+ messages in thread
From: yangx.jy @ 2021-08-23 4:37 UTC (permalink / raw)
To: Zhu Yanjun
Cc: RDMA mailing list, Olga Kornievskaia, Bob Pearson,
Jason Gunthorpe, Leon Romanovsky
On 2021/8/21 15:21, Zhu Yanjun wrote:
> On Fri, Aug 20, 2021 at 6:44 PM Xiao Yang<yangx.jy@fujitsu.com> wrote:
>> 1) New index member of struct rxe_queue is introduced but not zeroed
>> so the initial value of index may be random.
>> 2) Current index is not masked off to index_mask.
>> In such case, producer_addr() and consumer_addr() will get an invalid
>> address by the random index and then accessing the invalid address
>> triggers the following panic:
>> "BUG: unable to handle page fault for address: ffff9ae2c07a1414"
>>
>> Fix the issue by using kzalloc() to zero out index member.
>>
>> Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
>> Signed-off-by: Xiao Yang<yangx.jy@fujitsu.com>
>> ---
>> drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c
>> index 85b812586ed4..72d95398e604 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_queue.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_queue.c
>> @@ -63,7 +63,7 @@ struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe, int *num_elem,
>> if (*num_elem< 0)
>> goto err1;
>>
>> - q = kmalloc(sizeof(*q), GFP_KERNEL);
>> + q = kzalloc(sizeof(*q), GFP_KERNEL);
> Perhaps this is why I can not reproduce this problem in the local host.
Hi Yanjun,
I forgot to say that I reproduced the issue on my local vm.
Best Regards,
Xiao Yang
> Zhu Yanjun
>
>> if (!q)
>> goto err1;
>>
>> --
>> 2.25.1
>>
>>
>>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
2021-08-23 4:37 ` yangx.jy
@ 2021-08-23 5:42 ` Zhu Yanjun
2021-08-23 6:18 ` yangx.jy
0 siblings, 1 reply; 13+ messages in thread
From: Zhu Yanjun @ 2021-08-23 5:42 UTC (permalink / raw)
To: yangx.jy
Cc: RDMA mailing list, Olga Kornievskaia, Bob Pearson,
Jason Gunthorpe, Leon Romanovsky
On Mon, Aug 23, 2021 at 12:37 PM yangx.jy@fujitsu.com
<yangx.jy@fujitsu.com> wrote:
>
> On 2021/8/21 15:21, Zhu Yanjun wrote:
> > On Fri, Aug 20, 2021 at 6:44 PM Xiao Yang<yangx.jy@fujitsu.com> wrote:
> >> 1) New index member of struct rxe_queue is introduced but not zeroed
> >> so the initial value of index may be random.
> >> 2) Current index is not masked off to index_mask.
> >> In such case, producer_addr() and consumer_addr() will get an invalid
> >> address by the random index and then accessing the invalid address
> >> triggers the following panic:
> >> "BUG: unable to handle page fault for address: ffff9ae2c07a1414"
> >>
> >> Fix the issue by using kzalloc() to zero out index member.
> >>
> >> Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
> >> Signed-off-by: Xiao Yang<yangx.jy@fujitsu.com>
> >> ---
> >> drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
> >> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c
> >> index 85b812586ed4..72d95398e604 100644
> >> --- a/drivers/infiniband/sw/rxe/rxe_queue.c
> >> +++ b/drivers/infiniband/sw/rxe/rxe_queue.c
> >> @@ -63,7 +63,7 @@ struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe, int *num_elem,
> >> if (*num_elem< 0)
> >> goto err1;
> >>
> >> - q = kmalloc(sizeof(*q), GFP_KERNEL);
> >> + q = kzalloc(sizeof(*q), GFP_KERNEL);
> > Perhaps this is why I can not reproduce this problem in the local host.
> Hi Yanjun,
>
> I forgot to say that I reproduced the issue on my local vm.
Which OS are you using to reproduce this problem?
Zhu Yanjun
>
> Best Regards,
> Xiao Yang
> > Zhu Yanjun
> >
> >> if (!q)
> >> goto err1;
> >>
> >> --
> >> 2.25.1
> >>
> >>
> >>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
2021-08-23 5:42 ` Zhu Yanjun
@ 2021-08-23 6:18 ` yangx.jy
2021-08-23 6:48 ` Zhu Yanjun
0 siblings, 1 reply; 13+ messages in thread
From: yangx.jy @ 2021-08-23 6:18 UTC (permalink / raw)
To: Zhu Yanjun
Cc: RDMA mailing list, Olga Kornievskaia, Bob Pearson,
Jason Gunthorpe, Leon Romanovsky
On 2021/8/23 13:42, Zhu Yanjun wrote:
> On Mon, Aug 23, 2021 at 12:37 PM yangx.jy@fujitsu.com
> <yangx.jy@fujitsu.com> wrote:
>> On 2021/8/21 15:21, Zhu Yanjun wrote:
>>> On Fri, Aug 20, 2021 at 6:44 PM Xiao Yang<yangx.jy@fujitsu.com> wrote:
>>>> 1) New index member of struct rxe_queue is introduced but not zeroed
>>>> so the initial value of index may be random.
>>>> 2) Current index is not masked off to index_mask.
>>>> In such case, producer_addr() and consumer_addr() will get an invalid
>>>> address by the random index and then accessing the invalid address
>>>> triggers the following panic:
>>>> "BUG: unable to handle page fault for address: ffff9ae2c07a1414"
>>>>
>>>> Fix the issue by using kzalloc() to zero out index member.
>>>>
>>>> Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
>>>> Signed-off-by: Xiao Yang<yangx.jy@fujitsu.com>
>>>> ---
>>>> drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c
>>>> index 85b812586ed4..72d95398e604 100644
>>>> --- a/drivers/infiniband/sw/rxe/rxe_queue.c
>>>> +++ b/drivers/infiniband/sw/rxe/rxe_queue.c
>>>> @@ -63,7 +63,7 @@ struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe, int *num_elem,
>>>> if (*num_elem< 0)
>>>> goto err1;
>>>>
>>>> - q = kmalloc(sizeof(*q), GFP_KERNEL);
>>>> + q = kzalloc(sizeof(*q), GFP_KERNEL);
>>> Perhaps this is why I can not reproduce this problem in the local host.
>> Hi Yanjun,
>>
>> I forgot to say that I reproduced the issue on my local vm.
> Which OS are you using to reproduce this problem?
OS is fedora31.
> Zhu Yanjun
>
>> Best Regards,
>> Xiao Yang
>>> Zhu Yanjun
>>>
>>>> if (!q)
>>>> goto err1;
>>>>
>>>> --
>>>> 2.25.1
>>>>
>>>>
>>>>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
2021-08-23 6:18 ` yangx.jy
@ 2021-08-23 6:48 ` Zhu Yanjun
2021-09-01 5:42 ` yangx.jy
0 siblings, 1 reply; 13+ messages in thread
From: Zhu Yanjun @ 2021-08-23 6:48 UTC (permalink / raw)
To: yangx.jy
Cc: RDMA mailing list, Olga Kornievskaia, Bob Pearson,
Jason Gunthorpe, Leon Romanovsky
On Mon, Aug 23, 2021 at 2:18 PM yangx.jy@fujitsu.com
<yangx.jy@fujitsu.com> wrote:
>
> On 2021/8/23 13:42, Zhu Yanjun wrote:
> > On Mon, Aug 23, 2021 at 12:37 PM yangx.jy@fujitsu.com
> > <yangx.jy@fujitsu.com> wrote:
> >> On 2021/8/21 15:21, Zhu Yanjun wrote:
> >>> On Fri, Aug 20, 2021 at 6:44 PM Xiao Yang<yangx.jy@fujitsu.com> wrote:
> >>>> 1) New index member of struct rxe_queue is introduced but not zeroed
> >>>> so the initial value of index may be random.
> >>>> 2) Current index is not masked off to index_mask.
> >>>> In such case, producer_addr() and consumer_addr() will get an invalid
> >>>> address by the random index and then accessing the invalid address
> >>>> triggers the following panic:
> >>>> "BUG: unable to handle page fault for address: ffff9ae2c07a1414"
> >>>>
> >>>> Fix the issue by using kzalloc() to zero out index member.
> >>>>
> >>>> Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
> >>>> Signed-off-by: Xiao Yang<yangx.jy@fujitsu.com>
> >>>> ---
> >>>> drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
> >>>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c
> >>>> index 85b812586ed4..72d95398e604 100644
> >>>> --- a/drivers/infiniband/sw/rxe/rxe_queue.c
> >>>> +++ b/drivers/infiniband/sw/rxe/rxe_queue.c
> >>>> @@ -63,7 +63,7 @@ struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe, int *num_elem,
> >>>> if (*num_elem< 0)
> >>>> goto err1;
> >>>>
> >>>> - q = kmalloc(sizeof(*q), GFP_KERNEL);
> >>>> + q = kzalloc(sizeof(*q), GFP_KERNEL);
> >>> Perhaps this is why I can not reproduce this problem in the local host.
> >> Hi Yanjun,
> >>
> >> I forgot to say that I reproduced the issue on my local vm.
> > Which OS are you using to reproduce this problem?
>
> OS is fedora31.
Can you reproduce this problem on Ubuntu 20.04?
Thanks,
Zhu Yanjun
>
> > Zhu Yanjun
> >
> >> Best Regards,
> >> Xiao Yang
> >>> Zhu Yanjun
> >>>
> >>>> if (!q)
> >>>> goto err1;
> >>>>
> >>>> --
> >>>> 2.25.1
> >>>>
> >>>>
> >>>>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
2021-08-21 13:00 ` yangx.jy
@ 2021-08-24 18:04 ` Bob Pearson
0 siblings, 0 replies; 13+ messages in thread
From: Bob Pearson @ 2021-08-24 18:04 UTC (permalink / raw)
To: yangx.jy; +Cc: linux-rdma, aglo, zyjzyj2000, jgg, leon
On 8/21/21 8:00 AM, yangx.jy@fujitsu.com wrote:
> On 2021/8/21 2:44, Bob Pearson wrote:
>> On 8/20/21 6:15 AM, Xiao Yang wrote:
>>> 1) New index member of struct rxe_queue is introduced but not zeroed
>>> so the initial value of index may be random.
>>> 2) Current index is not masked off to index_mask.
>>> In such case, producer_addr() and consumer_addr() will get an invalid
>>> address by the random index and then accessing the invalid address
>>> triggers the following panic:
>>> "BUG: unable to handle page fault for address: ffff9ae2c07a1414"
>>>
>>> Fix the issue by using kzalloc() to zero out index member.
>>>
>>> Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
>>> Signed-off-by: Xiao Yang<yangx.jy@fujitsu.com>
>>> ---
>>> drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c
>>> index 85b812586ed4..72d95398e604 100644
>>> --- a/drivers/infiniband/sw/rxe/rxe_queue.c
>>> +++ b/drivers/infiniband/sw/rxe/rxe_queue.c
>>> @@ -63,7 +63,7 @@ struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe, int *num_elem,
>>> if (*num_elem< 0)
>>> goto err1;
>>>
>>> - q = kmalloc(sizeof(*q), GFP_KERNEL);
>>> + q = kzalloc(sizeof(*q), GFP_KERNEL);
>>> if (!q)
>>> goto err1;
>>>
>>>
>> Thanks for this!! I am happy to take the blame but this has been there from the original 2016 rxe commit. Its a good catch.
> Hi Bob,
>
> The original 2016 rxe commit actually introduced kmalloc() but it
> initialized all members of struct rxe_queue at subsequent steps.
> When the new index member of struct rxe_queue was added, it didn't
> initialized at subsequent steps. So I think the issue was caused by
> your patch.
Yup. My comment was really that if it was an old one I was guilty either way most likely. But this is a good catch.
> I use kzalloc() to fix the issue because I want to avoid the same issue
> when another new member will be added in future.
>
> Best Regards,
> Xiao Yang
>> Reviewed-by: Bob Pearson<rpearsonhpe@gmail.com>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
2021-08-23 6:48 ` Zhu Yanjun
@ 2021-09-01 5:42 ` yangx.jy
2021-09-01 6:03 ` Zhu Yanjun
0 siblings, 1 reply; 13+ messages in thread
From: yangx.jy @ 2021-09-01 5:42 UTC (permalink / raw)
To: Zhu Yanjun
Cc: RDMA mailing list, Olga Kornievskaia, Bob Pearson,
Jason Gunthorpe, Leon Romanovsky
On 2021/8/23 14:48, Zhu Yanjun wrote:
> Can you reproduce this problem on Ubuntu 20.04?
Hi Yanjun,
I cannot reproduce this issue on Ubuntu 20.04 vm for now.
I think I didn't hit the condition where q->index gets the random value
after kmalloc().
Perhaps only when allocating the memory which has been used and freed
before.
Best Regards,
Xiao Yang
> Thanks,
> Zhu Yanjun
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
2021-09-01 5:42 ` yangx.jy
@ 2021-09-01 6:03 ` Zhu Yanjun
0 siblings, 0 replies; 13+ messages in thread
From: Zhu Yanjun @ 2021-09-01 6:03 UTC (permalink / raw)
To: yangx.jy
Cc: RDMA mailing list, Olga Kornievskaia, Bob Pearson,
Jason Gunthorpe, Leon Romanovsky
On Wed, Sep 1, 2021 at 1:42 PM yangx.jy@fujitsu.com
<yangx.jy@fujitsu.com> wrote:
>
> On 2021/8/23 14:48, Zhu Yanjun wrote:
> > Can you reproduce this problem on Ubuntu 20.04?
> Hi Yanjun,
>
> I cannot reproduce this issue on Ubuntu 20.04 vm for now.
> I think I didn't hit the condition where q->index gets the random value
> after kmalloc().
Sure. Thanks a lot.
I can not reproduce this problem on Ubuntu 20.04.
Zhu Yanjun
> Perhaps only when allocating the memory which has been used and freed
> before.
>
> Best Regards,
> Xiao Yang
> > Thanks,
> > Zhu Yanjun
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2021-09-01 6:03 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-20 11:15 [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue Xiao Yang
2021-08-20 10:48 ` yangx.jy
2021-08-20 18:44 ` Bob Pearson
2021-08-21 13:00 ` yangx.jy
2021-08-24 18:04 ` Bob Pearson
2021-08-20 19:10 ` Jason Gunthorpe
2021-08-21 7:21 ` Zhu Yanjun
2021-08-23 4:37 ` yangx.jy
2021-08-23 5:42 ` Zhu Yanjun
2021-08-23 6:18 ` yangx.jy
2021-08-23 6:48 ` Zhu Yanjun
2021-09-01 5:42 ` yangx.jy
2021-09-01 6:03 ` Zhu Yanjun
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.