linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-rc] IB/mlx4: Fix leak in id_map_find_del
@ 2020-01-17 13:56 Håkon Bugge
  2020-01-20 14:36 ` Håkon Bugge
  2020-01-20 15:06 ` Leon Romanovsky
  0 siblings, 2 replies; 6+ messages in thread
From: Håkon Bugge @ 2020-01-17 13:56 UTC (permalink / raw)
  To: Yishai Hadas; +Cc: linux-rdma

Using CX-3 virtual functions, either from a bare-metal machine or
pass-through from a VM, MAD packets are proxied through the PF driver.

Since the VF drivers have separate name spaces for MAD Transaction Ids
(TIDs), the PF driver has to re-map the TIDs and keep the book keeping
in a cache.

Following the RDMA Connection Manager (CM) protocol, it is clear when
an entry has to evicted from the cache. When a DREP is sent from
mlx4_ib_multiplex_cm_handler(), id_map_find_del() is called. Similar
when a REJ is received by the mlx4_ib_demux_cm_handler(),
id_map_find_del() is called.

This function wipes out the TID in use from the IDR or XArray and
removes the id_map_entry from the table.

In short, it does everything except the topping of the cake, which is
to remove the entry from the list and free it. In other words, for the
DREP and REJ cases enumerated above, both will leak one id_map_entry.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
---
 drivers/infiniband/hw/mlx4/cm.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/mlx4/cm.c b/drivers/infiniband/hw/mlx4/cm.c
index ecd6cadd529a..1df6d3ccfc62 100644
--- a/drivers/infiniband/hw/mlx4/cm.c
+++ b/drivers/infiniband/hw/mlx4/cm.c
@@ -197,8 +197,13 @@ static void id_map_find_del(struct ib_device *ibdev, int pv_cm_id)
 	if (!ent)
 		goto out;
 	found_ent = id_map_find_by_sl_id(ibdev, ent->slave_id, ent->sl_cm_id);
-	if (found_ent && found_ent == ent)
+	if (found_ent && found_ent == ent) {
 		rb_erase(&found_ent->node, sl_id_map);
+		if (!ent->scheduled_delete) {
+			list_del(&ent->list);
+			kfree(ent);
+		}
+	}
 out:
 	spin_unlock(&sriov->id_map_lock);
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH for-rc] IB/mlx4: Fix leak in id_map_find_del
  2020-01-17 13:56 [PATCH for-rc] IB/mlx4: Fix leak in id_map_find_del Håkon Bugge
@ 2020-01-20 14:36 ` Håkon Bugge
  2020-01-20 15:06 ` Leon Romanovsky
  1 sibling, 0 replies; 6+ messages in thread
From: Håkon Bugge @ 2020-01-20 14:36 UTC (permalink / raw)
  To: Yishai Hadas; +Cc: OFED mailing list



> On 17 Jan 2020, at 14:56, Håkon Bugge <haakon.bugge@oracle.com> wrote:
> 
> Using CX-3 virtual functions, either from a bare-metal machine or
> pass-through from a VM, MAD packets are proxied through the PF driver.
> 
> Since the VF drivers have separate name spaces for MAD Transaction Ids
> (TIDs), the PF driver has to re-map the TIDs and keep the book keeping
> in a cache.
> 
> Following the RDMA Connection Manager (CM) protocol, it is clear when
> an entry has to evicted from the cache. When a DREP is sent from
> mlx4_ib_multiplex_cm_handler(), id_map_find_del() is called. Similar
> when a REJ is received by the mlx4_ib_demux_cm_handler(),
> id_map_find_del() is called.
> 
> This function wipes out the TID in use from the IDR or XArray and
> removes the id_map_entry from the table.
> 
> In short, it does everything except the topping of the cake, which is
> to remove the entry from the list and free it. In other words, for the
> DREP and REJ cases enumerated above, both will leak one id_map_entry.
> 
> Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>

When/if this gets merged, please add:

Cc: stable@vger.kernel.org

Thxs, Håkon

> ---
> drivers/infiniband/hw/mlx4/cm.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/hw/mlx4/cm.c b/drivers/infiniband/hw/mlx4/cm.c
> index ecd6cadd529a..1df6d3ccfc62 100644
> --- a/drivers/infiniband/hw/mlx4/cm.c
> +++ b/drivers/infiniband/hw/mlx4/cm.c
> @@ -197,8 +197,13 @@ static void id_map_find_del(struct ib_device *ibdev, int pv_cm_id)
> 	if (!ent)
> 		goto out;
> 	found_ent = id_map_find_by_sl_id(ibdev, ent->slave_id, ent->sl_cm_id);
> -	if (found_ent && found_ent == ent)
> +	if (found_ent && found_ent == ent) {
> 		rb_erase(&found_ent->node, sl_id_map);
> +		if (!ent->scheduled_delete) {
> +			list_del(&ent->list);
> +			kfree(ent);
> +		}
> +	}
> out:
> 	spin_unlock(&sriov->id_map_lock);
> }
> -- 
> 2.20.1
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH for-rc] IB/mlx4: Fix leak in id_map_find_del
  2020-01-17 13:56 [PATCH for-rc] IB/mlx4: Fix leak in id_map_find_del Håkon Bugge
  2020-01-20 14:36 ` Håkon Bugge
@ 2020-01-20 15:06 ` Leon Romanovsky
  2020-01-20 15:49   ` Håkon Bugge
  1 sibling, 1 reply; 6+ messages in thread
From: Leon Romanovsky @ 2020-01-20 15:06 UTC (permalink / raw)
  To: Håkon Bugge; +Cc: Yishai Hadas, linux-rdma

On Fri, Jan 17, 2020 at 02:56:22PM +0100, Håkon Bugge wrote:
> Using CX-3 virtual functions, either from a bare-metal machine or
> pass-through from a VM, MAD packets are proxied through the PF driver.
>
> Since the VF drivers have separate name spaces for MAD Transaction Ids
> (TIDs), the PF driver has to re-map the TIDs and keep the book keeping
> in a cache.
>
> Following the RDMA Connection Manager (CM) protocol, it is clear when
> an entry has to evicted from the cache. When a DREP is sent from
> mlx4_ib_multiplex_cm_handler(), id_map_find_del() is called. Similar
> when a REJ is received by the mlx4_ib_demux_cm_handler(),
> id_map_find_del() is called.
>
> This function wipes out the TID in use from the IDR or XArray and
> removes the id_map_entry from the table.
>
> In short, it does everything except the topping of the cake, which is
> to remove the entry from the list and free it. In other words, for the
> DREP and REJ cases enumerated above, both will leak one id_map_entry.
>
> Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
> ---
>  drivers/infiniband/hw/mlx4/cm.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/hw/mlx4/cm.c b/drivers/infiniband/hw/mlx4/cm.c
> index ecd6cadd529a..1df6d3ccfc62 100644
> --- a/drivers/infiniband/hw/mlx4/cm.c
> +++ b/drivers/infiniband/hw/mlx4/cm.c
> @@ -197,8 +197,13 @@ static void id_map_find_del(struct ib_device *ibdev, int pv_cm_id)
>  	if (!ent)
>  		goto out;
>  	found_ent = id_map_find_by_sl_id(ibdev, ent->slave_id, ent->sl_cm_id);
> -	if (found_ent && found_ent == ent)
> +	if (found_ent && found_ent == ent) {
>  		rb_erase(&found_ent->node, sl_id_map);
> +		if (!ent->scheduled_delete) {

Why do we need to check scheduled_delete?

Isn't this to mark call to timeout cleanup (id_map_ent_timeout), which
can't race with id_map_find_del()? They both hold the same spinlock.

Thanks

> +			list_del(&ent->list);
> +			kfree(ent);
> +		}
> +	}
>  out:
>  	spin_unlock(&sriov->id_map_lock);
>  }
> --
> 2.20.1
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH for-rc] IB/mlx4: Fix leak in id_map_find_del
  2020-01-20 15:06 ` Leon Romanovsky
@ 2020-01-20 15:49   ` Håkon Bugge
  2020-01-20 16:27     ` Leon Romanovsky
  0 siblings, 1 reply; 6+ messages in thread
From: Håkon Bugge @ 2020-01-20 15:49 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Yishai Hadas, OFED mailing list



> On 20 Jan 2020, at 16:06, Leon Romanovsky <leon@kernel.org> wrote:
> 
> On Fri, Jan 17, 2020 at 02:56:22PM +0100, Håkon Bugge wrote:
>> Using CX-3 virtual functions, either from a bare-metal machine or
>> pass-through from a VM, MAD packets are proxied through the PF driver.
>> 
>> Since the VF drivers have separate name spaces for MAD Transaction Ids
>> (TIDs), the PF driver has to re-map the TIDs and keep the book keeping
>> in a cache.
>> 
>> Following the RDMA Connection Manager (CM) protocol, it is clear when
>> an entry has to evicted from the cache. When a DREP is sent from
>> mlx4_ib_multiplex_cm_handler(), id_map_find_del() is called. Similar
>> when a REJ is received by the mlx4_ib_demux_cm_handler(),
>> id_map_find_del() is called.
>> 
>> This function wipes out the TID in use from the IDR or XArray and
>> removes the id_map_entry from the table.
>> 
>> In short, it does everything except the topping of the cake, which is
>> to remove the entry from the list and free it. In other words, for the
>> DREP and REJ cases enumerated above, both will leak one id_map_entry.
>> 
>> Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
>> ---
>> drivers/infiniband/hw/mlx4/cm.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/infiniband/hw/mlx4/cm.c b/drivers/infiniband/hw/mlx4/cm.c
>> index ecd6cadd529a..1df6d3ccfc62 100644
>> --- a/drivers/infiniband/hw/mlx4/cm.c
>> +++ b/drivers/infiniband/hw/mlx4/cm.c
>> @@ -197,8 +197,13 @@ static void id_map_find_del(struct ib_device *ibdev, int pv_cm_id)
>> 	if (!ent)
>> 		goto out;
>> 	found_ent = id_map_find_by_sl_id(ibdev, ent->slave_id, ent->sl_cm_id);
>> -	if (found_ent && found_ent == ent)
>> +	if (found_ent && found_ent == ent) {
>> 		rb_erase(&found_ent->node, sl_id_map);
>> +		if (!ent->scheduled_delete) {
> 
> Why do we need to check scheduled_delete?

1. Node receives a DREQ and mlx4_ib_demux_cm_handler() is called, which again calls schedule_delayed(), which sets scheduled_delete.

2. DREQ is proxied over to the VM, which replies with a DREP.

3. The DREP is proxied over to the PF driver, mlx4_ib_multiplex_cm_handler() is called, id_map_find_del() is called. If it is freed now (without checking scheduled_delete), it will be a double free when the delayed work kicks in.

But this documents that the commit message is not accurate, it is only the REJ case that has a leak.

> Isn't this to mark call to timeout cleanup (id_map_ent_timeout), which
> can't race with id_map_find_del()? They both hold the same spinlock.

No race, but it can be set as per the above.

If you agree, I will send a v2 with corrected commit message.


Thxs, Håkon


> 
> Thanks
> 
>> +			list_del(&ent->list);
>> +			kfree(ent);
>> +		}
>> +	}
>> out:
>> 	spin_unlock(&sriov->id_map_lock);
>> }
>> --
>> 2.20.1
>> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH for-rc] IB/mlx4: Fix leak in id_map_find_del
  2020-01-20 15:49   ` Håkon Bugge
@ 2020-01-20 16:27     ` Leon Romanovsky
  2020-01-21 14:26       ` Håkon Bugge
  0 siblings, 1 reply; 6+ messages in thread
From: Leon Romanovsky @ 2020-01-20 16:27 UTC (permalink / raw)
  To: Håkon Bugge; +Cc: Yishai Hadas, OFED mailing list

On Mon, Jan 20, 2020 at 04:49:56PM +0100, Håkon Bugge wrote:
>
>
> > On 20 Jan 2020, at 16:06, Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Fri, Jan 17, 2020 at 02:56:22PM +0100, Håkon Bugge wrote:
> >> Using CX-3 virtual functions, either from a bare-metal machine or
> >> pass-through from a VM, MAD packets are proxied through the PF driver.
> >>
> >> Since the VF drivers have separate name spaces for MAD Transaction Ids
> >> (TIDs), the PF driver has to re-map the TIDs and keep the book keeping
> >> in a cache.
> >>
> >> Following the RDMA Connection Manager (CM) protocol, it is clear when
> >> an entry has to evicted from the cache. When a DREP is sent from
> >> mlx4_ib_multiplex_cm_handler(), id_map_find_del() is called. Similar
> >> when a REJ is received by the mlx4_ib_demux_cm_handler(),
> >> id_map_find_del() is called.
> >>
> >> This function wipes out the TID in use from the IDR or XArray and
> >> removes the id_map_entry from the table.
> >>
> >> In short, it does everything except the topping of the cake, which is
> >> to remove the entry from the list and free it. In other words, for the
> >> DREP and REJ cases enumerated above, both will leak one id_map_entry.
> >>
> >> Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
> >> ---
> >> drivers/infiniband/hw/mlx4/cm.c | 7 ++++++-
> >> 1 file changed, 6 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/infiniband/hw/mlx4/cm.c b/drivers/infiniband/hw/mlx4/cm.c
> >> index ecd6cadd529a..1df6d3ccfc62 100644
> >> --- a/drivers/infiniband/hw/mlx4/cm.c
> >> +++ b/drivers/infiniband/hw/mlx4/cm.c
> >> @@ -197,8 +197,13 @@ static void id_map_find_del(struct ib_device *ibdev, int pv_cm_id)
> >> 	if (!ent)
> >> 		goto out;
> >> 	found_ent = id_map_find_by_sl_id(ibdev, ent->slave_id, ent->sl_cm_id);
> >> -	if (found_ent && found_ent == ent)
> >> +	if (found_ent && found_ent == ent) {
> >> 		rb_erase(&found_ent->node, sl_id_map);
> >> +		if (!ent->scheduled_delete) {
> >
> > Why do we need to check scheduled_delete?
>
> 1. Node receives a DREQ and mlx4_ib_demux_cm_handler() is called, which again calls schedule_delayed(), which sets scheduled_delete.
>
> 2. DREQ is proxied over to the VM, which replies with a DREP.
>
> 3. The DREP is proxied over to the PF driver, mlx4_ib_multiplex_cm_handler() is called, id_map_find_del() is called. If it is freed now (without checking scheduled_delete), it will be a double free when the delayed work kicks in.

It will be the case if we don't cancel delayed work inside
id_map_find_del(), but it raises other question. Why do we need two
identical delete functions? Can we convert id_map_find_del() callers
to use id_map_ent_timeout() instead?

Thanks

>
> But this documents that the commit message is not accurate, it is only the REJ case that has a leak.
>
> > Isn't this to mark call to timeout cleanup (id_map_ent_timeout), which
> > can't race with id_map_find_del()? They both hold the same spinlock.
>
> No race, but it can be set as per the above.
>
> If you agree, I will send a v2 with corrected commit message.
>
>
> Thxs, Håkon
>
>
> >
> > Thanks
> >
> >> +			list_del(&ent->list);
> >> +			kfree(ent);
> >> +		}
> >> +	}
> >> out:
> >> 	spin_unlock(&sriov->id_map_lock);
> >> }
> >> --
> >> 2.20.1
> >>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH for-rc] IB/mlx4: Fix leak in id_map_find_del
  2020-01-20 16:27     ` Leon Romanovsky
@ 2020-01-21 14:26       ` Håkon Bugge
  0 siblings, 0 replies; 6+ messages in thread
From: Håkon Bugge @ 2020-01-21 14:26 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Yishai Hadas, OFED mailing list



> On 20 Jan 2020, at 17:27, Leon Romanovsky <leon@kernel.org> wrote:
> 
> On Mon, Jan 20, 2020 at 04:49:56PM +0100, Håkon Bugge wrote:
>> 
>> 
>>> On 20 Jan 2020, at 16:06, Leon Romanovsky <leon@kernel.org> wrote:
>>> 
>>> On Fri, Jan 17, 2020 at 02:56:22PM +0100, Håkon Bugge wrote:
>>>> Using CX-3 virtual functions, either from a bare-metal machine or
>>>> pass-through from a VM, MAD packets are proxied through the PF driver.
>>>> 
>>>> Since the VF drivers have separate name spaces for MAD Transaction Ids
>>>> (TIDs), the PF driver has to re-map the TIDs and keep the book keeping
>>>> in a cache.
>>>> 
>>>> Following the RDMA Connection Manager (CM) protocol, it is clear when
>>>> an entry has to evicted from the cache. When a DREP is sent from
>>>> mlx4_ib_multiplex_cm_handler(), id_map_find_del() is called. Similar
>>>> when a REJ is received by the mlx4_ib_demux_cm_handler(),
>>>> id_map_find_del() is called.
>>>> 
>>>> This function wipes out the TID in use from the IDR or XArray and
>>>> removes the id_map_entry from the table.
>>>> 
>>>> In short, it does everything except the topping of the cake, which is
>>>> to remove the entry from the list and free it. In other words, for the
>>>> DREP and REJ cases enumerated above, both will leak one id_map_entry.
>>>> 
>>>> Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
>>>> ---
>>>> drivers/infiniband/hw/mlx4/cm.c | 7 ++++++-
>>>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>>> 
>>>> diff --git a/drivers/infiniband/hw/mlx4/cm.c b/drivers/infiniband/hw/mlx4/cm.c
>>>> index ecd6cadd529a..1df6d3ccfc62 100644
>>>> --- a/drivers/infiniband/hw/mlx4/cm.c
>>>> +++ b/drivers/infiniband/hw/mlx4/cm.c
>>>> @@ -197,8 +197,13 @@ static void id_map_find_del(struct ib_device *ibdev, int pv_cm_id)
>>>> 	if (!ent)
>>>> 		goto out;
>>>> 	found_ent = id_map_find_by_sl_id(ibdev, ent->slave_id, ent->sl_cm_id);
>>>> -	if (found_ent && found_ent == ent)
>>>> +	if (found_ent && found_ent == ent) {
>>>> 		rb_erase(&found_ent->node, sl_id_map);
>>>> +		if (!ent->scheduled_delete) {
>>> 
>>> Why do we need to check scheduled_delete?
>> 
>> 1. Node receives a DREQ and mlx4_ib_demux_cm_handler() is called, which again calls schedule_delayed(), which sets scheduled_delete.
>> 
>> 2. DREQ is proxied over to the VM, which replies with a DREP.
>> 
>> 3. The DREP is proxied over to the PF driver, mlx4_ib_multiplex_cm_handler() is called, id_map_find_del() is called. If it is freed now (without checking scheduled_delete), it will be a double free when the delayed work kicks in.
> 
> It will be the case if we don't cancel delayed work inside
> id_map_find_del(), but it raises other question. Why do we need two
> identical delete functions?

Almost identical. It's a question about being evolutionary vs. revolutionary, when fixing a leak. My preference would be to fix the leak as simple as possible, without changing any logic.

> Can we convert id_map_find_del() callers
> to use id_map_ent_timeout() instead?

I did the opposite, and I will send it out as a v2 after testing. But, I didn't like it. Fewer lines, yes, but more complexity.


Thxs, Håkon

> 
> Thanks
> 
>> 
>> But this documents that the commit message is not accurate, it is only the REJ case that has a leak.
>> 
>>> Isn't this to mark call to timeout cleanup (id_map_ent_timeout), which
>>> can't race with id_map_find_del()? They both hold the same spinlock.
>> 
>> No race, but it can be set as per the above.
>> 
>> If you agree, I will send a v2 with corrected commit message.
>> 
>> 
>> Thxs, Håkon
>> 
>> 
>>> 
>>> Thanks
>>> 
>>>> +			list_del(&ent->list);
>>>> +			kfree(ent);
>>>> +		}
>>>> +	}
>>>> out:
>>>> 	spin_unlock(&sriov->id_map_lock);
>>>> }
>>>> --
>>>> 2.20.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-01-21 14:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-17 13:56 [PATCH for-rc] IB/mlx4: Fix leak in id_map_find_del Håkon Bugge
2020-01-20 14:36 ` Håkon Bugge
2020-01-20 15:06 ` Leon Romanovsky
2020-01-20 15:49   ` Håkon Bugge
2020-01-20 16:27     ` Leon Romanovsky
2020-01-21 14:26       ` Håkon Bugge

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).