linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: "Håkon Bugge" <haakon.bugge@oracle.com>,
	"Manjunath Patil" <manjunath.b.patil@oracle.com>,
	"Rama Nichanamatlu" <rama.nichanamatlu@oracle.com>,
	"Jack Morgenstein" <jackm@dev.mellanox.co.il>,
	"Jason Gunthorpe" <jgg@mellanox.com>,
	"Sasha Levin" <sashal@kernel.org>,
	linux-rdma@vger.kernel.org
Subject: [PATCH AUTOSEL 5.4 421/459] IB/mlx4: Fix leak in id_map_find_del
Date: Fri, 14 Feb 2020 11:01:11 -0500	[thread overview]
Message-ID: <20200214160149.11681-421-sashal@kernel.org> (raw)
In-Reply-To: <20200214160149.11681-1-sashal@kernel.org>

From: Håkon Bugge <haakon.bugge@oracle.com>

[ Upstream commit ea660ad7c1c476fd6e5e3b17780d47159db71dea ]

Using CX-3 virtual functions, either from a bare-metal machine or
pass-through from a VM, MAD packets are proxied through the PF driver.

Since the VF drivers have separate name spaces for MAD Transaction Ids
(TIDs), the PF driver has to re-map the TIDs and keep the book keeping in
a cache.

Following the RDMA Connection Manager (CM) protocol, it is clear when an
entry has to evicted from the cache. When a DREP is sent from
mlx4_ib_multiplex_cm_handler(), id_map_find_del() is called. Similar when
a REJ is received by the mlx4_ib_demux_cm_handler(), id_map_find_del() is
called.

This function wipes out the TID in use from the IDR or XArray and removes
the id_map_entry from the table.

In short, it does everything except the topping of the cake, which is to
remove the entry from the list and free it. In other words, for the REJ
case enumerated above, one id_map_entry will be leaked.

For the other case above, a DREQ has been received first. The reception of
the DREQ will trigger queuing of a delayed work to delete the
id_map_entry, for the case where the VM doesn't send back a DREP.

In the normal case, the VM _will_ send back a DREP, and id_map_find_del()
will be called.

But this scenario introduces a secondary leak. First, when the DREQ is
received, a delayed work is queued. The VM will then return a DREP, which
will call id_map_find_del(). As stated above, this will free the TID used
from the XArray or IDR. Now, there is window where that particular TID can
be re-allocated, lets say by an outgoing REQ. This TID will later be wiped
out by the delayed work, when the function id_map_ent_timeout() is
called. But the id_map_entry allocated by the outgoing REQ will not be
de-allocated, and we have a leak.

Both leaks are fixed by removing the id_map_find_del() function and only
using schedule_delayed(). Of course, a check in schedule_delayed() to see
if the work already has been queued, has been added.

Another benefit of always using the delayed version for deleting entries,
is that we do get a TimeWait effect; a TID no longer in use, will occupy
the XArray or IDR for CM_CLEANUP_CACHE_TIMEOUT time, without any ability
of being re-used for that time period.

Fixes: 3cf69cc8dbeb ("IB/mlx4: Add CM paravirtualization")
Link: https://lore.kernel.org/r/20200123155521.1212288-1-haakon.bugge@oracle.com
Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: Manjunath Patil <manjunath.b.patil@oracle.com>
Reviewed-by: Rama Nichanamatlu <rama.nichanamatlu@oracle.com>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/infiniband/hw/mlx4/cm.c | 29 +++--------------------------
 1 file changed, 3 insertions(+), 26 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/cm.c b/drivers/infiniband/hw/mlx4/cm.c
index ecd6cadd529a5..b591861934b3c 100644
--- a/drivers/infiniband/hw/mlx4/cm.c
+++ b/drivers/infiniband/hw/mlx4/cm.c
@@ -186,23 +186,6 @@ static void id_map_ent_timeout(struct work_struct *work)
 	kfree(ent);
 }
 
-static void id_map_find_del(struct ib_device *ibdev, int pv_cm_id)
-{
-	struct mlx4_ib_sriov *sriov = &to_mdev(ibdev)->sriov;
-	struct rb_root *sl_id_map = &sriov->sl_id_map;
-	struct id_map_entry *ent, *found_ent;
-
-	spin_lock(&sriov->id_map_lock);
-	ent = xa_erase(&sriov->pv_id_table, pv_cm_id);
-	if (!ent)
-		goto out;
-	found_ent = id_map_find_by_sl_id(ibdev, ent->slave_id, ent->sl_cm_id);
-	if (found_ent && found_ent == ent)
-		rb_erase(&found_ent->node, sl_id_map);
-out:
-	spin_unlock(&sriov->id_map_lock);
-}
-
 static void sl_id_map_add(struct ib_device *ibdev, struct id_map_entry *new)
 {
 	struct rb_root *sl_id_map = &to_mdev(ibdev)->sriov.sl_id_map;
@@ -294,7 +277,7 @@ static void schedule_delayed(struct ib_device *ibdev, struct id_map_entry *id)
 	spin_lock(&sriov->id_map_lock);
 	spin_lock_irqsave(&sriov->going_down_lock, flags);
 	/*make sure that there is no schedule inside the scheduled work.*/
-	if (!sriov->is_going_down) {
+	if (!sriov->is_going_down && !id->scheduled_delete) {
 		id->scheduled_delete = 1;
 		schedule_delayed_work(&id->timeout, CM_CLEANUP_CACHE_TIMEOUT);
 	}
@@ -341,9 +324,6 @@ int mlx4_ib_multiplex_cm_handler(struct ib_device *ibdev, int port, int slave_id
 
 	if (mad->mad_hdr.attr_id == CM_DREQ_ATTR_ID)
 		schedule_delayed(ibdev, id);
-	else if (mad->mad_hdr.attr_id == CM_DREP_ATTR_ID)
-		id_map_find_del(ibdev, pv_cm_id);
-
 	return 0;
 }
 
@@ -382,12 +362,9 @@ int mlx4_ib_demux_cm_handler(struct ib_device *ibdev, int port, int *slave,
 		*slave = id->slave_id;
 	set_remote_comm_id(mad, id->sl_cm_id);
 
-	if (mad->mad_hdr.attr_id == CM_DREQ_ATTR_ID)
+	if (mad->mad_hdr.attr_id == CM_DREQ_ATTR_ID ||
+	    mad->mad_hdr.attr_id == CM_REJ_ATTR_ID)
 		schedule_delayed(ibdev, id);
-	else if (mad->mad_hdr.attr_id == CM_REJ_ATTR_ID ||
-			mad->mad_hdr.attr_id == CM_DREP_ATTR_ID) {
-		id_map_find_del(ibdev, (int) pv_cm_id);
-	}
 
 	return 0;
 }
-- 
2.20.1


      parent reply	other threads:[~2020-02-14 17:19 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20200214160149.11681-1-sashal@kernel.org>
2020-02-14 15:54 ` [PATCH AUTOSEL 5.4 034/459] RDMA/i40iw: fix a potential NULL pointer dereference Sasha Levin
2020-02-14 15:54 ` [PATCH AUTOSEL 5.4 039/459] RDMA/netlink: Do not always generate an ACK for some netlink operations Sasha Levin
2020-02-14 15:55 ` [PATCH AUTOSEL 5.4 052/459] IB/core: Let IB core distribute cache update events Sasha Levin
2020-02-14 15:55 ` [PATCH AUTOSEL 5.4 102/459] RDMA/umem: Fix ib_umem_find_best_pgsz() Sasha Levin
2020-02-14 15:55 ` [PATCH AUTOSEL 5.4 103/459] RDMA/cma: Fix unbalanced cm_id reference count during address resolve Sasha Levin
2020-02-14 15:58 ` [PATCH AUTOSEL 5.4 240/459] RDMA/rxe: Fix error type of mmap_offset Sasha Levin
2020-02-14 15:58 ` [PATCH AUTOSEL 5.4 256/459] mlx5: work around high stack usage with gcc Sasha Levin
2020-02-14 15:58 ` [PATCH AUTOSEL 5.4 257/459] RDMA/hns: Avoid printing address of mtt page Sasha Levin
2020-02-14 15:58 ` [PATCH AUTOSEL 5.4 273/459] RDMA/core: Fix locking in ib_uverbs_event_read Sasha Levin
2020-02-14 15:58 ` [PATCH AUTOSEL 5.4 274/459] IB/hfi1: Add software counter for ctxt0 seq drop Sasha Levin
2020-02-14 15:58 ` [PATCH AUTOSEL 5.4 275/459] IB/hfi1: Add RcvShortLengthErrCnt to hfi1stats Sasha Levin
2020-02-14 15:59 ` [PATCH AUTOSEL 5.4 295/459] RDMA/uverbs: Remove needs_kfree_rcu from uverbs_obj_type_class Sasha Levin
2020-02-14 15:59 ` [PATCH AUTOSEL 5.4 324/459] IB/srp: Never use immediate data if it is disabled by a user Sasha Levin
2020-02-14 15:59 ` [PATCH AUTOSEL 5.4 333/459] RDMA/mlx5: Don't fake udata for kernel path Sasha Levin
2020-02-14 15:59 ` [PATCH AUTOSEL 5.4 338/459] RDMA/uverbs: Verify MR access flags Sasha Levin
2020-02-14 15:59 ` [PATCH AUTOSEL 5.4 339/459] IB/mlx4: Fix memory leak in add_gid error flow Sasha Levin
2020-02-14 16:01 ` Sasha Levin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200214160149.11681-421-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=haakon.bugge@oracle.com \
    --cc=jackm@dev.mellanox.co.il \
    --cc=jgg@mellanox.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=manjunath.b.patil@oracle.com \
    --cc=rama.nichanamatlu@oracle.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).