linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	Jack Morgenstein <jackm@dev.mellanox.co.il>,
	Matan Barak <matanb@mellanox.com>,
	Tariq Toukan <tariqt@mellanox.com>,
	"David S. Miller" <davem@davemloft.net>,
	Sumit Semwal <sumit.semwal@linaro.org>
Subject: [PATCH 4.9 29/31] net/mlx4_core: Fix racy CQ (Completion Queue) free
Date: Sun, 16 Apr 2017 10:04:20 +0200	[thread overview]
Message-ID: <20170416080223.305267031@linuxfoundation.org> (raw)
In-Reply-To: <20170416080221.808058771@linuxfoundation.org>

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jack Morgenstein <jackm@dev.mellanox.co.il>

commit 291c566a28910614ce42d0ffe82196eddd6346f4 upstream.

In function mlx4_cq_completion() and mlx4_cq_event(), the
radix_tree_lookup requires a rcu_read_lock.
This is mandatory: if another core frees the CQ, it could
run the radix_tree_node_rcu_free() call_rcu() callback while
its being used by the radix tree lookup function.

Additionally, in function mlx4_cq_event(), since we are adding
the rcu lock around the radix-tree lookup, we no longer need to take
the spinlock. Also, the synchronize_irq() call for the async event
eliminates the need for incrementing the cq reference count in
mlx4_cq_event().

Other changes:
1. In function mlx4_cq_free(), replace spin_lock_irq with spin_lock:
   we no longer take this spinlock in the interrupt context.
   The spinlock here, therefore, simply protects against different
   threads simultaneously invoking mlx4_cq_free() for different cq's.

2. In function mlx4_cq_free(), we move the radix tree delete to before
   the synchronize_irq() calls. This guarantees that we will not
   access this cq during any subsequent interrupts, and therefore can
   safely free the CQ after the synchronize_irq calls. The rcu_read_lock
   in the interrupt handlers only needs to protect against corrupting the
   radix tree; the interrupt handlers may access the cq outside the
   rcu_read_lock due to the synchronize_irq calls which protect against
   premature freeing of the cq.

3. In function mlx4_cq_event(), we change the mlx_warn message to mlx4_dbg.

4. We leave the cq reference count mechanism in place, because it is
   still needed for the cq completion tasklet mechanism.

Fixes: 6d90aa5cf17b ("net/mlx4_core: Make sure there are no pending async events when freeing CQ")
Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/net/ethernet/mellanox/mlx4/cq.c |   38 ++++++++++++++++----------------
 1 file changed, 20 insertions(+), 18 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx4/cq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cq.c
@@ -101,13 +101,19 @@ void mlx4_cq_completion(struct mlx4_dev
 {
 	struct mlx4_cq *cq;
 
+	rcu_read_lock();
 	cq = radix_tree_lookup(&mlx4_priv(dev)->cq_table.tree,
 			       cqn & (dev->caps.num_cqs - 1));
+	rcu_read_unlock();
+
 	if (!cq) {
 		mlx4_dbg(dev, "Completion event for bogus CQ %08x\n", cqn);
 		return;
 	}
 
+	/* Acessing the CQ outside of rcu_read_lock is safe, because
+	 * the CQ is freed only after interrupt handling is completed.
+	 */
 	++cq->arm_sn;
 
 	cq->comp(cq);
@@ -118,23 +124,19 @@ void mlx4_cq_event(struct mlx4_dev *dev,
 	struct mlx4_cq_table *cq_table = &mlx4_priv(dev)->cq_table;
 	struct mlx4_cq *cq;
 
-	spin_lock(&cq_table->lock);
-
+	rcu_read_lock();
 	cq = radix_tree_lookup(&cq_table->tree, cqn & (dev->caps.num_cqs - 1));
-	if (cq)
-		atomic_inc(&cq->refcount);
-
-	spin_unlock(&cq_table->lock);
+	rcu_read_unlock();
 
 	if (!cq) {
-		mlx4_warn(dev, "Async event for bogus CQ %08x\n", cqn);
+		mlx4_dbg(dev, "Async event for bogus CQ %08x\n", cqn);
 		return;
 	}
 
+	/* Acessing the CQ outside of rcu_read_lock is safe, because
+	 * the CQ is freed only after interrupt handling is completed.
+	 */
 	cq->event(cq, event_type);
-
-	if (atomic_dec_and_test(&cq->refcount))
-		complete(&cq->free);
 }
 
 static int mlx4_SW2HW_CQ(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *mailbox,
@@ -301,9 +303,9 @@ int mlx4_cq_alloc(struct mlx4_dev *dev,
 	if (err)
 		return err;
 
-	spin_lock_irq(&cq_table->lock);
+	spin_lock(&cq_table->lock);
 	err = radix_tree_insert(&cq_table->tree, cq->cqn, cq);
-	spin_unlock_irq(&cq_table->lock);
+	spin_unlock(&cq_table->lock);
 	if (err)
 		goto err_icm;
 
@@ -349,9 +351,9 @@ int mlx4_cq_alloc(struct mlx4_dev *dev,
 	return 0;
 
 err_radix:
-	spin_lock_irq(&cq_table->lock);
+	spin_lock(&cq_table->lock);
 	radix_tree_delete(&cq_table->tree, cq->cqn);
-	spin_unlock_irq(&cq_table->lock);
+	spin_unlock(&cq_table->lock);
 
 err_icm:
 	mlx4_cq_free_icm(dev, cq->cqn);
@@ -370,15 +372,15 @@ void mlx4_cq_free(struct mlx4_dev *dev,
 	if (err)
 		mlx4_warn(dev, "HW2SW_CQ failed (%d) for CQN %06x\n", err, cq->cqn);
 
+	spin_lock(&cq_table->lock);
+	radix_tree_delete(&cq_table->tree, cq->cqn);
+	spin_unlock(&cq_table->lock);
+
 	synchronize_irq(priv->eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(cq->vector)].irq);
 	if (priv->eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(cq->vector)].irq !=
 	    priv->eq_table.eq[MLX4_EQ_ASYNC].irq)
 		synchronize_irq(priv->eq_table.eq[MLX4_EQ_ASYNC].irq);
 
-	spin_lock_irq(&cq_table->lock);
-	radix_tree_delete(&cq_table->tree, cq->cqn);
-	spin_unlock_irq(&cq_table->lock);
-
 	if (atomic_dec_and_test(&cq->refcount))
 		complete(&cq->free);
 	wait_for_completion(&cq->free);

  parent reply	other threads:[~2017-04-16  8:13 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-16  8:03 [PATCH 4.9 00/31] 4.9.23-stable review Greg Kroah-Hartman
2017-04-16  8:03 ` [PATCH 4.9 05/31] drm/i915: Drop support for I915_EXEC_CONSTANTS_* execbuf parameters Greg Kroah-Hartman
2017-04-16  8:03 ` [PATCH 4.9 06/31] drm/i915: Stop using RP_DOWN_EI on Baytrail Greg Kroah-Hartman
2017-04-16  8:03 ` [PATCH 4.9 07/31] drm/i915: Avoid rcu_barrier() from reclaim paths (shrinker) Greg Kroah-Hartman
2017-04-16  8:03 ` [PATCH 4.9 08/31] orangefs: fix memory leak of string new on exit path Greg Kroah-Hartman
2017-04-16  8:04 ` [PATCH 4.9 09/31] orangefs: Dan Carpenter influenced cleanups Greg Kroah-Hartman
2017-04-16  8:04 ` [PATCH 4.9 10/31] orangefs: fix buffer size mis-match between kernel space and user space Greg Kroah-Hartman
2017-04-16  8:04 ` [PATCH 4.9 12/31] rt2x00usb: fix anchor initialization Greg Kroah-Hartman
2017-04-16  8:04 ` [PATCH 4.9 13/31] rt2x00usb: do not anchor rx and tx urbs Greg Kroah-Hartman
2017-04-16  8:04 ` [PATCH 4.9 14/31] rt2x00: Fix incorrect usage of CONFIG_RT2X00_LIB_USB Greg Kroah-Hartman
2017-04-16  8:04 ` [PATCH 4.9 16/31] MIPS: Introduce irq_stack Greg Kroah-Hartman
2017-04-16  8:04 ` [PATCH 4.9 17/31] MIPS: Stack unwinding while on IRQ stack Greg Kroah-Hartman
2017-04-16  8:04 ` [PATCH 4.9 18/31] MIPS: Only change $28 to thread_info if coming from user mode Greg Kroah-Hartman
2017-04-16  8:04 ` [PATCH 4.9 19/31] MIPS: Switch to the irq_stack in interrupts Greg Kroah-Hartman
2017-04-16  8:04 ` [PATCH 4.9 20/31] MIPS: Select HAVE_IRQ_EXIT_ON_IRQ_STACK Greg Kroah-Hartman
2017-04-16  8:04 ` [PATCH 4.9 21/31] MIPS: IRQ Stack: Fix erroneous jal to plat_irq_dispatch Greg Kroah-Hartman
2017-04-16  8:04 ` [PATCH 4.9 24/31] [PATCH] Revert "drm/i915/execlists: Reset RING registers upon resume" Greg Kroah-Hartman
2017-04-16  8:04 ` [PATCH 4.9 25/31] net/packet: fix overflow in check for priv area size Greg Kroah-Hartman
2017-04-16  8:04 ` [PATCH 4.9 26/31] blk-mq: Avoid memory reclaim when remapping queues Greg Kroah-Hartman
2017-04-16  8:04 ` [PATCH 4.9 27/31] usb: hub: Wait for connection to be reestablished after port reset Greg Kroah-Hartman
2017-04-16  8:04 ` [PATCH 4.9 28/31] net/mlx4_en: Fix bad WQE issue Greg Kroah-Hartman
2017-04-16  8:04 ` Greg Kroah-Hartman [this message]
2017-04-16  8:04 ` [PATCH 4.9 30/31] net/mlx4_core: Fix when to save some qp context flags for dynamic VST to VGT transitions Greg Kroah-Hartman
2017-04-16 21:29 ` [PATCH 4.9 00/31] 4.9.23-stable review Guenter Roeck
2017-04-17 18:21 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170416080223.305267031@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=jackm@dev.mellanox.co.il \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matanb@mellanox.com \
    --cc=stable@vger.kernel.org \
    --cc=sumit.semwal@linaro.org \
    --cc=tariqt@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).