All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] IB/rdmavt: cq ktrhead worker fix and API update
@ 2016-10-19 12:07 Petr Mladek
  2016-10-19 12:07 ` [PATCH 1/2] IB/rdmavt: Avoid queuing work into a destroyed cq kthread worker Petr Mladek
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Petr Mladek @ 2016-10-19 12:07 UTC (permalink / raw)
  To: Doug Ledford, Sean Hefty
  Cc: Dennis Dalessandro, Hal Rosenstock, linux-rdma, Tejun Heo,
	linux-kernel, Petr Mladek

The kthread worker API has been improved in 4.9-rc1. The 2nd
patch uses the new functions and simplifies the kthread worker
creation and destroying.

I have found a possible race when working on the API conversion.
A proposed fix is in the 1st patch.

Both changes are compile tested only. I did not find an easy way
how to test them at runtime.

Petr Mladek (2):
  IB/rdmavt: Avoid queuing work into a destroyed cq kthread worker
  IB/rdmavt: Handle the kthread worker using the new API

 drivers/infiniband/sw/rdmavt/cq.c | 64 +++++++++++++++++----------------------
 1 file changed, 27 insertions(+), 37 deletions(-)

-- 
1.8.5.6

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] IB/rdmavt: Avoid queuing work into a destroyed cq kthread worker
  2016-10-19 12:07 [PATCH 0/2] IB/rdmavt: cq ktrhead worker fix and API update Petr Mladek
@ 2016-10-19 12:07 ` Petr Mladek
  2016-10-19 12:07 ` [PATCH 2/2] IB/rdmavt: Handle the kthread worker using the new API Petr Mladek
  2016-10-20 12:56 ` [PATCH 0/2] IB/rdmavt: cq ktrhead worker fix and API update Dalessandro, Dennis
  2 siblings, 0 replies; 5+ messages in thread
From: Petr Mladek @ 2016-10-19 12:07 UTC (permalink / raw)
  To: Doug Ledford, Sean Hefty
  Cc: Dennis Dalessandro, Hal Rosenstock, linux-rdma, Tejun Heo,
	linux-kernel, Petr Mladek

The memory barrier is not enough to protect queuing works into
a destroyed cq kthread. Just imagine the following situation:

CPU1				CPU2

rvt_cq_enter()
  worker =  cq->rdi->worker;

				rvt_cq_exit()
				  rdi->worker = NULL;
				  smp_wmb();
				  kthread_flush_worker(worker);
				  kthread_stop(worker->task);
				  kfree(worker);

				  // nothing queued yet =>
				  // nothing flushed and
				  // happily stopped and freed

  if (likely(worker)) {
     // true => read before CPU2 acted
     cq->notify = RVT_CQ_NONE;
     cq->triggered++;
     kthread_queue_work(worker, &cq->comptask);

  BANG: worker has been flushed/stopped/freed in the meantime.

This patch solves this by protecting the critical sections by
rdi->n_cqs_lock. It seems that this lock is not much contended
and looks reasonable for this purpose.

One catch is that rvt_cq_enter() might be called from IRQ context.
Therefore we must always take the lock with IRQs disabled to avoid
a possible deadlock.

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 drivers/infiniband/sw/rdmavt/cq.c | 30 ++++++++++++++++--------------
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/drivers/infiniband/sw/rdmavt/cq.c b/drivers/infiniband/sw/rdmavt/cq.c
index 6d9904a4a0ab..223ec4589fc7 100644
--- a/drivers/infiniband/sw/rdmavt/cq.c
+++ b/drivers/infiniband/sw/rdmavt/cq.c
@@ -119,18 +119,17 @@ void rvt_cq_enter(struct rvt_cq *cq, struct ib_wc *entry, bool solicited)
 	if (cq->notify == IB_CQ_NEXT_COMP ||
 	    (cq->notify == IB_CQ_SOLICITED &&
 	     (solicited || entry->status != IB_WC_SUCCESS))) {
-		struct kthread_worker *worker;
 		/*
 		 * This will cause send_complete() to be called in
 		 * another thread.
 		 */
-		smp_read_barrier_depends(); /* see rvt_cq_exit */
-		worker = cq->rdi->worker;
-		if (likely(worker)) {
+		spin_lock(&cq->rdi->n_cqs_lock);
+		if (likely(cq->rdi->worker)) {
 			cq->notify = RVT_CQ_NONE;
 			cq->triggered++;
-			kthread_queue_work(worker, &cq->comptask);
+			kthread_queue_work(cq->rdi->worker, &cq->comptask);
 		}
+		spin_unlock(&cq->rdi->n_cqs_lock);
 	}
 
 	spin_unlock_irqrestore(&cq->lock, flags);
@@ -240,15 +239,15 @@ struct ib_cq *rvt_create_cq(struct ib_device *ibdev,
 		}
 	}
 
-	spin_lock(&rdi->n_cqs_lock);
+	spin_lock_irq(&rdi->n_cqs_lock);
 	if (rdi->n_cqs_allocated == rdi->dparms.props.max_cq) {
-		spin_unlock(&rdi->n_cqs_lock);
+		spin_unlock_irq(&rdi->n_cqs_lock);
 		ret = ERR_PTR(-ENOMEM);
 		goto bail_ip;
 	}
 
 	rdi->n_cqs_allocated++;
-	spin_unlock(&rdi->n_cqs_lock);
+	spin_unlock_irq(&rdi->n_cqs_lock);
 
 	if (cq->ip) {
 		spin_lock_irq(&rdi->pending_lock);
@@ -296,9 +295,9 @@ int rvt_destroy_cq(struct ib_cq *ibcq)
 	struct rvt_dev_info *rdi = cq->rdi;
 
 	kthread_flush_work(&cq->comptask);
-	spin_lock(&rdi->n_cqs_lock);
+	spin_lock_irq(&rdi->n_cqs_lock);
 	rdi->n_cqs_allocated--;
-	spin_unlock(&rdi->n_cqs_lock);
+	spin_unlock_irq(&rdi->n_cqs_lock);
 	if (cq->ip)
 		kref_put(&cq->ip->ref, rvt_release_mmap_info);
 	else
@@ -541,12 +540,15 @@ void rvt_cq_exit(struct rvt_dev_info *rdi)
 {
 	struct kthread_worker *worker;
 
-	worker = rdi->worker;
-	if (!worker)
+	/* block future queuing from send_complete() */
+	spin_lock_irq(&rdi->n_cqs_lock);
+	if (!rdi->worker) {
+		spin_unlock_irq(&rdi->n_cqs_lock);
 		return;
-	/* blocks future queuing from send_complete() */
+	}
 	rdi->worker = NULL;
-	smp_wmb(); /* See rdi_cq_enter */
+	spin_unlock_irq(&rdi->n_cqs_lock);
+
 	kthread_flush_worker(worker);
 	kthread_stop(worker->task);
 	kfree(worker);
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] IB/rdmavt: Handle the kthread worker using the new API
  2016-10-19 12:07 [PATCH 0/2] IB/rdmavt: cq ktrhead worker fix and API update Petr Mladek
  2016-10-19 12:07 ` [PATCH 1/2] IB/rdmavt: Avoid queuing work into a destroyed cq kthread worker Petr Mladek
@ 2016-10-19 12:07 ` Petr Mladek
  2016-10-20 12:56 ` [PATCH 0/2] IB/rdmavt: cq ktrhead worker fix and API update Dalessandro, Dennis
  2 siblings, 0 replies; 5+ messages in thread
From: Petr Mladek @ 2016-10-19 12:07 UTC (permalink / raw)
  To: Doug Ledford, Sean Hefty
  Cc: Dennis Dalessandro, Hal Rosenstock, linux-rdma, Tejun Heo,
	linux-kernel, Petr Mladek

Use the new API to create and destroy the cq kthread worker.
The API hides some implementation details.

In particular, kthread_create_worker() allocates and initializes
struct kthread_worker. It runs the kthread the right way and stores
task_struct into the worker structure. In addition, the *on_cpu()
variant binds the kthread to the given cpu and the related memory
node.

kthread_destroy_worker() flushes all pending works, stops
the kthread and frees the structure.

This patch does not change the existing behavior. Note that we must
use the on_cpu() variant because the function starts the kthread
and it must bind it to the right CPU before waking. The numa node
is associated for given CPU as well.

Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 drivers/infiniband/sw/rdmavt/cq.c | 34 +++++++++++-----------------------
 1 file changed, 11 insertions(+), 23 deletions(-)

diff --git a/drivers/infiniband/sw/rdmavt/cq.c b/drivers/infiniband/sw/rdmavt/cq.c
index 223ec4589fc7..4d0b6992e847 100644
--- a/drivers/infiniband/sw/rdmavt/cq.c
+++ b/drivers/infiniband/sw/rdmavt/cq.c
@@ -503,33 +503,23 @@ int rvt_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *entry)
  */
 int rvt_driver_cq_init(struct rvt_dev_info *rdi)
 {
-	int ret = 0;
 	int cpu;
-	struct task_struct *task;
+	struct kthread_worker *worker;
 
 	if (rdi->worker)
 		return 0;
+
 	spin_lock_init(&rdi->n_cqs_lock);
-	rdi->worker = kzalloc(sizeof(*rdi->worker), GFP_KERNEL);
-	if (!rdi->worker)
-		return -ENOMEM;
-	kthread_init_worker(rdi->worker);
-	task = kthread_create_on_node(
-		kthread_worker_fn,
-		rdi->worker,
-		rdi->dparms.node,
-		"%s", rdi->dparms.cq_name);
-	if (IS_ERR(task)) {
-		kfree(rdi->worker);
-		rdi->worker = NULL;
-		return PTR_ERR(task);
-	}
 
-	set_user_nice(task, MIN_NICE);
 	cpu = cpumask_first(cpumask_of_node(rdi->dparms.node));
-	kthread_bind(task, cpu);
-	wake_up_process(task);
-	return ret;
+	worker = kthread_create_worker_on_cpu(cpu, 0,
+					      "%s", rdi->dparms.cq_name);
+	if (IS_ERR(worker))
+		return PTR_ERR(worker);
+
+	set_user_nice(worker->task, MIN_NICE);
+	rdi->worker = worker;
+	return 0;
 }
 
 /**
@@ -549,7 +539,5 @@ void rvt_cq_exit(struct rvt_dev_info *rdi)
 	rdi->worker = NULL;
 	spin_unlock_irq(&rdi->n_cqs_lock);
 
-	kthread_flush_worker(worker);
-	kthread_stop(worker->task);
-	kfree(worker);
+	kthread_destroy_worker(worker);
 }
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/2] IB/rdmavt: cq ktrhead worker fix and API update
  2016-10-19 12:07 [PATCH 0/2] IB/rdmavt: cq ktrhead worker fix and API update Petr Mladek
  2016-10-19 12:07 ` [PATCH 1/2] IB/rdmavt: Avoid queuing work into a destroyed cq kthread worker Petr Mladek
  2016-10-19 12:07 ` [PATCH 2/2] IB/rdmavt: Handle the kthread worker using the new API Petr Mladek
@ 2016-10-20 12:56 ` Dalessandro, Dennis
  2016-12-14 17:17   ` Doug Ledford
  2 siblings, 1 reply; 5+ messages in thread
From: Dalessandro, Dennis @ 2016-10-20 12:56 UTC (permalink / raw)
  To: dledford, pmladek, Hefty, Sean
  Cc: hal.rosenstock, tj, linux-rdma, linux-kernel

On Wed, 2016-10-19 at 14:07 +0200, Petr Mladek wrote:
> The kthread worker API has been improved in 4.9-rc1. The 2nd
> patch uses the new functions and simplifies the kthread worker
> creation and destroying.
> 
> I have found a possible race when working on the API conversion.
> A proposed fix is in the 1st patch.
> 
> Both changes are compile tested only. I did not find an easy way
> how to test them at runtime.
> 
> Petr Mladek (2):
>   IB/rdmavt: Avoid queuing work into a destroyed cq kthread worker
>   IB/rdmavt: Handle the kthread worker using the new API
> 
>  drivers/infiniband/sw/rdmavt/cq.c | 64 +++++++++++++++++----------
> ------------
>  1 file changed, 27 insertions(+), 37 deletions(-)

Thanks for the patches. I'm going to take a closer look, I just now
seen these.

-Denny

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/2] IB/rdmavt: cq ktrhead worker fix and API update
  2016-10-20 12:56 ` [PATCH 0/2] IB/rdmavt: cq ktrhead worker fix and API update Dalessandro, Dennis
@ 2016-12-14 17:17   ` Doug Ledford
  0 siblings, 0 replies; 5+ messages in thread
From: Doug Ledford @ 2016-12-14 17:17 UTC (permalink / raw)
  To: Dalessandro, Dennis, pmladek, Hefty, Sean
  Cc: hal.rosenstock, tj, linux-rdma, linux-kernel


[-- Attachment #1.1: Type: text/plain, Size: 1063 bytes --]

On 10/20/2016 8:56 AM, Dalessandro, Dennis wrote:
> On Wed, 2016-10-19 at 14:07 +0200, Petr Mladek wrote:
>> The kthread worker API has been improved in 4.9-rc1. The 2nd
>> patch uses the new functions and simplifies the kthread worker
>> creation and destroying.
>>
>> I have found a possible race when working on the API conversion.
>> A proposed fix is in the 1st patch.
>>
>> Both changes are compile tested only. I did not find an easy way
>> how to test them at runtime.
>>
>> Petr Mladek (2):
>>   IB/rdmavt: Avoid queuing work into a destroyed cq kthread worker
>>   IB/rdmavt: Handle the kthread worker using the new API
>>
>>  drivers/infiniband/sw/rdmavt/cq.c | 64 +++++++++++++++++----------
>> ------------
>>  1 file changed, 27 insertions(+), 37 deletions(-)
> 
> Thanks for the patches. I'm going to take a closer look, I just now
> seen these.
> 
> -Denny
> 

These looked good to me, and you haven't objected, so I'm taking this
series.  Thanks.

-- 
Doug Ledford <dledford@redhat.com>
    GPG Key ID: 0E572FDD


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-12-14 17:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-19 12:07 [PATCH 0/2] IB/rdmavt: cq ktrhead worker fix and API update Petr Mladek
2016-10-19 12:07 ` [PATCH 1/2] IB/rdmavt: Avoid queuing work into a destroyed cq kthread worker Petr Mladek
2016-10-19 12:07 ` [PATCH 2/2] IB/rdmavt: Handle the kthread worker using the new API Petr Mladek
2016-10-20 12:56 ` [PATCH 0/2] IB/rdmavt: cq ktrhead worker fix and API update Dalessandro, Dennis
2016-12-14 17:17   ` Doug Ledford

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.