nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [PATCH RT] nvdimm: make lane acquirement RT aware
@ 2019-03-06  9:57 Yongxin Liu
  2019-03-06 16:35 ` Dan Williams
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Yongxin Liu @ 2019-03-06  9:57 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users
  Cc: linux-nvdimm, bigeasy, rostedt, paul.gortmaker, tglx

Currently, nvdimm driver isn't RT compatible.
nd_region_acquire_lane() disables preemption with get_cpu() which
causes "scheduling while atomic" spews on RT, when using fio to test
pmem as block device.

In this change, we replace get_cpu/put_cpu with local_lock_cpu/
local_unlock_cpu, and introduce per CPU variable "ndl_local_lock".
Due to preemption on RT, this lock can avoid race condition for the
same lane on the same CPU. When CPU number is greater than the lane
number, lane can be shared among CPUs. "ndl_lock->lock" is used to
protect the lane in this situation.

This patch is derived from Dan Williams and Pankaj Gupta's proposal from
https://www.mail-archive.com/linux-nvdimm@lists.01.org/msg13359.html
and https://www.spinics.net/lists/linux-rt-users/msg20280.html.
Many thanks to them.

Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Pankaj Gupta <pagupta@redhat.com>
Cc: linux-rt-users <linux-rt-users@vger.kernel.org>
Cc: linux-nvdimm <linux-nvdimm@lists.01.org>
Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com>
---
 drivers/nvdimm/region_devs.c | 40 +++++++++++++++++++---------------------
 1 file changed, 19 insertions(+), 21 deletions(-)

diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
index fa37afcd43ff..6c5388cf2477 100644
--- a/drivers/nvdimm/region_devs.c
+++ b/drivers/nvdimm/region_devs.c
@@ -18,9 +18,13 @@
 #include <linux/sort.h>
 #include <linux/io.h>
 #include <linux/nd.h>
+#include <linux/locallock.h>
 #include "nd-core.h"
 #include "nd.h"
 
+/* lock for tasks on the same CPU to sequence the access to the lane */
+static DEFINE_LOCAL_IRQ_LOCK(ndl_local_lock);
+
 /*
  * For readq() and writeq() on 32-bit builds, the hi-lo, lo-hi order is
  * irrelevant.
@@ -935,18 +939,15 @@ int nd_blk_region_init(struct nd_region *nd_region)
 unsigned int nd_region_acquire_lane(struct nd_region *nd_region)
 {
 	unsigned int cpu, lane;
+	struct nd_percpu_lane *ndl_lock, *ndl_count;
 
-	cpu = get_cpu();
-	if (nd_region->num_lanes < nr_cpu_ids) {
-		struct nd_percpu_lane *ndl_lock, *ndl_count;
+	cpu = local_lock_cpu(ndl_local_lock);
 
-		lane = cpu % nd_region->num_lanes;
-		ndl_count = per_cpu_ptr(nd_region->lane, cpu);
-		ndl_lock = per_cpu_ptr(nd_region->lane, lane);
-		if (ndl_count->count++ == 0)
-			spin_lock(&ndl_lock->lock);
-	} else
-		lane = cpu;
+	lane = cpu % nd_region->num_lanes;
+	ndl_count = per_cpu_ptr(nd_region->lane, cpu);
+	ndl_lock = per_cpu_ptr(nd_region->lane, lane);
+	if (ndl_count->count++ == 0)
+		spin_lock(&ndl_lock->lock);
 
 	return lane;
 }
@@ -954,17 +955,14 @@ EXPORT_SYMBOL(nd_region_acquire_lane);
 
 void nd_region_release_lane(struct nd_region *nd_region, unsigned int lane)
 {
-	if (nd_region->num_lanes < nr_cpu_ids) {
-		unsigned int cpu = get_cpu();
-		struct nd_percpu_lane *ndl_lock, *ndl_count;
-
-		ndl_count = per_cpu_ptr(nd_region->lane, cpu);
-		ndl_lock = per_cpu_ptr(nd_region->lane, lane);
-		if (--ndl_count->count == 0)
-			spin_unlock(&ndl_lock->lock);
-		put_cpu();
-	}
-	put_cpu();
+	struct nd_percpu_lane *ndl_lock, *ndl_count;
+	unsigned int cpu = smp_processor_id();
+
+	ndl_count = per_cpu_ptr(nd_region->lane, cpu);
+	ndl_lock = per_cpu_ptr(nd_region->lane, lane);
+	if (--ndl_count->count == 0)
+		spin_unlock(&ndl_lock->lock);
+	local_unlock_cpu(ndl_local_lock);
 }
 EXPORT_SYMBOL(nd_region_release_lane);
 
-- 
2.14.4

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH RT] nvdimm: make lane acquirement RT aware
  2019-03-06  9:57 [PATCH RT] nvdimm: make lane acquirement RT aware Yongxin Liu
@ 2019-03-06 16:35 ` Dan Williams
  2019-03-07 14:33 ` Sebastian Andrzej Siewior
  2019-03-08  6:31 ` Pankaj Gupta
  2 siblings, 0 replies; 12+ messages in thread
From: Dan Williams @ 2019-03-06 16:35 UTC (permalink / raw)
  To: Yongxin Liu
  Cc: linux-rt-users, linux-nvdimm, Sebastian Andrzej Siewior,
	Linux Kernel Mailing List, Steven Rostedt, Paul Gortmaker,
	Thomas Gleixner

On Wed, Mar 6, 2019 at 2:05 AM Yongxin Liu <yongxin.liu@windriver.com> wrote:
>
> Currently, nvdimm driver isn't RT compatible.
> nd_region_acquire_lane() disables preemption with get_cpu() which
> causes "scheduling while atomic" spews on RT, when using fio to test
> pmem as block device.
>
> In this change, we replace get_cpu/put_cpu with local_lock_cpu/
> local_unlock_cpu, and introduce per CPU variable "ndl_local_lock".
> Due to preemption on RT, this lock can avoid race condition for the
> same lane on the same CPU. When CPU number is greater than the lane
> number, lane can be shared among CPUs. "ndl_lock->lock" is used to
> protect the lane in this situation.
>
> This patch is derived from Dan Williams and Pankaj Gupta's proposal from
> https://www.mail-archive.com/linux-nvdimm@lists.01.org/msg13359.html
> and https://www.spinics.net/lists/linux-rt-users/msg20280.html.
> Many thanks to them.
>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Pankaj Gupta <pagupta@redhat.com>
> Cc: linux-rt-users <linux-rt-users@vger.kernel.org>
> Cc: linux-nvdimm <linux-nvdimm@lists.01.org>
> Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com>

Looks ok to me in concept.

Acked-by: Dan Williams <dan.j.williams@intel.com>
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RT] nvdimm: make lane acquirement RT aware
  2019-03-06  9:57 [PATCH RT] nvdimm: make lane acquirement RT aware Yongxin Liu
  2019-03-06 16:35 ` Dan Williams
@ 2019-03-07 14:33 ` Sebastian Andrzej Siewior
  2019-03-08  0:07   ` Liu, Yongxin
  2019-03-08  6:31 ` Pankaj Gupta
  2 siblings, 1 reply; 12+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-03-07 14:33 UTC (permalink / raw)
  To: Yongxin Liu
  Cc: linux-rt-users, linux-nvdimm, linux-kernel, rostedt,
	paul.gortmaker, tglx

On 2019-03-06 17:57:09 [+0800], Yongxin Liu wrote:
> In this change, we replace get_cpu/put_cpu with local_lock_cpu/
> local_unlock_cpu, and introduce per CPU variable "ndl_local_lock".
> Due to preemption on RT, this lock can avoid race condition for the
> same lane on the same CPU. When CPU number is greater than the lane
> number, lane can be shared among CPUs. "ndl_lock->lock" is used to
> protect the lane in this situation.

so what was the reason that get_cpu() can't be replaced with
raw_smp_processor_id()?

Sebastian
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH RT] nvdimm: make lane acquirement RT aware
  2019-03-07 14:33 ` Sebastian Andrzej Siewior
@ 2019-03-08  0:07   ` Liu, Yongxin
  2019-03-08  9:41     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 12+ messages in thread
From: Liu, Yongxin @ 2019-03-08  0:07 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-rt-users, linux-nvdimm, linux-kernel, rostedt, Gortmaker,
	Paul  <Paul.Gortmaker@windriver.com>,
	tglx@linutronix.de

> -----Original Message-----
> From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> owner@vger.kernel.org] On Behalf Of Sebastian Andrzej Siewior
> Sent: Thursday, March 7, 2019 22:34
> To: Liu, Yongxin
> Cc: linux-kernel@vger.kernel.org; linux-rt-users@vger.kernel.org;
> tglx@linutronix.de; rostedt@goodmis.org; dan.j.williams@intel.com;
> pagupta@redhat.com; Gortmaker, Paul; linux-nvdimm@lists.01.org
> Subject: Re: [PATCH RT] nvdimm: make lane acquirement RT aware
> 
> On 2019-03-06 17:57:09 [+0800], Yongxin Liu wrote:
> > In this change, we replace get_cpu/put_cpu with local_lock_cpu/
> > local_unlock_cpu, and introduce per CPU variable "ndl_local_lock".
> > Due to preemption on RT, this lock can avoid race condition for the
> > same lane on the same CPU. When CPU number is greater than the lane
> > number, lane can be shared among CPUs. "ndl_lock->lock" is used to
> > protect the lane in this situation.
> 
> so what was the reason that get_cpu() can't be replaced with
> raw_smp_processor_id()?
> 
> Sebastian

The lane is critical resource which needs to be protected. One CPU can use only one
lane. If CPU number is greater than the number of total lane, the lane can be shared
among CPUs. 

In non-RT kernel, get_cpu() disable preemption by calling preempt_disable() first.
Only one thread on the same CPU can get the lane.

In RT kernel, if we only use raw_smp_processor_id(), this doesn't protect the lane. 
Thus two threads on the same CPU can get the same lane at the same time.

In this patch, two-level lock can avoid race condition for the lane.

          CPU A                  CPU B (B == A % num_lanes)
 
    task A1    task A2     task B1    task B2
       |          |           |          |
       |__________|           |__________|
            |                      |
       ndl_local_lock           ndl_local_lock
            |                      |
            |______________________|
                       |
                       |
                  ndl_lock->lock
                       |
                       |
                      lane

 
Thanks,
Yongxin
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RT] nvdimm: make lane acquirement RT aware
  2019-03-06  9:57 [PATCH RT] nvdimm: make lane acquirement RT aware Yongxin Liu
  2019-03-06 16:35 ` Dan Williams
  2019-03-07 14:33 ` Sebastian Andrzej Siewior
@ 2019-03-08  6:31 ` Pankaj Gupta
  2 siblings, 0 replies; 12+ messages in thread
From: Pankaj Gupta @ 2019-03-08  6:31 UTC (permalink / raw)
  To: Yongxin Liu
  Cc: linux-rt-users, linux-nvdimm, bigeasy, linux-kernel, rostedt,
	paul gortmaker, tglx


> Currently, nvdimm driver isn't RT compatible.
> nd_region_acquire_lane() disables preemption with get_cpu() which
> causes "scheduling while atomic" spews on RT, when using fio to test
> pmem as block device.
> 
> In this change, we replace get_cpu/put_cpu with local_lock_cpu/
> local_unlock_cpu, and introduce per CPU variable "ndl_local_lock".
> Due to preemption on RT, this lock can avoid race condition for the
> same lane on the same CPU. When CPU number is greater than the lane
> number, lane can be shared among CPUs. "ndl_lock->lock" is used to
> protect the lane in this situation.
> 
> This patch is derived from Dan Williams and Pankaj Gupta's proposal from
> https://www.mail-archive.com/linux-nvdimm@lists.01.org/msg13359.html
> and https://www.spinics.net/lists/linux-rt-users/msg20280.html.
> Many thanks to them.
> 
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Pankaj Gupta <pagupta@redhat.com>
> Cc: linux-rt-users <linux-rt-users@vger.kernel.org>
> Cc: linux-nvdimm <linux-nvdimm@lists.01.org>
> Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com>

This patch looks good to me.

Acked-by: Pankaj Gupta <pagupta@redhat.com>

> ---
>  drivers/nvdimm/region_devs.c | 40 +++++++++++++++++++---------------------
>  1 file changed, 19 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
> index fa37afcd43ff..6c5388cf2477 100644
> --- a/drivers/nvdimm/region_devs.c
> +++ b/drivers/nvdimm/region_devs.c
> @@ -18,9 +18,13 @@
>  #include <linux/sort.h>
>  #include <linux/io.h>
>  #include <linux/nd.h>
> +#include <linux/locallock.h>
>  #include "nd-core.h"
>  #include "nd.h"
>  
> +/* lock for tasks on the same CPU to sequence the access to the lane */
> +static DEFINE_LOCAL_IRQ_LOCK(ndl_local_lock);
> +
>  /*
>   * For readq() and writeq() on 32-bit builds, the hi-lo, lo-hi order is
>   * irrelevant.
> @@ -935,18 +939,15 @@ int nd_blk_region_init(struct nd_region *nd_region)
>  unsigned int nd_region_acquire_lane(struct nd_region *nd_region)
>  {
>  	unsigned int cpu, lane;
> +	struct nd_percpu_lane *ndl_lock, *ndl_count;
>  
> -	cpu = get_cpu();
> -	if (nd_region->num_lanes < nr_cpu_ids) {
> -		struct nd_percpu_lane *ndl_lock, *ndl_count;
> +	cpu = local_lock_cpu(ndl_local_lock);
>  
> -		lane = cpu % nd_region->num_lanes;
> -		ndl_count = per_cpu_ptr(nd_region->lane, cpu);
> -		ndl_lock = per_cpu_ptr(nd_region->lane, lane);
> -		if (ndl_count->count++ == 0)
> -			spin_lock(&ndl_lock->lock);
> -	} else
> -		lane = cpu;
> +	lane = cpu % nd_region->num_lanes;
> +	ndl_count = per_cpu_ptr(nd_region->lane, cpu);
> +	ndl_lock = per_cpu_ptr(nd_region->lane, lane);
> +	if (ndl_count->count++ == 0)
> +		spin_lock(&ndl_lock->lock);
>  
>  	return lane;
>  }
> @@ -954,17 +955,14 @@ EXPORT_SYMBOL(nd_region_acquire_lane);
>  
>  void nd_region_release_lane(struct nd_region *nd_region, unsigned int lane)
>  {
> -	if (nd_region->num_lanes < nr_cpu_ids) {
> -		unsigned int cpu = get_cpu();
> -		struct nd_percpu_lane *ndl_lock, *ndl_count;
> -
> -		ndl_count = per_cpu_ptr(nd_region->lane, cpu);
> -		ndl_lock = per_cpu_ptr(nd_region->lane, lane);
> -		if (--ndl_count->count == 0)
> -			spin_unlock(&ndl_lock->lock);
> -		put_cpu();
> -	}
> -	put_cpu();
> +	struct nd_percpu_lane *ndl_lock, *ndl_count;
> +	unsigned int cpu = smp_processor_id();
> +
> +	ndl_count = per_cpu_ptr(nd_region->lane, cpu);
> +	ndl_lock = per_cpu_ptr(nd_region->lane, lane);
> +	if (--ndl_count->count == 0)
> +		spin_unlock(&ndl_lock->lock);
> +	local_unlock_cpu(ndl_local_lock);
>  }
>  EXPORT_SYMBOL(nd_region_release_lane);
>  
> --
> 2.14.4
> 
> 
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RT] nvdimm: make lane acquirement RT aware
  2019-03-08  0:07   ` Liu, Yongxin
@ 2019-03-08  9:41     ` Sebastian Andrzej Siewior
  2019-03-11  0:44       ` Liu, Yongxin
  0 siblings, 1 reply; 12+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-03-08  9:41 UTC (permalink / raw)
  To: Liu, Yongxin
  Cc: linux-rt-users, linux-nvdimm, linux-kernel, rostedt, Gortmaker,
	Paul  <Paul.Gortmaker@windriver.com>,
	tglx@linutronix.de

On 2019-03-08 00:07:41 [+0000], Liu, Yongxin wrote:
> The lane is critical resource which needs to be protected. One CPU can use only one
> lane. If CPU number is greater than the number of total lane, the lane can be shared
> among CPUs. 
> 
> In non-RT kernel, get_cpu() disable preemption by calling preempt_disable() first.
> Only one thread on the same CPU can get the lane.
> 
> In RT kernel, if we only use raw_smp_processor_id(), this doesn't protect the lane. 
> Thus two threads on the same CPU can get the same lane at the same time.
> 
> In this patch, two-level lock can avoid race condition for the lane.

but you still have the ndl_lock->lock which protects the resource. So in
the unlikely (but possible event) that you switch CPUs after obtaining
the CPU number you block on the lock. No harm is done, right?

> Thanks,
> Yongxin

Sebastian
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH RT] nvdimm: make lane acquirement RT aware
  2019-03-08  9:41     ` Sebastian Andrzej Siewior
@ 2019-03-11  0:44       ` Liu, Yongxin
  2019-03-15 16:42         ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 12+ messages in thread
From: Liu, Yongxin @ 2019-03-11  0:44 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-rt-users, linux-nvdimm, linux-kernel, rostedt, Gortmaker,
	Paul  <Paul.Gortmaker@windriver.com>,
	tglx@linutronix.de


> -----Original Message-----
> From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> owner@vger.kernel.org] On Behalf Of Sebastian Andrzej Siewior
> Sent: Friday, March 8, 2019 17:42
> To: Liu, Yongxin
> Cc: linux-kernel@vger.kernel.org; linux-rt-users@vger.kernel.org;
> tglx@linutronix.de; rostedt@goodmis.org; dan.j.williams@intel.com;
> pagupta@redhat.com; Gortmaker, Paul; linux-nvdimm@lists.01.org
> Subject: Re: [PATCH RT] nvdimm: make lane acquirement RT aware
> 
> On 2019-03-08 00:07:41 [+0000], Liu, Yongxin wrote:
> > The lane is critical resource which needs to be protected. One CPU can
> use only one
> > lane. If CPU number is greater than the number of total lane, the lane
> can be shared
> > among CPUs.
> >
> > In non-RT kernel, get_cpu() disable preemption by calling
> preempt_disable() first.
> > Only one thread on the same CPU can get the lane.
> >
> > In RT kernel, if we only use raw_smp_processor_id(), this doesn't
> protect the lane.
> > Thus two threads on the same CPU can get the same lane at the same time.
> >
> > In this patch, two-level lock can avoid race condition for the lane.
> 
> but you still have the ndl_lock->lock which protects the resource. So in
> the unlikely (but possible event) that you switch CPUs after obtaining
> the CPU number you block on the lock. No harm is done, right?

The resource "lane" can be acquired recursively, so "ndl_lock->lock" is a conditional lock.

ndl_count->count is per CPU.
ndl_lock->lock is per lane.

Here is an example:
Thread A  on CPU 5 --> nd_region_acquire_lane --> lane# 5 --> get "ndl_lock->lock"
--> nd_region_acquire_lane --> lane# 5 --> bypass "ndl_lock->lock" due to "ndl_count->count++".

Thread B on CPU 5 --> nd_region_acquire_lane --> lane# 5 --> bypass "ndl_lock->lock" ("ndl_count->count"
was changed by Thread A)

If we use raw_smp_processor_id(), no matter which CPU the thread was migrated to, 
if there is another thread running on the old CPU, there will be race condition 
due to per CPU variable "ndl_count->count".


Thanks,
Yongxin

> 
> Sebastian
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RT] nvdimm: make lane acquirement RT aware
  2019-03-11  0:44       ` Liu, Yongxin
@ 2019-03-15 16:42         ` Sebastian Andrzej Siewior
  2019-03-18  1:41           ` Liu, Yongxin
  0 siblings, 1 reply; 12+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-03-15 16:42 UTC (permalink / raw)
  To: Liu, Yongxin
  Cc: linux-rt-users, linux-nvdimm, linux-kernel, rostedt, Gortmaker,
	Paul  <Paul.Gortmaker@windriver.com>,
	tglx@linutronix.de

On 2019-03-11 00:44:58 [+0000], Liu, Yongxin wrote:
> > but you still have the ndl_lock->lock which protects the resource. So in
> > the unlikely (but possible event) that you switch CPUs after obtaining
> > the CPU number you block on the lock. No harm is done, right?
> 
> The resource "lane" can be acquired recursively, so "ndl_lock->lock" is a conditional lock.
> 
> ndl_count->count is per CPU.
> ndl_lock->lock is per lane.
> 
> Here is an example:
> Thread A  on CPU 5 --> nd_region_acquire_lane --> lane# 5 --> get "ndl_lock->lock"
> --> nd_region_acquire_lane --> lane# 5 --> bypass "ndl_lock->lock" due to "ndl_count->count++".
> 
> Thread B on CPU 5 --> nd_region_acquire_lane --> lane# 5 --> bypass "ndl_lock->lock" ("ndl_count->count"
> was changed by Thread A)
> 
> If we use raw_smp_processor_id(), no matter which CPU the thread was migrated to, 
> if there is another thread running on the old CPU, there will be race condition 
> due to per CPU variable "ndl_count->count".

so I've been looking at it again. The recursive locking could have been
solved better. Like the local_lock() on -RT is doing it.
Given that you lock with preempt_disable() there should be no in-IRQ
usage.
But in the "nd_region->num_lanes >= nr_cpu_ids" case you don't take any
locks. That would be a problem with raw_smp_processor_id() approach.

So what about the completely untested patch here:

diff --git a/drivers/nvdimm/nd.h b/drivers/nvdimm/nd.h
index 379bf4305e615..98c2e9df4b2e4 100644
--- a/drivers/nvdimm/nd.h
+++ b/drivers/nvdimm/nd.h
@@ -109,7 +109,8 @@ unsigned sizeof_namespace_label(struct nvdimm_drvdata *ndd);
 			res; res = next, next = next ? next->sibling : NULL)
 
 struct nd_percpu_lane {
-	int count;
+	struct task_struct *owner;
+	int nestcnt;
 	spinlock_t lock;
 };
 
diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
index e2818f94f2928..8a62f9833513f 100644
--- a/drivers/nvdimm/region_devs.c
+++ b/drivers/nvdimm/region_devs.c
@@ -946,19 +946,17 @@ int nd_blk_region_init(struct nd_region *nd_region)
  */
 unsigned int nd_region_acquire_lane(struct nd_region *nd_region)
 {
+	struct nd_percpu_lane *ndl_lock;
 	unsigned int cpu, lane;
 
-	cpu = get_cpu();
-	if (nd_region->num_lanes < nr_cpu_ids) {
-		struct nd_percpu_lane *ndl_lock, *ndl_count;
-
-		lane = cpu % nd_region->num_lanes;
-		ndl_count = per_cpu_ptr(nd_region->lane, cpu);
-		ndl_lock = per_cpu_ptr(nd_region->lane, lane);
-		if (ndl_count->count++ == 0)
-			spin_lock(&ndl_lock->lock);
-	} else
-		lane = cpu;
+	cpu = raw_smp_processor_id();
+	lane = cpu % nd_region->num_lanes;
+	ndl_lock  = per_cpu_ptr(nd_region->lane, lane);
+	if (ndl_lock->owner != current) {
+		spin_lock(&ndl_lock->lock);
+		ndl_lock->owner = current;
+	}
+	ndl_lock->nestcnt++;
 
 	return lane;
 }
@@ -966,17 +964,16 @@ EXPORT_SYMBOL(nd_region_acquire_lane);
 
 void nd_region_release_lane(struct nd_region *nd_region, unsigned int lane)
 {
-	if (nd_region->num_lanes < nr_cpu_ids) {
-		unsigned int cpu = get_cpu();
-		struct nd_percpu_lane *ndl_lock, *ndl_count;
+	struct nd_percpu_lane *ndl_lock;
 
-		ndl_count = per_cpu_ptr(nd_region->lane, cpu);
-		ndl_lock = per_cpu_ptr(nd_region->lane, lane);
-		if (--ndl_count->count == 0)
-			spin_unlock(&ndl_lock->lock);
-		put_cpu();
-	}
-	put_cpu();
+	ndl_lock = per_cpu_ptr(nd_region->lane, lane);
+	WARN_ON(ndl_lock->nestcnt == 0);
+	WARN_ON(ndl_lock->owner != current);
+	if (--ndl_lock->nestcnt)
+		return;
+
+	ndl_lock->owner = NULL;
+	spin_unlock(&ndl_lock->lock);
 }
 EXPORT_SYMBOL(nd_region_release_lane);
 
@@ -1042,7 +1039,8 @@ static struct nd_region *nd_region_create(struct nvdimm_bus *nvdimm_bus,
 
 		ndl = per_cpu_ptr(nd_region->lane, i);
 		spin_lock_init(&ndl->lock);
-		ndl->count = 0;
+		ndl->owner = NULL;
+		ndl->nestcnt = 0;
 	}
 
 	for (i = 0; i < ndr_desc->num_mappings; i++) {

> Thanks,
> Yongxin

Sebastian
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* RE: [PATCH RT] nvdimm: make lane acquirement RT aware
  2019-03-15 16:42         ` Sebastian Andrzej Siewior
@ 2019-03-18  1:41           ` Liu, Yongxin
  2019-03-18 11:40             ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 12+ messages in thread
From: Liu, Yongxin @ 2019-03-18  1:41 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-rt-users, linux-nvdimm, linux-kernel, rostedt, Gortmaker,
	Paul  <Paul.Gortmaker@windriver.com>,
	tglx@linutronix.de


> -----Original Message-----
> From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> owner@vger.kernel.org] On Behalf Of Sebastian Andrzej Siewior
> Sent: Saturday, March 16, 2019 00:43
> To: Liu, Yongxin
> Cc: linux-kernel@vger.kernel.org; linux-rt-users@vger.kernel.org;
> tglx@linutronix.de; rostedt@goodmis.org; dan.j.williams@intel.com;
> pagupta@redhat.com; Gortmaker, Paul; linux-nvdimm@lists.01.org
> Subject: Re: [PATCH RT] nvdimm: make lane acquirement RT aware
> 
> On 2019-03-11 00:44:58 [+0000], Liu, Yongxin wrote:
> > > but you still have the ndl_lock->lock which protects the resource. So
> in
> > > the unlikely (but possible event) that you switch CPUs after
> obtaining
> > > the CPU number you block on the lock. No harm is done, right?
> >
> > The resource "lane" can be acquired recursively, so "ndl_lock->lock" is
> a conditional lock.
> >
> > ndl_count->count is per CPU.
> > ndl_lock->lock is per lane.
> >
> > Here is an example:
> > Thread A  on CPU 5 --> nd_region_acquire_lane --> lane# 5 --> get
> "ndl_lock->lock"
> > --> nd_region_acquire_lane --> lane# 5 --> bypass "ndl_lock->lock" due
> to "ndl_count->count++".
> >
> > Thread B on CPU 5 --> nd_region_acquire_lane --> lane# 5 --> bypass
> "ndl_lock->lock" ("ndl_count->count"
> > was changed by Thread A)
> >
> > If we use raw_smp_processor_id(), no matter which CPU the thread was
> migrated to,
> > if there is another thread running on the old CPU, there will be race
> condition
> > due to per CPU variable "ndl_count->count".
> 
> so I've been looking at it again. The recursive locking could have been
> solved better. Like the local_lock() on -RT is doing it.
> Given that you lock with preempt_disable() there should be no in-IRQ
> usage.
> But in the "nd_region->num_lanes >= nr_cpu_ids" case you don't take any
> locks. That would be a problem with raw_smp_processor_id() approach.
> 
> So what about the completely untested patch here:
> 
> diff --git a/drivers/nvdimm/nd.h b/drivers/nvdimm/nd.h
> index 379bf4305e615..98c2e9df4b2e4 100644
> --- a/drivers/nvdimm/nd.h
> +++ b/drivers/nvdimm/nd.h
> @@ -109,7 +109,8 @@ unsigned sizeof_namespace_label(struct nvdimm_drvdata
> *ndd);
>  			res; res = next, next = next ? next->sibling : NULL)
> 
>  struct nd_percpu_lane {
> -	int count;
> +	struct task_struct *owner;
> +	int nestcnt;
>  	spinlock_t lock;
>  };
> 
> diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
> index e2818f94f2928..8a62f9833513f 100644
> --- a/drivers/nvdimm/region_devs.c
> +++ b/drivers/nvdimm/region_devs.c
> @@ -946,19 +946,17 @@ int nd_blk_region_init(struct nd_region *nd_region)
>   */
>  unsigned int nd_region_acquire_lane(struct nd_region *nd_region)
>  {
> +	struct nd_percpu_lane *ndl_lock;
>  	unsigned int cpu, lane;
> 
> -	cpu = get_cpu();
> -	if (nd_region->num_lanes < nr_cpu_ids) {
> -		struct nd_percpu_lane *ndl_lock, *ndl_count;
> -
> -		lane = cpu % nd_region->num_lanes;
> -		ndl_count = per_cpu_ptr(nd_region->lane, cpu);
> -		ndl_lock = per_cpu_ptr(nd_region->lane, lane);
> -		if (ndl_count->count++ == 0)
> -			spin_lock(&ndl_lock->lock);
> -	} else
> -		lane = cpu;
> +	cpu = raw_smp_processor_id();
> +	lane = cpu % nd_region->num_lanes;
> +	ndl_lock  = per_cpu_ptr(nd_region->lane, lane);
> +	if (ndl_lock->owner != current) {
> +		spin_lock(&ndl_lock->lock);
> +		ndl_lock->owner = current;
> +	}
> +	ndl_lock->nestcnt++;
> 
>  	return lane;
>  }
> @@ -966,17 +964,16 @@ EXPORT_SYMBOL(nd_region_acquire_lane);
> 
>  void nd_region_release_lane(struct nd_region *nd_region, unsigned int
> lane)
>  {
> -	if (nd_region->num_lanes < nr_cpu_ids) {
> -		unsigned int cpu = get_cpu();
> -		struct nd_percpu_lane *ndl_lock, *ndl_count;
> +	struct nd_percpu_lane *ndl_lock;
> 
> -		ndl_count = per_cpu_ptr(nd_region->lane, cpu);
> -		ndl_lock = per_cpu_ptr(nd_region->lane, lane);
> -		if (--ndl_count->count == 0)
> -			spin_unlock(&ndl_lock->lock);
> -		put_cpu();
> -	}
> -	put_cpu();
> +	ndl_lock = per_cpu_ptr(nd_region->lane, lane);
> +	WARN_ON(ndl_lock->nestcnt == 0);
> +	WARN_ON(ndl_lock->owner != current);
> +	if (--ndl_lock->nestcnt)
> +		return;
> +
> +	ndl_lock->owner = NULL;
> +	spin_unlock(&ndl_lock->lock);
>  }
>  EXPORT_SYMBOL(nd_region_release_lane);
> 
> @@ -1042,7 +1039,8 @@ static struct nd_region *nd_region_create(struct
> nvdimm_bus *nvdimm_bus,
> 
>  		ndl = per_cpu_ptr(nd_region->lane, i);
>  		spin_lock_init(&ndl->lock);
> -		ndl->count = 0;
> +		ndl->owner = NULL;
> +		ndl->nestcnt = 0;
>  	}
> 
>  	for (i = 0; i < ndr_desc->num_mappings; i++) {
> 
> > Thanks,
> > Yongxin
> 

Consider the recursive call to nd_region_acquire_lane() in the following situation.
Will there be a dead lock?


    Thread A                    Thread B
       |                           |
       |                           |
     CPU 1                       CPU 2
       |                           |
       |                           |
 get lock for Lane 1         get lock for Lane 2
       |                           |
       |                           |
 migrate to CPU 2            migrate to CPU 1
       |                           |
       |                           |
 wait lock for Lane 2        wait lock for Lane 1 
       |                           |
       |                           |
       _____________________________
                   |
                dead lock ?


Thanks,
Yognxin


> Sebastian
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RT] nvdimm: make lane acquirement RT aware
  2019-03-18  1:41           ` Liu, Yongxin
@ 2019-03-18 11:40             ` Sebastian Andrzej Siewior
  2019-03-18 11:48               ` Liu, Yongxin
  0 siblings, 1 reply; 12+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-03-18 11:40 UTC (permalink / raw)
  To: Liu, Yongxin
  Cc: linux-rt-users, linux-nvdimm, linux-kernel, rostedt, Gortmaker,
	Paul  <Paul.Gortmaker@windriver.com>,
	tglx@linutronix.de

On 2019-03-18 01:41:10 [+0000], Liu, Yongxin wrote:
> 
> Consider the recursive call to nd_region_acquire_lane() in the following situation.
> Will there be a dead lock?
> 
> 
>     Thread A                    Thread B
>        |                           |
>        |                           |
>      CPU 1                       CPU 2
>        |                           |
>        |                           |
>  get lock for Lane 1         get lock for Lane 2
>        |                           |
>        |                           |
>  migrate to CPU 2            migrate to CPU 1
>        |                           |
>        |                           |
>  wait lock for Lane 2        wait lock for Lane 1 
>        |                           |
>        |                           |
>        _____________________________
>                    |
>                 dead lock ?

Bummer. That would dead lock indeed.
Is it easily possible to recognize the recursive case? 

> 
> Thanks,
> Yognxin

Sebastian
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH RT] nvdimm: make lane acquirement RT aware
  2019-03-18 11:40             ` Sebastian Andrzej Siewior
@ 2019-03-18 11:48               ` Liu, Yongxin
  2019-03-28 17:38                 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 12+ messages in thread
From: Liu, Yongxin @ 2019-03-18 11:48 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-rt-users, linux-nvdimm, linux-kernel, rostedt, Gortmaker,
	Paul  <Paul.Gortmaker@windriver.com>,
	tglx@linutronix.de


> -----Original Message-----
> From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> owner@vger.kernel.org] On Behalf Of Sebastian Andrzej Siewior
> Sent: Monday, March 18, 2019 19:40
> To: Liu, Yongxin
> Cc: linux-kernel@vger.kernel.org; linux-rt-users@vger.kernel.org;
> tglx@linutronix.de; rostedt@goodmis.org; dan.j.williams@intel.com;
> pagupta@redhat.com; Gortmaker, Paul; linux-nvdimm@lists.01.org
> Subject: Re: [PATCH RT] nvdimm: make lane acquirement RT aware
> 
> On 2019-03-18 01:41:10 [+0000], Liu, Yongxin wrote:
> >
> > Consider the recursive call to nd_region_acquire_lane() in the
> following situation.
> > Will there be a dead lock?
> >
> >
> >     Thread A                    Thread B
> >        |                           |
> >        |                           |
> >      CPU 1                       CPU 2
> >        |                           |
> >        |                           |
> >  get lock for Lane 1         get lock for Lane 2
> >        |                           |
> >        |                           |
> >  migrate to CPU 2            migrate to CPU 1
> >        |                           |
> >        |                           |
> >  wait lock for Lane 2        wait lock for Lane 1
> >        |                           |
> >        |                           |
> >        _____________________________
> >                    |
> >                 dead lock ?
> 
> Bummer. That would dead lock indeed.
> Is it easily possible to recognize the recursive case?

Not easily. I don't have test case for recursive call. 
For now, just code analysis.


Yongxin

> >
> > Thanks,
> > Yognxin
> 
> Sebastian
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RT] nvdimm: make lane acquirement RT aware
  2019-03-18 11:48               ` Liu, Yongxin
@ 2019-03-28 17:38                 ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 12+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-03-28 17:38 UTC (permalink / raw)
  To: Liu, Yongxin
  Cc: linux-rt-users, linux-nvdimm, linux-kernel, rostedt, Gortmaker,
	Paul  <Paul.Gortmaker@windriver.com>,
	tglx@linutronix.de

On 2019-03-18 11:48:28 [+0000], Liu, Yongxin wrote:
> 
> > 
> > Bummer. That would dead lock indeed.
> > Is it easily possible to recognize the recursive case?
> 
> Not easily. I don't have test case for recursive call. 
> For now, just code analysis.

So I've been playing with qemu's nvdimm device. So I *think* the
recursive case is here not possible because qemu only supports pmem
while it would require the blk mode to trigger it. It is just a wild
guess…

On top of qemu's nvdimm device I can create a block device via
	ndctl create-namespace  namespace0.0 --mode=sector

and then I trigger the code path in question.

I would *really* prefer to understand the recursive case and avoid it.
That way the recursive case is explicitly known and uses another path.
The lock can then be always acquired which gives you always lockdep
coverage (which is now missing unless you have more LANEs than CPUs).

The local_lock thingy is completely unneeded: a simple get_cpu_light()
would do the job.

> Yongxin

Sebastian
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-03-28 17:39 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-06  9:57 [PATCH RT] nvdimm: make lane acquirement RT aware Yongxin Liu
2019-03-06 16:35 ` Dan Williams
2019-03-07 14:33 ` Sebastian Andrzej Siewior
2019-03-08  0:07   ` Liu, Yongxin
2019-03-08  9:41     ` Sebastian Andrzej Siewior
2019-03-11  0:44       ` Liu, Yongxin
2019-03-15 16:42         ` Sebastian Andrzej Siewior
2019-03-18  1:41           ` Liu, Yongxin
2019-03-18 11:40             ` Sebastian Andrzej Siewior
2019-03-18 11:48               ` Liu, Yongxin
2019-03-28 17:38                 ` Sebastian Andrzej Siewior
2019-03-08  6:31 ` Pankaj Gupta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).