All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] cpuset: handle race between CPU hotplug and cpuset_hotplug_work
@ 2016-09-09  0:41 ` Joonwoo Park
  0 siblings, 0 replies; 9+ messages in thread
From: Joonwoo Park @ 2016-09-09  0:41 UTC (permalink / raw)
  To: Li Zefan; +Cc: Joonwoo Park, cgroups, linux-kernel

Discrepancy between cpu_online_mask and cpuset's effective CPU masks on
cpuset hierarchy is inevitable since cpuset defers updating of
effective CPU masks with workqueue while nothing prevents system from
doing CPU hotplug.  For that reason guarantee_online_cpus() walks up
the cpuset hierarchy until it finds intersection under the assumption
that top cpuset's effective CPU mask intersects with cpu_online_mask
even under such race.

However a sequence of CPU hotplugs can open a time window which is none
of effective CPUs in the top cpuset intersects with cpu_online_mask.

For example when there are 4 possible CPUs 0-3 where only CPU0 is online:

  ========================  ===========================
   cpu_online_mask           top_cpuset.effective_cpus
  ========================  ===========================
   echo 1 > cpu2/online.
   CPU hotplug notifier woke up hotplug work but not yet scheduled.
      [0,2]                     [0]

   echo 0 > cpu0/online.
   The workqueue is still runnable.
      [2]                       [0]
  ========================  ===========================

  Now there is no intersection between cpu_online_mask and
  top_cpuset.effective_cpus.  Thus invoking sys_sched_setaffinity() at
  this moment can cause following:

   Unable to handle kernel NULL pointer dereference at virtual address 000000d0
   ------------[ cut here ]------------
   Kernel BUG at ffffffc0001389b0 [verbose debug info unavailable]
   Internal error: Oops - BUG: 96000005 [#1] PREEMPT SMP
   Modules linked in:
   CPU: 2 PID: 1420 Comm: taskset Tainted: G        W       4.4.8+ #98
   task: ffffffc06a5c4880 ti: ffffffc06e124000 task.ti: ffffffc06e124000
   PC is at guarantee_online_cpus+0x2c/0x58
   LR is at cpuset_cpus_allowed+0x4c/0x6c
   <snip>
   Process taskset (pid: 1420, stack limit = 0xffffffc06e124020)
   Call trace:
   [<ffffffc0001389b0>] guarantee_online_cpus+0x2c/0x58
   [<ffffffc00013b208>] cpuset_cpus_allowed+0x4c/0x6c
   [<ffffffc0000d61f0>] sched_setaffinity+0xc0/0x1ac
   [<ffffffc0000d6374>] SyS_sched_setaffinity+0x98/0xac
   [<ffffffc000085cb0>] el0_svc_naked+0x24/0x28

The top cpuset's effective_cpus are guaranteed to be identical to online
CPUs eventually.  Hence fall back to online CPU mask when there is no
intersection between top cpuset's effective_cpus and online CPU mask.

Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: cgroups@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 kernel/cpuset.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index c7fd277..b5d2b73 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -325,8 +325,7 @@ static struct file_system_type cpuset_fs_type = {
 /*
  * Return in pmask the portion of a cpusets's cpus_allowed that
  * are online.  If none are online, walk up the cpuset hierarchy
- * until we find one that does have some online cpus.  The top
- * cpuset always has some cpus online.
+ * until we find one that does have some online cpus.
  *
  * One way or another, we guarantee to return some non-empty subset
  * of cpu_online_mask.
@@ -335,8 +334,20 @@ static struct file_system_type cpuset_fs_type = {
  */
 static void guarantee_online_cpus(struct cpuset *cs, struct cpumask *pmask)
 {
-	while (!cpumask_intersects(cs->effective_cpus, cpu_online_mask))
+	while (!cpumask_intersects(cs->effective_cpus, cpu_online_mask)) {
 		cs = parent_cs(cs);
+		if (unlikely(!cs)) {
+			/*
+			 * The top cpuset doesn't have any online cpu in
+			 * consequence of race between cpuset_hotplug_work
+			 * and cpu hotplug notifier.  But we know the top
+			 * cpuset's effective_cpus is on its way to be same
+			 * with online cpus mask.
+			 */
+			cpumask_copy(pmask, cpu_online_mask);
+			return;
+		}
+	}
 	cpumask_and(pmask, cs->effective_cpus, cpu_online_mask);
 }
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH] cpuset: handle race between CPU hotplug and cpuset_hotplug_work
@ 2016-09-09  0:41 ` Joonwoo Park
  0 siblings, 0 replies; 9+ messages in thread
From: Joonwoo Park @ 2016-09-09  0:41 UTC (permalink / raw)
  To: Li Zefan
  Cc: Joonwoo Park, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Discrepancy between cpu_online_mask and cpuset's effective CPU masks on
cpuset hierarchy is inevitable since cpuset defers updating of
effective CPU masks with workqueue while nothing prevents system from
doing CPU hotplug.  For that reason guarantee_online_cpus() walks up
the cpuset hierarchy until it finds intersection under the assumption
that top cpuset's effective CPU mask intersects with cpu_online_mask
even under such race.

However a sequence of CPU hotplugs can open a time window which is none
of effective CPUs in the top cpuset intersects with cpu_online_mask.

For example when there are 4 possible CPUs 0-3 where only CPU0 is online:

  ========================  ===========================
   cpu_online_mask           top_cpuset.effective_cpus
  ========================  ===========================
   echo 1 > cpu2/online.
   CPU hotplug notifier woke up hotplug work but not yet scheduled.
      [0,2]                     [0]

   echo 0 > cpu0/online.
   The workqueue is still runnable.
      [2]                       [0]
  ========================  ===========================

  Now there is no intersection between cpu_online_mask and
  top_cpuset.effective_cpus.  Thus invoking sys_sched_setaffinity() at
  this moment can cause following:

   Unable to handle kernel NULL pointer dereference at virtual address 000000d0
   ------------[ cut here ]------------
   Kernel BUG at ffffffc0001389b0 [verbose debug info unavailable]
   Internal error: Oops - BUG: 96000005 [#1] PREEMPT SMP
   Modules linked in:
   CPU: 2 PID: 1420 Comm: taskset Tainted: G        W       4.4.8+ #98
   task: ffffffc06a5c4880 ti: ffffffc06e124000 task.ti: ffffffc06e124000
   PC is at guarantee_online_cpus+0x2c/0x58
   LR is at cpuset_cpus_allowed+0x4c/0x6c
   <snip>
   Process taskset (pid: 1420, stack limit = 0xffffffc06e124020)
   Call trace:
   [<ffffffc0001389b0>] guarantee_online_cpus+0x2c/0x58
   [<ffffffc00013b208>] cpuset_cpus_allowed+0x4c/0x6c
   [<ffffffc0000d61f0>] sched_setaffinity+0xc0/0x1ac
   [<ffffffc0000d6374>] SyS_sched_setaffinity+0x98/0xac
   [<ffffffc000085cb0>] el0_svc_naked+0x24/0x28

The top cpuset's effective_cpus are guaranteed to be identical to online
CPUs eventually.  Hence fall back to online CPU mask when there is no
intersection between top cpuset's effective_cpus and online CPU mask.

Signed-off-by: Joonwoo Park <joonwoop-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
Cc: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
---
 kernel/cpuset.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index c7fd277..b5d2b73 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -325,8 +325,7 @@ static struct file_system_type cpuset_fs_type = {
 /*
  * Return in pmask the portion of a cpusets's cpus_allowed that
  * are online.  If none are online, walk up the cpuset hierarchy
- * until we find one that does have some online cpus.  The top
- * cpuset always has some cpus online.
+ * until we find one that does have some online cpus.
  *
  * One way or another, we guarantee to return some non-empty subset
  * of cpu_online_mask.
@@ -335,8 +334,20 @@ static struct file_system_type cpuset_fs_type = {
  */
 static void guarantee_online_cpus(struct cpuset *cs, struct cpumask *pmask)
 {
-	while (!cpumask_intersects(cs->effective_cpus, cpu_online_mask))
+	while (!cpumask_intersects(cs->effective_cpus, cpu_online_mask)) {
 		cs = parent_cs(cs);
+		if (unlikely(!cs)) {
+			/*
+			 * The top cpuset doesn't have any online cpu in
+			 * consequence of race between cpuset_hotplug_work
+			 * and cpu hotplug notifier.  But we know the top
+			 * cpuset's effective_cpus is on its way to be same
+			 * with online cpus mask.
+			 */
+			cpumask_copy(pmask, cpu_online_mask);
+			return;
+		}
+	}
 	cpumask_and(pmask, cs->effective_cpus, cpu_online_mask);
 }
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] cpuset: handle race between CPU hotplug and cpuset_hotplug_work
@ 2016-09-12  2:48   ` Zefan Li
  0 siblings, 0 replies; 9+ messages in thread
From: Zefan Li @ 2016-09-12  2:48 UTC (permalink / raw)
  To: Joonwoo Park; +Cc: cgroups, linux-kernel, Tejun Heo

Cc: Tejun

On 2016/9/9 8:41, Joonwoo Park wrote:
> Discrepancy between cpu_online_mask and cpuset's effective CPU masks on
> cpuset hierarchy is inevitable since cpuset defers updating of
> effective CPU masks with workqueue while nothing prevents system from
> doing CPU hotplug.  For that reason guarantee_online_cpus() walks up
> the cpuset hierarchy until it finds intersection under the assumption
> that top cpuset's effective CPU mask intersects with cpu_online_mask
> even under such race.
> 
> However a sequence of CPU hotplugs can open a time window which is none
> of effective CPUs in the top cpuset intersects with cpu_online_mask.
> 
> For example when there are 4 possible CPUs 0-3 where only CPU0 is online:
> 
>   ========================  ===========================
>    cpu_online_mask           top_cpuset.effective_cpus
>   ========================  ===========================
>    echo 1 > cpu2/online.
>    CPU hotplug notifier woke up hotplug work but not yet scheduled.
>       [0,2]                     [0]
> 
>    echo 0 > cpu0/online.
>    The workqueue is still runnable.
>       [2]                       [0]
>   ========================  ===========================
> 
>   Now there is no intersection between cpu_online_mask and
>   top_cpuset.effective_cpus.  Thus invoking sys_sched_setaffinity() at
>   this moment can cause following:
> 
>    Unable to handle kernel NULL pointer dereference at virtual address 000000d0
>    ------------[ cut here ]------------
>    Kernel BUG at ffffffc0001389b0 [verbose debug info unavailable]
>    Internal error: Oops - BUG: 96000005 [#1] PREEMPT SMP
>    Modules linked in:
>    CPU: 2 PID: 1420 Comm: taskset Tainted: G        W       4.4.8+ #98
>    task: ffffffc06a5c4880 ti: ffffffc06e124000 task.ti: ffffffc06e124000
>    PC is at guarantee_online_cpus+0x2c/0x58
>    LR is at cpuset_cpus_allowed+0x4c/0x6c
>    <snip>
>    Process taskset (pid: 1420, stack limit = 0xffffffc06e124020)
>    Call trace:
>    [<ffffffc0001389b0>] guarantee_online_cpus+0x2c/0x58
>    [<ffffffc00013b208>] cpuset_cpus_allowed+0x4c/0x6c
>    [<ffffffc0000d61f0>] sched_setaffinity+0xc0/0x1ac
>    [<ffffffc0000d6374>] SyS_sched_setaffinity+0x98/0xac
>    [<ffffffc000085cb0>] el0_svc_naked+0x24/0x28
> 
> The top cpuset's effective_cpus are guaranteed to be identical to online
> CPUs eventually.  Hence fall back to online CPU mask when there is no
> intersection between top cpuset's effective_cpus and online CPU mask.
> 
> Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
> Cc: Li Zefan <lizefan@huawei.com>
> Cc: cgroups@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org

Thanks for fixing this!

Acked-by: Zefan Li <lizefan@huawei.com>
Cc: <stable@vger.kernel.org> # 3.17+

> ---
>  kernel/cpuset.c | 17 ++++++++++++++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
> index c7fd277..b5d2b73 100644
> --- a/kernel/cpuset.c
> +++ b/kernel/cpuset.c
> @@ -325,8 +325,7 @@ static struct file_system_type cpuset_fs_type = {
>  /*
>   * Return in pmask the portion of a cpusets's cpus_allowed that
>   * are online.  If none are online, walk up the cpuset hierarchy
> - * until we find one that does have some online cpus.  The top
> - * cpuset always has some cpus online.
> + * until we find one that does have some online cpus.
>   *
>   * One way or another, we guarantee to return some non-empty subset
>   * of cpu_online_mask.
> @@ -335,8 +334,20 @@ static struct file_system_type cpuset_fs_type = {
>   */
>  static void guarantee_online_cpus(struct cpuset *cs, struct cpumask *pmask)
>  {
> -	while (!cpumask_intersects(cs->effective_cpus, cpu_online_mask))
> +	while (!cpumask_intersects(cs->effective_cpus, cpu_online_mask)) {
>  		cs = parent_cs(cs);
> +		if (unlikely(!cs)) {
> +			/*
> +			 * The top cpuset doesn't have any online cpu in
> +			 * consequence of race between cpuset_hotplug_work
> +			 * and cpu hotplug notifier.  But we know the top
> +			 * cpuset's effective_cpus is on its way to be same
> +			 * with online cpus mask.
> +			 */
> +			cpumask_copy(pmask, cpu_online_mask);
> +			return;
> +		}
> +	}
>  	cpumask_and(pmask, cs->effective_cpus, cpu_online_mask);
>  }
>  
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] cpuset: handle race between CPU hotplug and cpuset_hotplug_work
@ 2016-09-12  2:48   ` Zefan Li
  0 siblings, 0 replies; 9+ messages in thread
From: Zefan Li @ 2016-09-12  2:48 UTC (permalink / raw)
  To: Joonwoo Park
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Tejun Heo

Cc: Tejun

On 2016/9/9 8:41, Joonwoo Park wrote:
> Discrepancy between cpu_online_mask and cpuset's effective CPU masks on
> cpuset hierarchy is inevitable since cpuset defers updating of
> effective CPU masks with workqueue while nothing prevents system from
> doing CPU hotplug.  For that reason guarantee_online_cpus() walks up
> the cpuset hierarchy until it finds intersection under the assumption
> that top cpuset's effective CPU mask intersects with cpu_online_mask
> even under such race.
> 
> However a sequence of CPU hotplugs can open a time window which is none
> of effective CPUs in the top cpuset intersects with cpu_online_mask.
> 
> For example when there are 4 possible CPUs 0-3 where only CPU0 is online:
> 
>   ========================  ===========================
>    cpu_online_mask           top_cpuset.effective_cpus
>   ========================  ===========================
>    echo 1 > cpu2/online.
>    CPU hotplug notifier woke up hotplug work but not yet scheduled.
>       [0,2]                     [0]
> 
>    echo 0 > cpu0/online.
>    The workqueue is still runnable.
>       [2]                       [0]
>   ========================  ===========================
> 
>   Now there is no intersection between cpu_online_mask and
>   top_cpuset.effective_cpus.  Thus invoking sys_sched_setaffinity() at
>   this moment can cause following:
> 
>    Unable to handle kernel NULL pointer dereference at virtual address 000000d0
>    ------------[ cut here ]------------
>    Kernel BUG at ffffffc0001389b0 [verbose debug info unavailable]
>    Internal error: Oops - BUG: 96000005 [#1] PREEMPT SMP
>    Modules linked in:
>    CPU: 2 PID: 1420 Comm: taskset Tainted: G        W       4.4.8+ #98
>    task: ffffffc06a5c4880 ti: ffffffc06e124000 task.ti: ffffffc06e124000
>    PC is at guarantee_online_cpus+0x2c/0x58
>    LR is at cpuset_cpus_allowed+0x4c/0x6c
>    <snip>
>    Process taskset (pid: 1420, stack limit = 0xffffffc06e124020)
>    Call trace:
>    [<ffffffc0001389b0>] guarantee_online_cpus+0x2c/0x58
>    [<ffffffc00013b208>] cpuset_cpus_allowed+0x4c/0x6c
>    [<ffffffc0000d61f0>] sched_setaffinity+0xc0/0x1ac
>    [<ffffffc0000d6374>] SyS_sched_setaffinity+0x98/0xac
>    [<ffffffc000085cb0>] el0_svc_naked+0x24/0x28
> 
> The top cpuset's effective_cpus are guaranteed to be identical to online
> CPUs eventually.  Hence fall back to online CPU mask when there is no
> intersection between top cpuset's effective_cpus and online CPU mask.
> 
> Signed-off-by: Joonwoo Park <joonwoop-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
> Cc: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

Thanks for fixing this!

Acked-by: Zefan Li <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> # 3.17+

> ---
>  kernel/cpuset.c | 17 ++++++++++++++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
> index c7fd277..b5d2b73 100644
> --- a/kernel/cpuset.c
> +++ b/kernel/cpuset.c
> @@ -325,8 +325,7 @@ static struct file_system_type cpuset_fs_type = {
>  /*
>   * Return in pmask the portion of a cpusets's cpus_allowed that
>   * are online.  If none are online, walk up the cpuset hierarchy
> - * until we find one that does have some online cpus.  The top
> - * cpuset always has some cpus online.
> + * until we find one that does have some online cpus.
>   *
>   * One way or another, we guarantee to return some non-empty subset
>   * of cpu_online_mask.
> @@ -335,8 +334,20 @@ static struct file_system_type cpuset_fs_type = {
>   */
>  static void guarantee_online_cpus(struct cpuset *cs, struct cpumask *pmask)
>  {
> -	while (!cpumask_intersects(cs->effective_cpus, cpu_online_mask))
> +	while (!cpumask_intersects(cs->effective_cpus, cpu_online_mask)) {
>  		cs = parent_cs(cs);
> +		if (unlikely(!cs)) {
> +			/*
> +			 * The top cpuset doesn't have any online cpu in
> +			 * consequence of race between cpuset_hotplug_work
> +			 * and cpu hotplug notifier.  But we know the top
> +			 * cpuset's effective_cpus is on its way to be same
> +			 * with online cpus mask.
> +			 */
> +			cpumask_copy(pmask, cpu_online_mask);
> +			return;
> +		}
> +	}
>  	cpumask_and(pmask, cs->effective_cpus, cpu_online_mask);
>  }
>  
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] cpuset: handle race between CPU hotplug and cpuset_hotplug_work
@ 2016-09-12  4:05     ` Joonwoo Park
  0 siblings, 0 replies; 9+ messages in thread
From: Joonwoo Park @ 2016-09-12  4:05 UTC (permalink / raw)
  To: Zefan Li; +Cc: cgroups, linux-kernel, Tejun Heo

On Mon, Sep 12, 2016 at 10:48:31AM +0800, Zefan Li wrote:
> Cc: Tejun
> 
> On 2016/9/9 8:41, Joonwoo Park wrote:
> > Discrepancy between cpu_online_mask and cpuset's effective CPU masks on
> > cpuset hierarchy is inevitable since cpuset defers updating of
> > effective CPU masks with workqueue while nothing prevents system from
> > doing CPU hotplug.  For that reason guarantee_online_cpus() walks up
> > the cpuset hierarchy until it finds intersection under the assumption
> > that top cpuset's effective CPU mask intersects with cpu_online_mask
> > even under such race.
> > 
> > However a sequence of CPU hotplugs can open a time window which is none
> > of effective CPUs in the top cpuset intersects with cpu_online_mask.
> > 
> > For example when there are 4 possible CPUs 0-3 where only CPU0 is online:
> > 
> >   ========================  ===========================
> >    cpu_online_mask           top_cpuset.effective_cpus
> >   ========================  ===========================
> >    echo 1 > cpu2/online.
> >    CPU hotplug notifier woke up hotplug work but not yet scheduled.
> >       [0,2]                     [0]
> > 
> >    echo 0 > cpu0/online.
> >    The workqueue is still runnable.
> >       [2]                       [0]
> >   ========================  ===========================
> > 
> >   Now there is no intersection between cpu_online_mask and
> >   top_cpuset.effective_cpus.  Thus invoking sys_sched_setaffinity() at
> >   this moment can cause following:
> > 
> >    Unable to handle kernel NULL pointer dereference at virtual address 000000d0
> >    ------------[ cut here ]------------
> >    Kernel BUG at ffffffc0001389b0 [verbose debug info unavailable]
> >    Internal error: Oops - BUG: 96000005 [#1] PREEMPT SMP
> >    Modules linked in:
> >    CPU: 2 PID: 1420 Comm: taskset Tainted: G        W       4.4.8+ #98
> >    task: ffffffc06a5c4880 ti: ffffffc06e124000 task.ti: ffffffc06e124000
> >    PC is at guarantee_online_cpus+0x2c/0x58
> >    LR is at cpuset_cpus_allowed+0x4c/0x6c
> >    <snip>
> >    Process taskset (pid: 1420, stack limit = 0xffffffc06e124020)
> >    Call trace:
> >    [<ffffffc0001389b0>] guarantee_online_cpus+0x2c/0x58
> >    [<ffffffc00013b208>] cpuset_cpus_allowed+0x4c/0x6c
> >    [<ffffffc0000d61f0>] sched_setaffinity+0xc0/0x1ac
> >    [<ffffffc0000d6374>] SyS_sched_setaffinity+0x98/0xac
> >    [<ffffffc000085cb0>] el0_svc_naked+0x24/0x28
> > 
> > The top cpuset's effective_cpus are guaranteed to be identical to online
> > CPUs eventually.  Hence fall back to online CPU mask when there is no
> > intersection between top cpuset's effective_cpus and online CPU mask.
> > 
> > Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
> > Cc: Li Zefan <lizefan@huawei.com>
> > Cc: cgroups@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> 
> Thanks for fixing this!
> 
> Acked-by: Zefan Li <lizefan@huawei.com>
> Cc: <stable@vger.kernel.org> # 3.17+
> 

Thanks for reviewing.

Shortly I will send v2 which has few grammar error fixes in the
changelog.
No code change has made.

Joonwoo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] cpuset: handle race between CPU hotplug and cpuset_hotplug_work
@ 2016-09-12  4:05     ` Joonwoo Park
  0 siblings, 0 replies; 9+ messages in thread
From: Joonwoo Park @ 2016-09-12  4:05 UTC (permalink / raw)
  To: Zefan Li
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Tejun Heo

On Mon, Sep 12, 2016 at 10:48:31AM +0800, Zefan Li wrote:
> Cc: Tejun
> 
> On 2016/9/9 8:41, Joonwoo Park wrote:
> > Discrepancy between cpu_online_mask and cpuset's effective CPU masks on
> > cpuset hierarchy is inevitable since cpuset defers updating of
> > effective CPU masks with workqueue while nothing prevents system from
> > doing CPU hotplug.  For that reason guarantee_online_cpus() walks up
> > the cpuset hierarchy until it finds intersection under the assumption
> > that top cpuset's effective CPU mask intersects with cpu_online_mask
> > even under such race.
> > 
> > However a sequence of CPU hotplugs can open a time window which is none
> > of effective CPUs in the top cpuset intersects with cpu_online_mask.
> > 
> > For example when there are 4 possible CPUs 0-3 where only CPU0 is online:
> > 
> >   ========================  ===========================
> >    cpu_online_mask           top_cpuset.effective_cpus
> >   ========================  ===========================
> >    echo 1 > cpu2/online.
> >    CPU hotplug notifier woke up hotplug work but not yet scheduled.
> >       [0,2]                     [0]
> > 
> >    echo 0 > cpu0/online.
> >    The workqueue is still runnable.
> >       [2]                       [0]
> >   ========================  ===========================
> > 
> >   Now there is no intersection between cpu_online_mask and
> >   top_cpuset.effective_cpus.  Thus invoking sys_sched_setaffinity() at
> >   this moment can cause following:
> > 
> >    Unable to handle kernel NULL pointer dereference at virtual address 000000d0
> >    ------------[ cut here ]------------
> >    Kernel BUG at ffffffc0001389b0 [verbose debug info unavailable]
> >    Internal error: Oops - BUG: 96000005 [#1] PREEMPT SMP
> >    Modules linked in:
> >    CPU: 2 PID: 1420 Comm: taskset Tainted: G        W       4.4.8+ #98
> >    task: ffffffc06a5c4880 ti: ffffffc06e124000 task.ti: ffffffc06e124000
> >    PC is at guarantee_online_cpus+0x2c/0x58
> >    LR is at cpuset_cpus_allowed+0x4c/0x6c
> >    <snip>
> >    Process taskset (pid: 1420, stack limit = 0xffffffc06e124020)
> >    Call trace:
> >    [<ffffffc0001389b0>] guarantee_online_cpus+0x2c/0x58
> >    [<ffffffc00013b208>] cpuset_cpus_allowed+0x4c/0x6c
> >    [<ffffffc0000d61f0>] sched_setaffinity+0xc0/0x1ac
> >    [<ffffffc0000d6374>] SyS_sched_setaffinity+0x98/0xac
> >    [<ffffffc000085cb0>] el0_svc_naked+0x24/0x28
> > 
> > The top cpuset's effective_cpus are guaranteed to be identical to online
> > CPUs eventually.  Hence fall back to online CPU mask when there is no
> > intersection between top cpuset's effective_cpus and online CPU mask.
> > 
> > Signed-off-by: Joonwoo Park <joonwoop-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
> > Cc: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> > Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> 
> Thanks for fixing this!
> 
> Acked-by: Zefan Li <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> # 3.17+
> 

Thanks for reviewing.

Shortly I will send v2 which has few grammar error fixes in the
changelog.
No code change has made.

Joonwoo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2] cpuset: handle race between CPU hotplug and cpuset_hotplug_work
  2016-09-12  4:05     ` Joonwoo Park
@ 2016-09-12  4:14       ` Joonwoo Park
  -1 siblings, 0 replies; 9+ messages in thread
From: Joonwoo Park @ 2016-09-12  4:14 UTC (permalink / raw)
  To: Li Zefan; +Cc: Joonwoo Park, Tejun Heo, cgroups, linux-kernel, stable

A discrepancy between cpu_online_mask and cpuset's effective_cpus
mask is inevitable during hotplug since cpuset defers updating of
effective_cpus mask using a workqueue, during which time nothing
prevents the system from more hotplug operations.  For that reason
guarantee_online_cpus() walks up the cpuset hierarchy until it finds
an intersection under the assumption that top cpuset's effective_cpus
mask intersects with cpu_online_mask even with such a race occurring.

However a sequence of CPU hotplugs can open a time window, during which
none of the effective CPUs in the top cpuset intersect with
cpu_online_mask.

For example when there are 4 possible CPUs 0-3 and only CPU0 is online:

  ========================  ===========================
   cpu_online_mask           top_cpuset.effective_cpus
  ========================  ===========================
   echo 1 > cpu2/online.
   CPU hotplug notifier woke up hotplug work but not yet scheduled.
      [0,2]                     [0]

   echo 0 > cpu0/online.
   The workqueue is still runnable.
      [2]                       [0]
  ========================  ===========================

  Now there is no intersection between cpu_online_mask and
  top_cpuset.effective_cpus.  Thus invoking sys_sched_setaffinity() at
  this moment can cause following:

   Unable to handle kernel NULL pointer dereference at virtual address 000000d0
   ------------[ cut here ]------------
   Kernel BUG at ffffffc0001389b0 [verbose debug info unavailable]
   Internal error: Oops - BUG: 96000005 [#1] PREEMPT SMP
   Modules linked in:
   CPU: 2 PID: 1420 Comm: taskset Tainted: G        W       4.4.8+ #98
   task: ffffffc06a5c4880 ti: ffffffc06e124000 task.ti: ffffffc06e124000
   PC is at guarantee_online_cpus+0x2c/0x58
   LR is at cpuset_cpus_allowed+0x4c/0x6c
   <snip>
   Process taskset (pid: 1420, stack limit = 0xffffffc06e124020)
   Call trace:
   [<ffffffc0001389b0>] guarantee_online_cpus+0x2c/0x58
   [<ffffffc00013b208>] cpuset_cpus_allowed+0x4c/0x6c
   [<ffffffc0000d61f0>] sched_setaffinity+0xc0/0x1ac
   [<ffffffc0000d6374>] SyS_sched_setaffinity+0x98/0xac
   [<ffffffc000085cb0>] el0_svc_naked+0x24/0x28

The top cpuset's effective_cpus are guaranteed to be identical to
cpu_online_mask eventually.  Hence fall back to cpu_online_mask when
there is no intersection between top cpuset's effective_cpus and
cpu_online_mask.

Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: cgroups@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: <stable@vger.kernel.org> # 3.17+
---
 v2: fixed changelog and comment.

 kernel/cpuset.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 73e93e5..27c6d78 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -325,8 +325,7 @@ static struct file_system_type cpuset_fs_type = {
 /*
  * Return in pmask the portion of a cpusets's cpus_allowed that
  * are online.  If none are online, walk up the cpuset hierarchy
- * until we find one that does have some online cpus.  The top
- * cpuset always has some cpus online.
+ * until we find one that does have some online cpus.
  *
  * One way or another, we guarantee to return some non-empty subset
  * of cpu_online_mask.
@@ -335,8 +334,20 @@ static struct file_system_type cpuset_fs_type = {
  */
 static void guarantee_online_cpus(struct cpuset *cs, struct cpumask *pmask)
 {
-	while (!cpumask_intersects(cs->effective_cpus, cpu_online_mask))
+	while (!cpumask_intersects(cs->effective_cpus, cpu_online_mask)) {
 		cs = parent_cs(cs);
+		if (unlikely(!cs)) {
+			/*
+			 * The top cpuset doesn't have any online cpu as a
+			 * consequence of a race between cpuset_hotplug_work
+			 * and cpu hotplug notifier.  But we know the top
+			 * cpuset's effective_cpus is on its way to to be
+			 * identical to cpu_online_mask.
+			 */
+			cpumask_copy(pmask, cpu_online_mask);
+			return;
+		}
+	}
 	cpumask_and(pmask, cs->effective_cpus, cpu_online_mask);
 }
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2] cpuset: handle race between CPU hotplug and cpuset_hotplug_work
@ 2016-09-12  4:14       ` Joonwoo Park
  0 siblings, 0 replies; 9+ messages in thread
From: Joonwoo Park @ 2016-09-12  4:14 UTC (permalink / raw)
  To: Li Zefan; +Cc: Joonwoo Park, Tejun Heo, cgroups, linux-kernel, stable

A discrepancy between cpu_online_mask and cpuset's effective_cpus
mask is inevitable during hotplug since cpuset defers updating of
effective_cpus mask using a workqueue, during which time nothing
prevents the system from more hotplug operations.  For that reason
guarantee_online_cpus() walks up the cpuset hierarchy until it finds
an intersection under the assumption that top cpuset's effective_cpus
mask intersects with cpu_online_mask even with such a race occurring.

However a sequence of CPU hotplugs can open a time window, during which
none of the effective CPUs in the top cpuset intersect with
cpu_online_mask.

For example when there are 4 possible CPUs 0-3 and only CPU0 is online:

  ========================  ===========================
   cpu_online_mask           top_cpuset.effective_cpus
  ========================  ===========================
   echo 1 > cpu2/online.
   CPU hotplug notifier woke up hotplug work but not yet scheduled.
      [0,2]                     [0]

   echo 0 > cpu0/online.
   The workqueue is still runnable.
      [2]                       [0]
  ========================  ===========================

  Now there is no intersection between cpu_online_mask and
  top_cpuset.effective_cpus.  Thus invoking sys_sched_setaffinity() at
  this moment can cause following:

   Unable to handle kernel NULL pointer dereference at virtual address 000000d0
   ------------[ cut here ]------------
   Kernel BUG at ffffffc0001389b0 [verbose debug info unavailable]
   Internal error: Oops - BUG: 96000005 [#1] PREEMPT SMP
   Modules linked in:
   CPU: 2 PID: 1420 Comm: taskset Tainted: G        W       4.4.8+ #98
   task: ffffffc06a5c4880 ti: ffffffc06e124000 task.ti: ffffffc06e124000
   PC is at guarantee_online_cpus+0x2c/0x58
   LR is at cpuset_cpus_allowed+0x4c/0x6c
   <snip>
   Process taskset (pid: 1420, stack limit = 0xffffffc06e124020)
   Call trace:
   [<ffffffc0001389b0>] guarantee_online_cpus+0x2c/0x58
   [<ffffffc00013b208>] cpuset_cpus_allowed+0x4c/0x6c
   [<ffffffc0000d61f0>] sched_setaffinity+0xc0/0x1ac
   [<ffffffc0000d6374>] SyS_sched_setaffinity+0x98/0xac
   [<ffffffc000085cb0>] el0_svc_naked+0x24/0x28

The top cpuset's effective_cpus are guaranteed to be identical to
cpu_online_mask eventually.  Hence fall back to cpu_online_mask when
there is no intersection between top cpuset's effective_cpus and
cpu_online_mask.

Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: cgroups@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: <stable@vger.kernel.org> # 3.17+
---
 v2: fixed changelog and comment.

 kernel/cpuset.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 73e93e5..27c6d78 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -325,8 +325,7 @@ static struct file_system_type cpuset_fs_type = {
 /*
  * Return in pmask the portion of a cpusets's cpus_allowed that
  * are online.  If none are online, walk up the cpuset hierarchy
- * until we find one that does have some online cpus.  The top
- * cpuset always has some cpus online.
+ * until we find one that does have some online cpus.
  *
  * One way or another, we guarantee to return some non-empty subset
  * of cpu_online_mask.
@@ -335,8 +334,20 @@ static struct file_system_type cpuset_fs_type = {
  */
 static void guarantee_online_cpus(struct cpuset *cs, struct cpumask *pmask)
 {
-	while (!cpumask_intersects(cs->effective_cpus, cpu_online_mask))
+	while (!cpumask_intersects(cs->effective_cpus, cpu_online_mask)) {
 		cs = parent_cs(cs);
+		if (unlikely(!cs)) {
+			/*
+			 * The top cpuset doesn't have any online cpu as a
+			 * consequence of a race between cpuset_hotplug_work
+			 * and cpu hotplug notifier.  But we know the top
+			 * cpuset's effective_cpus is on its way to to be
+			 * identical to cpu_online_mask.
+			 */
+			cpumask_copy(pmask, cpu_online_mask);
+			return;
+		}
+	}
 	cpumask_and(pmask, cs->effective_cpus, cpu_online_mask);
 }
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] cpuset: handle race between CPU hotplug and cpuset_hotplug_work
  2016-09-12  4:14       ` Joonwoo Park
  (?)
@ 2016-09-13 15:27       ` Tejun Heo
  -1 siblings, 0 replies; 9+ messages in thread
From: Tejun Heo @ 2016-09-13 15:27 UTC (permalink / raw)
  To: Joonwoo Park; +Cc: Li Zefan, cgroups, linux-kernel, stable

On Sun, Sep 11, 2016 at 09:14:58PM -0700, Joonwoo Park wrote:
> A discrepancy between cpu_online_mask and cpuset's effective_cpus
> mask is inevitable during hotplug since cpuset defers updating of
> effective_cpus mask using a workqueue, during which time nothing
> prevents the system from more hotplug operations.  For that reason
> guarantee_online_cpus() walks up the cpuset hierarchy until it finds
> an intersection under the assumption that top cpuset's effective_cpus
> mask intersects with cpu_online_mask even with such a race occurring.
> 
> However a sequence of CPU hotplugs can open a time window, during which
> none of the effective CPUs in the top cpuset intersect with
> cpu_online_mask.

Applied to cgroup/for-4.8-fixes.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-09-13 15:27 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-09  0:41 [PATCH] cpuset: handle race between CPU hotplug and cpuset_hotplug_work Joonwoo Park
2016-09-09  0:41 ` Joonwoo Park
2016-09-12  2:48 ` Zefan Li
2016-09-12  2:48   ` Zefan Li
2016-09-12  4:05   ` Joonwoo Park
2016-09-12  4:05     ` Joonwoo Park
2016-09-12  4:14     ` [PATCH v2] " Joonwoo Park
2016-09-12  4:14       ` Joonwoo Park
2016-09-13 15:27       ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.