All of lore.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <longman@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>,
	Jing-Ting Wu <jing-ting.wu@mediatek.com>,
	Mukesh Ojha <quic_mojha@quicinc.com>,
	Valentin Schneider <vschneid@redhat.com>,
	wsd_upstream@mediatek.com, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-mediatek@lists.infradead.org, Jonathan.JMChen@mediatek.com,
	"chris.redpath@arm.com" <chris.redpath@arm.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Vincent Donnefort <vdonnefort@gmail.com>,
	Ingo Molnar <mingo@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Christian Brauner <brauner@kernel.org>,
	cgroups@vger.kernel.org, lixiong.liu@mediatek.com,
	wenju.xu@mediatek.com
Subject: Re: BUG: HANG_DETECT waiting for migration_cpu_stop() complete
Date: Tue, 6 Sep 2022 17:02:20 -0400	[thread overview]
Message-ID: <879886ba-7d77-11fc-5357-b21e141637ef@redhat.com> (raw)
In-Reply-To: <YxeylMvgc/wKcJqk@hirez.programming.kicks-ass.net>


On 9/6/22 16:50, Peter Zijlstra wrote:
> On Tue, Sep 06, 2022 at 04:40:03PM -0400, Waiman Long wrote:
>
> I've not followed the earlier stuff due to being unreadable; just
> reacting to this..
>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 838623b68031..5d9ea1553ec0 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -2794,9 +2794,9 @@ static int __set_cpus_allowed_ptr_locked(struct
>> task_struct *p,
>>                  if (cpumask_equal(&p->cpus_mask, new_mask))
>>                          goto out;
>>
>> -               if (WARN_ON_ONCE(p == current &&
>> -                                is_migration_disabled(p) &&
>> -                                !cpumask_test_cpu(task_cpu(p), new_mask)))
>> {
>> +               if (is_migration_disabled(p) &&
>> +                   !cpumask_test_cpu(task_cpu(p), new_mask)) {
>> +                       WARN_ON_ONCE(p == current);
>>                          ret = -EBUSY;
>>                          goto out;
>>                  }
>> @@ -2818,7 +2818,11 @@ static int __set_cpus_allowed_ptr_locked(struct
>> task_struct *p,
>>          if (flags & SCA_USER)
>>                  user_mask = clear_user_cpus_ptr(p);
>>
>> -       ret = affine_move_task(rq, p, rf, dest_cpu, flags);
>> +       if (!is_migration_disabled(p) || (flags & SCA_MIGRATE_ENABLE)) {
>> +               ret = affine_move_task(rq, p, rf, dest_cpu, flags);
>> +       } else {
>> +               task_rq_unlock(rq, p, rf);
>> +       }
> This cannot be right. There might be previous set_cpus_allowed_ptr()
> callers that are blocked and waiting for the task to land on a valid
> CPU.

You are probably right. I haven't fully understand all the migration 
disable code yet. However, if migration is disabled, there are some 
corner cases we need to handle properly.

Cheers,
Longman


WARNING: multiple messages have this Message-ID (diff)
From: Waiman Long <longman@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>,
	Jing-Ting Wu <jing-ting.wu@mediatek.com>,
	Mukesh Ojha <quic_mojha@quicinc.com>,
	Valentin Schneider <vschneid@redhat.com>,
	wsd_upstream@mediatek.com, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-mediatek@lists.infradead.org, Jonathan.JMChen@mediatek.com,
	"chris.redpath@arm.com" <chris.redpath@arm.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Vincent Donnefort <vdonnefort@gmail.com>,
	Ingo Molnar <mingo@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Christian Brauner <brauner@kernel.org>,
	cgroups@vger.kernel.org, lixiong.liu@mediatek.com,
	wenju.xu@mediatek.com
Subject: Re: BUG: HANG_DETECT waiting for migration_cpu_stop() complete
Date: Tue, 6 Sep 2022 17:02:20 -0400	[thread overview]
Message-ID: <879886ba-7d77-11fc-5357-b21e141637ef@redhat.com> (raw)
In-Reply-To: <YxeylMvgc/wKcJqk@hirez.programming.kicks-ass.net>


On 9/6/22 16:50, Peter Zijlstra wrote:
> On Tue, Sep 06, 2022 at 04:40:03PM -0400, Waiman Long wrote:
>
> I've not followed the earlier stuff due to being unreadable; just
> reacting to this..
>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 838623b68031..5d9ea1553ec0 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -2794,9 +2794,9 @@ static int __set_cpus_allowed_ptr_locked(struct
>> task_struct *p,
>>                  if (cpumask_equal(&p->cpus_mask, new_mask))
>>                          goto out;
>>
>> -               if (WARN_ON_ONCE(p == current &&
>> -                                is_migration_disabled(p) &&
>> -                                !cpumask_test_cpu(task_cpu(p), new_mask)))
>> {
>> +               if (is_migration_disabled(p) &&
>> +                   !cpumask_test_cpu(task_cpu(p), new_mask)) {
>> +                       WARN_ON_ONCE(p == current);
>>                          ret = -EBUSY;
>>                          goto out;
>>                  }
>> @@ -2818,7 +2818,11 @@ static int __set_cpus_allowed_ptr_locked(struct
>> task_struct *p,
>>          if (flags & SCA_USER)
>>                  user_mask = clear_user_cpus_ptr(p);
>>
>> -       ret = affine_move_task(rq, p, rf, dest_cpu, flags);
>> +       if (!is_migration_disabled(p) || (flags & SCA_MIGRATE_ENABLE)) {
>> +               ret = affine_move_task(rq, p, rf, dest_cpu, flags);
>> +       } else {
>> +               task_rq_unlock(rq, p, rf);
>> +       }
> This cannot be right. There might be previous set_cpus_allowed_ptr()
> callers that are blocked and waiting for the task to land on a valid
> CPU.

You are probably right. I haven't fully understand all the migration 
disable code yet. However, if migration is disabled, there are some 
corner cases we need to handle properly.

Cheers,
Longman


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Waiman Long <longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Jing-Ting Wu
	<jing-ting.wu-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org>,
	Mukesh Ojha <quic_mojha-jfJNa2p1gH1BDgjK7y7TUQ@public.gmane.org>,
	Valentin Schneider
	<vschneid-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	wsd_upstream-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org,
	linux-mediatek-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org,
	Jonathan.JMChen-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org,
	"chris.redpath-5wv7dgnIgG8@public.gmane.org"
	<chris.redpath-5wv7dgnIgG8@public.gmane.org>,
	Dietmar Eggemann <dietmar.eggemann-5wv7dgnIgG8@public.gmane.org>,
	Vincent Donnefort
	<vdonnefort-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Juri Lelli <juri.lelli-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Vincent Guittot
	<vincent.guittot-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>,
	Steven Rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org>,
	Ben Segall <bsegall-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Mel Gorman <mgorman-l3A5Bk7waGM@public.gmane.org>,
	Christian Brauner
	<brauner-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	lixiong.liu-NuS5LvNUpcJWk0Htik3J/w@public.gmane.org, wenju.xu
Subject: Re: BUG: HANG_DETECT waiting for migration_cpu_stop() complete
Date: Tue, 6 Sep 2022 17:02:20 -0400	[thread overview]
Message-ID: <879886ba-7d77-11fc-5357-b21e141637ef@redhat.com> (raw)
In-Reply-To: <YxeylMvgc/wKcJqk-Nxj+rRp3nVydTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>


On 9/6/22 16:50, Peter Zijlstra wrote:
> On Tue, Sep 06, 2022 at 04:40:03PM -0400, Waiman Long wrote:
>
> I've not followed the earlier stuff due to being unreadable; just
> reacting to this..
>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 838623b68031..5d9ea1553ec0 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -2794,9 +2794,9 @@ static int __set_cpus_allowed_ptr_locked(struct
>> task_struct *p,
>>                  if (cpumask_equal(&p->cpus_mask, new_mask))
>>                          goto out;
>>
>> -               if (WARN_ON_ONCE(p == current &&
>> -                                is_migration_disabled(p) &&
>> -                                !cpumask_test_cpu(task_cpu(p), new_mask)))
>> {
>> +               if (is_migration_disabled(p) &&
>> +                   !cpumask_test_cpu(task_cpu(p), new_mask)) {
>> +                       WARN_ON_ONCE(p == current);
>>                          ret = -EBUSY;
>>                          goto out;
>>                  }
>> @@ -2818,7 +2818,11 @@ static int __set_cpus_allowed_ptr_locked(struct
>> task_struct *p,
>>          if (flags & SCA_USER)
>>                  user_mask = clear_user_cpus_ptr(p);
>>
>> -       ret = affine_move_task(rq, p, rf, dest_cpu, flags);
>> +       if (!is_migration_disabled(p) || (flags & SCA_MIGRATE_ENABLE)) {
>> +               ret = affine_move_task(rq, p, rf, dest_cpu, flags);
>> +       } else {
>> +               task_rq_unlock(rq, p, rf);
>> +       }
> This cannot be right. There might be previous set_cpus_allowed_ptr()
> callers that are blocked and waiting for the task to land on a valid
> CPU.

You are probably right. I haven't fully understand all the migration 
disable code yet. However, if migration is disabled, there are some 
corner cases we need to handle properly.

Cheers,
Longman


  reply	other threads:[~2022-09-06 21:02 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-05  2:47 BUG: HANG_DETECT waiting for migration_cpu_stop() complete Jing-Ting Wu
2022-09-05  2:47 ` Jing-Ting Wu
2022-09-05  2:47 ` Jing-Ting Wu
2022-09-05  6:44 ` Mukesh Ojha
2022-09-05  6:44   ` Mukesh Ojha
2022-09-05  6:44   ` Mukesh Ojha
2022-09-05  8:22   ` Jing-Ting Wu
2022-09-05  8:22     ` Jing-Ting Wu
2022-09-05  8:22     ` Jing-Ting Wu
2022-09-06 18:30     ` Tejun Heo
2022-09-06 18:30       ` Tejun Heo
2022-09-06 18:30       ` Tejun Heo
2022-09-06 20:01       ` Waiman Long
2022-09-06 20:01         ` Waiman Long
2022-09-06 20:40         ` Waiman Long
2022-09-06 20:40           ` Waiman Long
2022-09-06 20:40           ` Waiman Long
2022-09-06 20:50           ` Peter Zijlstra
2022-09-06 20:50             ` Peter Zijlstra
2022-09-06 20:50             ` Peter Zijlstra
2022-09-06 21:02             ` Waiman Long [this message]
2022-09-06 21:02               ` Waiman Long
2022-09-06 21:02               ` Waiman Long
2022-09-23 14:20             ` Mukesh Ojha
2022-09-23 14:20               ` Mukesh Ojha
2022-09-23 14:20               ` Mukesh Ojha
2022-09-29 15:13               ` Mukesh Ojha
2022-09-29 15:13                 ` Mukesh Ojha
2022-09-29 15:13                 ` Mukesh Ojha
2022-09-07  0:07 ` Hillf Danton
2022-09-22  5:40   ` Jing-Ting Wu
2022-09-22  5:40     ` Jing-Ting Wu
2022-09-22  5:40     ` Jing-Ting Wu
2022-09-22 12:02     ` Hillf Danton
2023-03-22  9:37 Ryan Xiao (肖水林)
2023-03-22  9:37 ` Ryan Xiao (肖水林)
2023-03-27  4:05 ` Ryan Xiao (肖水林)
2023-03-27  4:05   ` Ryan Xiao (肖水林)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=879886ba-7d77-11fc-5357-b21e141637ef@redhat.com \
    --to=longman@redhat.com \
    --cc=Jonathan.JMChen@mediatek.com \
    --cc=brauner@kernel.org \
    --cc=bsegall@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=chris.redpath@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=jing-ting.wu@mediatek.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mediatek@lists.infradead.org \
    --cc=lixiong.liu@mediatek.com \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=quic_mojha@quicinc.com \
    --cc=rostedt@goodmis.org \
    --cc=tj@kernel.org \
    --cc=vdonnefort@gmail.com \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=wenju.xu@mediatek.com \
    --cc=wsd_upstream@mediatek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.