All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kuyo Chang (張建文)" <Kuyo.Chang@mediatek.com>
To: "peterz@infradead.org" <peterz@infradead.org>
Cc: "dietmar.eggemann@arm.com" <dietmar.eggemann@arm.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mediatek@lists.infradead.org" 
	<linux-mediatek@lists.infradead.org>,
	"rostedt@goodmis.org" <rostedt@goodmis.org>,
	wsd_upstream <wsd_upstream@mediatek.com>,
	"vschneid@redhat.com" <vschneid@redhat.com>,
	"bristot@redhat.com" <bristot@redhat.com>,
	"juri.lelli@redhat.com" <juri.lelli@redhat.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"bsegall@google.com" <bsegall@google.com>,
	"mgorman@suse.de" <mgorman@suse.de>,
	"matthias.bgg@gmail.com" <matthias.bgg@gmail.com>,
	"vincent.guittot@linaro.org" <vincent.guittot@linaro.org>,
	"angelogioacchino.delregno@collabora.com" 
	<angelogioacchino.delregno@collabora.com>
Subject: Re: [PATCH 1/1] sched/core: Fix stuck on completion for affine_move_task() when stopper disable
Date: Tue, 10 Oct 2023 14:40:22 +0000	[thread overview]
Message-ID: <8ad1b617a1040ce4cc56a5d04e8219b5313a9a6e.camel@mediatek.com> (raw)
In-Reply-To: <20230929102135.GD6282@noisy.programming.kicks-ass.net>

On Fri, 2023-09-29 at 12:21 +0200, Peter Zijlstra wrote:
>  	 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>  On Wed, Sep 27, 2023 at 03:57:35PM +0000, Kuyo Chang (張建文) wrote:
> 
> > This issue occurs at CPU hotplug/set_affinity stress test.
> > The reproduce ratio is very low(about once a week).
> 
> I'm assuming you're running an arm64 kernel with preempt_full=y (the
> default for arm64).
> 
> Could you please test the below?
> 

It is running good so far(more than a week)on hotplug/set affinity
stress test. I will keep it testing and report back if it happens
again.

> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index d8fd29d66b24..079a63b8a954 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2645,9 +2645,11 @@ static int migration_cpu_stop(void *data)
>   * it.
>   */
>  WARN_ON_ONCE(!pending->stop_pending);
> +preempt_disable();
>  task_rq_unlock(rq, p, &rf);
>  stop_one_cpu_nowait(task_cpu(p), migration_cpu_stop,
>      &pending->arg, &pending->stop_work);
> +preempt_enable();
>  return 0;
>  }
>  out:
> @@ -2967,12 +2969,13 @@ static int affine_move_task(struct rq *rq,
> struct task_struct *p, struct rq_flag
>  complete = true;
>  }
>  
> +preempt_disable();
>  task_rq_unlock(rq, p, rf);
> -
>  if (push_task) {
>  stop_one_cpu_nowait(rq->cpu, push_cpu_stop,
>      p, &rq->push_work);
>  }
> +preempt_enable();
>  
>  if (complete)
>  complete_all(&pending->done);
> @@ -3038,12 +3041,13 @@ static int affine_move_task(struct rq *rq,
> struct task_struct *p, struct rq_flag
>  if (flags & SCA_MIGRATE_ENABLE)
>  p->migration_flags &= ~MDF_PUSH;
>  
> +preempt_disable();
>  task_rq_unlock(rq, p, rf);
> -
>  if (!stop_pending) {
>  stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop,
>      &pending->arg, &pending->stop_work);
>  }
> +preempt_enable();
>  
>  if (flags & SCA_MIGRATE_ENABLE)
>  return 0;
> @@ -9459,6 +9461,7 @@ static void balance_push(struct rq *rq)
>   * Temporarily drop rq->lock such that we can wake-up the stop task.
>   * Both preemption and IRQs are still disabled.
>   */
> +preempt_disable();
>  raw_spin_rq_unlock(rq);
>  stop_one_cpu_nowait(rq->cpu, __balance_push_cpu_stop, push_task,
>      this_cpu_ptr(&push_work));
> @@ -9468,6 +9471,7 @@ static void balance_push(struct rq *rq)
>   * which kthread_is_per_cpu() and will push this task away.
>   */
>  raw_spin_rq_lock(rq);
> +preempt_enable();
>  }
>  
>  static void balance_push_set(int cpu, bool on)

WARNING: multiple messages have this Message-ID (diff)
From: "Kuyo Chang (張建文)" <Kuyo.Chang@mediatek.com>
To: "peterz@infradead.org" <peterz@infradead.org>
Cc: "dietmar.eggemann@arm.com" <dietmar.eggemann@arm.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mediatek@lists.infradead.org"
	<linux-mediatek@lists.infradead.org>,
	"rostedt@goodmis.org" <rostedt@goodmis.org>,
	wsd_upstream <wsd_upstream@mediatek.com>,
	"vschneid@redhat.com" <vschneid@redhat.com>,
	"bristot@redhat.com" <bristot@redhat.com>,
	"juri.lelli@redhat.com" <juri.lelli@redhat.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"bsegall@google.com" <bsegall@google.com>,
	"mgorman@suse.de" <mgorman@suse.de>,
	"matthias.bgg@gmail.com" <matthias.bgg@gmail.com>,
	"vincent.guittot@linaro.org" <vincent.guittot@linaro.org>,
	"angelogioacchino.delregno@collabora.com"
	<angelogioacchino.delregno@collabora.com>
Subject: Re: [PATCH 1/1] sched/core: Fix stuck on completion for affine_move_task() when stopper disable
Date: Tue, 10 Oct 2023 14:40:22 +0000	[thread overview]
Message-ID: <8ad1b617a1040ce4cc56a5d04e8219b5313a9a6e.camel@mediatek.com> (raw)
In-Reply-To: <20230929102135.GD6282@noisy.programming.kicks-ass.net>

On Fri, 2023-09-29 at 12:21 +0200, Peter Zijlstra wrote:
>  	 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>  On Wed, Sep 27, 2023 at 03:57:35PM +0000, Kuyo Chang (張建文) wrote:
> 
> > This issue occurs at CPU hotplug/set_affinity stress test.
> > The reproduce ratio is very low(about once a week).
> 
> I'm assuming you're running an arm64 kernel with preempt_full=y (the
> default for arm64).
> 
> Could you please test the below?
> 

It is running good so far(more than a week)on hotplug/set affinity
stress test. I will keep it testing and report back if it happens
again.

> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index d8fd29d66b24..079a63b8a954 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2645,9 +2645,11 @@ static int migration_cpu_stop(void *data)
>   * it.
>   */
>  WARN_ON_ONCE(!pending->stop_pending);
> +preempt_disable();
>  task_rq_unlock(rq, p, &rf);
>  stop_one_cpu_nowait(task_cpu(p), migration_cpu_stop,
>      &pending->arg, &pending->stop_work);
> +preempt_enable();
>  return 0;
>  }
>  out:
> @@ -2967,12 +2969,13 @@ static int affine_move_task(struct rq *rq,
> struct task_struct *p, struct rq_flag
>  complete = true;
>  }
>  
> +preempt_disable();
>  task_rq_unlock(rq, p, rf);
> -
>  if (push_task) {
>  stop_one_cpu_nowait(rq->cpu, push_cpu_stop,
>      p, &rq->push_work);
>  }
> +preempt_enable();
>  
>  if (complete)
>  complete_all(&pending->done);
> @@ -3038,12 +3041,13 @@ static int affine_move_task(struct rq *rq,
> struct task_struct *p, struct rq_flag
>  if (flags & SCA_MIGRATE_ENABLE)
>  p->migration_flags &= ~MDF_PUSH;
>  
> +preempt_disable();
>  task_rq_unlock(rq, p, rf);
> -
>  if (!stop_pending) {
>  stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop,
>      &pending->arg, &pending->stop_work);
>  }
> +preempt_enable();
>  
>  if (flags & SCA_MIGRATE_ENABLE)
>  return 0;
> @@ -9459,6 +9461,7 @@ static void balance_push(struct rq *rq)
>   * Temporarily drop rq->lock such that we can wake-up the stop task.
>   * Both preemption and IRQs are still disabled.
>   */
> +preempt_disable();
>  raw_spin_rq_unlock(rq);
>  stop_one_cpu_nowait(rq->cpu, __balance_push_cpu_stop, push_task,
>      this_cpu_ptr(&push_work));
> @@ -9468,6 +9471,7 @@ static void balance_push(struct rq *rq)
>   * which kthread_is_per_cpu() and will push this task away.
>   */
>  raw_spin_rq_lock(rq);
> +preempt_enable();
>  }
>  
>  static void balance_push_set(int cpu, bool on)
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2023-10-10 14:40 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-27  3:34 [PATCH 1/1] sched/core: Fix stuck on completion for affine_move_task() when stopper disable Kuyo Chang
2023-09-27  3:34 ` Kuyo Chang
2023-09-27  8:08 ` Peter Zijlstra
2023-09-27  8:08   ` Peter Zijlstra
2023-09-27 15:57   ` Kuyo Chang (張建文)
2023-09-27 15:57     ` Kuyo Chang (張建文)
2023-09-28 15:16     ` Peter Zijlstra
2023-09-28 15:16       ` Peter Zijlstra
2023-09-28 15:19       ` Peter Zijlstra
2023-09-28 15:19         ` Peter Zijlstra
2023-09-29 10:21     ` Peter Zijlstra
2023-09-29 10:21       ` Peter Zijlstra
2023-10-01 15:15       ` Kuyo Chang (張建文)
2023-10-01 15:15         ` Kuyo Chang (張建文)
2023-10-10 14:40       ` Kuyo Chang (張建文) [this message]
2023-10-10 14:40         ` Kuyo Chang (張建文)
2023-10-10 14:57         ` Peter Zijlstra
2023-10-10 14:57           ` Peter Zijlstra
2023-10-10 20:04           ` [PATCH] sched: Fix stop_one_cpu_nowait() vs hotplug Peter Zijlstra
2023-10-10 20:04             ` Peter Zijlstra
2023-10-11  3:24             ` Kuyo Chang (張建文)
2023-10-11  3:24               ` Kuyo Chang (張建文)
2023-10-11 13:26               ` Peter Zijlstra
2023-10-11 13:26                 ` Peter Zijlstra
2023-10-13  8:06             ` [tip: sched/core] " tip-bot2 for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8ad1b617a1040ce4cc56a5d04e8219b5313a9a6e.camel@mediatek.com \
    --to=kuyo.chang@mediatek.com \
    --cc=angelogioacchino.delregno@collabora.com \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mediatek@lists.infradead.org \
    --cc=matthias.bgg@gmail.com \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=wsd_upstream@mediatek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.