All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qais Yousef <qais.yousef@arm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: tglx@linutronix.de, mingo@kernel.org,
	linux-kernel@vger.kernel.org, bigeasy@linutronix.de,
	swood@redhat.com, valentin.schneider@arm.com,
	juri.lelli@redhat.com, vincent.guittot@linaro.org,
	dietmar.eggemann@arm.com, rostedt@goodmis.org,
	bsegall@google.com, mgorman@suse.de, bristot@redhat.com,
	vincent.donnefort@arm.com, tj@kernel.org, ouwen210@hotmail.com
Subject: Re: [PATCH v4 15/19] sched: Fix migrate_disable() vs rt/dl balancing
Date: Sat, 26 Dec 2020 13:54:45 +0000	[thread overview]
Message-ID: <20201226135445.gkxfn5lmbxhblnj4@e107158-lin> (raw)
In-Reply-To: <20201023102347.499155098@infradead.org>

Hi Peter

Apologies for the late comments on the patch.

On 10/23/20 12:12, Peter Zijlstra wrote:

[...]

> + * When a preempted task becomes elegible to run under the ideal model (IOW it
> + * becomes one of the M highest priority tasks), it might still have to wait
> + * for the preemptee's migrate_disable() section to complete. Thereby suffering
> + * a reduction in bandwidth in the exact duration of the migrate_disable()
> + * section.
> + *
> + * Per this argument, the change from preempt_disable() to migrate_disable()
> + * gets us:
> + *
> + * - a higher priority tasks gains reduced wake-up latency; with preempt_disable()
> + *   it would have had to wait for the lower priority task.
> + *
> + * - a lower priority tasks; which under preempt_disable() could've instantly
> + *   migrated away when another CPU becomes available, is now constrained
> + *   by the ability to push the higher priority task away, which might itself be
> + *   in a migrate_disable() section, reducing it's available bandwidth.
> + *
> + * IOW it trades latency / moves the interference term, but it stays in the
> + * system, and as long as it remains unbounded, the system is not fully
> + * deterministic.

The idea makes sense but I'm worried about some implementation details.

Specifically:

	* There's no guarantee the target CPU we're pushing to doesn't have
	  a lower priority task in migration_disabled too. So we could end up
	  having to push the task again. Although unlikely in practice, but as
	  I see it the worst case scenario is unbounded here. The planets could
	  align perfectly for the higher priority task to spend the majority of
	  its time migrating between cpus that have low priority RT tasks in
	  migration_disabled regions.

	  We need to track migration disabled at rq level to fix this.
	  It might be necessary to track the priority levels that are in
	  migration_disabled too :-/

	* Since this is a voluntary migration, I think we should ensure it is
	  restricted to cpus_share_cache() to guarantee the price is minimal
	  and acceptable.

	* The push is done via the stopper task; which will steal run time
	  and could contribute to worst case latency. I think it'd fine in
	  practice, but PREEMPT_RT folks will know better.

I think the combined effect of above could end up throwing off RT system
designers who could find their high-priority-hard-RT task is missing its
deadline to be nice to lower priority tasks who go often to migration_disabled
regions.

I seem to remember Clark saying in last LPC that few us latency is not unheard
of now.

> +int push_cpu_stop(void *arg)
> +{
> +	struct rq *lowest_rq = NULL, *rq = this_rq();
> +	struct task_struct *p = arg;
> +
> +	raw_spin_lock_irq(&p->pi_lock);
> +	raw_spin_lock(&rq->lock);
> +
> +	if (task_rq(p) != rq)
> +		goto out_unlock;
> +
> +	if (is_migration_disabled(p)) {
> +		p->migration_flags |= MDF_PUSH;
> +		goto out_unlock;
> +	}
> +
> +	p->migration_flags &= ~MDF_PUSH;
> +
> +	if (p->sched_class->find_lock_rq)
> +		lowest_rq = p->sched_class->find_lock_rq(p, rq);
> +
> +	if (!lowest_rq)
> +		goto out_unlock;
> +
> +	// XXX validate p is still the highest prio task

The task_rq(p) could have left the migration_disabled region by now too. If we
track that at rq level we could be able to do last minute check to bale out of
this voluntary push.

I think we should check that the lowest_rq is not in migration_disabled region
too otherwise the same task could end up here again.

Need to think more about it, but we might be able to get away with verifying
task_rq(p)->curr and lowest_rq->curr aren't in migration disabled. The only
worry I can think of now is that rq->curr is a similar task to this one. That
is: a higher priority task that has preempted a migration_disabled region.

Verifying that task_cpu(p) and lowest_rq->cpu are in the same llc will help
avoid a costly migration. After all this is a voluntary migration.

Once we do all these bale outs; we might need to rate limit another PULL
triggering this continuously. Need to dig more into that.

> +	if (task_rq(p) == rq) {
> +		deactivate_task(rq, p, 0);
> +		set_task_cpu(p, lowest_rq->cpu);
> +		activate_task(lowest_rq, p, 0);
> +		resched_curr(lowest_rq);
> +	}
> +
> +	double_unlock_balance(rq, lowest_rq);
> +
> +out_unlock:
> +	rq->push_busy = false;
> +	raw_spin_unlock(&rq->lock);
> +	raw_spin_unlock_irq(&p->pi_lock);
> +
> +	put_task_struct(p);
> +	return 0;
> +}

[...]

> +static inline struct task_struct *get_push_task(struct rq *rq)
> +{
> +	struct task_struct *p = rq->curr;

Shouldn't we verify the class of the task here? The RT task in migration
disabled could have been preempted by a dl or stopper task. Similarly, the dl
task could have been preempted by a stopper task.

I don't think an RT task should be allowed to push a dl task under any
circumstances?

Cheers

--
Qais Yousef

> +
> +	lockdep_assert_held(&rq->lock);
> +
> +	if (rq->push_busy)
> +		return NULL;
> +
> +	if (p->nr_cpus_allowed == 1)
> +		return NULL;
> +
> +	rq->push_busy = true;
> +	return get_task_struct(p);
> +}
> +
> +extern int push_cpu_stop(void *arg);
> +
>  #endif
>  
>  #ifdef CONFIG_CPU_IDLE
> 
> 

  parent reply	other threads:[~2020-12-26 13:56 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-23 10:11 [PATCH v4 00/19] sched: Migrate disable support Peter Zijlstra
2020-10-23 10:11 ` [PATCH v4 01/19] stop_machine: Add function and caller debug info Peter Zijlstra
2020-11-11  8:23   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2020-10-23 10:12 ` [PATCH v4 02/19] sched: Fix balance_callback() Peter Zijlstra
2020-11-11  8:23   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2020-11-11 20:30     ` Paul Bolle
2020-11-11 20:45       ` Peter Zijlstra
2020-10-23 10:12 ` [PATCH v4 03/19] sched/hotplug: Ensure only per-cpu kthreads run during hotplug Peter Zijlstra
2020-11-11  8:23   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2020-10-23 10:12 ` [PATCH v4 04/19] sched/core: Wait for tasks being pushed away on hotplug Peter Zijlstra
2020-11-11  8:23   ` [tip: sched/core] " tip-bot2 for Thomas Gleixner
2020-10-23 10:12 ` [PATCH v4 05/19] workqueue: Manually break affinity " Peter Zijlstra
2020-11-11  8:23   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2020-10-23 10:12 ` [PATCH v4 06/19] sched/hotplug: Consolidate task migration on CPU unplug Peter Zijlstra
2020-11-11  8:23   ` [tip: sched/core] " tip-bot2 for Thomas Gleixner
2020-10-23 10:12 ` [PATCH v4 07/19] sched: Fix hotplug vs CPU bandwidth control Peter Zijlstra
2020-11-11  8:23   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2020-10-23 10:12 ` [PATCH v4 08/19] sched: Massage set_cpus_allowed() Peter Zijlstra
2020-11-11  8:23   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2020-10-23 10:12 ` [PATCH v4 09/19] sched: Add migrate_disable() Peter Zijlstra
2020-11-11  8:23   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2020-10-23 10:12 ` [PATCH v4 10/19] sched: Fix migrate_disable() vs set_cpus_allowed_ptr() Peter Zijlstra
2020-11-11  8:23   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2020-11-12 16:38   ` [PATCH v4 10/19] " Qian Cai
2020-11-12 17:26     ` Valentin Schneider
2020-11-12 18:01       ` Qian Cai
2020-11-12 19:31         ` Valentin Schneider
2020-11-12 19:41           ` Qian Cai
2020-11-12 20:37           ` Qian Cai
2020-11-12 21:26             ` Valentin Schneider
2020-11-13 10:27           ` Peter Zijlstra
2020-11-12 18:35       ` Qian Cai
2020-11-20 12:34     ` [tip: sched/core] sched/core: Add missing completion for affine_move_task() waiters tip-bot2 for Valentin Schneider
2020-10-23 10:12 ` [PATCH v4 11/19] sched/core: Make migrate disable and CPU hotplug cooperative Peter Zijlstra
2020-10-29 16:27   ` Valentin Schneider
2020-10-29 17:34     ` Peter Zijlstra
2020-10-29 17:55       ` Valentin Schneider
2020-11-11  8:23   ` [tip: sched/core] " tip-bot2 for Thomas Gleixner
2020-11-13 15:06   ` [PATCH v4 11/19] " Qian Cai
2020-11-17 19:28     ` Valentin Schneider
2020-11-18 14:44       ` Qian Cai
2020-11-23 18:13         ` Sebastian Andrzej Siewior
2020-12-02 21:59           ` Qian Cai
2020-12-03 12:31           ` Qian Cai
2020-12-04  0:23       ` Qian Cai
2020-12-04 21:19       ` Qian Cai
2020-12-05 18:37         ` Valentin Schneider
2020-12-06  1:17           ` Qian Cai
2020-12-07 19:27         ` Valentin Schneider
2020-12-08 13:46           ` Qian Cai
2020-12-09 19:16             ` Valentin Schneider
2020-10-23 10:12 ` [PATCH v4 12/19] sched,rt: Use cpumask_any*_distribute() Peter Zijlstra
2020-11-11  8:23   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2020-10-23 10:12 ` [PATCH v4 13/19] sched,rt: Use the full cpumask for balancing Peter Zijlstra
2020-11-11  8:23   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2020-10-23 10:12 ` [PATCH v4 14/19] sched, lockdep: Annotate ->pi_lock recursion Peter Zijlstra
2020-10-29 16:27   ` Valentin Schneider
2020-10-29 17:38     ` Peter Zijlstra
2020-10-29 18:09       ` Valentin Schneider
2020-11-11  8:23   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2020-10-23 10:12 ` [PATCH v4 15/19] sched: Fix migrate_disable() vs rt/dl balancing Peter Zijlstra
2020-11-11  8:23   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2020-12-26 13:54   ` Qais Yousef [this message]
2021-03-05 14:56     ` [PATCH v4 15/19] " Peter Zijlstra
2021-03-05 15:41       ` Valentin Schneider
2021-03-05 17:11         ` Qais Yousef
2021-03-10 14:44         ` Qais Yousef
2021-03-05 16:48       ` Qais Yousef
2020-10-23 10:12 ` [PATCH v4 16/19] sched/proc: Print accurate cpumask vs migrate_disable() Peter Zijlstra
2020-11-11  8:23   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2020-10-23 10:12 ` [PATCH v4 17/19] sched: Add migrate_disable() tracepoints Peter Zijlstra
2020-10-29 16:27   ` Valentin Schneider
2020-10-29 17:43     ` Peter Zijlstra
2020-10-29 17:56       ` Valentin Schneider
2020-10-29 17:59         ` Peter Zijlstra
2020-10-23 10:12 ` [PATCH v4 18/19] sched: Deny self-issued __set_cpus_allowed_ptr() when migrate_disable() Peter Zijlstra
2020-10-23 10:12 ` [PATCH v4 19/19] sched: Comment affine_move_task() Peter Zijlstra
2020-10-29 16:27   ` Valentin Schneider
2020-10-29 17:44     ` Peter Zijlstra
2020-10-29 19:03 ` [PATCH v4 00/19] sched: Migrate disable support Valentin Schneider
2020-11-09 16:39 ` Daniel Bristot de Oliveira

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201226135445.gkxfn5lmbxhblnj4@e107158-lin \
    --to=qais.yousef@arm.com \
    --cc=bigeasy@linutronix.de \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=ouwen210@hotmail.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=swood@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.donnefort@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.