linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Quentin Perret <qperret@google.com>
To: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Will Deacon <will@kernel.org>, Juri Lelli <juri.lelli@redhat.com>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	linux-arm-kernel@lists.infradead.org, linux-arch@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Marc Zyngier <maz@kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	Qais Yousef <qais.yousef@arm.com>,
	Suren Baghdasaryan <surenb@google.com>, Tejun Heo <tj@kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Ingo Molnar <mingo@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	kernel-team@android.com
Subject: Re: [PATCH v6 13/21] sched: Admit forcefully-affined tasks into SCHED_DEADLINE
Date: Fri, 21 May 2021 13:02:01 +0000	[thread overview]
Message-ID: <YKevSSLHjdRvrJ2i@google.com> (raw)
In-Reply-To: <3620bad5-2a27-0f9e-f1f0-70036997d33c@arm.com>

On Friday 21 May 2021 at 13:23:55 (+0200), Dietmar Eggemann wrote:
> On 21/05/2021 12:37, Will Deacon wrote:
> > On Fri, May 21, 2021 at 10:39:32AM +0200, Juri Lelli wrote:
> >> On 21/05/21 08:15, Quentin Perret wrote:
> >>> On Friday 21 May 2021 at 07:25:51 (+0200), Juri Lelli wrote:
> >>>> On 20/05/21 19:01, Will Deacon wrote:
> >>>>> On Thu, May 20, 2021 at 02:38:55PM +0200, Daniel Bristot de Oliveira wrote:
> >>>>>> On 5/20/21 12:33 PM, Quentin Perret wrote:
> >>>>>>> On Thursday 20 May 2021 at 11:16:41 (+0100), Will Deacon wrote:
> >>>>>>>> Ok, thanks for the insight. In which case, I'll go with what we discussed:
> >>>>>>>> require admission control to be disabled for sched_setattr() but allow
> >>>>>>>> execve() to a 32-bit task from a 64-bit deadline task with a warning (this
> >>>>>>>> is probably similar to CPU hotplug?).
> >>>>>>>
> >>>>>>> Still not sure that we can let execve go through ... It will break AC
> >>>>>>> all the same, so it should probably fail as well if AC is on IMO
> >>>>>>>
> >>>>>>
> >>>>>> If the cpumask of the 32-bit task is != of the 64-bit task that is executing it,
> >>>>>> the admission control needs to be re-executed, and it could fail. So I see this
> >>>>>> operation equivalent to sched_setaffinity(). This will likely be true for future
> >>>>>> schedulers that will allow arbitrary affinities (AC should run on affinity
> >>>>>> change, and could fail).
> >>>>>>
> >>>>>> I would vote with Juri: "I'd go with fail hard if AC is on, let it
> >>>>>> pass if AC is off (supposedly the user knows what to do)," (also hope nobody
> >>>>>> complains until we add better support for affinity, and use this as a motivation
> >>>>>> to get back on this front).
> >>>>>
> >>>>> I can have a go at implementing it, but I don't think it's a great solution
> >>>>> and here's why:
> >>>>>
> >>>>> Failing an execve() is _very_ likely to be fatal to the application. It's
> >>>>> also very likely that the task calling execve() doesn't know whether the
> >>>>> program it's trying to execute is 32-bit or not. Consequently, if we go
> >>>>> with failing execve() then all that will happen is that people will disable
> >>>>> admission control altogether.
> >>>
> >>> Right, but only on these dumb 32bit asymmetric systems, and only if we
> >>> care about running 32bits deadline tasks -- which I seriously doubt for
> >>> the Android use-case.
> >>>
> >>> Note that running deadline tasks is also a privileged operation, it
> >>> can't be done by random apps.
> >>>
> >>>>> That has a negative impact on "pure" 64-bit
> >>>>> applications and so I think we end up with the tail wagging the dog because
> >>>>> admission control will be disabled for everybody just because there is a
> >>>>> handful of 32-bit programs which may get executed. I understand that it
> >>>>> also means that RT throttling would be disabled.
> >>>>
> >>>> Completely understand your perplexity. But how can the kernel still give
> >>>> guarantees to "pure" 64-bit applications if there are 32-bit
> >>>> applications around that essentially broke admission control when they
> >>>> were restricted to a subset of cores?
> >>>>
> >>>>> Allowing the execve() to continue with a warning is very similar to the
> >>>>> case in which all the 64-bit CPUs are hot-unplugged at the point of
> >>>>> execve(), and this is much closer to the illusion that this patch series
> >>>>> intends to provide.
> >>>>
> >>>> So, for hotplug we currently have a check that would make hotplug
> >>>> operations fail if removing a CPU would mean not enough bandwidth to run
> >>>> the currently admitted set of DEADLINE tasks.
> >>>
> >>> Aha, wasn't aware. Any pointers to that check for my education?
> >>
> >> Hotplug ends up calling dl_cpu_busy() (after the cpu being hotplugged out
> >> got removed), IIRC. So, if that fails the operation in undone.
> > 
> > Interesting, thanks. Thinking about this some more, it strikes me that with
> > these silly asymmetric systems there could be an interesting additional
> > problem with hotplug and deadline tasks. Imagine the following sequence of
> > events:
> > 
> >   1. All online CPUs are 32-bit-capable
> >   2. sched_setattr() admits a 32-bit deadline task
> >   3. A 64-bit-only CPU is onlined
> >   4. Some of the 32-bit-capable CPUs are offlined
> > 
> > I wonder if we can get into a situation where we think we have enough
> > bandwidth available, but in reality the 32-bit task is in trouble because
> > it can't make use of the 64-bit-only CPU.
> > 
> > If so, then it seems to me that admission control is really just
> > "best-effort" for 32-bit deadline tasks on these systems because it's based
> > on a snapshot in time of the available resources.
> 
> IMHO DL AC is per root domain (rd). So if we have e.g. an 8 CPU system
> with aarch32_el0 eq. [0-3] then we would need 2 exclusive cpusets ([0-3]
> and [4-7]) to admit 32-bit DL tasks into [0-3] (i.e. to pass the `if
> (!cpumask_subset(span, p->cpus_ptr) ...` test in __sched_setscheduler().
> 
> Trying to admit too many 32-bit DL tasks or trying to hp out a CPU[0-3]
> would lead to `Device or resource busy` in case the rd bw wouldn't be
> sufficient anymore for the set of admitted tasks. But the [0-3] DL AC
> wouldn't care about hp on CPU[4-7].

So I think Will has a point since, IIRC, the root domains get rebuilt
during hotplug. So you can imagine a case with a single root domain, but
CPUs 4-7 are offline. In this case, sched_setattr() will happily promote
a task to DL as long as its affinity mask is a superset of the rd span,
but things may get ugly when CPUs are plugged back in later on.

This looks like an existing bug though. I just tried the following on a
system with 4 CPUs:

    // Create a task affined to CPU [0-2]
    > while true; do echo "Hi" > /dev/null; done &
    [1] 560
    > mypid=$!
    > taskset -p 7 $mypid
    pid 560's current affinity mask: f
    pid 560's new affinity mask: 7

    // Try to move it DL, this should fail because of the affinity
    > chrt -d -T 5000000 -P 16666666 -p 0 $mypid
    chrt: failed to set pid 560's policy: Operation not permitted

    // Offline CPU 3, so the rd now covers CPUs 0-2 only
    > echo 0 > /sys/devices/system/cpu/cpu3/online
    [  400.843830] CPU3: shutdown
    [  400.844100] psci: CPU3 killed (polled 0 ms)

    // Try to admit the task again, which now succeeds
    > chrt -d -T 5000000 -P 16666666 -p 0 $mypid

    // Plug CPU3 back online
    > echo 1 > /sys/devices/system/cpu/cpu3/online
    [  408.819337] Detected PIPT I-cache on CPU3
    [  408.819642] GICv3: CPU3: found redistributor 3 region 0:0x0000000008100000
    [  408.820165] CPU3: Booted secondary processor 0x0000000003 [0x410fd083]

I don't see any easy way to fix this w/o iterating over all deadline
tasks in the rd when hotplugging a CPU back on, and blocking the hotplug
operation if it'll cause affinity issues. Urgh.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-05-21 13:05 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-18  9:47 [PATCH v6 00/21] Add support for 32-bit tasks on asymmetric AArch32 systems Will Deacon
2021-05-18  9:47 ` [PATCH v6 01/21] arm64: cpuinfo: Split AArch32 registers out into a separate struct Will Deacon
2021-05-21 10:47   ` Catalin Marinas
2021-05-18  9:47 ` [PATCH v6 02/21] arm64: Allow mismatched 32-bit EL0 support Will Deacon
2021-05-21 10:25   ` Catalin Marinas
2021-05-24 12:05     ` Will Deacon
2021-05-24 13:49       ` Catalin Marinas
2021-05-21 10:41   ` Catalin Marinas
2021-05-24 12:09     ` Will Deacon
2021-05-24 13:46       ` Catalin Marinas
2021-05-21 15:22   ` Qais Yousef
2021-05-24 20:21     ` Will Deacon
2021-05-18  9:47 ` [PATCH v6 03/21] KVM: arm64: Kill 32-bit vCPUs on systems with mismatched " Will Deacon
2021-05-21 10:47   ` Catalin Marinas
2021-05-18  9:47 ` [PATCH v6 04/21] arm64: Kill 32-bit applications scheduled on 64-bit-only CPUs Will Deacon
2021-05-21 10:55   ` Catalin Marinas
2021-05-18  9:47 ` [PATCH v6 05/21] arm64: Advertise CPUs capable of running 32-bit applications in sysfs Will Deacon
2021-05-21 11:00   ` Catalin Marinas
2021-05-18  9:47 ` [PATCH v6 06/21] sched: Introduce task_cpu_possible_mask() to limit fallback rq selection Will Deacon
2021-05-21 16:03   ` Peter Zijlstra
2021-05-24 12:17     ` Will Deacon
2021-05-18  9:47 ` [PATCH v6 07/21] cpuset: Don't use the cpu_possible_mask as a last resort for cgroup v1 Will Deacon
2021-05-21 17:39   ` Qais Yousef
2021-05-24 20:21     ` Will Deacon
2021-05-18  9:47 ` [PATCH v6 08/21] cpuset: Honour task_cpu_possible_mask() in guarantee_online_cpus() Will Deacon
2021-05-21 16:25   ` Qais Yousef
2021-05-24 21:09     ` Will Deacon
2021-05-18  9:47 ` [PATCH v6 09/21] sched: Reject CPU affinity changes based on task_cpu_possible_mask() Will Deacon
2021-05-18  9:47 ` [PATCH v6 10/21] sched: Introduce task_struct::user_cpus_ptr to track requested affinity Will Deacon
2021-05-18  9:47 ` [PATCH v6 11/21] sched: Split the guts of sched_setaffinity() into a helper function Will Deacon
2021-05-21 16:41   ` Qais Yousef
2021-05-24 21:16     ` Will Deacon
2021-05-18  9:47 ` [PATCH v6 12/21] sched: Allow task CPU affinity to be restricted on asymmetric systems Will Deacon
2021-05-21 17:11   ` Qais Yousef
2021-05-24 21:43     ` Will Deacon
2021-05-18  9:47 ` [PATCH v6 13/21] sched: Admit forcefully-affined tasks into SCHED_DEADLINE Will Deacon
2021-05-18 10:20   ` Quentin Perret
2021-05-18 10:28     ` Will Deacon
2021-05-18 10:48       ` Quentin Perret
2021-05-18 10:59         ` Will Deacon
2021-05-18 13:19           ` Quentin Perret
2021-05-20  9:13             ` Juri Lelli
2021-05-20 10:16               ` Will Deacon
2021-05-20 10:33                 ` Quentin Perret
2021-05-20 12:38                   ` Juri Lelli
2021-05-20 12:38                   ` Daniel Bristot de Oliveira
2021-05-20 15:06                     ` Dietmar Eggemann
2021-05-20 16:00                       ` Daniel Bristot de Oliveira
2021-05-20 17:55                         ` Dietmar Eggemann
2021-05-20 18:03                           ` Will Deacon
2021-05-21 11:26                             ` Dietmar Eggemann
2021-05-20 18:01                     ` Will Deacon
2021-05-21  5:25                       ` Juri Lelli
2021-05-21  8:15                         ` Quentin Perret
2021-05-21  8:39                           ` Juri Lelli
2021-05-21 10:37                             ` Will Deacon
2021-05-21 11:23                               ` Dietmar Eggemann
2021-05-21 13:02                                 ` Quentin Perret [this message]
2021-05-21 14:04                                   ` Juri Lelli
2021-05-21 17:47                                     ` Dietmar Eggemann
2021-05-21 13:00                               ` Daniel Bristot de Oliveira
2021-05-21 13:12                                 ` Quentin Perret
2021-05-24 20:47                                 ` Will Deacon
2021-05-18  9:47 ` [PATCH v6 14/21] freezer: Add frozen_or_skipped() helper function Will Deacon
2021-05-18  9:47 ` [PATCH v6 15/21] sched: Defer wakeup in ttwu() for unschedulable frozen tasks Will Deacon
2021-05-18  9:47 ` [PATCH v6 16/21] arm64: Implement task_cpu_possible_mask() Will Deacon
2021-05-24 14:57   ` Catalin Marinas
2021-05-18  9:47 ` [PATCH v6 17/21] arm64: exec: Adjust affinity for compat tasks with mismatched 32-bit EL0 Will Deacon
2021-05-24 15:02   ` Catalin Marinas
2021-05-18  9:47 ` [PATCH v6 18/21] arm64: Prevent offlining first CPU with 32-bit EL0 on mismatched system Will Deacon
2021-05-24 15:46   ` Catalin Marinas
2021-05-24 20:32     ` Will Deacon
2021-05-25  9:43       ` Catalin Marinas
2021-05-18  9:47 ` [PATCH v6 19/21] arm64: Hook up cmdline parameter to allow mismatched 32-bit EL0 Will Deacon
2021-05-24 15:47   ` Catalin Marinas
2021-05-18  9:47 ` [PATCH v6 20/21] arm64: Remove logic to kill 32-bit tasks on 64-bit-only cores Will Deacon
2021-05-24 15:47   ` Catalin Marinas
2021-05-18  9:47 ` [PATCH v6 21/21] Documentation: arm64: describe asymmetric 32-bit support Will Deacon
2021-05-21 17:37   ` Qais Yousef
2021-05-24 21:46     ` Will Deacon
2021-05-24 16:22   ` Catalin Marinas
2021-05-21 17:45 ` [PATCH v6 00/21] Add support for 32-bit tasks on asymmetric AArch32 systems Qais Yousef
2021-05-24 22:08   ` Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YKevSSLHjdRvrJ2i@google.com \
    --to=qperret@google.com \
    --cc=bristot@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=juri.lelli@redhat.com \
    --cc=kernel-team@android.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=peterz@infradead.org \
    --cc=qais.yousef@arm.com \
    --cc=rjw@rjwysocki.net \
    --cc=surenb@google.com \
    --cc=tj@kernel.org \
    --cc=vincent.guittot@linaro.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).