LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Dietmar Eggemann <dietmar.eggemann@arm.com>
To: Daniel Bristot de Oliveira <bristot@redhat.com>,
	linux-kernel@vger.kernel.org
Cc: Marco Perronet <perronet@mpi-sws.org>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Li Zefan <lizefan@huawei.com>, Tejun Heo <tj@kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Valentin Schneider <valentin.schneider@arm.com>,
	cgroups@vger.kernel.org
Subject: Re: [PATCH 3/6] sched/deadline: Allow DL tasks on empty (cgroup v2) cpusets
Date: Thu, 14 Jan 2021 13:12:55 +0100
Message-ID: <17db52d5-9967-969f-de03-ae6aec03c53e@arm.com> (raw)
In-Reply-To: <8380113688bd64a6deb3241ff6a0fff62b157f47.1610463999.git.bristot@redhat.com>

On 12/01/2021 16:53, Daniel Bristot de Oliveira wrote:
> cgroups v2 allows the cpuset controller to be enabled/disabled on
> demand. On Fedora 32, cpuset is disabled by default. To enable it,
> a user needs to:
> 
>   # cd /sys/fs/cgroup/
>   # echo +cpuset > cgroup.subtree_control
> 
> Existing cgroups will expose the cpuset interface (e.g., cpuset.cpus
> file). By default, cpuset.cpus has no CPU assigned, which means that
> existing tasks will move to a cpuset without cpus.
> 
> With that in mind, look what happens if a SCHED_DEADLINE task exists
> on any cgroup (user.slice by default on Fedora):
> 
> ----- %< -----
>   # chrt -d --sched-period 1000000000 --sched-runtime 100000000 0 sleep 100 &

Like you mentioned above, to see the issue the DL task has to be moved
into the cgroup (e.g. user.slice) here:

echo $PID > /sys/fs/cgroup/user.slice/cgroup.procs

>   # cd /sys/fs/cgroup/
>   # echo '+cpuset' > cgroup.subtree_control
>   [   65.384041] BUG: unable to handle page fault for address: ffffffffb720f7e0
>   [   65.384551] #PF: supervisor read access in kernel mode
>   [   65.384923] #PF: error_code(0x0000) - not-present page
>   [   65.385298] PGD 61a15067 P4D 61a15067 PUD 61a16063 PMD 800fffff9ddff062
>   [   65.385781] Oops: 0000 [#1] SMP PTI
>   [   65.386042] CPU: 0 PID: 799 Comm: sh Not tainted 5.10.0-rc3 #1
>   [   65.386461] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
>   [   65.387077] RIP: 0010:dl_task_can_attach+0x40/0x250
>   [   65.387429] Code: 54 55 53 48 83 ec 18 48 89 3c 24 bf ff ff ff ff e8 05 a2 52 00
>                        4c 63 f0 48 c7 c5 00 9e 02 00 4a 8b 04 f5 00 09 47 b6 48 89 ea
>                        <4c> 8b a4 10 e0 09 00 00 49 8d 44 24 40 48 89 c7 48 89 44 24
>                        08 e8
>   [   65.388768] RSP: 0018:ffffaee8c056fcd8 EFLAGS: 00010283
>   [   65.389148] RAX: ffffffffb71e5000 RBX: ffffaee8c056fdd0 RCX: 0000000000000040
>   [   65.389661] RDX: 0000000000029e00 RSI: ffff9db202534e48 RDI: ffffffffb6d3a3e0
>   [   65.390174] RBP: 0000000000029e00 R08: 0000000000000000 R09: 0000000000000004
>   [   65.390686] R10: 0000000000000001 R11: 00000000ffa6fbff R12: ffffaee8c056fbf0
>   [   65.391196] R13: ffff9db2024e1400 R14: 0000000000000004 R15: ffff9db20ebb31e0
>   [   65.391710] FS:  00007f6df41b1740(0000) GS:ffff9db377c00000(0000) knlGS:0000000000000000
>   [   65.392289] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   [   65.392705] CR2: ffffffffb720f7e0 CR3: 000000010680a003 CR4: 0000000000370ef0
>   [   65.393220] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   [   65.393732] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>   [   65.394244] Call Trace:
>   [   65.394437]  cpuset_can_attach+0x8b/0x110
>   [   65.394732]  cgroup_migrate_execute+0x70/0x430
>   [   65.395057]  cgroup_update_dfl_csses+0x222/0x230
>   [   65.395392]  cgroup_subtree_control_write+0x2c6/0x3c0
>   [   65.395759]  kernfs_fop_write+0xce/0x1b0
>   [   65.396048]  vfs_write+0xc2/0x230
>   [   65.396291]  ksys_write+0x4f/0xc0
>   [   65.396533]  do_syscall_64+0x33/0x40
>   [   65.396797]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   [   65.397166] RIP: 0033:0x7f6df42a6537
>   [   65.397428] Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f
>                        1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05
>                        <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89
>                        74 24
>   [   65.398766] RSP: 002b:00007ffee4128018 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
>   [   65.399838] RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007f6df42a6537
>   [   65.400923] RDX: 0000000000000008 RSI: 000055b3f7e549e0 RDI: 0000000000000001
>   [   65.402003] RBP: 000055b3f7e549e0 R08: 000000000000000a R09: 0000000000000007
>   [   65.403082] R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000000008
>   [   65.404156] R13: 00007f6df4378500 R14: 0000000000000008 R15: 00007f6df4378700
>   [   65.405218] Modules linked in: <lots of modules>
>   [   65.414172] CR2: ffffffffb720f7e0
>   [   65.415117] ---[ end trace 2dbff1a688549e65 ]---
> ----- >% -----
> 
> That happens because on dl_task_can_attach():
>         dest_cpu = cpumask_any_and(cpu_active_mask, cs_cpus_allowed);
> 
> returns a non active cpu.

Since cs_cpus_allowed is empty dest_cpu should be an invalid CPU (>=
nr_cpu_ids) here.

> Initially, I thought about returning an error and blocking the
> operation. However, that is indeed not needed. The cpuset without
> CPUs assigned will be a non-root cpuset, hence its cpu mask will
> be the same as the root one. So, the bandwidth was already accounted,
> and the task can proceed.

LGTM.

After the '/sys/fs/cgroup# echo '+cpuset' > cgroup.subtree_control':

root's cpuset.cpus.effective == user.slice's cpuset.cpus.effective

> Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Juri Lelli <juri.lelli@redhat.com>
> Cc: Vincent Guittot <vincent.guittot@linaro.org>
> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Ben Segall <bsegall@google.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
> Cc: Li Zefan <lizefan@huawei.com>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Valentin Schneider <valentin.schneider@arm.com>
> Cc: linux-kernel@vger.kernel.org
> Cc: cgroups@vger.kernel.org
> ---
>  kernel/sched/deadline.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 943aa32cc1bc..788a391657a5 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -2871,6 +2871,13 @@ int dl_task_can_attach(struct task_struct *p,
>  	bool overflow;
>  	int ret;
>  
> +	/*
> +	 * The cpuset has no cpus assigned, so the thread will not
> +	 * change its affinity.
> +	 */
> +	if (cpumask_empty(cs_cpus_allowed))
> +		return 0;
> +
>  	/*
>  	 * The task is not moving to another root domain, so it is
>  	 * already accounted.
> 

  reply index

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-12 15:53 [PATCH 0/6] sched/deadline: cpuset task acceptance review Daniel Bristot de Oliveira
2021-01-12 15:53 ` [PATCH 1/6] sched/deadline: Consolidate the SCHED_DL task_can_attach() check on its own function Daniel Bristot de Oliveira
2021-01-12 15:53 ` [PATCH 2/6] sched/deadline: Inform dl_task_can_attach() if the cpuset is exclusive Daniel Bristot de Oliveira
2021-01-12 15:53 ` [PATCH 3/6] sched/deadline: Allow DL tasks on empty (cgroup v2) cpusets Daniel Bristot de Oliveira
2021-01-14 12:12   ` Dietmar Eggemann [this message]
2021-01-18 12:51     ` Daniel Bristot de Oliveira
2021-01-22  8:08   ` Juri Lelli
2021-01-12 15:53 ` [PATCH 4/6] sched/deadline: Block DL tasks on non-exclusive cpuset if bandwitdh control is enable Daniel Bristot de Oliveira
2021-01-14 15:51   ` Dietmar Eggemann
2021-01-19  9:41     ` Daniel Bristot de Oliveira
2021-01-19 15:37       ` Dietmar Eggemann
2021-01-12 15:53 ` [PATCH 5/6] sched/deadline: Add helpers to get the correct root domain/span/dl_bw Daniel Bristot de Oliveira
2021-01-12 15:53 ` [PATCH 6/6] sched/deadline: Fixes cpu/rd/dl_bw references for suspended tasks Daniel Bristot de Oliveira
2021-01-15 14:36   ` Dietmar Eggemann
2021-01-18 13:17     ` Daniel Bristot de Oliveira

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=17db52d5-9967-969f-de03-ae6aec03c53e@arm.com \
    --to=dietmar.eggemann@arm.com \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=perronet@mpi-sws.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tj@kernel.org \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git
	git clone --mirror https://lore.kernel.org/lkml/10 lkml/git/10.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git