linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] fix share rt runtime with offline rq
@ 2019-12-21  2:20 chenying
  2019-12-23 16:40 ` Steven Rostedt
  0 siblings, 1 reply; 3+ messages in thread
From: chenying @ 2019-12-21  2:20 UTC (permalink / raw)
  To: mingo
  Cc: peterz, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
	bsegall, mgorman, linux-kernel, xue.zhihong, wang.yi59,
	jiang.xuexin, chenying

In my environment,cpu0-11 are online, cpu12-15 are offline, CPU2 is isolated,
sched_rt_runtime_us is 950000,and then bind a rt process with dead loop to CPU2.
We can see that CPU usage on CPU2 reaches 100%,but only one cpu is isolated,
so it can be inferred that CPU2 shares the rt runtime of offline cpu.

/ # cat /sys/devices/system/cpu/online
0-11
/ # cat /sys/devices/system/cpu/offline
12-15
/ # cat /sys/devices/system/cpu/isolated
2
/ # cat /proc/sys/kernel/sched_rt_runtime_us
950000
/ # chrt -p 357
pid 357's current scheduling policy: SCHED_FIFO
pid 357's current scheduling priority: 1

top - 15:52:12 up 4 min,  0 users,  load average: 0.92, 0.41, 0.16
Tasks: 201 total,   2 running, 199 sleeping,   0 stopped,   0 zombie
%Cpu0  :  0.3 us,  0.3 sy,  0.0 ni, 99.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu4  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu5  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu6  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu7  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu8  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu9  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu10 :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
  357 root      -2   0    4044    172    136 R 100.0  0.0   2:32.99 deadloop
  366 root      20   0   22060   2404   2128 R   0.7  0.0   0:00.06 top
    1 root      20   0    2624     20      0 S   0.0  0.0   0:05.93 init
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.00 kthreadd
    3 root      20   0       0      0      0 S   0.0  0.0   0:00.00 ksoftirqd/0
    4 root      20   0       0      0      0 S   0.0  0.0   0:00.00 kworker/0:0

Signed-off-by: chenying <chen.ying153@zte.com.cn>
---
 kernel/sched/rt.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index a532558..d20dc86 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -648,8 +648,12 @@ static void do_balance_runtime(struct rt_rq *rt_rq)
 	rt_period = ktime_to_ns(rt_b->rt_period);
 	for_each_cpu(i, rd->span) {
 		struct rt_rq *iter = sched_rt_period_rt_rq(rt_b, i);
+		struct rq *rq = rq_of_rt_rq(iter);
 		s64 diff;
 
+		if (!rq->online)
+			continue;
+
 		if (iter == rt_rq)
 			continue;
 
-- 
2.15.2


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] fix share rt runtime with offline rq
  2019-12-21  2:20 [PATCH] fix share rt runtime with offline rq chenying
@ 2019-12-23 16:40 ` Steven Rostedt
  2020-01-08 13:02   ` Peter Zijlstra
  0 siblings, 1 reply; 3+ messages in thread
From: Steven Rostedt @ 2019-12-23 16:40 UTC (permalink / raw)
  To: chenying
  Cc: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, linux-kernel, xue.zhihong, wang.yi59,
	jiang.xuexin

On Sat, 21 Dec 2019 10:20:12 +0800
chenying <chen.ying153@zte.com.cn> wrote:

> In my environment,cpu0-11 are online, cpu12-15 are offline, CPU2 is isolated,
> sched_rt_runtime_us is 950000,and then bind a rt process with dead loop to CPU2.
> We can see that CPU usage on CPU2 reaches 100%,but only one cpu is isolated,
> so it can be inferred that CPU2 shares the rt runtime of offline cpu.
> 
> / # cat /sys/devices/system/cpu/online
> 0-11
> / # cat /sys/devices/system/cpu/offline
> 12-15
> / # cat /sys/devices/system/cpu/isolated
> 2
> / # cat /proc/sys/kernel/sched_rt_runtime_us
> 950000
> / # chrt -p 357
> pid 357's current scheduling policy: SCHED_FIFO
> pid 357's current scheduling priority: 1

I'm guessing that you took the cpus offline via the kernel command line
parameter. Because when I tried this with just:

 # echo 0 > /sys/devices/system/cpu/cpu${cpu}/online

I could not reproduce it. But when I booted with maxcpus=X set, I could.


> 
> top - 15:52:12 up 4 min,  0 users,  load average: 0.92, 0.41, 0.16
> Tasks: 201 total,   2 running, 199 sleeping,   0 stopped,   0 zombie
> %Cpu0  :  0.3 us,  0.3 sy,  0.0 ni, 99.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
> %Cpu1  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
> %Cpu2  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
> %Cpu3  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
> %Cpu4  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
> %Cpu5  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
> %Cpu6  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
> %Cpu7  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
> %Cpu8  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
> %Cpu9  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
> %Cpu10 :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
> 
>   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
>   357 root      -2   0    4044    172    136 R 100.0  0.0   2:32.99 deadloop
>   366 root      20   0   22060   2404   2128 R   0.7  0.0   0:00.06 top
>     1 root      20   0    2624     20      0 S   0.0  0.0   0:05.93 init
>     2 root      20   0       0      0      0 S   0.0  0.0   0:00.00 kthreadd
>     3 root      20   0       0      0      0 S   0.0  0.0   0:00.00 ksoftirqd/0
>     4 root      20   0       0      0      0 S   0.0  0.0   0:00.00 kworker/0:0
> 
> Signed-off-by: chenying <chen.ying153@zte.com.cn>
> ---
>  kernel/sched/rt.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index a532558..d20dc86 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -648,8 +648,12 @@ static void do_balance_runtime(struct rt_rq *rt_rq)
>  	rt_period = ktime_to_ns(rt_b->rt_period);
>  	for_each_cpu(i, rd->span) {
>  		struct rt_rq *iter = sched_rt_period_rt_rq(rt_b, i);
> +		struct rq *rq = rq_of_rt_rq(iter);
>  		s64 diff;
>  
> +		if (!rq->online)
> +			continue;
> +

I think this might be papering over the real issue. Perhaps
rq_offline_rt() needs to be called for CPUs not being brought online?

-- Steve


>  		if (iter == rt_rq)
>  			continue;
>  


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] fix share rt runtime with offline rq
  2019-12-23 16:40 ` Steven Rostedt
@ 2020-01-08 13:02   ` Peter Zijlstra
  0 siblings, 0 replies; 3+ messages in thread
From: Peter Zijlstra @ 2020-01-08 13:02 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: chenying, mingo, juri.lelli, vincent.guittot, dietmar.eggemann,
	bsegall, mgorman, linux-kernel, xue.zhihong, wang.yi59,
	jiang.xuexin

On Mon, Dec 23, 2019 at 11:40:30AM -0500, Steven Rostedt wrote:
> >  kernel/sched/rt.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> > index a532558..d20dc86 100644
> > --- a/kernel/sched/rt.c
> > +++ b/kernel/sched/rt.c
> > @@ -648,8 +648,12 @@ static void do_balance_runtime(struct rt_rq *rt_rq)
> >  	rt_period = ktime_to_ns(rt_b->rt_period);
> >  	for_each_cpu(i, rd->span) {
> >  		struct rt_rq *iter = sched_rt_period_rt_rq(rt_b, i);
> > +		struct rq *rq = rq_of_rt_rq(iter);
> >  		s64 diff;
> >  
> > +		if (!rq->online)
> > +			continue;
> > +
> 
> I think this might be papering over the real issue. Perhaps
> rq_offline_rt() needs to be called for CPUs not being brought online?

Yeah, very much that. Something like the below perhaps. But I really
want to rip out the whole RT_CGROUP_SCHED stuff so we can start over.

Perhaps the poster can explain what he's using this stuff for?

---
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 4043abe45459..96a0320cfadb 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -208,7 +208,13 @@ int alloc_rt_sched_group(struct task_group *tg, struct task_group *parent)
 			goto err_free_rq;
 
 		init_rt_rq(rt_rq);
+
+		cpus_read_lock();
 		rt_rq->rt_runtime = tg->rt_bandwidth.rt_runtime;
+		if (!cpu_online(i))
+			rt_rq->rt_runtime = RUNTIME_INF;
+		cpus_read_unlock();
+
 		init_tg_rt_entry(tg, rt_rq, rt_se, i, parent->rt_se[i]);
 	}
 

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-01-08 13:02 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-21  2:20 [PATCH] fix share rt runtime with offline rq chenying
2019-12-23 16:40 ` Steven Rostedt
2020-01-08 13:02   ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).