* cgroup, RT reservation per core(s)?
@ 2009-02-09 19:30 Rolando Martins
2009-02-09 19:52 ` Peter Zijlstra
0 siblings, 1 reply; 14+ messages in thread
From: Rolando Martins @ 2009-02-09 19:30 UTC (permalink / raw)
To: linux-kernel
Hi,
I would like to have in a quad-core, 2 cores totally (100%) dedicated
to RT tasks (SCHED_RR and SCHED_FIFO) and the other 2 cores with
normal behavior, better said, allowing SCHED_OTHER & SCHED_FIFO &
SCHED_RR tasks but still with a RT reservation. Follows an example:
# Setup first domain (cpu 0,1)
echo 0-1 > /dev/cgroup/0/cpuset.cpus
echo 0 > /dev/cgroup/0/cpuset.mems
# Setup RT bandwidth for firstdomain (80% for RT, 20% others)
echo 1000000 > /dev/cgroup/0/cpu.rt_period_us
echo 800000 > /dev/cgroup/0/cpu.rt_runtime_us
# Setup second domain (cpu 2,3)
mkdir /dev/cgroup/1
echo 2-3 > /dev/cgroup/1/cpuset.cpus
echo 0 > /dev/cgroup/1/cpuset.mems
# Setup RT bandwidth for second domain (100% for RT)
echo 1000000 > /dev/cgroup/1/cpu.rt_period_us
echo 1000000 > /dev/cgroup/1/cpu.rt_runtime_us
Is there anyway for doing this?
Thanks,
Rol
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)?
2009-02-09 19:30 cgroup, RT reservation per core(s)? Rolando Martins
@ 2009-02-09 19:52 ` Peter Zijlstra
2009-02-09 20:04 ` Rolando Martins
0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2009-02-09 19:52 UTC (permalink / raw)
To: Rolando Martins; +Cc: linux-kernel
On Mon, 2009-02-09 at 19:30 +0000, Rolando Martins wrote:
> Hi,
> I would like to have in a quad-core, 2 cores totally (100%) dedicated
> to RT tasks (SCHED_RR and SCHED_FIFO) and the other 2 cores with
> normal behavior, better said, allowing SCHED_OTHER & SCHED_FIFO &
> SCHED_RR tasks but still with a RT reservation. Follows an example:
>
>
> # Setup first domain (cpu 0,1)
> echo 0-1 > /dev/cgroup/0/cpuset.cpus
> echo 0 > /dev/cgroup/0/cpuset.mems
>
> # Setup RT bandwidth for firstdomain (80% for RT, 20% others)
> echo 1000000 > /dev/cgroup/0/cpu.rt_period_us
> echo 800000 > /dev/cgroup/0/cpu.rt_runtime_us
>
>
> # Setup second domain (cpu 2,3)
> mkdir /dev/cgroup/1
> echo 2-3 > /dev/cgroup/1/cpuset.cpus
> echo 0 > /dev/cgroup/1/cpuset.mems
>
> # Setup RT bandwidth for second domain (100% for RT)
> echo 1000000 > /dev/cgroup/1/cpu.rt_period_us
> echo 1000000 > /dev/cgroup/1/cpu.rt_runtime_us
>
> Is there anyway for doing this?
Nope, but why do you need bandwidth groups if all you want is a full
cpu?
Just the cpuset should be plenty
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)?
2009-02-09 19:52 ` Peter Zijlstra
@ 2009-02-09 20:04 ` Rolando Martins
2009-02-10 13:06 ` Peter Zijlstra
0 siblings, 1 reply; 14+ messages in thread
From: Rolando Martins @ 2009-02-09 20:04 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: linux-kernel
Thanks for the quick reply.
On Mon, Feb 9, 2009 at 7:52 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Mon, 2009-02-09 at 19:30 +0000, Rolando Martins wrote:
>> Hi,
>> I would like to have in a quad-core, 2 cores totally (100%) dedicated
>> to RT tasks (SCHED_RR and SCHED_FIFO) and the other 2 cores with
>> normal behavior, better said, allowing SCHED_OTHER & SCHED_FIFO &
>> SCHED_RR tasks but still with a RT reservation. Follows an example:
>>
>>
>> # Setup first domain (cpu 0,1)
>> echo 0-1 > /dev/cgroup/0/cpuset.cpus
>> echo 0 > /dev/cgroup/0/cpuset.mems
>>
>> # Setup RT bandwidth for firstdomain (80% for RT, 20% others)
>> echo 1000000 > /dev/cgroup/0/cpu.rt_period_us
>> echo 800000 > /dev/cgroup/0/cpu.rt_runtime_us
>>
>>
>> # Setup second domain (cpu 2,3)
>> mkdir /dev/cgroup/1
>> echo 2-3 > /dev/cgroup/1/cpuset.cpus
>> echo 0 > /dev/cgroup/1/cpuset.mems
>>
>> # Setup RT bandwidth for second domain (100% for RT)
>> echo 1000000 > /dev/cgroup/1/cpu.rt_period_us
>> echo 1000000 > /dev/cgroup/1/cpu.rt_runtime_us
>>
>> Is there anyway for doing this?
>
> Nope, but why do you need bandwidth groups if all you want is a full
> cpu?
>
> Just the cpuset should be plenty
You have a point;)
>
>
I should have elaborated this more:
root
----|----
| |
(0.5 mem) 0 1 (100% rt, 0.5 mem)
---------
| | |
2 3 4 (33% rt for each group, 33% mem
per group(0.165))
Rol
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)?
2009-02-09 20:04 ` Rolando Martins
@ 2009-02-10 13:06 ` Peter Zijlstra
2009-02-10 14:46 ` Rolando Martins
2009-03-03 12:58 ` Rolando Martins
0 siblings, 2 replies; 14+ messages in thread
From: Peter Zijlstra @ 2009-02-10 13:06 UTC (permalink / raw)
To: Rolando Martins; +Cc: linux-kernel
On Mon, 2009-02-09 at 20:04 +0000, Rolando Martins wrote:
> I should have elaborated this more:
>
> root
> ----|----
> | |
> (0.5 mem) 0 1 (100% rt, 0.5 mem)
> ---------
> | | |
> 2 3 4 (33% rt for each group, 33% mem
> per group(0.165))
> Rol
Right, i think this can be done.
You would indeed need cpusets and sched-cgroups.
Split the machine in 2 using cpusets.
___R___
/ \
A B
Where R is the root cpuset, and A and B are the siblings.
Assign A one half the cpus, and B the other half.
Disable load-balancing on R.
Then using sched cgroups create the hierarchy
____1____
/ | \
2 3 4
Where 1 can be the root group if you like.
Assign 1 a utilization limit of 100%, and 2,3 and 4 a utilization limit
of 33% each.
Then place the tasks that get 100% cputime on your 2 cpus in cpuset A
and sched group 1.
Place your other tasks in B,{2-4} respectively.
The reason this works is that bandwidth distribution is sched domain
wide, and by disabling load-balancing on R, you split the schedule
domain.
I've never actually tried anything like this, let me know if it
works ;-)
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)?
2009-02-10 13:06 ` Peter Zijlstra
@ 2009-02-10 14:46 ` Rolando Martins
2009-02-10 16:00 ` Peter Zijlstra
2009-03-03 12:58 ` Rolando Martins
1 sibling, 1 reply; 14+ messages in thread
From: Rolando Martins @ 2009-02-10 14:46 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: linux-kernel
On 2/10/09, Peter Zijlstra <peterz@infradead.org> wrote:
> On Mon, 2009-02-09 at 20:04 +0000, Rolando Martins wrote:
>
> > I should have elaborated this more:
> >
> > root
> > ----|----
> > | |
> > (0.5 mem) 0 1 (100% rt, 0.5 mem)
> > ---------
> > | | |
> > 2 3 4 (33% rt for each group, 33% mem
> > per group(0.165))
> > Rol
>
>
>
> Right, i think this can be done.
>
> You would indeed need cpusets and sched-cgroups.
>
> Split the machine in 2 using cpusets.
>
> ___R___
> / \
> A B
>
> Where R is the root cpuset, and A and B are the siblings.
> Assign A one half the cpus, and B the other half.
> Disable load-balancing on R.
>
> Then using sched cgroups create the hierarchy
>
> ____1____
> / | \
> 2 3 4
>
> Where 1 can be the root group if you like.
>
> Assign 1 a utilization limit of 100%, and 2,3 and 4 a utilization limit
> of 33% each.
>
> Then place the tasks that get 100% cputime on your 2 cpus in cpuset A
> and sched group 1.
>
> Place your other tasks in B,{2-4} respectively.
>
> The reason this works is that bandwidth distribution is sched domain
> wide, and by disabling load-balancing on R, you split the schedule
> domain.
>
> I've never actually tried anything like this, let me know if it
> works ;-)
>
On 2/10/09, Peter Zijlstra <peterz@infradead.org> wrote:
> On Mon, 2009-02-09 at 20:04 +0000, Rolando Martins wrote:
>
> > I should have elaborated this more:
> >
> > root
> > ----|----
> > | |
> > (0.5 mem) 0 1 (100% rt, 0.5 mem)
> > ---------
> > | | |
> > 2 3 4 (33% rt for each group, 33% mem
> > per group(0.165))
> > Rol
>
>
>
> Right, i think this can be done.
>
> You would indeed need cpusets and sched-cgroups.
>
> Split the machine in 2 using cpusets.
>
> ___R___
> / \
> A B
>
> Where R is the root cpuset, and A and B are the siblings.
> Assign A one half the cpus, and B the other half.
> Disable load-balancing on R.
>
> Then using sched cgroups create the hierarchy
>
> ____1____
> / | \
> 2 3 4
>
> Where 1 can be the root group if you like.
>
> Assign 1 a utilization limit of 100%, and 2,3 and 4 a utilization limit
> of 33% each.
>
> Then place the tasks that get 100% cputime on your 2 cpus in cpuset A
> and sched group 1.
>
> Place your other tasks in B,{2-4} respectively.
>
> The reason this works is that bandwidth distribution is sched domain
> wide, and by disabling load-balancing on R, you split the schedule
> domain.
>
> I've never actually tried anything like this, let me know if it
> works ;-)
>
Thanks Peter, it works!
I am thinking for different strategies to be used in my rt middleware
project, and I think there is a limitation.
If I wanted to have some RT on the B cpuset, I couldn't because I
assigned A.cpu.rt_runtime_ns = root.cpu.rt_runtime_ns (then subdivided
the A cpuset, with 2,3,4, each one having A.cpu.rt_runtime_ns/3).
This happens because there is a global /proc/sys/kernel/sched_rt_runtime_us and
/proc/sys/kernel/sched_rt_period_us.
What do you think about adding a separate tuple (runtime,period) for
each core/cpu?
In this case:
/proc/sys/kernel/sched_rt_runtime_us_0
/proc/sys/kernel/sched_rt_period_us_0
...
/proc/sys/kernel/sched_rt_runtime_us_n (n, cpu count)
/proc/sys/kernel/sched_rt_period_us_n
Given this, we could the following:
mkdir /dev/cgroup/A
echo 0-1 > /dev/cgroup/A/cpuset.cpus
echo 0 > /dev/cgroup/A/cpuset.mems
echo 1000000 > /dev/cgroup/A/cpu.rt_period_us
echo 1000000 > /dev/cgroup/A/cpu.rt_runtime_us
This would only work if we could allocate
(cpu.rt_runtime_us,cpu.rt_period_us) in both CPU 0 and CPU 1,
otherwise fail.
mkdir /dev/cgroup/B
echo 2-3 > /dev/cgroup/B/cpuset.cpus
echo 0 > /dev/cgroup/B/cpuset.mems
echo 1000000 > /dev/cgroup/B/cpu.rt_period_us
echo 800000 > /dev/cgroup/B/cpu.rt_runtime_us
The same here, failed if we couldn't allocate 0.8 in both CPU 2 and CPU 3.
Does this make sense? ;)
Rol
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)?
2009-02-10 14:46 ` Rolando Martins
@ 2009-02-10 16:00 ` Peter Zijlstra
2009-02-10 17:32 ` Rolando Martins
0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2009-02-10 16:00 UTC (permalink / raw)
To: Rolando Martins; +Cc: linux-kernel
On Tue, 2009-02-10 at 14:46 +0000, Rolando Martins wrote:
>
> > I've never actually tried anything like this, let me know if it
> > works ;-)
> >
>
> Thanks Peter, it works!
> I am thinking for different strategies to be used in my rt middleware
> project, and I think there is a limitation.
> If I wanted to have some RT on the B cpuset, I couldn't because I
> assigned A.cpu.rt_runtime_ns = root.cpu.rt_runtime_ns (then subdivided
> the A cpuset, with 2,3,4, each one having A.cpu.rt_runtime_ns/3).
Try it, you can run RT proglets in B.
You get n*utilization per schedule domain, where n is the number of cpus
in it.
So you still have 200% left in B, even if you use 200% of A.
> This happens because there is a
> global /proc/sys/kernel/sched_rt_runtime_us and
> /proc/sys/kernel/sched_rt_period_us.
These globals don't actually do much (except provide a global cap) in
the cgroup case.
> What do you think about adding a separate tuple (runtime,period) for
> each core/cpu?
> Does this make sense? ;)
That's going to give me a horrible head-ache trying to load-balance
stuff.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)?
2009-02-10 16:00 ` Peter Zijlstra
@ 2009-02-10 17:32 ` Rolando Martins
2009-02-10 19:53 ` Peter Zijlstra
0 siblings, 1 reply; 14+ messages in thread
From: Rolando Martins @ 2009-02-10 17:32 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: linux-kernel
On 2/10/09, Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, 2009-02-10 at 14:46 +0000, Rolando Martins wrote:
> >
> > > I've never actually tried anything like this, let me know if it
> > > works ;-)
> > >
> >
> > Thanks Peter, it works!
>
> > I am thinking for different strategies to be used in my rt middleware
> > project, and I think there is a limitation.
> > If I wanted to have some RT on the B cpuset, I couldn't because I
> > assigned A.cpu.rt_runtime_ns = root.cpu.rt_runtime_ns (then subdivided
> > the A cpuset, with 2,3,4, each one having A.cpu.rt_runtime_ns/3).
>
>
> Try it, you can run RT proglets in B.
>
> You get n*utilization per schedule domain, where n is the number of cpus
> in it.
>
> So you still have 200% left in B, even if you use 200% of A.
>
>
> > This happens because there is a
> > global /proc/sys/kernel/sched_rt_runtime_us and
> > /proc/sys/kernel/sched_rt_period_us.
>
>
> These globals don't actually do much (except provide a global cap) in
> the cgroup case.
>
>
> > What do you think about adding a separate tuple (runtime,period) for
> > each core/cpu?
>
>
> > Does this make sense? ;)
>
> That's going to give me a horrible head-ache trying to load-balance
> stuff.
>
Sorry Peter, but I didn't think before typing;)
I was looking at the cgroups as more integrated (rigid;) ) infrastructure,
and therefore using only a mount point for all the operations... :x
Now I got everything working properly! Thanks for the support.
For helping others:
mkdir /dev/cpuset
mount -t cgroup -o cpuset none /dev/cpuset
cd /dev/cpuset
echo 0 > cpuset.sched_load_balance
mkdir A
echo 0-1 > A/cpuset.cpus
echo 0 > A/cpuset.mems
mkdir B
echo 2-3 > B/cpuset.cpus
echo 0 > B/cpuset.mems
mount -t cgroup -o cpu none /dev/sched_domain
cd /dev/sched_domain
mkdir 1
echo cpu.rt_runtime_ns > 1/cpu.rt_runtime_ns
mkdir 1/2
echo 33333 > 1/2/cpu.rt_runtime_ns
mkdir 1/3
echo 33333 > 1/3/cpu.rt_runtime_ns
mkdir 1/4
echo 33333 > 1/3/cpu.rt_runtime_ns
For example, setting the current shell to a specific cpuset(A) and sched(1/2):
echo $$ > /dev/cpuset/A/tasks
echo $$ > /dev/sched_domain/1/2/tasks
"execute program"
Peter, can you confirm this code? ;)
Thanks!
Rol
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)?
2009-02-10 17:32 ` Rolando Martins
@ 2009-02-10 19:53 ` Peter Zijlstra
2009-02-11 11:33 ` Rolando Martins
0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2009-02-10 19:53 UTC (permalink / raw)
To: Rolando Martins; +Cc: linux-kernel
On Tue, 2009-02-10 at 17:32 +0000, Rolando Martins wrote:
>
> For helping others:
>
> mkdir /dev/cpuset
> mount -t cgroup -o cpuset none /dev/cpuset
> cd /dev/cpuset
> echo 0 > cpuset.sched_load_balance
I'm not quite sure that its allowed to disable load-balance before
creating children. Other than that it looks ok.
> mkdir A
> echo 0-1 > A/cpuset.cpus
> echo 0 > A/cpuset.mems
> mkdir B
> echo 2-3 > B/cpuset.cpus
> echo 0 > B/cpuset.mems
>
>
> mount -t cgroup -o cpu none /dev/sched_domain
> cd /dev/sched_domain
> mkdir 1
> echo cpu.rt_runtime_ns > 1/cpu.rt_runtime_ns
> mkdir 1/2
> echo 33333 > 1/2/cpu.rt_runtime_ns
> mkdir 1/3
> echo 33333 > 1/3/cpu.rt_runtime_ns
> mkdir 1/4
> echo 33333 > 1/3/cpu.rt_runtime_ns
>
> For example, setting the current shell to a specific cpuset(A) and
> sched(1/2):
>
> echo $$ > /dev/cpuset/A/tasks
> echo $$ > /dev/sched_domain/1/2/tasks
> "execute program"
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)?
2009-02-10 19:53 ` Peter Zijlstra
@ 2009-02-11 11:33 ` Rolando Martins
2009-02-11 11:42 ` Peter Zijlstra
0 siblings, 1 reply; 14+ messages in thread
From: Rolando Martins @ 2009-02-11 11:33 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: linux-kernel
On 2/10/09, Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, 2009-02-10 at 17:32 +0000, Rolando Martins wrote:
> >
> > For helping others:
> >
> > mkdir /dev/cpuset
> > mount -t cgroup -o cpuset none /dev/cpuset
> > cd /dev/cpuset
> > echo 0 > cpuset.sched_load_balance
>
>
> I'm not quite sure that its allowed to disable load-balance before
> creating children. Other than that it looks ok.
>
>
> > mkdir A
> > echo 0-1 > A/cpuset.cpus
> > echo 0 > A/cpuset.mems
> > mkdir B
> > echo 2-3 > B/cpuset.cpus
> > echo 0 > B/cpuset.mems
> >
> >
> > mount -t cgroup -o cpu none /dev/sched_domain
> > cd /dev/sched_domain
> > mkdir 1
> > echo cpu.rt_runtime_ns > 1/cpu.rt_runtime_ns
> > mkdir 1/2
> > echo 33333 > 1/2/cpu.rt_runtime_ns
> > mkdir 1/3
> > echo 33333 > 1/3/cpu.rt_runtime_ns
> > mkdir 1/4
> > echo 33333 > 1/3/cpu.rt_runtime_ns
> >
> > For example, setting the current shell to a specific cpuset(A) and
> > sched(1/2):
> >
> > echo $$ > /dev/cpuset/A/tasks
> > echo $$ > /dev/sched_domain/1/2/tasks
> > "execute program"
>
>
Hi again,
is there any way to have multiple "distinct" sched domains, i.e.:
mount -t cgroup -o cpu none /dev/sched_domain_0
... setup sched_domain_0 (ex: 90% RT, 10% Others)
mount -t cgroup -o cpu none /dev/sched_domain_1
... setup sched_domain_1 (ex: 20% RT, 80% Others)
Then give sched_domain_0 to cpuset A and sched_domain_1 to B?
Thanks,
Rol
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)?
2009-02-11 11:33 ` Rolando Martins
@ 2009-02-11 11:42 ` Peter Zijlstra
2009-02-11 11:53 ` Balbir Singh
0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2009-02-11 11:42 UTC (permalink / raw)
To: Rolando Martins
Cc: linux-kernel, Paul Menage, Balbir Singh, Srivatsa Vaddagiri
On Wed, 2009-02-11 at 11:33 +0000, Rolando Martins wrote:
> Hi again,
>
> is there any way to have multiple "distinct" sched domains, i.e.:
> mount -t cgroup -o cpu none /dev/sched_domain_0
> .... setup sched_domain_0 (ex: 90% RT, 10% Others)
> mount -t cgroup -o cpu none /dev/sched_domain_1
> .... setup sched_domain_1 (ex: 20% RT, 80% Others)
> Then give sched_domain_0 to cpuset A and sched_domain_1 to B?
Nope.
We currently only support a single instance of a cgroup controller.
I see the use for what you propose, however implementing that will be
'interesting'.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)?
2009-02-11 11:42 ` Peter Zijlstra
@ 2009-02-11 11:53 ` Balbir Singh
2009-02-11 12:00 ` Peter Zijlstra
2009-02-11 12:10 ` Rolando Martins
0 siblings, 2 replies; 14+ messages in thread
From: Balbir Singh @ 2009-02-11 11:53 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Rolando Martins, linux-kernel, Paul Menage, Srivatsa Vaddagiri
* Peter Zijlstra <peterz@infradead.org> [2009-02-11 12:42:14]:
> On Wed, 2009-02-11 at 11:33 +0000, Rolando Martins wrote:
>
> > Hi again,
> >
> > is there any way to have multiple "distinct" sched domains, i.e.:
> > mount -t cgroup -o cpu none /dev/sched_domain_0
> > .... setup sched_domain_0 (ex: 90% RT, 10% Others)
> > mount -t cgroup -o cpu none /dev/sched_domain_1
> > .... setup sched_domain_1 (ex: 20% RT, 80% Others)
> > Then give sched_domain_0 to cpuset A and sched_domain_1 to B?
>
> Nope.
>
> We currently only support a single instance of a cgroup controller.
>
> I see the use for what you propose, however implementing that will be
> 'interesting'.
I am confused, if you cpusets, you get your own sched_domain. If you
mount cpusets and cpu controller together, you'll get what you want.
Is this a figment of my imagination. You might need to use exclusive
CPUsets though.
--
Balbir
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)?
2009-02-11 11:53 ` Balbir Singh
@ 2009-02-11 12:00 ` Peter Zijlstra
2009-02-11 12:10 ` Rolando Martins
1 sibling, 0 replies; 14+ messages in thread
From: Peter Zijlstra @ 2009-02-11 12:00 UTC (permalink / raw)
To: balbir; +Cc: Rolando Martins, linux-kernel, Paul Menage, Srivatsa Vaddagiri
On Wed, 2009-02-11 at 17:23 +0530, Balbir Singh wrote:
> * Peter Zijlstra <peterz@infradead.org> [2009-02-11 12:42:14]:
>
> > On Wed, 2009-02-11 at 11:33 +0000, Rolando Martins wrote:
> >
> > > Hi again,
> > >
> > > is there any way to have multiple "distinct" sched domains, i.e.:
> > > mount -t cgroup -o cpu none /dev/sched_domain_0
> > > .... setup sched_domain_0 (ex: 90% RT, 10% Others)
> > > mount -t cgroup -o cpu none /dev/sched_domain_1
> > > .... setup sched_domain_1 (ex: 20% RT, 80% Others)
> > > Then give sched_domain_0 to cpuset A and sched_domain_1 to B?
> >
> > Nope.
> >
> > We currently only support a single instance of a cgroup controller.
> >
> > I see the use for what you propose, however implementing that will be
> > 'interesting'.
>
> I am confused, if you cpusets, you get your own sched_domain. If you
> mount cpusets and cpu controller together, you'll get what you want.
> Is this a figment of my imagination. You might need to use exclusive
> CPUsets though.
afaiui he wants a cgroup hierarchy per exclusive sched domain.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)?
2009-02-11 11:53 ` Balbir Singh
2009-02-11 12:00 ` Peter Zijlstra
@ 2009-02-11 12:10 ` Rolando Martins
1 sibling, 0 replies; 14+ messages in thread
From: Rolando Martins @ 2009-02-11 12:10 UTC (permalink / raw)
To: balbir; +Cc: Peter Zijlstra, linux-kernel, Paul Menage, Srivatsa Vaddagiri
On 2/11/09, Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> * Peter Zijlstra <peterz@infradead.org> [2009-02-11 12:42:14]:
>
>
> > On Wed, 2009-02-11 at 11:33 +0000, Rolando Martins wrote:
> >
> > > Hi again,
> > >
> > > is there any way to have multiple "distinct" sched domains, i.e.:
> > > mount -t cgroup -o cpu none /dev/sched_domain_0
> > > .... setup sched_domain_0 (ex: 90% RT, 10% Others)
> > > mount -t cgroup -o cpu none /dev/sched_domain_1
> > > .... setup sched_domain_1 (ex: 20% RT, 80% Others)
> > > Then give sched_domain_0 to cpuset A and sched_domain_1 to B?
> >
> > Nope.
> >
> > We currently only support a single instance of a cgroup controller.
> >
> > I see the use for what you propose, however implementing that will be
> > 'interesting'.
>
>
> I am confused, if you cpusets, you get your own sched_domain. If you
> mount cpusets and cpu controller together, you'll get what you want.
> Is this a figment of my imagination. You might need to use exclusive
> CPUsets though.
>
> --
>
> Balbir
>
I don't know if you meant the following situation (mounting cpuset and
cpu together):
R
-----------------------
(80% RT, 20%Others) A B (100% RT, 0%
Others)
(Cpus 0-2) (CPU 3)
If so, we can't do this because of the restriction imposed by global
rt_runtime_ns.
Perhaps a "feasible" solution could be implemented by having distinct
global rt_runtime_ns (one for each cpu, i.e.: rt_runtime_ns_0; ...;
rt_runtime_n )
R
-----------------------
(80% RT, 20%Others) A B (100% RT, 0%
Others)
(Cpus 0-2) (CPU 3)
capacity_used_cpu_0_rt = 0.8 capacity_used_cpu_3_rt = 1
capacity_used_cpu_1_rt = 0.8
capacity_used_cpu_2_rt = 0.8
Given a i processor: we have the global restriction enforced:
SUM(capacity_used_cpu_i_rt) < rt_runtime_i
Rol
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)?
2009-02-10 13:06 ` Peter Zijlstra
2009-02-10 14:46 ` Rolando Martins
@ 2009-03-03 12:58 ` Rolando Martins
1 sibling, 0 replies; 14+ messages in thread
From: Rolando Martins @ 2009-03-03 12:58 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: linux-kernel
On Tue, Feb 10, 2009 at 1:06 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Mon, 2009-02-09 at 20:04 +0000, Rolando Martins wrote:
>
>> I should have elaborated this more:
>>
>> root
>> ----|----
>> | |
>> (0.5 mem) 0 1 (100% rt, 0.5 mem)
>> ---------
>> | | |
>> 2 3 4 (33% rt for each group, 33% mem
>> per group(0.165))
>> Rol
>
>
> Right, i think this can be done.
>
> You would indeed need cpusets and sched-cgroups.
>
> Split the machine in 2 using cpusets.
>
> ___R___
> / \
> A B
>
> Where R is the root cpuset, and A and B are the siblings.
> Assign A one half the cpus, and B the other half.
> Disable load-balancing on R.
>
> Then using sched cgroups create the hierarchy
>
> ____1____
> / | \
> 2 3 4
>
> Where 1 can be the root group if you like.
>
> Assign 1 a utilization limit of 100%, and 2,3 and 4 a utilization limit
> of 33% each.
>
> Then place the tasks that get 100% cputime on your 2 cpus in cpuset A
> and sched group 1.
>
> Place your other tasks in B,{2-4} respectively.
>
> The reason this works is that bandwidth distribution is sched domain
> wide, and by disabling load-balancing on R, you split the schedule
> domain.
>
> I've never actually tried anything like this, let me know if it
> works ;-)
>
Just to confirm, cpuset.sched_load_balance doesn't work with RT, right?
You cannot have tasks for sub-domain 2 to utilize bandwidth of
sub-domain 3, right?
__1__
/ \
2 3
(50% rt) (50% rt )
For my application domain;) it would be interesting to have
rt_runtime_ns as a min. of allocated rt and not a max.
Ex. If an application of domain 2 needs to go up to 100% and domain 3
is idle, then it would be cool to let it utilize the full bandwidth.
(we also could have a hard upper limit in each sub-domain, like
hard_up=0.8, i.e. even if we could get 100%, we will only utilize
80%).
Does this make sense?
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2009-03-03 12:58 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-09 19:30 cgroup, RT reservation per core(s)? Rolando Martins
2009-02-09 19:52 ` Peter Zijlstra
2009-02-09 20:04 ` Rolando Martins
2009-02-10 13:06 ` Peter Zijlstra
2009-02-10 14:46 ` Rolando Martins
2009-02-10 16:00 ` Peter Zijlstra
2009-02-10 17:32 ` Rolando Martins
2009-02-10 19:53 ` Peter Zijlstra
2009-02-11 11:33 ` Rolando Martins
2009-02-11 11:42 ` Peter Zijlstra
2009-02-11 11:53 ` Balbir Singh
2009-02-11 12:00 ` Peter Zijlstra
2009-02-11 12:10 ` Rolando Martins
2009-03-03 12:58 ` Rolando Martins
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).