linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* cgroup, RT reservation per core(s)?
@ 2009-02-09 19:30 Rolando Martins
  2009-02-09 19:52 ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: Rolando Martins @ 2009-02-09 19:30 UTC (permalink / raw)
  To: linux-kernel

Hi,
I would like to have in a quad-core, 2 cores totally (100%) dedicated
to RT tasks (SCHED_RR and SCHED_FIFO) and the other 2 cores with
normal behavior, better said,  allowing SCHED_OTHER & SCHED_FIFO &
SCHED_RR tasks but still with a RT reservation. Follows an example:


# Setup first domain (cpu 0,1)
echo 0-1 > /dev/cgroup/0/cpuset.cpus
echo 0 > /dev/cgroup/0/cpuset.mems

# Setup RT bandwidth for firstdomain (80% for RT, 20% others)
echo 1000000 > /dev/cgroup/0/cpu.rt_period_us
echo 800000 > /dev/cgroup/0/cpu.rt_runtime_us


# Setup second domain (cpu 2,3)
mkdir /dev/cgroup/1
echo 2-3 > /dev/cgroup/1/cpuset.cpus
echo 0 > /dev/cgroup/1/cpuset.mems

# Setup RT bandwidth for second domain (100% for RT)
echo 1000000 > /dev/cgroup/1/cpu.rt_period_us
echo 1000000 > /dev/cgroup/1/cpu.rt_runtime_us

Is there anyway for doing this?

Thanks,
Rol

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: cgroup, RT reservation per core(s)?
  2009-02-09 19:30 cgroup, RT reservation per core(s)? Rolando Martins
@ 2009-02-09 19:52 ` Peter Zijlstra
  2009-02-09 20:04   ` Rolando Martins
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2009-02-09 19:52 UTC (permalink / raw)
  To: Rolando Martins; +Cc: linux-kernel

On Mon, 2009-02-09 at 19:30 +0000, Rolando Martins wrote:
> Hi,
> I would like to have in a quad-core, 2 cores totally (100%) dedicated
> to RT tasks (SCHED_RR and SCHED_FIFO) and the other 2 cores with
> normal behavior, better said,  allowing SCHED_OTHER & SCHED_FIFO &
> SCHED_RR tasks but still with a RT reservation. Follows an example:
> 
> 
> # Setup first domain (cpu 0,1)
> echo 0-1 > /dev/cgroup/0/cpuset.cpus
> echo 0 > /dev/cgroup/0/cpuset.mems
> 
> # Setup RT bandwidth for firstdomain (80% for RT, 20% others)
> echo 1000000 > /dev/cgroup/0/cpu.rt_period_us
> echo 800000 > /dev/cgroup/0/cpu.rt_runtime_us
> 
> 
> # Setup second domain (cpu 2,3)
> mkdir /dev/cgroup/1
> echo 2-3 > /dev/cgroup/1/cpuset.cpus
> echo 0 > /dev/cgroup/1/cpuset.mems
> 
> # Setup RT bandwidth for second domain (100% for RT)
> echo 1000000 > /dev/cgroup/1/cpu.rt_period_us
> echo 1000000 > /dev/cgroup/1/cpu.rt_runtime_us
> 
> Is there anyway for doing this?

Nope, but why do you need bandwidth groups if all you want is a full
cpu?

Just the cpuset should be plenty


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: cgroup, RT reservation per core(s)?
  2009-02-09 19:52 ` Peter Zijlstra
@ 2009-02-09 20:04   ` Rolando Martins
  2009-02-10 13:06     ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: Rolando Martins @ 2009-02-09 20:04 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel

Thanks for the quick reply.

On Mon, Feb 9, 2009 at 7:52 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Mon, 2009-02-09 at 19:30 +0000, Rolando Martins wrote:
>> Hi,
>> I would like to have in a quad-core, 2 cores totally (100%) dedicated
>> to RT tasks (SCHED_RR and SCHED_FIFO) and the other 2 cores with
>> normal behavior, better said,  allowing SCHED_OTHER & SCHED_FIFO &
>> SCHED_RR tasks but still with a RT reservation. Follows an example:
>>
>>
>> # Setup first domain (cpu 0,1)
>> echo 0-1 > /dev/cgroup/0/cpuset.cpus
>> echo 0 > /dev/cgroup/0/cpuset.mems
>>
>> # Setup RT bandwidth for firstdomain (80% for RT, 20% others)
>> echo 1000000 > /dev/cgroup/0/cpu.rt_period_us
>> echo 800000 > /dev/cgroup/0/cpu.rt_runtime_us
>>
>>
>> # Setup second domain (cpu 2,3)
>> mkdir /dev/cgroup/1
>> echo 2-3 > /dev/cgroup/1/cpuset.cpus
>> echo 0 > /dev/cgroup/1/cpuset.mems
>>
>> # Setup RT bandwidth for second domain (100% for RT)
>> echo 1000000 > /dev/cgroup/1/cpu.rt_period_us
>> echo 1000000 > /dev/cgroup/1/cpu.rt_runtime_us
>>
>> Is there anyway for doing this?
>
> Nope, but why do you need bandwidth groups if all you want is a full
> cpu?
>
> Just the cpuset should be plenty
You have a point;)
>
>
I should have elaborated this more:

                     root
                  ----|----
                  |          |
(0.5 mem) 0         1 (100% rt, 0.5 mem)
                         ---------
                         |    |    |
                         2   3   4  (33% rt for each group, 33% mem
per group(0.165))
Rol

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: cgroup, RT reservation per core(s)?
  2009-02-09 20:04   ` Rolando Martins
@ 2009-02-10 13:06     ` Peter Zijlstra
  2009-02-10 14:46       ` Rolando Martins
  2009-03-03 12:58       ` Rolando Martins
  0 siblings, 2 replies; 14+ messages in thread
From: Peter Zijlstra @ 2009-02-10 13:06 UTC (permalink / raw)
  To: Rolando Martins; +Cc: linux-kernel

On Mon, 2009-02-09 at 20:04 +0000, Rolando Martins wrote:

> I should have elaborated this more:
> 
>                      root
>                   ----|----
>                   |          |
> (0.5 mem) 0         1 (100% rt, 0.5 mem)
>                          ---------
>                          |    |    |
>                          2   3   4  (33% rt for each group, 33% mem
> per group(0.165))
> Rol


Right, i think this can be done.

You would indeed need cpusets and sched-cgroups.

Split the machine in 2 using cpusets.

   ___R___
  /       \
 A         B

Where R is the root cpuset, and A and B are the siblings.
Assign A one half the cpus, and B the other half.
Disable load-balancing on R.

Then using sched cgroups create the hierarchy

  ____1____
 /    |    \
2     3     4

Where 1 can be the root group if you like.

Assign 1 a utilization limit of 100%, and 2,3 and 4 a utilization limit
of 33% each.

Then place the tasks that get 100% cputime on your 2 cpus in cpuset A
and sched group 1.

Place your other tasks in B,{2-4} respectively.

The reason this works is that bandwidth distribution is sched domain
wide, and by disabling load-balancing on R, you split the schedule
domain.

I've never actually tried anything like this, let me know if it
works ;-)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: cgroup, RT reservation per core(s)?
  2009-02-10 13:06     ` Peter Zijlstra
@ 2009-02-10 14:46       ` Rolando Martins
  2009-02-10 16:00         ` Peter Zijlstra
  2009-03-03 12:58       ` Rolando Martins
  1 sibling, 1 reply; 14+ messages in thread
From: Rolando Martins @ 2009-02-10 14:46 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel

On 2/10/09, Peter Zijlstra <peterz@infradead.org> wrote:
> On Mon, 2009-02-09 at 20:04 +0000, Rolando Martins wrote:
>
>  > I should have elaborated this more:
>  >
>  >                      root
>  >                   ----|----
>  >                   |          |
>  > (0.5 mem) 0         1 (100% rt, 0.5 mem)
>  >                          ---------
>  >                          |    |    |
>  >                          2   3   4  (33% rt for each group, 33% mem
>  > per group(0.165))
>  > Rol
>
>
>
> Right, i think this can be done.
>
>  You would indeed need cpusets and sched-cgroups.
>
>  Split the machine in 2 using cpusets.
>
>    ___R___
>   /       \
>   A         B
>
>  Where R is the root cpuset, and A and B are the siblings.
>  Assign A one half the cpus, and B the other half.
>  Disable load-balancing on R.
>
>  Then using sched cgroups create the hierarchy
>
>   ____1____
>   /    |    \
>  2     3     4
>
>  Where 1 can be the root group if you like.
>
>  Assign 1 a utilization limit of 100%, and 2,3 and 4 a utilization limit
>  of 33% each.
>
>  Then place the tasks that get 100% cputime on your 2 cpus in cpuset A
>  and sched group 1.
>
>  Place your other tasks in B,{2-4} respectively.
>
>  The reason this works is that bandwidth distribution is sched domain
>  wide, and by disabling load-balancing on R, you split the schedule
>  domain.
>
>  I've never actually tried anything like this, let me know if it
>  works ;-)
>
On 2/10/09, Peter Zijlstra <peterz@infradead.org> wrote:
> On Mon, 2009-02-09 at 20:04 +0000, Rolando Martins wrote:
>
>  > I should have elaborated this more:
>  >
>  >                      root
>  >                   ----|----
>  >                   |          |
>  > (0.5 mem) 0         1 (100% rt, 0.5 mem)
>  >                          ---------
>  >                          |    |    |
>  >                          2   3   4  (33% rt for each group, 33% mem
>  > per group(0.165))
>  > Rol
>
>
>
> Right, i think this can be done.
>
>  You would indeed need cpusets and sched-cgroups.
>
>  Split the machine in 2 using cpusets.
>
>    ___R___
>   /       \
>   A         B
>
>  Where R is the root cpuset, and A and B are the siblings.
>  Assign A one half the cpus, and B the other half.
>  Disable load-balancing on R.
>
>  Then using sched cgroups create the hierarchy
>
>   ____1____
>   /    |    \
>  2     3     4
>
>  Where 1 can be the root group if you like.
>
>  Assign 1 a utilization limit of 100%, and 2,3 and 4 a utilization limit
>  of 33% each.
>
>  Then place the tasks that get 100% cputime on your 2 cpus in cpuset A
>  and sched group 1.
>
>  Place your other tasks in B,{2-4} respectively.
>
>  The reason this works is that bandwidth distribution is sched domain
>  wide, and by disabling load-balancing on R, you split the schedule
>  domain.
>
>  I've never actually tried anything like this, let me know if it
>  works ;-)
>

Thanks Peter, it works!
I am thinking for different strategies to be used in my rt middleware
project, and I think there is a limitation.
If I wanted to have some RT on the B cpuset, I couldn't because I
assigned A.cpu.rt_runtime_ns = root.cpu.rt_runtime_ns (then subdivided
the A cpuset, with 2,3,4, each one having A.cpu.rt_runtime_ns/3).

This happens because there is a global /proc/sys/kernel/sched_rt_runtime_us and
/proc/sys/kernel/sched_rt_period_us.
What do you think about adding a separate tuple (runtime,period) for
each core/cpu?

In this case:
/proc/sys/kernel/sched_rt_runtime_us_0
/proc/sys/kernel/sched_rt_period_us_0
...
/proc/sys/kernel/sched_rt_runtime_us_n (n, cpu count)
/proc/sys/kernel/sched_rt_period_us_n


Given this, we could the following:

mkdir /dev/cgroup/A
echo 0-1 > /dev/cgroup/A/cpuset.cpus
echo 0 > /dev/cgroup/A/cpuset.mems
echo 1000000 > /dev/cgroup/A/cpu.rt_period_us
echo 1000000 > /dev/cgroup/A/cpu.rt_runtime_us

This would only work if we could allocate
(cpu.rt_runtime_us,cpu.rt_period_us) in both CPU 0 and CPU 1,
otherwise fail.

mkdir /dev/cgroup/B
echo 2-3 > /dev/cgroup/B/cpuset.cpus
echo 0 > /dev/cgroup/B/cpuset.mems
echo 1000000 > /dev/cgroup/B/cpu.rt_period_us
echo 800000 > /dev/cgroup/B/cpu.rt_runtime_us
The same here, failed if we couldn't allocate 0.8 in both CPU 2 and CPU 3.

Does this make sense? ;)

Rol

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: cgroup, RT reservation per core(s)?
  2009-02-10 14:46       ` Rolando Martins
@ 2009-02-10 16:00         ` Peter Zijlstra
  2009-02-10 17:32           ` Rolando Martins
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2009-02-10 16:00 UTC (permalink / raw)
  To: Rolando Martins; +Cc: linux-kernel

On Tue, 2009-02-10 at 14:46 +0000, Rolando Martins wrote:
> 
> >  I've never actually tried anything like this, let me know if it
> >  works ;-)
> >
> 
> Thanks Peter, it works!

> I am thinking for different strategies to be used in my rt middleware
> project, and I think there is a limitation.
> If I wanted to have some RT on the B cpuset, I couldn't because I
> assigned A.cpu.rt_runtime_ns = root.cpu.rt_runtime_ns (then subdivided
> the A cpuset, with 2,3,4, each one having A.cpu.rt_runtime_ns/3).

Try it, you can run RT proglets in B.

You get n*utilization per schedule domain, where n is the number of cpus
in it.

So you still have 200% left in B, even if you use 200% of A.

> This happens because there is a
> global /proc/sys/kernel/sched_rt_runtime_us and
> /proc/sys/kernel/sched_rt_period_us.

These globals don't actually do much (except provide a global cap) in
the cgroup case.

> What do you think about adding a separate tuple (runtime,period) for
> each core/cpu?

> Does this make sense? ;)

That's going to give me a horrible head-ache trying to load-balance
stuff.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: cgroup, RT reservation per core(s)?
  2009-02-10 16:00         ` Peter Zijlstra
@ 2009-02-10 17:32           ` Rolando Martins
  2009-02-10 19:53             ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: Rolando Martins @ 2009-02-10 17:32 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel

On 2/10/09, Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, 2009-02-10 at 14:46 +0000, Rolando Martins wrote:
>  >
>  > >  I've never actually tried anything like this, let me know if it
>  > >  works ;-)
>  > >
>  >
>  > Thanks Peter, it works!
>
>  > I am thinking for different strategies to be used in my rt middleware
>  > project, and I think there is a limitation.
>  > If I wanted to have some RT on the B cpuset, I couldn't because I
>  > assigned A.cpu.rt_runtime_ns = root.cpu.rt_runtime_ns (then subdivided
>  > the A cpuset, with 2,3,4, each one having A.cpu.rt_runtime_ns/3).
>
>
> Try it, you can run RT proglets in B.
>
>  You get n*utilization per schedule domain, where n is the number of cpus
>  in it.
>
>  So you still have 200% left in B, even if you use 200% of A.
>
>
>  > This happens because there is a
>  > global /proc/sys/kernel/sched_rt_runtime_us and
>  > /proc/sys/kernel/sched_rt_period_us.
>
>
> These globals don't actually do much (except provide a global cap) in
>  the cgroup case.
>
>
>  > What do you think about adding a separate tuple (runtime,period) for
>  > each core/cpu?
>
>
> > Does this make sense? ;)
>
>  That's going to give me a horrible head-ache trying to load-balance
>  stuff.
>
Sorry Peter, but I didn't think before typing;)
I was looking at the cgroups as more integrated (rigid;) ) infrastructure,
and therefore using only a mount point for all the operations... :x

Now I got everything working properly! Thanks for the support.

For helping others:

mkdir /dev/cpuset
mount -t cgroup -o cpuset none /dev/cpuset
cd /dev/cpuset
echo 0 > cpuset.sched_load_balance
mkdir A
echo 0-1 > A/cpuset.cpus
echo 0 > A/cpuset.mems
mkdir B
echo 2-3 > B/cpuset.cpus
echo 0 > B/cpuset.mems


mount -t cgroup -o cpu none /dev/sched_domain
cd /dev/sched_domain
mkdir 1
echo cpu.rt_runtime_ns > 1/cpu.rt_runtime_ns
mkdir 1/2
echo 33333 > 1/2/cpu.rt_runtime_ns
mkdir 1/3
echo 33333 > 1/3/cpu.rt_runtime_ns
mkdir 1/4
echo 33333 > 1/3/cpu.rt_runtime_ns

For example, setting the current shell to a specific cpuset(A) and sched(1/2):

echo $$ > /dev/cpuset/A/tasks
echo $$ > /dev/sched_domain/1/2/tasks
"execute program"


Peter, can you confirm this code? ;)

Thanks!
Rol

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: cgroup, RT reservation per core(s)?
  2009-02-10 17:32           ` Rolando Martins
@ 2009-02-10 19:53             ` Peter Zijlstra
  2009-02-11 11:33               ` Rolando Martins
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2009-02-10 19:53 UTC (permalink / raw)
  To: Rolando Martins; +Cc: linux-kernel

On Tue, 2009-02-10 at 17:32 +0000, Rolando Martins wrote:
> 
> For helping others:
> 
> mkdir /dev/cpuset
> mount -t cgroup -o cpuset none /dev/cpuset
> cd /dev/cpuset
> echo 0 > cpuset.sched_load_balance

I'm not quite sure that its allowed to disable load-balance before
creating children. Other than that it looks ok.

> mkdir A
> echo 0-1 > A/cpuset.cpus
> echo 0 > A/cpuset.mems
> mkdir B
> echo 2-3 > B/cpuset.cpus
> echo 0 > B/cpuset.mems
> 
> 
> mount -t cgroup -o cpu none /dev/sched_domain
> cd /dev/sched_domain
> mkdir 1
> echo cpu.rt_runtime_ns > 1/cpu.rt_runtime_ns
> mkdir 1/2
> echo 33333 > 1/2/cpu.rt_runtime_ns
> mkdir 1/3
> echo 33333 > 1/3/cpu.rt_runtime_ns
> mkdir 1/4
> echo 33333 > 1/3/cpu.rt_runtime_ns
> 
> For example, setting the current shell to a specific cpuset(A) and
> sched(1/2):
> 
> echo $$ > /dev/cpuset/A/tasks
> echo $$ > /dev/sched_domain/1/2/tasks
> "execute program"





^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: cgroup, RT reservation per core(s)?
  2009-02-10 19:53             ` Peter Zijlstra
@ 2009-02-11 11:33               ` Rolando Martins
  2009-02-11 11:42                 ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: Rolando Martins @ 2009-02-11 11:33 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel

On 2/10/09, Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, 2009-02-10 at 17:32 +0000, Rolando Martins wrote:
>  >
>  > For helping others:
>  >
>  > mkdir /dev/cpuset
>  > mount -t cgroup -o cpuset none /dev/cpuset
>  > cd /dev/cpuset
>  > echo 0 > cpuset.sched_load_balance
>
>
> I'm not quite sure that its allowed to disable load-balance before
>  creating children. Other than that it looks ok.
>
>
>  > mkdir A
>  > echo 0-1 > A/cpuset.cpus
>  > echo 0 > A/cpuset.mems
>  > mkdir B
>  > echo 2-3 > B/cpuset.cpus
>  > echo 0 > B/cpuset.mems
>  >
>  >
>  > mount -t cgroup -o cpu none /dev/sched_domain
>  > cd /dev/sched_domain
>  > mkdir 1
>  > echo cpu.rt_runtime_ns > 1/cpu.rt_runtime_ns
>  > mkdir 1/2
>  > echo 33333 > 1/2/cpu.rt_runtime_ns
>  > mkdir 1/3
>  > echo 33333 > 1/3/cpu.rt_runtime_ns
>  > mkdir 1/4
>  > echo 33333 > 1/3/cpu.rt_runtime_ns
>  >
>  > For example, setting the current shell to a specific cpuset(A) and
>  > sched(1/2):
>  >
>  > echo $$ > /dev/cpuset/A/tasks
>  > echo $$ > /dev/sched_domain/1/2/tasks
>  > "execute program"
>
>

Hi again,

is there any way to have multiple "distinct" sched domains, i.e.:
mount -t cgroup -o cpu none /dev/sched_domain_0
... setup sched_domain_0 (ex:  90% RT, 10% Others)
mount -t cgroup -o cpu none /dev/sched_domain_1
... setup sched_domain_1 (ex:  20% RT, 80% Others)
Then give sched_domain_0 to cpuset A and sched_domain_1 to B?

Thanks,
Rol

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: cgroup, RT reservation per core(s)?
  2009-02-11 11:33               ` Rolando Martins
@ 2009-02-11 11:42                 ` Peter Zijlstra
  2009-02-11 11:53                   ` Balbir Singh
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2009-02-11 11:42 UTC (permalink / raw)
  To: Rolando Martins
  Cc: linux-kernel, Paul Menage, Balbir Singh, Srivatsa Vaddagiri

On Wed, 2009-02-11 at 11:33 +0000, Rolando Martins wrote:

> Hi again,
> 
> is there any way to have multiple "distinct" sched domains, i.e.:
> mount -t cgroup -o cpu none /dev/sched_domain_0
> .... setup sched_domain_0 (ex:  90% RT, 10% Others)
> mount -t cgroup -o cpu none /dev/sched_domain_1
> .... setup sched_domain_1 (ex:  20% RT, 80% Others)
> Then give sched_domain_0 to cpuset A and sched_domain_1 to B?

Nope.

We currently only support a single instance of a cgroup controller.

I see the use for what you propose, however implementing that will be
'interesting'.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: cgroup, RT reservation per core(s)?
  2009-02-11 11:42                 ` Peter Zijlstra
@ 2009-02-11 11:53                   ` Balbir Singh
  2009-02-11 12:00                     ` Peter Zijlstra
  2009-02-11 12:10                     ` Rolando Martins
  0 siblings, 2 replies; 14+ messages in thread
From: Balbir Singh @ 2009-02-11 11:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Rolando Martins, linux-kernel, Paul Menage, Srivatsa Vaddagiri

* Peter Zijlstra <peterz@infradead.org> [2009-02-11 12:42:14]:

> On Wed, 2009-02-11 at 11:33 +0000, Rolando Martins wrote:
> 
> > Hi again,
> > 
> > is there any way to have multiple "distinct" sched domains, i.e.:
> > mount -t cgroup -o cpu none /dev/sched_domain_0
> > .... setup sched_domain_0 (ex:  90% RT, 10% Others)
> > mount -t cgroup -o cpu none /dev/sched_domain_1
> > .... setup sched_domain_1 (ex:  20% RT, 80% Others)
> > Then give sched_domain_0 to cpuset A and sched_domain_1 to B?
> 
> Nope.
> 
> We currently only support a single instance of a cgroup controller.
> 
> I see the use for what you propose, however implementing that will be
> 'interesting'.

I am confused, if you cpusets, you get your own sched_domain. If you
mount cpusets and cpu controller together, you'll get what you want.
Is this a figment of my imagination. You might need to use exclusive
CPUsets though.

-- 
	Balbir

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: cgroup, RT reservation per core(s)?
  2009-02-11 11:53                   ` Balbir Singh
@ 2009-02-11 12:00                     ` Peter Zijlstra
  2009-02-11 12:10                     ` Rolando Martins
  1 sibling, 0 replies; 14+ messages in thread
From: Peter Zijlstra @ 2009-02-11 12:00 UTC (permalink / raw)
  To: balbir; +Cc: Rolando Martins, linux-kernel, Paul Menage, Srivatsa Vaddagiri

On Wed, 2009-02-11 at 17:23 +0530, Balbir Singh wrote:
> * Peter Zijlstra <peterz@infradead.org> [2009-02-11 12:42:14]:
> 
> > On Wed, 2009-02-11 at 11:33 +0000, Rolando Martins wrote:
> > 
> > > Hi again,
> > > 
> > > is there any way to have multiple "distinct" sched domains, i.e.:
> > > mount -t cgroup -o cpu none /dev/sched_domain_0
> > > .... setup sched_domain_0 (ex:  90% RT, 10% Others)
> > > mount -t cgroup -o cpu none /dev/sched_domain_1
> > > .... setup sched_domain_1 (ex:  20% RT, 80% Others)
> > > Then give sched_domain_0 to cpuset A and sched_domain_1 to B?
> > 
> > Nope.
> > 
> > We currently only support a single instance of a cgroup controller.
> > 
> > I see the use for what you propose, however implementing that will be
> > 'interesting'.
> 
> I am confused, if you cpusets, you get your own sched_domain. If you
> mount cpusets and cpu controller together, you'll get what you want.
> Is this a figment of my imagination. You might need to use exclusive
> CPUsets though.

afaiui he wants a cgroup hierarchy per exclusive sched domain.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: cgroup, RT reservation per core(s)?
  2009-02-11 11:53                   ` Balbir Singh
  2009-02-11 12:00                     ` Peter Zijlstra
@ 2009-02-11 12:10                     ` Rolando Martins
  1 sibling, 0 replies; 14+ messages in thread
From: Rolando Martins @ 2009-02-11 12:10 UTC (permalink / raw)
  To: balbir; +Cc: Peter Zijlstra, linux-kernel, Paul Menage, Srivatsa Vaddagiri

On 2/11/09, Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> * Peter Zijlstra <peterz@infradead.org> [2009-02-11 12:42:14]:
>
>
>  > On Wed, 2009-02-11 at 11:33 +0000, Rolando Martins wrote:
>  >
>  > > Hi again,
>  > >
>  > > is there any way to have multiple "distinct" sched domains, i.e.:
>  > > mount -t cgroup -o cpu none /dev/sched_domain_0
>  > > .... setup sched_domain_0 (ex:  90% RT, 10% Others)
>  > > mount -t cgroup -o cpu none /dev/sched_domain_1
>  > > .... setup sched_domain_1 (ex:  20% RT, 80% Others)
>  > > Then give sched_domain_0 to cpuset A and sched_domain_1 to B?
>  >
>  > Nope.
>  >
>  > We currently only support a single instance of a cgroup controller.
>  >
>  > I see the use for what you propose, however implementing that will be
>  > 'interesting'.
>
>
> I am confused, if you cpusets, you get your own sched_domain. If you
>  mount cpusets and cpu controller together, you'll get what you want.
>  Is this a figment of my imagination. You might need to use exclusive
>  CPUsets though.
>
>  --
>
>         Balbir
>
I don't know if you meant the following situation (mounting cpuset and
cpu together):

                                                            R
                                              -----------------------
      (80% RT, 20%Others)      A                        B (100% RT, 0%
Others)
      (Cpus 0-2)                                                 (CPU 3)

If so, we can't do this because  of the restriction imposed by global
rt_runtime_ns.
Perhaps a "feasible" solution could be implemented by having distinct
global rt_runtime_ns (one for each cpu, i.e.: rt_runtime_ns_0; ...;
rt_runtime_n )

                                                            R
                                              -----------------------
      (80% RT, 20%Others)      A                        B (100% RT, 0%
Others)
      (Cpus 0-2)                                                 (CPU 3)
capacity_used_cpu_0_rt = 0.8                        capacity_used_cpu_3_rt = 1
capacity_used_cpu_1_rt = 0.8
capacity_used_cpu_2_rt = 0.8

Given a i processor: we have the global restriction enforced:
SUM(capacity_used_cpu_i_rt) < rt_runtime_i

Rol

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: cgroup, RT reservation per core(s)?
  2009-02-10 13:06     ` Peter Zijlstra
  2009-02-10 14:46       ` Rolando Martins
@ 2009-03-03 12:58       ` Rolando Martins
  1 sibling, 0 replies; 14+ messages in thread
From: Rolando Martins @ 2009-03-03 12:58 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel

On Tue, Feb 10, 2009 at 1:06 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Mon, 2009-02-09 at 20:04 +0000, Rolando Martins wrote:
>
>> I should have elaborated this more:
>>
>>                      root
>>                   ----|----
>>                   |          |
>> (0.5 mem) 0         1 (100% rt, 0.5 mem)
>>                          ---------
>>                          |    |    |
>>                          2   3   4  (33% rt for each group, 33% mem
>> per group(0.165))
>> Rol
>
>
> Right, i think this can be done.
>
> You would indeed need cpusets and sched-cgroups.
>
> Split the machine in 2 using cpusets.
>
>   ___R___
>  /       \
>  A         B
>
> Where R is the root cpuset, and A and B are the siblings.
> Assign A one half the cpus, and B the other half.
> Disable load-balancing on R.
>
> Then using sched cgroups create the hierarchy
>
>  ____1____
>  /    |    \
> 2     3     4
>
> Where 1 can be the root group if you like.
>
> Assign 1 a utilization limit of 100%, and 2,3 and 4 a utilization limit
> of 33% each.
>
> Then place the tasks that get 100% cputime on your 2 cpus in cpuset A
> and sched group 1.
>
> Place your other tasks in B,{2-4} respectively.
>
> The reason this works is that bandwidth distribution is sched domain
> wide, and by disabling load-balancing on R, you split the schedule
> domain.
>
> I've never actually tried anything like this, let me know if it
> works ;-)
>

Just to confirm, cpuset.sched_load_balance doesn't work with RT, right?
You cannot have tasks for sub-domain 2 to utilize bandwidth of
sub-domain 3, right?

                               __1__
                              /        \
                             2         3
                       (50% rt)  (50% rt )

For my application domain;) it would be interesting to have
rt_runtime_ns as a min. of allocated rt and not a max.
Ex. If an application of domain 2 needs to go up to 100% and domain 3
is idle, then it would be cool to let it utilize the full bandwidth.
(we also could have a hard upper limit in each sub-domain, like
hard_up=0.8, i.e. even if we could get 100%, we will only utilize
80%).

Does this make sense?

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2009-03-03 12:58 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-09 19:30 cgroup, RT reservation per core(s)? Rolando Martins
2009-02-09 19:52 ` Peter Zijlstra
2009-02-09 20:04   ` Rolando Martins
2009-02-10 13:06     ` Peter Zijlstra
2009-02-10 14:46       ` Rolando Martins
2009-02-10 16:00         ` Peter Zijlstra
2009-02-10 17:32           ` Rolando Martins
2009-02-10 19:53             ` Peter Zijlstra
2009-02-11 11:33               ` Rolando Martins
2009-02-11 11:42                 ` Peter Zijlstra
2009-02-11 11:53                   ` Balbir Singh
2009-02-11 12:00                     ` Peter Zijlstra
2009-02-11 12:10                     ` Rolando Martins
2009-03-03 12:58       ` Rolando Martins

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).