* cgroup, RT reservation per core(s)? @ 2009-02-09 19:30 Rolando Martins 2009-02-09 19:52 ` Peter Zijlstra 0 siblings, 1 reply; 14+ messages in thread From: Rolando Martins @ 2009-02-09 19:30 UTC (permalink / raw) To: linux-kernel Hi, I would like to have in a quad-core, 2 cores totally (100%) dedicated to RT tasks (SCHED_RR and SCHED_FIFO) and the other 2 cores with normal behavior, better said, allowing SCHED_OTHER & SCHED_FIFO & SCHED_RR tasks but still with a RT reservation. Follows an example: # Setup first domain (cpu 0,1) echo 0-1 > /dev/cgroup/0/cpuset.cpus echo 0 > /dev/cgroup/0/cpuset.mems # Setup RT bandwidth for firstdomain (80% for RT, 20% others) echo 1000000 > /dev/cgroup/0/cpu.rt_period_us echo 800000 > /dev/cgroup/0/cpu.rt_runtime_us # Setup second domain (cpu 2,3) mkdir /dev/cgroup/1 echo 2-3 > /dev/cgroup/1/cpuset.cpus echo 0 > /dev/cgroup/1/cpuset.mems # Setup RT bandwidth for second domain (100% for RT) echo 1000000 > /dev/cgroup/1/cpu.rt_period_us echo 1000000 > /dev/cgroup/1/cpu.rt_runtime_us Is there anyway for doing this? Thanks, Rol ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)? 2009-02-09 19:30 cgroup, RT reservation per core(s)? Rolando Martins @ 2009-02-09 19:52 ` Peter Zijlstra 2009-02-09 20:04 ` Rolando Martins 0 siblings, 1 reply; 14+ messages in thread From: Peter Zijlstra @ 2009-02-09 19:52 UTC (permalink / raw) To: Rolando Martins; +Cc: linux-kernel On Mon, 2009-02-09 at 19:30 +0000, Rolando Martins wrote: > Hi, > I would like to have in a quad-core, 2 cores totally (100%) dedicated > to RT tasks (SCHED_RR and SCHED_FIFO) and the other 2 cores with > normal behavior, better said, allowing SCHED_OTHER & SCHED_FIFO & > SCHED_RR tasks but still with a RT reservation. Follows an example: > > > # Setup first domain (cpu 0,1) > echo 0-1 > /dev/cgroup/0/cpuset.cpus > echo 0 > /dev/cgroup/0/cpuset.mems > > # Setup RT bandwidth for firstdomain (80% for RT, 20% others) > echo 1000000 > /dev/cgroup/0/cpu.rt_period_us > echo 800000 > /dev/cgroup/0/cpu.rt_runtime_us > > > # Setup second domain (cpu 2,3) > mkdir /dev/cgroup/1 > echo 2-3 > /dev/cgroup/1/cpuset.cpus > echo 0 > /dev/cgroup/1/cpuset.mems > > # Setup RT bandwidth for second domain (100% for RT) > echo 1000000 > /dev/cgroup/1/cpu.rt_period_us > echo 1000000 > /dev/cgroup/1/cpu.rt_runtime_us > > Is there anyway for doing this? Nope, but why do you need bandwidth groups if all you want is a full cpu? Just the cpuset should be plenty ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)? 2009-02-09 19:52 ` Peter Zijlstra @ 2009-02-09 20:04 ` Rolando Martins 2009-02-10 13:06 ` Peter Zijlstra 0 siblings, 1 reply; 14+ messages in thread From: Rolando Martins @ 2009-02-09 20:04 UTC (permalink / raw) To: Peter Zijlstra; +Cc: linux-kernel Thanks for the quick reply. On Mon, Feb 9, 2009 at 7:52 PM, Peter Zijlstra <peterz@infradead.org> wrote: > On Mon, 2009-02-09 at 19:30 +0000, Rolando Martins wrote: >> Hi, >> I would like to have in a quad-core, 2 cores totally (100%) dedicated >> to RT tasks (SCHED_RR and SCHED_FIFO) and the other 2 cores with >> normal behavior, better said, allowing SCHED_OTHER & SCHED_FIFO & >> SCHED_RR tasks but still with a RT reservation. Follows an example: >> >> >> # Setup first domain (cpu 0,1) >> echo 0-1 > /dev/cgroup/0/cpuset.cpus >> echo 0 > /dev/cgroup/0/cpuset.mems >> >> # Setup RT bandwidth for firstdomain (80% for RT, 20% others) >> echo 1000000 > /dev/cgroup/0/cpu.rt_period_us >> echo 800000 > /dev/cgroup/0/cpu.rt_runtime_us >> >> >> # Setup second domain (cpu 2,3) >> mkdir /dev/cgroup/1 >> echo 2-3 > /dev/cgroup/1/cpuset.cpus >> echo 0 > /dev/cgroup/1/cpuset.mems >> >> # Setup RT bandwidth for second domain (100% for RT) >> echo 1000000 > /dev/cgroup/1/cpu.rt_period_us >> echo 1000000 > /dev/cgroup/1/cpu.rt_runtime_us >> >> Is there anyway for doing this? > > Nope, but why do you need bandwidth groups if all you want is a full > cpu? > > Just the cpuset should be plenty You have a point;) > > I should have elaborated this more: root ----|---- | | (0.5 mem) 0 1 (100% rt, 0.5 mem) --------- | | | 2 3 4 (33% rt for each group, 33% mem per group(0.165)) Rol ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)? 2009-02-09 20:04 ` Rolando Martins @ 2009-02-10 13:06 ` Peter Zijlstra 2009-02-10 14:46 ` Rolando Martins 2009-03-03 12:58 ` Rolando Martins 0 siblings, 2 replies; 14+ messages in thread From: Peter Zijlstra @ 2009-02-10 13:06 UTC (permalink / raw) To: Rolando Martins; +Cc: linux-kernel On Mon, 2009-02-09 at 20:04 +0000, Rolando Martins wrote: > I should have elaborated this more: > > root > ----|---- > | | > (0.5 mem) 0 1 (100% rt, 0.5 mem) > --------- > | | | > 2 3 4 (33% rt for each group, 33% mem > per group(0.165)) > Rol Right, i think this can be done. You would indeed need cpusets and sched-cgroups. Split the machine in 2 using cpusets. ___R___ / \ A B Where R is the root cpuset, and A and B are the siblings. Assign A one half the cpus, and B the other half. Disable load-balancing on R. Then using sched cgroups create the hierarchy ____1____ / | \ 2 3 4 Where 1 can be the root group if you like. Assign 1 a utilization limit of 100%, and 2,3 and 4 a utilization limit of 33% each. Then place the tasks that get 100% cputime on your 2 cpus in cpuset A and sched group 1. Place your other tasks in B,{2-4} respectively. The reason this works is that bandwidth distribution is sched domain wide, and by disabling load-balancing on R, you split the schedule domain. I've never actually tried anything like this, let me know if it works ;-) ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)? 2009-02-10 13:06 ` Peter Zijlstra @ 2009-02-10 14:46 ` Rolando Martins 2009-02-10 16:00 ` Peter Zijlstra 2009-03-03 12:58 ` Rolando Martins 1 sibling, 1 reply; 14+ messages in thread From: Rolando Martins @ 2009-02-10 14:46 UTC (permalink / raw) To: Peter Zijlstra; +Cc: linux-kernel On 2/10/09, Peter Zijlstra <peterz@infradead.org> wrote: > On Mon, 2009-02-09 at 20:04 +0000, Rolando Martins wrote: > > > I should have elaborated this more: > > > > root > > ----|---- > > | | > > (0.5 mem) 0 1 (100% rt, 0.5 mem) > > --------- > > | | | > > 2 3 4 (33% rt for each group, 33% mem > > per group(0.165)) > > Rol > > > > Right, i think this can be done. > > You would indeed need cpusets and sched-cgroups. > > Split the machine in 2 using cpusets. > > ___R___ > / \ > A B > > Where R is the root cpuset, and A and B are the siblings. > Assign A one half the cpus, and B the other half. > Disable load-balancing on R. > > Then using sched cgroups create the hierarchy > > ____1____ > / | \ > 2 3 4 > > Where 1 can be the root group if you like. > > Assign 1 a utilization limit of 100%, and 2,3 and 4 a utilization limit > of 33% each. > > Then place the tasks that get 100% cputime on your 2 cpus in cpuset A > and sched group 1. > > Place your other tasks in B,{2-4} respectively. > > The reason this works is that bandwidth distribution is sched domain > wide, and by disabling load-balancing on R, you split the schedule > domain. > > I've never actually tried anything like this, let me know if it > works ;-) > On 2/10/09, Peter Zijlstra <peterz@infradead.org> wrote: > On Mon, 2009-02-09 at 20:04 +0000, Rolando Martins wrote: > > > I should have elaborated this more: > > > > root > > ----|---- > > | | > > (0.5 mem) 0 1 (100% rt, 0.5 mem) > > --------- > > | | | > > 2 3 4 (33% rt for each group, 33% mem > > per group(0.165)) > > Rol > > > > Right, i think this can be done. > > You would indeed need cpusets and sched-cgroups. > > Split the machine in 2 using cpusets. > > ___R___ > / \ > A B > > Where R is the root cpuset, and A and B are the siblings. > Assign A one half the cpus, and B the other half. > Disable load-balancing on R. > > Then using sched cgroups create the hierarchy > > ____1____ > / | \ > 2 3 4 > > Where 1 can be the root group if you like. > > Assign 1 a utilization limit of 100%, and 2,3 and 4 a utilization limit > of 33% each. > > Then place the tasks that get 100% cputime on your 2 cpus in cpuset A > and sched group 1. > > Place your other tasks in B,{2-4} respectively. > > The reason this works is that bandwidth distribution is sched domain > wide, and by disabling load-balancing on R, you split the schedule > domain. > > I've never actually tried anything like this, let me know if it > works ;-) > Thanks Peter, it works! I am thinking for different strategies to be used in my rt middleware project, and I think there is a limitation. If I wanted to have some RT on the B cpuset, I couldn't because I assigned A.cpu.rt_runtime_ns = root.cpu.rt_runtime_ns (then subdivided the A cpuset, with 2,3,4, each one having A.cpu.rt_runtime_ns/3). This happens because there is a global /proc/sys/kernel/sched_rt_runtime_us and /proc/sys/kernel/sched_rt_period_us. What do you think about adding a separate tuple (runtime,period) for each core/cpu? In this case: /proc/sys/kernel/sched_rt_runtime_us_0 /proc/sys/kernel/sched_rt_period_us_0 ... /proc/sys/kernel/sched_rt_runtime_us_n (n, cpu count) /proc/sys/kernel/sched_rt_period_us_n Given this, we could the following: mkdir /dev/cgroup/A echo 0-1 > /dev/cgroup/A/cpuset.cpus echo 0 > /dev/cgroup/A/cpuset.mems echo 1000000 > /dev/cgroup/A/cpu.rt_period_us echo 1000000 > /dev/cgroup/A/cpu.rt_runtime_us This would only work if we could allocate (cpu.rt_runtime_us,cpu.rt_period_us) in both CPU 0 and CPU 1, otherwise fail. mkdir /dev/cgroup/B echo 2-3 > /dev/cgroup/B/cpuset.cpus echo 0 > /dev/cgroup/B/cpuset.mems echo 1000000 > /dev/cgroup/B/cpu.rt_period_us echo 800000 > /dev/cgroup/B/cpu.rt_runtime_us The same here, failed if we couldn't allocate 0.8 in both CPU 2 and CPU 3. Does this make sense? ;) Rol ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)? 2009-02-10 14:46 ` Rolando Martins @ 2009-02-10 16:00 ` Peter Zijlstra 2009-02-10 17:32 ` Rolando Martins 0 siblings, 1 reply; 14+ messages in thread From: Peter Zijlstra @ 2009-02-10 16:00 UTC (permalink / raw) To: Rolando Martins; +Cc: linux-kernel On Tue, 2009-02-10 at 14:46 +0000, Rolando Martins wrote: > > > I've never actually tried anything like this, let me know if it > > works ;-) > > > > Thanks Peter, it works! > I am thinking for different strategies to be used in my rt middleware > project, and I think there is a limitation. > If I wanted to have some RT on the B cpuset, I couldn't because I > assigned A.cpu.rt_runtime_ns = root.cpu.rt_runtime_ns (then subdivided > the A cpuset, with 2,3,4, each one having A.cpu.rt_runtime_ns/3). Try it, you can run RT proglets in B. You get n*utilization per schedule domain, where n is the number of cpus in it. So you still have 200% left in B, even if you use 200% of A. > This happens because there is a > global /proc/sys/kernel/sched_rt_runtime_us and > /proc/sys/kernel/sched_rt_period_us. These globals don't actually do much (except provide a global cap) in the cgroup case. > What do you think about adding a separate tuple (runtime,period) for > each core/cpu? > Does this make sense? ;) That's going to give me a horrible head-ache trying to load-balance stuff. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)? 2009-02-10 16:00 ` Peter Zijlstra @ 2009-02-10 17:32 ` Rolando Martins 2009-02-10 19:53 ` Peter Zijlstra 0 siblings, 1 reply; 14+ messages in thread From: Rolando Martins @ 2009-02-10 17:32 UTC (permalink / raw) To: Peter Zijlstra; +Cc: linux-kernel On 2/10/09, Peter Zijlstra <peterz@infradead.org> wrote: > On Tue, 2009-02-10 at 14:46 +0000, Rolando Martins wrote: > > > > > I've never actually tried anything like this, let me know if it > > > works ;-) > > > > > > > Thanks Peter, it works! > > > I am thinking for different strategies to be used in my rt middleware > > project, and I think there is a limitation. > > If I wanted to have some RT on the B cpuset, I couldn't because I > > assigned A.cpu.rt_runtime_ns = root.cpu.rt_runtime_ns (then subdivided > > the A cpuset, with 2,3,4, each one having A.cpu.rt_runtime_ns/3). > > > Try it, you can run RT proglets in B. > > You get n*utilization per schedule domain, where n is the number of cpus > in it. > > So you still have 200% left in B, even if you use 200% of A. > > > > This happens because there is a > > global /proc/sys/kernel/sched_rt_runtime_us and > > /proc/sys/kernel/sched_rt_period_us. > > > These globals don't actually do much (except provide a global cap) in > the cgroup case. > > > > What do you think about adding a separate tuple (runtime,period) for > > each core/cpu? > > > > Does this make sense? ;) > > That's going to give me a horrible head-ache trying to load-balance > stuff. > Sorry Peter, but I didn't think before typing;) I was looking at the cgroups as more integrated (rigid;) ) infrastructure, and therefore using only a mount point for all the operations... :x Now I got everything working properly! Thanks for the support. For helping others: mkdir /dev/cpuset mount -t cgroup -o cpuset none /dev/cpuset cd /dev/cpuset echo 0 > cpuset.sched_load_balance mkdir A echo 0-1 > A/cpuset.cpus echo 0 > A/cpuset.mems mkdir B echo 2-3 > B/cpuset.cpus echo 0 > B/cpuset.mems mount -t cgroup -o cpu none /dev/sched_domain cd /dev/sched_domain mkdir 1 echo cpu.rt_runtime_ns > 1/cpu.rt_runtime_ns mkdir 1/2 echo 33333 > 1/2/cpu.rt_runtime_ns mkdir 1/3 echo 33333 > 1/3/cpu.rt_runtime_ns mkdir 1/4 echo 33333 > 1/3/cpu.rt_runtime_ns For example, setting the current shell to a specific cpuset(A) and sched(1/2): echo $$ > /dev/cpuset/A/tasks echo $$ > /dev/sched_domain/1/2/tasks "execute program" Peter, can you confirm this code? ;) Thanks! Rol ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)? 2009-02-10 17:32 ` Rolando Martins @ 2009-02-10 19:53 ` Peter Zijlstra 2009-02-11 11:33 ` Rolando Martins 0 siblings, 1 reply; 14+ messages in thread From: Peter Zijlstra @ 2009-02-10 19:53 UTC (permalink / raw) To: Rolando Martins; +Cc: linux-kernel On Tue, 2009-02-10 at 17:32 +0000, Rolando Martins wrote: > > For helping others: > > mkdir /dev/cpuset > mount -t cgroup -o cpuset none /dev/cpuset > cd /dev/cpuset > echo 0 > cpuset.sched_load_balance I'm not quite sure that its allowed to disable load-balance before creating children. Other than that it looks ok. > mkdir A > echo 0-1 > A/cpuset.cpus > echo 0 > A/cpuset.mems > mkdir B > echo 2-3 > B/cpuset.cpus > echo 0 > B/cpuset.mems > > > mount -t cgroup -o cpu none /dev/sched_domain > cd /dev/sched_domain > mkdir 1 > echo cpu.rt_runtime_ns > 1/cpu.rt_runtime_ns > mkdir 1/2 > echo 33333 > 1/2/cpu.rt_runtime_ns > mkdir 1/3 > echo 33333 > 1/3/cpu.rt_runtime_ns > mkdir 1/4 > echo 33333 > 1/3/cpu.rt_runtime_ns > > For example, setting the current shell to a specific cpuset(A) and > sched(1/2): > > echo $$ > /dev/cpuset/A/tasks > echo $$ > /dev/sched_domain/1/2/tasks > "execute program" ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)? 2009-02-10 19:53 ` Peter Zijlstra @ 2009-02-11 11:33 ` Rolando Martins 2009-02-11 11:42 ` Peter Zijlstra 0 siblings, 1 reply; 14+ messages in thread From: Rolando Martins @ 2009-02-11 11:33 UTC (permalink / raw) To: Peter Zijlstra; +Cc: linux-kernel On 2/10/09, Peter Zijlstra <peterz@infradead.org> wrote: > On Tue, 2009-02-10 at 17:32 +0000, Rolando Martins wrote: > > > > For helping others: > > > > mkdir /dev/cpuset > > mount -t cgroup -o cpuset none /dev/cpuset > > cd /dev/cpuset > > echo 0 > cpuset.sched_load_balance > > > I'm not quite sure that its allowed to disable load-balance before > creating children. Other than that it looks ok. > > > > mkdir A > > echo 0-1 > A/cpuset.cpus > > echo 0 > A/cpuset.mems > > mkdir B > > echo 2-3 > B/cpuset.cpus > > echo 0 > B/cpuset.mems > > > > > > mount -t cgroup -o cpu none /dev/sched_domain > > cd /dev/sched_domain > > mkdir 1 > > echo cpu.rt_runtime_ns > 1/cpu.rt_runtime_ns > > mkdir 1/2 > > echo 33333 > 1/2/cpu.rt_runtime_ns > > mkdir 1/3 > > echo 33333 > 1/3/cpu.rt_runtime_ns > > mkdir 1/4 > > echo 33333 > 1/3/cpu.rt_runtime_ns > > > > For example, setting the current shell to a specific cpuset(A) and > > sched(1/2): > > > > echo $$ > /dev/cpuset/A/tasks > > echo $$ > /dev/sched_domain/1/2/tasks > > "execute program" > > Hi again, is there any way to have multiple "distinct" sched domains, i.e.: mount -t cgroup -o cpu none /dev/sched_domain_0 ... setup sched_domain_0 (ex: 90% RT, 10% Others) mount -t cgroup -o cpu none /dev/sched_domain_1 ... setup sched_domain_1 (ex: 20% RT, 80% Others) Then give sched_domain_0 to cpuset A and sched_domain_1 to B? Thanks, Rol ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)? 2009-02-11 11:33 ` Rolando Martins @ 2009-02-11 11:42 ` Peter Zijlstra 2009-02-11 11:53 ` Balbir Singh 0 siblings, 1 reply; 14+ messages in thread From: Peter Zijlstra @ 2009-02-11 11:42 UTC (permalink / raw) To: Rolando Martins Cc: linux-kernel, Paul Menage, Balbir Singh, Srivatsa Vaddagiri On Wed, 2009-02-11 at 11:33 +0000, Rolando Martins wrote: > Hi again, > > is there any way to have multiple "distinct" sched domains, i.e.: > mount -t cgroup -o cpu none /dev/sched_domain_0 > .... setup sched_domain_0 (ex: 90% RT, 10% Others) > mount -t cgroup -o cpu none /dev/sched_domain_1 > .... setup sched_domain_1 (ex: 20% RT, 80% Others) > Then give sched_domain_0 to cpuset A and sched_domain_1 to B? Nope. We currently only support a single instance of a cgroup controller. I see the use for what you propose, however implementing that will be 'interesting'. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)? 2009-02-11 11:42 ` Peter Zijlstra @ 2009-02-11 11:53 ` Balbir Singh 2009-02-11 12:00 ` Peter Zijlstra 2009-02-11 12:10 ` Rolando Martins 0 siblings, 2 replies; 14+ messages in thread From: Balbir Singh @ 2009-02-11 11:53 UTC (permalink / raw) To: Peter Zijlstra Cc: Rolando Martins, linux-kernel, Paul Menage, Srivatsa Vaddagiri * Peter Zijlstra <peterz@infradead.org> [2009-02-11 12:42:14]: > On Wed, 2009-02-11 at 11:33 +0000, Rolando Martins wrote: > > > Hi again, > > > > is there any way to have multiple "distinct" sched domains, i.e.: > > mount -t cgroup -o cpu none /dev/sched_domain_0 > > .... setup sched_domain_0 (ex: 90% RT, 10% Others) > > mount -t cgroup -o cpu none /dev/sched_domain_1 > > .... setup sched_domain_1 (ex: 20% RT, 80% Others) > > Then give sched_domain_0 to cpuset A and sched_domain_1 to B? > > Nope. > > We currently only support a single instance of a cgroup controller. > > I see the use for what you propose, however implementing that will be > 'interesting'. I am confused, if you cpusets, you get your own sched_domain. If you mount cpusets and cpu controller together, you'll get what you want. Is this a figment of my imagination. You might need to use exclusive CPUsets though. -- Balbir ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)? 2009-02-11 11:53 ` Balbir Singh @ 2009-02-11 12:00 ` Peter Zijlstra 2009-02-11 12:10 ` Rolando Martins 1 sibling, 0 replies; 14+ messages in thread From: Peter Zijlstra @ 2009-02-11 12:00 UTC (permalink / raw) To: balbir; +Cc: Rolando Martins, linux-kernel, Paul Menage, Srivatsa Vaddagiri On Wed, 2009-02-11 at 17:23 +0530, Balbir Singh wrote: > * Peter Zijlstra <peterz@infradead.org> [2009-02-11 12:42:14]: > > > On Wed, 2009-02-11 at 11:33 +0000, Rolando Martins wrote: > > > > > Hi again, > > > > > > is there any way to have multiple "distinct" sched domains, i.e.: > > > mount -t cgroup -o cpu none /dev/sched_domain_0 > > > .... setup sched_domain_0 (ex: 90% RT, 10% Others) > > > mount -t cgroup -o cpu none /dev/sched_domain_1 > > > .... setup sched_domain_1 (ex: 20% RT, 80% Others) > > > Then give sched_domain_0 to cpuset A and sched_domain_1 to B? > > > > Nope. > > > > We currently only support a single instance of a cgroup controller. > > > > I see the use for what you propose, however implementing that will be > > 'interesting'. > > I am confused, if you cpusets, you get your own sched_domain. If you > mount cpusets and cpu controller together, you'll get what you want. > Is this a figment of my imagination. You might need to use exclusive > CPUsets though. afaiui he wants a cgroup hierarchy per exclusive sched domain. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)? 2009-02-11 11:53 ` Balbir Singh 2009-02-11 12:00 ` Peter Zijlstra @ 2009-02-11 12:10 ` Rolando Martins 1 sibling, 0 replies; 14+ messages in thread From: Rolando Martins @ 2009-02-11 12:10 UTC (permalink / raw) To: balbir; +Cc: Peter Zijlstra, linux-kernel, Paul Menage, Srivatsa Vaddagiri On 2/11/09, Balbir Singh <balbir@linux.vnet.ibm.com> wrote: > * Peter Zijlstra <peterz@infradead.org> [2009-02-11 12:42:14]: > > > > On Wed, 2009-02-11 at 11:33 +0000, Rolando Martins wrote: > > > > > Hi again, > > > > > > is there any way to have multiple "distinct" sched domains, i.e.: > > > mount -t cgroup -o cpu none /dev/sched_domain_0 > > > .... setup sched_domain_0 (ex: 90% RT, 10% Others) > > > mount -t cgroup -o cpu none /dev/sched_domain_1 > > > .... setup sched_domain_1 (ex: 20% RT, 80% Others) > > > Then give sched_domain_0 to cpuset A and sched_domain_1 to B? > > > > Nope. > > > > We currently only support a single instance of a cgroup controller. > > > > I see the use for what you propose, however implementing that will be > > 'interesting'. > > > I am confused, if you cpusets, you get your own sched_domain. If you > mount cpusets and cpu controller together, you'll get what you want. > Is this a figment of my imagination. You might need to use exclusive > CPUsets though. > > -- > > Balbir > I don't know if you meant the following situation (mounting cpuset and cpu together): R ----------------------- (80% RT, 20%Others) A B (100% RT, 0% Others) (Cpus 0-2) (CPU 3) If so, we can't do this because of the restriction imposed by global rt_runtime_ns. Perhaps a "feasible" solution could be implemented by having distinct global rt_runtime_ns (one for each cpu, i.e.: rt_runtime_ns_0; ...; rt_runtime_n ) R ----------------------- (80% RT, 20%Others) A B (100% RT, 0% Others) (Cpus 0-2) (CPU 3) capacity_used_cpu_0_rt = 0.8 capacity_used_cpu_3_rt = 1 capacity_used_cpu_1_rt = 0.8 capacity_used_cpu_2_rt = 0.8 Given a i processor: we have the global restriction enforced: SUM(capacity_used_cpu_i_rt) < rt_runtime_i Rol ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: cgroup, RT reservation per core(s)? 2009-02-10 13:06 ` Peter Zijlstra 2009-02-10 14:46 ` Rolando Martins @ 2009-03-03 12:58 ` Rolando Martins 1 sibling, 0 replies; 14+ messages in thread From: Rolando Martins @ 2009-03-03 12:58 UTC (permalink / raw) To: Peter Zijlstra; +Cc: linux-kernel On Tue, Feb 10, 2009 at 1:06 PM, Peter Zijlstra <peterz@infradead.org> wrote: > On Mon, 2009-02-09 at 20:04 +0000, Rolando Martins wrote: > >> I should have elaborated this more: >> >> root >> ----|---- >> | | >> (0.5 mem) 0 1 (100% rt, 0.5 mem) >> --------- >> | | | >> 2 3 4 (33% rt for each group, 33% mem >> per group(0.165)) >> Rol > > > Right, i think this can be done. > > You would indeed need cpusets and sched-cgroups. > > Split the machine in 2 using cpusets. > > ___R___ > / \ > A B > > Where R is the root cpuset, and A and B are the siblings. > Assign A one half the cpus, and B the other half. > Disable load-balancing on R. > > Then using sched cgroups create the hierarchy > > ____1____ > / | \ > 2 3 4 > > Where 1 can be the root group if you like. > > Assign 1 a utilization limit of 100%, and 2,3 and 4 a utilization limit > of 33% each. > > Then place the tasks that get 100% cputime on your 2 cpus in cpuset A > and sched group 1. > > Place your other tasks in B,{2-4} respectively. > > The reason this works is that bandwidth distribution is sched domain > wide, and by disabling load-balancing on R, you split the schedule > domain. > > I've never actually tried anything like this, let me know if it > works ;-) > Just to confirm, cpuset.sched_load_balance doesn't work with RT, right? You cannot have tasks for sub-domain 2 to utilize bandwidth of sub-domain 3, right? __1__ / \ 2 3 (50% rt) (50% rt ) For my application domain;) it would be interesting to have rt_runtime_ns as a min. of allocated rt and not a max. Ex. If an application of domain 2 needs to go up to 100% and domain 3 is idle, then it would be cool to let it utilize the full bandwidth. (we also could have a hard upper limit in each sub-domain, like hard_up=0.8, i.e. even if we could get 100%, we will only utilize 80%). Does this make sense? ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2009-03-03 12:58 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2009-02-09 19:30 cgroup, RT reservation per core(s)? Rolando Martins 2009-02-09 19:52 ` Peter Zijlstra 2009-02-09 20:04 ` Rolando Martins 2009-02-10 13:06 ` Peter Zijlstra 2009-02-10 14:46 ` Rolando Martins 2009-02-10 16:00 ` Peter Zijlstra 2009-02-10 17:32 ` Rolando Martins 2009-02-10 19:53 ` Peter Zijlstra 2009-02-11 11:33 ` Rolando Martins 2009-02-11 11:42 ` Peter Zijlstra 2009-02-11 11:53 ` Balbir Singh 2009-02-11 12:00 ` Peter Zijlstra 2009-02-11 12:10 ` Rolando Martins 2009-03-03 12:58 ` Rolando Martins
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).