All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] CPU hard limits
@ 2009-06-04  5:36 Bharata B Rao
  2009-06-04 12:19 ` Avi Kivity
                   ` (3 more replies)
  0 siblings, 4 replies; 107+ messages in thread
From: Bharata B Rao @ 2009-06-04  5:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dhaval Giani, Balbir Singh, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, Avi Kivity, kvm,
	Linux Containers, Herbert Poetzl

Hi,

This is an RFC about the CPU hard limits feature where I have explained
the need for the feature, the proposed plan and the issues around it.
Before I come up with an implementation for hard limits, I would like to
know community's thoughts on this scheduler enhancement and any feedback
and suggestions.

Regards,
Bharata.

1. CPU hard limit
2. Need for hard limiting CPU resource
3. Granularity of enforcing CPU hard limits
4. Existing solutions
5. Specifying hard limits
6. Per task group vs global bandwidth period
7. Configuring
8. Throttling of tasks
9. Group scheduler hierarchy considerations
10. SMP considerations
11. Starvation
12. Hard limit and fairness

1. CPU hard limit
-----------------
CFS is a proportional share scheduler which tries to divide the CPU time
proportionately between tasks or groups of tasks (task group/cgroup) depending
on the priority/weight of the task or shares assigned to groups of tasks.
In CFS, a task/task group can get more than its share of CPU if there are
enough idle CPU cycles available in the system, due to the work conserving
nature of the scheduler.

However there are scenarios (Sec 2) where giving more than the desired
CPU share to a task/task group is not acceptable. In those scenarios, the
scheduler needs to put a hard stop on the CPU resource consumption of
task/task group if it exceeds a preset limit. This is usually achieved by
throttling the task/task group when it fully consumes its allocated CPU time.

2. Need for hard limiting CPU resource
--------------------------------------
- Pay-per-use: In enterprise systems that cater to multiple clients/customers
  where a customer demands a certain share of CPU resources and pays only
  that, CPU hard limits will be useful to hard limit the customer's job
  to consume only the specified amount of CPU resource.
- In container based virtualization environments running multiple containers,
  hard limits will be useful to ensure a container doesn't exceed its
  CPU entitlement.
- Hard limits can be used to provide guarantees.

3. Granularity of enforcing CPU hard limits
-------------------------------------------
Conceptually, hard limits can either be enforced for individual tasks or
groups of tasks.  However enforcing limits per task would be too fine
grained and would be a lot of work on the part of the system administrator
in terms of setting limits for every task. Based on the current understanding
of the users of this feature,  it is felt that hard limiting is more useful
at task group level than the individual tasks level. Hence in the subsequent
paragraphs, the concept of hard limit as applicable to task group/cgroup
is discussed.

4. Existing solutions
---------------------
- Both Linux-VServer and OpenVZ virtualization solutions support CPU hard
  limiting.
- Per task limit can be enforced using rlimits, but it is not rate based.

5. Specifying hard limits
-------------------------
CPU time consumed by a task group is generally measured over a
time period (called bandwidth period) and the task group gets throttled
when its CPU time reaches a limit (hard limit) within a bandwidth period.
The task group remains throttled until the bandwidth period gets
renewed at which time additional CPU time becomes available
to the tasks in the system.

When a task group's hard limit is specified as a ratio X/Y, it means that
the group will get throttled if its CPU time consumption exceeds X seconds
in a bandwidth period of Y seconds.

Specifying the hard limit as X/Y requires us to specify the bandwidth
period also.

Is having a uniform/same bandwidth period for all the groups an option ?
If so, we could even specify the hard limit as a percentage, like
30% of a uniform bandwidth period.

6. Per task group vs global bandwidth period
--------------------------------------------
The bandwidth period can either be per task group or global. With global
bandwidth period, the runtimes of all the task groups need to be
replenished when the period ends. Though this appears conceptually simple,
the implementation might not scale. Instead if every task group maintains its
bandwidth period separately, the refresh cycles of each group will happen
independent of each other. Moreover different groups might prefer different
bandwidth periods. Hence the first implementation will have per task group
bandwidth period.

Timers can be used to trigger bandwidth refresh cycles. (similar to rt group
sched)

7. Configuring
--------------
- User could set the hard limit (X and/or Y) through the cgroup fs.
- When the scheduler supports hard limiting, should it be enabled
  for all tasks groups of the system ? Or should user have an option
  to enable hard limiting per group ?
- When hard limiting is enabled for a group, should the limit be
  set to a default to start with ? Or should the user set the limit
  and the bandwidth before enabling the hard limiting ?
- What should be a sane default value for the bandwidth period ?

8. Throttling of tasks
----------------------
Task group can be taken off the runqueue when it hits the limit and enqueued
back when the bandwidth period is refreshed. This method would require us to
maintain the throttled tasks list separately for every group.

Under heavy throttling, there could be tasks being dequeued and enqueued
back at bandwidth refresh times leading to frequent variations in the
runqueue load. This might unduly stress the load balancer.

Note: A group (entity) can't be dequeued unless all tasks under it are
dequeued. So there can be false/failed attempts to run tasks of a throttled
group until all the tasks from the throttled group are dequeued.

9. Group scheduler hierarchy considerations
-------------------------------------------
Since the group scheduler is hierarchical in nature, should there be any
relation between the hard limit values of the parent task group
and the values of its child groups ? Should the hard limit values set for
child groups be compatible with the parent's hard limit ? For eg, consider
a group A having hard limit as X/Y has two children A1 and A2. Should the
limits for A1 (X1/Y) and A2 (X2/Y) be set so that X1/Y+X2/Y <= X/Y ?

Or should child groups set their limits independently of parent ? In this
case, even if the child still has CPU time left before it hits the limit,
it could get throttled because its parent got throttled. I would think that
this method will lead to easier implementation.

AFAICS, rt group scheduler needs EDF to support different bandwidth periods
for different groups (Ref: Documentation/scheduler/sched-rt-group.txt). I
don't think the same requirement is applicable to non-rt groups. This is
because with hard limits we are not guaranteeing the CPU time for a group,
instead we are just specifying the max time which a group can run within a
bandwidth period.

10. SMP considerations
----------------------
Hard limits could be enforced for the system as a whole or for individual
CPUS.

When it is enforced per CPU, a task group on a CPU will be throttled if
it reaches its hard limit on that CPU. This can lead to unfairness if
the same task group on other CPUs has runtimes still left and it is not
being utilized.

If enforced system wide, then a task group will be throttled when sum of the
run times of its tasks running on different CPUs reach the limit.

Could we use a hybrid method where a task group that reaches its limit on a CPU
could draw the group bandwidth from another CPU where there are no runnable
tasks belonging to that group ?

RT group scheduling borrows runtime from other CPUs when runtimes are balanced.

11. Starvation
---------------
When a task group that holds a shared resource (like a lock) is throttled,
another group which needs the same shared resource will not be able to
make progress even when the CPU has idle cycles to spare. This will lead
to starvation and unfairness. This situation could be avoided by some of
the methods like

- Disabling throttling when a group is holding a lock.
- Inheriting runtime from the group which faces starvation.

The first implementation will not address this problem of starvation.

12. Hard limits and fairness
----------------------------
Hard limits are set independent of group shares. The hard limit setting
by the user may be such that it may not be possible for the scheduler to
meet fairness and also enforce hard limits. Hard limiting takes precedence.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found] ` <20090604053649.GA3701-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
@ 2009-06-04 12:19   ` Avi Kivity
  2009-06-05  8:53   ` Paul Menage
  1 sibling, 0 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-04 12:19 UTC (permalink / raw)
  To: bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: Peter Zijlstra, Pavel Emelyanov, Dhaval Giani,
	kvm-u79uwXL29TY76Z2rM5mHXA, Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Ingo Molnar, Balbir Singh

Bharata B Rao wrote:
> 2. Need for hard limiting CPU resource
> --------------------------------------
> - Pay-per-use: In enterprise systems that cater to multiple clients/customers
>   where a customer demands a certain share of CPU resources and pays only
>   that, CPU hard limits will be useful to hard limit the customer's job
>   to consume only the specified amount of CPU resource.
> - In container based virtualization environments running multiple containers,
>   hard limits will be useful to ensure a container doesn't exceed its
>   CPU entitlement.
> - Hard limits can be used to provide guarantees.
>   
How can hard limits provide guarantees?

Let's take an example where I have 1 group that I wish to guarantee a 
20% share of the cpu, and anther 8 groups with no limits or guarantees.

One way to achieve the guarantee is to hard limit each of the 8 other 
groups to 10%; the sum total of the limits is 80%, leaving 20% for the 
guarantee group. The downside is the arbitrary limit imposed on the 
other groups.

Another way is to place the 8 groups in a container group, and limit 
that to 80%. But that doesn't work if I want to provide guarantees to 
several groups.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-04  5:36 [RFC] CPU hard limits Bharata B Rao
@ 2009-06-04 12:19 ` Avi Kivity
  2009-06-04 21:32   ` Mike Waychison
                     ` (3 more replies)
       [not found] ` <20090604053649.GA3701-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
                   ` (2 subsequent siblings)
  3 siblings, 4 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-04 12:19 UTC (permalink / raw)
  To: bharata
  Cc: linux-kernel, Dhaval Giani, Balbir Singh,
	Vaidyanathan Srinivasan, Gautham R Shenoy, Srivatsa Vaddagiri,
	Ingo Molnar, Peter Zijlstra, Pavel Emelyanov, kvm,
	Linux Containers, Herbert Poetzl

Bharata B Rao wrote:
> 2. Need for hard limiting CPU resource
> --------------------------------------
> - Pay-per-use: In enterprise systems that cater to multiple clients/customers
>   where a customer demands a certain share of CPU resources and pays only
>   that, CPU hard limits will be useful to hard limit the customer's job
>   to consume only the specified amount of CPU resource.
> - In container based virtualization environments running multiple containers,
>   hard limits will be useful to ensure a container doesn't exceed its
>   CPU entitlement.
> - Hard limits can be used to provide guarantees.
>   
How can hard limits provide guarantees?

Let's take an example where I have 1 group that I wish to guarantee a 
20% share of the cpu, and anther 8 groups with no limits or guarantees.

One way to achieve the guarantee is to hard limit each of the 8 other 
groups to 10%; the sum total of the limits is 80%, leaving 20% for the 
guarantee group. The downside is the arbitrary limit imposed on the 
other groups.

Another way is to place the 8 groups in a container group, and limit 
that to 80%. But that doesn't work if I want to provide guarantees to 
several groups.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]   ` <4A27BBCA.5020606-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2009-06-04 21:32     ` Mike Waychison
  2009-06-05  3:03     ` Bharata B Rao
  2009-06-05  3:07     ` Balbir Singh
  2 siblings, 0 replies; 107+ messages in thread
From: Mike Waychison @ 2009-06-04 21:32 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Peter Zijlstra, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Balbir Singh, Pavel Emelyanov

Avi Kivity wrote:
> Bharata B Rao wrote:
>> 2. Need for hard limiting CPU resource
>> --------------------------------------
>> - Pay-per-use: In enterprise systems that cater to multiple clients/customers
>>   where a customer demands a certain share of CPU resources and pays only
>>   that, CPU hard limits will be useful to hard limit the customer's job
>>   to consume only the specified amount of CPU resource.
>> - In container based virtualization environments running multiple containers,
>>   hard limits will be useful to ensure a container doesn't exceed its
>>   CPU entitlement.
>> - Hard limits can be used to provide guarantees.
>>   
> How can hard limits provide guarantees?

Hard limits are useful and desirable in situations where we would like 
to maintain deterministic behavior.

Placing a hard cap on the cpu usage of a given task group (and 
configuring such that this cpu time is not overcommited) on a system 
allows us to create a hard guarantee that throughput for that task group 
will not fluctuate as other workloads are added and removed on the system.

Cache use and bus bandwidth in a multi-workload environment can still 
cause a performance deviation, but these are second order compared to 
the cpu scheduling guarantees themselves.

Mike Waychison

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-04 12:19 ` Avi Kivity
@ 2009-06-04 21:32   ` Mike Waychison
  2009-06-05  3:03   ` Bharata B Rao
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 107+ messages in thread
From: Mike Waychison @ 2009-06-04 21:32 UTC (permalink / raw)
  To: Avi Kivity
  Cc: bharata, Peter Zijlstra, Pavel Emelyanov, Dhaval Giani, kvm,
	Gautham R Shenoy, Linux Containers, linux-kernel, Ingo Molnar,
	Balbir Singh

Avi Kivity wrote:
> Bharata B Rao wrote:
>> 2. Need for hard limiting CPU resource
>> --------------------------------------
>> - Pay-per-use: In enterprise systems that cater to multiple clients/customers
>>   where a customer demands a certain share of CPU resources and pays only
>>   that, CPU hard limits will be useful to hard limit the customer's job
>>   to consume only the specified amount of CPU resource.
>> - In container based virtualization environments running multiple containers,
>>   hard limits will be useful to ensure a container doesn't exceed its
>>   CPU entitlement.
>> - Hard limits can be used to provide guarantees.
>>   
> How can hard limits provide guarantees?

Hard limits are useful and desirable in situations where we would like 
to maintain deterministic behavior.

Placing a hard cap on the cpu usage of a given task group (and 
configuring such that this cpu time is not overcommited) on a system 
allows us to create a hard guarantee that throughput for that task group 
will not fluctuate as other workloads are added and removed on the system.

Cache use and bus bandwidth in a multi-workload environment can still 
cause a performance deviation, but these are second order compared to 
the cpu scheduling guarantees themselves.

Mike Waychison

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]   ` <4A27BBCA.5020606-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2009-06-04 21:32     ` Mike Waychison
@ 2009-06-05  3:03     ` Bharata B Rao
  2009-06-05  3:07     ` Balbir Singh
  2 siblings, 0 replies; 107+ messages in thread
From: Bharata B Rao @ 2009-06-05  3:03 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Peter Zijlstra, Pavel Emelyanov, Dhaval Giani,
	kvm-u79uwXL29TY76Z2rM5mHXA, Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Ingo Molnar, Balbir Singh

On Thu, Jun 04, 2009 at 03:19:22PM +0300, Avi Kivity wrote:
> Bharata B Rao wrote:
>> 2. Need for hard limiting CPU resource
>> --------------------------------------
>> - Pay-per-use: In enterprise systems that cater to multiple clients/customers
>>   where a customer demands a certain share of CPU resources and pays only
>>   that, CPU hard limits will be useful to hard limit the customer's job
>>   to consume only the specified amount of CPU resource.
>> - In container based virtualization environments running multiple containers,
>>   hard limits will be useful to ensure a container doesn't exceed its
>>   CPU entitlement.
>> - Hard limits can be used to provide guarantees.
>>   
> How can hard limits provide guarantees?
>
> Let's take an example where I have 1 group that I wish to guarantee a  
> 20% share of the cpu, and anther 8 groups with no limits or guarantees.
>
> One way to achieve the guarantee is to hard limit each of the 8 other  
> groups to 10%; the sum total of the limits is 80%, leaving 20% for the  
> guarantee group. The downside is the arbitrary limit imposed on the  
> other groups.

This method sounds very similar to the openvz method:
http://wiki.openvz.org/Containers/Guarantees_for_resources

>
> Another way is to place the 8 groups in a container group, and limit  
> that to 80%. But that doesn't work if I want to provide guarantees to  
> several groups.

Hmm why not ? Reduce the guarantee of the container group and provide
the same to additional groups ?

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-04 12:19 ` Avi Kivity
  2009-06-04 21:32   ` Mike Waychison
@ 2009-06-05  3:03   ` Bharata B Rao
  2009-06-05  3:33     ` Avi Kivity
       [not found]     ` <20090605030309.GA3872-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
  2009-06-05  3:07   ` Balbir Singh
       [not found]   ` <4A27BBCA.5020606-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  3 siblings, 2 replies; 107+ messages in thread
From: Bharata B Rao @ 2009-06-05  3:03 UTC (permalink / raw)
  To: Avi Kivity
  Cc: linux-kernel, Dhaval Giani, Balbir Singh,
	Vaidyanathan Srinivasan, Gautham R Shenoy, Srivatsa Vaddagiri,
	Ingo Molnar, Peter Zijlstra, Pavel Emelyanov, kvm,
	Linux Containers, Herbert Poetzl

On Thu, Jun 04, 2009 at 03:19:22PM +0300, Avi Kivity wrote:
> Bharata B Rao wrote:
>> 2. Need for hard limiting CPU resource
>> --------------------------------------
>> - Pay-per-use: In enterprise systems that cater to multiple clients/customers
>>   where a customer demands a certain share of CPU resources and pays only
>>   that, CPU hard limits will be useful to hard limit the customer's job
>>   to consume only the specified amount of CPU resource.
>> - In container based virtualization environments running multiple containers,
>>   hard limits will be useful to ensure a container doesn't exceed its
>>   CPU entitlement.
>> - Hard limits can be used to provide guarantees.
>>   
> How can hard limits provide guarantees?
>
> Let's take an example where I have 1 group that I wish to guarantee a  
> 20% share of the cpu, and anther 8 groups with no limits or guarantees.
>
> One way to achieve the guarantee is to hard limit each of the 8 other  
> groups to 10%; the sum total of the limits is 80%, leaving 20% for the  
> guarantee group. The downside is the arbitrary limit imposed on the  
> other groups.

This method sounds very similar to the openvz method:
http://wiki.openvz.org/Containers/Guarantees_for_resources

>
> Another way is to place the 8 groups in a container group, and limit  
> that to 80%. But that doesn't work if I want to provide guarantees to  
> several groups.

Hmm why not ? Reduce the guarantee of the container group and provide
the same to additional groups ?

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]   ` <4A27BBCA.5020606-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2009-06-04 21:32     ` Mike Waychison
  2009-06-05  3:03     ` Bharata B Rao
@ 2009-06-05  3:07     ` Balbir Singh
  2 siblings, 0 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-05  3:07 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Peter Zijlstra, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Pavel Emelyanov

* Avi Kivity <avi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> [2009-06-04 15:19:22]:

> Bharata B Rao wrote:
> > 2. Need for hard limiting CPU resource
> > --------------------------------------
> > - Pay-per-use: In enterprise systems that cater to multiple clients/customers
> >   where a customer demands a certain share of CPU resources and pays only
> >   that, CPU hard limits will be useful to hard limit the customer's job
> >   to consume only the specified amount of CPU resource.
> > - In container based virtualization environments running multiple containers,
> >   hard limits will be useful to ensure a container doesn't exceed its
> >   CPU entitlement.
> > - Hard limits can be used to provide guarantees.
> >   
> How can hard limits provide guarantees?
> 
> Let's take an example where I have 1 group that I wish to guarantee a 
> 20% share of the cpu, and anther 8 groups with no limits or guarantees.
> 
> One way to achieve the guarantee is to hard limit each of the 8 other 
> groups to 10%; the sum total of the limits is 80%, leaving 20% for the 
> guarantee group. The downside is the arbitrary limit imposed on the 
> other groups.
> 
> Another way is to place the 8 groups in a container group, and limit 
> that to 80%. But that doesn't work if I want to provide guarantees to 
> several groups.
>

Hi, Avi,

Take a look at
http://wiki.openvz.org/Containers/Guarantees_for_resources
and the associated program in the wiki page.

-- 
	Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-04 12:19 ` Avi Kivity
  2009-06-04 21:32   ` Mike Waychison
  2009-06-05  3:03   ` Bharata B Rao
@ 2009-06-05  3:07   ` Balbir Singh
       [not found]   ` <4A27BBCA.5020606-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  3 siblings, 0 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-05  3:07 UTC (permalink / raw)
  To: Avi Kivity
  Cc: bharata, Peter Zijlstra, Pavel Emelyanov, Dhaval Giani, kvm,
	Gautham R Shenoy, Linux Containers, linux-kernel, Ingo Molnar

* Avi Kivity <avi@redhat.com> [2009-06-04 15:19:22]:

> Bharata B Rao wrote:
> > 2. Need for hard limiting CPU resource
> > --------------------------------------
> > - Pay-per-use: In enterprise systems that cater to multiple clients/customers
> >   where a customer demands a certain share of CPU resources and pays only
> >   that, CPU hard limits will be useful to hard limit the customer's job
> >   to consume only the specified amount of CPU resource.
> > - In container based virtualization environments running multiple containers,
> >   hard limits will be useful to ensure a container doesn't exceed its
> >   CPU entitlement.
> > - Hard limits can be used to provide guarantees.
> >   
> How can hard limits provide guarantees?
> 
> Let's take an example where I have 1 group that I wish to guarantee a 
> 20% share of the cpu, and anther 8 groups with no limits or guarantees.
> 
> One way to achieve the guarantee is to hard limit each of the 8 other 
> groups to 10%; the sum total of the limits is 80%, leaving 20% for the 
> guarantee group. The downside is the arbitrary limit imposed on the 
> other groups.
> 
> Another way is to place the 8 groups in a container group, and limit 
> that to 80%. But that doesn't work if I want to provide guarantees to 
> several groups.
>

Hi, Avi,

Take a look at
http://wiki.openvz.org/Containers/Guarantees_for_resources
and the associated program in the wiki page.

-- 
	Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]     ` <20090605030309.GA3872-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
@ 2009-06-05  3:33       ` Avi Kivity
  0 siblings, 0 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-05  3:33 UTC (permalink / raw)
  To: bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: Peter Zijlstra, Pavel Emelyanov, Dhaval Giani,
	kvm-u79uwXL29TY76Z2rM5mHXA, Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Ingo Molnar, Balbir Singh

Bharata B Rao wrote:
>> Another way is to place the 8 groups in a container group, and limit  
>> that to 80%. But that doesn't work if I want to provide guarantees to  
>> several groups.
>>     
>
> Hmm why not ? Reduce the guarantee of the container group and provide
> the same to additional groups ?
>   

This method produces suboptimal results:

$ cgroup-limits 10 10 0
[50.0, 50.0, 40.0]

I want to provide two 10% guaranteed groups and one best-effort group.  
Using the limits method, no group can now use more than 50% of the 
resources.  However, having the first group use 90% of the resources 
does not violate any guarantees, but it not allowed by the solution.

#!/usr/bin/python

def calculate_limits(g, R):
    N = len(g)
    if N == 1:
        return [R]

    s = sum([R - gi for gi in g])
    return [(s - (R - gi) - (N - 2) * (R - gi)) / (N - 1)
            for gi in g]

import sys
print calculate_limits([float(x) for x in sys.argv[1:]], 100)

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  3:03   ` Bharata B Rao
@ 2009-06-05  3:33     ` Avi Kivity
       [not found]       ` <4A28921C.6010802-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2009-06-05  4:37       ` Balbir Singh
       [not found]     ` <20090605030309.GA3872-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
  1 sibling, 2 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-05  3:33 UTC (permalink / raw)
  To: bharata
  Cc: linux-kernel, Dhaval Giani, Balbir Singh,
	Vaidyanathan Srinivasan, Gautham R Shenoy, Srivatsa Vaddagiri,
	Ingo Molnar, Peter Zijlstra, Pavel Emelyanov, kvm,
	Linux Containers, Herbert Poetzl

Bharata B Rao wrote:
>> Another way is to place the 8 groups in a container group, and limit  
>> that to 80%. But that doesn't work if I want to provide guarantees to  
>> several groups.
>>     
>
> Hmm why not ? Reduce the guarantee of the container group and provide
> the same to additional groups ?
>   

This method produces suboptimal results:

$ cgroup-limits 10 10 0
[50.0, 50.0, 40.0]

I want to provide two 10% guaranteed groups and one best-effort group.  
Using the limits method, no group can now use more than 50% of the 
resources.  However, having the first group use 90% of the resources 
does not violate any guarantees, but it not allowed by the solution.

#!/usr/bin/python

def calculate_limits(g, R):
    N = len(g)
    if N == 1:
        return [R]

    s = sum([R - gi for gi in g])
    return [(s - (R - gi) - (N - 2) * (R - gi)) / (N - 1)
            for gi in g]

import sys
print calculate_limits([float(x) for x in sys.argv[1:]], 100)

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]       ` <4A28921C.6010802-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2009-06-05  4:37         ` Balbir Singh
  0 siblings, 0 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-05  4:37 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Pavel Emelyanov, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Peter Zijlstra

On Fri, Jun 5, 2009 at 11:33 AM, Avi Kivity <avi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> Bharata B Rao wrote:
>>>
>>> Another way is to place the 8 groups in a container group, and limit
>>>  that to 80%. But that doesn't work if I want to provide guarantees to
>>>  several groups.
>>>
>>
>> Hmm why not ? Reduce the guarantee of the container group and provide
>> the same to additional groups ?
>>
>
> This method produces suboptimal results:
>
> $ cgroup-limits 10 10 0
> [50.0, 50.0, 40.0]
>
> I want to provide two 10% guaranteed groups and one best-effort group.
>  Using the limits method, no group can now use more than 50% of the
> resources.  However, having the first group use 90% of the resources does
> not violate any guarantees, but it not allowed by the solution.
>

How, it works out fine in my calculation

50 + 40 for G2 and G3, make sure that G1 gets 10%, since others are
limited to 90%
50 + 40 for G1 and G3, make sure that G2 gets 10%, since others are
limited to 90%
50 + 50 for G1 and G2, make sure that G3 gets 0%, since others are
limited to 100%

Now if we really have zeros, I would recommend using

cgroup-limits 10 10 and you'll see that you'll get 90, 90 as output.

Adding zeros to the calcuation is not recommended. Does that help?

Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  3:33     ` Avi Kivity
       [not found]       ` <4A28921C.6010802-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2009-06-05  4:37       ` Balbir Singh
       [not found]         ` <661de9470906042137u603e2997n80c270bf7f6191ad-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2009-06-05  4:44         ` Avi Kivity
  1 sibling, 2 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-05  4:37 UTC (permalink / raw)
  To: Avi Kivity
  Cc: bharata, linux-kernel, Dhaval Giani, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, kvm, Linux Containers,
	Herbert Poetzl

On Fri, Jun 5, 2009 at 11:33 AM, Avi Kivity <avi@redhat.com> wrote:
> Bharata B Rao wrote:
>>>
>>> Another way is to place the 8 groups in a container group, and limit
>>>  that to 80%. But that doesn't work if I want to provide guarantees to
>>>  several groups.
>>>
>>
>> Hmm why not ? Reduce the guarantee of the container group and provide
>> the same to additional groups ?
>>
>
> This method produces suboptimal results:
>
> $ cgroup-limits 10 10 0
> [50.0, 50.0, 40.0]
>
> I want to provide two 10% guaranteed groups and one best-effort group.
>  Using the limits method, no group can now use more than 50% of the
> resources.  However, having the first group use 90% of the resources does
> not violate any guarantees, but it not allowed by the solution.
>

How, it works out fine in my calculation

50 + 40 for G2 and G3, make sure that G1 gets 10%, since others are
limited to 90%
50 + 40 for G1 and G3, make sure that G2 gets 10%, since others are
limited to 90%
50 + 50 for G1 and G2, make sure that G3 gets 0%, since others are
limited to 100%

Now if we really have zeros, I would recommend using

cgroup-limits 10 10 and you'll see that you'll get 90, 90 as output.

Adding zeros to the calcuation is not recommended. Does that help?

Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]         ` <661de9470906042137u603e2997n80c270bf7f6191ad-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-06-05  4:44           ` Avi Kivity
  0 siblings, 0 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-05  4:44 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Pavel Emelyanov, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Peter Zijlstra

Balbir Singh wrote:
> On Fri, Jun 5, 2009 at 11:33 AM, Avi Kivity <avi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>   
>> Bharata B Rao wrote:
>>     
>>>> Another way is to place the 8 groups in a container group, and limit
>>>>  that to 80%. But that doesn't work if I want to provide guarantees to
>>>>  several groups.
>>>>
>>>>         
>>> Hmm why not ? Reduce the guarantee of the container group and provide
>>> the same to additional groups ?
>>>
>>>       
>> This method produces suboptimal results:
>>
>> $ cgroup-limits 10 10 0
>> [50.0, 50.0, 40.0]
>>
>> I want to provide two 10% guaranteed groups and one best-effort group.
>>  Using the limits method, no group can now use more than 50% of the
>> resources.  However, having the first group use 90% of the resources does
>> not violate any guarantees, but it not allowed by the solution.
>>
>>     
>
> How, it works out fine in my calculation
>
> 50 + 40 for G2 and G3, make sure that G1 gets 10%, since others are
> limited to 90%
> 50 + 40 for G1 and G3, make sure that G2 gets 10%, since others are
> limited to 90%
> 50 + 50 for G1 and G2, make sure that G3 gets 0%, since others are
> limited to 100%
>   

It's fine in that it satisfies the guarantees, but it is deeply 
suboptimal.  If I ran a cpu hog in the first group, while the other two 
were idle, it would be limited to 50% cpu.  On the other hand, if it 
consumed all 100% cpu it would still satisfy the guarantees (as the 
other groups are idle).

The result is that in such a situation, wall clock time would double 
even though cpu resources are available.
> Now if we really have zeros, I would recommend using
>
> cgroup-limits 10 10 and you'll see that you'll get 90, 90 as output.
>
> Adding zeros to the calcuation is not recommended. Does that help?

What do you mean, it is not recommended? I have two groups which need at 
least 10% and one which does not need any guarantee, how do I express it?

In any case, changing the zero to 1% does not materially change the results.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  4:37       ` Balbir Singh
       [not found]         ` <661de9470906042137u603e2997n80c270bf7f6191ad-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-06-05  4:44         ` Avi Kivity
       [not found]           ` <4A28A2AB.3060108-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2009-06-05  4:49           ` Balbir Singh
  1 sibling, 2 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-05  4:44 UTC (permalink / raw)
  To: Balbir Singh
  Cc: bharata, linux-kernel, Dhaval Giani, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, kvm, Linux Containers,
	Herbert Poetzl

Balbir Singh wrote:
> On Fri, Jun 5, 2009 at 11:33 AM, Avi Kivity <avi@redhat.com> wrote:
>   
>> Bharata B Rao wrote:
>>     
>>>> Another way is to place the 8 groups in a container group, and limit
>>>>  that to 80%. But that doesn't work if I want to provide guarantees to
>>>>  several groups.
>>>>
>>>>         
>>> Hmm why not ? Reduce the guarantee of the container group and provide
>>> the same to additional groups ?
>>>
>>>       
>> This method produces suboptimal results:
>>
>> $ cgroup-limits 10 10 0
>> [50.0, 50.0, 40.0]
>>
>> I want to provide two 10% guaranteed groups and one best-effort group.
>>  Using the limits method, no group can now use more than 50% of the
>> resources.  However, having the first group use 90% of the resources does
>> not violate any guarantees, but it not allowed by the solution.
>>
>>     
>
> How, it works out fine in my calculation
>
> 50 + 40 for G2 and G3, make sure that G1 gets 10%, since others are
> limited to 90%
> 50 + 40 for G1 and G3, make sure that G2 gets 10%, since others are
> limited to 90%
> 50 + 50 for G1 and G2, make sure that G3 gets 0%, since others are
> limited to 100%
>   

It's fine in that it satisfies the guarantees, but it is deeply 
suboptimal.  If I ran a cpu hog in the first group, while the other two 
were idle, it would be limited to 50% cpu.  On the other hand, if it 
consumed all 100% cpu it would still satisfy the guarantees (as the 
other groups are idle).

The result is that in such a situation, wall clock time would double 
even though cpu resources are available.
> Now if we really have zeros, I would recommend using
>
> cgroup-limits 10 10 and you'll see that you'll get 90, 90 as output.
>
> Adding zeros to the calcuation is not recommended. Does that help?

What do you mean, it is not recommended? I have two groups which need at 
least 10% and one which does not need any guarantee, how do I express it?

In any case, changing the zero to 1% does not materially change the results.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]           ` <4A28A2AB.3060108-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2009-06-05  4:49             ` Balbir Singh
  0 siblings, 0 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-05  4:49 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Pavel Emelyanov, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Peter Zijlstra

* Avi Kivity <avi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> [2009-06-05 07:44:27]:

> Balbir Singh wrote:
>> On Fri, Jun 5, 2009 at 11:33 AM, Avi Kivity <avi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>>   
>>> Bharata B Rao wrote:
>>>     
>>>>> Another way is to place the 8 groups in a container group, and limit
>>>>>  that to 80%. But that doesn't work if I want to provide guarantees to
>>>>>  several groups.
>>>>>
>>>>>         
>>>> Hmm why not ? Reduce the guarantee of the container group and provide
>>>> the same to additional groups ?
>>>>
>>>>       
>>> This method produces suboptimal results:
>>>
>>> $ cgroup-limits 10 10 0
>>> [50.0, 50.0, 40.0]
>>>
>>> I want to provide two 10% guaranteed groups and one best-effort group.
>>>  Using the limits method, no group can now use more than 50% of the
>>> resources.  However, having the first group use 90% of the resources does
>>> not violate any guarantees, but it not allowed by the solution.
>>>
>>>     
>>
>> How, it works out fine in my calculation
>>
>> 50 + 40 for G2 and G3, make sure that G1 gets 10%, since others are
>> limited to 90%
>> 50 + 40 for G1 and G3, make sure that G2 gets 10%, since others are
>> limited to 90%
>> 50 + 50 for G1 and G2, make sure that G3 gets 0%, since others are
>> limited to 100%
>>   
>
> It's fine in that it satisfies the guarantees, but it is deeply  
> suboptimal.  If I ran a cpu hog in the first group, while the other two  
> were idle, it would be limited to 50% cpu.  On the other hand, if it  
> consumed all 100% cpu it would still satisfy the guarantees (as the  
> other groups are idle).
>
> The result is that in such a situation, wall clock time would double  
> even though cpu resources are available.

But then there is no other way to make a *guarantee*, guarantees come
at a cost of idling resources, no? Can you show me any other
combination that will provide the guarantee and without idling the
system for the specified guarantees?


>> Now if we really have zeros, I would recommend using
>>
>> cgroup-limits 10 10 and you'll see that you'll get 90, 90 as output.
>>
>> Adding zeros to the calcuation is not recommended. Does that help?
>
> What do you mean, it is not recommended? I have two groups which need at  
> least 10% and one which does not need any guarantee, how do I express it?
>
Ignore this part of my comment

> In any case, changing the zero to 1% does not materially change the results.

True.

-- 
	Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  4:44         ` Avi Kivity
       [not found]           ` <4A28A2AB.3060108-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2009-06-05  4:49           ` Balbir Singh
       [not found]             ` <20090605044946.GA11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
                               ` (3 more replies)
  1 sibling, 4 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-05  4:49 UTC (permalink / raw)
  To: Avi Kivity
  Cc: bharata, linux-kernel, Dhaval Giani, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, kvm, Linux Containers,
	Herbert Poetzl

* Avi Kivity <avi@redhat.com> [2009-06-05 07:44:27]:

> Balbir Singh wrote:
>> On Fri, Jun 5, 2009 at 11:33 AM, Avi Kivity <avi@redhat.com> wrote:
>>   
>>> Bharata B Rao wrote:
>>>     
>>>>> Another way is to place the 8 groups in a container group, and limit
>>>>>  that to 80%. But that doesn't work if I want to provide guarantees to
>>>>>  several groups.
>>>>>
>>>>>         
>>>> Hmm why not ? Reduce the guarantee of the container group and provide
>>>> the same to additional groups ?
>>>>
>>>>       
>>> This method produces suboptimal results:
>>>
>>> $ cgroup-limits 10 10 0
>>> [50.0, 50.0, 40.0]
>>>
>>> I want to provide two 10% guaranteed groups and one best-effort group.
>>>  Using the limits method, no group can now use more than 50% of the
>>> resources.  However, having the first group use 90% of the resources does
>>> not violate any guarantees, but it not allowed by the solution.
>>>
>>>     
>>
>> How, it works out fine in my calculation
>>
>> 50 + 40 for G2 and G3, make sure that G1 gets 10%, since others are
>> limited to 90%
>> 50 + 40 for G1 and G3, make sure that G2 gets 10%, since others are
>> limited to 90%
>> 50 + 50 for G1 and G2, make sure that G3 gets 0%, since others are
>> limited to 100%
>>   
>
> It's fine in that it satisfies the guarantees, but it is deeply  
> suboptimal.  If I ran a cpu hog in the first group, while the other two  
> were idle, it would be limited to 50% cpu.  On the other hand, if it  
> consumed all 100% cpu it would still satisfy the guarantees (as the  
> other groups are idle).
>
> The result is that in such a situation, wall clock time would double  
> even though cpu resources are available.

But then there is no other way to make a *guarantee*, guarantees come
at a cost of idling resources, no? Can you show me any other
combination that will provide the guarantee and without idling the
system for the specified guarantees?


>> Now if we really have zeros, I would recommend using
>>
>> cgroup-limits 10 10 and you'll see that you'll get 90, 90 as output.
>>
>> Adding zeros to the calcuation is not recommended. Does that help?
>
> What do you mean, it is not recommended? I have two groups which need at  
> least 10% and one which does not need any guarantee, how do I express it?
>
Ignore this part of my comment

> In any case, changing the zero to 1% does not materially change the results.

True.

-- 
	Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]             ` <20090605044946.GA11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
@ 2009-06-05  5:09               ` Chris Friesen
  2009-06-05  5:10               ` Balbir Singh
  2009-06-05  5:16               ` Avi Kivity
  2 siblings, 0 replies; 107+ messages in thread
From: Chris Friesen @ 2009-06-05  5:09 UTC (permalink / raw)
  To: balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: Pavel Emelyanov, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Peter Zijlstra

Balbir Singh wrote:

> But then there is no other way to make a *guarantee*, guarantees come
> at a cost of idling resources, no? Can you show me any other
> combination that will provide the guarantee and without idling the
> system for the specified guarantees?

The example given was two 10% guaranteed groups and one best-effort
group.  Why would this require idling resources?

If I have a hog in each group, the requirements would be met if the
groups got 33, 33, and 33.  (Or 10/10/80, for that matter.)  If the
second and third groups go idle, why not let the first group use 100% of
the cpu?

The only hard restriction is that the sum of the guarantees must be less
than 100%.

Chris

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  4:49           ` Balbir Singh
       [not found]             ` <20090605044946.GA11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
@ 2009-06-05  5:09             ` Chris Friesen
  2009-06-05  5:13               ` Balbir Singh
       [not found]               ` <4A28A882.8070503-ZIRUuHA3oDzQT0dZR+AlfA@public.gmane.org>
  2009-06-05  5:10             ` Balbir Singh
  2009-06-05  5:16             ` Avi Kivity
  3 siblings, 2 replies; 107+ messages in thread
From: Chris Friesen @ 2009-06-05  5:09 UTC (permalink / raw)
  To: balbir
  Cc: Avi Kivity, bharata, linux-kernel, Dhaval Giani,
	Vaidyanathan Srinivasan, Gautham R Shenoy, Srivatsa Vaddagiri,
	Ingo Molnar, Peter Zijlstra, Pavel Emelyanov, kvm,
	Linux Containers, Herbert Poetzl

Balbir Singh wrote:

> But then there is no other way to make a *guarantee*, guarantees come
> at a cost of idling resources, no? Can you show me any other
> combination that will provide the guarantee and without idling the
> system for the specified guarantees?

The example given was two 10% guaranteed groups and one best-effort
group.  Why would this require idling resources?

If I have a hog in each group, the requirements would be met if the
groups got 33, 33, and 33.  (Or 10/10/80, for that matter.)  If the
second and third groups go idle, why not let the first group use 100% of
the cpu?

The only hard restriction is that the sum of the guarantees must be less
than 100%.

Chris

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]             ` <20090605044946.GA11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
  2009-06-05  5:09               ` Chris Friesen
@ 2009-06-05  5:10               ` Balbir Singh
  2009-06-05  5:16               ` Avi Kivity
  2 siblings, 0 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-05  5:10 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Pavel Emelyanov, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Peter Zijlstra

* Balbir Singh <balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> [2009-06-05 12:49:46]:

> * Avi Kivity <avi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> [2009-06-05 07:44:27]:
> 
> > Balbir Singh wrote:
> >> On Fri, Jun 5, 2009 at 11:33 AM, Avi Kivity <avi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> >>   
> >>> Bharata B Rao wrote:
> >>>     
> >>>>> Another way is to place the 8 groups in a container group, and limit
> >>>>>  that to 80%. But that doesn't work if I want to provide guarantees to
> >>>>>  several groups.
> >>>>>
> >>>>>         
> >>>> Hmm why not ? Reduce the guarantee of the container group and provide
> >>>> the same to additional groups ?
> >>>>
> >>>>       
> >>> This method produces suboptimal results:
> >>>
> >>> $ cgroup-limits 10 10 0
> >>> [50.0, 50.0, 40.0]
> >>>
> >>> I want to provide two 10% guaranteed groups and one best-effort group.
> >>>  Using the limits method, no group can now use more than 50% of the
> >>> resources.  However, having the first group use 90% of the resources does
> >>> not violate any guarantees, but it not allowed by the solution.
> >>>
> >>>     
> >>
> >> How, it works out fine in my calculation
> >>
> >> 50 + 40 for G2 and G3, make sure that G1 gets 10%, since others are
> >> limited to 90%
> >> 50 + 40 for G1 and G3, make sure that G2 gets 10%, since others are
> >> limited to 90%
> >> 50 + 50 for G1 and G2, make sure that G3 gets 0%, since others are
> >> limited to 100%
> >>   
> >
> > It's fine in that it satisfies the guarantees, but it is deeply  
> > suboptimal.  If I ran a cpu hog in the first group, while the other two  
> > were idle, it would be limited to 50% cpu.  On the other hand, if it  
> > consumed all 100% cpu it would still satisfy the guarantees (as the  
> > other groups are idle).
> >
> > The result is that in such a situation, wall clock time would double  
> > even though cpu resources are available.
> 
> But then there is no other way to make a *guarantee*, guarantees come
> at a cost of idling resources, no? Can you show me any other
> combination that will provide the guarantee and without idling the
> system for the specified guarantees?

OK, I see part of your concern, but I think we could do some
optimizations during design. For example if all groups have reached
their hard-limit and the system is idle, should we do start a new hard
limit interval and restart, so that idleness can be removed. Would
that be an acceptable design point?

-- 
	Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  4:49           ` Balbir Singh
       [not found]             ` <20090605044946.GA11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
  2009-06-05  5:09             ` Chris Friesen
@ 2009-06-05  5:10             ` Balbir Singh
  2009-06-05  5:21               ` Avi Kivity
       [not found]               ` <20090605051050.GB11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
  2009-06-05  5:16             ` Avi Kivity
  3 siblings, 2 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-05  5:10 UTC (permalink / raw)
  To: Avi Kivity
  Cc: bharata, linux-kernel, Dhaval Giani, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, kvm, Linux Containers,
	Herbert Poetzl

* Balbir Singh <balbir@linux.vnet.ibm.com> [2009-06-05 12:49:46]:

> * Avi Kivity <avi@redhat.com> [2009-06-05 07:44:27]:
> 
> > Balbir Singh wrote:
> >> On Fri, Jun 5, 2009 at 11:33 AM, Avi Kivity <avi@redhat.com> wrote:
> >>   
> >>> Bharata B Rao wrote:
> >>>     
> >>>>> Another way is to place the 8 groups in a container group, and limit
> >>>>>  that to 80%. But that doesn't work if I want to provide guarantees to
> >>>>>  several groups.
> >>>>>
> >>>>>         
> >>>> Hmm why not ? Reduce the guarantee of the container group and provide
> >>>> the same to additional groups ?
> >>>>
> >>>>       
> >>> This method produces suboptimal results:
> >>>
> >>> $ cgroup-limits 10 10 0
> >>> [50.0, 50.0, 40.0]
> >>>
> >>> I want to provide two 10% guaranteed groups and one best-effort group.
> >>>  Using the limits method, no group can now use more than 50% of the
> >>> resources.  However, having the first group use 90% of the resources does
> >>> not violate any guarantees, but it not allowed by the solution.
> >>>
> >>>     
> >>
> >> How, it works out fine in my calculation
> >>
> >> 50 + 40 for G2 and G3, make sure that G1 gets 10%, since others are
> >> limited to 90%
> >> 50 + 40 for G1 and G3, make sure that G2 gets 10%, since others are
> >> limited to 90%
> >> 50 + 50 for G1 and G2, make sure that G3 gets 0%, since others are
> >> limited to 100%
> >>   
> >
> > It's fine in that it satisfies the guarantees, but it is deeply  
> > suboptimal.  If I ran a cpu hog in the first group, while the other two  
> > were idle, it would be limited to 50% cpu.  On the other hand, if it  
> > consumed all 100% cpu it would still satisfy the guarantees (as the  
> > other groups are idle).
> >
> > The result is that in such a situation, wall clock time would double  
> > even though cpu resources are available.
> 
> But then there is no other way to make a *guarantee*, guarantees come
> at a cost of idling resources, no? Can you show me any other
> combination that will provide the guarantee and without idling the
> system for the specified guarantees?

OK, I see part of your concern, but I think we could do some
optimizations during design. For example if all groups have reached
their hard-limit and the system is idle, should we do start a new hard
limit interval and restart, so that idleness can be removed. Would
that be an acceptable design point?

-- 
	Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]               ` <4A28A882.8070503-ZIRUuHA3oDzQT0dZR+AlfA@public.gmane.org>
@ 2009-06-05  5:13                 ` Balbir Singh
  0 siblings, 0 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-05  5:13 UTC (permalink / raw)
  To: Chris Friesen
  Cc: Pavel Emelyanov, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Peter Zijlstra

* Chris Friesen <cfriesen-ZIRUuHA3oDzQT0dZR+AlfA@public.gmane.org> [2009-06-04 23:09:22]:

> Balbir Singh wrote:
> 
> > But then there is no other way to make a *guarantee*, guarantees come
> > at a cost of idling resources, no? Can you show me any other
> > combination that will provide the guarantee and without idling the
> > system for the specified guarantees?
> 
> The example given was two 10% guaranteed groups and one best-effort
> group.  Why would this require idling resources?
> 
> If I have a hog in each group, the requirements would be met if the
> groups got 33, 33, and 33.  (Or 10/10/80, for that matter.)  If the
> second and third groups go idle, why not let the first group use 100% of
> the cpu?
> 
> The only hard restriction is that the sum of the guarantees must be less
> than 100%.
>

Chris,

I just responded to a variation of this, I think that some of this
could be handled during design. I just sent out the email a few
minutes ago. Could you look at that and respond. 

-- 
	Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  5:09             ` Chris Friesen
@ 2009-06-05  5:13               ` Balbir Singh
       [not found]               ` <4A28A882.8070503-ZIRUuHA3oDzQT0dZR+AlfA@public.gmane.org>
  1 sibling, 0 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-05  5:13 UTC (permalink / raw)
  To: Chris Friesen
  Cc: Avi Kivity, bharata, linux-kernel, Dhaval Giani,
	Vaidyanathan Srinivasan, Gautham R Shenoy, Srivatsa Vaddagiri,
	Ingo Molnar, Peter Zijlstra, Pavel Emelyanov, kvm,
	Linux Containers, Herbert Poetzl

* Chris Friesen <cfriesen@nortel.com> [2009-06-04 23:09:22]:

> Balbir Singh wrote:
> 
> > But then there is no other way to make a *guarantee*, guarantees come
> > at a cost of idling resources, no? Can you show me any other
> > combination that will provide the guarantee and without idling the
> > system for the specified guarantees?
> 
> The example given was two 10% guaranteed groups and one best-effort
> group.  Why would this require idling resources?
> 
> If I have a hog in each group, the requirements would be met if the
> groups got 33, 33, and 33.  (Or 10/10/80, for that matter.)  If the
> second and third groups go idle, why not let the first group use 100% of
> the cpu?
> 
> The only hard restriction is that the sum of the guarantees must be less
> than 100%.
>

Chris,

I just responded to a variation of this, I think that some of this
could be handled during design. I just sent out the email a few
minutes ago. Could you look at that and respond. 

-- 
	Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]             ` <20090605044946.GA11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
  2009-06-05  5:09               ` Chris Friesen
  2009-06-05  5:10               ` Balbir Singh
@ 2009-06-05  5:16               ` Avi Kivity
  2 siblings, 0 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-05  5:16 UTC (permalink / raw)
  To: balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: Pavel Emelyanov, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Peter Zijlstra

Balbir Singh wrote:

    

>>> How, it works out fine in my calculation
>>>
>>> 50 + 40 for G2 and G3, make sure that G1 gets 10%, since others are
>>> limited to 90%
>>> 50 + 40 for G1 and G3, make sure that G2 gets 10%, since others are
>>> limited to 90%
>>> 50 + 50 for G1 and G2, make sure that G3 gets 0%, since others are
>>> limited to 100%
>>>   
>>>       
>> It's fine in that it satisfies the guarantees, but it is deeply  
>> suboptimal.  If I ran a cpu hog in the first group, while the other two  
>> were idle, it would be limited to 50% cpu.  On the other hand, if it  
>> consumed all 100% cpu it would still satisfy the guarantees (as the  
>> other groups are idle).
>>
>> The result is that in such a situation, wall clock time would double  
>> even though cpu resources are available.
>>     
>
> But then there is no other way to make a *guarantee*, guarantees come
> at a cost of idling resources, no? Can you show me any other
> combination that will provide the guarantee and without idling the
> system for the specified guarantees?
>   

Suppose in my example cgroup 1 consumed 100% of the cpu resources and 
cgroup 2 and 3 were completely idle.  All of the guarantees are met (if 
cgroup 2 is idle, there's no need to give it the 10% cpu time it is 
guaranteed).

If  your only tool to achieve the guarantees is a limit system, then 
yes, the equation yields the correct results.  But given that it yields 
such inferior results, I think we need to look for a more involved solution.

I think the limits method fits cases where it is difficult to evict a 
resource (say, disk quotas -- if you want to guarantee 10% of space to 
cgroups 1, you must limit all others to 90%).  But for processor usage, 
you can evict a cgroup instantly, so nothing prevents a cgroup from 
consuming all available resources as long as others do not contend for them.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  4:49           ` Balbir Singh
                               ` (2 preceding siblings ...)
  2009-06-05  5:10             ` Balbir Singh
@ 2009-06-05  5:16             ` Avi Kivity
  2009-06-05  5:20               ` Balbir Singh
       [not found]               ` <4A28AA25.4050206-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  3 siblings, 2 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-05  5:16 UTC (permalink / raw)
  To: balbir
  Cc: bharata, linux-kernel, Dhaval Giani, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, kvm, Linux Containers,
	Herbert Poetzl

Balbir Singh wrote:

    

>>> How, it works out fine in my calculation
>>>
>>> 50 + 40 for G2 and G3, make sure that G1 gets 10%, since others are
>>> limited to 90%
>>> 50 + 40 for G1 and G3, make sure that G2 gets 10%, since others are
>>> limited to 90%
>>> 50 + 50 for G1 and G2, make sure that G3 gets 0%, since others are
>>> limited to 100%
>>>   
>>>       
>> It's fine in that it satisfies the guarantees, but it is deeply  
>> suboptimal.  If I ran a cpu hog in the first group, while the other two  
>> were idle, it would be limited to 50% cpu.  On the other hand, if it  
>> consumed all 100% cpu it would still satisfy the guarantees (as the  
>> other groups are idle).
>>
>> The result is that in such a situation, wall clock time would double  
>> even though cpu resources are available.
>>     
>
> But then there is no other way to make a *guarantee*, guarantees come
> at a cost of idling resources, no? Can you show me any other
> combination that will provide the guarantee and without idling the
> system for the specified guarantees?
>   

Suppose in my example cgroup 1 consumed 100% of the cpu resources and 
cgroup 2 and 3 were completely idle.  All of the guarantees are met (if 
cgroup 2 is idle, there's no need to give it the 10% cpu time it is 
guaranteed).

If  your only tool to achieve the guarantees is a limit system, then 
yes, the equation yields the correct results.  But given that it yields 
such inferior results, I think we need to look for a more involved solution.

I think the limits method fits cases where it is difficult to evict a 
resource (say, disk quotas -- if you want to guarantee 10% of space to 
cgroups 1, you must limit all others to 90%).  But for processor usage, 
you can evict a cgroup instantly, so nothing prevents a cgroup from 
consuming all available resources as long as others do not contend for them.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]               ` <4A28AA25.4050206-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2009-06-05  5:20                 ` Balbir Singh
  0 siblings, 0 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-05  5:20 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Pavel Emelyanov, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Peter Zijlstra

* Avi Kivity <avi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> [2009-06-05 08:16:21]:

> Balbir Singh wrote:
>
>    
>
>>>> How, it works out fine in my calculation
>>>>
>>>> 50 + 40 for G2 and G3, make sure that G1 gets 10%, since others are
>>>> limited to 90%
>>>> 50 + 40 for G1 and G3, make sure that G2 gets 10%, since others are
>>>> limited to 90%
>>>> 50 + 50 for G1 and G2, make sure that G3 gets 0%, since others are
>>>> limited to 100%
>>>>         
>>> It's fine in that it satisfies the guarantees, but it is deeply   
>>> suboptimal.  If I ran a cpu hog in the first group, while the other 
>>> two  were idle, it would be limited to 50% cpu.  On the other hand, 
>>> if it  consumed all 100% cpu it would still satisfy the guarantees 
>>> (as the  other groups are idle).
>>>
>>> The result is that in such a situation, wall clock time would double  
>>> even though cpu resources are available.
>>>     
>>
>> But then there is no other way to make a *guarantee*, guarantees come
>> at a cost of idling resources, no? Can you show me any other
>> combination that will provide the guarantee and without idling the
>> system for the specified guarantees?
>>   
>
> Suppose in my example cgroup 1 consumed 100% of the cpu resources and  
> cgroup 2 and 3 were completely idle.  All of the guarantees are met (if  
> cgroup 2 is idle, there's no need to give it the 10% cpu time it is  
> guaranteed).
>
> If  your only tool to achieve the guarantees is a limit system, then  
> yes, the equation yields the correct results.  But given that it yields  
> such inferior results, I think we need to look for a more involved 
> solution.
>
> I think the limits method fits cases where it is difficult to evict a  
> resource (say, disk quotas -- if you want to guarantee 10% of space to  
> cgroups 1, you must limit all others to 90%).  But for processor usage,  
> you can evict a cgroup instantly, so nothing prevents a cgroup from  
> consuming all available resources as long as others do not contend for 
> them.

Avi,

Could you look at my newer email and comment, where I've mentioned
that I see your concern and discussed a design point. We could
probably take this discussion forward from there?

-- 
	Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  5:16             ` Avi Kivity
@ 2009-06-05  5:20               ` Balbir Singh
       [not found]               ` <4A28AA25.4050206-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  1 sibling, 0 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-05  5:20 UTC (permalink / raw)
  To: Avi Kivity
  Cc: bharata, linux-kernel, Dhaval Giani, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, kvm, Linux Containers,
	Herbert Poetzl

* Avi Kivity <avi@redhat.com> [2009-06-05 08:16:21]:

> Balbir Singh wrote:
>
>    
>
>>>> How, it works out fine in my calculation
>>>>
>>>> 50 + 40 for G2 and G3, make sure that G1 gets 10%, since others are
>>>> limited to 90%
>>>> 50 + 40 for G1 and G3, make sure that G2 gets 10%, since others are
>>>> limited to 90%
>>>> 50 + 50 for G1 and G2, make sure that G3 gets 0%, since others are
>>>> limited to 100%
>>>>         
>>> It's fine in that it satisfies the guarantees, but it is deeply   
>>> suboptimal.  If I ran a cpu hog in the first group, while the other 
>>> two  were idle, it would be limited to 50% cpu.  On the other hand, 
>>> if it  consumed all 100% cpu it would still satisfy the guarantees 
>>> (as the  other groups are idle).
>>>
>>> The result is that in such a situation, wall clock time would double  
>>> even though cpu resources are available.
>>>     
>>
>> But then there is no other way to make a *guarantee*, guarantees come
>> at a cost of idling resources, no? Can you show me any other
>> combination that will provide the guarantee and without idling the
>> system for the specified guarantees?
>>   
>
> Suppose in my example cgroup 1 consumed 100% of the cpu resources and  
> cgroup 2 and 3 were completely idle.  All of the guarantees are met (if  
> cgroup 2 is idle, there's no need to give it the 10% cpu time it is  
> guaranteed).
>
> If  your only tool to achieve the guarantees is a limit system, then  
> yes, the equation yields the correct results.  But given that it yields  
> such inferior results, I think we need to look for a more involved 
> solution.
>
> I think the limits method fits cases where it is difficult to evict a  
> resource (say, disk quotas -- if you want to guarantee 10% of space to  
> cgroups 1, you must limit all others to 90%).  But for processor usage,  
> you can evict a cgroup instantly, so nothing prevents a cgroup from  
> consuming all available resources as long as others do not contend for 
> them.

Avi,

Could you look at my newer email and comment, where I've mentioned
that I see your concern and discussed a design point. We could
probably take this discussion forward from there?

-- 
	Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]               ` <20090605051050.GB11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
@ 2009-06-05  5:21                 ` Avi Kivity
  0 siblings, 0 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-05  5:21 UTC (permalink / raw)
  To: balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: Pavel Emelyanov, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Peter Zijlstra

Balbir Singh wrote:
>> But then there is no other way to make a *guarantee*, guarantees come
>> at a cost of idling resources, no? Can you show me any other
>> combination that will provide the guarantee and without idling the
>> system for the specified guarantees?
>>     
>
> OK, I see part of your concern, but I think we could do some
> optimizations during design. For example if all groups have reached
> their hard-limit and the system is idle, should we do start a new hard
> limit interval and restart, so that idleness can be removed. Would
> that be an acceptable design point?

I think so.  Given guarantees G1..Gn (0 <= Gi <= 1; sum(Gi) <= 1), and a 
cpu hog running in each group, how would the algorithm divide resources?

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  5:10             ` Balbir Singh
@ 2009-06-05  5:21               ` Avi Kivity
  2009-06-05  5:27                 ` Balbir Singh
       [not found]                 ` <4A28AB67.7040800-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
       [not found]               ` <20090605051050.GB11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
  1 sibling, 2 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-05  5:21 UTC (permalink / raw)
  To: balbir
  Cc: bharata, linux-kernel, Dhaval Giani, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, kvm, Linux Containers,
	Herbert Poetzl

Balbir Singh wrote:
>> But then there is no other way to make a *guarantee*, guarantees come
>> at a cost of idling resources, no? Can you show me any other
>> combination that will provide the guarantee and without idling the
>> system for the specified guarantees?
>>     
>
> OK, I see part of your concern, but I think we could do some
> optimizations during design. For example if all groups have reached
> their hard-limit and the system is idle, should we do start a new hard
> limit interval and restart, so that idleness can be removed. Would
> that be an acceptable design point?

I think so.  Given guarantees G1..Gn (0 <= Gi <= 1; sum(Gi) <= 1), and a 
cpu hog running in each group, how would the algorithm divide resources?

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]                 ` <4A28AB67.7040800-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2009-06-05  5:27                   ` Balbir Singh
  0 siblings, 0 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-05  5:27 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Pavel Emelyanov, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Peter Zijlstra

* Avi Kivity <avi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> [2009-06-05 08:21:43]:

> Balbir Singh wrote:
>>> But then there is no other way to make a *guarantee*, guarantees come
>>> at a cost of idling resources, no? Can you show me any other
>>> combination that will provide the guarantee and without idling the
>>> system for the specified guarantees?
>>>     
>>
>> OK, I see part of your concern, but I think we could do some
>> optimizations during design. For example if all groups have reached
>> their hard-limit and the system is idle, should we do start a new hard
>> limit interval and restart, so that idleness can be removed. Would
>> that be an acceptable design point?
>
> I think so.  Given guarantees G1..Gn (0 <= Gi <= 1; sum(Gi) <= 1), and a  
> cpu hog running in each group, how would the algorithm divide resources?
>

As per the matrix calculation, but as soon as we reach an idle point,
we redistribute the b/w and start a new quantum so to speak, where all
groups are charged up to their hard limits.

For your question, if there is a CPU hog running, it would be as per
the matrix calculation, since the system has no idle point during the
bandwidth period.

-- 
	Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  5:21               ` Avi Kivity
@ 2009-06-05  5:27                 ` Balbir Singh
  2009-06-05  5:31                   ` Bharata B Rao
                                     ` (2 more replies)
       [not found]                 ` <4A28AB67.7040800-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  1 sibling, 3 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-05  5:27 UTC (permalink / raw)
  To: Avi Kivity
  Cc: bharata, linux-kernel, Dhaval Giani, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, kvm, Linux Containers,
	Herbert Poetzl

* Avi Kivity <avi@redhat.com> [2009-06-05 08:21:43]:

> Balbir Singh wrote:
>>> But then there is no other way to make a *guarantee*, guarantees come
>>> at a cost of idling resources, no? Can you show me any other
>>> combination that will provide the guarantee and without idling the
>>> system for the specified guarantees?
>>>     
>>
>> OK, I see part of your concern, but I think we could do some
>> optimizations during design. For example if all groups have reached
>> their hard-limit and the system is idle, should we do start a new hard
>> limit interval and restart, so that idleness can be removed. Would
>> that be an acceptable design point?
>
> I think so.  Given guarantees G1..Gn (0 <= Gi <= 1; sum(Gi) <= 1), and a  
> cpu hog running in each group, how would the algorithm divide resources?
>

As per the matrix calculation, but as soon as we reach an idle point,
we redistribute the b/w and start a new quantum so to speak, where all
groups are charged up to their hard limits.

For your question, if there is a CPU hog running, it would be as per
the matrix calculation, since the system has no idle point during the
bandwidth period.

-- 
	Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]                   ` <20090605052755.GE11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
@ 2009-06-05  5:31                     ` Bharata B Rao
  2009-06-05  6:03                     ` Avi Kivity
  1 sibling, 0 replies; 107+ messages in thread
From: Bharata B Rao @ 2009-06-05  5:31 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Pavel Emelyanov, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity, Ingo Molnar,
	Peter Zijlstra

On Fri, Jun 05, 2009 at 01:27:55PM +0800, Balbir Singh wrote:
> * Avi Kivity <avi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> [2009-06-05 08:21:43]:
> 
> > Balbir Singh wrote:
> >>> But then there is no other way to make a *guarantee*, guarantees come
> >>> at a cost of idling resources, no? Can you show me any other
> >>> combination that will provide the guarantee and without idling the
> >>> system for the specified guarantees?
> >>>     
> >>
> >> OK, I see part of your concern, but I think we could do some
> >> optimizations during design. For example if all groups have reached
> >> their hard-limit and the system is idle, should we do start a new hard
> >> limit interval and restart, so that idleness can be removed. Would
> >> that be an acceptable design point?
> >
> > I think so.  Given guarantees G1..Gn (0 <= Gi <= 1; sum(Gi) <= 1), and a  
> > cpu hog running in each group, how would the algorithm divide resources?
> >
> 
> As per the matrix calculation, but as soon as we reach an idle point,
> we redistribute the b/w and start a new quantum so to speak, where all
> groups are charged up to their hard limits.

But could there be client models where you are required to strictly
adhere to the limit within the bandwidth and not provide more (by advancing
the bandwidth period) in the presence of idle cycles ?

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  5:27                 ` Balbir Singh
@ 2009-06-05  5:31                   ` Bharata B Rao
  2009-06-05  6:01                     ` Avi Kivity
                                       ` (2 more replies)
       [not found]                   ` <20090605052755.GE11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
  2009-06-05  6:03                   ` Avi Kivity
  2 siblings, 3 replies; 107+ messages in thread
From: Bharata B Rao @ 2009-06-05  5:31 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Avi Kivity, linux-kernel, Dhaval Giani, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, kvm, Linux Containers,
	Herbert Poetzl

On Fri, Jun 05, 2009 at 01:27:55PM +0800, Balbir Singh wrote:
> * Avi Kivity <avi@redhat.com> [2009-06-05 08:21:43]:
> 
> > Balbir Singh wrote:
> >>> But then there is no other way to make a *guarantee*, guarantees come
> >>> at a cost of idling resources, no? Can you show me any other
> >>> combination that will provide the guarantee and without idling the
> >>> system for the specified guarantees?
> >>>     
> >>
> >> OK, I see part of your concern, but I think we could do some
> >> optimizations during design. For example if all groups have reached
> >> their hard-limit and the system is idle, should we do start a new hard
> >> limit interval and restart, so that idleness can be removed. Would
> >> that be an acceptable design point?
> >
> > I think so.  Given guarantees G1..Gn (0 <= Gi <= 1; sum(Gi) <= 1), and a  
> > cpu hog running in each group, how would the algorithm divide resources?
> >
> 
> As per the matrix calculation, but as soon as we reach an idle point,
> we redistribute the b/w and start a new quantum so to speak, where all
> groups are charged up to their hard limits.

But could there be client models where you are required to strictly
adhere to the limit within the bandwidth and not provide more (by advancing
the bandwidth period) in the presence of idle cycles ?

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]                     ` <20090605053159.GB3872-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
@ 2009-06-05  6:01                       ` Avi Kivity
  2009-06-05  9:24                       ` Balbir Singh
  1 sibling, 0 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-05  6:01 UTC (permalink / raw)
  To: bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: Peter Zijlstra, Pavel Emelyanov, Dhaval Giani,
	kvm-u79uwXL29TY76Z2rM5mHXA, Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Ingo Molnar, Balbir Singh

Bharata B Rao wrote:
> On Fri, Jun 05, 2009 at 01:27:55PM +0800, Balbir Singh wrote:
>   
>> * Avi Kivity <avi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> [2009-06-05 08:21:43]:
>>
>>     
>>> Balbir Singh wrote:
>>>       
>>>>> But then there is no other way to make a *guarantee*, guarantees come
>>>>> at a cost of idling resources, no? Can you show me any other
>>>>> combination that will provide the guarantee and without idling the
>>>>> system for the specified guarantees?
>>>>>     
>>>>>           
>>>> OK, I see part of your concern, but I think we could do some
>>>> optimizations during design. For example if all groups have reached
>>>> their hard-limit and the system is idle, should we do start a new hard
>>>> limit interval and restart, so that idleness can be removed. Would
>>>> that be an acceptable design point?
>>>>         
>>> I think so.  Given guarantees G1..Gn (0 <= Gi <= 1; sum(Gi) <= 1), and a  
>>> cpu hog running in each group, how would the algorithm divide resources?
>>>
>>>       
>> As per the matrix calculation, but as soon as we reach an idle point,
>> we redistribute the b/w and start a new quantum so to speak, where all
>> groups are charged up to their hard limits.
>>     
>
> But could there be client models where you are required to strictly
> adhere to the limit within the bandwidth and not provide more (by advancing
> the bandwidth period) in the presence of idle cycles ?
>   

That's the limit part.  I'd like to be able to specify limits and 
guarantees on the same host and for the same groups; I don't think that 
works when you advance the bandwidth period.

I think we need to treat guarantees as first-class goals, not something 
derived from limits (in fact I think guarantees are more useful as they 
can be used to provide SLAs).

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  5:31                   ` Bharata B Rao
@ 2009-06-05  6:01                     ` Avi Kivity
       [not found]                       ` <4A28B4CE.4010004-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2009-06-05  9:39                       ` Balbir Singh
       [not found]                     ` <20090605053159.GB3872-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
  2009-06-05  9:24                     ` Balbir Singh
  2 siblings, 2 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-05  6:01 UTC (permalink / raw)
  To: bharata
  Cc: Balbir Singh, linux-kernel, Dhaval Giani,
	Vaidyanathan Srinivasan, Gautham R Shenoy, Srivatsa Vaddagiri,
	Ingo Molnar, Peter Zijlstra, Pavel Emelyanov, kvm,
	Linux Containers, Herbert Poetzl

Bharata B Rao wrote:
> On Fri, Jun 05, 2009 at 01:27:55PM +0800, Balbir Singh wrote:
>   
>> * Avi Kivity <avi@redhat.com> [2009-06-05 08:21:43]:
>>
>>     
>>> Balbir Singh wrote:
>>>       
>>>>> But then there is no other way to make a *guarantee*, guarantees come
>>>>> at a cost of idling resources, no? Can you show me any other
>>>>> combination that will provide the guarantee and without idling the
>>>>> system for the specified guarantees?
>>>>>     
>>>>>           
>>>> OK, I see part of your concern, but I think we could do some
>>>> optimizations during design. For example if all groups have reached
>>>> their hard-limit and the system is idle, should we do start a new hard
>>>> limit interval and restart, so that idleness can be removed. Would
>>>> that be an acceptable design point?
>>>>         
>>> I think so.  Given guarantees G1..Gn (0 <= Gi <= 1; sum(Gi) <= 1), and a  
>>> cpu hog running in each group, how would the algorithm divide resources?
>>>
>>>       
>> As per the matrix calculation, but as soon as we reach an idle point,
>> we redistribute the b/w and start a new quantum so to speak, where all
>> groups are charged up to their hard limits.
>>     
>
> But could there be client models where you are required to strictly
> adhere to the limit within the bandwidth and not provide more (by advancing
> the bandwidth period) in the presence of idle cycles ?
>   

That's the limit part.  I'd like to be able to specify limits and 
guarantees on the same host and for the same groups; I don't think that 
works when you advance the bandwidth period.

I think we need to treat guarantees as first-class goals, not something 
derived from limits (in fact I think guarantees are more useful as they 
can be used to provide SLAs).

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]                   ` <20090605052755.GE11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
  2009-06-05  5:31                     ` Bharata B Rao
@ 2009-06-05  6:03                     ` Avi Kivity
  1 sibling, 0 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-05  6:03 UTC (permalink / raw)
  To: balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: Pavel Emelyanov, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Peter Zijlstra

Balbir Singh wrote:
>> I think so.  Given guarantees G1..Gn (0 <= Gi <= 1; sum(Gi) <= 1), and a  
>> cpu hog running in each group, how would the algorithm divide resources?
>>
>>     
>
> As per the matrix calculation, but as soon as we reach an idle point,
> we redistribute the b/w and start a new quantum so to speak, where all
> groups are charged up to their hard limits.
>
> For your question, if there is a CPU hog running, it would be as per
> the matrix calculation, since the system has no idle point during the
> bandwidth period.
>   

So the groups with guarantees get a priority boost.  That's not a good 
side effect.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  5:27                 ` Balbir Singh
  2009-06-05  5:31                   ` Bharata B Rao
       [not found]                   ` <20090605052755.GE11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
@ 2009-06-05  6:03                   ` Avi Kivity
  2009-06-05  6:32                     ` Bharata B Rao
       [not found]                     ` <4A28B539.3050001-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2 siblings, 2 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-05  6:03 UTC (permalink / raw)
  To: balbir
  Cc: bharata, linux-kernel, Dhaval Giani, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, kvm, Linux Containers,
	Herbert Poetzl

Balbir Singh wrote:
>> I think so.  Given guarantees G1..Gn (0 <= Gi <= 1; sum(Gi) <= 1), and a  
>> cpu hog running in each group, how would the algorithm divide resources?
>>
>>     
>
> As per the matrix calculation, but as soon as we reach an idle point,
> we redistribute the b/w and start a new quantum so to speak, where all
> groups are charged up to their hard limits.
>
> For your question, if there is a CPU hog running, it would be as per
> the matrix calculation, since the system has no idle point during the
> bandwidth period.
>   

So the groups with guarantees get a priority boost.  That's not a good 
side effect.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]                     ` <4A28B539.3050001-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2009-06-05  6:32                       ` Bharata B Rao
  0 siblings, 0 replies; 107+ messages in thread
From: Bharata B Rao @ 2009-06-05  6:32 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Peter Zijlstra, Pavel Emelyanov, Dhaval Giani,
	kvm-u79uwXL29TY76Z2rM5mHXA, Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Ingo Molnar,
	balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

On Fri, Jun 05, 2009 at 09:03:37AM +0300, Avi Kivity wrote:
> Balbir Singh wrote:
>>> I think so.  Given guarantees G1..Gn (0 <= Gi <= 1; sum(Gi) <= 1), 
>>> and a  cpu hog running in each group, how would the algorithm divide 
>>> resources?
>>>
>>>     
>>
>> As per the matrix calculation, but as soon as we reach an idle point,
>> we redistribute the b/w and start a new quantum so to speak, where all
>> groups are charged up to their hard limits.
>>
>> For your question, if there is a CPU hog running, it would be as per
>> the matrix calculation, since the system has no idle point during the
>> bandwidth period.
>>   
>
> So the groups with guarantees get a priority boost.  That's not a good  
> side effect.

That happens only in the presence of idle cycles when other groups [with or
without guarantees] have nothing useful to do. So how would that matter
since there is nothing else to run anyway ?

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  6:03                   ` Avi Kivity
@ 2009-06-05  6:32                     ` Bharata B Rao
       [not found]                       ` <20090605063243.GC3872-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
  2009-06-05 12:57                       ` Avi Kivity
       [not found]                     ` <4A28B539.3050001-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  1 sibling, 2 replies; 107+ messages in thread
From: Bharata B Rao @ 2009-06-05  6:32 UTC (permalink / raw)
  To: Avi Kivity
  Cc: balbir, linux-kernel, Dhaval Giani, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, kvm, Linux Containers,
	Herbert Poetzl

On Fri, Jun 05, 2009 at 09:03:37AM +0300, Avi Kivity wrote:
> Balbir Singh wrote:
>>> I think so.  Given guarantees G1..Gn (0 <= Gi <= 1; sum(Gi) <= 1), 
>>> and a  cpu hog running in each group, how would the algorithm divide 
>>> resources?
>>>
>>>     
>>
>> As per the matrix calculation, but as soon as we reach an idle point,
>> we redistribute the b/w and start a new quantum so to speak, where all
>> groups are charged up to their hard limits.
>>
>> For your question, if there is a CPU hog running, it would be as per
>> the matrix calculation, since the system has no idle point during the
>> bandwidth period.
>>   
>
> So the groups with guarantees get a priority boost.  That's not a good  
> side effect.

That happens only in the presence of idle cycles when other groups [with or
without guarantees] have nothing useful to do. So how would that matter
since there is nothing else to run anyway ?

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  6:01                     ` Avi Kivity
@ 2009-06-05  8:16                           ` Bharata B Rao
  2009-06-05  9:39                       ` Balbir Singh
  1 sibling, 0 replies; 107+ messages in thread
From: Bharata B Rao @ 2009-06-05  8:16 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Peter Zijlstra, Pavel Emelyanov, Dhaval Giani,
	kvm-u79uwXL29TY76Z2rM5mHXA, Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Ingo Molnar, Balbir Singh

On Fri, Jun 05, 2009 at 09:01:50AM +0300, Avi Kivity wrote:
> Bharata B Rao wrote:
>>
>> But could there be client models where you are required to strictly
>> adhere to the limit within the bandwidth and not provide more (by advancing
>> the bandwidth period) in the presence of idle cycles ?
>>   
>
> That's the limit part.  I'd like to be able to specify limits and  
> guarantees on the same host and for the same groups; I don't think that  
> works when you advance the bandwidth period.
>
> I think we need to treat guarantees as first-class goals, not something  
> derived from limits (in fact I think guarantees are more useful as they  
> can be used to provide SLAs).

I agree that guarantees are important, but I am not sure about

1. specifying both limits and guarantees for groups and
2. not deriving guarantees from limits.

Guarantees are met by some form of throttling or limiting and hence I think
limiting should drive the guarantees.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
@ 2009-06-05  8:16                           ` Bharata B Rao
  0 siblings, 0 replies; 107+ messages in thread
From: Bharata B Rao @ 2009-06-05  8:16 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Balbir Singh, linux-kernel, Dhaval Giani,
	Vaidyanathan Srinivasan, Gautham R Shenoy, Srivatsa Vaddagiri,
	Ingo Molnar, Peter Zijlstra, Pavel Emelyanov, kvm,
	Linux Containers, Herbert Poetzl

On Fri, Jun 05, 2009 at 09:01:50AM +0300, Avi Kivity wrote:
> Bharata B Rao wrote:
>>
>> But could there be client models where you are required to strictly
>> adhere to the limit within the bandwidth and not provide more (by advancing
>> the bandwidth period) in the presence of idle cycles ?
>>   
>
> That's the limit part.  I'd like to be able to specify limits and  
> guarantees on the same host and for the same groups; I don't think that  
> works when you advance the bandwidth period.
>
> I think we need to treat guarantees as first-class goals, not something  
> derived from limits (in fact I think guarantees are more useful as they  
> can be used to provide SLAs).

I agree that guarantees are important, but I am not sure about

1. specifying both limits and guarantees for groups and
2. not deriving guarantees from limits.

Guarantees are met by some form of throttling or limiting and hence I think
limiting should drive the guarantees.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found] ` <20090604053649.GA3701-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
  2009-06-04 12:19   ` Avi Kivity
@ 2009-06-05  8:53   ` Paul Menage
  1 sibling, 0 replies; 107+ messages in thread
From: Paul Menage @ 2009-06-05  8:53 UTC (permalink / raw)
  To: bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: Peter Zijlstra, Pavel Emelyanov, Dhaval Giani,
	kvm-u79uwXL29TY76Z2rM5mHXA, Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity, Ingo Molnar,
	Balbir Singh

On Wed, Jun 3, 2009 at 10:36 PM, Bharata B
Rao<bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
> - Hard limits can be used to provide guarantees.
>

This claim (and the subsequent long thread it generated on how limits
can provide guarantees) confused me a bit.

Why do we need limits to provide guarantees when we can already
provide guarantees via shares?

Suppose 10 cgroups each want 10% of the machine's CPU. We can just
give each cgroup an equal share, and they're guaranteed 10% if they
try to use it; if they don't use it, other cgroups can get access to
the idle cycles.

Suppose cgroup A wants a guarantee of 50% and two others, B and C,
want guarantees of 15% each; give A 50 shares and B and C 15 shares
each. In this case, if they all run flat out they'll get 62%/19%/19%,
which is within their SLA.

That's not to say that hard limits can't be useful in their own right
- e.g. for providing reproducible loadtesting conditions by
controlling how much CPU a service can use during the load test. But I
don't see why using them to implement guarantees is either necessary
or desirable.

(Unless I'm missing some crucial point ...)

Paul

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-04  5:36 [RFC] CPU hard limits Bharata B Rao
  2009-06-04 12:19 ` Avi Kivity
       [not found] ` <20090604053649.GA3701-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
@ 2009-06-05  8:53 ` Paul Menage
  2009-06-05  9:27   ` Bharata B Rao
                     ` (4 more replies)
  2009-06-05  9:02 ` Reinhard Tartler
  3 siblings, 5 replies; 107+ messages in thread
From: Paul Menage @ 2009-06-05  8:53 UTC (permalink / raw)
  To: bharata
  Cc: linux-kernel, Dhaval Giani, Balbir Singh,
	Vaidyanathan Srinivasan, Gautham R Shenoy, Srivatsa Vaddagiri,
	Ingo Molnar, Peter Zijlstra, Pavel Emelyanov, Avi Kivity, kvm,
	Linux Containers, Herbert Poetzl

On Wed, Jun 3, 2009 at 10:36 PM, Bharata B
Rao<bharata@linux.vnet.ibm.com> wrote:
> - Hard limits can be used to provide guarantees.
>

This claim (and the subsequent long thread it generated on how limits
can provide guarantees) confused me a bit.

Why do we need limits to provide guarantees when we can already
provide guarantees via shares?

Suppose 10 cgroups each want 10% of the machine's CPU. We can just
give each cgroup an equal share, and they're guaranteed 10% if they
try to use it; if they don't use it, other cgroups can get access to
the idle cycles.

Suppose cgroup A wants a guarantee of 50% and two others, B and C,
want guarantees of 15% each; give A 50 shares and B and C 15 shares
each. In this case, if they all run flat out they'll get 62%/19%/19%,
which is within their SLA.

That's not to say that hard limits can't be useful in their own right
- e.g. for providing reproducible loadtesting conditions by
controlling how much CPU a service can use during the load test. But I
don't see why using them to implement guarantees is either necessary
or desirable.

(Unless I'm missing some crucial point ...)

Paul

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-04  5:36 [RFC] CPU hard limits Bharata B Rao
                   ` (2 preceding siblings ...)
  2009-06-05  8:53 ` Paul Menage
@ 2009-06-05  9:02 ` Reinhard Tartler
  3 siblings, 0 replies; 107+ messages in thread
From: Reinhard Tartler @ 2009-06-05  9:02 UTC (permalink / raw)
  To: linux-kernel; +Cc: kvm

Bharata B Rao <bharata@linux.vnet.ibm.com> writes:

> 4. Existing solutions
> ---------------------
> - Both Linux-VServer and OpenVZ virtualization solutions support CPU hard
>   limiting.
> - Per task limit can be enforced using rlimits, but it is not rate
> based.

Also related work:

http://www.usenix.org/events/osdi99/full_papers/banga/banga.pdf

it has even had a preliminary linux implementation from 2003, which has
been proposed at that time, but wasn't considered.

http://admingilde.org/~martin/rc/

maybe someone wants to pick up that work?


-- 
Gruesse/greetings,
Reinhard Tartler, KeyID 945348A4


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]                     ` <20090605053159.GB3872-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
  2009-06-05  6:01                       ` Avi Kivity
@ 2009-06-05  9:24                       ` Balbir Singh
  1 sibling, 0 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-05  9:24 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: Pavel Emelyanov, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity, Ingo Molnar,
	Peter Zijlstra

* Bharata B Rao <bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> [2009-06-05 11:01:59]:

> On Fri, Jun 05, 2009 at 01:27:55PM +0800, Balbir Singh wrote:
> > * Avi Kivity <avi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> [2009-06-05 08:21:43]:
> > 
> > > Balbir Singh wrote:
> > >>> But then there is no other way to make a *guarantee*, guarantees come
> > >>> at a cost of idling resources, no? Can you show me any other
> > >>> combination that will provide the guarantee and without idling the
> > >>> system for the specified guarantees?
> > >>>     
> > >>
> > >> OK, I see part of your concern, but I think we could do some
> > >> optimizations during design. For example if all groups have reached
> > >> their hard-limit and the system is idle, should we do start a new hard
> > >> limit interval and restart, so that idleness can be removed. Would
> > >> that be an acceptable design point?
> > >
> > > I think so.  Given guarantees G1..Gn (0 <= Gi <= 1; sum(Gi) <= 1), and a  
> > > cpu hog running in each group, how would the algorithm divide resources?
> > >
> > 
> > As per the matrix calculation, but as soon as we reach an idle point,
> > we redistribute the b/w and start a new quantum so to speak, where all
> > groups are charged up to their hard limits.
> 
> But could there be client models where you are required to strictly
> adhere to the limit within the bandwidth and not provide more (by advancing
> the bandwidth period) in the presence of idle cycles ?
>

Good point, I think so, so I think there is should be a good default
and configurable for the other case. 

-- 
	Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  5:31                   ` Bharata B Rao
  2009-06-05  6:01                     ` Avi Kivity
       [not found]                     ` <20090605053159.GB3872-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
@ 2009-06-05  9:24                     ` Balbir Singh
  2 siblings, 0 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-05  9:24 UTC (permalink / raw)
  To: Bharata B Rao
  Cc: Avi Kivity, linux-kernel, Dhaval Giani, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, kvm, Linux Containers,
	Herbert Poetzl

* Bharata B Rao <bharata@linux.vnet.ibm.com> [2009-06-05 11:01:59]:

> On Fri, Jun 05, 2009 at 01:27:55PM +0800, Balbir Singh wrote:
> > * Avi Kivity <avi@redhat.com> [2009-06-05 08:21:43]:
> > 
> > > Balbir Singh wrote:
> > >>> But then there is no other way to make a *guarantee*, guarantees come
> > >>> at a cost of idling resources, no? Can you show me any other
> > >>> combination that will provide the guarantee and without idling the
> > >>> system for the specified guarantees?
> > >>>     
> > >>
> > >> OK, I see part of your concern, but I think we could do some
> > >> optimizations during design. For example if all groups have reached
> > >> their hard-limit and the system is idle, should we do start a new hard
> > >> limit interval and restart, so that idleness can be removed. Would
> > >> that be an acceptable design point?
> > >
> > > I think so.  Given guarantees G1..Gn (0 <= Gi <= 1; sum(Gi) <= 1), and a  
> > > cpu hog running in each group, how would the algorithm divide resources?
> > >
> > 
> > As per the matrix calculation, but as soon as we reach an idle point,
> > we redistribute the b/w and start a new quantum so to speak, where all
> > groups are charged up to their hard limits.
> 
> But could there be client models where you are required to strictly
> adhere to the limit within the bandwidth and not provide more (by advancing
> the bandwidth period) in the presence of idle cycles ?
>

Good point, I think so, so I think there is should be a good default
and configurable for the other case. 

-- 
	Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]   ` <6599ad830906050153i1afd104fqe70f681317349142-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-06-05  9:27     ` Bharata B Rao
  2009-06-05  9:36     ` Balbir Singh
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 107+ messages in thread
From: Bharata B Rao @ 2009-06-05  9:27 UTC (permalink / raw)
  To: Paul Menage
  Cc: Peter Zijlstra, Pavel Emelyanov, Dhaval Giani,
	kvm-u79uwXL29TY76Z2rM5mHXA, Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity, Ingo Molnar,
	Balbir Singh

On Fri, Jun 05, 2009 at 01:53:15AM -0700, Paul Menage wrote:
> On Wed, Jun 3, 2009 at 10:36 PM, Bharata B
> Rao<bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
> > - Hard limits can be used to provide guarantees.
> >
> 
> This claim (and the subsequent long thread it generated on how limits
> can provide guarantees) confused me a bit.
> 
> Why do we need limits to provide guarantees when we can already
> provide guarantees via shares?

shares design is proportional and hence it can't by itself provide
guarantees.

> 
> Suppose 10 cgroups each want 10% of the machine's CPU. We can just
> give each cgroup an equal share, and they're guaranteed 10% if they
> try to use it; if they don't use it, other cgroups can get access to
> the idle cycles.

Now if 11th group with same shares comes in, then each group will now
get 9% of CPU and that 10% guarantee breaks.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  8:53 ` Paul Menage
@ 2009-06-05  9:27   ` Bharata B Rao
  2009-06-05  9:32     ` Paul Menage
       [not found]     ` <20090605092733.GA27486-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
       [not found]   ` <6599ad830906050153i1afd104fqe70f681317349142-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
                     ` (3 subsequent siblings)
  4 siblings, 2 replies; 107+ messages in thread
From: Bharata B Rao @ 2009-06-05  9:27 UTC (permalink / raw)
  To: Paul Menage
  Cc: linux-kernel, Dhaval Giani, Balbir Singh,
	Vaidyanathan Srinivasan, Gautham R Shenoy, Srivatsa Vaddagiri,
	Ingo Molnar, Peter Zijlstra, Pavel Emelyanov, Avi Kivity, kvm,
	Linux Containers, Herbert Poetzl

On Fri, Jun 05, 2009 at 01:53:15AM -0700, Paul Menage wrote:
> On Wed, Jun 3, 2009 at 10:36 PM, Bharata B
> Rao<bharata@linux.vnet.ibm.com> wrote:
> > - Hard limits can be used to provide guarantees.
> >
> 
> This claim (and the subsequent long thread it generated on how limits
> can provide guarantees) confused me a bit.
> 
> Why do we need limits to provide guarantees when we can already
> provide guarantees via shares?

shares design is proportional and hence it can't by itself provide
guarantees.

> 
> Suppose 10 cgroups each want 10% of the machine's CPU. We can just
> give each cgroup an equal share, and they're guaranteed 10% if they
> try to use it; if they don't use it, other cgroups can get access to
> the idle cycles.

Now if 11th group with same shares comes in, then each group will now
get 9% of CPU and that 10% guarantee breaks.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]     ` <20090605092733.GA27486-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
@ 2009-06-05  9:32       ` Paul Menage
  0 siblings, 0 replies; 107+ messages in thread
From: Paul Menage @ 2009-06-05  9:32 UTC (permalink / raw)
  To: bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: Peter Zijlstra, Pavel Emelyanov, Dhaval Giani,
	kvm-u79uwXL29TY76Z2rM5mHXA, Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity, Ingo Molnar,
	Balbir Singh

On Fri, Jun 5, 2009 at 2:27 AM, Bharata B Rao<bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
>>
>> Suppose 10 cgroups each want 10% of the machine's CPU. We can just
>> give each cgroup an equal share, and they're guaranteed 10% if they
>> try to use it; if they don't use it, other cgroups can get access to
>> the idle cycles.
>
> Now if 11th group with same shares comes in, then each group will now
> get 9% of CPU and that 10% guarantee breaks.

So you're trying to guarantee 11 cgroups that they can each get 10% of
the CPU? That's called over-committing, and while there's nothing
wrong with doing that if you're confident that they'll not al need
their 10% at the same time, there's no way to *guarantee* them all
10%. You can guarantee them all 9% and hope the extra 1% is spare for
those that need it (over-committing), or you can guarantee 10 of them
10% and give the last one 0 shares.

How would you propose to guarantee 11 cgroups each 10% of the CPU
using hard limits?

Paul

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  9:27   ` Bharata B Rao
@ 2009-06-05  9:32     ` Paul Menage
       [not found]       ` <6599ad830906050232n11aa30d8xfcda0a279a482f32-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2009-06-05  9:48       ` Dhaval Giani
       [not found]     ` <20090605092733.GA27486-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
  1 sibling, 2 replies; 107+ messages in thread
From: Paul Menage @ 2009-06-05  9:32 UTC (permalink / raw)
  To: bharata
  Cc: linux-kernel, Dhaval Giani, Balbir Singh,
	Vaidyanathan Srinivasan, Gautham R Shenoy, Srivatsa Vaddagiri,
	Ingo Molnar, Peter Zijlstra, Pavel Emelyanov, Avi Kivity, kvm,
	Linux Containers, Herbert Poetzl

On Fri, Jun 5, 2009 at 2:27 AM, Bharata B Rao<bharata@linux.vnet.ibm.com> wrote:
>>
>> Suppose 10 cgroups each want 10% of the machine's CPU. We can just
>> give each cgroup an equal share, and they're guaranteed 10% if they
>> try to use it; if they don't use it, other cgroups can get access to
>> the idle cycles.
>
> Now if 11th group with same shares comes in, then each group will now
> get 9% of CPU and that 10% guarantee breaks.

So you're trying to guarantee 11 cgroups that they can each get 10% of
the CPU? That's called over-committing, and while there's nothing
wrong with doing that if you're confident that they'll not al need
their 10% at the same time, there's no way to *guarantee* them all
10%. You can guarantee them all 9% and hope the extra 1% is spare for
those that need it (over-committing), or you can guarantee 10 of them
10% and give the last one 0 shares.

How would you propose to guarantee 11 cgroups each 10% of the CPU
using hard limits?

Paul

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]   ` <6599ad830906050153i1afd104fqe70f681317349142-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2009-06-05  9:27     ` Bharata B Rao
@ 2009-06-05  9:36     ` Balbir Singh
  2009-06-05 11:32     ` Srivatsa Vaddagiri
  2009-06-05 13:02     ` Avi Kivity
  3 siblings, 0 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-05  9:36 UTC (permalink / raw)
  To: Paul Menage
  Cc: Pavel Emelyanov, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Peter Zijlstra

* menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org <menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> [2009-06-05 01:53:15]:

> On Wed, Jun 3, 2009 at 10:36 PM, Bharata B
> Rao<bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
> > - Hard limits can be used to provide guarantees.
> >
> 
> This claim (and the subsequent long thread it generated on how limits
> can provide guarantees) confused me a bit.
> 
> Why do we need limits to provide guarantees when we can already
> provide guarantees via shares?
> 
> Suppose 10 cgroups each want 10% of the machine's CPU. We can just
> give each cgroup an equal share, and they're guaranteed 10% if they
> try to use it; if they don't use it, other cgroups can get access to
> the idle cycles.
> 
> Suppose cgroup A wants a guarantee of 50% and two others, B and C,
> want guarantees of 15% each; give A 50 shares and B and C 15 shares
> each. In this case, if they all run flat out they'll get 62%/19%/19%,
> which is within their SLA.
> 
> That's not to say that hard limits can't be useful in their own right
> - e.g. for providing reproducible loadtesting conditions by
> controlling how much CPU a service can use during the load test. But I
> don't see why using them to implement guarantees is either necessary
> or desirable.
> 
> (Unless I'm missing some crucial point ...)

The important scenario I have is adding and removing groups.

Consider 10 cgroups with shares of 10 each, what if 5 new are created
with the same shares? We now start getting 100/15, even though we did
not change our shares.

-- 
	Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  8:53 ` Paul Menage
  2009-06-05  9:27   ` Bharata B Rao
       [not found]   ` <6599ad830906050153i1afd104fqe70f681317349142-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-06-05  9:36   ` Balbir Singh
       [not found]     ` <20090605093625.GI11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
  2009-06-05 11:32   ` Srivatsa Vaddagiri
  2009-06-05 13:02   ` Avi Kivity
  4 siblings, 1 reply; 107+ messages in thread
From: Balbir Singh @ 2009-06-05  9:36 UTC (permalink / raw)
  To: Paul Menage
  Cc: bharata, linux-kernel, Dhaval Giani, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, Avi Kivity, kvm,
	Linux Containers, Herbert Poetzl

* menage@google.com <menage@google.com> [2009-06-05 01:53:15]:

> On Wed, Jun 3, 2009 at 10:36 PM, Bharata B
> Rao<bharata@linux.vnet.ibm.com> wrote:
> > - Hard limits can be used to provide guarantees.
> >
> 
> This claim (and the subsequent long thread it generated on how limits
> can provide guarantees) confused me a bit.
> 
> Why do we need limits to provide guarantees when we can already
> provide guarantees via shares?
> 
> Suppose 10 cgroups each want 10% of the machine's CPU. We can just
> give each cgroup an equal share, and they're guaranteed 10% if they
> try to use it; if they don't use it, other cgroups can get access to
> the idle cycles.
> 
> Suppose cgroup A wants a guarantee of 50% and two others, B and C,
> want guarantees of 15% each; give A 50 shares and B and C 15 shares
> each. In this case, if they all run flat out they'll get 62%/19%/19%,
> which is within their SLA.
> 
> That's not to say that hard limits can't be useful in their own right
> - e.g. for providing reproducible loadtesting conditions by
> controlling how much CPU a service can use during the load test. But I
> don't see why using them to implement guarantees is either necessary
> or desirable.
> 
> (Unless I'm missing some crucial point ...)

The important scenario I have is adding and removing groups.

Consider 10 cgroups with shares of 10 each, what if 5 new are created
with the same shares? We now start getting 100/15, even though we did
not change our shares.

-- 
	Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]                       ` <4A28B4CE.4010004-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2009-06-05  8:16                           ` Bharata B Rao
@ 2009-06-05  9:39                         ` Balbir Singh
  1 sibling, 0 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-05  9:39 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Pavel Emelyanov, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Peter Zijlstra

* Avi Kivity <avi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> [2009-06-05 09:01:50]:

> Bharata B Rao wrote:
>> On Fri, Jun 05, 2009 at 01:27:55PM +0800, Balbir Singh wrote:
>>   
>>> * Avi Kivity <avi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> [2009-06-05 08:21:43]:
>>>
>>>     
>>>> Balbir Singh wrote:
>>>>       
>>>>>> But then there is no other way to make a *guarantee*, guarantees come
>>>>>> at a cost of idling resources, no? Can you show me any other
>>>>>> combination that will provide the guarantee and without idling the
>>>>>> system for the specified guarantees?
>>>>>>               
>>>>> OK, I see part of your concern, but I think we could do some
>>>>> optimizations during design. For example if all groups have reached
>>>>> their hard-limit and the system is idle, should we do start a new hard
>>>>> limit interval and restart, so that idleness can be removed. Would
>>>>> that be an acceptable design point?
>>>>>         
>>>> I think so.  Given guarantees G1..Gn (0 <= Gi <= 1; sum(Gi) <= 1), 
>>>> and a  cpu hog running in each group, how would the algorithm 
>>>> divide resources?
>>>>
>>>>       
>>> As per the matrix calculation, but as soon as we reach an idle point,
>>> we redistribute the b/w and start a new quantum so to speak, where all
>>> groups are charged up to their hard limits.
>>>     
>>
>> But could there be client models where you are required to strictly
>> adhere to the limit within the bandwidth and not provide more (by advancing
>> the bandwidth period) in the presence of idle cycles ?
>>   
>
> That's the limit part.  I'd like to be able to specify limits and  
> guarantees on the same host and for the same groups; I don't think that  
> works when you advance the bandwidth period.

Yes, this feature needs to be configurable. But your use case for both
limits and guarantees is interesting. We spoke to Peter and he was
convinced only of the guarantee use case. Could you please help
elaborate your use case, so that we can incorporate it into RFC v2 we
send out. Peter is opposed to having hard limits and is convinced that
they are not generally useful, so far I seen you and Paul say it is
useful, any arguments you have or any +1 from you will help us. Peter
I am not back stabbing you :)


>
> I think we need to treat guarantees as first-class goals, not something  
> derived from limits (in fact I think guarantees are more useful as they  
> can be used to provide SLAs).

Even limits are useful for SLA's since your b/w available changes
quite drastically as we add or remove groups. There are other use
cases for limits as well.


-- 
	Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  6:01                     ` Avi Kivity
       [not found]                       ` <4A28B4CE.4010004-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2009-06-05  9:39                       ` Balbir Singh
  2009-06-05 13:14                         ` Avi Kivity
       [not found]                         ` <20090605093947.GJ11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
  1 sibling, 2 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-05  9:39 UTC (permalink / raw)
  To: Avi Kivity
  Cc: bharata, linux-kernel, Dhaval Giani, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, kvm, Linux Containers,
	Herbert Poetzl

* Avi Kivity <avi@redhat.com> [2009-06-05 09:01:50]:

> Bharata B Rao wrote:
>> On Fri, Jun 05, 2009 at 01:27:55PM +0800, Balbir Singh wrote:
>>   
>>> * Avi Kivity <avi@redhat.com> [2009-06-05 08:21:43]:
>>>
>>>     
>>>> Balbir Singh wrote:
>>>>       
>>>>>> But then there is no other way to make a *guarantee*, guarantees come
>>>>>> at a cost of idling resources, no? Can you show me any other
>>>>>> combination that will provide the guarantee and without idling the
>>>>>> system for the specified guarantees?
>>>>>>               
>>>>> OK, I see part of your concern, but I think we could do some
>>>>> optimizations during design. For example if all groups have reached
>>>>> their hard-limit and the system is idle, should we do start a new hard
>>>>> limit interval and restart, so that idleness can be removed. Would
>>>>> that be an acceptable design point?
>>>>>         
>>>> I think so.  Given guarantees G1..Gn (0 <= Gi <= 1; sum(Gi) <= 1), 
>>>> and a  cpu hog running in each group, how would the algorithm 
>>>> divide resources?
>>>>
>>>>       
>>> As per the matrix calculation, but as soon as we reach an idle point,
>>> we redistribute the b/w and start a new quantum so to speak, where all
>>> groups are charged up to their hard limits.
>>>     
>>
>> But could there be client models where you are required to strictly
>> adhere to the limit within the bandwidth and not provide more (by advancing
>> the bandwidth period) in the presence of idle cycles ?
>>   
>
> That's the limit part.  I'd like to be able to specify limits and  
> guarantees on the same host and for the same groups; I don't think that  
> works when you advance the bandwidth period.

Yes, this feature needs to be configurable. But your use case for both
limits and guarantees is interesting. We spoke to Peter and he was
convinced only of the guarantee use case. Could you please help
elaborate your use case, so that we can incorporate it into RFC v2 we
send out. Peter is opposed to having hard limits and is convinced that
they are not generally useful, so far I seen you and Paul say it is
useful, any arguments you have or any +1 from you will help us. Peter
I am not back stabbing you :)


>
> I think we need to treat guarantees as first-class goals, not something  
> derived from limits (in fact I think guarantees are more useful as they  
> can be used to provide SLAs).

Even limits are useful for SLA's since your b/w available changes
quite drastically as we add or remove groups. There are other use
cases for limits as well.


-- 
	Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]       ` <6599ad830906050232n11aa30d8xfcda0a279a482f32-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-06-05  9:48         ` Dhaval Giani
  0 siblings, 0 replies; 107+ messages in thread
From: Dhaval Giani @ 2009-06-05  9:48 UTC (permalink / raw)
  To: Paul Menage
  Cc: Pavel Emelyanov, Peter Zijlstra, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Balbir Singh

On Fri, Jun 05, 2009 at 02:32:51AM -0700, Paul Menage wrote:
> On Fri, Jun 5, 2009 at 2:27 AM, Bharata B Rao<bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
> >>
> >> Suppose 10 cgroups each want 10% of the machine's CPU. We can just
> >> give each cgroup an equal share, and they're guaranteed 10% if they
> >> try to use it; if they don't use it, other cgroups can get access to
> >> the idle cycles.
> >
> > Now if 11th group with same shares comes in, then each group will now
> > get 9% of CPU and that 10% guarantee breaks.
> 
> So you're trying to guarantee 11 cgroups that they can each get 10% of
> the CPU? That's called over-committing, and while there's nothing
> wrong with doing that if you're confident that they'll not al need
> their 10% at the same time, there's no way to *guarantee* them all
> 10%. You can guarantee them all 9% and hope the extra 1% is spare for
> those that need it (over-committing), or you can guarantee 10 of them
> 10% and give the last one 0 shares.
> 
> How would you propose to guarantee 11 cgroups each 10% of the CPU
> using hard limits?
> 

You cannot guarantee 10% to 11 groups on any system (unless I am missing
something). The sum of guarantees cannot exceed 100%.

How would you be able to do that with any other mechanism?

Thanks,
-- 
regards,
Dhaval

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  9:32     ` Paul Menage
       [not found]       ` <6599ad830906050232n11aa30d8xfcda0a279a482f32-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-06-05  9:48       ` Dhaval Giani
       [not found]         ` <20090605094811.GD4601-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
  2009-06-05  9:51         ` Paul Menage
  1 sibling, 2 replies; 107+ messages in thread
From: Dhaval Giani @ 2009-06-05  9:48 UTC (permalink / raw)
  To: Paul Menage
  Cc: bharata, linux-kernel, Balbir Singh, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, Avi Kivity, kvm,
	Linux Containers, Herbert Poetzl

On Fri, Jun 05, 2009 at 02:32:51AM -0700, Paul Menage wrote:
> On Fri, Jun 5, 2009 at 2:27 AM, Bharata B Rao<bharata@linux.vnet.ibm.com> wrote:
> >>
> >> Suppose 10 cgroups each want 10% of the machine's CPU. We can just
> >> give each cgroup an equal share, and they're guaranteed 10% if they
> >> try to use it; if they don't use it, other cgroups can get access to
> >> the idle cycles.
> >
> > Now if 11th group with same shares comes in, then each group will now
> > get 9% of CPU and that 10% guarantee breaks.
> 
> So you're trying to guarantee 11 cgroups that they can each get 10% of
> the CPU? That's called over-committing, and while there's nothing
> wrong with doing that if you're confident that they'll not al need
> their 10% at the same time, there's no way to *guarantee* them all
> 10%. You can guarantee them all 9% and hope the extra 1% is spare for
> those that need it (over-committing), or you can guarantee 10 of them
> 10% and give the last one 0 shares.
> 
> How would you propose to guarantee 11 cgroups each 10% of the CPU
> using hard limits?
> 

You cannot guarantee 10% to 11 groups on any system (unless I am missing
something). The sum of guarantees cannot exceed 100%.

How would you be able to do that with any other mechanism?

Thanks,
-- 
regards,
Dhaval

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  9:36   ` Balbir Singh
@ 2009-06-05  9:48         ` Paul Menage
  0 siblings, 0 replies; 107+ messages in thread
From: Paul Menage @ 2009-06-05  9:48 UTC (permalink / raw)
  To: balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: Pavel Emelyanov, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Peter Zijlstra

On Fri, Jun 5, 2009 at 2:36 AM, Balbir Singh<balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
>
> The important scenario I have is adding and removing groups.
>
> Consider 10 cgroups with shares of 10 each, what if 5 new are created
> with the same shares? We now start getting 100/15, even though we did
> not change our shares.

Are you assuming that arbitrary users can create new cgroups whenever
they like, with whatever shares they like? In that situation, how
would you use hard limits to provide guarantees? Presumably if the
user could create a cgroup with an arbitrary share, they could create
one with an arbitrary hard limit too.

Can you explain a bit more about how you're envisaging cgroups being
created, and how their shares and hard limits would get set? I was
working on the assumption that (for any sub-tree of the CFS hierarchy)
there's a single managing entity that gets to decide the shares given
to the cgroups within that tree. That managing entity would be
responsible for ensuring that the shares given out allowed guarantees
to be met (or alternatively, that the probability of violating those
guarantees based on the shares given out was within some tolerance
threshold).

Paul

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
@ 2009-06-05  9:48         ` Paul Menage
  0 siblings, 0 replies; 107+ messages in thread
From: Paul Menage @ 2009-06-05  9:48 UTC (permalink / raw)
  To: balbir
  Cc: bharata, linux-kernel, Dhaval Giani, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, Avi Kivity, kvm,
	Linux Containers, Herbert Poetzl

On Fri, Jun 5, 2009 at 2:36 AM, Balbir Singh<balbir@linux.vnet.ibm.com> wrote:
>
> The important scenario I have is adding and removing groups.
>
> Consider 10 cgroups with shares of 10 each, what if 5 new are created
> with the same shares? We now start getting 100/15, even though we did
> not change our shares.

Are you assuming that arbitrary users can create new cgroups whenever
they like, with whatever shares they like? In that situation, how
would you use hard limits to provide guarantees? Presumably if the
user could create a cgroup with an arbitrary share, they could create
one with an arbitrary hard limit too.

Can you explain a bit more about how you're envisaging cgroups being
created, and how their shares and hard limits would get set? I was
working on the assumption that (for any sub-tree of the CFS hierarchy)
there's a single managing entity that gets to decide the shares given
to the cgroups within that tree. That managing entity would be
responsible for ensuring that the shares given out allowed guarantees
to be met (or alternatively, that the probability of violating those
guarantees based on the shares given out was within some tolerance
threshold).

Paul

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]         ` <20090605094811.GD4601-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2009-06-05  9:51           ` Paul Menage
  0 siblings, 0 replies; 107+ messages in thread
From: Paul Menage @ 2009-06-05  9:51 UTC (permalink / raw)
  To: Dhaval Giani
  Cc: Pavel Emelyanov, Peter Zijlstra, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Balbir Singh

On Fri, Jun 5, 2009 at 2:48 AM, Dhaval Giani<dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
>> > Now if 11th group with same shares comes in, then each group will now
>> > get 9% of CPU and that 10% guarantee breaks.
>>
>> So you're trying to guarantee 11 cgroups that they can each get 10% of
>> the CPU? That's called over-committing, and while there's nothing
>> wrong with doing that if you're confident that they'll not all need
>> their 10% at the same time, there's no way to *guarantee* them all
>> 10%. You can guarantee them all 9% and hope the extra 1% is spare for
>> those that need it (over-committing), or you can guarantee 10 of them
>> 10% and give the last one 0 shares.
>>
>> How would you propose to guarantee 11 cgroups each 10% of the CPU
>> using hard limits?
>>
>
> You cannot guarantee 10% to 11 groups on any system (unless I am missing
> something). The sum of guarantees cannot exceed 100%.

That's exactly my point. I was trying to counter Bharata's statement, which was:

> > Now if 11th group with same shares comes in, then each group will now
> > get 9% of CPU and that 10% guarantee breaks.

which seemed to be implying that this was a drawback of using shares
to implement guarantees.

Paul

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  9:48       ` Dhaval Giani
       [not found]         ` <20090605094811.GD4601-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2009-06-05  9:51         ` Paul Menage
       [not found]           ` <6599ad830906050251h18f4e037h182f61aa80a5b046-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2009-06-05  9:59           ` Dhaval Giani
  1 sibling, 2 replies; 107+ messages in thread
From: Paul Menage @ 2009-06-05  9:51 UTC (permalink / raw)
  To: Dhaval Giani
  Cc: bharata, linux-kernel, Balbir Singh, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, Avi Kivity, kvm,
	Linux Containers, Herbert Poetzl

On Fri, Jun 5, 2009 at 2:48 AM, Dhaval Giani<dhaval@linux.vnet.ibm.com> wrote:
>> > Now if 11th group with same shares comes in, then each group will now
>> > get 9% of CPU and that 10% guarantee breaks.
>>
>> So you're trying to guarantee 11 cgroups that they can each get 10% of
>> the CPU? That's called over-committing, and while there's nothing
>> wrong with doing that if you're confident that they'll not all need
>> their 10% at the same time, there's no way to *guarantee* them all
>> 10%. You can guarantee them all 9% and hope the extra 1% is spare for
>> those that need it (over-committing), or you can guarantee 10 of them
>> 10% and give the last one 0 shares.
>>
>> How would you propose to guarantee 11 cgroups each 10% of the CPU
>> using hard limits?
>>
>
> You cannot guarantee 10% to 11 groups on any system (unless I am missing
> something). The sum of guarantees cannot exceed 100%.

That's exactly my point. I was trying to counter Bharata's statement, which was:

> > Now if 11th group with same shares comes in, then each group will now
> > get 9% of CPU and that 10% guarantee breaks.

which seemed to be implying that this was a drawback of using shares
to implement guarantees.

Paul

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]         ` <6599ad830906050248m2c569e5bx44fb3bbddf46f8b1-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-06-05  9:55           ` Balbir Singh
  0 siblings, 0 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-05  9:55 UTC (permalink / raw)
  To: Paul Menage
  Cc: Pavel Emelyanov, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Peter Zijlstra

* menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org <menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> [2009-06-05 02:48:36]:

> On Fri, Jun 5, 2009 at 2:36 AM, Balbir Singh<balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
> >
> > The important scenario I have is adding and removing groups.
> >
> > Consider 10 cgroups with shares of 10 each, what if 5 new are created
> > with the same shares? We now start getting 100/15, even though we did
> > not change our shares.
> 
> Are you assuming that arbitrary users can create new cgroups whenever
> they like, with whatever shares they like? In that situation, how
> would you use hard limits to provide guarantees? Presumably if the
> user could create a cgroup with an arbitrary share, they could create
> one with an arbitrary hard limit too.
>

What about applications running as root, that can create their own
groups? How about multiple instances of the same application started?
Do applications need to know that creating a group will hurt
guarantees provided to others?
 
> Can you explain a bit more about how you're envisaging cgroups being
> created, and how their shares and hard limits would get set? I was
> working on the assumption that (for any sub-tree of the CFS hierarchy)
> there's a single managing entity that gets to decide the shares given
> to the cgroups within that tree. That managing entity would be
> responsible for ensuring that the shares given out allowed guarantees
> to be met (or alternatively, that the probability of violating those
> guarantees based on the shares given out was within some tolerance
> threshold).
>

The point is that there is no single control entity for creating
groups. if run a solution, it might create groups without telling the
user. No one is arbitrating, not even libcgroup. What if someone
changes the cpuset assignment and moves CPUS x to y in an exclusive
cpuset all of a sudden. How do we arbitrate?

 
-- 
	Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  9:48         ` Paul Menage
  (?)
  (?)
@ 2009-06-05  9:55         ` Balbir Singh
  2009-06-05  9:57           ` Paul Menage
                             ` (2 more replies)
  -1 siblings, 3 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-05  9:55 UTC (permalink / raw)
  To: Paul Menage
  Cc: bharata, linux-kernel, Dhaval Giani, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, Avi Kivity, kvm,
	Linux Containers, Herbert Poetzl

* menage@google.com <menage@google.com> [2009-06-05 02:48:36]:

> On Fri, Jun 5, 2009 at 2:36 AM, Balbir Singh<balbir@linux.vnet.ibm.com> wrote:
> >
> > The important scenario I have is adding and removing groups.
> >
> > Consider 10 cgroups with shares of 10 each, what if 5 new are created
> > with the same shares? We now start getting 100/15, even though we did
> > not change our shares.
> 
> Are you assuming that arbitrary users can create new cgroups whenever
> they like, with whatever shares they like? In that situation, how
> would you use hard limits to provide guarantees? Presumably if the
> user could create a cgroup with an arbitrary share, they could create
> one with an arbitrary hard limit too.
>

What about applications running as root, that can create their own
groups? How about multiple instances of the same application started?
Do applications need to know that creating a group will hurt
guarantees provided to others?
 
> Can you explain a bit more about how you're envisaging cgroups being
> created, and how their shares and hard limits would get set? I was
> working on the assumption that (for any sub-tree of the CFS hierarchy)
> there's a single managing entity that gets to decide the shares given
> to the cgroups within that tree. That managing entity would be
> responsible for ensuring that the shares given out allowed guarantees
> to be met (or alternatively, that the probability of violating those
> guarantees based on the shares given out was within some tolerance
> threshold).
>

The point is that there is no single control entity for creating
groups. if run a solution, it might create groups without telling the
user. No one is arbitrating, not even libcgroup. What if someone
changes the cpuset assignment and moves CPUS x to y in an exclusive
cpuset all of a sudden. How do we arbitrate?

 
-- 
	Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]           ` <20090605095527.GM11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
@ 2009-06-05  9:57             ` Paul Menage
  2009-06-05 10:02             ` Paul Menage
  1 sibling, 0 replies; 107+ messages in thread
From: Paul Menage @ 2009-06-05  9:57 UTC (permalink / raw)
  To: balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: Pavel Emelyanov, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Peter Zijlstra

On Fri, Jun 5, 2009 at 2:55 AM, Balbir Singh<balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
> The point is that there is no single control entity for creating
> groups. if run a solution, it might create groups without telling the
> user. No one is arbitrating, not even libcgroup. What if someone
> changes the cpuset assignment and moves CPUS x to y in an exclusive
> cpuset all of a sudden. How do we arbitrate?

But in that situation, how do hard limits help? If you can't control
when cgroups are being created, and you can't control their shares,
how are you going to control their hard limits?

Paul

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  9:55         ` Balbir Singh
@ 2009-06-05  9:57           ` Paul Menage
  2009-06-05 10:02           ` Paul Menage
       [not found]           ` <20090605095527.GM11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
  2 siblings, 0 replies; 107+ messages in thread
From: Paul Menage @ 2009-06-05  9:57 UTC (permalink / raw)
  To: balbir
  Cc: bharata, linux-kernel, Dhaval Giani, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, Avi Kivity, kvm,
	Linux Containers, Herbert Poetzl

On Fri, Jun 5, 2009 at 2:55 AM, Balbir Singh<balbir@linux.vnet.ibm.com> wrote:
> The point is that there is no single control entity for creating
> groups. if run a solution, it might create groups without telling the
> user. No one is arbitrating, not even libcgroup. What if someone
> changes the cpuset assignment and moves CPUS x to y in an exclusive
> cpuset all of a sudden. How do we arbitrate?

But in that situation, how do hard limits help? If you can't control
when cgroups are being created, and you can't control their shares,
how are you going to control their hard limits?

Paul

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]           ` <6599ad830906050251h18f4e037h182f61aa80a5b046-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-06-05  9:59             ` Dhaval Giani
  0 siblings, 0 replies; 107+ messages in thread
From: Dhaval Giani @ 2009-06-05  9:59 UTC (permalink / raw)
  To: Paul Menage
  Cc: Pavel Emelyanov, Peter Zijlstra, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Balbir Singh

On Fri, Jun 05, 2009 at 02:51:18AM -0700, Paul Menage wrote:
> On Fri, Jun 5, 2009 at 2:48 AM, Dhaval Giani<dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
> >> > Now if 11th group with same shares comes in, then each group will now
> >> > get 9% of CPU and that 10% guarantee breaks.
> >>
> >> So you're trying to guarantee 11 cgroups that they can each get 10% of
> >> the CPU? That's called over-committing, and while there's nothing
> >> wrong with doing that if you're confident that they'll not all need
> >> their 10% at the same time, there's no way to *guarantee* them all
> >> 10%. You can guarantee them all 9% and hope the extra 1% is spare for
> >> those that need it (over-committing), or you can guarantee 10 of them
> >> 10% and give the last one 0 shares.
> >>
> >> How would you propose to guarantee 11 cgroups each 10% of the CPU
> >> using hard limits?
> >>
> >
> > You cannot guarantee 10% to 11 groups on any system (unless I am missing
> > something). The sum of guarantees cannot exceed 100%.
> 
> That's exactly my point. I was trying to counter Bharata's statement, which was:
> 
> > > Now if 11th group with same shares comes in, then each group will now
> > > get 9% of CPU and that 10% guarantee breaks.
> 
> which seemed to be implying that this was a drawback of using shares
> to implement guarantees.
> 

OK :). Glad to see I did not get it wrong.

I think we are focusing on the wrong use case here. Guarantees is just a
useful side-effect we get by using hard limits. I think the more
important use case is where the provider wants to limit the amount of
time a user gets (such as in a cloud).

Maybe we should direct our attention in solving that problem? :)

thanks,
-- 
regards,
Dhaval

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  9:51         ` Paul Menage
       [not found]           ` <6599ad830906050251h18f4e037h182f61aa80a5b046-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-06-05  9:59           ` Dhaval Giani
       [not found]             ` <20090605095931.GE4601-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
  2009-06-05 10:03             ` Paul Menage
  1 sibling, 2 replies; 107+ messages in thread
From: Dhaval Giani @ 2009-06-05  9:59 UTC (permalink / raw)
  To: Paul Menage
  Cc: bharata, linux-kernel, Balbir Singh, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, Avi Kivity, kvm,
	Linux Containers, Herbert Poetzl

On Fri, Jun 05, 2009 at 02:51:18AM -0700, Paul Menage wrote:
> On Fri, Jun 5, 2009 at 2:48 AM, Dhaval Giani<dhaval@linux.vnet.ibm.com> wrote:
> >> > Now if 11th group with same shares comes in, then each group will now
> >> > get 9% of CPU and that 10% guarantee breaks.
> >>
> >> So you're trying to guarantee 11 cgroups that they can each get 10% of
> >> the CPU? That's called over-committing, and while there's nothing
> >> wrong with doing that if you're confident that they'll not all need
> >> their 10% at the same time, there's no way to *guarantee* them all
> >> 10%. You can guarantee them all 9% and hope the extra 1% is spare for
> >> those that need it (over-committing), or you can guarantee 10 of them
> >> 10% and give the last one 0 shares.
> >>
> >> How would you propose to guarantee 11 cgroups each 10% of the CPU
> >> using hard limits?
> >>
> >
> > You cannot guarantee 10% to 11 groups on any system (unless I am missing
> > something). The sum of guarantees cannot exceed 100%.
> 
> That's exactly my point. I was trying to counter Bharata's statement, which was:
> 
> > > Now if 11th group with same shares comes in, then each group will now
> > > get 9% of CPU and that 10% guarantee breaks.
> 
> which seemed to be implying that this was a drawback of using shares
> to implement guarantees.
> 

OK :). Glad to see I did not get it wrong.

I think we are focusing on the wrong use case here. Guarantees is just a
useful side-effect we get by using hard limits. I think the more
important use case is where the provider wants to limit the amount of
time a user gets (such as in a cloud).

Maybe we should direct our attention in solving that problem? :)

thanks,
-- 
regards,
Dhaval

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]           ` <20090605095527.GM11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
  2009-06-05  9:57             ` Paul Menage
@ 2009-06-05 10:02             ` Paul Menage
  1 sibling, 0 replies; 107+ messages in thread
From: Paul Menage @ 2009-06-05 10:02 UTC (permalink / raw)
  To: balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: Pavel Emelyanov, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Peter Zijlstra

On Fri, Jun 5, 2009 at 2:55 AM, Balbir Singh<balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
>
> What about applications running as root, that can create their own
> groups? How about multiple instances of the same application started?
> Do applications need to know that creating a group will hurt
> guarantees provided to others?

Yes, of course. If you're handing out guarantees, but other users
can/will create cgroups with whatever parameters they like and won't
respect the guarantees that you've made, then those guarantees are
worthless. How do hard limits help in that situation?

Paul

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  9:55         ` Balbir Singh
  2009-06-05  9:57           ` Paul Menage
@ 2009-06-05 10:02           ` Paul Menage
       [not found]           ` <20090605095527.GM11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
  2 siblings, 0 replies; 107+ messages in thread
From: Paul Menage @ 2009-06-05 10:02 UTC (permalink / raw)
  To: balbir
  Cc: bharata, linux-kernel, Dhaval Giani, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, Avi Kivity, kvm,
	Linux Containers, Herbert Poetzl

On Fri, Jun 5, 2009 at 2:55 AM, Balbir Singh<balbir@linux.vnet.ibm.com> wrote:
>
> What about applications running as root, that can create their own
> groups? How about multiple instances of the same application started?
> Do applications need to know that creating a group will hurt
> guarantees provided to others?

Yes, of course. If you're handing out guarantees, but other users
can/will create cgroups with whatever parameters they like and won't
respect the guarantees that you've made, then those guarantees are
worthless. How do hard limits help in that situation?

Paul

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]             ` <20090605095931.GE4601-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2009-06-05 10:03               ` Paul Menage
  0 siblings, 0 replies; 107+ messages in thread
From: Paul Menage @ 2009-06-05 10:03 UTC (permalink / raw)
  To: Dhaval Giani
  Cc: Pavel Emelyanov, Peter Zijlstra, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Balbir Singh

On Fri, Jun 5, 2009 at 2:59 AM, Dhaval Giani<dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
>
> I think we are focusing on the wrong use case here. Guarantees is just a
> useful side-effect we get by using hard limits. I think the more
> important use case is where the provider wants to limit the amount of
> time a user gets (such as in a cloud).
>
> Maybe we should direct our attention in solving that problem? :)
>

Yes, that case and the "predictable load test behaviour" case are both
good reasons for hard limits.

Paul

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  9:59           ` Dhaval Giani
       [not found]             ` <20090605095931.GE4601-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2009-06-05 10:03             ` Paul Menage
       [not found]               ` <6599ad830906050303r404c325anc60ded4f45a50b95-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2009-06-08  8:50               ` Pavel Emelyanov
  1 sibling, 2 replies; 107+ messages in thread
From: Paul Menage @ 2009-06-05 10:03 UTC (permalink / raw)
  To: Dhaval Giani
  Cc: bharata, linux-kernel, Balbir Singh, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, Avi Kivity, kvm,
	Linux Containers, Herbert Poetzl

On Fri, Jun 5, 2009 at 2:59 AM, Dhaval Giani<dhaval@linux.vnet.ibm.com> wrote:
>
> I think we are focusing on the wrong use case here. Guarantees is just a
> useful side-effect we get by using hard limits. I think the more
> important use case is where the provider wants to limit the amount of
> time a user gets (such as in a cloud).
>
> Maybe we should direct our attention in solving that problem? :)
>

Yes, that case and the "predictable load test behaviour" case are both
good reasons for hard limits.

Paul

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]   ` <6599ad830906050153i1afd104fqe70f681317349142-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2009-06-05  9:27     ` Bharata B Rao
  2009-06-05  9:36     ` Balbir Singh
@ 2009-06-05 11:32     ` Srivatsa Vaddagiri
  2009-06-05 13:02     ` Avi Kivity
  3 siblings, 0 replies; 107+ messages in thread
From: Srivatsa Vaddagiri @ 2009-06-05 11:32 UTC (permalink / raw)
  To: Paul Menage
  Cc: Peter Zijlstra, Pavel Emelyanov, Dhaval Giani,
	kvm-u79uwXL29TY76Z2rM5mHXA, Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Balbir Singh

On Fri, Jun 05, 2009 at 01:53:15AM -0700, Paul Menage wrote:
> This claim (and the subsequent long thread it generated on how limits
> can provide guarantees) confused me a bit.
> 
> Why do we need limits to provide guarantees when we can already
> provide guarantees via shares?

I think the interval over which we need guarantee matters here. Shares
can generally provide guaranteed share of resource over longer (sometimes
minutes) intervals. For high-priority bursty workloads, the latency in 
achieving guaranteed resource usage matters. By having hard-limits, we are 
"reserving" (potentially idle) slots where the high-priority group can run and 
claim its guaranteed share almost immediately.

- vatsa

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  8:53 ` Paul Menage
                     ` (2 preceding siblings ...)
  2009-06-05  9:36   ` Balbir Singh
@ 2009-06-05 11:32   ` Srivatsa Vaddagiri
       [not found]     ` <20090605113217.GA20786-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
                       ` (2 more replies)
  2009-06-05 13:02   ` Avi Kivity
  4 siblings, 3 replies; 107+ messages in thread
From: Srivatsa Vaddagiri @ 2009-06-05 11:32 UTC (permalink / raw)
  To: Paul Menage
  Cc: bharata, linux-kernel, Dhaval Giani, Balbir Singh,
	Vaidyanathan Srinivasan, Gautham R Shenoy, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, Avi Kivity, kvm,
	Linux Containers, Herbert Poetzl

On Fri, Jun 05, 2009 at 01:53:15AM -0700, Paul Menage wrote:
> This claim (and the subsequent long thread it generated on how limits
> can provide guarantees) confused me a bit.
> 
> Why do we need limits to provide guarantees when we can already
> provide guarantees via shares?

I think the interval over which we need guarantee matters here. Shares
can generally provide guaranteed share of resource over longer (sometimes
minutes) intervals. For high-priority bursty workloads, the latency in 
achieving guaranteed resource usage matters. By having hard-limits, we are 
"reserving" (potentially idle) slots where the high-priority group can run and 
claim its guaranteed share almost immediately.

- vatsa

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]     ` <20090605113217.GA20786-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
@ 2009-06-05 12:18       ` Paul Menage
  2009-06-05 14:44       ` Chris Friesen
  1 sibling, 0 replies; 107+ messages in thread
From: Paul Menage @ 2009-06-05 12:18 UTC (permalink / raw)
  To: vatsa-xthvdsQ13ZrQT0dZR+AlfA
  Cc: Peter Zijlstra, Pavel Emelyanov, Dhaval Giani,
	kvm-u79uwXL29TY76Z2rM5mHXA, Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Balbir Singh

On Fri, Jun 5, 2009 at 4:32 AM, Srivatsa Vaddagiri<vatsa-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org> wrote:
>
> I think the interval over which we need guarantee matters here. Shares
> can generally provide guaranteed share of resource over longer (sometimes
> minutes) intervals. For high-priority bursty workloads, the latency in
> achieving guaranteed resource usage matters.

Well yes, it's true that you *could* just enforce shares over a
granularity of minutes, and limits over a granularity of milliseconds.
But why would you? It could well make sense that you can adjust the
granularity over which shares are enforced - e.g. for batch jobs, only
enforcing over minutes or tens of seconds might be fine. But if you're
doing the fine-grained accounting and scheduling required for the
tight hard limit enforcement, it doesn't seem as though it should be
much harder to enforce shares at the same granularity for those
cgroups that matter. In fact I thought that's what CFS already did -
updated the virtual time accounting at each context switch, and picked
the runnable child with the oldest virtual time. (Maybe someone like
Ingo or Peter who's more familiar than I with the CFS implementation
could comment here?)

> By having hard-limits, we are
> "reserving" (potentially idle) slots where the high-priority group can run and
> claim its guaranteed share almost immediately.

But you can always create an "idle" slot by forcibly preempting
whatever's running currently when you need to - you don't need to keep
the CPU deliberately idle just in case a cgroup with a guarantee wakes
up.

Paul

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05 11:32   ` Srivatsa Vaddagiri
       [not found]     ` <20090605113217.GA20786-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
@ 2009-06-05 12:18     ` Paul Menage
       [not found]       ` <6599ad830906050518t6cd7d477h36a187f2eaf55578-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2009-06-05 14:44     ` Chris Friesen
  2 siblings, 1 reply; 107+ messages in thread
From: Paul Menage @ 2009-06-05 12:18 UTC (permalink / raw)
  To: vatsa
  Cc: bharata, linux-kernel, Dhaval Giani, Balbir Singh,
	Vaidyanathan Srinivasan, Gautham R Shenoy, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, Avi Kivity, kvm,
	Linux Containers, Herbert Poetzl

On Fri, Jun 5, 2009 at 4:32 AM, Srivatsa Vaddagiri<vatsa@in.ibm.com> wrote:
>
> I think the interval over which we need guarantee matters here. Shares
> can generally provide guaranteed share of resource over longer (sometimes
> minutes) intervals. For high-priority bursty workloads, the latency in
> achieving guaranteed resource usage matters.

Well yes, it's true that you *could* just enforce shares over a
granularity of minutes, and limits over a granularity of milliseconds.
But why would you? It could well make sense that you can adjust the
granularity over which shares are enforced - e.g. for batch jobs, only
enforcing over minutes or tens of seconds might be fine. But if you're
doing the fine-grained accounting and scheduling required for the
tight hard limit enforcement, it doesn't seem as though it should be
much harder to enforce shares at the same granularity for those
cgroups that matter. In fact I thought that's what CFS already did -
updated the virtual time accounting at each context switch, and picked
the runnable child with the oldest virtual time. (Maybe someone like
Ingo or Peter who's more familiar than I with the CFS implementation
could comment here?)

> By having hard-limits, we are
> "reserving" (potentially idle) slots where the high-priority group can run and
> claim its guaranteed share almost immediately.

But you can always create an "idle" slot by forcibly preempting
whatever's running currently when you need to - you don't need to keep
the CPU deliberately idle just in case a cgroup with a guarantee wakes
up.

Paul

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]                       ` <20090605063243.GC3872-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
@ 2009-06-05 12:57                         ` Avi Kivity
  0 siblings, 0 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-05 12:57 UTC (permalink / raw)
  To: bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: Peter Zijlstra, Pavel Emelyanov, Dhaval Giani,
	kvm-u79uwXL29TY76Z2rM5mHXA, Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Ingo Molnar,
	balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

Bharata B Rao wrote:
>> So the groups with guarantees get a priority boost.  That's not a good  
>> side effect.
>>     
>
> That happens only in the presence of idle cycles when other groups [with or
> without guarantees] have nothing useful to do. So how would that matter
> since there is nothing else to run anyway ?
>   

If there are three groups, each running a cpu hog, and they have (say) 
guarantees of 10%, 10%, and 0%, then they should each get 33% of the 
cpu, not biased towards the groups with the guarantee.

If I want to change the weights, I'll alter their priority.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  6:32                     ` Bharata B Rao
       [not found]                       ` <20090605063243.GC3872-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
@ 2009-06-05 12:57                       ` Avi Kivity
  1 sibling, 0 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-05 12:57 UTC (permalink / raw)
  To: bharata
  Cc: balbir, linux-kernel, Dhaval Giani, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, kvm, Linux Containers,
	Herbert Poetzl

Bharata B Rao wrote:
>> So the groups with guarantees get a priority boost.  That's not a good  
>> side effect.
>>     
>
> That happens only in the presence of idle cycles when other groups [with or
> without guarantees] have nothing useful to do. So how would that matter
> since there is nothing else to run anyway ?
>   

If there are three groups, each running a cpu hog, and they have (say) 
guarantees of 10%, 10%, and 0%, then they should each get 33% of the 
cpu, not biased towards the groups with the guarantee.

If I want to change the weights, I'll alter their priority.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]   ` <6599ad830906050153i1afd104fqe70f681317349142-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
                       ` (2 preceding siblings ...)
  2009-06-05 11:32     ` Srivatsa Vaddagiri
@ 2009-06-05 13:02     ` Avi Kivity
  3 siblings, 0 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-05 13:02 UTC (permalink / raw)
  To: Paul Menage
  Cc: Peter Zijlstra, Pavel Emelyanov, Dhaval Giani,
	kvm-u79uwXL29TY76Z2rM5mHXA, Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Balbir Singh

Paul Menage wrote:
> On Wed, Jun 3, 2009 at 10:36 PM, Bharata B
> Rao<bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
>   
>> - Hard limits can be used to provide guarantees.
>>
>>     
>
> This claim (and the subsequent long thread it generated on how limits
> can provide guarantees) confused me a bit.
>
> Why do we need limits to provide guarantees when we can already
> provide guarantees via shares?
>
> Suppose 10 cgroups each want 10% of the machine's CPU. We can just
> give each cgroup an equal share, and they're guaranteed 10% if they
> try to use it; if they don't use it, other cgroups can get access to
> the idle cycles.
>
> Suppose cgroup A wants a guarantee of 50% and two others, B and C,
> want guarantees of 15% each; give A 50 shares and B and C 15 shares
> each. In this case, if they all run flat out they'll get 62%/19%/19%,
> which is within their SLA.
>
> That's not to say that hard limits can't be useful in their own right
> - e.g. for providing reproducible loadtesting conditions by
> controlling how much CPU a service can use during the load test. But I
> don't see why using them to implement guarantees is either necessary
> or desirable.
>
> (Unless I'm missing some crucial point ...)
>   

How many shares does a cgroup with a 0% guarantee get?

Ideally, the scheduler would hand out cpu time according to weight and 
demand, then clamp over-demand by a cgroup's limit and boost the share 
to meet guarantees.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  8:53 ` Paul Menage
                     ` (3 preceding siblings ...)
  2009-06-05 11:32   ` Srivatsa Vaddagiri
@ 2009-06-05 13:02   ` Avi Kivity
  2009-06-05 13:43     ` Dhaval Giani
       [not found]     ` <4A291753.7090205-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  4 siblings, 2 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-05 13:02 UTC (permalink / raw)
  To: Paul Menage
  Cc: bharata, linux-kernel, Dhaval Giani, Balbir Singh,
	Vaidyanathan Srinivasan, Gautham R Shenoy, Srivatsa Vaddagiri,
	Ingo Molnar, Peter Zijlstra, Pavel Emelyanov, kvm,
	Linux Containers, Herbert Poetzl

Paul Menage wrote:
> On Wed, Jun 3, 2009 at 10:36 PM, Bharata B
> Rao<bharata@linux.vnet.ibm.com> wrote:
>   
>> - Hard limits can be used to provide guarantees.
>>
>>     
>
> This claim (and the subsequent long thread it generated on how limits
> can provide guarantees) confused me a bit.
>
> Why do we need limits to provide guarantees when we can already
> provide guarantees via shares?
>
> Suppose 10 cgroups each want 10% of the machine's CPU. We can just
> give each cgroup an equal share, and they're guaranteed 10% if they
> try to use it; if they don't use it, other cgroups can get access to
> the idle cycles.
>
> Suppose cgroup A wants a guarantee of 50% and two others, B and C,
> want guarantees of 15% each; give A 50 shares and B and C 15 shares
> each. In this case, if they all run flat out they'll get 62%/19%/19%,
> which is within their SLA.
>
> That's not to say that hard limits can't be useful in their own right
> - e.g. for providing reproducible loadtesting conditions by
> controlling how much CPU a service can use during the load test. But I
> don't see why using them to implement guarantees is either necessary
> or desirable.
>
> (Unless I'm missing some crucial point ...)
>   

How many shares does a cgroup with a 0% guarantee get?

Ideally, the scheduler would hand out cpu time according to weight and 
demand, then clamp over-demand by a cgroup's limit and boost the share 
to meet guarantees.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]                         ` <20090605093947.GJ11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
@ 2009-06-05 13:14                           ` Avi Kivity
  0 siblings, 0 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-05 13:14 UTC (permalink / raw)
  To: balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: Pavel Emelyanov, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Peter Zijlstra

Balbir Singh wrote:
>> That's the limit part.  I'd like to be able to specify limits and  
>> guarantees on the same host and for the same groups; I don't think that  
>> works when you advance the bandwidth period.
>>     
>
> Yes, this feature needs to be configurable. But your use case for both
> limits and guarantees is interesting. We spoke to Peter and he was
> convinced only of the guarantee use case. Could you please help
> elaborate your use case, so that we can incorporate it into RFC v2 we
> send out. Peter is opposed to having hard limits and is convinced that
> they are not generally useful, so far I seen you and Paul say it is
> useful, any arguments you have or any +1 from you will help us. Peter
> I am not back stabbing you :)
>   

I am selling virtual private servers.  A 10% cpu share costs $x/month, 
and I guarantee you'll get that 10%, or your money back.  On the other 
hand, I want to limit cpu usage to that 10% (maybe a little more) so 
people don't buy 10% shares and use 100% on my underutilized servers.  
If they want 100%, let them pay for 100%.

>> I think we need to treat guarantees as first-class goals, not something  
>> derived from limits (in fact I think guarantees are more useful as they  
>> can be used to provide SLAs).
>>     
>
> Even limits are useful for SLA's since your b/w available changes
> quite drastically as we add or remove groups. There are other use
> cases for limits as well

SLAs are specified in terms of guarantees on a service, not on limits on 
others.  If we could use limits to provide guarantees, that would be 
fine, but it doesn't quite work out.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  9:39                       ` Balbir Singh
@ 2009-06-05 13:14                         ` Avi Kivity
       [not found]                           ` <4A291A2F.3090201-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2009-06-05 14:54                           ` Chris Friesen
       [not found]                         ` <20090605093947.GJ11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
  1 sibling, 2 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-05 13:14 UTC (permalink / raw)
  To: balbir
  Cc: bharata, linux-kernel, Dhaval Giani, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, kvm, Linux Containers,
	Herbert Poetzl

Balbir Singh wrote:
>> That's the limit part.  I'd like to be able to specify limits and  
>> guarantees on the same host and for the same groups; I don't think that  
>> works when you advance the bandwidth period.
>>     
>
> Yes, this feature needs to be configurable. But your use case for both
> limits and guarantees is interesting. We spoke to Peter and he was
> convinced only of the guarantee use case. Could you please help
> elaborate your use case, so that we can incorporate it into RFC v2 we
> send out. Peter is opposed to having hard limits and is convinced that
> they are not generally useful, so far I seen you and Paul say it is
> useful, any arguments you have or any +1 from you will help us. Peter
> I am not back stabbing you :)
>   

I am selling virtual private servers.  A 10% cpu share costs $x/month, 
and I guarantee you'll get that 10%, or your money back.  On the other 
hand, I want to limit cpu usage to that 10% (maybe a little more) so 
people don't buy 10% shares and use 100% on my underutilized servers.  
If they want 100%, let them pay for 100%.

>> I think we need to treat guarantees as first-class goals, not something  
>> derived from limits (in fact I think guarantees are more useful as they  
>> can be used to provide SLAs).
>>     
>
> Even limits are useful for SLA's since your b/w available changes
> quite drastically as we add or remove groups. There are other use
> cases for limits as well

SLAs are specified in terms of guarantees on a service, not on limits on 
others.  If we could use limits to provide guarantees, that would be 
fine, but it doesn't quite work out.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05 13:14                         ` Avi Kivity
@ 2009-06-05 13:42                               ` Balbir Singh
  2009-06-05 14:54                           ` Chris Friesen
  1 sibling, 0 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-05 13:42 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Pavel Emelyanov, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Peter Zijlstra

On Fri, Jun 5, 2009 at 6:44 PM, Avi Kivity<avi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> Balbir Singh wrote:
>>>
>>> That's the limit part.  I'd like to be able to specify limits and
>>>  guarantees on the same host and for the same groups; I don't think that
>>>  works when you advance the bandwidth period.
>>>
>>
>> Yes, this feature needs to be configurable. But your use case for both
>> limits and guarantees is interesting. We spoke to Peter and he was
>> convinced only of the guarantee use case. Could you please help
>> elaborate your use case, so that we can incorporate it into RFC v2 we
>> send out. Peter is opposed to having hard limits and is convinced that
>> they are not generally useful, so far I seen you and Paul say it is
>> useful, any arguments you have or any +1 from you will help us. Peter
>> I am not back stabbing you :)
>>
>
> I am selling virtual private servers.  A 10% cpu share costs $x/month, and I
> guarantee you'll get that 10%, or your money back.  On the other hand, I
> want to limit cpu usage to that 10% (maybe a little more) so people don't
> buy 10% shares and use 100% on my underutilized servers.  If they want 100%,
> let them pay for 100%.

Excellent examples, we've covered them in the RFC, could you see if we
missed anything in terms of use cases? The real question is do we care
enough to build hard limits control into the CFS group scheduler. I
believe we should.

>
>>> I think we need to treat guarantees as first-class goals, not something
>>>  derived from limits (in fact I think guarantees are more useful as they
>>>  can be used to provide SLAs).
>>>
>>
>> Even limits are useful for SLA's since your b/w available changes
>> quite drastically as we add or remove groups. There are other use
>> cases for limits as well
>
> SLAs are specified in terms of guarantees on a service, not on limits on
> others.  If we could use limits to provide guarantees, that would be fine,
> but it doesn't quite work out.

To be honest, I would disagree here, specifically if you start
comparing how you would build guarantees in the kernel and compare it
with the proposed approach. I don't want to harp on the technicality,
but point out the feasibility for people who care for lower end of the
guarantee without requiring density. I think the real technical
discussion should be on here are the use cases, lets agree on the need
for the feature and go ahead and start prototyping the feature.

Thanks,
Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
@ 2009-06-05 13:42                               ` Balbir Singh
  0 siblings, 0 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-05 13:42 UTC (permalink / raw)
  To: Avi Kivity
  Cc: bharata, linux-kernel, Dhaval Giani, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, kvm, Linux Containers,
	Herbert Poetzl

On Fri, Jun 5, 2009 at 6:44 PM, Avi Kivity<avi@redhat.com> wrote:
> Balbir Singh wrote:
>>>
>>> That's the limit part.  I'd like to be able to specify limits and
>>>  guarantees on the same host and for the same groups; I don't think that
>>>  works when you advance the bandwidth period.
>>>
>>
>> Yes, this feature needs to be configurable. But your use case for both
>> limits and guarantees is interesting. We spoke to Peter and he was
>> convinced only of the guarantee use case. Could you please help
>> elaborate your use case, so that we can incorporate it into RFC v2 we
>> send out. Peter is opposed to having hard limits and is convinced that
>> they are not generally useful, so far I seen you and Paul say it is
>> useful, any arguments you have or any +1 from you will help us. Peter
>> I am not back stabbing you :)
>>
>
> I am selling virtual private servers.  A 10% cpu share costs $x/month, and I
> guarantee you'll get that 10%, or your money back.  On the other hand, I
> want to limit cpu usage to that 10% (maybe a little more) so people don't
> buy 10% shares and use 100% on my underutilized servers.  If they want 100%,
> let them pay for 100%.

Excellent examples, we've covered them in the RFC, could you see if we
missed anything in terms of use cases? The real question is do we care
enough to build hard limits control into the CFS group scheduler. I
believe we should.

>
>>> I think we need to treat guarantees as first-class goals, not something
>>>  derived from limits (in fact I think guarantees are more useful as they
>>>  can be used to provide SLAs).
>>>
>>
>> Even limits are useful for SLA's since your b/w available changes
>> quite drastically as we add or remove groups. There are other use
>> cases for limits as well
>
> SLAs are specified in terms of guarantees on a service, not on limits on
> others.  If we could use limits to provide guarantees, that would be fine,
> but it doesn't quite work out.

To be honest, I would disagree here, specifically if you start
comparing how you would build guarantees in the kernel and compare it
with the proposed approach. I don't want to harp on the technicality,
but point out the feasibility for people who care for lower end of the
guarantee without requiring density. I think the real technical
discussion should be on here are the use cases, lets agree on the need
for the feature and go ahead and start prototyping the feature.

Thanks,
Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]     ` <4A291753.7090205-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2009-06-05 13:43       ` Dhaval Giani
  0 siblings, 0 replies; 107+ messages in thread
From: Dhaval Giani @ 2009-06-05 13:43 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Pavel Emelyanov, Peter Zijlstra, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Paul Menage,
	Ingo Molnar, Balbir Singh

On Fri, Jun 05, 2009 at 04:02:11PM +0300, Avi Kivity wrote:
> Paul Menage wrote:
>> On Wed, Jun 3, 2009 at 10:36 PM, Bharata B
>> Rao<bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
>>   
>>> - Hard limits can be used to provide guarantees.
>>>
>>>     
>>
>> This claim (and the subsequent long thread it generated on how limits
>> can provide guarantees) confused me a bit.
>>
>> Why do we need limits to provide guarantees when we can already
>> provide guarantees via shares?
>>
>> Suppose 10 cgroups each want 10% of the machine's CPU. We can just
>> give each cgroup an equal share, and they're guaranteed 10% if they
>> try to use it; if they don't use it, other cgroups can get access to
>> the idle cycles.
>>
>> Suppose cgroup A wants a guarantee of 50% and two others, B and C,
>> want guarantees of 15% each; give A 50 shares and B and C 15 shares
>> each. In this case, if they all run flat out they'll get 62%/19%/19%,
>> which is within their SLA.
>>
>> That's not to say that hard limits can't be useful in their own right
>> - e.g. for providing reproducible loadtesting conditions by
>> controlling how much CPU a service can use during the load test. But I
>> don't see why using them to implement guarantees is either necessary
>> or desirable.
>>
>> (Unless I'm missing some crucial point ...)
>>   
>
> How many shares does a cgroup with a 0% guarantee get?
>

Shares cannot be used to provide guarantees. All they decide is what
propotion groups can get CPU time. (yes, shares is a bad name, weight
shows the intent better).

thanks,
-- 
regards,
Dhaval

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05 13:02   ` Avi Kivity
@ 2009-06-05 13:43     ` Dhaval Giani
       [not found]       ` <20090605134320.GA3994-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
  2009-06-05 14:45       ` Chris Friesen
       [not found]     ` <4A291753.7090205-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  1 sibling, 2 replies; 107+ messages in thread
From: Dhaval Giani @ 2009-06-05 13:43 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Paul Menage, bharata, linux-kernel, Balbir Singh,
	Vaidyanathan Srinivasan, Gautham R Shenoy, Srivatsa Vaddagiri,
	Ingo Molnar, Peter Zijlstra, Pavel Emelyanov, kvm,
	Linux Containers, Herbert Poetzl

On Fri, Jun 05, 2009 at 04:02:11PM +0300, Avi Kivity wrote:
> Paul Menage wrote:
>> On Wed, Jun 3, 2009 at 10:36 PM, Bharata B
>> Rao<bharata@linux.vnet.ibm.com> wrote:
>>   
>>> - Hard limits can be used to provide guarantees.
>>>
>>>     
>>
>> This claim (and the subsequent long thread it generated on how limits
>> can provide guarantees) confused me a bit.
>>
>> Why do we need limits to provide guarantees when we can already
>> provide guarantees via shares?
>>
>> Suppose 10 cgroups each want 10% of the machine's CPU. We can just
>> give each cgroup an equal share, and they're guaranteed 10% if they
>> try to use it; if they don't use it, other cgroups can get access to
>> the idle cycles.
>>
>> Suppose cgroup A wants a guarantee of 50% and two others, B and C,
>> want guarantees of 15% each; give A 50 shares and B and C 15 shares
>> each. In this case, if they all run flat out they'll get 62%/19%/19%,
>> which is within their SLA.
>>
>> That's not to say that hard limits can't be useful in their own right
>> - e.g. for providing reproducible loadtesting conditions by
>> controlling how much CPU a service can use during the load test. But I
>> don't see why using them to implement guarantees is either necessary
>> or desirable.
>>
>> (Unless I'm missing some crucial point ...)
>>   
>
> How many shares does a cgroup with a 0% guarantee get?
>

Shares cannot be used to provide guarantees. All they decide is what
propotion groups can get CPU time. (yes, shares is a bad name, weight
shows the intent better).

thanks,
-- 
regards,
Dhaval

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]     ` <20090605113217.GA20786-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
  2009-06-05 12:18       ` Paul Menage
@ 2009-06-05 14:44       ` Chris Friesen
  1 sibling, 0 replies; 107+ messages in thread
From: Chris Friesen @ 2009-06-05 14:44 UTC (permalink / raw)
  To: vatsa-xthvdsQ13ZrQT0dZR+AlfA
  Cc: Peter Zijlstra, Pavel Emelyanov, Dhaval Giani,
	kvm-u79uwXL29TY76Z2rM5mHXA, Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Paul Menage,
	Ingo Molnar, Balbir Singh

Srivatsa Vaddagiri wrote:
> On Fri, Jun 05, 2009 at 01:53:15AM -0700, Paul Menage wrote:
>> This claim (and the subsequent long thread it generated on how limits
>> can provide guarantees) confused me a bit.
>>
>> Why do we need limits to provide guarantees when we can already
>> provide guarantees via shares?
> 
> I think the interval over which we need guarantee matters here. Shares
> can generally provide guaranteed share of resource over longer (sometimes
> minutes) intervals. For high-priority bursty workloads, the latency in 
> achieving guaranteed resource usage matters. By having hard-limits, we are 
> "reserving" (potentially idle) slots where the high-priority group can run and 
> claim its guaranteed share almost immediately.

Why do you need to "reserve" it though?  By definition, if it's
high-priority then it should be able to interrupt the currently running
task.

Chris

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05 11:32   ` Srivatsa Vaddagiri
       [not found]     ` <20090605113217.GA20786-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
  2009-06-05 12:18     ` Paul Menage
@ 2009-06-05 14:44     ` Chris Friesen
  2 siblings, 0 replies; 107+ messages in thread
From: Chris Friesen @ 2009-06-05 14:44 UTC (permalink / raw)
  To: vatsa
  Cc: Paul Menage, bharata, linux-kernel, Dhaval Giani, Balbir Singh,
	Vaidyanathan Srinivasan, Gautham R Shenoy, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, Avi Kivity, kvm,
	Linux Containers, Herbert Poetzl

Srivatsa Vaddagiri wrote:
> On Fri, Jun 05, 2009 at 01:53:15AM -0700, Paul Menage wrote:
>> This claim (and the subsequent long thread it generated on how limits
>> can provide guarantees) confused me a bit.
>>
>> Why do we need limits to provide guarantees when we can already
>> provide guarantees via shares?
> 
> I think the interval over which we need guarantee matters here. Shares
> can generally provide guaranteed share of resource over longer (sometimes
> minutes) intervals. For high-priority bursty workloads, the latency in 
> achieving guaranteed resource usage matters. By having hard-limits, we are 
> "reserving" (potentially idle) slots where the high-priority group can run and 
> claim its guaranteed share almost immediately.

Why do you need to "reserve" it though?  By definition, if it's
high-priority then it should be able to interrupt the currently running
task.

Chris



^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]       ` <20090605134320.GA3994-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2009-06-05 14:45         ` Chris Friesen
  0 siblings, 0 replies; 107+ messages in thread
From: Chris Friesen @ 2009-06-05 14:45 UTC (permalink / raw)
  To: Dhaval Giani
  Cc: Pavel Emelyanov, Peter Zijlstra, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Paul Menage,
	Ingo Molnar, Balbir Singh

Dhaval Giani wrote:

> Shares cannot be used to provide guarantees. All they decide is what
> propotion groups can get CPU time. (yes, shares is a bad name, weight
> shows the intent better).

If I (as the administrator of the system) arbitrarily decide that all
the shares/weights must add up to 100, they magically become percentage
guarantees.

Chris

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05 13:43     ` Dhaval Giani
       [not found]       ` <20090605134320.GA3994-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2009-06-05 14:45       ` Chris Friesen
  1 sibling, 0 replies; 107+ messages in thread
From: Chris Friesen @ 2009-06-05 14:45 UTC (permalink / raw)
  To: Dhaval Giani
  Cc: Avi Kivity, Paul Menage, bharata, linux-kernel, Balbir Singh,
	Vaidyanathan Srinivasan, Gautham R Shenoy, Srivatsa Vaddagiri,
	Ingo Molnar, Peter Zijlstra, Pavel Emelyanov, kvm,
	Linux Containers, Herbert Poetzl

Dhaval Giani wrote:

> Shares cannot be used to provide guarantees. All they decide is what
> propotion groups can get CPU time. (yes, shares is a bad name, weight
> shows the intent better).

If I (as the administrator of the system) arbitrarily decide that all
the shares/weights must add up to 100, they magically become percentage
guarantees.

Chris

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]                           ` <4A291A2F.3090201-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2009-06-05 13:42                               ` Balbir Singh
@ 2009-06-05 14:54                             ` Chris Friesen
  1 sibling, 0 replies; 107+ messages in thread
From: Chris Friesen @ 2009-06-05 14:54 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Peter Zijlstra, Pavel Emelyanov, Dhaval Giani,
	kvm-u79uwXL29TY76Z2rM5mHXA, Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

Avi Kivity wrote:

> I am selling virtual private servers.  A 10% cpu share costs $x/month, 
> and I guarantee you'll get that 10%, or your money back.  On the other 
> hand, I want to limit cpu usage to that 10% (maybe a little more) so 
> people don't buy 10% shares and use 100% on my underutilized servers.  
> If they want 100%, let them pay for 100%.

What about taking a page from the networking folks and specifying cpu
like a networking SLA?

Something like "group A is guaranteed X percent (or share) of the cpu,
but it is allowed to burst up to Y percent for Z milliseconds"

If a rule of this form was the first-class citizen, it would provide
both guarantees, limits, and flexible behaviour.

Chris

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05 13:14                         ` Avi Kivity
       [not found]                           ` <4A291A2F.3090201-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2009-06-05 14:54                           ` Chris Friesen
  2009-06-07  6:10                             ` Avi Kivity
       [not found]                             ` <4A293196.2060006-ZIRUuHA3oDzQT0dZR+AlfA@public.gmane.org>
  1 sibling, 2 replies; 107+ messages in thread
From: Chris Friesen @ 2009-06-05 14:54 UTC (permalink / raw)
  To: Avi Kivity
  Cc: balbir, bharata, linux-kernel, Dhaval Giani,
	Vaidyanathan Srinivasan, Gautham R Shenoy, Srivatsa Vaddagiri,
	Ingo Molnar, Peter Zijlstra, Pavel Emelyanov, kvm,
	Linux Containers, Herbert Poetzl

Avi Kivity wrote:

> I am selling virtual private servers.  A 10% cpu share costs $x/month, 
> and I guarantee you'll get that 10%, or your money back.  On the other 
> hand, I want to limit cpu usage to that 10% (maybe a little more) so 
> people don't buy 10% shares and use 100% on my underutilized servers.  
> If they want 100%, let them pay for 100%.

What about taking a page from the networking folks and specifying cpu
like a networking SLA?

Something like "group A is guaranteed X percent (or share) of the cpu,
but it is allowed to burst up to Y percent for Z milliseconds"

If a rule of this form was the first-class citizen, it would provide
both guarantees, limits, and flexible behaviour.

Chris

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]                           ` <20090605081651.GD3872-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
@ 2009-06-07  6:04                             ` Avi Kivity
  0 siblings, 0 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-07  6:04 UTC (permalink / raw)
  To: bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
  Cc: Peter Zijlstra, Pavel Emelyanov, Dhaval Giani,
	kvm-u79uwXL29TY76Z2rM5mHXA, Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Ingo Molnar, Balbir Singh

Bharata B Rao wrote:
> On Fri, Jun 05, 2009 at 09:01:50AM +0300, Avi Kivity wrote:
>   
>> Bharata B Rao wrote:
>>     
>>> But could there be client models where you are required to strictly
>>> adhere to the limit within the bandwidth and not provide more (by advancing
>>> the bandwidth period) in the presence of idle cycles ?
>>>   
>>>       
>> That's the limit part.  I'd like to be able to specify limits and  
>> guarantees on the same host and for the same groups; I don't think that  
>> works when you advance the bandwidth period.
>>
>> I think we need to treat guarantees as first-class goals, not something  
>> derived from limits (in fact I think guarantees are more useful as they  
>> can be used to provide SLAs).
>>     
>
> I agree that guarantees are important, but I am not sure about
>
> 1. specifying both limits and guarantees for groups and
>   

Why would you allow specifying a lower bound for cpu usage (a 
guarantee), and upper bound (a limit), but not both?

> 2. not deriving guarantees from limits.
>
> Guarantees are met by some form of throttling or limiting and hence I think
> limiting should drive the guarantees

That would be fine if it didn't idle the cpu despite there being demand 
and available cpu power.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05  8:16                           ` Bharata B Rao
  (?)
@ 2009-06-07  6:04                           ` Avi Kivity
       [not found]                             ` <4A2B5881.9060204-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  2009-06-07 16:14                             ` Bharata B Rao
  -1 siblings, 2 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-07  6:04 UTC (permalink / raw)
  To: bharata
  Cc: Balbir Singh, linux-kernel, Dhaval Giani,
	Vaidyanathan Srinivasan, Gautham R Shenoy, Srivatsa Vaddagiri,
	Ingo Molnar, Peter Zijlstra, Pavel Emelyanov, kvm,
	Linux Containers, Herbert Poetzl

Bharata B Rao wrote:
> On Fri, Jun 05, 2009 at 09:01:50AM +0300, Avi Kivity wrote:
>   
>> Bharata B Rao wrote:
>>     
>>> But could there be client models where you are required to strictly
>>> adhere to the limit within the bandwidth and not provide more (by advancing
>>> the bandwidth period) in the presence of idle cycles ?
>>>   
>>>       
>> That's the limit part.  I'd like to be able to specify limits and  
>> guarantees on the same host and for the same groups; I don't think that  
>> works when you advance the bandwidth period.
>>
>> I think we need to treat guarantees as first-class goals, not something  
>> derived from limits (in fact I think guarantees are more useful as they  
>> can be used to provide SLAs).
>>     
>
> I agree that guarantees are important, but I am not sure about
>
> 1. specifying both limits and guarantees for groups and
>   

Why would you allow specifying a lower bound for cpu usage (a 
guarantee), and upper bound (a limit), but not both?

> 2. not deriving guarantees from limits.
>
> Guarantees are met by some form of throttling or limiting and hence I think
> limiting should drive the guarantees

That would be fine if it didn't idle the cpu despite there being demand 
and available cpu power.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]                               ` <661de9470906050642s7774d601l53e366c77ffa7475-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-06-07  6:09                                 ` Avi Kivity
  0 siblings, 0 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-07  6:09 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Pavel Emelyanov, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Peter Zijlstra

Balbir Singh wrote:
>> I am selling virtual private servers.  A 10% cpu share costs $x/month, and I
>> guarantee you'll get that 10%, or your money back.  On the other hand, I
>> want to limit cpu usage to that 10% (maybe a little more) so people don't
>> buy 10% shares and use 100% on my underutilized servers.  If they want 100%,
>> let them pay for 100%.
>>     
>
> Excellent examples, we've covered them in the RFC, could you see if we
> missed anything in terms of use cases? The real question is do we care
> enough to build hard limits control into the CFS group scheduler. I
> believe we should.
>   

You only cover the limit part.  Guarantees are left as an exercise to 
the reader.

I don't think implementing guarantees via limits is workable as it 
causes the cpu to be idled unnecessarily.

>>> Even limits are useful for SLA's since your b/w available changes
>>> quite drastically as we add or remove groups. There are other use
>>> cases for limits as well
>>>       
>> SLAs are specified in terms of guarantees on a service, not on limits on
>> others.  If we could use limits to provide guarantees, that would be fine,
>> but it doesn't quite work out.
>>     
>
> To be honest, I would disagree here, specifically if you start
> comparing how you would build guarantees in the kernel and compare it
> with the proposed approach. I don't want to harp on the technicality,
> but point out the feasibility for people who care for lower end of the
> guarantee without requiring density. I think the real technical
> discussion should be on here are the use cases, lets agree on the need
> for the feature and go ahead and start prototyping the feature.
>   

I don't understand.  Are you saying implementing guarantees is too complex?

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05 13:42                               ` Balbir Singh
  (?)
@ 2009-06-07  6:09                               ` Avi Kivity
  -1 siblings, 0 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-07  6:09 UTC (permalink / raw)
  To: Balbir Singh
  Cc: bharata, linux-kernel, Dhaval Giani, Vaidyanathan Srinivasan,
	Gautham R Shenoy, Srivatsa Vaddagiri, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, kvm, Linux Containers,
	Herbert Poetzl

Balbir Singh wrote:
>> I am selling virtual private servers.  A 10% cpu share costs $x/month, and I
>> guarantee you'll get that 10%, or your money back.  On the other hand, I
>> want to limit cpu usage to that 10% (maybe a little more) so people don't
>> buy 10% shares and use 100% on my underutilized servers.  If they want 100%,
>> let them pay for 100%.
>>     
>
> Excellent examples, we've covered them in the RFC, could you see if we
> missed anything in terms of use cases? The real question is do we care
> enough to build hard limits control into the CFS group scheduler. I
> believe we should.
>   

You only cover the limit part.  Guarantees are left as an exercise to 
the reader.

I don't think implementing guarantees via limits is workable as it 
causes the cpu to be idled unnecessarily.

>>> Even limits are useful for SLA's since your b/w available changes
>>> quite drastically as we add or remove groups. There are other use
>>> cases for limits as well
>>>       
>> SLAs are specified in terms of guarantees on a service, not on limits on
>> others.  If we could use limits to provide guarantees, that would be fine,
>> but it doesn't quite work out.
>>     
>
> To be honest, I would disagree here, specifically if you start
> comparing how you would build guarantees in the kernel and compare it
> with the proposed approach. I don't want to harp on the technicality,
> but point out the feasibility for people who care for lower end of the
> guarantee without requiring density. I think the real technical
> discussion should be on here are the use cases, lets agree on the need
> for the feature and go ahead and start prototyping the feature.
>   

I don't understand.  Are you saying implementing guarantees is too complex?

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]                             ` <4A293196.2060006-ZIRUuHA3oDzQT0dZR+AlfA@public.gmane.org>
@ 2009-06-07  6:10                               ` Avi Kivity
  0 siblings, 0 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-07  6:10 UTC (permalink / raw)
  To: Chris Friesen
  Cc: Peter Zijlstra, Pavel Emelyanov, Dhaval Giani,
	kvm-u79uwXL29TY76Z2rM5mHXA, Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

Chris Friesen wrote:
> Avi Kivity wrote:
>
>   
>> I am selling virtual private servers.  A 10% cpu share costs $x/month, 
>> and I guarantee you'll get that 10%, or your money back.  On the other 
>> hand, I want to limit cpu usage to that 10% (maybe a little more) so 
>> people don't buy 10% shares and use 100% on my underutilized servers.  
>> If they want 100%, let them pay for 100%.
>>     
>
> What about taking a page from the networking folks and specifying cpu
> like a networking SLA?
>
> Something like "group A is guaranteed X percent (or share) of the cpu,
> but it is allowed to burst up to Y percent for Z milliseconds"
>
> If a rule of this form was the first-class citizen, it would provide
> both guarantees, limits, and flexible behaviour.
>   

I think you're introducing a new control (guarantees, limits, burst 
limit), but I like it.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05 14:54                           ` Chris Friesen
@ 2009-06-07  6:10                             ` Avi Kivity
       [not found]                             ` <4A293196.2060006-ZIRUuHA3oDzQT0dZR+AlfA@public.gmane.org>
  1 sibling, 0 replies; 107+ messages in thread
From: Avi Kivity @ 2009-06-07  6:10 UTC (permalink / raw)
  To: Chris Friesen
  Cc: balbir, bharata, linux-kernel, Dhaval Giani,
	Vaidyanathan Srinivasan, Gautham R Shenoy, Srivatsa Vaddagiri,
	Ingo Molnar, Peter Zijlstra, Pavel Emelyanov, kvm,
	Linux Containers, Herbert Poetzl

Chris Friesen wrote:
> Avi Kivity wrote:
>
>   
>> I am selling virtual private servers.  A 10% cpu share costs $x/month, 
>> and I guarantee you'll get that 10%, or your money back.  On the other 
>> hand, I want to limit cpu usage to that 10% (maybe a little more) so 
>> people don't buy 10% shares and use 100% on my underutilized servers.  
>> If they want 100%, let them pay for 100%.
>>     
>
> What about taking a page from the networking folks and specifying cpu
> like a networking SLA?
>
> Something like "group A is guaranteed X percent (or share) of the cpu,
> but it is allowed to burst up to Y percent for Z milliseconds"
>
> If a rule of this form was the first-class citizen, it would provide
> both guarantees, limits, and flexible behaviour.
>   

I think you're introducing a new control (guarantees, limits, burst 
limit), but I like it.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05 12:18     ` Paul Menage
@ 2009-06-07 10:11           ` Srivatsa Vaddagiri
  0 siblings, 0 replies; 107+ messages in thread
From: Srivatsa Vaddagiri @ 2009-06-07 10:11 UTC (permalink / raw)
  To: Paul Menage
  Cc: Peter Zijlstra, Pavel Emelyanov, Dhaval Giani,
	kvm-u79uwXL29TY76Z2rM5mHXA, Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Balbir Singh

On Fri, Jun 05, 2009 at 05:18:13AM -0700, Paul Menage wrote:
> Well yes, it's true that you *could* just enforce shares over a
> granularity of minutes, and limits over a granularity of milliseconds.
> But why would you? It could well make sense that you can adjust the
> granularity over which shares are enforced - e.g. for batch jobs, only
> enforcing over minutes or tens of seconds might be fine. But if you're
> doing the fine-grained accounting and scheduling required for the
> tight hard limit enforcement, it doesn't seem as though it should be
> much harder to enforce shares at the same granularity for those
> cgroups that matter. In fact I thought that's what CFS already did -
> updated the virtual time accounting at each context switch, and picked
> the runnable child with the oldest virtual time. (Maybe someone like
> Ingo or Peter who's more familiar than I with the CFS implementation
> could comment here?)

Using shares to guarantee resources over short period (<2-3 seconds) works 
just well on a single CPU. The complexity is with multi-cpu case, where CFS can 
take a long time to converge to a fair point. This is because fairness is based 
on rebalancing tasks equally across all CPUs.

For something like 4 tasks on 4 CPUs, it will converge pretty quickly 
(2-3 seconds):

[top o/p refreshed every 2sec on 2.6.30-rc5-tip]

14753 vatsa     20   0 63812 1072  924 R 99.9  0.0   0:39.54 hog
14754 vatsa     20   0 63812 1072  924 R 99.9  0.0   0:38.69 hog
14756 vatsa     20   0 63812 1076  924 R 99.9  0.0   0:38.27 hog
14755 vatsa     20   0 63812 1072  924 R 99.6  0.0   0:38.27 hog

whereas for something like 5 tasks on 4 CPUs, it will take a sufficiently 
longer time (>30 seconds)

[top o/p refreshed every 2sec]:

14754 vatsa     20   0 63812 1072  924 R 86.0  0.0   2:06.45 hog
14766 vatsa     20   0 63812 1072  924 R 83.0  0.0   0:07.95 hog
14756 vatsa     20   0 63812 1076  924 R 81.7  0.0   2:06.48 hog
14753 vatsa     20   0 63812 1072  924 R 78.7  0.0   2:07.10 hog
14755 vatsa     20   0 63812 1072  924 R 69.4  0.0   2:05.62 hog

[top o/p refreshed every 120sec]:

14766 vatsa     20   0 63812 1072  924 R 90.1  0.0   5:57.22 hog
14755 vatsa     20   0 63812 1072  924 R 84.8  0.0   8:01.61 hog
14754 vatsa     20   0 63812 1072  924 R 77.3  0.0   7:52.04 hog
14753 vatsa     20   0 63812 1072  924 R 74.1  0.0   7:29.01 hog
14756 vatsa     20   0 63812 1076  924 R 73.5  0.0   7:34.69 hog

[Note that even over 2min, we haven't achieved perfect fairness]

> > By having hard-limits, we are
> > "reserving" (potentially idle) slots where the high-priority group can run and
> > claim its guaranteed share almost immediately.

On further thinking, this is not as simple as that. In above example of
5 tasks on 4 CPUs, we could cap each task at a hard limit of 80% 
(4 CPUs/5 tasks), which is still not sufficient to ensure that each
task gets the perfect fairness of 80%! Not just that, hard-limit 
for a group (on each CPU) will have to be adjusted based on its task
distribution. For ex: a group that has a hard-limit of 25% on a 4-cpu
system and that has a single task, is entitled to claim a whole CPU. So
the per-cpu hard-limit for the group should be 100% on whatever CPU the
task is running. This adjustment of per-cpu hard-limit should happen
whenever the task distribution of the group across CPUs change - which
in theory would require you to monitor every task exit/migration
event and readjust limits, making it very complex and high-overhead.

Balbir,
	I dont think guarantee can be met easily thr' hard-limits in
case of CPU resource. Atleast its not as straightforward as in case of
memory!

- vatsa

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
@ 2009-06-07 10:11           ` Srivatsa Vaddagiri
  0 siblings, 0 replies; 107+ messages in thread
From: Srivatsa Vaddagiri @ 2009-06-07 10:11 UTC (permalink / raw)
  To: Paul Menage
  Cc: bharata, linux-kernel, Dhaval Giani, Balbir Singh,
	Vaidyanathan Srinivasan, Gautham R Shenoy, Ingo Molnar,
	Peter Zijlstra, Pavel Emelyanov, Avi Kivity, kvm,
	Linux Containers, Herbert Poetzl

On Fri, Jun 05, 2009 at 05:18:13AM -0700, Paul Menage wrote:
> Well yes, it's true that you *could* just enforce shares over a
> granularity of minutes, and limits over a granularity of milliseconds.
> But why would you? It could well make sense that you can adjust the
> granularity over which shares are enforced - e.g. for batch jobs, only
> enforcing over minutes or tens of seconds might be fine. But if you're
> doing the fine-grained accounting and scheduling required for the
> tight hard limit enforcement, it doesn't seem as though it should be
> much harder to enforce shares at the same granularity for those
> cgroups that matter. In fact I thought that's what CFS already did -
> updated the virtual time accounting at each context switch, and picked
> the runnable child with the oldest virtual time. (Maybe someone like
> Ingo or Peter who's more familiar than I with the CFS implementation
> could comment here?)

Using shares to guarantee resources over short period (<2-3 seconds) works 
just well on a single CPU. The complexity is with multi-cpu case, where CFS can 
take a long time to converge to a fair point. This is because fairness is based 
on rebalancing tasks equally across all CPUs.

For something like 4 tasks on 4 CPUs, it will converge pretty quickly 
(2-3 seconds):

[top o/p refreshed every 2sec on 2.6.30-rc5-tip]

14753 vatsa     20   0 63812 1072  924 R 99.9  0.0   0:39.54 hog
14754 vatsa     20   0 63812 1072  924 R 99.9  0.0   0:38.69 hog
14756 vatsa     20   0 63812 1076  924 R 99.9  0.0   0:38.27 hog
14755 vatsa     20   0 63812 1072  924 R 99.6  0.0   0:38.27 hog

whereas for something like 5 tasks on 4 CPUs, it will take a sufficiently 
longer time (>30 seconds)

[top o/p refreshed every 2sec]:

14754 vatsa     20   0 63812 1072  924 R 86.0  0.0   2:06.45 hog
14766 vatsa     20   0 63812 1072  924 R 83.0  0.0   0:07.95 hog
14756 vatsa     20   0 63812 1076  924 R 81.7  0.0   2:06.48 hog
14753 vatsa     20   0 63812 1072  924 R 78.7  0.0   2:07.10 hog
14755 vatsa     20   0 63812 1072  924 R 69.4  0.0   2:05.62 hog

[top o/p refreshed every 120sec]:

14766 vatsa     20   0 63812 1072  924 R 90.1  0.0   5:57.22 hog
14755 vatsa     20   0 63812 1072  924 R 84.8  0.0   8:01.61 hog
14754 vatsa     20   0 63812 1072  924 R 77.3  0.0   7:52.04 hog
14753 vatsa     20   0 63812 1072  924 R 74.1  0.0   7:29.01 hog
14756 vatsa     20   0 63812 1076  924 R 73.5  0.0   7:34.69 hog

[Note that even over 2min, we haven't achieved perfect fairness]

> > By having hard-limits, we are
> > "reserving" (potentially idle) slots where the high-priority group can run and
> > claim its guaranteed share almost immediately.

On further thinking, this is not as simple as that. In above example of
5 tasks on 4 CPUs, we could cap each task at a hard limit of 80% 
(4 CPUs/5 tasks), which is still not sufficient to ensure that each
task gets the perfect fairness of 80%! Not just that, hard-limit 
for a group (on each CPU) will have to be adjusted based on its task
distribution. For ex: a group that has a hard-limit of 25% on a 4-cpu
system and that has a single task, is entitled to claim a whole CPU. So
the per-cpu hard-limit for the group should be 100% on whatever CPU the
task is running. This adjustment of per-cpu hard-limit should happen
whenever the task distribution of the group across CPUs change - which
in theory would require you to monitor every task exit/migration
event and readjust limits, making it very complex and high-overhead.

Balbir,
	I dont think guarantee can be met easily thr' hard-limits in
case of CPU resource. Atleast its not as straightforward as in case of
memory!

- vatsa

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]           ` <20090607101120.GB16211-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
@ 2009-06-07 15:35             ` Balbir Singh
  0 siblings, 0 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-07 15:35 UTC (permalink / raw)
  To: vatsa-xthvdsQ13ZrQT0dZR+AlfA
  Cc: Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA, Gautham R Shenoy,
	Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Avi Kivity, bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	Paul Menage, Pavel Emelyanov, Ingo Molnar, Peter Zijlstra

On Sun, Jun 7, 2009 at 3:41 PM, Srivatsa Vaddagiri<vatsa-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org> wrote:
> On Fri, Jun 05, 2009 at 05:18:13AM -0700, Paul Menage wrote:
>> Well yes, it's true that you *could* just enforce shares over a
>> granularity of minutes, and limits over a granularity of milliseconds.
>> But why would you? It could well make sense that you can adjust the
>> granularity over which shares are enforced - e.g. for batch jobs, only
>> enforcing over minutes or tens of seconds might be fine. But if you're
>> doing the fine-grained accounting and scheduling required for the
>> tight hard limit enforcement, it doesn't seem as though it should be
>> much harder to enforce shares at the same granularity for those
>> cgroups that matter. In fact I thought that's what CFS already did -
>> updated the virtual time accounting at each context switch, and picked
>> the runnable child with the oldest virtual time. (Maybe someone like
>> Ingo or Peter who's more familiar than I with the CFS implementation
>> could comment here?)
>
> Using shares to guarantee resources over short period (<2-3 seconds) works
> just well on a single CPU. The complexity is with multi-cpu case, where CFS can
> take a long time to converge to a fair point. This is because fairness is based
> on rebalancing tasks equally across all CPUs.
>
> For something like 4 tasks on 4 CPUs, it will converge pretty quickly
> (2-3 seconds):
>
> [top o/p refreshed every 2sec on 2.6.30-rc5-tip]
>
> 14753 vatsa     20   0 63812 1072  924 R 99.9  0.0   0:39.54 hog
> 14754 vatsa     20   0 63812 1072  924 R 99.9  0.0   0:38.69 hog
> 14756 vatsa     20   0 63812 1076  924 R 99.9  0.0   0:38.27 hog
> 14755 vatsa     20   0 63812 1072  924 R 99.6  0.0   0:38.27 hog
>
> whereas for something like 5 tasks on 4 CPUs, it will take a sufficiently
> longer time (>30 seconds)
>
> [top o/p refreshed every 2sec]:
>
> 14754 vatsa     20   0 63812 1072  924 R 86.0  0.0   2:06.45 hog
> 14766 vatsa     20   0 63812 1072  924 R 83.0  0.0   0:07.95 hog
> 14756 vatsa     20   0 63812 1076  924 R 81.7  0.0   2:06.48 hog
> 14753 vatsa     20   0 63812 1072  924 R 78.7  0.0   2:07.10 hog
> 14755 vatsa     20   0 63812 1072  924 R 69.4  0.0   2:05.62 hog
>
> [top o/p refreshed every 120sec]:
>
> 14766 vatsa     20   0 63812 1072  924 R 90.1  0.0   5:57.22 hog
> 14755 vatsa     20   0 63812 1072  924 R 84.8  0.0   8:01.61 hog
> 14754 vatsa     20   0 63812 1072  924 R 77.3  0.0   7:52.04 hog
> 14753 vatsa     20   0 63812 1072  924 R 74.1  0.0   7:29.01 hog
> 14756 vatsa     20   0 63812 1076  924 R 73.5  0.0   7:34.69 hog
>
> [Note that even over 2min, we haven't achieved perfect fairness]
>

Good observation, Thanks!

>> > By having hard-limits, we are
>> > "reserving" (potentially idle) slots where the high-priority group can run and
>> > claim its guaranteed share almost immediately.
>
> On further thinking, this is not as simple as that. In above example of
> 5 tasks on 4 CPUs, we could cap each task at a hard limit of 80%
> (4 CPUs/5 tasks), which is still not sufficient to ensure that each
> task gets the perfect fairness of 80%! Not just that, hard-limit
> for a group (on each CPU) will have to be adjusted based on its task
> distribution. For ex: a group that has a hard-limit of 25% on a 4-cpu
> system and that has a single task, is entitled to claim a whole CPU. So
> the per-cpu hard-limit for the group should be 100% on whatever CPU the
> task is running. This adjustment of per-cpu hard-limit should happen
> whenever the task distribution of the group across CPUs change - which
> in theory would require you to monitor every task exit/migration
> event and readjust limits, making it very complex and high-overhead.
>

We already do that for shares right? I mean instead of 25% hard limit,
if the group had 25% of the shares the same thing would apply - no?

> Balbir,
>        I dont think guarantee can be met easily thr' hard-limits in
> case of CPU resource. Atleast its not as straightforward as in case of
> memory!

OK, based on the discussion - leaving implementation issues out,
speaking of whether it is possible to implement guarantees using
shares? My answer would be

1. Yes - but then the hard limits will prevent you and can cause idle
times, some of those can be handled in the implementation. There might
be other fairness and SMP concerns about the accuracy of the fairness,
thank you for that data.
2. We'll update the RFC (second version) with the findings and send it
out, so that the expectations are clearer
3. From what I've read and seen there seems to be no strong objection
to hard limits, but some reservations (based on 1) about using them
for guarantees and our RFC will reflect that.

Do you agree?
Balbir

Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-07 10:11           ` Srivatsa Vaddagiri
  (?)
@ 2009-06-07 15:35           ` Balbir Singh
  2009-06-08  4:37             ` Srivatsa Vaddagiri
       [not found]             ` <661de9470906070835l383cd388h67e40a31be07aef6-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  -1 siblings, 2 replies; 107+ messages in thread
From: Balbir Singh @ 2009-06-07 15:35 UTC (permalink / raw)
  To: vatsa
  Cc: Paul Menage, Peter Zijlstra, Pavel Emelyanov, Dhaval Giani, kvm,
	Gautham R Shenoy, Linux Containers, linux-kernel, Avi Kivity,
	bharata, Ingo Molnar

On Sun, Jun 7, 2009 at 3:41 PM, Srivatsa Vaddagiri<vatsa@in.ibm.com> wrote:
> On Fri, Jun 05, 2009 at 05:18:13AM -0700, Paul Menage wrote:
>> Well yes, it's true that you *could* just enforce shares over a
>> granularity of minutes, and limits over a granularity of milliseconds.
>> But why would you? It could well make sense that you can adjust the
>> granularity over which shares are enforced - e.g. for batch jobs, only
>> enforcing over minutes or tens of seconds might be fine. But if you're
>> doing the fine-grained accounting and scheduling required for the
>> tight hard limit enforcement, it doesn't seem as though it should be
>> much harder to enforce shares at the same granularity for those
>> cgroups that matter. In fact I thought that's what CFS already did -
>> updated the virtual time accounting at each context switch, and picked
>> the runnable child with the oldest virtual time. (Maybe someone like
>> Ingo or Peter who's more familiar than I with the CFS implementation
>> could comment here?)
>
> Using shares to guarantee resources over short period (<2-3 seconds) works
> just well on a single CPU. The complexity is with multi-cpu case, where CFS can
> take a long time to converge to a fair point. This is because fairness is based
> on rebalancing tasks equally across all CPUs.
>
> For something like 4 tasks on 4 CPUs, it will converge pretty quickly
> (2-3 seconds):
>
> [top o/p refreshed every 2sec on 2.6.30-rc5-tip]
>
> 14753 vatsa     20   0 63812 1072  924 R 99.9  0.0   0:39.54 hog
> 14754 vatsa     20   0 63812 1072  924 R 99.9  0.0   0:38.69 hog
> 14756 vatsa     20   0 63812 1076  924 R 99.9  0.0   0:38.27 hog
> 14755 vatsa     20   0 63812 1072  924 R 99.6  0.0   0:38.27 hog
>
> whereas for something like 5 tasks on 4 CPUs, it will take a sufficiently
> longer time (>30 seconds)
>
> [top o/p refreshed every 2sec]:
>
> 14754 vatsa     20   0 63812 1072  924 R 86.0  0.0   2:06.45 hog
> 14766 vatsa     20   0 63812 1072  924 R 83.0  0.0   0:07.95 hog
> 14756 vatsa     20   0 63812 1076  924 R 81.7  0.0   2:06.48 hog
> 14753 vatsa     20   0 63812 1072  924 R 78.7  0.0   2:07.10 hog
> 14755 vatsa     20   0 63812 1072  924 R 69.4  0.0   2:05.62 hog
>
> [top o/p refreshed every 120sec]:
>
> 14766 vatsa     20   0 63812 1072  924 R 90.1  0.0   5:57.22 hog
> 14755 vatsa     20   0 63812 1072  924 R 84.8  0.0   8:01.61 hog
> 14754 vatsa     20   0 63812 1072  924 R 77.3  0.0   7:52.04 hog
> 14753 vatsa     20   0 63812 1072  924 R 74.1  0.0   7:29.01 hog
> 14756 vatsa     20   0 63812 1076  924 R 73.5  0.0   7:34.69 hog
>
> [Note that even over 2min, we haven't achieved perfect fairness]
>

Good observation, Thanks!

>> > By having hard-limits, we are
>> > "reserving" (potentially idle) slots where the high-priority group can run and
>> > claim its guaranteed share almost immediately.
>
> On further thinking, this is not as simple as that. In above example of
> 5 tasks on 4 CPUs, we could cap each task at a hard limit of 80%
> (4 CPUs/5 tasks), which is still not sufficient to ensure that each
> task gets the perfect fairness of 80%! Not just that, hard-limit
> for a group (on each CPU) will have to be adjusted based on its task
> distribution. For ex: a group that has a hard-limit of 25% on a 4-cpu
> system and that has a single task, is entitled to claim a whole CPU. So
> the per-cpu hard-limit for the group should be 100% on whatever CPU the
> task is running. This adjustment of per-cpu hard-limit should happen
> whenever the task distribution of the group across CPUs change - which
> in theory would require you to monitor every task exit/migration
> event and readjust limits, making it very complex and high-overhead.
>

We already do that for shares right? I mean instead of 25% hard limit,
if the group had 25% of the shares the same thing would apply - no?

> Balbir,
>        I dont think guarantee can be met easily thr' hard-limits in
> case of CPU resource. Atleast its not as straightforward as in case of
> memory!

OK, based on the discussion - leaving implementation issues out,
speaking of whether it is possible to implement guarantees using
shares? My answer would be

1. Yes - but then the hard limits will prevent you and can cause idle
times, some of those can be handled in the implementation. There might
be other fairness and SMP concerns about the accuracy of the fairness,
thank you for that data.
2. We'll update the RFC (second version) with the findings and send it
out, so that the expectations are clearer
3. From what I've read and seen there seems to be no strong objection
to hard limits, but some reservations (based on 1) about using them
for guarantees and our RFC will reflect that.

Do you agree?
Balbir

Balbir

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]                             ` <4A2B5881.9060204-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2009-06-07 16:14                               ` Bharata B Rao
  0 siblings, 0 replies; 107+ messages in thread
From: Bharata B Rao @ 2009-06-07 16:14 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Peter Zijlstra, Pavel Emelyanov, Dhaval Giani,
	kvm-u79uwXL29TY76Z2rM5mHXA, Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Ingo Molnar, Balbir Singh

On Sun, Jun 07, 2009 at 09:04:49AM +0300, Avi Kivity wrote:
> Bharata B Rao wrote:
>> On Fri, Jun 05, 2009 at 09:01:50AM +0300, Avi Kivity wrote:
>>   
>>> Bharata B Rao wrote:
>>>     
>>>> But could there be client models where you are required to strictly
>>>> adhere to the limit within the bandwidth and not provide more (by advancing
>>>> the bandwidth period) in the presence of idle cycles ?
>>>>         
>>> That's the limit part.  I'd like to be able to specify limits and   
>>> guarantees on the same host and for the same groups; I don't think 
>>> that  works when you advance the bandwidth period.
>>>
>>> I think we need to treat guarantees as first-class goals, not 
>>> something  derived from limits (in fact I think guarantees are more 
>>> useful as they  can be used to provide SLAs).
>>>     
>>
>> I agree that guarantees are important, but I am not sure about
>>
>> 1. specifying both limits and guarantees for groups and
>>   
>
> Why would you allow specifying a lower bound for cpu usage (a  
> guarantee), and upper bound (a limit), but not both?

I was saying that we specify only limits and not guarantees since it
can be worked out from limits. Initial thinking was that the kernel will
be made aware of only limits and users could set the limits appropriately
to obtain the desired guarantees. I understand your concerns/objections
on this and we will address this in our next version of RFC as Balbir said.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-07  6:04                           ` Avi Kivity
       [not found]                             ` <4A2B5881.9060204-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2009-06-07 16:14                             ` Bharata B Rao
  1 sibling, 0 replies; 107+ messages in thread
From: Bharata B Rao @ 2009-06-07 16:14 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Balbir Singh, linux-kernel, Dhaval Giani,
	Vaidyanathan Srinivasan, Gautham R Shenoy, Srivatsa Vaddagiri,
	Ingo Molnar, Peter Zijlstra, Pavel Emelyanov, kvm,
	Linux Containers, Herbert Poetzl

On Sun, Jun 07, 2009 at 09:04:49AM +0300, Avi Kivity wrote:
> Bharata B Rao wrote:
>> On Fri, Jun 05, 2009 at 09:01:50AM +0300, Avi Kivity wrote:
>>   
>>> Bharata B Rao wrote:
>>>     
>>>> But could there be client models where you are required to strictly
>>>> adhere to the limit within the bandwidth and not provide more (by advancing
>>>> the bandwidth period) in the presence of idle cycles ?
>>>>         
>>> That's the limit part.  I'd like to be able to specify limits and   
>>> guarantees on the same host and for the same groups; I don't think 
>>> that  works when you advance the bandwidth period.
>>>
>>> I think we need to treat guarantees as first-class goals, not 
>>> something  derived from limits (in fact I think guarantees are more 
>>> useful as they  can be used to provide SLAs).
>>>     
>>
>> I agree that guarantees are important, but I am not sure about
>>
>> 1. specifying both limits and guarantees for groups and
>>   
>
> Why would you allow specifying a lower bound for cpu usage (a  
> guarantee), and upper bound (a limit), but not both?

I was saying that we specify only limits and not guarantees since it
can be worked out from limits. Initial thinking was that the kernel will
be made aware of only limits and users could set the limits appropriately
to obtain the desired guarantees. I understand your concerns/objections
on this and we will address this in our next version of RFC as Balbir said.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]             ` <661de9470906070835l383cd388h67e40a31be07aef6-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-06-08  4:37               ` Srivatsa Vaddagiri
  0 siblings, 0 replies; 107+ messages in thread
From: Srivatsa Vaddagiri @ 2009-06-08  4:37 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA, Gautham R Shenoy,
	Linux Containers, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	Avi Kivity, bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	Paul Menage, Pavel Emelyanov, Ingo Molnar, Peter Zijlstra

On Sun, Jun 07, 2009 at 09:05:23PM +0530, Balbir Singh wrote:
> > On further thinking, this is not as simple as that. In above example of
> > 5 tasks on 4 CPUs, we could cap each task at a hard limit of 80%
> > (4 CPUs/5 tasks), which is still not sufficient to ensure that each
> > task gets the perfect fairness of 80%! Not just that, hard-limit
> > for a group (on each CPU) will have to be adjusted based on its task
> > distribution. For ex: a group that has a hard-limit of 25% on a 4-cpu
> > system and that has a single task, is entitled to claim a whole CPU. So
> > the per-cpu hard-limit for the group should be 100% on whatever CPU the
> > task is running. This adjustment of per-cpu hard-limit should happen
> > whenever the task distribution of the group across CPUs change - which
> > in theory would require you to monitor every task exit/migration
> > event and readjust limits, making it very complex and high-overhead.
> >
> 
> We already do that for shares right? I mean instead of 25% hard limit,
> if the group had 25% of the shares the same thing would apply - no?

yes and no. we do rebalance shares based on task distribution, but not
upon every task fork/exit/wakeup/migration event. Its done once in a while,
frequent enough to give "decent" fairness!

> > Balbir,
> >        I dont think guarantee can be met easily thr' hard-limits in
> > case of CPU resource. Atleast its not as straightforward as in case of
> > memory!
> 
> OK, based on the discussion - leaving implementation issues out,
> speaking of whether it is possible to implement guarantees using
> shares? My answer would be
> 
> 1. Yes - but then the hard limits will prevent you and can cause idle
> times, some of those can be handled in the implementation. There might
> be other fairness and SMP concerns about the accuracy of the fairness,
> thank you for that data.
> 2. We'll update the RFC (second version) with the findings and send it
> out, so that the expectations are clearer
> 3. From what I've read and seen there seems to be no strong objection
> to hard limits, but some reservations (based on 1) about using them
> for guarantees and our RFC will reflect that.
> 
> Do you agree?

Well yes, guarantee is not a good argument for providing hard limits.
Pay-per-use kind of usage would be a better argument IMHO.

- vatsa

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-07 15:35           ` Balbir Singh
@ 2009-06-08  4:37             ` Srivatsa Vaddagiri
       [not found]             ` <661de9470906070835l383cd388h67e40a31be07aef6-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 0 replies; 107+ messages in thread
From: Srivatsa Vaddagiri @ 2009-06-08  4:37 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Paul Menage, Peter Zijlstra, Pavel Emelyanov, Dhaval Giani, kvm,
	Gautham R Shenoy, Linux Containers, linux-kernel, Avi Kivity,
	bharata, Ingo Molnar

On Sun, Jun 07, 2009 at 09:05:23PM +0530, Balbir Singh wrote:
> > On further thinking, this is not as simple as that. In above example of
> > 5 tasks on 4 CPUs, we could cap each task at a hard limit of 80%
> > (4 CPUs/5 tasks), which is still not sufficient to ensure that each
> > task gets the perfect fairness of 80%! Not just that, hard-limit
> > for a group (on each CPU) will have to be adjusted based on its task
> > distribution. For ex: a group that has a hard-limit of 25% on a 4-cpu
> > system and that has a single task, is entitled to claim a whole CPU. So
> > the per-cpu hard-limit for the group should be 100% on whatever CPU the
> > task is running. This adjustment of per-cpu hard-limit should happen
> > whenever the task distribution of the group across CPUs change - which
> > in theory would require you to monitor every task exit/migration
> > event and readjust limits, making it very complex and high-overhead.
> >
> 
> We already do that for shares right? I mean instead of 25% hard limit,
> if the group had 25% of the shares the same thing would apply - no?

yes and no. we do rebalance shares based on task distribution, but not
upon every task fork/exit/wakeup/migration event. Its done once in a while,
frequent enough to give "decent" fairness!

> > Balbir,
> >        I dont think guarantee can be met easily thr' hard-limits in
> > case of CPU resource. Atleast its not as straightforward as in case of
> > memory!
> 
> OK, based on the discussion - leaving implementation issues out,
> speaking of whether it is possible to implement guarantees using
> shares? My answer would be
> 
> 1. Yes - but then the hard limits will prevent you and can cause idle
> times, some of those can be handled in the implementation. There might
> be other fairness and SMP concerns about the accuracy of the fairness,
> thank you for that data.
> 2. We'll update the RFC (second version) with the findings and send it
> out, so that the expectations are clearer
> 3. From what I've read and seen there seems to be no strong objection
> to hard limits, but some reservations (based on 1) about using them
> for guarantees and our RFC will reflect that.
> 
> Do you agree?

Well yes, guarantee is not a good argument for providing hard limits.
Pay-per-use kind of usage would be a better argument IMHO.

- vatsa

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
       [not found]               ` <6599ad830906050303r404c325anc60ded4f45a50b95-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-06-08  8:50                 ` Pavel Emelyanov
  0 siblings, 0 replies; 107+ messages in thread
From: Pavel Emelyanov @ 2009-06-08  8:50 UTC (permalink / raw)
  To: Paul Menage
  Cc: Peter Zijlstra, Dhaval Giani, kvm-u79uwXL29TY76Z2rM5mHXA,
	Gautham R Shenoy, Linux Containers,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity,
	bharata-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, Ingo Molnar,
	Balbir Singh

Paul Menage wrote:
> On Fri, Jun 5, 2009 at 2:59 AM, Dhaval Giani<dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
>> I think we are focusing on the wrong use case here. Guarantees is just a
>> useful side-effect we get by using hard limits. I think the more
>> important use case is where the provider wants to limit the amount of
>> time a user gets (such as in a cloud).
>>
>> Maybe we should direct our attention in solving that problem? :)
>>
> 
> Yes, that case and the "predictable load test behaviour" case are both
> good reasons for hard limits.

ACK.

I'd like to add two things.

First, the article @openvz.org about guarantees you were discussing was 
not supposed to be a "best practices" paper. This was just a theoretical
thoughts on how to get guarantees out of the limit for those resources 
you cannot reclaim from the user and thus cannot provide the guarantee 
any other way. E.g. locked memory - once a user has it you cannot take 
it back, and if you want to guarantee some mount of it for group X you
have to keep all the other groups away from this amount.

And the second thing is an addition for Dhaval's case about limiting the 
amount of time a user gets. This is exactly what hosting providers do -
they _sell_ the CPU power to their customers and thus need to limit the
CPU time dedicated for containers.

> Paul
> 

^ permalink raw reply	[flat|nested] 107+ messages in thread

* Re: [RFC] CPU hard limits
  2009-06-05 10:03             ` Paul Menage
       [not found]               ` <6599ad830906050303r404c325anc60ded4f45a50b95-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-06-08  8:50               ` Pavel Emelyanov
  1 sibling, 0 replies; 107+ messages in thread
From: Pavel Emelyanov @ 2009-06-08  8:50 UTC (permalink / raw)
  To: Paul Menage
  Cc: Dhaval Giani, bharata, linux-kernel, Balbir Singh,
	Vaidyanathan Srinivasan, Gautham R Shenoy, Srivatsa Vaddagiri,
	Ingo Molnar, Peter Zijlstra, Avi Kivity, kvm, Linux Containers,
	Herbert Poetzl

Paul Menage wrote:
> On Fri, Jun 5, 2009 at 2:59 AM, Dhaval Giani<dhaval@linux.vnet.ibm.com> wrote:
>> I think we are focusing on the wrong use case here. Guarantees is just a
>> useful side-effect we get by using hard limits. I think the more
>> important use case is where the provider wants to limit the amount of
>> time a user gets (such as in a cloud).
>>
>> Maybe we should direct our attention in solving that problem? :)
>>
> 
> Yes, that case and the "predictable load test behaviour" case are both
> good reasons for hard limits.

ACK.

I'd like to add two things.

First, the article @openvz.org about guarantees you were discussing was 
not supposed to be a "best practices" paper. This was just a theoretical
thoughts on how to get guarantees out of the limit for those resources 
you cannot reclaim from the user and thus cannot provide the guarantee 
any other way. E.g. locked memory - once a user has it you cannot take 
it back, and if you want to guarantee some mount of it for group X you
have to keep all the other groups away from this amount.

And the second thing is an addition for Dhaval's case about limiting the 
amount of time a user gets. This is exactly what hosting providers do -
they _sell_ the CPU power to their customers and thus need to limit the
CPU time dedicated for containers.

> Paul
> 


^ permalink raw reply	[flat|nested] 107+ messages in thread

* [RFC] CPU hard limits
@ 2009-06-04  5:36 Bharata B Rao
  0 siblings, 0 replies; 107+ messages in thread
From: Bharata B Rao @ 2009-06-04  5:36 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Peter Zijlstra, Pavel Emelyanov, Dhaval Giani,
	kvm-u79uwXL29TY76Z2rM5mHXA, Gautham R Shenoy, Linux Containers,
	Avi Kivity, Ingo Molnar, Balbir Singh

Hi,

This is an RFC about the CPU hard limits feature where I have explained
the need for the feature, the proposed plan and the issues around it.
Before I come up with an implementation for hard limits, I would like to
know community's thoughts on this scheduler enhancement and any feedback
and suggestions.

Regards,
Bharata.

1. CPU hard limit
2. Need for hard limiting CPU resource
3. Granularity of enforcing CPU hard limits
4. Existing solutions
5. Specifying hard limits
6. Per task group vs global bandwidth period
7. Configuring
8. Throttling of tasks
9. Group scheduler hierarchy considerations
10. SMP considerations
11. Starvation
12. Hard limit and fairness

1. CPU hard limit
-----------------
CFS is a proportional share scheduler which tries to divide the CPU time
proportionately between tasks or groups of tasks (task group/cgroup) depending
on the priority/weight of the task or shares assigned to groups of tasks.
In CFS, a task/task group can get more than its share of CPU if there are
enough idle CPU cycles available in the system, due to the work conserving
nature of the scheduler.

However there are scenarios (Sec 2) where giving more than the desired
CPU share to a task/task group is not acceptable. In those scenarios, the
scheduler needs to put a hard stop on the CPU resource consumption of
task/task group if it exceeds a preset limit. This is usually achieved by
throttling the task/task group when it fully consumes its allocated CPU time.

2. Need for hard limiting CPU resource
--------------------------------------
- Pay-per-use: In enterprise systems that cater to multiple clients/customers
  where a customer demands a certain share of CPU resources and pays only
  that, CPU hard limits will be useful to hard limit the customer's job
  to consume only the specified amount of CPU resource.
- In container based virtualization environments running multiple containers,
  hard limits will be useful to ensure a container doesn't exceed its
  CPU entitlement.
- Hard limits can be used to provide guarantees.

3. Granularity of enforcing CPU hard limits
-------------------------------------------
Conceptually, hard limits can either be enforced for individual tasks or
groups of tasks.  However enforcing limits per task would be too fine
grained and would be a lot of work on the part of the system administrator
in terms of setting limits for every task. Based on the current understanding
of the users of this feature,  it is felt that hard limiting is more useful
at task group level than the individual tasks level. Hence in the subsequent
paragraphs, the concept of hard limit as applicable to task group/cgroup
is discussed.

4. Existing solutions
---------------------
- Both Linux-VServer and OpenVZ virtualization solutions support CPU hard
  limiting.
- Per task limit can be enforced using rlimits, but it is not rate based.

5. Specifying hard limits
-------------------------
CPU time consumed by a task group is generally measured over a
time period (called bandwidth period) and the task group gets throttled
when its CPU time reaches a limit (hard limit) within a bandwidth period.
The task group remains throttled until the bandwidth period gets
renewed at which time additional CPU time becomes available
to the tasks in the system.

When a task group's hard limit is specified as a ratio X/Y, it means that
the group will get throttled if its CPU time consumption exceeds X seconds
in a bandwidth period of Y seconds.

Specifying the hard limit as X/Y requires us to specify the bandwidth
period also.

Is having a uniform/same bandwidth period for all the groups an option ?
If so, we could even specify the hard limit as a percentage, like
30% of a uniform bandwidth period.

6. Per task group vs global bandwidth period
--------------------------------------------
The bandwidth period can either be per task group or global. With global
bandwidth period, the runtimes of all the task groups need to be
replenished when the period ends. Though this appears conceptually simple,
the implementation might not scale. Instead if every task group maintains its
bandwidth period separately, the refresh cycles of each group will happen
independent of each other. Moreover different groups might prefer different
bandwidth periods. Hence the first implementation will have per task group
bandwidth period.

Timers can be used to trigger bandwidth refresh cycles. (similar to rt group
sched)

7. Configuring
--------------
- User could set the hard limit (X and/or Y) through the cgroup fs.
- When the scheduler supports hard limiting, should it be enabled
  for all tasks groups of the system ? Or should user have an option
  to enable hard limiting per group ?
- When hard limiting is enabled for a group, should the limit be
  set to a default to start with ? Or should the user set the limit
  and the bandwidth before enabling the hard limiting ?
- What should be a sane default value for the bandwidth period ?

8. Throttling of tasks
----------------------
Task group can be taken off the runqueue when it hits the limit and enqueued
back when the bandwidth period is refreshed. This method would require us to
maintain the throttled tasks list separately for every group.

Under heavy throttling, there could be tasks being dequeued and enqueued
back at bandwidth refresh times leading to frequent variations in the
runqueue load. This might unduly stress the load balancer.

Note: A group (entity) can't be dequeued unless all tasks under it are
dequeued. So there can be false/failed attempts to run tasks of a throttled
group until all the tasks from the throttled group are dequeued.

9. Group scheduler hierarchy considerations
-------------------------------------------
Since the group scheduler is hierarchical in nature, should there be any
relation between the hard limit values of the parent task group
and the values of its child groups ? Should the hard limit values set for
child groups be compatible with the parent's hard limit ? For eg, consider
a group A having hard limit as X/Y has two children A1 and A2. Should the
limits for A1 (X1/Y) and A2 (X2/Y) be set so that X1/Y+X2/Y <= X/Y ?

Or should child groups set their limits independently of parent ? In this
case, even if the child still has CPU time left before it hits the limit,
it could get throttled because its parent got throttled. I would think that
this method will lead to easier implementation.

AFAICS, rt group scheduler needs EDF to support different bandwidth periods
for different groups (Ref: Documentation/scheduler/sched-rt-group.txt). I
don't think the same requirement is applicable to non-rt groups. This is
because with hard limits we are not guaranteeing the CPU time for a group,
instead we are just specifying the max time which a group can run within a
bandwidth period.

10. SMP considerations
----------------------
Hard limits could be enforced for the system as a whole or for individual
CPUS.

When it is enforced per CPU, a task group on a CPU will be throttled if
it reaches its hard limit on that CPU. This can lead to unfairness if
the same task group on other CPUs has runtimes still left and it is not
being utilized.

If enforced system wide, then a task group will be throttled when sum of the
run times of its tasks running on different CPUs reach the limit.

Could we use a hybrid method where a task group that reaches its limit on a CPU
could draw the group bandwidth from another CPU where there are no runnable
tasks belonging to that group ?

RT group scheduling borrows runtime from other CPUs when runtimes are balanced.

11. Starvation
---------------
When a task group that holds a shared resource (like a lock) is throttled,
another group which needs the same shared resource will not be able to
make progress even when the CPU has idle cycles to spare. This will lead
to starvation and unfairness. This situation could be avoided by some of
the methods like

- Disabling throttling when a group is holding a lock.
- Inheriting runtime from the group which faces starvation.

The first implementation will not address this problem of starvation.

12. Hard limits and fairness
----------------------------
Hard limits are set independent of group shares. The hard limit setting
by the user may be such that it may not be possible for the scheduler to
meet fairness and also enforce hard limits. Hard limiting takes precedence.

^ permalink raw reply	[flat|nested] 107+ messages in thread

end of thread, other threads:[~2009-06-08  8:51 UTC | newest]

Thread overview: 107+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-04  5:36 [RFC] CPU hard limits Bharata B Rao
2009-06-04 12:19 ` Avi Kivity
2009-06-04 21:32   ` Mike Waychison
2009-06-05  3:03   ` Bharata B Rao
2009-06-05  3:33     ` Avi Kivity
     [not found]       ` <4A28921C.6010802-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-06-05  4:37         ` Balbir Singh
2009-06-05  4:37       ` Balbir Singh
     [not found]         ` <661de9470906042137u603e2997n80c270bf7f6191ad-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-06-05  4:44           ` Avi Kivity
2009-06-05  4:44         ` Avi Kivity
     [not found]           ` <4A28A2AB.3060108-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-06-05  4:49             ` Balbir Singh
2009-06-05  4:49           ` Balbir Singh
     [not found]             ` <20090605044946.GA11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
2009-06-05  5:09               ` Chris Friesen
2009-06-05  5:10               ` Balbir Singh
2009-06-05  5:16               ` Avi Kivity
2009-06-05  5:09             ` Chris Friesen
2009-06-05  5:13               ` Balbir Singh
     [not found]               ` <4A28A882.8070503-ZIRUuHA3oDzQT0dZR+AlfA@public.gmane.org>
2009-06-05  5:13                 ` Balbir Singh
2009-06-05  5:10             ` Balbir Singh
2009-06-05  5:21               ` Avi Kivity
2009-06-05  5:27                 ` Balbir Singh
2009-06-05  5:31                   ` Bharata B Rao
2009-06-05  6:01                     ` Avi Kivity
     [not found]                       ` <4A28B4CE.4010004-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-06-05  8:16                         ` Bharata B Rao
2009-06-05  8:16                           ` Bharata B Rao
2009-06-07  6:04                           ` Avi Kivity
     [not found]                             ` <4A2B5881.9060204-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-06-07 16:14                               ` Bharata B Rao
2009-06-07 16:14                             ` Bharata B Rao
     [not found]                           ` <20090605081651.GD3872-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
2009-06-07  6:04                             ` Avi Kivity
2009-06-05  9:39                         ` Balbir Singh
2009-06-05  9:39                       ` Balbir Singh
2009-06-05 13:14                         ` Avi Kivity
     [not found]                           ` <4A291A2F.3090201-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-06-05 13:42                             ` Balbir Singh
2009-06-05 13:42                               ` Balbir Singh
2009-06-07  6:09                               ` Avi Kivity
     [not found]                               ` <661de9470906050642s7774d601l53e366c77ffa7475-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-06-07  6:09                                 ` Avi Kivity
2009-06-05 14:54                             ` Chris Friesen
2009-06-05 14:54                           ` Chris Friesen
2009-06-07  6:10                             ` Avi Kivity
     [not found]                             ` <4A293196.2060006-ZIRUuHA3oDzQT0dZR+AlfA@public.gmane.org>
2009-06-07  6:10                               ` Avi Kivity
     [not found]                         ` <20090605093947.GJ11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
2009-06-05 13:14                           ` Avi Kivity
     [not found]                     ` <20090605053159.GB3872-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
2009-06-05  6:01                       ` Avi Kivity
2009-06-05  9:24                       ` Balbir Singh
2009-06-05  9:24                     ` Balbir Singh
     [not found]                   ` <20090605052755.GE11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
2009-06-05  5:31                     ` Bharata B Rao
2009-06-05  6:03                     ` Avi Kivity
2009-06-05  6:03                   ` Avi Kivity
2009-06-05  6:32                     ` Bharata B Rao
     [not found]                       ` <20090605063243.GC3872-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
2009-06-05 12:57                         ` Avi Kivity
2009-06-05 12:57                       ` Avi Kivity
     [not found]                     ` <4A28B539.3050001-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-06-05  6:32                       ` Bharata B Rao
     [not found]                 ` <4A28AB67.7040800-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-06-05  5:27                   ` Balbir Singh
     [not found]               ` <20090605051050.GB11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
2009-06-05  5:21                 ` Avi Kivity
2009-06-05  5:16             ` Avi Kivity
2009-06-05  5:20               ` Balbir Singh
     [not found]               ` <4A28AA25.4050206-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-06-05  5:20                 ` Balbir Singh
     [not found]     ` <20090605030309.GA3872-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
2009-06-05  3:33       ` Avi Kivity
2009-06-05  3:07   ` Balbir Singh
     [not found]   ` <4A27BBCA.5020606-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-06-04 21:32     ` Mike Waychison
2009-06-05  3:03     ` Bharata B Rao
2009-06-05  3:07     ` Balbir Singh
     [not found] ` <20090604053649.GA3701-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
2009-06-04 12:19   ` Avi Kivity
2009-06-05  8:53   ` Paul Menage
2009-06-05  8:53 ` Paul Menage
2009-06-05  9:27   ` Bharata B Rao
2009-06-05  9:32     ` Paul Menage
     [not found]       ` <6599ad830906050232n11aa30d8xfcda0a279a482f32-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-06-05  9:48         ` Dhaval Giani
2009-06-05  9:48       ` Dhaval Giani
     [not found]         ` <20090605094811.GD4601-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-06-05  9:51           ` Paul Menage
2009-06-05  9:51         ` Paul Menage
     [not found]           ` <6599ad830906050251h18f4e037h182f61aa80a5b046-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-06-05  9:59             ` Dhaval Giani
2009-06-05  9:59           ` Dhaval Giani
     [not found]             ` <20090605095931.GE4601-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-06-05 10:03               ` Paul Menage
2009-06-05 10:03             ` Paul Menage
     [not found]               ` <6599ad830906050303r404c325anc60ded4f45a50b95-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-06-08  8:50                 ` Pavel Emelyanov
2009-06-08  8:50               ` Pavel Emelyanov
     [not found]     ` <20090605092733.GA27486-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
2009-06-05  9:32       ` Paul Menage
     [not found]   ` <6599ad830906050153i1afd104fqe70f681317349142-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-06-05  9:27     ` Bharata B Rao
2009-06-05  9:36     ` Balbir Singh
2009-06-05 11:32     ` Srivatsa Vaddagiri
2009-06-05 13:02     ` Avi Kivity
2009-06-05  9:36   ` Balbir Singh
     [not found]     ` <20090605093625.GI11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
2009-06-05  9:48       ` Paul Menage
2009-06-05  9:48         ` Paul Menage
     [not found]         ` <6599ad830906050248m2c569e5bx44fb3bbddf46f8b1-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-06-05  9:55           ` Balbir Singh
2009-06-05  9:55         ` Balbir Singh
2009-06-05  9:57           ` Paul Menage
2009-06-05 10:02           ` Paul Menage
     [not found]           ` <20090605095527.GM11755-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>
2009-06-05  9:57             ` Paul Menage
2009-06-05 10:02             ` Paul Menage
2009-06-05 11:32   ` Srivatsa Vaddagiri
     [not found]     ` <20090605113217.GA20786-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
2009-06-05 12:18       ` Paul Menage
2009-06-05 14:44       ` Chris Friesen
2009-06-05 12:18     ` Paul Menage
     [not found]       ` <6599ad830906050518t6cd7d477h36a187f2eaf55578-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-06-07 10:11         ` Srivatsa Vaddagiri
2009-06-07 10:11           ` Srivatsa Vaddagiri
2009-06-07 15:35           ` Balbir Singh
2009-06-08  4:37             ` Srivatsa Vaddagiri
     [not found]             ` <661de9470906070835l383cd388h67e40a31be07aef6-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-06-08  4:37               ` Srivatsa Vaddagiri
     [not found]           ` <20090607101120.GB16211-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>
2009-06-07 15:35             ` Balbir Singh
2009-06-05 14:44     ` Chris Friesen
2009-06-05 13:02   ` Avi Kivity
2009-06-05 13:43     ` Dhaval Giani
     [not found]       ` <20090605134320.GA3994-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-06-05 14:45         ` Chris Friesen
2009-06-05 14:45       ` Chris Friesen
     [not found]     ` <4A291753.7090205-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2009-06-05 13:43       ` Dhaval Giani
2009-06-05  9:02 ` Reinhard Tartler
2009-06-04  5:36 Bharata B Rao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.