Re: [PATCH 7/7] x86/intel_rdt: Add CAT documentation and usage guide

From: Vikas Shivappa <vikas.shivappa@intel.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Vikas Shivappa <vikas.shivappa@intel.com>,
	Vikas Shivappa <vikas.shivappa@linux.intel.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org, hpa@zytor.com,
	tglx@linutronix.de, mingo@kernel.org, tj@kernel.org,
	peterz@infradead.org, matt.fleming@intel.com,
	will.auld@intel.com, glenn.p.williamson@intel.com,
	kanaka.d.juvva@intel.com
Subject: Re: [PATCH 7/7] x86/intel_rdt: Add CAT documentation and usage guide
Date: Wed, 29 Jul 2015 14:20:19 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.10.1507291412550.8113@vshiva-Udesk> (raw)
In-Reply-To: <20150728233701.GA17184@amt.cnet>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5672 bytes --]

On Tue, 28 Jul 2015, Marcelo Tosatti wrote:

> On Tue, Mar 31, 2015 at 10:27:32AM -0700, Vikas Shivappa wrote:
>>
>>
>> On Thu, 26 Mar 2015, Marcelo Tosatti wrote:
>>
>>>
>>> I can't find any discussion relating to exposing the CBM interface
>>> directly to userspace in that thread ?
>>>
>>> Cpu.shares is written in ratio form, which is much more natural.
>>> Do you see any advantage in maintaining the
>>>
>>> (ratio -> cbm bitmasks)
>>>
>>> translation in userspace rather than in the kernel ?
>>>
>>> What about something like:
>>>
>>>
>>> 		      root cgroup
>>> 		   /		  \
>>> 		  /		    \
>>> 		/		      \
>>> 	cgroupA-80			cgroupB-30
>>>
>>>
>>> So that whatever exceeds 100% is the ratio of cache
>>> shared at that level (cgroup A and B share 10% of cache
>>> at that level).
>>
>> But this also means the 2 groups share all of the cache ?
>>
>> Specifying the amount of bits to be shared lets you specify the
>> exact cache area where you want to share and also when your total
>> occupancy does not cover all of the cache. For ex: it gets more
>> complex when you want to share say only the left quarter of the
>> cache. cgroupA gets left half and cgroup gets left quarter. The
>> bitmask aligns with how the h/w is designed to share the cache which
>> gives you flexibility to define any specific overlapping areas of
>> the cache.
>>
>>>
>>> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/sec-cpu_and_memory-use_case.html
>>>
>>> cpu — the cpu.shares parameter determines the share of CPU resources
>>> available to each process in all cgroups. Setting the parameter to 250,
>>> 250, and 500 in the finance, sales, and engineering cgroups respectively
>>> means that processes started in these groups will split the resources
>>> with a 1:1:2 ratio. Note that when a single process is running, it
>>> consumes as much CPU as necessary no matter which cgroup it is placed
>>> in. The CPU limitation only comes into effect when two or more processes
>>> compete for CPU resources.
>>>
>>>
>>
>> These are more defined in terms of how many cache lines (or how many
>> cache ways) they can use and would be difficult to define them in
>> terms of percentage. In contrast the cpu share is a time shared
>> thing and is much more granular where as here its not , its
>> occupancy in terms of cache lines/ways.. (however this is not really
>> defined as a restriction but thats the way it is now).
>> Also note that the granularity of the bitmasks define the
>> granularity of the percentages and in some SKUs the granularity is
>> 2b and not 1b.. So technically you wont be able to even allocate
>> percentage of cache even in 10% granularity for most of the cases
>> (if there are 30MB and 25 ways like in one of hsw SKU) and this will
>> vary for different SKUs which makes it more complicated for users.
>> However the user library is free to define own interface based on
>> the underlying cgroup interface say for example you never care about
>> the overlapping and using it for a specific SKU etc.. The underlying
>> cgroup framework is meant to be  generic for all SKus and used for
>> most of the use cases.
>>
>> Also at this point I see a lot of enterprise and and other users
>> already using the cgroup interface or shown interest in the same.
>> However I see your point where you indicate the ease with which user
>> can specify in size/percentage which he might be used to doing for
>> other resources rather than bits where he needs to get an idea size
>> by calculating it seperately - But again note that you may not be
>> able to define percentages in many scenarios like the one above. And
>> another question would be we would need to convince the users to
>> adapt to the modified percentage user model (ex: like the one you
>> say above where percentage - 100 is the one thats shared)
>> I can review this requirements and others I have received and get
>> back to see the closest that can be done if possible.
>>
>> Thanks,
>> Vikas
>
> Vikas,
>
> Three questions:
>
> First, usage model. The usage model for CAT is the following
> (please correct me if i'm wrong):
>
> 1) measure application performance without L3 cache reservation.
> 2) measure application perf with L3 cache reservation and
> X number of ways until desired perf is attained.
>
> On migration to a new hardware platform, to achieve similar benefit
> achieved when going from 1) to 2) is to reserve _at least_ the number of
> bytes that "X ways" provided when the measurement was performed. Is that
> correct?
>
> If that is correct, then the user does want to record "number of bytes"
> that X ways on measurement CPU provided.
>

The number of ways mapping to bits is implementation dependent. So we really 
cannot refer one way as a bit..

to map the size to bits. could check the cache capacity in /proc and then the 
number of bits in the cbm (max bits are shown in the root intel_rdt cgroup) .
ex: cache is 2MB. we have 16 bits cbm - a mask of 0xff would represent 1MB.

> Second question:
> Do you envision any use case which the placement of cache
> and not the quantity of cache is a criteria for decision?
> That is, two cases with the same amount of cache for each CLOSid,
> but with different locations inside the cache?
> (except sharing of ways by two CLOSid's, of course).
>

cbm max - 16 bits.  000f - allocate right quarter. f000 - allocate left 
quarter.. ? extend the case to any number of valid contiguous bits.

> Third question:
> How about support for the (new) I/D cache division?
>

Planning to be sending a patch end of this week or early next week.

Thanks,
Vikas

>
>
>
>
>
>
>