From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1754124AbbG2VUZ (ORCPT <rfc822;w@1wt.eu>);
	Wed, 29 Jul 2015 17:20:25 -0400
Received: from mga14.intel.com ([192.55.52.115]:28444 "EHLO mga14.intel.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1751820AbbG2VUY (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Wed, 29 Jul 2015 17:20:24 -0400
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.15,572,1432623600"; 
   d="scan'208";a="772547359"
Date: Wed, 29 Jul 2015 14:20:19 -0700 (PDT)
From: Vikas Shivappa <vikas.shivappa@intel.com>
X-X-Sender: vikas@vshiva-Udesk
To: Marcelo Tosatti <mtosatti@redhat.com>
cc: Vikas Shivappa <vikas.shivappa@intel.com>,
        Vikas Shivappa <vikas.shivappa@linux.intel.com>, x86@kernel.org,
        linux-kernel@vger.kernel.org, hpa@zytor.com, tglx@linutronix.de,
        mingo@kernel.org, tj@kernel.org, peterz@infradead.org,
        matt.fleming@intel.com, will.auld@intel.com,
        glenn.p.williamson@intel.com, kanaka.d.juvva@intel.com
Subject: Re: [PATCH 7/7] x86/intel_rdt: Add CAT documentation and usage
 guide
In-Reply-To: <20150728233701.GA17184@amt.cnet>
Message-ID: <alpine.DEB.2.10.1507291412550.8113@vshiva-Udesk>
References: <1426202167-30598-1-git-send-email-vikas.shivappa@linux.intel.com> <1426202167-30598-8-git-send-email-vikas.shivappa@linux.intel.com> <20150325223941.GA5657@amt.cnet> <alpine.DEB.2.10.1503261133070.19649@vshiva-Udesk> <20150327012927.GA1709@amt.cnet>
 <alpine.DEB.2.10.1503310955180.3891@vshiva-Udesk> <20150728233701.GA17184@amt.cnet>
User-Agent: Alpine 2.10 (DEB 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: MULTIPART/MIXED; BOUNDARY="8323329-948111101-1438204820=:8113"
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--8323329-948111101-1438204820=:8113
Content-Type: TEXT/PLAIN; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8BIT


On Tue, 28 Jul 2015, Marcelo Tosatti wrote:

> On Tue, Mar 31, 2015 at 10:27:32AM -0700, Vikas Shivappa wrote:
>>
>>
>> On Thu, 26 Mar 2015, Marcelo Tosatti wrote:
>>
>>>
>>> I can't find any discussion relating to exposing the CBM interface
>>> directly to userspace in that thread ?
>>>
>>> Cpu.shares is written in ratio form, which is much more natural.
>>> Do you see any advantage in maintaining the
>>>
>>> (ratio -> cbm bitmasks)
>>>
>>> translation in userspace rather than in the kernel ?
>>>
>>> What about something like:
>>>
>>>
>>> 		      root cgroup
>>> 		   /		  \
>>> 		  /		    \
>>> 		/		      \
>>> 	cgroupA-80			cgroupB-30
>>>
>>>
>>> So that whatever exceeds 100% is the ratio of cache
>>> shared at that level (cgroup A and B share 10% of cache
>>> at that level).
>>
>> But this also means the 2 groups share all of the cache ?
>>
>> Specifying the amount of bits to be shared lets you specify the
>> exact cache area where you want to share and also when your total
>> occupancy does not cover all of the cache. For ex: it gets more
>> complex when you want to share say only the left quarter of the
>> cache. cgroupA gets left half and cgroup gets left quarter. The
>> bitmask aligns with how the h/w is designed to share the cache which
>> gives you flexibility to define any specific overlapping areas of
>> the cache.
>>
>>>
>>> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/sec-cpu_and_memory-use_case.html
>>>
>>> cpu — the cpu.shares parameter determines the share of CPU resources
>>> available to each process in all cgroups. Setting the parameter to 250,
>>> 250, and 500 in the finance, sales, and engineering cgroups respectively
>>> means that processes started in these groups will split the resources
>>> with a 1:1:2 ratio. Note that when a single process is running, it
>>> consumes as much CPU as necessary no matter which cgroup it is placed
>>> in. The CPU limitation only comes into effect when two or more processes
>>> compete for CPU resources.
>>>
>>>
>>
>> These are more defined in terms of how many cache lines (or how many
>> cache ways) they can use and would be difficult to define them in
>> terms of percentage. In contrast the cpu share is a time shared
>> thing and is much more granular where as here its not , its
>> occupancy in terms of cache lines/ways.. (however this is not really
>> defined as a restriction but thats the way it is now).
>> Also note that the granularity of the bitmasks define the
>> granularity of the percentages and in some SKUs the granularity is
>> 2b and not 1b.. So technically you wont be able to even allocate
>> percentage of cache even in 10% granularity for most of the cases
>> (if there are 30MB and 25 ways like in one of hsw SKU) and this will
>> vary for different SKUs which makes it more complicated for users.
>> However the user library is free to define own interface based on
>> the underlying cgroup interface say for example you never care about
>> the overlapping and using it for a specific SKU etc.. The underlying
>> cgroup framework is meant to be  generic for all SKus and used for
>> most of the use cases.
>>
>> Also at this point I see a lot of enterprise and and other users
>> already using the cgroup interface or shown interest in the same.
>> However I see your point where you indicate the ease with which user
>> can specify in size/percentage which he might be used to doing for
>> other resources rather than bits where he needs to get an idea size
>> by calculating it seperately - But again note that you may not be
>> able to define percentages in many scenarios like the one above. And
>> another question would be we would need to convince the users to
>> adapt to the modified percentage user model (ex: like the one you
>> say above where percentage - 100 is the one thats shared)
>> I can review this requirements and others I have received and get
>> back to see the closest that can be done if possible.
>>
>> Thanks,
>> Vikas
>
> Vikas,
>
> Three questions:
>
> First, usage model. The usage model for CAT is the following
> (please correct me if i'm wrong):
>
> 1) measure application performance without L3 cache reservation.
> 2) measure application perf with L3 cache reservation and
> X number of ways until desired perf is attained.
>
> On migration to a new hardware platform, to achieve similar benefit
> achieved when going from 1) to 2) is to reserve _at least_ the number of
> bytes that "X ways" provided when the measurement was performed. Is that
> correct?
>
> If that is correct, then the user does want to record "number of bytes"
> that X ways on measurement CPU provided.
>

The number of ways mapping to bits is implementation dependent. So we really 
cannot refer one way as a bit..

to map the size to bits. could check the cache capacity in /proc and then the 
number of bits in the cbm (max bits are shown in the root intel_rdt cgroup) .
ex: cache is 2MB. we have 16 bits cbm - a mask of 0xff would represent 1MB.

> Second question:
> Do you envision any use case which the placement of cache
> and not the quantity of cache is a criteria for decision?
> That is, two cases with the same amount of cache for each CLOSid,
> but with different locations inside the cache?
> (except sharing of ways by two CLOSid's, of course).
>

cbm max - 16 bits.  000f - allocate right quarter. f000 - allocate left 
quarter.. ? extend the case to any number of valid contiguous bits.


> Third question:
> How about support for the (new) I/D cache division?
>

Planning to be sending a patch end of this week or early next week.

Thanks,
Vikas


>
>
>
>
>
>
>
--8323329-948111101-1438204820=:8113--