From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1753188AbbDACOR (ORCPT <rfc822;w@1wt.eu>);
	Tue, 31 Mar 2015 22:14:17 -0400
Received: from mx1.redhat.com ([209.132.183.28]:50395 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1752445AbbDACOL (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Tue, 31 Mar 2015 22:14:11 -0400
Date: Tue, 31 Mar 2015 19:56:23 -0300
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Vikas Shivappa <vikas.shivappa@intel.com>
Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>, x86@kernel.org,
        linux-kernel@vger.kernel.org, hpa@zytor.com, tglx@linutronix.de,
        mingo@kernel.org, tj@kernel.org, peterz@infradead.org,
        matt.fleming@intel.com, will.auld@intel.com,
        glenn.p.williamson@intel.com, kanaka.d.juvva@intel.com
Subject: Re: [PATCH 7/7] x86/intel_rdt: Add CAT documentation and usage guide
Message-ID: <20150331225623.GA3727@amt.cnet>
References: <1426202167-30598-1-git-send-email-vikas.shivappa@linux.intel.com>
 <1426202167-30598-8-git-send-email-vikas.shivappa@linux.intel.com>
 <20150325223941.GA5657@amt.cnet>
 <alpine.DEB.2.10.1503261133070.19649@vshiva-Udesk>
 <20150327012927.GA1709@amt.cnet>
 <alpine.DEB.2.10.1503310955180.3891@vshiva-Udesk>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <alpine.DEB.2.10.1503310955180.3891@vshiva-Udesk>
User-Agent: Mutt/1.5.21 (2010-09-15)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Tue, Mar 31, 2015 at 10:27:32AM -0700, Vikas Shivappa wrote:
> 
> 
> On Thu, 26 Mar 2015, Marcelo Tosatti wrote:
> 
> >
> >I can't find any discussion relating to exposing the CBM interface
> >directly to userspace in that thread ?
> >
> >Cpu.shares is written in ratio form, which is much more natural.
> >Do you see any advantage in maintaining the
> >
> >(ratio -> cbm bitmasks)
> >
> >translation in userspace rather than in the kernel ?
> >
> >What about something like:
> >
> >
> >		      root cgroup
> >		   /		  \
> >		  /		    \
> >		/		      \
> >	cgroupA-80			cgroupB-30
> >
> >
> >So that whatever exceeds 100% is the ratio of cache
> >shared at that level (cgroup A and B share 10% of cache
> >at that level).
> 
> But this also means the 2 groups share all of the cache ?
> 
> Specifying the amount of bits to be shared lets you specify the
> exact cache area where you want to share and also when your total
> occupancy does not cover all of the cache. For ex: it gets more
> complex when you want to share say only the left quarter of the
> cache. cgroupA gets left half and cgroup gets left quarter. The
> bitmask aligns with how the h/w is designed to share the cache which
> gives you flexibility to define any specific overlapping areas of
> the cache.

> >https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/sec-cpu_and_memory-use_case.html
> >
> >cpu — the cpu.shares parameter determines the share of CPU resources
> >available to each process in all cgroups. Setting the parameter to 250,
> >250, and 500 in the finance, sales, and engineering cgroups respectively
> >means that processes started in these groups will split the resources
> >with a 1:1:2 ratio. Note that when a single process is running, it
> >consumes as much CPU as necessary no matter which cgroup it is placed
> >in. The CPU limitation only comes into effect when two or more processes
> >compete for CPU resources.
> >
> >
> 
> These are more defined in terms of how many cache lines (or how many
> cache ways) they can use and would be difficult to define them in
> terms of percentage. In contrast the cpu share is a time shared
> thing and is much more granular where as here its not , its
> occupancy in terms of cache lines/ways.. (however this is not really
> defined as a restriction but thats the way it is now).
> Also note that the granularity of the bitmasks define the
> granularity of the percentages and in some SKUs the granularity is
> 2b and not 1b.. So technically you wont be able to even allocate
> percentage of cache even in 10% granularity for most of the cases
> (if there are 30MB and 25 ways like in one of hsw SKU) and this will
> vary for different SKUs which makes it more complicated for users.
> However the user library is free to define own interface based on
> the underlying cgroup interface say for example you never care about
> the overlapping and using it for a specific SKU etc.. The underlying
> cgroup framework is meant to be  generic for all SKus and used for
> most of the use cases.
> 
> Also at this point I see a lot of enterprise and and other users
> already using the cgroup interface or shown interest in the same.
> However I see your point where you indicate the ease with which user
> can specify in size/percentage which he might be used to doing for
> other resources rather than bits where he needs to get an idea size
> by calculating it seperately - But again note that you may not be
> able to define percentages in many scenarios like the one above. And
> another question would be we would need to convince the users to
> adapt to the modified percentage user model (ex: like the one you
> say above where percentage - 100 is the one thats shared)
> I can review this requirements and others I have received and get
> back to see the closest that can be done if possible.
> 
> Thanks,
> Vikas

Vikas,

I see. Don't have anything against performing the translation in userspace
(i agree userspace should be able to allow ratios and specific
minimum/maximum counts). Can you please export the relevant information
in files in /sys or cgroups itself rather than requiring userspace to
parse CPUID etc? Including the EBX register from CPUID(EAX=10H, ECX=1),
which is necessary to implement "reserved LLC" properly.

The current interface is unable to handle the cross CPU case, though.
It would be necessary to expose per-socket masks.