From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752596AbbKBWVr (ORCPT ); Mon, 2 Nov 2015 17:21:47 -0500 Received: from mx1.redhat.com ([209.132.183.28]:36466 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751182AbbKBWVl (ORCPT ); Mon, 2 Nov 2015 17:21:41 -0500 Date: Mon, 2 Nov 2015 20:20:58 -0200 From: Marcelo Tosatti To: Peter Zijlstra Cc: Thomas Gleixner , Fenghua Yu , H Peter Anvin , Ingo Molnar , linux-kernel@vger.kernel.org, x86 , Vikas Shivappa , Karen Noel , Paolo Bonzini , Luiz Capitulino , Clark Williams , Tejun Heo Subject: cat cgroup interface proposal (non hierarchical) was Re: [PATCH V15 00/11] x86: Intel Cache Allocation Technology Support Message-ID: <20151102222057.GA30158@amt.cnet> References: <1443766185-61618-1-git-send-email-fenghua.yu@intel.com> <20151013213125.GA16200@amt.cnet> <20151015113614.GL3816@twins.programming.kicks-ass.net> <20151016022851.GA9008@amt.cnet> <20151016095022.GP3816@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151016095022.GP3816@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Oct 16, 2015 at 11:50:22AM +0200, Peter Zijlstra wrote: > On Thu, Oct 15, 2015 at 11:28:52PM -0300, Marcelo Tosatti wrote: > > On Thu, Oct 15, 2015 at 01:36:14PM +0200, Peter Zijlstra wrote: > > > On Tue, Oct 13, 2015 at 06:31:27PM -0300, Marcelo Tosatti wrote: > > > > I am rewriting the interface with ioctls, with commands similar to the > > > > syscall interface proposed. > > > > > > Which is horrible for other use cases. I really don't see the problem > > > with the cgroup stuff. > > > > Can you detail what "horrible" means? > > Say an RT scenario; you set up your machine with cgroups. You create a > cpuset with is disjoint from the others, you frob around with the cpu > cgroup, etc.. > > So once you're all done, you start your RT app into a cgroup. > > But oh, fail, now you have to go muck about with ioctl()s to get the > cache allocation cruft to work. Peter, what follows is your cgroup proposal (extended), but at the end there is a point about impossibility of this cgroup interface to share cache between tasks, which IMO renders it unuseable (as it blocks any threads from sharing reserved cache). If you have any ideas on how to circumvent this, they are appreciated. Follows non-hierarchical cgroup CAT interface proposal. Thanks to some of the CC'ed Red Hat folks for early comments. cgroup CAT interface (non hierarchical): --------------------------------------- 0) Main directory files: cat_hw_info ----------- CAT HW information: CBM length, CDP supported, etc. Information reported per-socket, as sockets can have different configurations. Perhaps should be inside sysfs. 1) Sub-directories represent cache reservations (size,type). mkdir cache_reservation_for_forwarding_app cd cache_reservation_for_forwarding_app echo "80KB" > size echo "data_and_code" > type echo "socketmask=0xfffff" > socketmask (optional) echo "1" > enable_reservation echo "pid-of-forwarding-main-thread pid-of-helper-thread ..." > tasks Files: type ---------------- {data_and_code, only_code, only_data}. Type of L3 CAT cache allocation to use. only_code,only_data only supported on CDP capable processors. size ---- size of L3 cache reservation. rounding -------- {round_down,round_up} whether to round up / round down allocation size in kbytes, to cache-way size. Default: round_up socketmask ---------- Mask of sockets where the reservation is in effect. A zero bit means the task will not have the L3 cache portion that the cgroup references reserved on that socket. Default: all sockets set. enable ------ Allocate reservation with parameters set above. When a reservation is enabled, it reserves L3 cache space on any socket thats specified in "socketmask". After cgroup has been enabled by a write of "1" to "enable_reservation" file, only the "tasks" file can be modified. To change the size of a cgroup reservation, recreate the directory. tasks ----- Contains the list of tasks which use this cache reservation. Error reporting --------------- Errors are reported in response to write as appropriate: for example, write 1 > enable when there is not enough space for "socketmask" would return -ENOSPC, etc. Write to "enable" without size being set would return -EINVAL, etc. Listing ------- To list which reservations are in place, search for subdirectories where "enabled" file has value 1. Semantics: A task has guaranteed cache reservation on any CPU where its scheduled in, for the lifetime of the cgroup, as long as that task is not attached to further cgroups. That is, a task belonging to cgroup-A can have its cache reservation invalidated when attached to cgroup-B, (reasoning: it might be necessary to reallocate the CBMs to respect contiguous bits in cache, a restriction of the CAT HW interface). ------- BLOCKER ------- Can't use cgroups for CAT because: "Control Groups extends the kernel as follows: - Each task in the system has a reference-counted pointer to a css_set. - A css_set contains a set of reference-counted pointers to cgroup_subsys_state objects, one for each cgroup subsystem registered in the system." You need a task to be part of two cgroups at one time, to support the following configuration: Task-A: 70% of cache reserved exclusively (reservation-0). 20% of cache reserved (reservation-1). Task-B: 20% of cache reserved (reservation-1). Unless reservations are created separately, then added to cgroups: mount -t cgroup ... /../catcgroup/ cd /../catcgroup/ # create reservations cd reservations mkdir reservation-1 echo "80K" > size echo "socketmask" > ... echo "1" > enable mkdir reservation-2 echo "160K" > size echo "socketmask" > ... echo "1" > enable # attach reservation to cgroups cd /../catcgroup/ mkdir cgroup-for-threaded-app echo reservation-1 reservation-2 > reservations echo "mainthread" > tasks cd .. mkdir cgroup-for-helper-thread echo reservation-1 > reservations echo "helperthread" > tasks cd .. This way mainthread and helperthread can share "reservation-1". But this is abusing cgroups in a way that it has not been designed for. Who is going to maintain the linkage between reservations and cgroups?