From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932309AbbHGOsx (ORCPT ); Fri, 7 Aug 2015 10:48:53 -0400 Received: from mail-yk0-f180.google.com ([209.85.160.180]:34430 "EHLO mail-yk0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753778AbbHGOsv (ORCPT ); Fri, 7 Aug 2015 10:48:51 -0400 Date: Fri, 7 Aug 2015 10:48:48 -0400 From: Tejun Heo To: Vikas Shivappa Cc: Vikas Shivappa , linux-kernel@vger.kernel.org, x86@kernel.org, hpa@zytor.com, tglx@linutronix.de, mingo@kernel.org, peterz@infradead.org, Matt Fleming , "Auld, Will" , "Williamson, Glenn P" , "Juvva, Kanaka D" Subject: Re: [PATCH 5/9] x86/intel_rdt: Add new cgroup and Class of service management Message-ID: <20150807144848.GB14626@mtj.duckdns.org> References: <1435789270-27010-1-git-send-email-vikas.shivappa@linux.intel.com> <1435789270-27010-6-git-send-email-vikas.shivappa@linux.intel.com> <20150730194458.GD3504@mtj.duckdns.org> <20150802163157.GB32599@mtj.duckdns.org> <20150804190324.GH17598@mtj.duckdns.org> <20150805154627.GL17598@mtj.duckdns.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, On Thu, Aug 06, 2015 at 01:58:39PM -0700, Vikas Shivappa wrote: > >I'm having hard time believing that. There definitely are use cases > >where cachelines are trashed among service threads. Are you > >proclaiming that those cases aren't gonna be supported? > > Please refer to the noisy neighbour example i give here to help resolve > thrashing by a noisy neighbour - > http://marc.info/?l=linux-kernel&m=143889397419199 I don't think that's relevant to the discussion. Implement a taskset like tool and the administrator can deal with it just fine. As I wrote multiple times now, people have been dealing with CPU affinity fine w/o cgroups. Sure, cgroups do add on top but it's an a lot more complex facility and not a replacement for a more basic control mechanism. > >>- This interface like you said can easily bolt-on. basically an easy to use > >>interface without worrying about the architectural details. > > > >But it's ripe with architectural details. > > If specifying the bitmask is an issue , it can easily be addressed by > writing a script which calculates the bitmask to size - like mentioned here > http://marc.info/?l=linux-kernel&m=143889397419199 Let's say we fully virtualize cache partitioning so that each user can express what they want and the kernel can compute and manage the closest mapping supportable by the underlying hardware. That should be doable but I don't think that's what we want at this point. This, at least for now, is a niche feature which requires specific configurations to be useful and while useful to certain narrow use cases unlikely to be used across the board. Given that, we don't want to overengineer the solution. Implement something simple and specific. We don't yet even know the full usefulness or use cases of the feature. It doesn't make sense to overcommit to complex abstractions and mechanisms when there's a fairly good chance that our understanding of the problem itself is very porous. This applies the same to making it part of cgroups. It's a lot more complex and we end up committing a lot more than implementing something simple and specific. Let's please keep it simple. > >I'm not saying they are mutually exclusive but that we're going > >overboard in this direction when programmable interface should be the > >priority. While this mostly happened naturally for other resources > >because cgroups was introduced later but I think there's a general > >rule to follow there. > > Right , the cache allocation cannot be treated like memory like explained > here in 1.3 and 1.4 > http://marc.info/?l=linux-kernel&m=143889397419199 Who said that it could be? If it actually were a resource which is as ubiquitous, flexible and dividable as memory, cgroups would be an a lot better fit. > >If you factor in threads of a process, the above model is > >fundamentally flawed. How would root or any external entity find out > >what threads are to be allocated what? > > the process ID can be added to the cgroup together with all its threads as > shown in example of cgroup usage in (2) here - And how does an external entity find out which ID should be put where? This is a knowledge only known to the process itself. That's what I meant by going this route requires individual applications communicating with external agents. > In most cases in the cloud you will be able to decide based on what > workloads are running - see the example 1.5 here > > http://marc.info/?l=linux-kernel&m=143889397419199 Sure, that's an way outer scope. The point was that this can't handle in-process scope. > Each application would > >constnatly have to tell an external agent about what its intentions > >are. This might seem to work in a limited feature testing setup where > >you know everything about who's doing what but is no way a widely > >deployable solution. This pretty much degenerates into #3 you listed > >below. > > App may not be the best one to decide 1.1 and 1.2 here > http://marc.info/?l=linux-kernel&m=143889397419199 That paragraph just shows how little is understood, so you can't imagine a situation where threads of a process agree upon how they'll use cache to improve performance? Threads of the same program do things like this all the time with different types of resources. This is a large portion of what server software programmers do - making the threads and other components behave in a way that maxmizes the efficacy of the underlying system. Thanks. -- tejun