All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Tejun Heo <tj@kernel.org>
Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>,
	linux-kernel@vger.kernel.org, vikas.shivappa@intel.com,
	x86@kernel.org, hpa@zytor.com, tglx@linutronix.de,
	mingo@kernel.org, peterz@infradead.org, matt.fleming@intel.com,
	will.auld@intel.com, glenn.p.williamson@intel.com,
	kanaka.d.juvva@intel.com
Subject: Re: [PATCH 5/9] x86/intel_rdt: Add new cgroup and Class of service management
Date: Tue, 4 Aug 2015 09:55:20 -0300	[thread overview]
Message-ID: <20150804125520.GA31450@amt.cnet> (raw)
In-Reply-To: <20150803203250.GA31668@amt.cnet>

On Mon, Aug 03, 2015 at 05:32:50PM -0300, Marcelo Tosatti wrote:
> On Sun, Aug 02, 2015 at 12:23:25PM -0400, Tejun Heo wrote:
> > Hello,
> > 
> > On Fri, Jul 31, 2015 at 12:12:18PM -0300, Marcelo Tosatti wrote:
> > > > I don't really think it makes sense to implement a fully hierarchical
> > > > cgroup solution when there isn't the basic affinity-adjusting
> > > > interface 
> > > 
> > > What is an "affinity adjusting interface" ? Can you give an example
> > > please?
> > 
> > Something similar to sched_setaffinity().  Just a syscall / prctl or
> > whatever programmable interface which sets per-task attribute.
> 
> You really want to specify the cache configuration "at once": 
> having process-A exclusive access to 2MB of cache at all times,
> and process-B 4MB exclusive, means you can't have process-C use 4MB of 
> cache exclusively (consider 8MB cache machine).

Thats not true. Its fine to setup the 

	task set <--> cache portion

mapping in pieces.

In fact, its more natural because you don't necessarily know in advance
the entire cache allocation (think of "cp largefile /destination" with
sequential use-once behavior).

However, there is a use-case for sharing: in scenario 1 it might be
possible (and desired) to share code between applications.

> > > > and it isn't clear whether fully hierarchical resource
> > > > distribution would be necessary especially given that the granularity
> > > > of the target resource is very coarse.
> > > 
> > > As i see it, the benefit of the hierarchical structure to the CAT
> > > configuration is simply to organize sharing of cache ways in subtrees
> > > - two cgroups can share a given cache way only if they have a common
> > > parent. 
> > > 
> > > That is the only benefit. Vikas, please correct me if i'm wrong.
> > 
> > cgroups is not a superset of a programmable interface.  It has
> > distinctive disadvantages and not a substitute with hirearchy support
> > for regular systemcall-like interface.  I don't think it makes sense
> > to go full-on hierarchical cgroups when we don't have basic interface
> > which is likely to cover many use cases better.  A syscall-like
> > interface combined with a tool similar to taskset would cover a lot in
> > a more accessible way.
> 
> How are you going to specify sharing of portions of cache by two sets
> of tasks with a syscall interface?
> 
> > > > I can see that how cpuset would seem to invite this sort of usage but
> > > > cpuset itself is more of an arbitrary outgrowth (regardless of
> > > > history) in terms of resource control and most things controlled by
> > > > cpuset already have countepart interface which is readily accessible
> > > > to the normal applications.
> > > 
> > > I can't parse that phrase (due to ignorance). Please educate.
> > 
> > Hmmm... consider CPU affinity.  cpuset definitely is useful for some
> > use cases as a management tool especially if the workloads are not
> > cooperative or delegated; however, it's no substitute for a proper
> > syscall interface and it'd be silly to try to replace that with
> > cpuset.
> > 
> > > > Given that what the feature allows is restricting usage rather than
> > > > granting anything exclusively, a programmable interface wouldn't need
> > > > to worry about complications around priviledges
> > > 
> > > What complications about priviledges you refer to?
> > 
> > It's not granting exclusive access, so individual user applications
> > can be allowed to do whatever it wanna do as long as the issuer has
> > enough priv over the target task.
> 
> Priviledge management with cgroup system: to change cache allocation
> requires priviledge over cgroups.
> 
> Priviledge management with system call interface: applications 
> could be allowed to reserve up to a certain percentage of the cache.
> 
> > > > while being able to reap most of the benefits in an a lot easier way.
> > > > Am I missing something?
> > > 
> > > The interface does allow for exclusive cache usage by an application.
> > > Please read the Intel manual, section 17, it is very instructive.
> > 
> > For that, it'd have to require some CAP but I think just having
> > restrictive interface in the style of CPU or NUMA affinity would go a
> > long way.
> > 
> > > The use cases we have now are the following:
> > > 
> > > Scenario 1: Consider a system with 4 high performance applications
> > > running, one of which is a streaming application that manages a very
> > > large address space from which it reads and writes as it does its processing.
> > > As such the application will use all the cache it can get but does
> > > not need much if any cache. So, it spoils the cache for everyone for no
> > > gain on its own. In this case we'd like to constrain it to the
> > > smallest possible amount of cache while at the same time constraining
> > > the other 3 applications to stay out of this thrashed area of the
> > > cache.
> > 
> > A tool in the style of taskset should be enough for the above
> > scenario.
> > 
> > > Scenario 2: We have a numeric application that has been highly optimized
> > > to fit in the L2 cache (2M for example). We want to ensure that its
> > > cached data does not get flushed from the cache hierarchy while it is
> > > scheduled out. In this case we exclusively allocate enough L3 cache to
> > > hold all of the L2 cache.
> > >
> > > Scenario 3: Latency sensitive application executing in a shared
> > > environment, where memory to handle an event must be in L3 cache
> > > for latency requirements to be met.
> > 
> > Either isolate CPUs or run other stuff with affinity restricted.
> > 
> > cpuset-style allocation can be easier for things like this but that
> > should be an addition on top not the one and only interface.  How is
> > it gonna handle if multiple threads of a process want to restrict
> > cache usages to avoid stepping on each other's toes?  Delegate the
> > subdirectory and let the process itself open it and write to files to
> > configure when there isn't even a way to atomically access the
> > process's own directory or a way to synchronize against migration?
> 
> One would preconfigure that in advance - but you are right, a 
> syscall interface is more flexible in that respect.

So, systemd is responsible for locking.

> > cgroups may be an okay management interface but a horrible
> > programmable interface.
> > 
> > Sure, if this turns out to be as important as cpu or numa affinity and
> > gets widely used creating management burden in many use cases, we sure
> > can add cgroups controller for it but that's a remote possibility at
> > this point and the current attempt is over-engineering solution for
> > problems which haven't been shown to exist.  Let's please first
> > implement something simple and easy to use.
> > 
> > Thanks.
> > 
> > -- 
> > tejun

Don't see an easy way to fix the sharing use-case (it would require
exposing the "intersection" between two task sets).

Can't "cacheset" helper (similar to taskset) talk to systemd
to achieve the flexibility you point ?


  reply	other threads:[~2015-08-04 12:56 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-01 22:21 [PATCH V12 0/9] Hot cpu handling changes to cqm, rapl and Intel Cache Allocation support Vikas Shivappa
2015-07-01 22:21 ` [PATCH 1/9] x86/intel_cqm: Modify hot cpu notification handling Vikas Shivappa
2015-07-29 16:44   ` Peter Zijlstra
2015-07-31 23:19     ` Vikas Shivappa
2015-07-01 22:21 ` [PATCH 2/9] x86/intel_rapl: Modify hot cpu notification handling for RAPL Vikas Shivappa
2015-07-01 22:21 ` [PATCH 3/9] x86/intel_rdt: Cache Allocation documentation and cgroup usage guide Vikas Shivappa
2015-07-28 14:54   ` Peter Zijlstra
2015-08-04 20:41     ` Vikas Shivappa
2015-07-28 23:15   ` Marcelo Tosatti
2015-07-29  0:06     ` Vikas Shivappa
2015-07-29  1:28       ` Auld, Will
2015-07-29 19:32         ` Marcelo Tosatti
2015-07-30 17:47           ` Vikas Shivappa
2015-07-30 20:08             ` Marcelo Tosatti
2015-07-31 15:34               ` Marcelo Tosatti
2015-08-02 15:48               ` Martin Kletzander
2015-08-03 15:13                 ` Marcelo Tosatti
2015-08-03 18:22                   ` Vikas Shivappa
2015-07-30 20:22             ` Marcelo Tosatti
2015-07-30 23:03               ` Vikas Shivappa
2015-07-31 14:45                 ` Marcelo Tosatti
2015-07-31 16:41                   ` [summary] " Vikas Shivappa
2015-07-31 18:38                     ` Marcelo Tosatti
2015-07-29 20:07         ` Vikas Shivappa
2015-07-01 22:21 ` [PATCH 4/9] x86/intel_rdt: Add support for Cache Allocation detection Vikas Shivappa
2015-07-28 16:25   ` Peter Zijlstra
2015-07-28 22:07     ` Vikas Shivappa
2015-07-01 22:21 ` [PATCH 5/9] x86/intel_rdt: Add new cgroup and Class of service management Vikas Shivappa
2015-07-28 17:06   ` Peter Zijlstra
2015-07-30 18:01     ` Vikas Shivappa
2015-07-28 17:17   ` Peter Zijlstra
2015-07-30 18:10     ` Vikas Shivappa
2015-07-30 19:44   ` Tejun Heo
2015-07-31 15:12     ` Marcelo Tosatti
2015-08-02 16:23       ` Tejun Heo
2015-08-03 20:32         ` Marcelo Tosatti
2015-08-04 12:55           ` Marcelo Tosatti [this message]
2015-08-04 18:36             ` Tejun Heo
2015-08-04 18:32           ` Tejun Heo
2015-07-31 16:24     ` Vikas Shivappa
2015-08-02 16:31       ` Tejun Heo
2015-08-04 18:50         ` Vikas Shivappa
2015-08-04 19:03           ` Tejun Heo
2015-08-05  2:21             ` Vikas Shivappa
2015-08-05 15:46               ` Tejun Heo
2015-08-06 20:58                 ` Vikas Shivappa
2015-08-07 14:48                   ` Tejun Heo
2015-08-05 12:22         ` Matt Fleming
2015-08-05 16:10           ` Tejun Heo
2015-08-06  0:24           ` Marcelo Tosatti
2015-08-06 20:46             ` Vikas Shivappa
2015-08-07 13:15               ` Marcelo Tosatti
2015-08-18  0:20                 ` Marcelo Tosatti
2015-08-21  0:06                   ` Vikas Shivappa
2015-08-21  0:13                     ` Vikas Shivappa
2015-08-22  2:28                     ` Marcelo Tosatti
2015-08-23 18:47                       ` Vikas Shivappa
2015-08-24 13:06                         ` Marcelo Tosatti
2015-07-01 22:21 ` [PATCH 6/9] x86/intel_rdt: Add support for cache bit mask management Vikas Shivappa
2015-07-28 16:35   ` Peter Zijlstra
2015-07-28 22:08     ` Vikas Shivappa
2015-07-28 16:37   ` Peter Zijlstra
2015-07-30 17:54     ` Vikas Shivappa
2015-07-01 22:21 ` [PATCH 7/9] x86/intel_rdt: Implement scheduling support for Intel RDT Vikas Shivappa
2015-07-29 13:49   ` Peter Zijlstra
2015-07-30 18:16     ` Vikas Shivappa
2015-07-01 22:21 ` [PATCH 8/9] x86/intel_rdt: Hot cpu support for Cache Allocation Vikas Shivappa
2015-07-29 15:53   ` Peter Zijlstra
2015-07-31 23:21     ` Vikas Shivappa
2015-07-01 22:21 ` [PATCH 9/9] x86/intel_rdt: Intel haswell Cache Allocation enumeration Vikas Shivappa
2015-07-29 16:35   ` Peter Zijlstra
2015-08-03 20:49     ` Vikas Shivappa
2015-07-29 16:36   ` Peter Zijlstra
2015-07-30 18:45     ` Vikas Shivappa
2015-07-13 17:13 ` [PATCH V12 0/9] Hot cpu handling changes to cqm, rapl and Intel Cache Allocation support Vikas Shivappa
2015-07-16 12:55   ` Thomas Gleixner
2015-07-24 16:52 ` Thomas Gleixner
2015-07-24 18:28   ` Vikas Shivappa
2015-07-24 18:39     ` Thomas Gleixner
2015-07-24 18:45       ` Vikas Shivappa
2015-07-29 16:47     ` Peter Zijlstra
2015-07-29 22:53       ` Vikas Shivappa
2015-07-24 18:32   ` Vikas Shivappa
  -- strict thread matches above, loose matches on Subject: below --
2015-08-06 21:55 [PATCH V13 0/9] Intel cache allocation and Hot cpu handling changes to cqm, rapl Vikas Shivappa
2015-08-06 21:55 ` [PATCH 5/9] x86/intel_rdt: Add new cgroup and Class of service management Vikas Shivappa
2015-06-25 19:25 [PATCH V11 0/9] Hot cpu handling changes to cqm,rapl and Intel Cache Allocation support Vikas Shivappa
2015-06-25 19:25 ` [PATCH 5/9] x86/intel_rdt: Add new cgroup and Class of service management Vikas Shivappa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150804125520.GA31450@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=glenn.p.williamson@intel.com \
    --cc=hpa@zytor.com \
    --cc=kanaka.d.juvva@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matt.fleming@intel.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=vikas.shivappa@intel.com \
    --cc=vikas.shivappa@linux.intel.com \
    --cc=will.auld@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.