linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>, Ingo Molnar <mingo@elte.hu>,
	"H. Peter Anvin" <h.peter.anvin@intel.com>,
	Tejun Heo <tj@kernel.org>, Borislav Petkov <bp@suse.de>,
	Stephane Eranian <eranian@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	David Carrillo-Cisneros <davidcc@google.com>,
	Ravi V Shankar <ravi.v.shankar@intel.com>,
	Vikas Shivappa <vikas.shivappa@linux.intel.com>,
	Sai Prakhya <sai.praneeth.prakhya@intel.com>,
	linux-kernel <linux-kernel@vger.kernel.org>, x86 <x86@kernel.org>
Subject: Re: [PATCH 13/32] Documentation, x86: Documentation for Intel resource allocation user interface
Date: Thu, 14 Jul 2016 08:53:17 +0200 (CEST)	[thread overview]
Message-ID: <alpine.DEB.2.11.1607132310310.4083@nanos> (raw)
In-Reply-To: <20160713171310.GA14521@intel.com>

On Wed, 13 Jul 2016, Luck, Tony wrote:
> On Wed, Jul 13, 2016 at 02:47:30PM +0200, Thomas Gleixner wrote:
> > On Tue, 12 Jul 2016, Fenghua Yu wrote:
> > > +3. Hierarchy in rscctrl
> > > +=======================
> > 
> > What means rscctrl?
> > 
> > You were not able to find a more cryptic acronym?
> 
> rscctrl == resource control
> 
> Intel marketing would (probably) like us to use:
> 
>    /sys/fs/Intel(R) Resource Director Technology(TM)/
> 
> Happy to take suggestions for something in between those
> extremes :-)

I'd suggest "resctrl" and the abbreviation dictionaries tell me that the most
common ones for resource are: R, RESORC, RES

> > > +Any tasks scheduled on the cpus will use the schemas. User can set
> > > +both "cpus" and "tasks" to share the same schema in one directory. But when
> > > +a CPU is bound to a schema, a task running on the CPU uses this schema and
> > > +kernel will ignore scheam set up for the task in "tasks".
> > 
> > This does not make any sense. 
> > 
> > When a task is bound to a schema then this should have preference over the
> > schema which is associated to the CPU. The CPU association is meant for tasks
> > which are not bound to a particular partition/schema.
> > 
> > So the initial setup should be:
> > 
> >    - All CPUs are associated to the root resource partition
> > 
> >    - No thread is associated to a particular resource partition
> > 
> > When a thread is added to a 'tasks' file of a partition then this partition
> > takes preference. If it's removed, i.e. the association to a partition is
> > undone, then the CPU association is used.
> > 
> > I have no idea why you think that all threads should be in a tasks file by
> > default. Associating CPUs in the first place makes a lot more sense as it
> > represents the topology of the system nicely.
> 
> If we did it that way, it would be harder to change the default
> resources.  E.g. now we start with all processes in the root
> rdtgroup.  We can change the schema for the root group and restrict
> them to, say, 60% of L3 cache on one (or all) sockets - giving us
> 40% of cache to give out to one or more groups.

I tend to disagree.

If you start up with all resources assigned to all CPUs and all tasks are set
to use the CPU default, then you still can restrict the root CPU defaults to
60% L3 which gives you 40% of cache to hand out.

What's hard about this?

Now you can start to create new partitions and either assign CPU or tasks to
them.

As a side effect that avoids the whole 'find all tasks' on mount machinery
simply because the CPU defaults do not change at all.
 
> So what we've implemented (and perhaps need to explain better here)
> is that every thread always belongs to one (and only one) rdtgroup.
> It will use the resources described in that group whereever it runs,
> except in the case where we have designated some cpus as special snowflakes.

I don't think that case as special snowflakes. Due to the very limited number
of cosids the CPU association is going to be a very useful tool.

> When a cpu is assigned to an rdtgroup the schema for the cpu has
> precedence (i.e. we write the MSR with a CLOSID once, and then it
> never changes).
> 
> Some of this is confusing because people will very likely also use
> cpu affinity to control where their processes run. But affinity is
> orthogonal to rdtgroup membership.

Right. It's confusing and what's even more confusing is that you have no way
to figure out what a particular task is actually using. With the 'use CPU
defaults, if not assigned to a partition' scheme you can very easy figure out
what a task is using because its either in a partition task list or not.

> I think what we have allows you to so all the things we talked about.
> But if we are missing a case, or if things can be simplified while
> still retaining the same functionality then lets discuss that.

It covers almost everything except the case I outlined before:

   Isolated CPU	 	    Important Task runs on isolated CPU
   5% exclusive cache	    10% exclusive cache

That's impossible with your scheme, but it's something which matters. You want
to make sure that the system services on that isolated CPU stay cache hot
without hurting the cache locality of your isolated task.

> Otherwise we can revise the documentation to explain all this better.

That needs to be done in any case. The existing one does not really qualify as
proper documentation. It's closer to a fairy tale :)

I really have to ask why you did not take the time and include all the
information you gave now into that documentation file in the first place.

> > > +Initial value is all zeros which means there is no CPU bound to the schemas
> > > +in the root directory and tasks use the schemas.
> > 
> > As I said above this is backwards.
> 
> > > +If one resource is disabled, its line is not shown in schemas file.
> > 
> > That means:	  
> > 
> >      Resources which are not described in a schemata file are disabled for
> >      that particular partition.
> > 
> > Right?
> > 
> > Now that raises the question how this is supposed to work. Let's assume that
> > we have a partition 'foo' and thread X is in the tasks file of that
> > partition. The schema of that partition contains only an L2 entry. What's the
> > L3 association for thread X? Nothing at all?
> 
> Resources are either enabled or disabled globally. Each schema file
> must provide details for every enabled resource. So if we are on a
> processor that supports both L2 and L3, we will normally have schema
> files that specify both.
> We could boot with the "disable_cat_l2"
> kernel command line option and then every schema file would just
> specify L3 (and the MSRs for L2 would all be set to all-ones so that
> everyone had full access to the L2 on each core).

So the above should read:

   Each schema file must provide configuration for all resource controls which
   are enabled in the system.

Right?
 
> > > +User can create a sub-directory under the root directory by "mkdir" command.
> > > +User can remove the sub-directory by "rmdir" command.
> > 
> > User? Any user?
> 
> Well if someone did:
>  # chmod 777 /sys/fs/rscctrl
> then any user could make directories.  That would be inadvisable.
> You could use 775 and let a trusted group have control so that you
> didn't require root access to modify things.
> 
> Should we say "system administrator" rather than "user"?

Yes. Because the default should be 755 which is the obvious choice for all
root/admin controlled things. If root decides to change it to 777 then it's
not the kernels problem. But documentation should clearly say: It's a root
controlled resource.

Thanks,

	tglx

  reply	other threads:[~2016-07-14  6:55 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-13  1:02 [PATCH 00/32] Enable Intel Resource Allocation in Resource Director Technology Fenghua Yu
2016-07-13  1:02 ` [PATCH 01/32] x86/intel_rdt: Cache Allocation documentation Fenghua Yu
2016-07-13  1:02 ` [PATCH 02/32] x86/intel_rdt: Add support for Cache Allocation detection Fenghua Yu
2016-07-26 19:00   ` Nilay Vaish
2016-07-13  1:02 ` [PATCH 03/32] x86/intel_rdt: Add Class of service management Fenghua Yu
2016-07-13  1:02 ` [PATCH 04/32] x86/intel_rdt: Add L3 cache capacity bitmask management Fenghua Yu
2016-07-22  7:12   ` Marcelo Tosatti
2016-07-22 21:43     ` Luck, Tony
2016-07-23  4:31       ` Marcelo Tosatti
2016-07-26  3:18         ` Luck, Tony
2016-07-26 17:10         ` Shivappa Vikas
2016-07-13  1:02 ` [PATCH 05/32] x86/intel_rdt: Implement scheduling support for Intel RDT Fenghua Yu
2016-07-25 16:25   ` Nilay Vaish
2016-07-25 16:31   ` Nilay Vaish
2016-07-25 18:05     ` Luck, Tony
2016-07-25 22:47       ` David Carrillo-Cisneros
2016-07-13  1:02 ` [PATCH 06/32] x86/intel_rdt: Hot cpu support for Cache Allocation Fenghua Yu
2016-07-13  9:19   ` Thomas Gleixner
2016-07-21 19:46     ` Shivappa Vikas
2016-07-14  0:40   ` David Carrillo-Cisneros
2016-07-14 22:58     ` Yu, Fenghua
2016-07-13  1:02 ` [PATCH 07/32] x86/intel_rdt: Intel haswell Cache Allocation enumeration Fenghua Yu
2016-07-13  1:02 ` [PATCH 08/32] Define CONFIG_INTEL_RDT Fenghua Yu
2016-07-13 10:25   ` Thomas Gleixner
2016-07-13 18:05     ` Yu, Fenghua
2016-07-13 21:09       ` Thomas Gleixner
2016-07-13 21:18         ` Yu, Fenghua
2016-07-13  1:02 ` [PATCH 09/32] x86/intel_rdt: Intel Code Data Prioritization detection Fenghua Yu
2016-07-13  1:02 ` [PATCH 10/32] x86/intel_rdt: Adds support to enable Code Data Prioritization Fenghua Yu
2016-07-26 19:23   ` Nilay Vaish
2016-07-26 20:32     ` Shivappa Vikas
2016-07-13  1:02 ` [PATCH 11/32] x86/intel_rdt: Class of service and capacity bitmask management for CDP Fenghua Yu
2016-07-13  1:02 ` [PATCH 12/32] x86/intel_rdt: Hot cpu update for code data prioritization Fenghua Yu
2016-07-13  1:02 ` [PATCH 13/32] Documentation, x86: Documentation for Intel resource allocation user interface Fenghua Yu
2016-07-13 12:47   ` Thomas Gleixner
2016-07-13 17:13     ` Luck, Tony
2016-07-14  6:53       ` Thomas Gleixner [this message]
2016-07-14 17:16         ` Luck, Tony
2016-07-19 12:32           ` Thomas Gleixner
2016-08-04 23:38             ` Yu, Fenghua
2016-07-27 16:20   ` Nilay Vaish
2016-07-27 16:57     ` Luck, Tony
2016-08-03 22:15   ` Marcelo Tosatti
2016-07-13  1:02 ` [PATCH 14/32] x86/cpufeatures: Get max closid and max cbm len and clean feature comments and code Fenghua Yu
2016-07-27 16:49   ` Nilay Vaish
2016-07-13  1:02 ` [PATCH 15/32] cacheinfo: Introduce cache id Fenghua Yu
2016-07-27 17:04   ` Nilay Vaish
2016-07-13  1:02 ` [PATCH 16/32] Documentation, ABI: Add a document entry for " Fenghua Yu
2016-07-13  1:02 ` [PATCH 17/32] x86, intel_cacheinfo: Enable cache id in x86 Fenghua Yu
2016-07-28  5:41   ` Nilay Vaish
2016-07-13  1:02 ` [PATCH 18/32] drivers/base/cacheinfo.c: Export some cacheinfo functions for others to use Fenghua Yu
2016-07-13  1:02 ` [PATCH 19/32] sched.h: Add rg_list and rdtgroup in task_struct Fenghua Yu
2016-07-13 12:56   ` Thomas Gleixner
2016-07-13 17:50     ` Yu, Fenghua
2016-07-28  5:53   ` Nilay Vaish
2016-07-13  1:02 ` [PATCH 20/32] magic number for rscctrl file system Fenghua Yu
2016-07-28  5:57   ` Nilay Vaish
2016-07-13  1:02 ` [PATCH 21/32] x86/intel_rdt.h: Header for inter_rdt.c Fenghua Yu
2016-07-28 14:07   ` Nilay Vaish
2016-07-13  1:02 ` [PATCH 22/32] x86/intel_rdt_rdtgroup.h: Header for user interface Fenghua Yu
2016-07-13  1:02 ` [PATCH 23/32] x86/intel_rdt.c: Extend RDT to per cache and per resources Fenghua Yu
2016-07-13 13:07   ` Thomas Gleixner
2016-07-13 17:40     ` Yu, Fenghua
2016-07-13  1:02 ` [PATCH 24/32] Task fork and exit for rdtgroup Fenghua Yu
2016-07-13 13:14   ` Thomas Gleixner
2016-07-13 17:32     ` Yu, Fenghua
2016-07-13 21:02       ` Thomas Gleixner
2016-07-13 21:22         ` Yu, Fenghua
2016-07-13  1:02 ` [PATCH 25/32] x86/intel_rdt_rdtgroup.c: User interface for RDT Fenghua Yu
2016-07-14 12:30   ` Thomas Gleixner
2016-07-13  1:02 ` [PATCH 26/32] x86/intel_rdt_rdtgroup.c: Create info directory Fenghua Yu
2016-07-13  1:03 ` [PATCH 27/32] x86/intel_rdt_rdtgroup.c: Implement rscctrl file system commands Fenghua Yu
2016-07-13  1:03 ` [PATCH 28/32] x86/intel_rdt_rdtgroup.c: Read and write cpus Fenghua Yu
2016-07-13  1:03 ` [PATCH 29/32] x86/intel_rdt_rdtgroup.c: Tasks iterator and write Fenghua Yu
2016-07-13  1:03 ` [PATCH 30/32] x86/intel_rdt_rdtgroup.c: Process schemas input from rscctrl interface Fenghua Yu
2016-07-14  0:41   ` David Carrillo-Cisneros
2016-07-14  6:11     ` Thomas Gleixner
2016-07-14  6:16       ` Yu, Fenghua
2016-07-14  6:32     ` Yu, Fenghua
2016-07-13  1:03 ` [PATCH 31/32] MAINTAINERS: Add maintainer for Intel RDT resource allocation Fenghua Yu
2016-07-13  1:03 ` [PATCH 32/32] x86/Makefile: Build intel_rdt_rdtgroup.c Fenghua Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.11.1607132310310.4083@nanos \
    --to=tglx@linutronix.de \
    --cc=bp@suse.de \
    --cc=davidcc@google.com \
    --cc=eranian@google.com \
    --cc=fenghua.yu@intel.com \
    --cc=h.peter.anvin@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mtosatti@redhat.com \
    --cc=peterz@infradead.org \
    --cc=ravi.v.shankar@intel.com \
    --cc=sai.praneeth.prakhya@intel.com \
    --cc=tj@kernel.org \
    --cc=tony.luck@intel.com \
    --cc=vikas.shivappa@linux.intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).