linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Luiz Capitulino <lcapitulino@redhat.com>
To: Fenghua Yu <fenghua.yu@intel.com>
Cc: "H Peter Anvin" <hpa@zytor.com>, "Ingo Molnar" <mingo@redhat.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"linux-kernel" <linux-kernel@vger.kernel.org>,
	"x86" <x86@kernel.org>,
	"Vikas Shivappa" <vikas.shivappa@linux.intel.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	tj@kernel.org, riel@redhat.com
Subject: Re: [PATCH V15 00/11] x86: Intel Cache Allocation Technology Support
Date: Wed, 4 Nov 2015 09:42:27 -0500	[thread overview]
Message-ID: <20151104094227.5aafdf2c@redhat.com> (raw)
In-Reply-To: <1443766185-61618-1-git-send-email-fenghua.yu@intel.com>

On Thu,  1 Oct 2015 23:09:34 -0700
Fenghua Yu <fenghua.yu@intel.com> wrote:

> This series has some preparatory patches and Intel cache allocation
> support.

Ping? What's the status of this series?

We badly need this series for KVM-RT workloads. I did try it and it
seems to work but, apart from small fixable issues which I'll reply
to specific patches to point out, there are some design issues which
I need some clarification. They are in order of relevance:

 o Cache reservations are global to all NUMA nodes

   CAT is mostly intended for real-time and high performance
   computing. For both of them the most common setup is to
   pin your threads to specific cores on a specific NUMA node.

   So, suppose I have two HPC threads pinned to specific cores
   on node1. I want to reserve 80% of the L3 cache to those
   threads. With current patches I'd do this:

    1. Create a "all-tasks" cgroup which can only access 20% of
       the cache
    2. Create a "hpc" cgroup which can access 80% of the cache
    3. Move my HPC threads to "hpc" and all the other threads to
       "all-tasks"

   This has the intended behavior on node1: the "hpc" threads
   will write into 80% of the L3 cache and any "all-tasks" threads
   executing there will only write into 20% of the cache.

   However, this is also true for node0! So, the "all-tasks"
   threads can only write into 20% of the cache in node0 even
   though "hpc" threads will never execute there.

   Is this intended by design? Like, is this a hardware limitation
   (given that the IA32_L3_MASK_n MSRs are global anyways) or maybe
   a way to enforce cache coherence?

   I was wondering if we could have masks per NUMA node, where
   they are applied to processes whenever they migrate among
   NUMA nodes.

 o How does this feature apply to kernel threads?

   I'm just unable to move kernel threads out of the root
   cgroup. This means that kernel threads can always write
   into all cache no matter what the reservation scheme is.

   Is this intended by design? Why? Unless I'm missing
   something, reservations could and should be applied to
   kernel threads as well.

 o You can't change the root cgroup's CBM

   I can understand this makes the implementation a lot simpler.
   However, the reality is that there are way too little CBMs
   and loosing one for the root group seems like a waste.

   Can we change this or is there strong reasons not to do so?

 o cgroups hierarchy is limited by the number of CBMs

   Today on my Haswell system, this means that I can only have 3
   directories in my cgroups hierarchy. If the number of CBMs
   are expected to grow in next processors, then I think having
   this feature as cgroups makes sense. However, if we're still
   going to be this limited in terms of directory structure, then
   it seems a bit overkill to me to have this as cgroups

  parent reply	other threads:[~2015-11-04 14:47 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-02  6:09 [PATCH V15 00/11] x86: Intel Cache Allocation Technology Support Fenghua Yu
2015-10-02  6:09 ` [PATCH V15 01/11] x86/intel_cqm: Modify hot cpu notification handling Fenghua Yu
2015-10-02  6:09 ` [PATCH V15 02/11] x86/intel_rapl: " Fenghua Yu
2015-10-02  6:09 ` [PATCH V15 03/11] x86/intel_rdt: Cache Allocation documentation Fenghua Yu
2015-10-02  6:09 ` [PATCH V15 04/11] x86/intel_rdt: Add support for Cache Allocation detection Fenghua Yu
2015-11-04 14:51   ` Luiz Capitulino
2015-10-02  6:09 ` [PATCH V15 05/11] x86/intel_rdt: Add Class of service management Fenghua Yu
2015-11-04 14:55   ` Luiz Capitulino
2015-10-02  6:09 ` [PATCH V15 06/11] x86/intel_rdt: Add L3 cache capacity bitmask management Fenghua Yu
2015-10-02  6:09 ` [PATCH V15 07/11] x86/intel_rdt: Implement scheduling support for Intel RDT Fenghua Yu
2015-10-02  6:09 ` [PATCH V15 08/11] x86/intel_rdt: Hot cpu support for Cache Allocation Fenghua Yu
2015-10-02  6:09 ` [PATCH V15 09/11] x86/intel_rdt: Intel haswell Cache Allocation enumeration Fenghua Yu
2015-10-02  6:09 ` [PATCH V15 10/11] x86,cgroup/intel_rdt : Add intel_rdt cgroup documentation Fenghua Yu
2015-10-02  6:09 ` [PATCH V15 11/11] x86,cgroup/intel_rdt : Add a cgroup interface to manage Intel cache allocation Fenghua Yu
2015-11-18 20:58   ` Marcelo Tosatti
2015-11-18 21:27   ` Marcelo Tosatti
2015-12-16 22:00     ` Yu, Fenghua
2015-11-18 22:15   ` Marcelo Tosatti
2015-12-14 22:58     ` Yu, Fenghua
2015-10-11 19:50 ` [PATCH V15 00/11] x86: Intel Cache Allocation Technology Support Thomas Gleixner
2015-10-12 18:52   ` Yu, Fenghua
2015-10-12 19:58     ` Thomas Gleixner
2015-10-13 22:40     ` Marcelo Tosatti
2015-10-15 11:37       ` Peter Zijlstra
2015-10-16  0:17         ` Marcelo Tosatti
2015-10-16  9:44           ` Peter Zijlstra
2015-10-16 20:24             ` Marcelo Tosatti
2015-10-19 23:49               ` Marcelo Tosatti
2015-10-13 21:31   ` Marcelo Tosatti
2015-10-15 11:36     ` Peter Zijlstra
2015-10-16  2:28       ` Marcelo Tosatti
2015-10-16  9:50         ` Peter Zijlstra
2015-10-26 20:02           ` Marcelo Tosatti
2015-11-02 22:20           ` cat cgroup interface proposal (non hierarchical) was " Marcelo Tosatti
2015-11-04 14:42 ` Luiz Capitulino [this message]
2015-11-04 14:57   ` Thomas Gleixner
2015-11-04 15:12     ` Luiz Capitulino
2015-11-04 15:28       ` Thomas Gleixner
2015-11-04 15:35         ` Luiz Capitulino
2015-11-04 15:50           ` Thomas Gleixner
2015-11-05  2:19 ` [PATCH 1/2] x86/intel_rdt,intel_cqm: Remove build dependency of RDT code on CQM code David Carrillo-Cisneros
2015-11-05  2:19   ` [PATCH 2/2] x86/intel_rdt: Fix bug in initialization, locks and write cbm mask David Carrillo-Cisneros

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151104094227.5aafdf2c@redhat.com \
    --to=lcapitulino@redhat.com \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=vikas.shivappa@linux.intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).