From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1751868AbbGMRQJ (ORCPT <rfc822;w@1wt.eu>);
	Mon, 13 Jul 2015 13:16:09 -0400
Received: from mga01.intel.com ([192.55.52.88]:26401 "EHLO mga01.intel.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1751729AbbGMRQG (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Mon, 13 Jul 2015 13:16:06 -0400
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.15,464,1432623600"; 
   d="scan'208";a="746273148"
Date: Mon, 13 Jul 2015 10:13:28 -0700 (PDT)
From: Vikas Shivappa <vikas.shivappa@intel.com>
X-X-Sender: vikas@vshiva-Udesk
To: Vikas Shivappa <vikas.shivappa@linux.intel.com>
cc: linux-kernel@vger.kernel.org, vikas.shivappa@intel.com, x86@kernel.org,
        hpa@zytor.com, tglx@linutronix.de, mingo@kernel.org, tj@kernel.org,
        peterz@infradead.org, Matt Fleming <matt.fleming@intel.com>,
        "Auld, Will" <will.auld@intel.com>,
        "Williamson, Glenn P" <glenn.p.williamson@intel.com>,
        Marcelo Tosatti <mtosatti@redhat.com>,
        "Juvva, Kanaka D" <kanaka.d.juvva@intel.com>
Subject: Re: [PATCH V12 0/9] Hot cpu handling changes to cqm, rapl and Intel
 Cache Allocation support
In-Reply-To: <1435789270-27010-1-git-send-email-vikas.shivappa@linux.intel.com>
Message-ID: <alpine.DEB.2.10.1507131011080.32420@vshiva-Udesk>
References: <1435789270-27010-1-git-send-email-vikas.shivappa@linux.intel.com>
User-Agent: Alpine 2.10 (DEB 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org


Hello Thomas,

Just a ping for any feedback if any. Have tried to fix some issues you pointed 
out in V11 and V12.

Thanks,
Vikas

On Wed, 1 Jul 2015, Vikas Shivappa wrote:

> This patch has some changes to hot cpu handling code in existing cache
> monitoring and RAPL kernel code. This improves hot cpu notification
> handling by not looping through all online cpus which could be expensive
> in large systems.
>
> Cache allocation patches(dependent on prep patches) adds a cgroup
> subsystem to support the new Cache Allocation feature found in future
> Intel Xeon Intel processors. Cache Allocation is a sub-feature with in
> Resource Director Technology(RDT) feature. RDT which provides support to
> control sharing of platform resources like L3 cache.
>
> Cache Allocation Technology provides a way for the Software (OS/VMM) to
> restrict cache allocation to a defined 'subset' of cache which may be
> overlapping with other 'subsets'.  This feature is used when allocating
> a line in cache ie when pulling new data into the cache.  The
> programming of the h/w is done via programming  MSRs.  The patch series
> support to perform L3 cache allocation.
>
> In todays new processors the number of cores is continuously increasing
> which in turn increase the number of threads or workloads that can
> simultaneously be run. When multi-threaded applications run
> concurrently, they compete for shared resources including L3 cache.  At
> times, this L3 cache resource contention may result in inefficient space
> utilization. For example a higher priority thread may end up with lesser
> L3 cache resource or a cache sensitive app may not get optimal cache
> occupancy thereby degrading the performance.  Cache Allocation kernel
> patch helps provides a framework for sharing L3 cache so that users can
> allocate the resource according to set requirements.
>
> More information about the feature can be found in the Intel SDM, Volume
> 3 section 17.15.  SDM does not yet use the 'RDT' term yet and it is
> planned to be changed at a later time.
>
> *All the patches will apply on tip/perf/core*.
>
> Changes in v12:
>
> - From Matt's feedback replaced static cpumask_t tmp with function
> scope at multiple locations to static cpumask_t tmp_cpumask for the
> whole file. This is a temporary mask used during handling of hot cpu
> notifications in cqm/rapl and rdt code(1/9,2/9 and 8/9).  Although all
> the usage was serialized by hot cpu locking this makes it more
> readable.
>
> Changes in V11:  As per feedback from Thomas and discussions:
>
>  - removed the cpumask_any_online_but.its usage could be easily replaced with
>  'and'ing the cpu_online mask during hot cpu notifications.  Thomas
>  pointed the API had issue where there tmp mask wasnt thread safe. I
>  realized the support it indends to give does not seem to match with
>  others in cpumask.h
>  - the cqm patch which added mutex to hot cpu notification was merged
>  with the cqm hot plug patch to improve notificaiton handling
>  without commit logs and wasnt correct. seperated and just sending the
>  cqm hot plug patch and will send the mutex cqm patch seperately
>  - fixed issues in the hot cpu rdt handling. Since the cpu_starting was
>  replaced with cpu_online , now the wrmsr needs to be actually
>  scheduled on the target cpu - which the previous patch wasnt doing.
>  Replaced the cpu_dead with cpu_down_prepare. the cpu_down_failed is
>  handled the same way as cpu_online. By waiting till cpu_dead to update
>  the rdt_cpumask , we may miss some of the msr updates.
>
> Changes in V10:
>
> - changed the hot cpu notification we handle in cqm and cache allocation
>  to cpu_online and cpu_dead and removed others as the
>  cpu_*_prepare also had corresponding cancel notification
>  which we did not handle.
> - changed the file in rdt cgroup to l3_cache_mask to represent that its
>  for l3 cache.
>
> Changes as per Thomas and PeterZ feedback:
> - fixed the cpumask declarations in cpumask.h and rdt,cmt and rapl to
>  have static so that they burden stack space when large.
> - removed mutex in cpu_starting notifications, replaced the locking with
>  cpu_online.
> - changed name from hsw_probetest to cache_alloc_hsw_probe.
> - changed x86_rdt_max_closid to x86_cache_max_closid and
>  x86_rdt_max_cbm_len to x86_cache_max_cbm_len as they are only related
>  to cache allocation and not to all rdt.
>
> Changes in V9:
> Changes made as per Thomas feedback:
> - added a comment where we call schedule in code only when RDT is
>  enabled.
> - Reordered the local declarations to follow convention in
>  intel_cqm_xchg_rmid
>
> Changes in V8: Thanks to feedback from Thomas and following changes are
> made based on his feedback:
>
> Generic changes/Preparatory patches:
> -added a new cpumask_any_online_but which returns the next
> core sibling that is online.
> -Made changes in Intel Cache monitoring and Intel RAPL(Running average
>    power limit) code to use the new function above to find the next cpu
> that can be a designated reader for the package. Also changed the way
> the package masks are computed which can be simplified using
> topology_core_cpumask.
>
> Cache allocation specific changes:
> -Moved the documentation to the begining of the patch series.
> -Added more documentation for the rdt cgroup files in the documentation.
> -Changed the dmesg output when cache alloc is enabled to be more helpful
> and updated few other comments to be better readable.
> -removed __ prefix to functions like clos_get which were not following
> convention.
> -added code to take action on a WARN_ON in clos_put. Made a few other
> changes to reduce code text.
> -updated better readable/Kernel doc format comments for the
> call to rdt_css_alloc, datastructures .
> -removed cgroup_init
> -changed the names of functions to only have intel_ prefix for external
> APIs.
> -replaced (void *)&closid with (void *)closid when calling
> on_each_cpu_mask
> -fixed the reference release of closid during cache bitmask write.
> -changed the code to not ignore a cache mask which has bits set outside
> of the max bits allowed. It returns an error instead.
> -replaced bitmap_set(&max_mask, 0, max_cbm_len) with max_mask =
> (1ULL << max_cbm) - 1.
> - update the rdt_cpu_mask which has one cpu for each package, using
> topology_core_cpumask instead of looping through existing rdt_cpu_mask.
> Realized topology_core_cpumask name is misleading and it actually
> returns the cores in a cpu package!
> -arranged the code better to have the code relating to similar task
> together.
> -Improved searching for the next online cpu sibling and maintaining the
> rdt_cpu_mask which has one cpu per package.
> -removed the unnecessary wrapper rdt_enabled.
> -removed unnecessary spin lock and rculock in the scheduling code.
> -merged all scheduling code into one patch not seperating the RDT common
> software cache code.
>
> Changes in V7: Based on feedback from PeterZ and Matt and following
> discussions :
> - changed lot of naming to reflect the data structures which are common
> to RDT and specific to Cache allocation.
> - removed all usage of 'cat'. replace with more friendly cache
> allocation
> - fixed lot of convention issues (whitespace, return paradigm etc)
> - changed the scheduling hook for RDT to not use a inline.
> - removed adding new scheduling hook and just reused the existing one
> similar to perf hook.
>
> Changes in V6:
> - rebased to 4.1-rc1 which has the CMT(cache monitoring) support included.
> - (Thanks to Marcelo's feedback).Fixed support for hot cpu handling for
> IA32_L3_QOS MSRs. Although during deep C states the MSR need not be restored
> this is needed when physically a new package is added.
> -some other coding convention changes including renaming to cache_mask using a
> refcnt to track the number of cgroups using a closid in clos_cbm map.
> -1b cbm support for non-hsw SKUs. HSW is an exception which needs the cache
> bit masks to be at least 2 bits.
>
> Changes in v5:
> - Added support to propagate the cache bit mask update for each
> package.
> - Removed the cache bit mask reference in the intel_rdt structure as
>  there was no need for that and we already maintain a separate
>  closid<->cbm mapping.
> - Made a few coding convention changes which include adding the
> assertion while freeing the CLOSID.
>
> Changes in V4:
> - Integrated with the latest V5 CMT patches.
> - Changed naming of cgroup to rdt(resource director technology) from
>  cat(cache allocation technology). This was done as the RDT is the
>  umbrella term for platform shared resources allocation. Hence in
>  future it would be easier to add resource allocation to the same
>  cgroup
> - Naming changes also applied to a lot of other data structures/APIs.
> - Added documentation on cgroup usage for cache allocation to address
>  a lot of questions from various academic and industry regarding
>  cache allocation usage.
>
> Changes in V3:
> - Implements a common software cache for IA32_PQR_MSR
> - Implements support for hsw Cache Allocation enumeration. This does not use the brand
> strings like earlier version but does a probe test. The probe test is done only
> on hsw family of processors
> - Made a few coding convention, name changes
> - Check for lock being held when ClosID manipulation happens
>
> Changes in V2:
> - Removed HSW specific enumeration changes. Plan to include it later as a
>  separate patch.
> - Fixed the code in prep_arch_switch to be specific for x86 and removed
>  x86 defines.
> - Fixed cbm_write to not write all 1s when a cgroup is freed.
> - Fixed one possible memory leak in init.
> - Changed some of manual bitmap
>  manipulation to use the predefined bitmap APIs to make code more readable
> - Changed name in sources from cqe to cat
> - Global cat enable flag changed to static_key and disabled cgroup early_init
>
> [PATCH 1/9] x86/intel_cqm: Modify hot cpu notification handling
> [PATCH 2/9] x86/intel_rapl: Modify hot cpu notification handling for
> [PATCH 3/9] x86/intel_rdt: Cache Allocation documentation and cgroup
> [PATCH 4/9] x86/intel_rdt: Add support for Cache Allocation detection
> [PATCH 5/9] x86/intel_rdt: Add new cgroup and Class of service
> [PATCH 6/9] x86/intel_rdt: Add support for cache bit mask management
> [PATCH 7/9] x86/intel_rdt: Implement scheduling support for Intel RDT
> [PATCH 8/9] x86/intel_rdt: Hot cpu support for Cache Allocation
> [PATCH 9/9] x86/intel_rdt: Intel haswell Cache Allocation enumeration
>