All of lore.kernel.org
 help / color / mirror / Atom feed
From: Huang Ying <ying.huang@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Arjan Van De Ven <arjan@linux.intel.com>,
	Huang Ying <ying.huang@intel.com>,
	Sudeep Holla <sudeep.holla@arm.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Vlastimil Babka <vbabka@suse.cz>,
	David Hildenbrand <david@redhat.com>,
	Johannes Weiner <jweiner@redhat.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Michal Hocko <mhocko@suse.com>,
	Pavel Tatashin <pasha.tatashin@soleen.com>,
	Matthew Wilcox <willy@infradead.org>,
	Christoph Lameter <cl@linux.com>
Subject: [PATCH -V3 2/9] cacheinfo: calculate size of per-CPU data cache slice
Date: Mon, 16 Oct 2023 13:29:55 +0800	[thread overview]
Message-ID: <20231016053002.756205-3-ying.huang@intel.com> (raw)
In-Reply-To: <20231016053002.756205-1-ying.huang@intel.com>

This can be used to estimate the size of the data cache slice that can
be used by one CPU under ideal circumstances.  Both DATA caches and
UNIFIED caches are used in calculation.  So, the users need to consider
the impact of the code cache usage.

Because the cache inclusive/non-inclusive information isn't available
now, we just use the size of the per-CPU slice of LLC to make the
result more predictable across architectures.  This may be improved
when more cache information is available in the future.

A brute-force algorithm to iterate all online CPUs is used to avoid
to allocate an extra cpumask, especially in offline callback.

Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: David Hildenbrand <david@redhat.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Christoph Lameter <cl@linux.com>
---
 drivers/base/cacheinfo.c  | 49 ++++++++++++++++++++++++++++++++++++++-
 include/linux/cacheinfo.h |  1 +
 2 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index cbae8be1fe52..585c66fce9d9 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -898,6 +898,48 @@ static int cache_add_dev(unsigned int cpu)
 	return rc;
 }
 
+/*
+ * Calculate the size of the per-CPU data cache slice.  This can be
+ * used to estimate the size of the data cache slice that can be used
+ * by one CPU under ideal circumstances.  UNIFIED caches are counted
+ * in addition to DATA caches.  So, please consider code cache usage
+ * when use the result.
+ *
+ * Because the cache inclusive/non-inclusive information isn't
+ * available, we just use the size of the per-CPU slice of LLC to make
+ * the result more predictable across architectures.
+ */
+static void update_per_cpu_data_slice_size_cpu(unsigned int cpu)
+{
+	struct cpu_cacheinfo *ci;
+	struct cacheinfo *llc;
+	unsigned int nr_shared;
+
+	if (!last_level_cache_is_valid(cpu))
+		return;
+
+	ci = ci_cacheinfo(cpu);
+	llc = per_cpu_cacheinfo_idx(cpu, cache_leaves(cpu) - 1);
+
+	if (llc->type != CACHE_TYPE_DATA && llc->type != CACHE_TYPE_UNIFIED)
+		return;
+
+	nr_shared = cpumask_weight(&llc->shared_cpu_map);
+	if (nr_shared)
+		ci->per_cpu_data_slice_size = llc->size / nr_shared;
+}
+
+static void update_per_cpu_data_slice_size(bool cpu_online, unsigned int cpu)
+{
+	unsigned int icpu;
+
+	for_each_online_cpu(icpu) {
+		if (!cpu_online && icpu == cpu)
+			continue;
+		update_per_cpu_data_slice_size_cpu(icpu);
+	}
+}
+
 static int cacheinfo_cpu_online(unsigned int cpu)
 {
 	int rc = detect_cache_attributes(cpu);
@@ -906,7 +948,11 @@ static int cacheinfo_cpu_online(unsigned int cpu)
 		return rc;
 	rc = cache_add_dev(cpu);
 	if (rc)
-		free_cache_attributes(cpu);
+		goto err;
+	update_per_cpu_data_slice_size(true, cpu);
+	return 0;
+err:
+	free_cache_attributes(cpu);
 	return rc;
 }
 
@@ -916,6 +962,7 @@ static int cacheinfo_cpu_pre_down(unsigned int cpu)
 		cpu_cache_sysfs_exit(cpu);
 
 	free_cache_attributes(cpu);
+	update_per_cpu_data_slice_size(false, cpu);
 	return 0;
 }
 
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index a5cfd44fab45..d504eb4b49ab 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -73,6 +73,7 @@ struct cacheinfo {
 
 struct cpu_cacheinfo {
 	struct cacheinfo *info_list;
+	unsigned int per_cpu_data_slice_size;
 	unsigned int num_levels;
 	unsigned int num_leaves;
 	bool cpu_map_populated;
-- 
2.39.2


  parent reply	other threads:[~2023-10-16  5:30 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-16  5:29 [PATCH -V3 0/9] mm: PCP high auto-tuning Huang Ying
2023-10-16  5:29 ` [PATCH -V3 1/9] mm, pcp: avoid to drain PCP when process exit Huang Ying
2023-10-16  5:29 ` Huang Ying [this message]
2023-10-19 12:11   ` [PATCH -V3 2/9] cacheinfo: calculate size of per-CPU data cache slice Mel Gorman
2023-10-16  5:29 ` [PATCH -V3 3/9] mm, pcp: reduce lock contention for draining high-order pages Huang Ying
2023-10-27  6:23   ` kernel test robot
2023-11-06  6:22   ` kernel test robot
2023-11-06  6:38     ` Huang, Ying
2023-10-16  5:29 ` [PATCH -V3 4/9] mm: restrict the pcp batch scale factor to avoid too long latency Huang Ying
2023-10-19 12:12   ` Mel Gorman
2023-10-16  5:29 ` [PATCH -V3 5/9] mm, page_alloc: scale the number of pages that are batch allocated Huang Ying
2023-10-16  5:29 ` [PATCH -V3 6/9] mm: add framework for PCP high auto-tuning Huang Ying
2023-10-19 12:16   ` Mel Gorman
2023-10-16  5:30 ` [PATCH -V3 7/9] mm: tune PCP high automatically Huang Ying
2023-10-31  2:50   ` kernel test robot
2023-10-16  5:30 ` [PATCH -V3 8/9] mm, pcp: decrease PCP high if free pages < high watermark Huang Ying
2023-10-19 12:33   ` Mel Gorman
2023-10-20  3:30     ` Huang, Ying
2023-10-23  9:26       ` Mel Gorman
2023-10-16  5:30 ` [PATCH -V3 9/9] mm, pcp: reduce detecting time of consecutive high order page freeing Huang Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231016053002.756205-3-ying.huang@intel.com \
    --to=ying.huang@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@linux.intel.com \
    --cc=cl@linux.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=jweiner@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=sudeep.holla@arm.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.