linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] selftests/resctrl: Skip MBM&CMT tests when Intel Sub-NUMA
@ 2021-11-10  8:27 Shaopeng Tan
  2021-11-12 18:00 ` Reinette Chatre
  0 siblings, 1 reply; 4+ messages in thread
From: Shaopeng Tan @ 2021-11-10  8:27 UTC (permalink / raw)
  To: Fenghua Yu, Reinette Chatre, Shuah Khan
  Cc: linux-kernel, linux-kselftest, tan.shaopeng

From: "Tan, Shaopeng" <tan.shaopeng@jp.fujitsu.com>

When the Intel Sub-NUMA Clustering(SNC) feature is enabled,
the CMT and MBM counters may not be accurate.
In this case, skip MBM&CMT tests.

Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
---
Hello,

According to the Intel RDT reference Manual, 
when the sub-numa clustering feature is enabled, the CMT and MBM counters may not be accurate.
When running CMT tests and MBM tests on Intel processor, the result is "not ok".
So, fix it to skip the CMT & MBM test When the Intel Sub-NUMA Clustering(SNC) feature is enabled.

Thanks,

 tools/testing/selftests/resctrl/resctrl.h       |  1 +
 tools/testing/selftests/resctrl/resctrl_tests.c | 51 +++++++++++++++++++++++++
 tools/testing/selftests/resctrl/resctrlfs.c     | 26 +++++++++++++
 3 files changed, 78 insertions(+)

diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index 1ad10c4..8e82ce3 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -85,6 +85,7 @@ struct resctrl_val_param {
 int validate_bw_report_request(char *bw_report);
 bool validate_resctrl_feature_request(const char *resctrl_val);
 char *fgrep(FILE *inf, const char *str);
+char *fgrep_last_match_line(FILE *inf, const char *str);
 int taskset_benchmark(pid_t bm_pid, int cpu_no);
 void run_benchmark(int signum, siginfo_t *info, void *ucontext);
 int write_schemata(char *ctrlgrp, char *schemata, int cpu_no,
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
index 973f09a..122aab6 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -8,12 +8,15 @@
  *    Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>,
  *    Fenghua Yu <fenghua.yu@intel.com>
  */
+#include <numa.h>
+#include <string.h>
 #include "resctrl.h"
 
 #define BENCHMARK_ARGS		64
 #define BENCHMARK_ARG_SIZE	64
 
 bool is_amd;
+bool sub_numa_cluster_enable;
 
 void detect_amd(void)
 {
@@ -34,6 +37,35 @@ void detect_amd(void)
 	fclose(inf);
 }
 
+void check_sub_numa_cluster(void)
+{
+	FILE *inf = fopen("/proc/cpuinfo", "r");
+	char *res, *s;
+	int socket_num = 0;
+	int numa_nodes = 0;
+
+	if (!inf)
+		return;
+
+	res = fgrep_last_match_line(inf, "physical id");
+
+	if (res) {
+		s = strpbrk(res, "1234567890");
+		socket_num = atoi(s) + 1;
+		free(res);
+	}
+	fclose(inf);
+
+	numa_nodes = numa_max_node() + 1;
+
+	/*
+	 * when the Sub-NUMA Clustering(SNC) feature is enabled,
+	 * the number of numa nodes is twice the number of sockets.
+	 */
+	if (numa_nodes == (2 * socket_num))
+		sub_numa_cluster_enable = true;
+}
+
 static void cmd_help(void)
 {
 	printf("usage: resctrl_tests [-h] [-b \"benchmark_cmd [options]\"] [-t test list] [-n no_of_bits]\n");
@@ -61,6 +93,13 @@ static void run_mbm_test(bool has_ben, char **benchmark_cmd, int span,
 
 	ksft_print_msg("Starting MBM BW change ...\n");
 
+	/* when the Sub-NUMA Clustering(SNC) feature is enabled,
+	 * the CMT and MBM counters may not be accurate
+	 */
+	if (sub_numa_cluster_enable) {
+		ksft_test_result_skip("Sub-NUMA Clustering(SNC) feature is enabled, the MBM counters may not be accurate.\n");
+		return;
+	}
 	if (!validate_resctrl_feature_request(MBM_STR)) {
 		ksft_test_result_skip("Hardware does not support MBM or MBM is disabled\n");
 		return;
@@ -97,6 +136,14 @@ static void run_cmt_test(bool has_ben, char **benchmark_cmd, int cpu_no)
 	int res;
 
 	ksft_print_msg("Starting CMT test ...\n");
+
+	/* when the Sub-NUMA Clustering(SNC) feature is enabled,
+	 * the CMT and MBM counters may not be accurate
+	 */
+	if (sub_numa_cluster_enable) {
+		ksft_test_result_skip("Sub-NUMA Clustering(SNC) feature is enabled, the CMT counters may not be accurate.\n");
+		return;
+	}
 	if (!validate_resctrl_feature_request(CMT_STR)) {
 		ksft_test_result_skip("Hardware does not support CMT or CMT is disabled\n");
 		return;
@@ -210,6 +257,10 @@ int main(int argc, char **argv)
 	/* Detect AMD vendor */
 	detect_amd();
 
+	/* check whether sub numa clustering is enable or not */
+	if (!is_amd)
+		check_sub_numa_cluster();
+
 	if (has_ben) {
 		/* Extract benchmark command from command line. */
 		for (i = ben_ind; i < argc; i++) {
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
index 5f5a166..1908ecb 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -606,6 +606,32 @@ char *fgrep(FILE *inf, const char *str)
 }
 
 /*
+ * Find the last matched line.
+ * Return a pointer to the string of the matched line,
+ * else retuen NULL if no matched line
+ */
+char *fgrep_last_match_line(FILE *inf, const char *str)
+{
+	char line[256];
+	char result_line[256];
+	int slen = strlen(str);
+
+	while (!feof(inf)) {
+		if (!fgets(line, 256, inf))
+			break;
+		if (strncmp(line, str, slen))
+			continue;
+
+		strcpy(result_line, line);
+	}
+
+	if (strlen(result_line) >= slen)
+		return strdup(result_line);
+
+	return NULL;
+}
+
+/*
  * validate_resctrl_feature_request - Check if requested feature is valid.
  * @resctrl_val:	Requested feature
  *
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] selftests/resctrl: Skip MBM&CMT tests when Intel Sub-NUMA
  2021-11-10  8:27 [PATCH] selftests/resctrl: Skip MBM&CMT tests when Intel Sub-NUMA Shaopeng Tan
@ 2021-11-12 18:00 ` Reinette Chatre
  2021-11-15  7:11   ` tan.shaopeng
  0 siblings, 1 reply; 4+ messages in thread
From: Reinette Chatre @ 2021-11-12 18:00 UTC (permalink / raw)
  To: Shaopeng Tan, Fenghua Yu, Shuah Khan; +Cc: linux-kernel, linux-kselftest

Hi,

On 11/10/2021 12:27 AM, Shaopeng Tan wrote:
> From: "Tan, Shaopeng" <tan.shaopeng@jp.fujitsu.com>
> 
> When the Intel Sub-NUMA Clustering(SNC) feature is enabled,
> the CMT and MBM counters may not be accurate.
> In this case, skip MBM&CMT tests.
> 
> Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> ---
> Hello,
> 
> According to the Intel RDT reference Manual,
> when the sub-numa clustering feature is enabled, the CMT and MBM counters may not be accurate.
> When running CMT tests and MBM tests on Intel processor, the result is "not ok".
> So, fix it to skip the CMT & MBM test When the Intel Sub-NUMA Clustering(SNC) feature is enabled.
> 

It is not clear to me which exact document you refer to but I did find a 
RDT reference manual at the link below that describes the problem you 
mention:
https://www.intel.com/content/dam/develop/external/us/en/documents/180115-intel-rdtcascadelake-serverreferencemanual-806717.pdf

What is not mentioned in your description is that this is a hardware 
errata so the test is expected to fail on these systems and I find that 
disabling the test for all systems based on this hardware errata is too 
drastic.

Reinette




^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH] selftests/resctrl: Skip MBM&CMT tests when Intel Sub-NUMA
  2021-11-12 18:00 ` Reinette Chatre
@ 2021-11-15  7:11   ` tan.shaopeng
  2021-11-17 16:46     ` Reinette Chatre
  0 siblings, 1 reply; 4+ messages in thread
From: tan.shaopeng @ 2021-11-15  7:11 UTC (permalink / raw)
  To: 'Reinette Chatre', Fenghua Yu, Shuah Khan
  Cc: linux-kernel, linux-kselftest

Hi Reinette,

> On 11/10/2021 12:27 AM, Shaopeng Tan wrote:
> > From: "Tan, Shaopeng" <tan.shaopeng@jp.fujitsu.com>
> >
> > When the Intel Sub-NUMA Clustering(SNC) feature is enabled,
> > the CMT and MBM counters may not be accurate.
> > In this case, skip MBM&CMT tests.
> >
> > Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> > ---
> > Hello,
> >
> > According to the Intel RDT reference Manual,
> > when the sub-numa clustering feature is enabled, the CMT and MBM
> counters may not be accurate.
> > When running CMT tests and MBM tests on Intel processor, the result is "not
> ok".
> > So, fix it to skip the CMT & MBM test When the Intel Sub-NUMA
> Clustering(SNC) feature is enabled.
> >
> 
> It is not clear to me which exact document you refer to but I did find a
> RDT reference manual at the link below that describes the problem you
> mention:
> https://www.intel.com/content/dam/develop/external/us/en/documents/18
> 0115-intel-rdtcascadelake-serverreferencemanual-806717.pdf

Yes, I referred this manual.

> What is not mentioned in your description is that this is a hardware
> errata so the test is expected to fail on these systems and I find that
> disabling the test for all systems based on this hardware errata is too
> drastic.

Understood. It is not reasonable to disable the test for all systems 
based on this hardware errata. 
When I run restrl_test on Intel(R) Xeon(R) Gold 6254 CPU, 
the result of CMT & MBM is "not ok", and I took some time to debug it. 
In order to other people can do the test smoothly, I'd like to update the 
patch to disable the test only on 2nd Generation Intel Xeon scalable processors. 

Regards, 
Shaopeng Tan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] selftests/resctrl: Skip MBM&CMT tests when Intel Sub-NUMA
  2021-11-15  7:11   ` tan.shaopeng
@ 2021-11-17 16:46     ` Reinette Chatre
  0 siblings, 0 replies; 4+ messages in thread
From: Reinette Chatre @ 2021-11-17 16:46 UTC (permalink / raw)
  To: tan.shaopeng, Fenghua Yu, Shuah Khan; +Cc: linux-kernel, linux-kselftest

Hi Shaopeng Tan,

On 11/14/2021 11:11 PM, tan.shaopeng@fujitsu.com wrote:
> Hi Reinette,
> 
>> On 11/10/2021 12:27 AM, Shaopeng Tan wrote:
>>> From: "Tan, Shaopeng" <tan.shaopeng@jp.fujitsu.com>
>>>
>>> When the Intel Sub-NUMA Clustering(SNC) feature is enabled,
>>> the CMT and MBM counters may not be accurate.
>>> In this case, skip MBM&CMT tests.
>>>
>>> Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
>>> ---
>>> Hello,
>>>
>>> According to the Intel RDT reference Manual,
>>> when the sub-numa clustering feature is enabled, the CMT and MBM
>> counters may not be accurate.
>>> When running CMT tests and MBM tests on Intel processor, the result is "not
>> ok".
>>> So, fix it to skip the CMT & MBM test When the Intel Sub-NUMA
>> Clustering(SNC) feature is enabled.
>>>
>>
>> It is not clear to me which exact document you refer to but I did find a
>> RDT reference manual at the link below that describes the problem you
>> mention:
>> https://www.intel.com/content/dam/develop/external/us/en/documents/18
>> 0115-intel-rdtcascadelake-serverreferencemanual-806717.pdf
> 
> Yes, I referred this manual.
> 
>> What is not mentioned in your description is that this is a hardware
>> errata so the test is expected to fail on these systems and I find that
>> disabling the test for all systems based on this hardware errata is too
>> drastic.
> 
> Understood. It is not reasonable to disable the test for all systems
> based on this hardware errata.
> When I run restrl_test on Intel(R) Xeon(R) Gold 6254 CPU,
> the result of CMT & MBM is "not ok", and I took some time to debug it.
> In order to other people can do the test smoothly, I'd like to update the
> patch to disable the test only on 2nd Generation Intel Xeon scalable processors.

I've been thinking about this some more and I do not think that the test 
should be disabled. There is a clear incompatibility between SNC and RDT 
on these systems and I do not think the test should hide that, indeed it 
is helpful to highlight that there is an issue. Even so, spending time 
to debug a known issue is not a good use of time. Instead of skipping 
the test on these systems could the test perhaps be improved to provide 
more information on failure to help user decide if they really need SNC 
enabled? The test would show that RDT cannot be used on their system 
with the SNC configuration, hiding that information by skipping the test 
may create false idea that RDT is working with that configuration.

Reinette


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-11-17 17:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-10  8:27 [PATCH] selftests/resctrl: Skip MBM&CMT tests when Intel Sub-NUMA Shaopeng Tan
2021-11-12 18:00 ` Reinette Chatre
2021-11-15  7:11   ` tan.shaopeng
2021-11-17 16:46     ` Reinette Chatre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).