* [PATCH 0/5] selftests/resctrl: Fixes to failing tests
@ 2023-09-11 11:19 Ilpo Järvinen
2023-09-11 11:19 ` [PATCH 1/5] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal Ilpo Järvinen
` (4 more replies)
0 siblings, 5 replies; 25+ messages in thread
From: Ilpo Järvinen @ 2023-09-11 11:19 UTC (permalink / raw)
To: Reinette Chatre, Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman
Cc: linux-kernel, Shaopeng Tan, stable, Ilpo Järvinen
Fix three issues with resctrl selftests.
The signal handling fix became necessary after the mount/umount fixes.
The other two came up when I ran resctrl selftests across the server
fleet in our lab to validate the upcoming CAT test rewrite (the rewrite
is not part of this series).
These are developed and should apply cleanly at least on top the
benchmark cleanup series (might apply cleanly also w/o the benchmark
series, I didn't test).
Ilpo Järvinen (5):
selftests/resctrl: Extend signal handler coverage to unmount on
receiving signal
selftests/resctrl: Remove duplicate feature check from CMT test
selftests/resctrl: Refactor feature check to use resource and feature
name
selftests/resctrl: Fix feature checks
selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests
tools/testing/selftests/resctrl/cat_test.c | 8 ---
tools/testing/selftests/resctrl/cmt_test.c | 3 -
tools/testing/selftests/resctrl/mba_test.c | 2 +-
tools/testing/selftests/resctrl/mbm_test.c | 2 +-
tools/testing/selftests/resctrl/resctrl.h | 6 +-
.../testing/selftests/resctrl/resctrl_tests.c | 37 ++++++++--
tools/testing/selftests/resctrl/resctrl_val.c | 22 +++---
tools/testing/selftests/resctrl/resctrlfs.c | 69 ++++++++-----------
8 files changed, 73 insertions(+), 76 deletions(-)
--
2.30.2
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 1/5] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal
2023-09-11 11:19 [PATCH 0/5] selftests/resctrl: Fixes to failing tests Ilpo Järvinen
@ 2023-09-11 11:19 ` Ilpo Järvinen
2023-09-12 22:06 ` Reinette Chatre
2023-09-11 11:19 ` [PATCH 2/5] selftests/resctrl: Remove duplicate feature check from CMT test Ilpo Järvinen
` (3 subsequent siblings)
4 siblings, 1 reply; 25+ messages in thread
From: Ilpo Järvinen @ 2023-09-11 11:19 UTC (permalink / raw)
To: Reinette Chatre, Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman
Cc: linux-kernel, Shaopeng Tan, stable, Ilpo Järvinen
Unmounting resctrl FS has been moved into the per test functions in
resctrl_tests.c by commit caddc0fbe495 ("selftests/resctrl: Move
resctrl FS mount/umount to higher level"). In case a signal (SIGINT,
SIGTERM, or SIGHUP) is received, the running selftest is aborted by
ctrlc_handler() which then unmounts resctrl fs before exiting. The
current section between signal_handler_register() and
signal_handler_unregister(), however, does not cover the entire
duration when resctrl FS is mounted.
Move signal_handler_register() and signal_handler_unregister() call
into the test functions in resctrl_tests.c to properly unmount resctrl
fs. Adjust child process kill() call in ctrlc_handler() to only be
invoked if the child was already forked.
Fixes: caddc0fbe495 ("selftests/resctrl: Move resctrl FS mount/umount to higher level")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: <stable@vger.kernel.org>
---
tools/testing/selftests/resctrl/cat_test.c | 8 -------
.../testing/selftests/resctrl/resctrl_tests.c | 24 +++++++++++++++++++
tools/testing/selftests/resctrl/resctrl_val.c | 22 ++++++++---------
3 files changed, 34 insertions(+), 20 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index 97b87285ab2a..224ba8544d8a 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -167,12 +167,6 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
strcpy(param.filename, RESULT_FILE_NAME1);
param.num_of_runs = 0;
param.cpu_no = sibling_cpu_no;
- } else {
- ret = signal_handler_register();
- if (ret) {
- kill(bm_pid, SIGKILL);
- goto out;
- }
}
remove(param.filename);
@@ -209,10 +203,8 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
}
close(pipefd[0]);
kill(bm_pid, SIGKILL);
- signal_handler_unregister();
}
-out:
cat_test_cleanup();
return ret;
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
index 823672a20a43..3d66fbdc2df3 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -73,8 +73,13 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
ksft_print_msg("Starting MBM BW change ...\n");
+ res = signal_handler_register();
+ if (res)
+ return;
+
res = mount_resctrlfs();
if (res) {
+ signal_handler_unregister();
ksft_exit_fail_msg("Failed to mount resctrl FS\n");
return;
}
@@ -91,6 +96,7 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
umount:
umount_resctrlfs();
+ signal_handler_unregister();
}
static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
@@ -99,8 +105,13 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
ksft_print_msg("Starting MBA Schemata change ...\n");
+ res = signal_handler_register();
+ if (res)
+ return;
+
res = mount_resctrlfs();
if (res) {
+ signal_handler_unregister();
ksft_exit_fail_msg("Failed to mount resctrl FS\n");
return;
}
@@ -115,6 +126,7 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
umount:
umount_resctrlfs();
+ signal_handler_unregister();
}
static void run_cmt_test(const char * const *benchmark_cmd, int cpu_no)
@@ -123,8 +135,13 @@ static void run_cmt_test(const char * const *benchmark_cmd, int cpu_no)
ksft_print_msg("Starting CMT test ...\n");
+ res = signal_handler_register();
+ if (res)
+ return;
+
res = mount_resctrlfs();
if (res) {
+ signal_handler_unregister();
ksft_exit_fail_msg("Failed to mount resctrl FS\n");
return;
}
@@ -141,6 +158,7 @@ static void run_cmt_test(const char * const *benchmark_cmd, int cpu_no)
umount:
umount_resctrlfs();
+ signal_handler_unregister();
}
static void run_cat_test(int cpu_no, int no_of_bits)
@@ -149,8 +167,13 @@ static void run_cat_test(int cpu_no, int no_of_bits)
ksft_print_msg("Starting CAT test ...\n");
+ res = signal_handler_register();
+ if (res)
+ return;
+
res = mount_resctrlfs();
if (res) {
+ signal_handler_unregister();
ksft_exit_fail_msg("Failed to mount resctrl FS\n");
return;
}
@@ -165,6 +188,7 @@ static void run_cat_test(int cpu_no, int no_of_bits)
umount:
umount_resctrlfs();
+ signal_handler_unregister();
}
int main(int argc, char **argv)
diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c
index 51963a6f2186..a9fe61133119 100644
--- a/tools/testing/selftests/resctrl/resctrl_val.c
+++ b/tools/testing/selftests/resctrl/resctrl_val.c
@@ -468,7 +468,9 @@ pid_t bm_pid, ppid;
void ctrlc_handler(int signum, siginfo_t *info, void *ptr)
{
- kill(bm_pid, SIGKILL);
+ /* Only kill child after bm_pid is set after fork() */
+ if (bm_pid)
+ kill(bm_pid, SIGKILL);
umount_resctrlfs();
tests_cleanup();
ksft_print_msg("Ending\n\n");
@@ -485,6 +487,8 @@ int signal_handler_register(void)
struct sigaction sigact;
int ret = 0;
+ bm_pid = 0;
+
sigact.sa_sigaction = ctrlc_handler;
sigemptyset(&sigact.sa_mask);
sigact.sa_flags = SA_SIGINFO;
@@ -706,10 +710,6 @@ int resctrl_val(const char * const *benchmark_cmd, struct resctrl_val_param *par
ksft_print_msg("Benchmark PID: %d\n", bm_pid);
- ret = signal_handler_register();
- if (ret)
- goto out;
-
/*
* The cast removes constness but nothing mutates benchmark_cmd within
* the context of this process. At the receiving process, it becomes
@@ -721,19 +721,19 @@ int resctrl_val(const char * const *benchmark_cmd, struct resctrl_val_param *par
/* Taskset benchmark to specified cpu */
ret = taskset_benchmark(bm_pid, param->cpu_no);
if (ret)
- goto unregister;
+ goto out;
/* Write benchmark to specified control&monitoring grp in resctrl FS */
ret = write_bm_pid_to_resctrl(bm_pid, param->ctrlgrp, param->mongrp,
resctrl_val);
if (ret)
- goto unregister;
+ goto out;
if (!strncmp(resctrl_val, MBM_STR, sizeof(MBM_STR)) ||
!strncmp(resctrl_val, MBA_STR, sizeof(MBA_STR))) {
ret = initialize_mem_bw_imc();
if (ret)
- goto unregister;
+ goto out;
initialize_mem_bw_resctrl(param->ctrlgrp, param->mongrp,
param->cpu_no, resctrl_val);
@@ -748,7 +748,7 @@ int resctrl_val(const char * const *benchmark_cmd, struct resctrl_val_param *par
sizeof(pipe_message)) {
perror("# failed reading message from child process");
close(pipefd[0]);
- goto unregister;
+ goto out;
}
}
close(pipefd[0]);
@@ -757,7 +757,7 @@ int resctrl_val(const char * const *benchmark_cmd, struct resctrl_val_param *par
if (sigqueue(bm_pid, SIGUSR1, value) == -1) {
perror("# sigqueue SIGUSR1 to child");
ret = errno;
- goto unregister;
+ goto out;
}
/* Give benchmark enough time to fully run */
@@ -786,8 +786,6 @@ int resctrl_val(const char * const *benchmark_cmd, struct resctrl_val_param *par
}
}
-unregister:
- signal_handler_unregister();
out:
kill(bm_pid, SIGKILL);
--
2.30.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 2/5] selftests/resctrl: Remove duplicate feature check from CMT test
2023-09-11 11:19 [PATCH 0/5] selftests/resctrl: Fixes to failing tests Ilpo Järvinen
2023-09-11 11:19 ` [PATCH 1/5] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal Ilpo Järvinen
@ 2023-09-11 11:19 ` Ilpo Järvinen
2023-09-12 22:06 ` Reinette Chatre
2023-09-11 11:19 ` [PATCH 3/5] selftests/resctrl: Refactor feature check to use resource and feature name Ilpo Järvinen
` (2 subsequent siblings)
4 siblings, 1 reply; 25+ messages in thread
From: Ilpo Järvinen @ 2023-09-11 11:19 UTC (permalink / raw)
To: Reinette Chatre, Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman
Cc: linux-kernel, Shaopeng Tan, stable, Ilpo Järvinen
The test runner run_cmt_test() in resctrl_tests.c checks for CMT
feature and does not run cmt_resctrl_val() if CMT is not supported.
Then cmt_resctrl_val() also check is CMT is supported.
Remove the duplicated feature check for CMT from cmt_resctrl_val().
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: <stable@vger.kernel.org>
---
tools/testing/selftests/resctrl/cmt_test.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c
index cf2f5e92dea6..50bdbce9fba9 100644
--- a/tools/testing/selftests/resctrl/cmt_test.c
+++ b/tools/testing/selftests/resctrl/cmt_test.c
@@ -80,9 +80,6 @@ int cmt_resctrl_val(int cpu_no, int n, const char * const *benchmark_cmd)
size_t span;
int ret, i;
- if (!validate_resctrl_feature_request(CMT_STR))
- return -1;
-
ret = get_cbm_mask("L3", cbm_mask);
if (ret)
return ret;
--
2.30.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 3/5] selftests/resctrl: Refactor feature check to use resource and feature name
2023-09-11 11:19 [PATCH 0/5] selftests/resctrl: Fixes to failing tests Ilpo Järvinen
2023-09-11 11:19 ` [PATCH 1/5] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal Ilpo Järvinen
2023-09-11 11:19 ` [PATCH 2/5] selftests/resctrl: Remove duplicate feature check from CMT test Ilpo Järvinen
@ 2023-09-11 11:19 ` Ilpo Järvinen
2023-09-12 22:09 ` Reinette Chatre
2023-09-11 11:19 ` [PATCH 4/5] selftests/resctrl: Fix feature checks Ilpo Järvinen
2023-09-11 11:19 ` [PATCH 5/5] selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests Ilpo Järvinen
4 siblings, 1 reply; 25+ messages in thread
From: Ilpo Järvinen @ 2023-09-11 11:19 UTC (permalink / raw)
To: Reinette Chatre, Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman
Cc: linux-kernel, Shaopeng Tan, stable, Ilpo Järvinen
Feature check in validate_resctrl_feature_request() takes in the test
name string and maps that to what to check per test.
Pass resource and feature names to validate_resctrl_feature_request()
directly rather than deriving them from the test name inside the
function which makes the feature check easier to extend for new test
cases.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: <stable@vger.kernel.org>
---
tools/testing/selftests/resctrl/resctrl.h | 6 +-
.../testing/selftests/resctrl/resctrl_tests.c | 10 +--
tools/testing/selftests/resctrl/resctrlfs.c | 69 ++++++++-----------
3 files changed, 34 insertions(+), 51 deletions(-)
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index dd07463cdf48..89ced4152933 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -28,10 +28,6 @@
#define RESCTRL_PATH "/sys/fs/resctrl"
#define PHYS_ID_PATH "/sys/devices/system/cpu/cpu"
#define INFO_PATH "/sys/fs/resctrl/info"
-#define L3_PATH "/sys/fs/resctrl/info/L3"
-#define MB_PATH "/sys/fs/resctrl/info/MB"
-#define L3_MON_PATH "/sys/fs/resctrl/info/L3_MON"
-#define L3_MON_FEATURES_PATH "/sys/fs/resctrl/info/L3_MON/mon_features"
#define ARCH_INTEL 1
#define ARCH_AMD 2
@@ -88,7 +84,7 @@ int get_resource_id(int cpu_no, int *resource_id);
int mount_resctrlfs(void);
int umount_resctrlfs(void);
int validate_bw_report_request(char *bw_report);
-bool validate_resctrl_feature_request(const char *resctrl_val);
+bool validate_resctrl_feature_request(const char *resource, const char *feature);
char *fgrep(FILE *inf, const char *str);
int taskset_benchmark(pid_t bm_pid, int cpu_no);
void run_benchmark(int signum, siginfo_t *info, void *ucontext);
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
index 3d66fbdc2df3..3052394ca884 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -84,7 +84,9 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
return;
}
- if (!validate_resctrl_feature_request(MBM_STR) || (get_vendor() != ARCH_INTEL)) {
+ if (!validate_resctrl_feature_request("L3_MON", "mbm_total_bytes") ||
+ !validate_resctrl_feature_request("L3_MON", "mbm_local_bytes") ||
+ (get_vendor() != ARCH_INTEL)) {
ksft_test_result_skip("Hardware does not support MBM or MBM is disabled\n");
goto umount;
}
@@ -116,7 +118,7 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
return;
}
- if (!validate_resctrl_feature_request(MBA_STR) || (get_vendor() != ARCH_INTEL)) {
+ if (!validate_resctrl_feature_request("MB", NULL) || (get_vendor() != ARCH_INTEL)) {
ksft_test_result_skip("Hardware does not support MBA or MBA is disabled\n");
goto umount;
}
@@ -146,7 +148,7 @@ static void run_cmt_test(const char * const *benchmark_cmd, int cpu_no)
return;
}
- if (!validate_resctrl_feature_request(CMT_STR)) {
+ if (!validate_resctrl_feature_request("L3_MON", "llc_occupancy")) {
ksft_test_result_skip("Hardware does not support CMT or CMT is disabled\n");
goto umount;
}
@@ -178,7 +180,7 @@ static void run_cat_test(int cpu_no, int no_of_bits)
return;
}
- if (!validate_resctrl_feature_request(CAT_STR)) {
+ if (!validate_resctrl_feature_request("L3", NULL)) {
ksft_test_result_skip("Hardware does not support CAT or CAT is disabled\n");
goto umount;
}
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
index bd36ee206602..bd547a10791c 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -10,6 +10,8 @@
*/
#include "resctrl.h"
+#include <limits.h>
+
static int find_resctrl_mount(char *buffer)
{
FILE *mounts;
@@ -604,63 +606,46 @@ char *fgrep(FILE *inf, const char *str)
/*
* validate_resctrl_feature_request - Check if requested feature is valid.
- * @resctrl_val: Requested feature
+ * @resource: Required resource (e.g., MB, L3, L2, L3_MON, etc.)
+ * @feature: Feature to be checked under resource (can be NULL). This path
+ * is relative to the resource path.
*
- * Return: True if the feature is supported, else false. False is also
- * returned if resctrl FS is not mounted.
+ * Return: True if the resource/feature is supported, else false. False is
+ * also returned if resctrl FS is not mounted.
*/
-bool validate_resctrl_feature_request(const char *resctrl_val)
+bool validate_resctrl_feature_request(const char *resource, const char *feature)
{
struct stat statbuf;
- bool found = false;
+ char res_path[PATH_MAX];
char *res;
FILE *inf;
int ret;
- if (!resctrl_val)
+ if (!resource)
return false;
ret = find_resctrl_mount(NULL);
if (ret)
return false;
- if (!strncmp(resctrl_val, CAT_STR, sizeof(CAT_STR))) {
- if (!stat(L3_PATH, &statbuf))
- return true;
- } else if (!strncmp(resctrl_val, MBA_STR, sizeof(MBA_STR))) {
- if (!stat(MB_PATH, &statbuf))
- return true;
- } else if (!strncmp(resctrl_val, MBM_STR, sizeof(MBM_STR)) ||
- !strncmp(resctrl_val, CMT_STR, sizeof(CMT_STR))) {
- if (!stat(L3_MON_PATH, &statbuf)) {
- inf = fopen(L3_MON_FEATURES_PATH, "r");
- if (!inf)
- return false;
-
- if (!strncmp(resctrl_val, CMT_STR, sizeof(CMT_STR))) {
- res = fgrep(inf, "llc_occupancy");
- if (res) {
- found = true;
- free(res);
- }
- }
-
- if (!strncmp(resctrl_val, MBM_STR, sizeof(MBM_STR))) {
- res = fgrep(inf, "mbm_total_bytes");
- if (res) {
- free(res);
- res = fgrep(inf, "mbm_local_bytes");
- if (res) {
- found = true;
- free(res);
- }
- }
- }
- fclose(inf);
- }
- }
+ snprintf(res_path, sizeof(res_path), "%s/%s", INFO_PATH, resource);
+
+ if (stat(res_path, &statbuf))
+ return false;
+
+ if (!feature)
+ return true;
+
+ snprintf(res_path, sizeof(res_path), "%s/%s/mon_features", INFO_PATH, resource);
+ inf = fopen(res_path, "r");
+ if (!inf)
+ return false;
+
+ res = fgrep(inf, feature);
+ free(res);
+ fclose(inf);
- return found;
+ return res;
}
int filter_dmesg(void)
--
2.30.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 4/5] selftests/resctrl: Fix feature checks
2023-09-11 11:19 [PATCH 0/5] selftests/resctrl: Fixes to failing tests Ilpo Järvinen
` (2 preceding siblings ...)
2023-09-11 11:19 ` [PATCH 3/5] selftests/resctrl: Refactor feature check to use resource and feature name Ilpo Järvinen
@ 2023-09-11 11:19 ` Ilpo Järvinen
2023-09-11 11:19 ` [PATCH 5/5] selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests Ilpo Järvinen
4 siblings, 0 replies; 25+ messages in thread
From: Ilpo Järvinen @ 2023-09-11 11:19 UTC (permalink / raw)
To: Reinette Chatre, Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman
Cc: linux-kernel, Shaopeng Tan, stable, Ilpo Järvinen
The MBA and CMT tests expect support of other features to be able to
run.
When platform only supports MBA but not MBM, MBA test will fail with:
Failed to open total bw file: No such file or directory
When platform only supports CMT but not CAT, CMT test will fail with:
Failed to open bit mask file '/sys/fs/resctrl/info/L3/cbm_mask': No such file or directory
Extend feature checks to cover these two conditions.
Fixes: ee0415681eb6 ("selftests/resctrl: Use resctrl/info for feature detection")
Cc: <stable@vger.kernel.org> # selftests/resctrl: Refactor feature check to use resource and feature name
Cc: <stable@vger.kernel.org> # selftests/resctrl: Remove duplicate feature check from CMT test
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
tools/testing/selftests/resctrl/resctrl_tests.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
index 3052394ca884..4e9b392d28dc 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -118,7 +118,9 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
return;
}
- if (!validate_resctrl_feature_request("MB", NULL) || (get_vendor() != ARCH_INTEL)) {
+ if (!validate_resctrl_feature_request("MB", NULL) ||
+ !validate_resctrl_feature_request("L3_MON", "mbm_local_bytes") ||
+ (get_vendor() != ARCH_INTEL)) {
ksft_test_result_skip("Hardware does not support MBA or MBA is disabled\n");
goto umount;
}
@@ -148,7 +150,8 @@ static void run_cmt_test(const char * const *benchmark_cmd, int cpu_no)
return;
}
- if (!validate_resctrl_feature_request("L3_MON", "llc_occupancy")) {
+ if (!validate_resctrl_feature_request("L3_MON", "llc_occupancy") ||
+ !validate_resctrl_feature_request("L3", NULL)) {
ksft_test_result_skip("Hardware does not support CMT or CMT is disabled\n");
goto umount;
}
--
2.30.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 5/5] selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests
2023-09-11 11:19 [PATCH 0/5] selftests/resctrl: Fixes to failing tests Ilpo Järvinen
` (3 preceding siblings ...)
2023-09-11 11:19 ` [PATCH 4/5] selftests/resctrl: Fix feature checks Ilpo Järvinen
@ 2023-09-11 11:19 ` Ilpo Järvinen
2023-09-12 22:10 ` Reinette Chatre
4 siblings, 1 reply; 25+ messages in thread
From: Ilpo Järvinen @ 2023-09-11 11:19 UTC (permalink / raw)
To: Reinette Chatre, Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman
Cc: linux-kernel, Shaopeng Tan, stable, Ilpo Järvinen
5% difference upper bound for success is a bit on the low side for the
MBA and MBM tests. Some platforms produce outliers that are slightly
above that, typically 6-7%.
Relaxing the MBA/MBM success bound to 8% removes most of the failures
due those frequent outliers.
Fixes: 06bd03a57f8c ("selftests/resctrl: Fix MBA/MBM results reporting format")
Cc: <stable@vger.kernel.org>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
tools/testing/selftests/resctrl/mba_test.c | 2 +-
tools/testing/selftests/resctrl/mbm_test.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
index cf8284dadcb2..d3bf4368341e 100644
--- a/tools/testing/selftests/resctrl/mba_test.c
+++ b/tools/testing/selftests/resctrl/mba_test.c
@@ -12,7 +12,7 @@
#define RESULT_FILE_NAME "result_mba"
#define NUM_OF_RUNS 5
-#define MAX_DIFF_PERCENT 5
+#define MAX_DIFF_PERCENT 8
#define ALLOCATION_MAX 100
#define ALLOCATION_MIN 10
#define ALLOCATION_STEP 10
diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c
index 1ae131a2e246..d3c0d30c676a 100644
--- a/tools/testing/selftests/resctrl/mbm_test.c
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -11,7 +11,7 @@
#include "resctrl.h"
#define RESULT_FILE_NAME "result_mbm"
-#define MAX_DIFF_PERCENT 5
+#define MAX_DIFF_PERCENT 8
#define NUM_OF_RUNS 5
static int
--
2.30.2
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 1/5] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal
2023-09-11 11:19 ` [PATCH 1/5] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal Ilpo Järvinen
@ 2023-09-12 22:06 ` Reinette Chatre
2023-09-13 10:01 ` Ilpo Järvinen
0 siblings, 1 reply; 25+ messages in thread
From: Reinette Chatre @ 2023-09-12 22:06 UTC (permalink / raw)
To: Ilpo Järvinen, Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman
Cc: linux-kernel, Shaopeng Tan, stable
Hi Ilpo,
On 9/11/2023 4:19 AM, Ilpo Järvinen wrote:
> Unmounting resctrl FS has been moved into the per test functions in
> resctrl_tests.c by commit caddc0fbe495 ("selftests/resctrl: Move
> resctrl FS mount/umount to higher level"). In case a signal (SIGINT,
> SIGTERM, or SIGHUP) is received, the running selftest is aborted by
> ctrlc_handler() which then unmounts resctrl fs before exiting. The
> current section between signal_handler_register() and
> signal_handler_unregister(), however, does not cover the entire
> duration when resctrl FS is mounted.
>
> Move signal_handler_register() and signal_handler_unregister() call
> into the test functions in resctrl_tests.c to properly unmount resctrl
> fs. Adjust child process kill() call in ctrlc_handler() to only be
> invoked if the child was already forked.
Thank you for catching this.
>
> Fixes: caddc0fbe495 ("selftests/resctrl: Move resctrl FS mount/umount to higher level")
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> Cc: <stable@vger.kernel.org>
> ---
> tools/testing/selftests/resctrl/cat_test.c | 8 -------
> .../testing/selftests/resctrl/resctrl_tests.c | 24 +++++++++++++++++++
> tools/testing/selftests/resctrl/resctrl_val.c | 22 ++++++++---------
> 3 files changed, 34 insertions(+), 20 deletions(-)
>
> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> index 97b87285ab2a..224ba8544d8a 100644
> --- a/tools/testing/selftests/resctrl/cat_test.c
> +++ b/tools/testing/selftests/resctrl/cat_test.c
> @@ -167,12 +167,6 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
> strcpy(param.filename, RESULT_FILE_NAME1);
> param.num_of_runs = 0;
> param.cpu_no = sibling_cpu_no;
> - } else {
> - ret = signal_handler_register();
> - if (ret) {
> - kill(bm_pid, SIGKILL);
> - goto out;
> - }
> }
>
> remove(param.filename);
> @@ -209,10 +203,8 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
> }
> close(pipefd[0]);
> kill(bm_pid, SIGKILL);
> - signal_handler_unregister();
> }
>
> -out:
> cat_test_cleanup();
>
> return ret;
> diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
> index 823672a20a43..3d66fbdc2df3 100644
> --- a/tools/testing/selftests/resctrl/resctrl_tests.c
> +++ b/tools/testing/selftests/resctrl/resctrl_tests.c
> @@ -73,8 +73,13 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
>
> ksft_print_msg("Starting MBM BW change ...\n");
>
> + res = signal_handler_register();
> + if (res)
> + return;
> +
> res = mount_resctrlfs();
> if (res) {
> + signal_handler_unregister();
> ksft_exit_fail_msg("Failed to mount resctrl FS\n");
> return;
> }
> @@ -91,6 +96,7 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
>
> umount:
> umount_resctrlfs();
> + signal_handler_unregister();
> }
>
> static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
> @@ -99,8 +105,13 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
>
> ksft_print_msg("Starting MBA Schemata change ...\n");
>
> + res = signal_handler_register();
> + if (res)
> + return;
> +
> res = mount_resctrlfs();
> if (res) {
> + signal_handler_unregister();
> ksft_exit_fail_msg("Failed to mount resctrl FS\n");
> return;
> }
> @@ -115,6 +126,7 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
>
> umount:
> umount_resctrlfs();
> + signal_handler_unregister();
> }
>
This adds more duplicated code for every test. Have you considered a
single test setup function that can be used to mount resctrl FS and setup
the signal handler paired with a single test teardown function?
Reinette
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/5] selftests/resctrl: Remove duplicate feature check from CMT test
2023-09-11 11:19 ` [PATCH 2/5] selftests/resctrl: Remove duplicate feature check from CMT test Ilpo Järvinen
@ 2023-09-12 22:06 ` Reinette Chatre
2023-09-13 11:11 ` Ilpo Järvinen
0 siblings, 1 reply; 25+ messages in thread
From: Reinette Chatre @ 2023-09-12 22:06 UTC (permalink / raw)
To: Ilpo Järvinen, Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman
Cc: linux-kernel, Shaopeng Tan, stable
Hi Ilpo,
On 9/11/2023 4:19 AM, Ilpo Järvinen wrote:
> The test runner run_cmt_test() in resctrl_tests.c checks for CMT
> feature and does not run cmt_resctrl_val() if CMT is not supported.
> Then cmt_resctrl_val() also check is CMT is supported.
>
> Remove the duplicated feature check for CMT from cmt_resctrl_val().
>
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> Cc: <stable@vger.kernel.org>
This does not look like stable material to me.
Reinette
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/5] selftests/resctrl: Refactor feature check to use resource and feature name
2023-09-11 11:19 ` [PATCH 3/5] selftests/resctrl: Refactor feature check to use resource and feature name Ilpo Järvinen
@ 2023-09-12 22:09 ` Reinette Chatre
2023-09-13 11:02 ` Ilpo Järvinen
0 siblings, 1 reply; 25+ messages in thread
From: Reinette Chatre @ 2023-09-12 22:09 UTC (permalink / raw)
To: Ilpo Järvinen, Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman
Cc: linux-kernel, Shaopeng Tan, stable
Hi Ilpo,
On 9/11/2023 4:19 AM, Ilpo Järvinen wrote:
> Feature check in validate_resctrl_feature_request() takes in the test
> name string and maps that to what to check per test.
>
> Pass resource and feature names to validate_resctrl_feature_request()
> directly rather than deriving them from the test name inside the
> function which makes the feature check easier to extend for new test
> cases.
>
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> Cc: <stable@vger.kernel.org>
This does not seem to be stable material.
> ---
> tools/testing/selftests/resctrl/resctrl.h | 6 +-
> .../testing/selftests/resctrl/resctrl_tests.c | 10 +--
> tools/testing/selftests/resctrl/resctrlfs.c | 69 ++++++++-----------
> 3 files changed, 34 insertions(+), 51 deletions(-)
>
> diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
> index dd07463cdf48..89ced4152933 100644
> --- a/tools/testing/selftests/resctrl/resctrl.h
> +++ b/tools/testing/selftests/resctrl/resctrl.h
> @@ -28,10 +28,6 @@
> #define RESCTRL_PATH "/sys/fs/resctrl"
> #define PHYS_ID_PATH "/sys/devices/system/cpu/cpu"
> #define INFO_PATH "/sys/fs/resctrl/info"
> -#define L3_PATH "/sys/fs/resctrl/info/L3"
> -#define MB_PATH "/sys/fs/resctrl/info/MB"
> -#define L3_MON_PATH "/sys/fs/resctrl/info/L3_MON"
> -#define L3_MON_FEATURES_PATH "/sys/fs/resctrl/info/L3_MON/mon_features"
>
> #define ARCH_INTEL 1
> #define ARCH_AMD 2
> @@ -88,7 +84,7 @@ int get_resource_id(int cpu_no, int *resource_id);
> int mount_resctrlfs(void);
> int umount_resctrlfs(void);
> int validate_bw_report_request(char *bw_report);
> -bool validate_resctrl_feature_request(const char *resctrl_val);
> +bool validate_resctrl_feature_request(const char *resource, const char *feature);
> char *fgrep(FILE *inf, const char *str);
> int taskset_benchmark(pid_t bm_pid, int cpu_no);
> void run_benchmark(int signum, siginfo_t *info, void *ucontext);
> diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
> index 3d66fbdc2df3..3052394ca884 100644
> --- a/tools/testing/selftests/resctrl/resctrl_tests.c
> +++ b/tools/testing/selftests/resctrl/resctrl_tests.c
> @@ -84,7 +84,9 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
> return;
> }
>
> - if (!validate_resctrl_feature_request(MBM_STR) || (get_vendor() != ARCH_INTEL)) {
> + if (!validate_resctrl_feature_request("L3_MON", "mbm_total_bytes") ||
> + !validate_resctrl_feature_request("L3_MON", "mbm_local_bytes") ||
> + (get_vendor() != ARCH_INTEL)) {
> ksft_test_result_skip("Hardware does not support MBM or MBM is disabled\n");
> goto umount;
> }
> @@ -116,7 +118,7 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
> return;
> }
>
> - if (!validate_resctrl_feature_request(MBA_STR) || (get_vendor() != ARCH_INTEL)) {
> + if (!validate_resctrl_feature_request("MB", NULL) || (get_vendor() != ARCH_INTEL)) {
> ksft_test_result_skip("Hardware does not support MBA or MBA is disabled\n");
> goto umount;
> }
> @@ -146,7 +148,7 @@ static void run_cmt_test(const char * const *benchmark_cmd, int cpu_no)
> return;
> }
>
> - if (!validate_resctrl_feature_request(CMT_STR)) {
> + if (!validate_resctrl_feature_request("L3_MON", "llc_occupancy")) {
> ksft_test_result_skip("Hardware does not support CMT or CMT is disabled\n");
> goto umount;
> }
> @@ -178,7 +180,7 @@ static void run_cat_test(int cpu_no, int no_of_bits)
> return;
> }
>
> - if (!validate_resctrl_feature_request(CAT_STR)) {
> + if (!validate_resctrl_feature_request("L3", NULL)) {
> ksft_test_result_skip("Hardware does not support CAT or CAT is disabled\n");
> goto umount;
> }
> diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
> index bd36ee206602..bd547a10791c 100644
> --- a/tools/testing/selftests/resctrl/resctrlfs.c
> +++ b/tools/testing/selftests/resctrl/resctrlfs.c
> @@ -10,6 +10,8 @@
> */
> #include "resctrl.h"
>
> +#include <limits.h>
> +
Could you please include <limits.h> before the local resctrl.h?
> static int find_resctrl_mount(char *buffer)
> {
> FILE *mounts;
> @@ -604,63 +606,46 @@ char *fgrep(FILE *inf, const char *str)
>
> /*
> * validate_resctrl_feature_request - Check if requested feature is valid.
> - * @resctrl_val: Requested feature
> + * @resource: Required resource (e.g., MB, L3, L2, L3_MON, etc.)
> + * @feature: Feature to be checked under resource (can be NULL). This path
> + * is relative to the resource path.
I do not think "this path" is accurate. @feature is not a path but an entry
within the mon_features file.
Also please note that mon_features only exists for L3_MON, none of the other
listed resources have an associated mon_features file in resctrl. This
function is created to be generic has specific requirements on what
valid (never checked) parameters should be. This may be ok with the usage
but it should not pretend to be generic.
> *
> - * Return: True if the feature is supported, else false. False is also
> - * returned if resctrl FS is not mounted.
> + * Return: True if the resource/feature is supported, else false. False is
> + * also returned if resctrl FS is not mounted.
> */
> -bool validate_resctrl_feature_request(const char *resctrl_val)
> +bool validate_resctrl_feature_request(const char *resource, const char *feature)
> {
> struct stat statbuf;
> - bool found = false;
> + char res_path[PATH_MAX];
Please maintain reverse fir order.
> char *res;
> FILE *inf;
> int ret;
>
> - if (!resctrl_val)
> + if (!resource)
> return false;
>
> ret = find_resctrl_mount(NULL);
> if (ret)
> return false;
>
> - if (!strncmp(resctrl_val, CAT_STR, sizeof(CAT_STR))) {
> - if (!stat(L3_PATH, &statbuf))
> - return true;
> - } else if (!strncmp(resctrl_val, MBA_STR, sizeof(MBA_STR))) {
> - if (!stat(MB_PATH, &statbuf))
> - return true;
> - } else if (!strncmp(resctrl_val, MBM_STR, sizeof(MBM_STR)) ||
> - !strncmp(resctrl_val, CMT_STR, sizeof(CMT_STR))) {
> - if (!stat(L3_MON_PATH, &statbuf)) {
> - inf = fopen(L3_MON_FEATURES_PATH, "r");
> - if (!inf)
> - return false;
> -
> - if (!strncmp(resctrl_val, CMT_STR, sizeof(CMT_STR))) {
> - res = fgrep(inf, "llc_occupancy");
> - if (res) {
> - found = true;
> - free(res);
> - }
> - }
> -
> - if (!strncmp(resctrl_val, MBM_STR, sizeof(MBM_STR))) {
> - res = fgrep(inf, "mbm_total_bytes");
> - if (res) {
> - free(res);
> - res = fgrep(inf, "mbm_local_bytes");
> - if (res) {
> - found = true;
> - free(res);
> - }
> - }
> - }
> - fclose(inf);
> - }
> - }
> + snprintf(res_path, sizeof(res_path), "%s/%s", INFO_PATH, resource);
> +
> + if (stat(res_path, &statbuf))
> + return false;
> +
> + if (!feature)
> + return true;
> +
> + snprintf(res_path, sizeof(res_path), "%s/%s/mon_features", INFO_PATH, resource);
> + inf = fopen(res_path, "r");
> + if (!inf)
> + return false;
> +
> + res = fgrep(inf, feature);
> + free(res);
> + fclose(inf);
>
> - return found;
> + return res;
This is unexpected. Function should return bool but instead returns a char * that
has been freed?
> }
>
> int filter_dmesg(void)
Reinette
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 5/5] selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests
2023-09-11 11:19 ` [PATCH 5/5] selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests Ilpo Järvinen
@ 2023-09-12 22:10 ` Reinette Chatre
2023-09-13 11:43 ` Ilpo Järvinen
0 siblings, 1 reply; 25+ messages in thread
From: Reinette Chatre @ 2023-09-12 22:10 UTC (permalink / raw)
To: Ilpo Järvinen, Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman
Cc: linux-kernel, Shaopeng Tan, stable
Hi Ilpo,
On 9/11/2023 4:19 AM, Ilpo Järvinen wrote:
> 5% difference upper bound for success is a bit on the low side for the
"a bit on the low side" is very vague.
> MBA and MBM tests. Some platforms produce outliers that are slightly
> above that, typically 6-7%.
>
> Relaxing the MBA/MBM success bound to 8% removes most of the failures
> due those frequent outliers.
This description needs more context on what issue is being solved here.
What does the % difference represent? How was new percentage determined?
Did you investigate why there are differences between platforms? From
what I understand these tests measure memory bandwidth using perf and
resctrl and then compare the difference. Are there interesting things
about the platforms on which the difference is higher than 5%? Could
those be systems with multiple sockets (and thus multiple PMUs that need
to be setup, reset, and read)? Can the reading of the counters be improved
instead of relaxing the success criteria? A quick comparison between
get_mem_bw_imc() and get_mem_bw_resctrl() makes me think that a difference
is not surprising ... note how the PMU counters are started and reset
(potentially on multiple sockets) at every iteration while the resctrl
counters keep rolling and new values are just subtracted from previous.
Reinette
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/5] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal
2023-09-12 22:06 ` Reinette Chatre
@ 2023-09-13 10:01 ` Ilpo Järvinen
2023-09-13 20:58 ` Reinette Chatre
0 siblings, 1 reply; 25+ messages in thread
From: Ilpo Järvinen @ 2023-09-13 10:01 UTC (permalink / raw)
To: Reinette Chatre
Cc: Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman, LKML, Shaopeng Tan, stable
[-- Attachment #1: Type: text/plain, Size: 4186 bytes --]
On Tue, 12 Sep 2023, Reinette Chatre wrote:
> On 9/11/2023 4:19 AM, Ilpo Järvinen wrote:
> > Unmounting resctrl FS has been moved into the per test functions in
> > resctrl_tests.c by commit caddc0fbe495 ("selftests/resctrl: Move
> > resctrl FS mount/umount to higher level"). In case a signal (SIGINT,
> > SIGTERM, or SIGHUP) is received, the running selftest is aborted by
> > ctrlc_handler() which then unmounts resctrl fs before exiting. The
> > current section between signal_handler_register() and
> > signal_handler_unregister(), however, does not cover the entire
> > duration when resctrl FS is mounted.
> >
> > Move signal_handler_register() and signal_handler_unregister() call
> > into the test functions in resctrl_tests.c to properly unmount resctrl
> > fs. Adjust child process kill() call in ctrlc_handler() to only be
> > invoked if the child was already forked.
>
> Thank you for catching this.
>
> >
> > Fixes: caddc0fbe495 ("selftests/resctrl: Move resctrl FS mount/umount to higher level")
> > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> > Cc: <stable@vger.kernel.org>
> > ---
> > tools/testing/selftests/resctrl/cat_test.c | 8 -------
> > .../testing/selftests/resctrl/resctrl_tests.c | 24 +++++++++++++++++++
> > tools/testing/selftests/resctrl/resctrl_val.c | 22 ++++++++---------
> > 3 files changed, 34 insertions(+), 20 deletions(-)
> >
> > diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> > index 97b87285ab2a..224ba8544d8a 100644
> > --- a/tools/testing/selftests/resctrl/cat_test.c
> > +++ b/tools/testing/selftests/resctrl/cat_test.c
> > @@ -167,12 +167,6 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
> > strcpy(param.filename, RESULT_FILE_NAME1);
> > param.num_of_runs = 0;
> > param.cpu_no = sibling_cpu_no;
> > - } else {
> > - ret = signal_handler_register();
> > - if (ret) {
> > - kill(bm_pid, SIGKILL);
> > - goto out;
> > - }
> > }
> >
> > remove(param.filename);
> > @@ -209,10 +203,8 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
> > }
> > close(pipefd[0]);
> > kill(bm_pid, SIGKILL);
> > - signal_handler_unregister();
> > }
> >
> > -out:
> > cat_test_cleanup();
> >
> > return ret;
> > diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
> > index 823672a20a43..3d66fbdc2df3 100644
> > --- a/tools/testing/selftests/resctrl/resctrl_tests.c
> > +++ b/tools/testing/selftests/resctrl/resctrl_tests.c
> > @@ -73,8 +73,13 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
> >
> > ksft_print_msg("Starting MBM BW change ...\n");
> >
> > + res = signal_handler_register();
> > + if (res)
> > + return;
> > +
> > res = mount_resctrlfs();
> > if (res) {
> > + signal_handler_unregister();
> > ksft_exit_fail_msg("Failed to mount resctrl FS\n");
> > return;
> > }
> > @@ -91,6 +96,7 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
> >
> > umount:
> > umount_resctrlfs();
> > + signal_handler_unregister();
> > }
> >
> > static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
> > @@ -99,8 +105,13 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
> >
> > ksft_print_msg("Starting MBA Schemata change ...\n");
> >
> > + res = signal_handler_register();
> > + if (res)
> > + return;
> > +
> > res = mount_resctrlfs();
> > if (res) {
> > + signal_handler_unregister();
> > ksft_exit_fail_msg("Failed to mount resctrl FS\n");
> > return;
> > }
> > @@ -115,6 +126,7 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
> >
> > umount:
> > umount_resctrlfs();
> > + signal_handler_unregister();
> > }
> >
>
> This adds more duplicated code for every test. Have you considered a
> single test setup function that can be used to mount resctrl FS and setup
> the signal handler paired with a single test teardown function?
Yes. Consolidating all these is among my not-yet submitted patches.
I just had to do a backport-friendly Fixes patch first for this.
--
i.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/5] selftests/resctrl: Refactor feature check to use resource and feature name
2023-09-12 22:09 ` Reinette Chatre
@ 2023-09-13 11:02 ` Ilpo Järvinen
2023-09-13 20:59 ` Reinette Chatre
0 siblings, 1 reply; 25+ messages in thread
From: Ilpo Järvinen @ 2023-09-13 11:02 UTC (permalink / raw)
To: Reinette Chatre
Cc: Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman, LKML, Shaopeng Tan, stable
[-- Attachment #1: Type: text/plain, Size: 5802 bytes --]
On Tue, 12 Sep 2023, Reinette Chatre wrote:
> On 9/11/2023 4:19 AM, Ilpo Järvinen wrote:
> > Feature check in validate_resctrl_feature_request() takes in the test
> > name string and maps that to what to check per test.
> >
> > Pass resource and feature names to validate_resctrl_feature_request()
> > directly rather than deriving them from the test name inside the
> > function which makes the feature check easier to extend for new test
> > cases.
> >
> > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> > Cc: <stable@vger.kernel.org>
>
> This does not seem to be stable material.
Alone it isn't, but both 2/5 and this 3/5 are prerequisites for 4/5 as
shown by the tags there.
> > ---
> > tools/testing/selftests/resctrl/resctrl.h | 6 +-
> > .../testing/selftests/resctrl/resctrl_tests.c | 10 +--
> > tools/testing/selftests/resctrl/resctrlfs.c | 69 ++++++++-----------
> > 3 files changed, 34 insertions(+), 51 deletions(-)
> >
> > diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
> > index dd07463cdf48..89ced4152933 100644
> > --- a/tools/testing/selftests/resctrl/resctrl.h
> > +++ b/tools/testing/selftests/resctrl/resctrl.h
> > diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
> > index bd36ee206602..bd547a10791c 100644
> > --- a/tools/testing/selftests/resctrl/resctrlfs.c
> > +++ b/tools/testing/selftests/resctrl/resctrlfs.c
> > @@ -10,6 +10,8 @@
> > */
> > #include "resctrl.h"
> >
> > +#include <limits.h>
> > +
>
> Could you please include <limits.h> before the local resctrl.h?
Believe me I tried that first but it did not work. So this intentionally
in the current order as resctrl.h defines _GNU_SOURCE which is among
things that tends to alter many things. If I reorder them, the build gives
me these issues:
resctrlfs.c: In function ‘taskset_benchmark’:
resctrlfs.c:284:2: warning: implicit declaration of function ‘CPU_ZERO’;
did you mean ‘FP_ZERO’? [-Wimplicit-function-declaration]
284 | CPU_ZERO(&my_set);
| ^~~~~~~~
| FP_ZERO
resctrlfs.c:285:2: warning: implicit declaration of function ‘CPU_SET’
[-Wimplicit-function-declaration]
285 | CPU_SET(cpu_no, &my_set);
| ^~~~~~~
resctrlfs.c:287:6: warning: implicit declaration of function
‘sched_setaffinity’ [-Wimplicit-function-declaration]
287 | if (sched_setaffinity(bm_pid, sizeof(cpu_set_t), &my_set)) {
| ^~~~~~~~~~~~~~~~~
It might be useful to move _GNU_SOURCE define into Makefile though to
avoid these kind of issues (but that's not material for this patch).
> > static int find_resctrl_mount(char *buffer)
> > {
> > FILE *mounts;
> > @@ -604,63 +606,46 @@ char *fgrep(FILE *inf, const char *str)
> >
> > /*
> > * validate_resctrl_feature_request - Check if requested feature is valid.
> > - * @resctrl_val: Requested feature
> > + * @resource: Required resource (e.g., MB, L3, L2, L3_MON, etc.)
> > + * @feature: Feature to be checked under resource (can be NULL). This path
> > + * is relative to the resource path.
>
> I do not think "this path" is accurate. @feature is not a path but an entry
> within the mon_features file.
Yes, agreed.
> Also please note that mon_features only exists for L3_MON, none of the other
> listed resources have an associated mon_features file in resctrl. This
> function is created to be generic has specific requirements on what
> valid (never checked) parameters should be. This may be ok with the usage
> but it should not pretend to be generic.
So are you recommending I split this function into two where the new one
would do the mon_features check?
> > char *res;
> > FILE *inf;
> > int ret;
> >
> > - if (!resctrl_val)
> > + if (!resource)
> > return false;
> >
> > ret = find_resctrl_mount(NULL);
> > if (ret)
> > return false;
> >
> > - if (!strncmp(resctrl_val, CAT_STR, sizeof(CAT_STR))) {
> > - if (!stat(L3_PATH, &statbuf))
> > - return true;
> > - } else if (!strncmp(resctrl_val, MBA_STR, sizeof(MBA_STR))) {
> > - if (!stat(MB_PATH, &statbuf))
> > - return true;
> > - } else if (!strncmp(resctrl_val, MBM_STR, sizeof(MBM_STR)) ||
> > - !strncmp(resctrl_val, CMT_STR, sizeof(CMT_STR))) {
> > - if (!stat(L3_MON_PATH, &statbuf)) {
> > - inf = fopen(L3_MON_FEATURES_PATH, "r");
> > - if (!inf)
> > - return false;
> > -
> > - if (!strncmp(resctrl_val, CMT_STR, sizeof(CMT_STR))) {
> > - res = fgrep(inf, "llc_occupancy");
> > - if (res) {
> > - found = true;
> > - free(res);
> > - }
> > - }
> > -
> > - if (!strncmp(resctrl_val, MBM_STR, sizeof(MBM_STR))) {
> > - res = fgrep(inf, "mbm_total_bytes");
> > - if (res) {
> > - free(res);
> > - res = fgrep(inf, "mbm_local_bytes");
> > - if (res) {
> > - found = true;
> > - free(res);
> > - }
> > - }
> > - }
> > - fclose(inf);
> > - }
> > - }
> > + snprintf(res_path, sizeof(res_path), "%s/%s", INFO_PATH, resource);
> > +
> > + if (stat(res_path, &statbuf))
> > + return false;
> > +
> > + if (!feature)
> > + return true;
> > +
> > + snprintf(res_path, sizeof(res_path), "%s/%s/mon_features", INFO_PATH, resource);
> > + inf = fopen(res_path, "r");
> > + if (!inf)
> > + return false;
> > +
> > + res = fgrep(inf, feature);
> > + free(res);
> > + fclose(inf);
> >
> > - return found;
> > + return res;
>
> This is unexpected. Function should return bool but instead returns a char * that
> has been freed?
Okay, I understand it looks confusing when relying on implicit conversion
to boolean (despite being functionally correct).
I can do this instead to make it more obvious the intention is to convert
the result to boolean and not return a pointer:
return !!res;
--
i.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/5] selftests/resctrl: Remove duplicate feature check from CMT test
2023-09-12 22:06 ` Reinette Chatre
@ 2023-09-13 11:11 ` Ilpo Järvinen
2023-09-13 20:58 ` Reinette Chatre
0 siblings, 1 reply; 25+ messages in thread
From: Ilpo Järvinen @ 2023-09-13 11:11 UTC (permalink / raw)
To: Reinette Chatre
Cc: Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman, LKML, Shaopeng Tan, stable
[-- Attachment #1: Type: text/plain, Size: 886 bytes --]
On Tue, 12 Sep 2023, Reinette Chatre wrote:
> On 9/11/2023 4:19 AM, Ilpo Järvinen wrote:
> > The test runner run_cmt_test() in resctrl_tests.c checks for CMT
> > feature and does not run cmt_resctrl_val() if CMT is not supported.
> > Then cmt_resctrl_val() also check is CMT is supported.
> >
> > Remove the duplicated feature check for CMT from cmt_resctrl_val().
> >
> > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> > Cc: <stable@vger.kernel.org>
>
> This does not look like stable material to me.
I know but when constructing this series I had 2 options:
Either convert also this when changing validate_resctrl_feature_request()
or remove this call entirely.
Given it's duplicate of the other CMT check, I chose to just remove it
(which I'd do anyway). As patch 4/5 requires 3/5 which in turn requires
this, this has to go stable if 4/5 goes too.
--
i.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 5/5] selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests
2023-09-12 22:10 ` Reinette Chatre
@ 2023-09-13 11:43 ` Ilpo Järvinen
2023-09-13 21:00 ` Reinette Chatre
0 siblings, 1 reply; 25+ messages in thread
From: Ilpo Järvinen @ 2023-09-13 11:43 UTC (permalink / raw)
To: Reinette Chatre
Cc: Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman, LKML, Shaopeng Tan, stable
[-- Attachment #1: Type: text/plain, Size: 2744 bytes --]
On Tue, 12 Sep 2023, Reinette Chatre wrote:
> On 9/11/2023 4:19 AM, Ilpo Järvinen wrote:
> > 5% difference upper bound for success is a bit on the low side for the
>
> "a bit on the low side" is very vague.
The commit that introduced that 5% bound plainly admitted it's "randomly
chosen value". At least that wasn't vague, I guess. :-)
So what I'm trying to do here is to have "randomly chosen value" replaced
with a value that seems to work well enough based on measurements on
a large set of platforms.
Personally, I don't care much about this, I can just ignore the failures
due to outliers (and also reports about failing MBA/MBM test if somebody
ever sends one to me), but if I'd be one running automated tests it would
be annoying to have a problem like this unaddressed.
> > MBA and MBM tests. Some platforms produce outliers that are slightly
> > above that, typically 6-7%.
> >
> > Relaxing the MBA/MBM success bound to 8% removes most of the failures
> > due those frequent outliers.
>
> This description needs more context on what issue is being solved here.
> What does the % difference represent? How was new percentage determined?
>
> Did you investigate why there are differences between platforms? From
> what I understand these tests measure memory bandwidth using perf and
> resctrl and then compare the difference. Are there interesting things
> about the platforms on which the difference is higher than 5%?
Not really I think. The number just isn't that stable to always remain
below 5% (even if it usually does).
Only systematic thing I've come across is that if I play with the read
pattern for defeating the hw prefetcher (you've seen a patch earlier and
it will be among the series I'll send after this one), it has an impact
which looks more systematic across all MBM/MBA tests. But it's not what
I'm trying now address with this patch.
> Could
> those be systems with multiple sockets (and thus multiple PMUs that need
> to be setup, reset, and read)? Can the reading of the counters be improved
> instead of relaxing the success criteria? A quick comparison between
> get_mem_bw_imc() and get_mem_bw_resctrl() makes me think that a difference
> is not surprising ... note how the PMU counters are started and reset
> (potentially on multiple sockets) at every iteration while the resctrl
> counters keep rolling and new values are just subtracted from previous.
Perhaps, I can try to look into it (add to my todo list so I won't
forget). But in the meantime, this new value is picked using a criteria
that looks better than "randomly chosen value". If I ever manage to
address the outliers, the bound could be lowered again.
I'll update the changelog to explain things better.
--
i.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/5] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal
2023-09-13 10:01 ` Ilpo Järvinen
@ 2023-09-13 20:58 ` Reinette Chatre
2023-09-14 10:16 ` Ilpo Järvinen
0 siblings, 1 reply; 25+ messages in thread
From: Reinette Chatre @ 2023-09-13 20:58 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman, LKML, Shaopeng Tan, stable
Hi Ilpo,
On 9/13/2023 3:01 AM, Ilpo Järvinen wrote:
> On Tue, 12 Sep 2023, Reinette Chatre wrote:
>> On 9/11/2023 4:19 AM, Ilpo Järvinen wrote:
>>> Unmounting resctrl FS has been moved into the per test functions in
>>> resctrl_tests.c by commit caddc0fbe495 ("selftests/resctrl: Move
>>> resctrl FS mount/umount to higher level"). In case a signal (SIGINT,
>>> SIGTERM, or SIGHUP) is received, the running selftest is aborted by
>>> ctrlc_handler() which then unmounts resctrl fs before exiting. The
>>> current section between signal_handler_register() and
>>> signal_handler_unregister(), however, does not cover the entire
>>> duration when resctrl FS is mounted.
>>>
>>> Move signal_handler_register() and signal_handler_unregister() call
>>> into the test functions in resctrl_tests.c to properly unmount resctrl
>>> fs. Adjust child process kill() call in ctrlc_handler() to only be
>>> invoked if the child was already forked.
>>
>> Thank you for catching this.
>>
>>>
>>> Fixes: caddc0fbe495 ("selftests/resctrl: Move resctrl FS mount/umount to higher level")
>>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
>>> Cc: <stable@vger.kernel.org>
>>> ---
>>> tools/testing/selftests/resctrl/cat_test.c | 8 -------
>>> .../testing/selftests/resctrl/resctrl_tests.c | 24 +++++++++++++++++++
>>> tools/testing/selftests/resctrl/resctrl_val.c | 22 ++++++++---------
>>> 3 files changed, 34 insertions(+), 20 deletions(-)
>>>
>>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
>>> index 97b87285ab2a..224ba8544d8a 100644
>>> --- a/tools/testing/selftests/resctrl/cat_test.c
>>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>>> @@ -167,12 +167,6 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
>>> strcpy(param.filename, RESULT_FILE_NAME1);
>>> param.num_of_runs = 0;
>>> param.cpu_no = sibling_cpu_no;
>>> - } else {
>>> - ret = signal_handler_register();
>>> - if (ret) {
>>> - kill(bm_pid, SIGKILL);
>>> - goto out;
>>> - }
>>> }
>>>
>>> remove(param.filename);
>>> @@ -209,10 +203,8 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
>>> }
>>> close(pipefd[0]);
>>> kill(bm_pid, SIGKILL);
>>> - signal_handler_unregister();
>>> }
>>>
>>> -out:
>>> cat_test_cleanup();
>>>
>>> return ret;
>>> diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
>>> index 823672a20a43..3d66fbdc2df3 100644
>>> --- a/tools/testing/selftests/resctrl/resctrl_tests.c
>>> +++ b/tools/testing/selftests/resctrl/resctrl_tests.c
>>> @@ -73,8 +73,13 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
>>>
>>> ksft_print_msg("Starting MBM BW change ...\n");
>>>
>>> + res = signal_handler_register();
>>> + if (res)
>>> + return;
>>> +
>>> res = mount_resctrlfs();
>>> if (res) {
>>> + signal_handler_unregister();
>>> ksft_exit_fail_msg("Failed to mount resctrl FS\n");
>>> return;
>>> }
>>> @@ -91,6 +96,7 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
>>>
>>> umount:
>>> umount_resctrlfs();
>>> + signal_handler_unregister();
>>> }
>>>
>>> static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
>>> @@ -99,8 +105,13 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
>>>
>>> ksft_print_msg("Starting MBA Schemata change ...\n");
>>>
>>> + res = signal_handler_register();
>>> + if (res)
>>> + return;
>>> +
>>> res = mount_resctrlfs();
>>> if (res) {
>>> + signal_handler_unregister();
>>> ksft_exit_fail_msg("Failed to mount resctrl FS\n");
>>> return;
>>> }
>>> @@ -115,6 +126,7 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
>>>
>>> umount:
>>> umount_resctrlfs();
>>> + signal_handler_unregister();
>>> }
>>>
>>
>> This adds more duplicated code for every test. Have you considered a
>> single test setup function that can be used to mount resctrl FS and setup
>> the signal handler paired with a single test teardown function?
>
> Yes. Consolidating all these is among my not-yet submitted patches.
> I just had to do a backport-friendly Fixes patch first for this.
>
Could you please help me understand how the duplicate calls are more
backport friendly?
Reinette
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/5] selftests/resctrl: Remove duplicate feature check from CMT test
2023-09-13 11:11 ` Ilpo Järvinen
@ 2023-09-13 20:58 ` Reinette Chatre
2023-09-14 9:58 ` Ilpo Järvinen
0 siblings, 1 reply; 25+ messages in thread
From: Reinette Chatre @ 2023-09-13 20:58 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman, LKML, Shaopeng Tan, stable
Hi Ilpo,
On 9/13/2023 4:11 AM, Ilpo Järvinen wrote:
> On Tue, 12 Sep 2023, Reinette Chatre wrote:
>> On 9/11/2023 4:19 AM, Ilpo Järvinen wrote:
>>> The test runner run_cmt_test() in resctrl_tests.c checks for CMT
>>> feature and does not run cmt_resctrl_val() if CMT is not supported.
>>> Then cmt_resctrl_val() also check is CMT is supported.
>>>
>>> Remove the duplicated feature check for CMT from cmt_resctrl_val().
>>>
>>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
>>> Cc: <stable@vger.kernel.org>
>>
>> This does not look like stable material to me.
>
> I know but when constructing this series I had 2 options:
>
> Either convert also this when changing validate_resctrl_feature_request()
> or remove this call entirely.
>
> Given it's duplicate of the other CMT check, I chose to just remove it
> (which I'd do anyway). As patch 4/5 requires 3/5 which in turn requires
> this, this has to go stable if 4/5 goes too.
>
Understood. This makes it a dependency of an actual fix, which is addressed
in 4/5's sign-off area. This notation is new to me but it is not clear to me
that the dependency should also be tagged as stable material (without a
fixes tag). Since it is not an actual fix by itself yet is sent to @stable
I think it may cause confusion. Is just listing it as a dependency of the
actual fix not sufficient (as you already do in 4/5)? Perhaps as compromise
this patch can also get a note to the stable team. Something like:
Cc: <stable@vger.kernel.org> # dependency of "selftests/resctrl: Fix feature checks"
I am not sure though - I would like to avoid confusion and not burden
the stable team. If this is a flow you have used before successfully I'd
defer to your experience.
Reinette
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/5] selftests/resctrl: Refactor feature check to use resource and feature name
2023-09-13 11:02 ` Ilpo Järvinen
@ 2023-09-13 20:59 ` Reinette Chatre
2023-09-14 11:06 ` Ilpo Järvinen
0 siblings, 1 reply; 25+ messages in thread
From: Reinette Chatre @ 2023-09-13 20:59 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman, LKML, Shaopeng Tan, stable
Hi Ilpo,
On 9/13/2023 4:02 AM, Ilpo Järvinen wrote:
> On Tue, 12 Sep 2023, Reinette Chatre wrote:
>> On 9/11/2023 4:19 AM, Ilpo Järvinen wrote:
>>> Feature check in validate_resctrl_feature_request() takes in the test
>>> name string and maps that to what to check per test.
>>>
>>> Pass resource and feature names to validate_resctrl_feature_request()
>>> directly rather than deriving them from the test name inside the
>>> function which makes the feature check easier to extend for new test
>>> cases.
>>>
>>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
>>> Cc: <stable@vger.kernel.org>
>>
>> This does not seem to be stable material.
>
> Alone it isn't, but both 2/5 and this 3/5 are prerequisites for 4/5 as
> shown by the tags there.
>
>>> ---
>>> tools/testing/selftests/resctrl/resctrl.h | 6 +-
>>> .../testing/selftests/resctrl/resctrl_tests.c | 10 +--
>>> tools/testing/selftests/resctrl/resctrlfs.c | 69 ++++++++-----------
>>> 3 files changed, 34 insertions(+), 51 deletions(-)
>>>
>>> diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
>>> index dd07463cdf48..89ced4152933 100644
>>> --- a/tools/testing/selftests/resctrl/resctrl.h
>>> +++ b/tools/testing/selftests/resctrl/resctrl.h
>
>>> diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
>>> index bd36ee206602..bd547a10791c 100644
>>> --- a/tools/testing/selftests/resctrl/resctrlfs.c
>>> +++ b/tools/testing/selftests/resctrl/resctrlfs.c
>>> @@ -10,6 +10,8 @@
>>> */
>>> #include "resctrl.h"
>>>
>>> +#include <limits.h>
>>> +
>>
>> Could you please include <limits.h> before the local resctrl.h?
>
> Believe me I tried that first but it did not work. So this intentionally
> in the current order as resctrl.h defines _GNU_SOURCE which is among
> things that tends to alter many things. If I reorder them, the build gives
> me these issues:
>
> resctrlfs.c: In function ‘taskset_benchmark’:
> resctrlfs.c:284:2: warning: implicit declaration of function ‘CPU_ZERO’;
> did you mean ‘FP_ZERO’? [-Wimplicit-function-declaration]
> 284 | CPU_ZERO(&my_set);
> | ^~~~~~~~
> | FP_ZERO
> resctrlfs.c:285:2: warning: implicit declaration of function ‘CPU_SET’
> [-Wimplicit-function-declaration]
> 285 | CPU_SET(cpu_no, &my_set);
> | ^~~~~~~
> resctrlfs.c:287:6: warning: implicit declaration of function
> ‘sched_setaffinity’ [-Wimplicit-function-declaration]
> 287 | if (sched_setaffinity(bm_pid, sizeof(cpu_set_t), &my_set)) {
> | ^~~~~~~~~~~~~~~~~
>
> It might be useful to move _GNU_SOURCE define into Makefile though to
> avoid these kind of issues (but that's not material for this patch).
How about a #define _GNU_SOURCE in this file as an intermediate step?
I did see your patch making this change but cannot see how it is
coordinated with fixing the include order in this file.
>
>>> static int find_resctrl_mount(char *buffer)
>>> {
>>> FILE *mounts;
>>> @@ -604,63 +606,46 @@ char *fgrep(FILE *inf, const char *str)
>>>
>>> /*
>>> * validate_resctrl_feature_request - Check if requested feature is valid.
>>> - * @resctrl_val: Requested feature
>>> + * @resource: Required resource (e.g., MB, L3, L2, L3_MON, etc.)
>>> + * @feature: Feature to be checked under resource (can be NULL). This path
>>> + * is relative to the resource path.
>>
>> I do not think "this path" is accurate. @feature is not a path but an entry
>> within the mon_features file.
>
> Yes, agreed.
>
>> Also please note that mon_features only exists for L3_MON, none of the other
>> listed resources have an associated mon_features file in resctrl. This
>> function is created to be generic has specific requirements on what
>> valid (never checked) parameters should be. This may be ok with the usage
>> but it should not pretend to be generic.
>
> So are you recommending I split this function into two where the new one
> would do the mon_features check?
No need to split the function. That seems overkill considering its
captive usage. I think a snippet making its usage clear will be helpful.
Something like:
@feature: <description>. Can only be set for L3_MON. Must be
NULL for all other resources.
Please feel free to improve.
>
>>> char *res;
>>> FILE *inf;
>>> int ret;
>>>
>>> - if (!resctrl_val)
>>> + if (!resource)
>>> return false;
>>>
>>> ret = find_resctrl_mount(NULL);
>>> if (ret)
>>> return false;
>>>
>>> - if (!strncmp(resctrl_val, CAT_STR, sizeof(CAT_STR))) {
>>> - if (!stat(L3_PATH, &statbuf))
>>> - return true;
>>> - } else if (!strncmp(resctrl_val, MBA_STR, sizeof(MBA_STR))) {
>>> - if (!stat(MB_PATH, &statbuf))
>>> - return true;
>>> - } else if (!strncmp(resctrl_val, MBM_STR, sizeof(MBM_STR)) ||
>>> - !strncmp(resctrl_val, CMT_STR, sizeof(CMT_STR))) {
>>> - if (!stat(L3_MON_PATH, &statbuf)) {
>>> - inf = fopen(L3_MON_FEATURES_PATH, "r");
>>> - if (!inf)
>>> - return false;
>>> -
>>> - if (!strncmp(resctrl_val, CMT_STR, sizeof(CMT_STR))) {
>>> - res = fgrep(inf, "llc_occupancy");
>>> - if (res) {
>>> - found = true;
>>> - free(res);
>>> - }
>>> - }
>>> -
>>> - if (!strncmp(resctrl_val, MBM_STR, sizeof(MBM_STR))) {
>>> - res = fgrep(inf, "mbm_total_bytes");
>>> - if (res) {
>>> - free(res);
>>> - res = fgrep(inf, "mbm_local_bytes");
>>> - if (res) {
>>> - found = true;
>>> - free(res);
>>> - }
>>> - }
>>> - }
>>> - fclose(inf);
>>> - }
>>> - }
>>> + snprintf(res_path, sizeof(res_path), "%s/%s", INFO_PATH, resource);
>>> +
>>> + if (stat(res_path, &statbuf))
>>> + return false;
>>> +
>>> + if (!feature)
>>> + return true;
>>> +
>>> + snprintf(res_path, sizeof(res_path), "%s/%s/mon_features", INFO_PATH, resource);
>>> + inf = fopen(res_path, "r");
>>> + if (!inf)
>>> + return false;
>>> +
>>> + res = fgrep(inf, feature);
>>> + free(res);
>>> + fclose(inf);
>>>
>>> - return found;
>>> + return res;
>>
>> This is unexpected. Function should return bool but instead returns a char * that
>> has been freed?
>
> Okay, I understand it looks confusing when relying on implicit conversion
> to boolean (despite being functionally correct).
>
> I can do this instead to make it more obvious the intention is to convert
> the result to boolean and not return a pointer:
> return !!res;
>
Looks good to me.
Reinette
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 5/5] selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests
2023-09-13 11:43 ` Ilpo Järvinen
@ 2023-09-13 21:00 ` Reinette Chatre
0 siblings, 0 replies; 25+ messages in thread
From: Reinette Chatre @ 2023-09-13 21:00 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman, LKML, Shaopeng Tan, stable
Hi Ilpo,
On 9/13/2023 4:43 AM, Ilpo Järvinen wrote:
> On Tue, 12 Sep 2023, Reinette Chatre wrote:
>> On 9/11/2023 4:19 AM, Ilpo Järvinen wrote:
>>> 5% difference upper bound for success is a bit on the low side for the
>>
>> "a bit on the low side" is very vague.
>
> The commit that introduced that 5% bound plainly admitted it's "randomly
> chosen value". At least that wasn't vague, I guess. :-)
>
> So what I'm trying to do here is to have "randomly chosen value" replaced
> with a value that seems to work well enough based on measurements on
> a large set of platforms.
Already a better motivation for this change. Your cover letter also hints
at this but this description does not mention that this is not just
another number pulled from the air but indeed one that is based on
significant testing on a large variety of systems. This description can
surely mention all the work you did that ended with proposing this new
number, no?
>
> Personally, I don't care much about this, I can just ignore the failures
> due to outliers (and also reports about failing MBA/MBM test if somebody
> ever sends one to me), but if I'd be one running automated tests it would
> be annoying to have a problem like this unaddressed.
In no way was I suggesting that this should not be addressed.
>
>>> MBA and MBM tests. Some platforms produce outliers that are slightly
>>> above that, typically 6-7%.
>>>
>>> Relaxing the MBA/MBM success bound to 8% removes most of the failures
>>> due those frequent outliers.
>>
>> This description needs more context on what issue is being solved here.
>> What does the % difference represent? How was new percentage determined?
>>
>> Did you investigate why there are differences between platforms? From
>> what I understand these tests measure memory bandwidth using perf and
>> resctrl and then compare the difference. Are there interesting things
>> about the platforms on which the difference is higher than 5%?
>
> Not really I think. The number just isn't that stable to always remain
> below 5% (even if it usually does).
>
> Only systematic thing I've come across is that if I play with the read
> pattern for defeating the hw prefetcher (you've seen a patch earlier and
> it will be among the series I'll send after this one), it has an impact
> which looks more systematic across all MBM/MBA tests. But it's not what
> I'm trying now address with this patch.
>
>> Could
>> those be systems with multiple sockets (and thus multiple PMUs that need
>> to be setup, reset, and read)? Can the reading of the counters be improved
>> instead of relaxing the success criteria? A quick comparison between
>> get_mem_bw_imc() and get_mem_bw_resctrl() makes me think that a difference
>> is not surprising ... note how the PMU counters are started and reset
>> (potentially on multiple sockets) at every iteration while the resctrl
>> counters keep rolling and new values are just subtracted from previous.
>
> Perhaps, I can try to look into it (add to my todo list so I won't
> forget). But in the meantime, this new value is picked using a criteria
> that looks better than "randomly chosen value". If I ever manage to
> address the outliers, the bound could be lowered again.
>
> I'll update the changelog to explain things better.
>
>
ok, thank you.
Reinette
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/5] selftests/resctrl: Remove duplicate feature check from CMT test
2023-09-13 20:58 ` Reinette Chatre
@ 2023-09-14 9:58 ` Ilpo Järvinen
2023-09-14 15:04 ` Reinette Chatre
0 siblings, 1 reply; 25+ messages in thread
From: Ilpo Järvinen @ 2023-09-14 9:58 UTC (permalink / raw)
To: Reinette Chatre
Cc: Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman, LKML, Shaopeng Tan, stable
[-- Attachment #1: Type: text/plain, Size: 2651 bytes --]
On Wed, 13 Sep 2023, Reinette Chatre wrote:
> On 9/13/2023 4:11 AM, Ilpo Järvinen wrote:
> > On Tue, 12 Sep 2023, Reinette Chatre wrote:
> >> On 9/11/2023 4:19 AM, Ilpo Järvinen wrote:
> >>> The test runner run_cmt_test() in resctrl_tests.c checks for CMT
> >>> feature and does not run cmt_resctrl_val() if CMT is not supported.
> >>> Then cmt_resctrl_val() also check is CMT is supported.
> >>>
> >>> Remove the duplicated feature check for CMT from cmt_resctrl_val().
> >>>
> >>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> >>> Cc: <stable@vger.kernel.org>
> >>
> >> This does not look like stable material to me.
> >
> > I know but when constructing this series I had 2 options:
> >
> > Either convert also this when changing validate_resctrl_feature_request()
> > or remove this call entirely.
> >
> > Given it's duplicate of the other CMT check, I chose to just remove it
> > (which I'd do anyway). As patch 4/5 requires 3/5 which in turn requires
> > this, this has to go stable if 4/5 goes too.
> >
>
> Understood. This makes it a dependency of an actual fix, which is addressed
> in 4/5's sign-off area. This notation is new to me but it is not clear to me
> that the dependency should also be tagged as stable material (without a
> fixes tag). Since it is not an actual fix by itself yet is sent to @stable
> I think it may cause confusion. Is just listing it as a dependency of the
> actual fix not sufficient (as you already do in 4/5)? Perhaps as compromise
> this patch can also get a note to the stable team. Something like:
>
> Cc: <stable@vger.kernel.org> # dependency of "selftests/resctrl: Fix feature checks"
>
> I am not sure though - I would like to avoid confusion and not burden
> the stable team. If this is a flow you have used before successfully I'd
> defer to your experience.
I came across that dependency format when Greg KH replied to somebody how
to deal with the cases where there isn't yet a commit id
(the cases mentioned in Documentation/process/stable-kernel-rules.rst
assumes there is already a commit id). Unfortunately it's long time ago
so I cannot easily find the link.
Documentation/process/stable-kernel-rules.rst doesn't state that the
stable address should be only used for the patches with Fixes. In general,
I believe this doesn't matter much because whether something is Cc'ed or
not to stable@vger.kernel.org doesn't seems to impact the decision if a
patch goes into stable or not (even if even some maintainers seem to
pretend leaving it out makes a difference so I tend to play along and
smile myself how incorrect that assumption is :-)).
--
i.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/5] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal
2023-09-13 20:58 ` Reinette Chatre
@ 2023-09-14 10:16 ` Ilpo Järvinen
2023-09-14 15:04 ` Reinette Chatre
0 siblings, 1 reply; 25+ messages in thread
From: Ilpo Järvinen @ 2023-09-14 10:16 UTC (permalink / raw)
To: Reinette Chatre
Cc: Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman, LKML, Shaopeng Tan, stable
[-- Attachment #1: Type: text/plain, Size: 6672 bytes --]
On Wed, 13 Sep 2023, Reinette Chatre wrote:
> On 9/13/2023 3:01 AM, Ilpo Järvinen wrote:
> > On Tue, 12 Sep 2023, Reinette Chatre wrote:
> >> On 9/11/2023 4:19 AM, Ilpo Järvinen wrote:
> >>> Unmounting resctrl FS has been moved into the per test functions in
> >>> resctrl_tests.c by commit caddc0fbe495 ("selftests/resctrl: Move
> >>> resctrl FS mount/umount to higher level"). In case a signal (SIGINT,
> >>> SIGTERM, or SIGHUP) is received, the running selftest is aborted by
> >>> ctrlc_handler() which then unmounts resctrl fs before exiting. The
> >>> current section between signal_handler_register() and
> >>> signal_handler_unregister(), however, does not cover the entire
> >>> duration when resctrl FS is mounted.
> >>>
> >>> Move signal_handler_register() and signal_handler_unregister() call
> >>> into the test functions in resctrl_tests.c to properly unmount resctrl
> >>> fs. Adjust child process kill() call in ctrlc_handler() to only be
> >>> invoked if the child was already forked.
> >>
> >> Thank you for catching this.
> >>
> >>>
> >>> Fixes: caddc0fbe495 ("selftests/resctrl: Move resctrl FS mount/umount to higher level")
> >>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> >>> Cc: <stable@vger.kernel.org>
> >>> ---
> >>> tools/testing/selftests/resctrl/cat_test.c | 8 -------
> >>> .../testing/selftests/resctrl/resctrl_tests.c | 24 +++++++++++++++++++
> >>> tools/testing/selftests/resctrl/resctrl_val.c | 22 ++++++++---------
> >>> 3 files changed, 34 insertions(+), 20 deletions(-)
> >>>
> >>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> >>> index 97b87285ab2a..224ba8544d8a 100644
> >>> --- a/tools/testing/selftests/resctrl/cat_test.c
> >>> +++ b/tools/testing/selftests/resctrl/cat_test.c
> >>> @@ -167,12 +167,6 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
> >>> strcpy(param.filename, RESULT_FILE_NAME1);
> >>> param.num_of_runs = 0;
> >>> param.cpu_no = sibling_cpu_no;
> >>> - } else {
> >>> - ret = signal_handler_register();
> >>> - if (ret) {
> >>> - kill(bm_pid, SIGKILL);
> >>> - goto out;
> >>> - }
> >>> }
> >>>
> >>> remove(param.filename);
> >>> @@ -209,10 +203,8 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
> >>> }
> >>> close(pipefd[0]);
> >>> kill(bm_pid, SIGKILL);
> >>> - signal_handler_unregister();
> >>> }
> >>>
> >>> -out:
> >>> cat_test_cleanup();
> >>>
> >>> return ret;
> >>> diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
> >>> index 823672a20a43..3d66fbdc2df3 100644
> >>> --- a/tools/testing/selftests/resctrl/resctrl_tests.c
> >>> +++ b/tools/testing/selftests/resctrl/resctrl_tests.c
> >>> @@ -73,8 +73,13 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
> >>>
> >>> ksft_print_msg("Starting MBM BW change ...\n");
> >>>
> >>> + res = signal_handler_register();
> >>> + if (res)
> >>> + return;
> >>> +
> >>> res = mount_resctrlfs();
> >>> if (res) {
> >>> + signal_handler_unregister();
> >>> ksft_exit_fail_msg("Failed to mount resctrl FS\n");
> >>> return;
> >>> }
> >>> @@ -91,6 +96,7 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
> >>>
> >>> umount:
> >>> umount_resctrlfs();
> >>> + signal_handler_unregister();
> >>> }
> >>>
> >>> static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
> >>> @@ -99,8 +105,13 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
> >>>
> >>> ksft_print_msg("Starting MBA Schemata change ...\n");
> >>>
> >>> + res = signal_handler_register();
> >>> + if (res)
> >>> + return;
> >>> +
> >>> res = mount_resctrlfs();
> >>> if (res) {
> >>> + signal_handler_unregister();
> >>> ksft_exit_fail_msg("Failed to mount resctrl FS\n");
> >>> return;
> >>> }
> >>> @@ -115,6 +126,7 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
> >>>
> >>> umount:
> >>> umount_resctrlfs();
> >>> + signal_handler_unregister();
> >>> }
> >>>
> >>
> >> This adds more duplicated code for every test. Have you considered a
> >> single test setup function that can be used to mount resctrl FS and setup
> >> the signal handler paired with a single test teardown function?
> >
> > Yes. Consolidating all these is among my not-yet submitted patches.
> > I just had to do a backport-friendly Fixes patch first for this.
> >
>
> Could you please help me understand how the duplicate calls are more
> backport friendly?
Hi,
It's simply because the refactoring that has to be done to be able to
introduce the generalized test framework is much more invasive and far
reaching than this patch. Essentially, all the call signatures of the test
functions need to match and the feature checks need to be done in new per
test functions too. This is the diffstat of those changes alone:
tools/testing/selftests/resctrl/cat_test.c | 21 +++--
tools/testing/selftests/resctrl/cmt_test.c | 26 +++--
tools/testing/selftests/resctrl/mba_test.c | 20 +++-
tools/testing/selftests/resctrl/mbm_test.c | 20 +++-
tools/testing/selftests/resctrl/resctrl.h | 43 ++++++++-
tools/testing/selftests/resctrl/resctrl_tests.c | 220 +++++++++++++++----------------------------
tools/testing/selftests/resctrl/resctrlfs.c | 5 +
(tools/testing/selftests/resctrl/resctrl_tests.c --- part would
be slightly less if I'd reorder this patch but that only 24 lines off as
per diffstat of this patch).
But that's not all.... To be able to push the generalized test framework
to stable, you need to also count in the benchmark cmd changes which
worked towards making the call signatures identical. So here's the
diffstat for that series for quick reference:
tools/testing/selftests/resctrl/cache.c | 5 +-
tools/testing/selftests/resctrl/cat_test.c | 13 +--
tools/testing/selftests/resctrl/cmt_test.c | 34 ++++--
tools/testing/selftests/resctrl/mba_test.c | 4 +-
tools/testing/selftests/resctrl/mbm_test.c | 7 +-
tools/testing/selftests/resctrl/resctrl.h | 16 +--
.../testing/selftests/resctrl/resctrl_tests.c | 100 ++++++++----------
tools/testing/selftests/resctrl/resctrl_val.c | 10 +-
That's ~500 lines changed vs ~50 so it's a magnitude worse and much less
localized.
And rest assured, I did not like introducing the duplicated calls any more
than you do (I did not write the generalized test framework for nothing,
after all) but the way taken in this patch seemed the most reasonable
option under these circumstances.
--
i.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/5] selftests/resctrl: Refactor feature check to use resource and feature name
2023-09-13 20:59 ` Reinette Chatre
@ 2023-09-14 11:06 ` Ilpo Järvinen
0 siblings, 0 replies; 25+ messages in thread
From: Ilpo Järvinen @ 2023-09-14 11:06 UTC (permalink / raw)
To: Reinette Chatre
Cc: Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman, LKML, Shaopeng Tan, stable
[-- Attachment #1: Type: text/plain, Size: 5204 bytes --]
On Wed, 13 Sep 2023, Reinette Chatre wrote:
> On 9/13/2023 4:02 AM, Ilpo Järvinen wrote:
> > On Tue, 12 Sep 2023, Reinette Chatre wrote:
> >> On 9/11/2023 4:19 AM, Ilpo Järvinen wrote:
> >>> Feature check in validate_resctrl_feature_request() takes in the test
> >>> name string and maps that to what to check per test.
> >>>
> >>> Pass resource and feature names to validate_resctrl_feature_request()
> >>> directly rather than deriving them from the test name inside the
> >>> function which makes the feature check easier to extend for new test
> >>> cases.
> >>>
> >>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> >>> Cc: <stable@vger.kernel.org>
> >>
> >> This does not seem to be stable material.
> >
> > Alone it isn't, but both 2/5 and this 3/5 are prerequisites for 4/5 as
> > shown by the tags there.
> >
> >>> ---
> >>> tools/testing/selftests/resctrl/resctrl.h | 6 +-
> >>> .../testing/selftests/resctrl/resctrl_tests.c | 10 +--
> >>> tools/testing/selftests/resctrl/resctrlfs.c | 69 ++++++++-----------
> >>> 3 files changed, 34 insertions(+), 51 deletions(-)
> >>>
> >>> diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
> >>> index dd07463cdf48..89ced4152933 100644
> >>> --- a/tools/testing/selftests/resctrl/resctrl.h
> >>> +++ b/tools/testing/selftests/resctrl/resctrl.h
> >
> >>> diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
> >>> index bd36ee206602..bd547a10791c 100644
> >>> --- a/tools/testing/selftests/resctrl/resctrlfs.c
> >>> +++ b/tools/testing/selftests/resctrl/resctrlfs.c
> >>> @@ -10,6 +10,8 @@
> >>> */
> >>> #include "resctrl.h"
> >>>
> >>> +#include <limits.h>
> >>> +
> >>
> >> Could you please include <limits.h> before the local resctrl.h?
> >
> > Believe me I tried that first but it did not work. So this intentionally
> > in the current order as resctrl.h defines _GNU_SOURCE which is among
> > things that tends to alter many things. If I reorder them, the build gives
> > me these issues:
> >
> > resctrlfs.c: In function ‘taskset_benchmark’:
> > resctrlfs.c:284:2: warning: implicit declaration of function ‘CPU_ZERO’;
> > did you mean ‘FP_ZERO’? [-Wimplicit-function-declaration]
> > 284 | CPU_ZERO(&my_set);
> > | ^~~~~~~~
> > | FP_ZERO
> > resctrlfs.c:285:2: warning: implicit declaration of function ‘CPU_SET’
> > [-Wimplicit-function-declaration]
> > 285 | CPU_SET(cpu_no, &my_set);
> > | ^~~~~~~
> > resctrlfs.c:287:6: warning: implicit declaration of function
> > ‘sched_setaffinity’ [-Wimplicit-function-declaration]
> > 287 | if (sched_setaffinity(bm_pid, sizeof(cpu_set_t), &my_set)) {
> > | ^~~~~~~~~~~~~~~~~
> >
> > It might be useful to move _GNU_SOURCE define into Makefile though to
> > avoid these kind of issues (but that's not material for this patch).
>
> How about a #define _GNU_SOURCE in this file as an intermediate step?
> I did see your patch making this change but cannot see how it is
> coordinated with fixing the include order in this file.
I'll just make that change part of this series and use also it as
dependency. Making an intermediate step just for stable that is going to
immediately removed in mainline would just causing the code to diverge
unnecessarily, IMO.
There's also a small risk for some other bug that does not cause compile
to fail due to differences because of a late define for _GNU_SOURCE. I
don't find it very likely but seems possible due to differences in some
constant values (not that the resctrl selftest code is very good at using
those defined constants in the first place, there are plenty of literals
still to cleanup).
> >>> static int find_resctrl_mount(char *buffer)
> >>> {
> >>> FILE *mounts;
> >>> @@ -604,63 +606,46 @@ char *fgrep(FILE *inf, const char *str)
> >>>
> >>> /*
> >>> * validate_resctrl_feature_request - Check if requested feature is valid.
> >>> - * @resctrl_val: Requested feature
> >>> + * @resource: Required resource (e.g., MB, L3, L2, L3_MON, etc.)
> >>> + * @feature: Feature to be checked under resource (can be NULL). This path
> >>> + * is relative to the resource path.
> >>
> >> I do not think "this path" is accurate. @feature is not a path but an entry
> >> within the mon_features file.
> >
> > Yes, agreed.
> >
> >> Also please note that mon_features only exists for L3_MON, none of the other
> >> listed resources have an associated mon_features file in resctrl. This
> >> function is created to be generic has specific requirements on what
> >> valid (never checked) parameters should be. This may be ok with the usage
> >> but it should not pretend to be generic.
> >
> > So are you recommending I split this function into two where the new one
> > would do the mon_features check?
>
> No need to split the function. That seems overkill considering its
> captive usage. I think a snippet making its usage clear will be helpful.
> Something like:
>
> @feature: <description>. Can only be set for L3_MON. Must be
> NULL for all other resources.
>
> Please feel free to improve.
Thanks, I'll do that.
--
i.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/5] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal
2023-09-14 10:16 ` Ilpo Järvinen
@ 2023-09-14 15:04 ` Reinette Chatre
2023-09-14 17:05 ` Ilpo Järvinen
0 siblings, 1 reply; 25+ messages in thread
From: Reinette Chatre @ 2023-09-14 15:04 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman, LKML, Shaopeng Tan, stable
Hi Ilpo,
On 9/14/2023 3:16 AM, Ilpo Järvinen wrote:
> On Wed, 13 Sep 2023, Reinette Chatre wrote:
>> On 9/13/2023 3:01 AM, Ilpo Järvinen wrote:
>>> On Tue, 12 Sep 2023, Reinette Chatre wrote:
>>>> On 9/11/2023 4:19 AM, Ilpo Järvinen wrote:
>>>>> Unmounting resctrl FS has been moved into the per test functions in
>>>>> resctrl_tests.c by commit caddc0fbe495 ("selftests/resctrl: Move
>>>>> resctrl FS mount/umount to higher level"). In case a signal (SIGINT,
>>>>> SIGTERM, or SIGHUP) is received, the running selftest is aborted by
>>>>> ctrlc_handler() which then unmounts resctrl fs before exiting. The
>>>>> current section between signal_handler_register() and
>>>>> signal_handler_unregister(), however, does not cover the entire
>>>>> duration when resctrl FS is mounted.
>>>>>
>>>>> Move signal_handler_register() and signal_handler_unregister() call
>>>>> into the test functions in resctrl_tests.c to properly unmount resctrl
>>>>> fs. Adjust child process kill() call in ctrlc_handler() to only be
>>>>> invoked if the child was already forked.
>>>>
>>>> Thank you for catching this.
>>>>
>>>>>
>>>>> Fixes: caddc0fbe495 ("selftests/resctrl: Move resctrl FS mount/umount to higher level")
>>>>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
>>>>> Cc: <stable@vger.kernel.org>
>>>>> ---
>>>>> tools/testing/selftests/resctrl/cat_test.c | 8 -------
>>>>> .../testing/selftests/resctrl/resctrl_tests.c | 24 +++++++++++++++++++
>>>>> tools/testing/selftests/resctrl/resctrl_val.c | 22 ++++++++---------
>>>>> 3 files changed, 34 insertions(+), 20 deletions(-)
>>>>>
>>>>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
>>>>> index 97b87285ab2a..224ba8544d8a 100644
>>>>> --- a/tools/testing/selftests/resctrl/cat_test.c
>>>>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>>>>> @@ -167,12 +167,6 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
>>>>> strcpy(param.filename, RESULT_FILE_NAME1);
>>>>> param.num_of_runs = 0;
>>>>> param.cpu_no = sibling_cpu_no;
>>>>> - } else {
>>>>> - ret = signal_handler_register();
>>>>> - if (ret) {
>>>>> - kill(bm_pid, SIGKILL);
>>>>> - goto out;
>>>>> - }
>>>>> }
>>>>>
>>>>> remove(param.filename);
>>>>> @@ -209,10 +203,8 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
>>>>> }
>>>>> close(pipefd[0]);
>>>>> kill(bm_pid, SIGKILL);
>>>>> - signal_handler_unregister();
>>>>> }
>>>>>
>>>>> -out:
>>>>> cat_test_cleanup();
>>>>>
>>>>> return ret;
>>>>> diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
>>>>> index 823672a20a43..3d66fbdc2df3 100644
>>>>> --- a/tools/testing/selftests/resctrl/resctrl_tests.c
>>>>> +++ b/tools/testing/selftests/resctrl/resctrl_tests.c
>>>>> @@ -73,8 +73,13 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
>>>>>
>>>>> ksft_print_msg("Starting MBM BW change ...\n");
>>>>>
>>>>> + res = signal_handler_register();
>>>>> + if (res)
>>>>> + return;
>>>>> +
>>>>> res = mount_resctrlfs();
>>>>> if (res) {
>>>>> + signal_handler_unregister();
>>>>> ksft_exit_fail_msg("Failed to mount resctrl FS\n");
>>>>> return;
>>>>> }
>>>>> @@ -91,6 +96,7 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
>>>>>
>>>>> umount:
>>>>> umount_resctrlfs();
>>>>> + signal_handler_unregister();
>>>>> }
>>>>>
>>>>> static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
>>>>> @@ -99,8 +105,13 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
>>>>>
>>>>> ksft_print_msg("Starting MBA Schemata change ...\n");
>>>>>
>>>>> + res = signal_handler_register();
>>>>> + if (res)
>>>>> + return;
>>>>> +
>>>>> res = mount_resctrlfs();
>>>>> if (res) {
>>>>> + signal_handler_unregister();
>>>>> ksft_exit_fail_msg("Failed to mount resctrl FS\n");
>>>>> return;
>>>>> }
>>>>> @@ -115,6 +126,7 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
>>>>>
>>>>> umount:
>>>>> umount_resctrlfs();
>>>>> + signal_handler_unregister();
>>>>> }
>>>>>
>>>>
>>>> This adds more duplicated code for every test. Have you considered a
>>>> single test setup function that can be used to mount resctrl FS and setup
>>>> the signal handler paired with a single test teardown function?
>>>
>>> Yes. Consolidating all these is among my not-yet submitted patches.
>>> I just had to do a backport-friendly Fixes patch first for this.
>>>
>>
>> Could you please help me understand how the duplicate calls are more
>> backport friendly?
>
> Hi,
>
> It's simply because the refactoring that has to be done to be able to
> introduce the generalized test framework is much more invasive and far
> reaching than this patch. Essentially, all the call signatures of the test
> functions need to match and the feature checks need to be done in new per
> test functions too. This is the diffstat of those changes alone:
>
> tools/testing/selftests/resctrl/cat_test.c | 21 +++--
> tools/testing/selftests/resctrl/cmt_test.c | 26 +++--
> tools/testing/selftests/resctrl/mba_test.c | 20 +++-
> tools/testing/selftests/resctrl/mbm_test.c | 20 +++-
> tools/testing/selftests/resctrl/resctrl.h | 43 ++++++++-
> tools/testing/selftests/resctrl/resctrl_tests.c | 220 +++++++++++++++----------------------------
> tools/testing/selftests/resctrl/resctrlfs.c | 5 +
>
> (tools/testing/selftests/resctrl/resctrl_tests.c --- part would
> be slightly less if I'd reorder this patch but that only 24 lines off as
> per diffstat of this patch).
>
> But that's not all.... To be able to push the generalized test framework
> to stable, you need to also count in the benchmark cmd changes which
> worked towards making the call signatures identical. So here's the
> diffstat for that series for quick reference:
>
> tools/testing/selftests/resctrl/cache.c | 5 +-
> tools/testing/selftests/resctrl/cat_test.c | 13 +--
> tools/testing/selftests/resctrl/cmt_test.c | 34 ++++--
> tools/testing/selftests/resctrl/mba_test.c | 4 +-
> tools/testing/selftests/resctrl/mbm_test.c | 7 +-
> tools/testing/selftests/resctrl/resctrl.h | 16 +--
> .../testing/selftests/resctrl/resctrl_tests.c | 100 ++++++++----------
> tools/testing/selftests/resctrl/resctrl_val.c | 10 +-
>
> That's ~500 lines changed vs ~50 so it's a magnitude worse and much less
> localized.
>
> And rest assured, I did not like introducing the duplicated calls any more
> than you do (I did not write the generalized test framework for nothing,
> after all) but the way taken in this patch seemed the most reasonable
> option under these circumstances.
>
hmmm ... I did not expect that a total refactoring would be needed.
I was thinking about a change from this:
testX(...)
{
res = signal_handler_register();
/* error handling */
res = mount_resctrlfs();
/* error handling */
/* test */
unmount_resctrlfs();
signal_handler_register();
}
to this:
int test_setup(...)
{
res = signal_handler_register();
/* error handling */
res = mount_resctrlfs();
/* error handling */
}
void test_cleanup(...)
{
unmount_resctrlfs();
signal_handler_register();
}
testX(...)
{
res = test_setup(..);
/* error handling */
/* test */
test_cleanup();
}
I expect this to also support the bigger refactoring.
Reinette
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/5] selftests/resctrl: Remove duplicate feature check from CMT test
2023-09-14 9:58 ` Ilpo Järvinen
@ 2023-09-14 15:04 ` Reinette Chatre
0 siblings, 0 replies; 25+ messages in thread
From: Reinette Chatre @ 2023-09-14 15:04 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman, LKML, Shaopeng Tan, stable
Hi Ilpo,
On 9/14/2023 2:58 AM, Ilpo Järvinen wrote:
> On Wed, 13 Sep 2023, Reinette Chatre wrote:
>> On 9/13/2023 4:11 AM, Ilpo Järvinen wrote:
>>> On Tue, 12 Sep 2023, Reinette Chatre wrote:
>>>> On 9/11/2023 4:19 AM, Ilpo Järvinen wrote:
>>>>> The test runner run_cmt_test() in resctrl_tests.c checks for CMT
>>>>> feature and does not run cmt_resctrl_val() if CMT is not supported.
>>>>> Then cmt_resctrl_val() also check is CMT is supported.
>>>>>
>>>>> Remove the duplicated feature check for CMT from cmt_resctrl_val().
>>>>>
>>>>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
>>>>> Cc: <stable@vger.kernel.org>
>>>>
>>>> This does not look like stable material to me.
>>>
>>> I know but when constructing this series I had 2 options:
>>>
>>> Either convert also this when changing validate_resctrl_feature_request()
>>> or remove this call entirely.
>>>
>>> Given it's duplicate of the other CMT check, I chose to just remove it
>>> (which I'd do anyway). As patch 4/5 requires 3/5 which in turn requires
>>> this, this has to go stable if 4/5 goes too.
>>>
>>
>> Understood. This makes it a dependency of an actual fix, which is addressed
>> in 4/5's sign-off area. This notation is new to me but it is not clear to me
>> that the dependency should also be tagged as stable material (without a
>> fixes tag). Since it is not an actual fix by itself yet is sent to @stable
>> I think it may cause confusion. Is just listing it as a dependency of the
>> actual fix not sufficient (as you already do in 4/5)? Perhaps as compromise
>> this patch can also get a note to the stable team. Something like:
>>
>> Cc: <stable@vger.kernel.org> # dependency of "selftests/resctrl: Fix feature checks"
>>
>> I am not sure though - I would like to avoid confusion and not burden
>> the stable team. If this is a flow you have used before successfully I'd
>> defer to your experience.
>
> I came across that dependency format when Greg KH replied to somebody how
> to deal with the cases where there isn't yet a commit id
> (the cases mentioned in Documentation/process/stable-kernel-rules.rst
> assumes there is already a commit id). Unfortunately it's long time ago
> so I cannot easily find the link.
I see, thank you. I was not aware of this custom.
Reinette
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/5] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal
2023-09-14 15:04 ` Reinette Chatre
@ 2023-09-14 17:05 ` Ilpo Järvinen
2023-09-14 17:29 ` Reinette Chatre
0 siblings, 1 reply; 25+ messages in thread
From: Ilpo Järvinen @ 2023-09-14 17:05 UTC (permalink / raw)
To: Reinette Chatre
Cc: Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman, LKML, Shaopeng Tan, stable
[-- Attachment #1: Type: text/plain, Size: 8322 bytes --]
On Thu, 14 Sep 2023, Reinette Chatre wrote:
> On 9/14/2023 3:16 AM, Ilpo Järvinen wrote:
> > On Wed, 13 Sep 2023, Reinette Chatre wrote:
> >> On 9/13/2023 3:01 AM, Ilpo Järvinen wrote:
> >>> On Tue, 12 Sep 2023, Reinette Chatre wrote:
> >>>> On 9/11/2023 4:19 AM, Ilpo Järvinen wrote:
> >>>>> Unmounting resctrl FS has been moved into the per test functions in
> >>>>> resctrl_tests.c by commit caddc0fbe495 ("selftests/resctrl: Move
> >>>>> resctrl FS mount/umount to higher level"). In case a signal (SIGINT,
> >>>>> SIGTERM, or SIGHUP) is received, the running selftest is aborted by
> >>>>> ctrlc_handler() which then unmounts resctrl fs before exiting. The
> >>>>> current section between signal_handler_register() and
> >>>>> signal_handler_unregister(), however, does not cover the entire
> >>>>> duration when resctrl FS is mounted.
> >>>>>
> >>>>> Move signal_handler_register() and signal_handler_unregister() call
> >>>>> into the test functions in resctrl_tests.c to properly unmount resctrl
> >>>>> fs. Adjust child process kill() call in ctrlc_handler() to only be
> >>>>> invoked if the child was already forked.
> >>>>
> >>>> Thank you for catching this.
> >>>>
> >>>>>
> >>>>> Fixes: caddc0fbe495 ("selftests/resctrl: Move resctrl FS mount/umount to higher level")
> >>>>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> >>>>> Cc: <stable@vger.kernel.org>
> >>>>> ---
> >>>>> tools/testing/selftests/resctrl/cat_test.c | 8 -------
> >>>>> .../testing/selftests/resctrl/resctrl_tests.c | 24 +++++++++++++++++++
> >>>>> tools/testing/selftests/resctrl/resctrl_val.c | 22 ++++++++---------
> >>>>> 3 files changed, 34 insertions(+), 20 deletions(-)
> >>>>>
> >>>>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> >>>>> index 97b87285ab2a..224ba8544d8a 100644
> >>>>> --- a/tools/testing/selftests/resctrl/cat_test.c
> >>>>> +++ b/tools/testing/selftests/resctrl/cat_test.c
> >>>>> @@ -167,12 +167,6 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
> >>>>> strcpy(param.filename, RESULT_FILE_NAME1);
> >>>>> param.num_of_runs = 0;
> >>>>> param.cpu_no = sibling_cpu_no;
> >>>>> - } else {
> >>>>> - ret = signal_handler_register();
> >>>>> - if (ret) {
> >>>>> - kill(bm_pid, SIGKILL);
> >>>>> - goto out;
> >>>>> - }
> >>>>> }
> >>>>>
> >>>>> remove(param.filename);
> >>>>> @@ -209,10 +203,8 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
> >>>>> }
> >>>>> close(pipefd[0]);
> >>>>> kill(bm_pid, SIGKILL);
> >>>>> - signal_handler_unregister();
> >>>>> }
> >>>>>
> >>>>> -out:
> >>>>> cat_test_cleanup();
> >>>>>
> >>>>> return ret;
> >>>>> diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
> >>>>> index 823672a20a43..3d66fbdc2df3 100644
> >>>>> --- a/tools/testing/selftests/resctrl/resctrl_tests.c
> >>>>> +++ b/tools/testing/selftests/resctrl/resctrl_tests.c
> >>>>> @@ -73,8 +73,13 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
> >>>>>
> >>>>> ksft_print_msg("Starting MBM BW change ...\n");
> >>>>>
> >>>>> + res = signal_handler_register();
> >>>>> + if (res)
> >>>>> + return;
> >>>>> +
> >>>>> res = mount_resctrlfs();
> >>>>> if (res) {
> >>>>> + signal_handler_unregister();
> >>>>> ksft_exit_fail_msg("Failed to mount resctrl FS\n");
> >>>>> return;
> >>>>> }
> >>>>> @@ -91,6 +96,7 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
> >>>>>
> >>>>> umount:
> >>>>> umount_resctrlfs();
> >>>>> + signal_handler_unregister();
> >>>>> }
> >>>>>
> >>>>> static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
> >>>>> @@ -99,8 +105,13 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
> >>>>>
> >>>>> ksft_print_msg("Starting MBA Schemata change ...\n");
> >>>>>
> >>>>> + res = signal_handler_register();
> >>>>> + if (res)
> >>>>> + return;
> >>>>> +
> >>>>> res = mount_resctrlfs();
> >>>>> if (res) {
> >>>>> + signal_handler_unregister();
> >>>>> ksft_exit_fail_msg("Failed to mount resctrl FS\n");
> >>>>> return;
> >>>>> }
> >>>>> @@ -115,6 +126,7 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
> >>>>>
> >>>>> umount:
> >>>>> umount_resctrlfs();
> >>>>> + signal_handler_unregister();
> >>>>> }
> >>>>>
> >>>>
> >>>> This adds more duplicated code for every test. Have you considered a
> >>>> single test setup function that can be used to mount resctrl FS and setup
> >>>> the signal handler paired with a single test teardown function?
> >>>
> >>> Yes. Consolidating all these is among my not-yet submitted patches.
> >>> I just had to do a backport-friendly Fixes patch first for this.
> >>>
> >>
> >> Could you please help me understand how the duplicate calls are more
> >> backport friendly?
> >
> > Hi,
> >
> > It's simply because the refactoring that has to be done to be able to
> > introduce the generalized test framework is much more invasive and far
> > reaching than this patch. Essentially, all the call signatures of the test
> > functions need to match and the feature checks need to be done in new per
> > test functions too. This is the diffstat of those changes alone:
> >
> > tools/testing/selftests/resctrl/cat_test.c | 21 +++--
> > tools/testing/selftests/resctrl/cmt_test.c | 26 +++--
> > tools/testing/selftests/resctrl/mba_test.c | 20 +++-
> > tools/testing/selftests/resctrl/mbm_test.c | 20 +++-
> > tools/testing/selftests/resctrl/resctrl.h | 43 ++++++++-
> > tools/testing/selftests/resctrl/resctrl_tests.c | 220 +++++++++++++++----------------------------
> > tools/testing/selftests/resctrl/resctrlfs.c | 5 +
> >
> > (tools/testing/selftests/resctrl/resctrl_tests.c --- part would
> > be slightly less if I'd reorder this patch but that only 24 lines off as
> > per diffstat of this patch).
> >
> > But that's not all.... To be able to push the generalized test framework
> > to stable, you need to also count in the benchmark cmd changes which
> > worked towards making the call signatures identical. So here's the
> > diffstat for that series for quick reference:
> >
> > tools/testing/selftests/resctrl/cache.c | 5 +-
> > tools/testing/selftests/resctrl/cat_test.c | 13 +--
> > tools/testing/selftests/resctrl/cmt_test.c | 34 ++++--
> > tools/testing/selftests/resctrl/mba_test.c | 4 +-
> > tools/testing/selftests/resctrl/mbm_test.c | 7 +-
> > tools/testing/selftests/resctrl/resctrl.h | 16 +--
> > .../testing/selftests/resctrl/resctrl_tests.c | 100 ++++++++----------
> > tools/testing/selftests/resctrl/resctrl_val.c | 10 +-
> >
> > That's ~500 lines changed vs ~50 so it's a magnitude worse and much less
> > localized.
> >
> > And rest assured, I did not like introducing the duplicated calls any more
> > than you do (I did not write the generalized test framework for nothing,
> > after all) but the way taken in this patch seemed the most reasonable
> > option under these circumstances.
> >
>
> hmmm ... I did not expect that a total refactoring would be needed.
>
> I was thinking about a change from this:
>
>
> testX(...)
> {
>
> res = signal_handler_register();
> /* error handling */
> res = mount_resctrlfs();
> /* error handling */
>
> /* test */
>
> unmount_resctrlfs();
> signal_handler_register();
>
> }
>
>
> to this:
>
>
> int test_setup(...)
> {
> res = signal_handler_register();
> /* error handling */
> res = mount_resctrlfs();
> /* error handling */
> }
>
>
> void test_cleanup(...)
> {
> unmount_resctrlfs();
> signal_handler_register();
> }
>
>
> testX(...)
> {
>
> res = test_setup(..);
> /* error handling */
>
> /* test */
>
> test_cleanup();
> }
>
> I expect this to also support the bigger refactoring.
Okay, I'll do so then.
However, having already written the generic run_single_test() function
that is part of the generic test framework, I definitely don't feel those
helpers would be that helpful for it. It more feels like they'd make the
flow less obvious by adding those two extra calls there but that's of
course matter of taste.
--
i.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/5] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal
2023-09-14 17:05 ` Ilpo Järvinen
@ 2023-09-14 17:29 ` Reinette Chatre
0 siblings, 0 replies; 25+ messages in thread
From: Reinette Chatre @ 2023-09-14 17:29 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman, LKML, Shaopeng Tan, stable
Hi Ilpo,
On 9/14/2023 10:05 AM, Ilpo Järvinen wrote:
> On Thu, 14 Sep 2023, Reinette Chatre wrote:
>> On 9/14/2023 3:16 AM, Ilpo Järvinen wrote:
>>> On Wed, 13 Sep 2023, Reinette Chatre wrote:
>>>> On 9/13/2023 3:01 AM, Ilpo Järvinen wrote:
>>>>> On Tue, 12 Sep 2023, Reinette Chatre wrote:
>>>>>> On 9/11/2023 4:19 AM, Ilpo Järvinen wrote:
>>>>>>> Unmounting resctrl FS has been moved into the per test functions in
>>>>>>> resctrl_tests.c by commit caddc0fbe495 ("selftests/resctrl: Move
>>>>>>> resctrl FS mount/umount to higher level"). In case a signal (SIGINT,
>>>>>>> SIGTERM, or SIGHUP) is received, the running selftest is aborted by
>>>>>>> ctrlc_handler() which then unmounts resctrl fs before exiting. The
>>>>>>> current section between signal_handler_register() and
>>>>>>> signal_handler_unregister(), however, does not cover the entire
>>>>>>> duration when resctrl FS is mounted.
>>>>>>>
>>>>>>> Move signal_handler_register() and signal_handler_unregister() call
>>>>>>> into the test functions in resctrl_tests.c to properly unmount resctrl
>>>>>>> fs. Adjust child process kill() call in ctrlc_handler() to only be
>>>>>>> invoked if the child was already forked.
>>>>>>
>>>>>> Thank you for catching this.
>>>>>>
>>>>>>>
>>>>>>> Fixes: caddc0fbe495 ("selftests/resctrl: Move resctrl FS mount/umount to higher level")
>>>>>>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
>>>>>>> Cc: <stable@vger.kernel.org>
>>>>>>> ---
>>>>>>> tools/testing/selftests/resctrl/cat_test.c | 8 -------
>>>>>>> .../testing/selftests/resctrl/resctrl_tests.c | 24 +++++++++++++++++++
>>>>>>> tools/testing/selftests/resctrl/resctrl_val.c | 22 ++++++++---------
>>>>>>> 3 files changed, 34 insertions(+), 20 deletions(-)
>>>>>>>
>>>>>>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
>>>>>>> index 97b87285ab2a..224ba8544d8a 100644
>>>>>>> --- a/tools/testing/selftests/resctrl/cat_test.c
>>>>>>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>>>>>>> @@ -167,12 +167,6 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
>>>>>>> strcpy(param.filename, RESULT_FILE_NAME1);
>>>>>>> param.num_of_runs = 0;
>>>>>>> param.cpu_no = sibling_cpu_no;
>>>>>>> - } else {
>>>>>>> - ret = signal_handler_register();
>>>>>>> - if (ret) {
>>>>>>> - kill(bm_pid, SIGKILL);
>>>>>>> - goto out;
>>>>>>> - }
>>>>>>> }
>>>>>>>
>>>>>>> remove(param.filename);
>>>>>>> @@ -209,10 +203,8 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
>>>>>>> }
>>>>>>> close(pipefd[0]);
>>>>>>> kill(bm_pid, SIGKILL);
>>>>>>> - signal_handler_unregister();
>>>>>>> }
>>>>>>>
>>>>>>> -out:
>>>>>>> cat_test_cleanup();
>>>>>>>
>>>>>>> return ret;
>>>>>>> diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
>>>>>>> index 823672a20a43..3d66fbdc2df3 100644
>>>>>>> --- a/tools/testing/selftests/resctrl/resctrl_tests.c
>>>>>>> +++ b/tools/testing/selftests/resctrl/resctrl_tests.c
>>>>>>> @@ -73,8 +73,13 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
>>>>>>>
>>>>>>> ksft_print_msg("Starting MBM BW change ...\n");
>>>>>>>
>>>>>>> + res = signal_handler_register();
>>>>>>> + if (res)
>>>>>>> + return;
>>>>>>> +
>>>>>>> res = mount_resctrlfs();
>>>>>>> if (res) {
>>>>>>> + signal_handler_unregister();
>>>>>>> ksft_exit_fail_msg("Failed to mount resctrl FS\n");
>>>>>>> return;
>>>>>>> }
>>>>>>> @@ -91,6 +96,7 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
>>>>>>>
>>>>>>> umount:
>>>>>>> umount_resctrlfs();
>>>>>>> + signal_handler_unregister();
>>>>>>> }
>>>>>>>
>>>>>>> static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
>>>>>>> @@ -99,8 +105,13 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
>>>>>>>
>>>>>>> ksft_print_msg("Starting MBA Schemata change ...\n");
>>>>>>>
>>>>>>> + res = signal_handler_register();
>>>>>>> + if (res)
>>>>>>> + return;
>>>>>>> +
>>>>>>> res = mount_resctrlfs();
>>>>>>> if (res) {
>>>>>>> + signal_handler_unregister();
>>>>>>> ksft_exit_fail_msg("Failed to mount resctrl FS\n");
>>>>>>> return;
>>>>>>> }
>>>>>>> @@ -115,6 +126,7 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
>>>>>>>
>>>>>>> umount:
>>>>>>> umount_resctrlfs();
>>>>>>> + signal_handler_unregister();
>>>>>>> }
>>>>>>>
>>>>>>
>>>>>> This adds more duplicated code for every test. Have you considered a
>>>>>> single test setup function that can be used to mount resctrl FS and setup
>>>>>> the signal handler paired with a single test teardown function?
>>>>>
>>>>> Yes. Consolidating all these is among my not-yet submitted patches.
>>>>> I just had to do a backport-friendly Fixes patch first for this.
>>>>>
>>>>
>>>> Could you please help me understand how the duplicate calls are more
>>>> backport friendly?
>>>
>>> Hi,
>>>
>>> It's simply because the refactoring that has to be done to be able to
>>> introduce the generalized test framework is much more invasive and far
>>> reaching than this patch. Essentially, all the call signatures of the test
>>> functions need to match and the feature checks need to be done in new per
>>> test functions too. This is the diffstat of those changes alone:
>>>
>>> tools/testing/selftests/resctrl/cat_test.c | 21 +++--
>>> tools/testing/selftests/resctrl/cmt_test.c | 26 +++--
>>> tools/testing/selftests/resctrl/mba_test.c | 20 +++-
>>> tools/testing/selftests/resctrl/mbm_test.c | 20 +++-
>>> tools/testing/selftests/resctrl/resctrl.h | 43 ++++++++-
>>> tools/testing/selftests/resctrl/resctrl_tests.c | 220 +++++++++++++++----------------------------
>>> tools/testing/selftests/resctrl/resctrlfs.c | 5 +
>>>
>>> (tools/testing/selftests/resctrl/resctrl_tests.c --- part would
>>> be slightly less if I'd reorder this patch but that only 24 lines off as
>>> per diffstat of this patch).
>>>
>>> But that's not all.... To be able to push the generalized test framework
>>> to stable, you need to also count in the benchmark cmd changes which
>>> worked towards making the call signatures identical. So here's the
>>> diffstat for that series for quick reference:
>>>
>>> tools/testing/selftests/resctrl/cache.c | 5 +-
>>> tools/testing/selftests/resctrl/cat_test.c | 13 +--
>>> tools/testing/selftests/resctrl/cmt_test.c | 34 ++++--
>>> tools/testing/selftests/resctrl/mba_test.c | 4 +-
>>> tools/testing/selftests/resctrl/mbm_test.c | 7 +-
>>> tools/testing/selftests/resctrl/resctrl.h | 16 +--
>>> .../testing/selftests/resctrl/resctrl_tests.c | 100 ++++++++----------
>>> tools/testing/selftests/resctrl/resctrl_val.c | 10 +-
>>>
>>> That's ~500 lines changed vs ~50 so it's a magnitude worse and much less
>>> localized.
>>>
>>> And rest assured, I did not like introducing the duplicated calls any more
>>> than you do (I did not write the generalized test framework for nothing,
>>> after all) but the way taken in this patch seemed the most reasonable
>>> option under these circumstances.
>>>
>>
>> hmmm ... I did not expect that a total refactoring would be needed.
>>
>> I was thinking about a change from this:
>>
>>
>> testX(...)
>> {
>>
>> res = signal_handler_register();
>> /* error handling */
>> res = mount_resctrlfs();
>> /* error handling */
>>
>> /* test */
>>
>> unmount_resctrlfs();
>> signal_handler_register();
>>
>> }
>>
>>
>> to this:
>>
>>
>> int test_setup(...)
>> {
>> res = signal_handler_register();
>> /* error handling */
>> res = mount_resctrlfs();
>> /* error handling */
>> }
>>
>>
>> void test_cleanup(...)
>> {
>> unmount_resctrlfs();
>> signal_handler_register();
>> }
>>
>>
>> testX(...)
>> {
>>
>> res = test_setup(..);
>> /* error handling */
>>
>> /* test */
>>
>> test_cleanup();
>> }
>>
>> I expect this to also support the bigger refactoring.
>
> Okay, I'll do so then.
>
> However, having already written the generic run_single_test() function
> that is part of the generic test framework, I definitely don't feel those
> helpers would be that helpful for it. It more feels like they'd make the
> flow less obvious by adding those two extra calls there but that's of
> course matter of taste.
Sounds like there is some room for improvement here, perhaps open coding
the test_setup() and test_cleanup() helpers within run_single_test().
This is purely speculation on my part as I have not seen the code.
Reinette
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2023-09-14 17:30 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-11 11:19 [PATCH 0/5] selftests/resctrl: Fixes to failing tests Ilpo Järvinen
2023-09-11 11:19 ` [PATCH 1/5] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal Ilpo Järvinen
2023-09-12 22:06 ` Reinette Chatre
2023-09-13 10:01 ` Ilpo Järvinen
2023-09-13 20:58 ` Reinette Chatre
2023-09-14 10:16 ` Ilpo Järvinen
2023-09-14 15:04 ` Reinette Chatre
2023-09-14 17:05 ` Ilpo Järvinen
2023-09-14 17:29 ` Reinette Chatre
2023-09-11 11:19 ` [PATCH 2/5] selftests/resctrl: Remove duplicate feature check from CMT test Ilpo Järvinen
2023-09-12 22:06 ` Reinette Chatre
2023-09-13 11:11 ` Ilpo Järvinen
2023-09-13 20:58 ` Reinette Chatre
2023-09-14 9:58 ` Ilpo Järvinen
2023-09-14 15:04 ` Reinette Chatre
2023-09-11 11:19 ` [PATCH 3/5] selftests/resctrl: Refactor feature check to use resource and feature name Ilpo Järvinen
2023-09-12 22:09 ` Reinette Chatre
2023-09-13 11:02 ` Ilpo Järvinen
2023-09-13 20:59 ` Reinette Chatre
2023-09-14 11:06 ` Ilpo Järvinen
2023-09-11 11:19 ` [PATCH 4/5] selftests/resctrl: Fix feature checks Ilpo Järvinen
2023-09-11 11:19 ` [PATCH 5/5] selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests Ilpo Järvinen
2023-09-12 22:10 ` Reinette Chatre
2023-09-13 11:43 ` Ilpo Järvinen
2023-09-13 21:00 ` Reinette Chatre
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).