linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/7] cpu isolation: infra to block interference to select CPUs
@ 2022-09-08 19:28 Marcelo Tosatti
  2022-09-08 19:29 ` [RFC PATCH 1/7] cpu isolation: basic block interference infrastructure Marcelo Tosatti
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Marcelo Tosatti @ 2022-09-08 19:28 UTC (permalink / raw)
  To: linux-kernel
  Cc: Frederic Weisbecker, Juri Lelli, Daniel Bristot de Oliveira,
	Prasad Pandit, Valentin Schneider, Yair Podemsky,
	Thomas Gleixner

There are a number of codepaths in the kernel that interrupt
code execution in remote CPUs. A subset of such codepaths are
triggered from userspace and can therefore return errors.

Introduce a cpumask named "block interference", writable from userspace.

This cpumask (and associated helpers) can be used by code that executes
code on remote CPUs to optionally return an error.

Note: the word "interference" has been chosen since "interruption" is
often confused with "device interrupt".

To protect readers VS writers of this cpumask, a per-CPU read-write
semaphore is used. This is acceptable since the codepaths which 
trigger such interferences are not (or should not be) hot.

What is proposed is to incrementally modify code that can return errors
in two ways:

1) Introduction of fail variants of the functions that generate 
code execution on remote CPUs. This way the modified code should
look like:

        block_interf_read_lock();

        ret = smp_call_func_single_fail() / stop_machine_fail() / ...

        block_interf_read_unlock();

This is grep friendly (so one can search for smp_call_function_* variants)
and re-uses code.

2) Usage of block interference CPU mask helpers. For certain 
users of smp_call_func_*, stop_machine_* functions it 
is natural to check for block interference CPUs before
calling the functions for remote code execution.

For example if its not desirable to perform error handling at
smp_call_func_* time, or if performing the error handling requires
unjustified complexity. Then:


        block_interf_read_lock();

	if target cpumask intersects with block interference cpumask {
		block_interf_read_unlock();
		return error
	}

	...
        ret = smp_call_function_single / stop_machine() / ...
	...

        block_interf_read_unlock();

Regarding housekeeping flags, it is usually the case that initialization might
require code execution on interference blocked CPUs (for example MTRR 
initialization, resctrlfs initialization, MSR writes, ...). Therefore 
tagging the CPUs after system initialization is necessary, which
is not possible with current housekeeping flags infrastructure.

This patchset converts clockevent and clocksource unbind, perf_event_open
system call, and /proc/mtrr writes, but there are several more users
to convert (eg: MSR reads/writes, resctrlfs, etc).

Sending this as an RFC to know whether folks think this is the
right direction.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC PATCH 1/7] cpu isolation: basic block interference infrastructure
  2022-09-08 19:28 [RFC PATCH 0/7] cpu isolation: infra to block interference to select CPUs Marcelo Tosatti
@ 2022-09-08 19:29 ` Marcelo Tosatti
  2022-09-08 19:29 ` [RFC PATCH 2/7] introduce smp_call_func_single_fail Marcelo Tosatti
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Marcelo Tosatti @ 2022-09-08 19:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Frederic Weisbecker, Juri Lelli, Daniel Bristot de Oliveira,
	Prasad Pandit, Valentin Schneider, Yair Podemsky,
	Thomas Gleixner, Marcelo Tosatti

There are a number of codepaths in the kernel that interrupt
code execution in remote CPUs. A subset of such codepaths are
triggered from userspace and can therefore return errors.

Introduce a cpumask named "block interference", writable from userspace.

This cpumask (and associated helpers) can be used by code that executes
code on remote CPUs to optionally return an error.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

Index: linux-2.6/include/linux/sched/isolation.h
===================================================================
--- linux-2.6.orig/include/linux/sched/isolation.h
+++ linux-2.6/include/linux/sched/isolation.h
@@ -58,4 +58,33 @@ static inline bool housekeeping_cpu(int
 	return true;
 }
 
+#ifdef CONFIG_CPU_ISOLATION
+extern cpumask_var_t block_interf_cpumask;
+extern bool block_interf_cpumask_active;
+
+void block_interf_read_lock(void);
+void block_interf_read_unlock(void);
+
+void block_interf_write_lock(void);
+void block_interf_write_unlock(void);
+
+void block_interf_assert_held(void);
+
+#else
+static inline void block_interf_read_lock(void)		{ }
+static inline void block_interf_read_unlock(void)	{ }
+static inline void block_interf_write_lock(void)	{ }
+static inline void block_interf_write_unlock(void)	{ }
+static inline void block_interf_assert_held(void)	{ }
+#endif
+
+static inline bool block_interf_cpu(int cpu)
+{
+#ifdef CONFIG_CPU_ISOLATION
+	if (block_interf_cpumask_active)
+		return cpumask_test_cpu(cpu, block_interf_cpumask);
+#endif
+	return false;
+}
+
 #endif /* _LINUX_SCHED_ISOLATION_H */
Index: linux-2.6/kernel/sched/isolation.c
===================================================================
--- linux-2.6.orig/kernel/sched/isolation.c
+++ linux-2.6/kernel/sched/isolation.c
@@ -239,3 +239,116 @@ static int __init housekeeping_isolcpus_
 	return housekeeping_setup(str, flags);
 }
 __setup("isolcpus=", housekeeping_isolcpus_setup);
+
+DEFINE_STATIC_PERCPU_RWSEM(block_interf_lock);
+
+cpumask_var_t block_interf_cpumask;
+EXPORT_SYMBOL_GPL(block_interf_cpumask);
+
+bool block_interf_cpumask_active;
+EXPORT_SYMBOL_GPL(block_interf_cpumask_active);
+
+void block_interf_read_lock(void)
+{
+	percpu_down_read(&block_interf_lock);
+}
+EXPORT_SYMBOL_GPL(block_interf_read_lock);
+
+void block_interf_read_unlock(void)
+{
+	percpu_up_read(&block_interf_lock);
+}
+EXPORT_SYMBOL_GPL(block_interf_read_unlock);
+
+void block_interf_write_lock(void)
+{
+	percpu_down_write(&block_interf_lock);
+}
+EXPORT_SYMBOL_GPL(block_interf_write_lock);
+
+void block_interf_write_unlock(void)
+{
+	percpu_up_write(&block_interf_lock);
+}
+EXPORT_SYMBOL_GPL(block_interf_write_unlock);
+
+void block_interf_assert_held(void)
+{
+	percpu_rwsem_assert_held(&block_interf_lock);
+}
+EXPORT_SYMBOL_GPL(block_interf_assert_held);
+
+static ssize_t
+block_interf_cpumask_read(struct file *filp, char __user *ubuf,
+		     size_t count, loff_t *ppos)
+{
+	char *mask_str;
+	int len;
+
+	len = snprintf(NULL, 0, "%*pb\n",
+		       cpumask_pr_args(block_interf_cpumask)) + 1;
+	mask_str = kmalloc(len, GFP_KERNEL);
+	if (!mask_str)
+		return -ENOMEM;
+
+	len = snprintf(mask_str, len, "%*pb\n",
+		       cpumask_pr_args(block_interf_cpumask));
+	if (len >= count) {
+		count = -EINVAL;
+		goto out_err;
+	}
+	count = simple_read_from_buffer(ubuf, count, ppos, mask_str, len);
+
+out_err:
+	kfree(mask_str);
+
+	return count;
+}
+
+static ssize_t
+block_interf_cpumask_write(struct file *filp, const char __user *ubuf,
+			   size_t count, loff_t *ppos)
+{
+	cpumask_var_t block_interf_cpumask_new;
+	int err;
+
+	if (!zalloc_cpumask_var(&block_interf_cpumask_new, GFP_KERNEL))
+		return -ENOMEM;
+
+	err = cpumask_parse_user(ubuf, count, block_interf_cpumask_new);
+	if (err)
+		goto err_free;
+
+	block_interf_write_lock();
+	cpumask_copy(block_interf_cpumask, block_interf_cpumask_new);
+	block_interf_write_unlock();
+	free_cpumask_var(block_interf_cpumask_new);
+
+	return count;
+
+err_free:
+	free_cpumask_var(block_interf_cpumask_new);
+
+	return err;
+}
+
+static const struct file_operations block_interf_cpumask_fops = {
+	.read		= block_interf_cpumask_read,
+	.write		= block_interf_cpumask_write,
+};
+
+
+static int __init block_interf_cpumask_init(void)
+{
+	if (!zalloc_cpumask_var(&block_interf_cpumask, GFP_KERNEL))
+		return -ENOMEM;
+
+	debugfs_create_file_unsafe("block_interf_cpumask", 0644, NULL, NULL,
+				   &block_interf_cpumask_fops);
+
+	block_interf_cpumask_active = true;
+	return 0;
+}
+
+late_initcall(block_interf_cpumask_init);
+



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC PATCH 2/7] introduce smp_call_func_single_fail
  2022-09-08 19:28 [RFC PATCH 0/7] cpu isolation: infra to block interference to select CPUs Marcelo Tosatti
  2022-09-08 19:29 ` [RFC PATCH 1/7] cpu isolation: basic block interference infrastructure Marcelo Tosatti
@ 2022-09-08 19:29 ` Marcelo Tosatti
  2022-09-08 19:29 ` [RFC PATCH 3/7] introduce _fail variants of stop_machine functions Marcelo Tosatti
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Marcelo Tosatti @ 2022-09-08 19:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Frederic Weisbecker, Juri Lelli, Daniel Bristot de Oliveira,
	Prasad Pandit, Valentin Schneider, Yair Podemsky,
	Thomas Gleixner, Marcelo Tosatti

Introduce smp_call_func_single_fail, which checks if
the target CPU is tagged as a "block interference" CPU,
and returns an error if so.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

Index: linux-2.6/include/linux/smp.h
===================================================================
--- linux-2.6.orig/include/linux/smp.h
+++ linux-2.6/include/linux/smp.h
@@ -50,6 +50,9 @@ extern unsigned int total_cpus;
 int smp_call_function_single(int cpuid, smp_call_func_t func, void *info,
 			     int wait);
 
+int smp_call_func_single_fail(int cpuid, smp_call_func_t func, void *info,
+			     int wait);
+
 void on_each_cpu_cond_mask(smp_cond_func_t cond_func, smp_call_func_t func,
 			   void *info, bool wait, const struct cpumask *mask);
 
Index: linux-2.6/kernel/smp.c
===================================================================
--- linux-2.6.orig/kernel/smp.c
+++ linux-2.6/kernel/smp.c
@@ -25,6 +25,7 @@
 #include <linux/nmi.h>
 #include <linux/sched/debug.h>
 #include <linux/jump_label.h>
+#include <linux/sched/isolation.h>
 
 #include "smpboot.h"
 #include "sched/smp.h"
@@ -782,6 +783,30 @@ int smp_call_function_single(int cpu, sm
 }
 EXPORT_SYMBOL(smp_call_function_single);
 
+/*
+ * smp_call_func_single_fail - Run a function on a specific CPU,
+ * failing if any target CPU is marked as "no ipi".
+ * @func: The function to run. This must be fast and non-blocking.
+ * @info: An arbitrary pointer to pass to the function.
+ * @wait: If true, wait until function has completed on other CPUs.
+ *
+ * Returns 0 on success, else a negative status code.
+ */
+int smp_call_func_single_fail(int cpu, smp_call_func_t func, void *info,
+			     int wait)
+{
+	int err;
+
+	block_interf_assert_held();
+	if (block_interf_cpu(cpu))
+		return -EPERM;
+
+	err = smp_call_function_single(cpu, func, info, wait);
+
+	return err;
+}
+EXPORT_SYMBOL(smp_call_func_single_fail);
+
 /**
  * smp_call_function_single_async() - Run an asynchronous function on a
  * 			         specific CPU.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC PATCH 3/7] introduce _fail variants of stop_machine functions
  2022-09-08 19:28 [RFC PATCH 0/7] cpu isolation: infra to block interference to select CPUs Marcelo Tosatti
  2022-09-08 19:29 ` [RFC PATCH 1/7] cpu isolation: basic block interference infrastructure Marcelo Tosatti
  2022-09-08 19:29 ` [RFC PATCH 2/7] introduce smp_call_func_single_fail Marcelo Tosatti
@ 2022-09-08 19:29 ` Marcelo Tosatti
  2022-09-08 19:29 ` [RFC PATCH 4/7] clockevent unbind: use smp_call_func_single_fail Marcelo Tosatti
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Marcelo Tosatti @ 2022-09-08 19:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Frederic Weisbecker, Juri Lelli, Daniel Bristot de Oliveira,
	Prasad Pandit, Valentin Schneider, Yair Podemsky,
	Thomas Gleixner, Marcelo Tosatti

Introduce stop_machine_fail and stop_machine_cpuslocked_fail,
which check if any online CPU in the system is tagged as 
a block interference CPU. 

If so, returns an error.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

Index: linux-2.6/include/linux/stop_machine.h
===================================================================
--- linux-2.6.orig/include/linux/stop_machine.h
+++ linux-2.6/include/linux/stop_machine.h
@@ -113,6 +113,9 @@ static inline void print_stop_info(const
  */
 int stop_machine(cpu_stop_fn_t fn, void *data, const struct cpumask *cpus);
 
+
+int stop_machine_fail(cpu_stop_fn_t fn, void *data, const struct cpumask *cpus);
+
 /**
  * stop_machine_cpuslocked: freeze the machine on all CPUs and run this function
  * @fn: the function to run
@@ -124,6 +127,9 @@ int stop_machine(cpu_stop_fn_t fn, void
  */
 int stop_machine_cpuslocked(cpu_stop_fn_t fn, void *data, const struct cpumask *cpus);
 
+
+int stop_machine_cpuslocked_fail(cpu_stop_fn_t fn, void *data, const struct cpumask *cpus);
+
 /**
  * stop_core_cpuslocked: - stop all threads on just one core
  * @cpu: any cpu in the targeted core
Index: linux-2.6/kernel/stop_machine.c
===================================================================
--- linux-2.6.orig/kernel/stop_machine.c
+++ linux-2.6/kernel/stop_machine.c
@@ -22,6 +22,7 @@
 #include <linux/atomic.h>
 #include <linux/nmi.h>
 #include <linux/sched/wake_q.h>
+#include <linux/sched/isolation.h>
 
 /*
  * Structure to determine completion condition and record errors.  May
@@ -619,6 +620,17 @@ int stop_machine_cpuslocked(cpu_stop_fn_
 	return stop_cpus(cpu_online_mask, multi_cpu_stop, &msdata);
 }
 
+int stop_machine_cpuslocked_fail(cpu_stop_fn_t fn, void *data,
+				 const struct cpumask *cpus)
+{
+	block_interf_assert_held();
+
+	if (cpumask_intersects(block_interf_cpumask, cpu_online_mask))
+		return -EPERM;
+
+	return stop_machine_cpuslocked(fn, data, cpus);
+}
+
 int stop_machine(cpu_stop_fn_t fn, void *data, const struct cpumask *cpus)
 {
 	int ret;
@@ -631,6 +643,19 @@ int stop_machine(cpu_stop_fn_t fn, void
 }
 EXPORT_SYMBOL_GPL(stop_machine);
 
+int stop_machine_fail(cpu_stop_fn_t fn, void *data, const struct cpumask *cpus)
+{
+	int ret;
+
+	/* No CPUs can come up or down during this. */
+	cpus_read_lock();
+	ret = stop_machine_cpuslocked_fail(fn, data, cpus);
+	cpus_read_unlock();
+	return ret;
+}
+EXPORT_SYMBOL_GPL(stop_machine_fail);
+
+
 #ifdef CONFIG_SCHED_SMT
 int stop_core_cpuslocked(unsigned int cpu, cpu_stop_fn_t fn, void *data)
 {



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC PATCH 4/7] clockevent unbind: use smp_call_func_single_fail
  2022-09-08 19:28 [RFC PATCH 0/7] cpu isolation: infra to block interference to select CPUs Marcelo Tosatti
                   ` (2 preceding siblings ...)
  2022-09-08 19:29 ` [RFC PATCH 3/7] introduce _fail variants of stop_machine functions Marcelo Tosatti
@ 2022-09-08 19:29 ` Marcelo Tosatti
  2022-09-08 19:29 ` [RFC PATCH 5/7] timekeeping_notify: use stop_machine_fail when appropriate Marcelo Tosatti
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Marcelo Tosatti @ 2022-09-08 19:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Frederic Weisbecker, Juri Lelli, Daniel Bristot de Oliveira,
	Prasad Pandit, Valentin Schneider, Yair Podemsky,
	Thomas Gleixner, Marcelo Tosatti

Convert clockevents_unbind from smp_call_function_single
to smp_call_func_single_fail, which will fail in case
the target CPU is tagged as block interference CPU.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

Index: linux-2.6/kernel/time/clockevents.c
===================================================================
--- linux-2.6.orig/kernel/time/clockevents.c
+++ linux-2.6/kernel/time/clockevents.c
@@ -13,6 +13,7 @@
 #include <linux/module.h>
 #include <linux/smp.h>
 #include <linux/device.h>
+#include <linux/sched/isolation.h>
 
 #include "tick-internal.h"
 
@@ -416,9 +417,14 @@ static void __clockevents_unbind(void *a
  */
 static int clockevents_unbind(struct clock_event_device *ced, int cpu)
 {
+	int ret;
 	struct ce_unbind cu = { .ce = ced, .res = -ENODEV };
 
-	smp_call_function_single(cpu, __clockevents_unbind, &cu, 1);
+	block_interf_read_lock();
+	ret = smp_call_func_single_fail(cpu, __clockevents_unbind, &cu, 1);
+	block_interf_read_unlock();
+	if (ret)
+		return ret;
 	return cu.res;
 }
 



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC PATCH 5/7] timekeeping_notify: use stop_machine_fail when appropriate
  2022-09-08 19:28 [RFC PATCH 0/7] cpu isolation: infra to block interference to select CPUs Marcelo Tosatti
                   ` (3 preceding siblings ...)
  2022-09-08 19:29 ` [RFC PATCH 4/7] clockevent unbind: use smp_call_func_single_fail Marcelo Tosatti
@ 2022-09-08 19:29 ` Marcelo Tosatti
  2022-09-08 19:29 ` [RFC PATCH 6/7] perf_event_open: check for block interference CPUs Marcelo Tosatti
  2022-09-08 19:29 ` [RFC PATCH 7/7] mtrr_add_page/mtrr_del_page: " Marcelo Tosatti
  6 siblings, 0 replies; 8+ messages in thread
From: Marcelo Tosatti @ 2022-09-08 19:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Frederic Weisbecker, Juri Lelli, Daniel Bristot de Oliveira,
	Prasad Pandit, Valentin Schneider, Yair Podemsky,
	Thomas Gleixner, Marcelo Tosatti

Change timekeeping_notify to use stop_machine_fail when appropriate,
which will fail in case the target CPU is tagged as block interference CPU.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

Index: linux-2.6/include/linux/clocksource.h
===================================================================
--- linux-2.6.orig/include/linux/clocksource.h
+++ linux-2.6/include/linux/clocksource.h
@@ -267,7 +267,7 @@ extern void clocksource_arch_init(struct
 static inline void clocksource_arch_init(struct clocksource *cs) { }
 #endif
 
-extern int timekeeping_notify(struct clocksource *clock);
+extern int timekeeping_notify(struct clocksource *clock, bool fail);
 
 extern u64 clocksource_mmio_readl_up(struct clocksource *);
 extern u64 clocksource_mmio_readl_down(struct clocksource *);
Index: linux-2.6/kernel/time/clocksource.c
===================================================================
--- linux-2.6.orig/kernel/time/clocksource.c
+++ linux-2.6/kernel/time/clocksource.c
@@ -117,7 +117,7 @@ static u64 suspend_start;
 
 #ifdef CONFIG_CLOCKSOURCE_WATCHDOG
 static void clocksource_watchdog_work(struct work_struct *work);
-static void clocksource_select(void);
+static int clocksource_select(bool fail);
 
 static LIST_HEAD(watchdog_list);
 static struct clocksource *watchdog;
@@ -649,7 +649,7 @@ static int clocksource_watchdog_kthread(
 {
 	mutex_lock(&clocksource_mutex);
 	if (__clocksource_watchdog_kthread())
-		clocksource_select();
+		clocksource_select(false);
 	mutex_unlock(&clocksource_mutex);
 	return 0;
 }
@@ -946,7 +946,7 @@ static struct clocksource *clocksource_f
 	return NULL;
 }
 
-static void __clocksource_select(bool skipcur)
+static int __clocksource_select(bool skipcur, bool fail)
 {
 	bool oneshot = tick_oneshot_mode_active();
 	struct clocksource *best, *cs;
@@ -954,7 +954,7 @@ static void __clocksource_select(bool sk
 	/* Find the best suitable clocksource */
 	best = clocksource_find_best(oneshot, skipcur);
 	if (!best)
-		return;
+		return 0;
 
 	if (!strlen(override_name))
 		goto found;
@@ -991,10 +991,16 @@ static void __clocksource_select(bool sk
 	}
 
 found:
-	if (curr_clocksource != best && !timekeeping_notify(best)) {
+	if (curr_clocksource != best) {
+		int ret;
+
+		ret = timekeeping_notify(best, fail);
+		if (ret)
+			return ret;
 		pr_info("Switched to clocksource %s\n", best->name);
 		curr_clocksource = best;
 	}
+	return 0;
 }
 
 /**
@@ -1005,14 +1011,14 @@ found:
  * Select the clocksource with the best rating, or the clocksource,
  * which is selected by userspace override.
  */
-static void clocksource_select(void)
+static int clocksource_select(bool fail)
 {
-	__clocksource_select(false);
+	return __clocksource_select(false, fail);
 }
 
-static void clocksource_select_fallback(void)
+static int clocksource_select_fallback(void)
 {
-	__clocksource_select(true);
+	return __clocksource_select(true, true);
 }
 
 /*
@@ -1031,7 +1037,7 @@ static int __init clocksource_done_booti
 	 * Run the watchdog first to eliminate unstable clock sources
 	 */
 	__clocksource_watchdog_kthread();
-	clocksource_select();
+	clocksource_select(false);
 	mutex_unlock(&clocksource_mutex);
 	return 0;
 }
@@ -1179,7 +1185,7 @@ int __clocksource_register_scale(struct
 	clocksource_enqueue_watchdog(cs);
 	clocksource_watchdog_unlock(&flags);
 
-	clocksource_select();
+	clocksource_select(false);
 	clocksource_select_watchdog(false);
 	__clocksource_suspend_select(cs);
 	mutex_unlock(&clocksource_mutex);
@@ -1208,7 +1214,7 @@ void clocksource_change_rating(struct cl
 	__clocksource_change_rating(cs, rating);
 	clocksource_watchdog_unlock(&flags);
 
-	clocksource_select();
+	clocksource_select(false);
 	clocksource_select_watchdog(false);
 	clocksource_suspend_select(false);
 	mutex_unlock(&clocksource_mutex);
@@ -1230,8 +1236,12 @@ static int clocksource_unbind(struct clo
 	}
 
 	if (cs == curr_clocksource) {
+		int ret;
+
 		/* Select and try to install a replacement clock source */
-		clocksource_select_fallback();
+		ret = clocksource_select_fallback();
+		if (ret)
+			return ret;
 		if (curr_clocksource == cs)
 			return -EBUSY;
 	}
@@ -1322,17 +1332,17 @@ static ssize_t current_clocksource_store
 					 struct device_attribute *attr,
 					 const char *buf, size_t count)
 {
-	ssize_t ret;
+	ssize_t ret, err;
 
 	mutex_lock(&clocksource_mutex);
 
 	ret = sysfs_get_uname(buf, override_name, count);
 	if (ret >= 0)
-		clocksource_select();
+		err = clocksource_select(true);
 
 	mutex_unlock(&clocksource_mutex);
 
-	return ret;
+	return err ? err : ret;
 }
 static DEVICE_ATTR_RW(current_clocksource);
 
Index: linux-2.6/kernel/time/timekeeping.c
===================================================================
--- linux-2.6.orig/kernel/time/timekeeping.c
+++ linux-2.6/kernel/time/timekeeping.c
@@ -13,6 +13,7 @@
 #include <linux/sched.h>
 #include <linux/sched/loadavg.h>
 #include <linux/sched/clock.h>
+#include <linux/sched/isolation.h>
 #include <linux/syscore_ops.h>
 #include <linux/clocksource.h>
 #include <linux/jiffies.h>
@@ -1497,13 +1498,24 @@ static int change_clocksource(void *data
  * This function is called from clocksource.c after a new, better clock
  * source has been registered. The caller holds the clocksource_mutex.
  */
-int timekeeping_notify(struct clocksource *clock)
+int timekeeping_notify(struct clocksource *clock, bool fail)
 {
 	struct timekeeper *tk = &tk_core.timekeeper;
 
 	if (tk->tkr_mono.clock == clock)
 		return 0;
-	stop_machine(change_clocksource, clock, NULL);
+
+	if (!fail)
+		stop_machine(change_clocksource, clock, NULL);
+	else {
+		int ret;
+
+		block_interf_read_lock();
+		ret = stop_machine_fail(change_clocksource, clock, NULL);
+		block_interf_read_unlock();
+		if (ret)
+			return ret;
+	}
 	tick_clock_notify();
 	return tk->tkr_mono.clock == clock ? 0 : -1;
 }



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC PATCH 6/7] perf_event_open: check for block interference CPUs
  2022-09-08 19:28 [RFC PATCH 0/7] cpu isolation: infra to block interference to select CPUs Marcelo Tosatti
                   ` (4 preceding siblings ...)
  2022-09-08 19:29 ` [RFC PATCH 5/7] timekeeping_notify: use stop_machine_fail when appropriate Marcelo Tosatti
@ 2022-09-08 19:29 ` Marcelo Tosatti
  2022-09-08 19:29 ` [RFC PATCH 7/7] mtrr_add_page/mtrr_del_page: " Marcelo Tosatti
  6 siblings, 0 replies; 8+ messages in thread
From: Marcelo Tosatti @ 2022-09-08 19:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Frederic Weisbecker, Juri Lelli, Daniel Bristot de Oliveira,
	Prasad Pandit, Valentin Schneider, Yair Podemsky,
	Thomas Gleixner, Marcelo Tosatti

When creating perf events, return an error rather
than interfering with CPUs tagged as block interference.

Note: this patch is incomplete, installation of perf context 
on block interference CPUs via task context is not performed.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

Index: linux-2.6/kernel/events/core.c
===================================================================
--- linux-2.6.orig/kernel/events/core.c
+++ linux-2.6/kernel/events/core.c
@@ -54,6 +54,7 @@
 #include <linux/highmem.h>
 #include <linux/pgtable.h>
 #include <linux/buildid.h>
+#include <linux/sched/isolation.h>
 
 #include "internal.h"
 
@@ -12391,6 +12392,26 @@ not_move_group:
 
 	WARN_ON_ONCE(ctx->parent_ctx);
 
+	block_interf_read_lock();
+	if (!task) {
+		if (move_group) {
+			for_each_sibling_event(sibling, group_leader) {
+				if (block_interf_cpu(sibling->cpu)) {
+					err = -EPERM;
+					goto err_block_interf;
+				}
+			}
+			if (block_interf_cpu(group_leader->cpu)) {
+				err = -EPERM;
+				goto err_block_interf;
+			}
+		}
+		if (block_interf_cpu(event->cpu)) {
+			err = -EPERM;
+			goto err_block_interf;
+		}
+	}
+
 	/*
 	 * This is the point on no return; we cannot fail hereafter. This is
 	 * where we start modifying current state.
@@ -12464,6 +12485,8 @@ not_move_group:
 		put_task_struct(task);
 	}
 
+	block_interf_read_unlock();
+
 	mutex_lock(&current->perf_event_mutex);
 	list_add_tail(&event->owner_entry, &current->perf_event_list);
 	mutex_unlock(&current->perf_event_mutex);
@@ -12478,6 +12501,8 @@ not_move_group:
 	fd_install(event_fd, event_file);
 	return event_fd;
 
+err_block_interf:
+	block_interf_read_unlock();
 err_locked:
 	if (move_group)
 		perf_event_ctx_unlock(group_leader, gctx);



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC PATCH 7/7] mtrr_add_page/mtrr_del_page: check for block interference CPUs
  2022-09-08 19:28 [RFC PATCH 0/7] cpu isolation: infra to block interference to select CPUs Marcelo Tosatti
                   ` (5 preceding siblings ...)
  2022-09-08 19:29 ` [RFC PATCH 6/7] perf_event_open: check for block interference CPUs Marcelo Tosatti
@ 2022-09-08 19:29 ` Marcelo Tosatti
  6 siblings, 0 replies; 8+ messages in thread
From: Marcelo Tosatti @ 2022-09-08 19:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Frederic Weisbecker, Juri Lelli, Daniel Bristot de Oliveira,
	Prasad Pandit, Valentin Schneider, Yair Podemsky,
	Thomas Gleixner, Marcelo Tosatti

Check if any online CPU in the system is tagged as
a block interference CPU, and if so return an error
to mtrr_add_page/mtrr_del_page. 

This can avoid interference to such CPUs (while allowing
userspace to handle the failures).

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>


Index: linux-2.6/arch/x86/kernel/cpu/mtrr/mtrr.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/cpu/mtrr/mtrr.c
+++ linux-2.6/arch/x86/kernel/cpu/mtrr/mtrr.c
@@ -45,6 +45,7 @@
 #include <linux/smp.h>
 #include <linux/syscore_ops.h>
 #include <linux/rcupdate.h>
+#include <linux/sched/isolation.h>
 
 #include <asm/cpufeature.h>
 #include <asm/e820/api.h>
@@ -335,6 +336,13 @@ int mtrr_add_page(unsigned long base, un
 	error = -EINVAL;
 	replace = -1;
 
+	block_interf_read_lock();
+
+	if (cpumask_intersects(block_interf_cpumask, cpu_online_mask)) {
+		block_interf_read_unlock();
+		return -EPERM;
+	}
+
 	/* No CPU hotplug when we change MTRR entries */
 	cpus_read_lock();
 
@@ -399,6 +407,7 @@ int mtrr_add_page(unsigned long base, un
  out:
 	mutex_unlock(&mtrr_mutex);
 	cpus_read_unlock();
+	block_interf_read_unlock();
 	return error;
 }
 
@@ -484,6 +493,11 @@ int mtrr_del_page(int reg, unsigned long
 		return -ENODEV;
 
 	max = num_var_ranges;
+	block_interf_read_lock();
+	if (cpumask_intersects(block_interf_cpumask, cpu_online_mask)) {
+		block_interf_read_unlock();
+		return -EPERM;
+	}
 	/* No CPU hotplug when we change MTRR entries */
 	cpus_read_lock();
 	mutex_lock(&mtrr_mutex);
@@ -521,6 +535,7 @@ int mtrr_del_page(int reg, unsigned long
  out:
 	mutex_unlock(&mtrr_mutex);
 	cpus_read_unlock();
+	block_interf_read_unlock();
 	return error;
 }
 



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-09-08 19:57 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-08 19:28 [RFC PATCH 0/7] cpu isolation: infra to block interference to select CPUs Marcelo Tosatti
2022-09-08 19:29 ` [RFC PATCH 1/7] cpu isolation: basic block interference infrastructure Marcelo Tosatti
2022-09-08 19:29 ` [RFC PATCH 2/7] introduce smp_call_func_single_fail Marcelo Tosatti
2022-09-08 19:29 ` [RFC PATCH 3/7] introduce _fail variants of stop_machine functions Marcelo Tosatti
2022-09-08 19:29 ` [RFC PATCH 4/7] clockevent unbind: use smp_call_func_single_fail Marcelo Tosatti
2022-09-08 19:29 ` [RFC PATCH 5/7] timekeeping_notify: use stop_machine_fail when appropriate Marcelo Tosatti
2022-09-08 19:29 ` [RFC PATCH 6/7] perf_event_open: check for block interference CPUs Marcelo Tosatti
2022-09-08 19:29 ` [RFC PATCH 7/7] mtrr_add_page/mtrr_del_page: " Marcelo Tosatti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).